1
Mostly this is patches from me and RTH cleaning up and doing
1
First arm pullreq of the cycle; this is mostly my softfloat NaN
2
more decodetree conversion for AArch32 Neon. The major new feature
2
handling series. (Lots more in my to-review queue, but I don't
3
is Dongjiu Geng's patchset to report host memory errors to KVM guests;
3
like pullreqs growing too close to a hundred patches at a time :-))
4
also a new aspeed board from Patrick Williams.
5
4
6
thanks
5
thanks
7
-- PMM
6
-- PMM
8
7
9
The following changes since commit 035b448b84f3557206abc44d786c5d3db2638f7d:
8
The following changes since commit 97f2796a3736ed37a1b85dc1c76a6c45b829dd17:
10
9
11
Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-05-14' into staging (2020-05-14 10:58:30 +0100)
10
Open 10.0 development tree (2024-12-10 17:41:17 +0000)
12
11
13
are available in the Git repository at:
12
are available in the Git repository at:
14
13
15
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200514
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20241211
16
15
17
for you to fetch changes up to e95485f85657be21135c17a9226e297c21e73360:
16
for you to fetch changes up to 1abe28d519239eea5cf9620bb13149423e5665f8:
18
17
19
target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree (2020-05-14 15:03:09 +0100)
18
MAINTAINERS: Add correct email address for Vikram Garhwal (2024-12-11 15:31:09 +0000)
20
19
21
----------------------------------------------------------------
20
----------------------------------------------------------------
22
target-arm queue:
21
target-arm queue:
23
* target/arm: Use correct GDB XML for M-profile cores
22
* hw/net/lan9118: Extract PHY model, reuse with imx_fec, fix bugs
24
* target/arm: Code cleanup to use gvec APIs better
23
* fpu: Make muladd NaN handling runtime-selected, not compile-time
25
* aspeed: Add support for the sonorapass-bmc board
24
* fpu: Make default NaN pattern runtime-selected, not compile-time
26
* target/arm: Support reporting KVM host memory errors
25
* fpu: Minor NaN-related cleanups
27
to the guest via ACPI notifications
26
* MAINTAINERS: email address updates
28
* target/arm: Finish conversion of Neon 3-reg-same insns to decodetree
29
27
30
----------------------------------------------------------------
28
----------------------------------------------------------------
31
Dongjiu Geng (10):
29
Bernhard Beschow (5):
32
acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
30
hw/net/lan9118: Extract lan9118_phy
33
hw/arm/virt: Introduce a RAS machine option
31
hw/net/lan9118_phy: Reuse in imx_fec and consolidate implementations
34
docs: APEI GHES generation and CPER record description
32
hw/net/lan9118_phy: Fix off-by-one error in MII_ANLPAR register
35
ACPI: Build related register address fields via hardware error fw_cfg blob
33
hw/net/lan9118_phy: Reuse MII constants
36
ACPI: Build Hardware Error Source Table
34
hw/net/lan9118_phy: Add missing 100 mbps full duplex advertisement
37
ACPI: Record the Generic Error Status Block address
38
KVM: Move hwpoison page related functions into kvm-all.c
39
ACPI: Record Generic Error Status Block(GESB) table
40
target-arm: kvm64: handle SIGBUS signal from kernel or KVM
41
MAINTAINERS: Add ACPI/HEST/GHES entries
42
35
43
Patrick Williams (1):
36
Leif Lindholm (1):
44
aspeed: Add support for the sonorapass-bmc board
37
MAINTAINERS: update email address for Leif Lindholm
45
38
46
Peter Maydell (18):
39
Peter Maydell (54):
47
target/arm: Use correct GDB XML for M-profile cores
40
fpu: handle raising Invalid for infzero in pick_nan_muladd
48
target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
41
fpu: Check for default_nan_mode before calling pickNaNMulAdd
49
target/arm: Convert Neon 3-reg-same SHA to decodetree
42
softfloat: Allow runtime choice of inf * 0 + NaN result
50
target/arm: Convert Neon 64-bit element 3-reg-same insns
43
tests/fp: Explicitly set inf-zero-nan rule
51
target/arm: Convert Neon VHADD 3-reg-same insns
44
target/arm: Set FloatInfZeroNaNRule explicitly
52
target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
45
target/s390: Set FloatInfZeroNaNRule explicitly
53
target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
46
target/ppc: Set FloatInfZeroNaNRule explicitly
54
target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
47
target/mips: Set FloatInfZeroNaNRule explicitly
55
target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
48
target/sparc: Set FloatInfZeroNaNRule explicitly
56
target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
49
target/xtensa: Set FloatInfZeroNaNRule explicitly
57
target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
50
target/x86: Set FloatInfZeroNaNRule explicitly
58
target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
51
target/loongarch: Set FloatInfZeroNaNRule explicitly
59
target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
52
target/hppa: Set FloatInfZeroNaNRule explicitly
60
target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
53
softfloat: Pass have_snan to pickNaNMulAdd
61
target/arm: Convert Neon 3-reg-same compare insns to decodetree
54
softfloat: Allow runtime choice of NaN propagation for muladd
62
target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
55
tests/fp: Explicitly set 3-NaN propagation rule
63
target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
56
target/arm: Set Float3NaNPropRule explicitly
64
target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree
57
target/loongarch: Set Float3NaNPropRule explicitly
58
target/ppc: Set Float3NaNPropRule explicitly
59
target/s390x: Set Float3NaNPropRule explicitly
60
target/sparc: Set Float3NaNPropRule explicitly
61
target/mips: Set Float3NaNPropRule explicitly
62
target/xtensa: Set Float3NaNPropRule explicitly
63
target/i386: Set Float3NaNPropRule explicitly
64
target/hppa: Set Float3NaNPropRule explicitly
65
fpu: Remove use_first_nan field from float_status
66
target/m68k: Don't pass NULL float_status to floatx80_default_nan()
67
softfloat: Create floatx80 default NaN from parts64_default_nan
68
target/loongarch: Use normal float_status in fclass_s and fclass_d helpers
69
target/m68k: In frem helper, initialize local float_status from env->fp_status
70
target/m68k: Init local float_status from env fp_status in gdb get/set reg
71
target/sparc: Initialize local scratch float_status from env->fp_status
72
target/ppc: Use env->fp_status in helper_compute_fprf functions
73
fpu: Allow runtime choice of default NaN value
74
tests/fp: Set default NaN pattern explicitly
75
target/microblaze: Set default NaN pattern explicitly
76
target/i386: Set default NaN pattern explicitly
77
target/hppa: Set default NaN pattern explicitly
78
target/alpha: Set default NaN pattern explicitly
79
target/arm: Set default NaN pattern explicitly
80
target/loongarch: Set default NaN pattern explicitly
81
target/m68k: Set default NaN pattern explicitly
82
target/mips: Set default NaN pattern explicitly
83
target/openrisc: Set default NaN pattern explicitly
84
target/ppc: Set default NaN pattern explicitly
85
target/sh4: Set default NaN pattern explicitly
86
target/rx: Set default NaN pattern explicitly
87
target/s390x: Set default NaN pattern explicitly
88
target/sparc: Set default NaN pattern explicitly
89
target/xtensa: Set default NaN pattern explicitly
90
target/hexagon: Set default NaN pattern explicitly
91
target/riscv: Set default NaN pattern explicitly
92
target/tricore: Set default NaN pattern explicitly
93
fpu: Remove default handling for dnan_pattern
65
94
66
Richard Henderson (16):
95
Richard Henderson (11):
67
target/arm: Create gen_gvec_[us]sra
96
target/arm: Copy entire float_status in is_ebf
68
target/arm: Create gen_gvec_{u,s}{rshr,rsra}
97
softfloat: Inline pickNaNMulAdd
69
target/arm: Create gen_gvec_{sri,sli}
98
softfloat: Use goto for default nan case in pick_nan_muladd
70
target/arm: Remove unnecessary range check for VSHL
99
softfloat: Remove which from parts_pick_nan_muladd
71
target/arm: Tidy handle_vec_simd_shri
100
softfloat: Pad array size in pick_nan_muladd
72
target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
101
softfloat: Move propagateFloatx80NaN to softfloat.c
73
target/arm: Create gen_gvec_{mla,mls}
102
softfloat: Use parts_pick_nan in propagateFloatx80NaN
74
target/arm: Swap argument order for VSHL during decode
103
softfloat: Inline pickNaN
75
target/arm: Create gen_gvec_{cmtst,ushl,sshl}
104
softfloat: Share code between parts_pick_nan cases
76
target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
105
softfloat: Sink frac_cmp in parts_pick_nan until needed
77
target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
106
softfloat: Replace WHICH with RET in parts_pick_nan
78
target/arm: Create gen_gvec_{qrdmla,qrdmls}
79
target/arm: Pass pointer to qc to qrdmla/qrdmls
80
target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
81
target/arm: Vectorize SABD/UABD
82
target/arm: Vectorize SABA/UABA
83
107
84
docs/specs/acpi_hest_ghes.rst | 110 ++
108
Vikram Garhwal (1):
85
docs/specs/index.rst | 1 +
109
MAINTAINERS: Add correct email address for Vikram Garhwal
86
configure | 4 +-
87
default-configs/arm-softmmu.mak | 1 +
88
include/hw/acpi/aml-build.h | 1 +
89
include/hw/acpi/generic_event_device.h | 2 +
90
include/hw/acpi/ghes.h | 74 +
91
include/hw/arm/virt.h | 1 +
92
include/qemu/uuid.h | 27 +
93
include/sysemu/kvm.h | 3 +-
94
include/sysemu/kvm_int.h | 12 +
95
target/arm/cpu.h | 4 +
96
target/arm/helper.h | 78 +-
97
target/arm/internals.h | 5 +-
98
target/arm/translate.h | 84 +-
99
target/i386/cpu.h | 2 +
100
target/arm/neon-dp.decode | 119 +-
101
accel/kvm/kvm-all.c | 36 +
102
hw/acpi/aml-build.c | 2 +
103
hw/acpi/generic_event_device.c | 19 +
104
hw/acpi/ghes.c | 448 ++++++
105
hw/acpi/nvdimm.c | 10 +-
106
hw/arm/aspeed.c | 78 ++
107
hw/arm/virt-acpi-build.c | 15 +
108
hw/arm/virt.c | 23 +
109
target/arm/cpu_tcg.c | 1 +
110
target/arm/gdbstub.c | 22 +-
111
target/arm/helper.c | 2 +-
112
target/arm/kvm64.c | 77 ++
113
target/arm/neon_helper.c | 17 -
114
target/arm/tlb_helper.c | 2 +-
115
target/arm/translate-a64.c | 210 +--
116
target/arm/translate-neon.inc.c | 682 +++++++++-
117
target/arm/translate.c | 2349 +++++++++++++++++---------------
118
target/arm/vec_helper.c | 240 +++-
119
target/arm/vfp_helper.c | 9 +-
120
target/i386/kvm.c | 36 -
121
MAINTAINERS | 9 +
122
gdb-xml/arm-m-profile.xml | 27 +
123
hw/acpi/Kconfig | 4 +
124
hw/acpi/Makefile.objs | 1 +
125
41 files changed, 3402 insertions(+), 1445 deletions(-)
126
create mode 100644 docs/specs/acpi_hest_ghes.rst
127
create mode 100644 include/hw/acpi/ghes.h
128
create mode 100644 hw/acpi/ghes.c
129
create mode 100644 gdb-xml/arm-m-profile.xml
130
110
111
MAINTAINERS | 4 +-
112
include/fpu/softfloat-helpers.h | 38 +++-
113
include/fpu/softfloat-types.h | 89 +++++++-
114
include/hw/net/imx_fec.h | 9 +-
115
include/hw/net/lan9118_phy.h | 37 ++++
116
include/hw/net/mii.h | 6 +
117
target/mips/fpu_helper.h | 20 ++
118
target/sparc/helper.h | 4 +-
119
fpu/softfloat.c | 19 ++
120
hw/net/imx_fec.c | 146 ++------------
121
hw/net/lan9118.c | 137 ++-----------
122
hw/net/lan9118_phy.c | 222 ++++++++++++++++++++
123
linux-user/arm/nwfpe/fpa11.c | 5 +
124
target/alpha/cpu.c | 2 +
125
target/arm/cpu.c | 10 +
126
target/arm/tcg/vec_helper.c | 20 +-
127
target/hexagon/cpu.c | 2 +
128
target/hppa/fpu_helper.c | 12 ++
129
target/i386/tcg/fpu_helper.c | 12 ++
130
target/loongarch/tcg/fpu_helper.c | 14 +-
131
target/m68k/cpu.c | 14 +-
132
target/m68k/fpu_helper.c | 6 +-
133
target/m68k/helper.c | 6 +-
134
target/microblaze/cpu.c | 2 +
135
target/mips/msa.c | 10 +
136
target/openrisc/cpu.c | 2 +
137
target/ppc/cpu_init.c | 19 ++
138
target/ppc/fpu_helper.c | 3 +-
139
target/riscv/cpu.c | 2 +
140
target/rx/cpu.c | 2 +
141
target/s390x/cpu.c | 5 +
142
target/sh4/cpu.c | 2 +
143
target/sparc/cpu.c | 6 +
144
target/sparc/fop_helper.c | 8 +-
145
target/sparc/translate.c | 4 +-
146
target/tricore/helper.c | 2 +
147
target/xtensa/cpu.c | 4 +
148
target/xtensa/fpu_helper.c | 3 +-
149
tests/fp/fp-bench.c | 7 +
150
tests/fp/fp-test-log2.c | 1 +
151
tests/fp/fp-test.c | 7 +
152
fpu/softfloat-parts.c.inc | 152 +++++++++++---
153
fpu/softfloat-specialize.c.inc | 412 ++------------------------------------
154
.mailmap | 5 +-
155
hw/net/Kconfig | 5 +
156
hw/net/meson.build | 1 +
157
hw/net/trace-events | 10 +-
158
47 files changed, 778 insertions(+), 730 deletions(-)
159
create mode 100644 include/hw/net/lan9118_phy.h
160
create mode 100644 hw/net/lan9118_phy.c
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
This patch builds error_block_address and read_ack_register fields
3
A very similar implementation of the same device exists in imx_fec. Prepare for
4
in hardware errors table , the error_block_address points to Generic
4
a common implementation by extracting a device model into its own files.
5
Error Status Block(GESB) via bios_linker. The max size for one GESB
6
is 1kb, For more detailed information, please refer to
7
document: docs/specs/acpi_hest_ghes.rst
8
5
9
Now we only support one Error source, if necessary, we can extend to
6
Some migration state has been moved into the new device model which breaks
10
support more.
7
migration compatibility for the following machines:
8
* smdkc210
9
* realview-*
10
* vexpress-*
11
* kzm
12
* mps2-*
11
13
12
Suggested-by: Laszlo Ersek <lersek@redhat.com>
14
While breaking migration ABI, fix the size of the MII registers to be 16 bit,
13
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
15
as defined by IEEE 802.3u.
14
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
16
15
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
17
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
16
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
18
Tested-by: Guenter Roeck <linux@roeck-us.net>
17
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
19
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
18
Message-id: 20200512030609.19593-5-gengdongjiu@huawei.com
20
Message-id: 20241102125724.532843-2-shentey@gmail.com
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
---
22
---
21
default-configs/arm-softmmu.mak | 1 +
23
include/hw/net/lan9118_phy.h | 37 ++++++++
22
include/hw/acpi/aml-build.h | 1 +
24
hw/net/lan9118.c | 137 +++++-----------------------
23
include/hw/acpi/ghes.h | 28 +++++++++++
25
hw/net/lan9118_phy.c | 169 +++++++++++++++++++++++++++++++++++
24
hw/acpi/aml-build.c | 2 +
26
hw/net/Kconfig | 4 +
25
hw/acpi/ghes.c | 89 +++++++++++++++++++++++++++++++++
27
hw/net/meson.build | 1 +
26
hw/arm/virt-acpi-build.c | 5 ++
28
5 files changed, 233 insertions(+), 115 deletions(-)
27
hw/acpi/Kconfig | 4 ++
29
create mode 100644 include/hw/net/lan9118_phy.h
28
hw/acpi/Makefile.objs | 1 +
30
create mode 100644 hw/net/lan9118_phy.c
29
8 files changed, 131 insertions(+)
30
create mode 100644 include/hw/acpi/ghes.h
31
create mode 100644 hw/acpi/ghes.c
32
31
33
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
32
diff --git a/include/hw/net/lan9118_phy.h b/include/hw/net/lan9118_phy.h
34
index XXXXXXX..XXXXXXX 100644
35
--- a/default-configs/arm-softmmu.mak
36
+++ b/default-configs/arm-softmmu.mak
37
@@ -XXX,XX +XXX,XX @@ CONFIG_FSL_IMX7=y
38
CONFIG_FSL_IMX6UL=y
39
CONFIG_SEMIHOSTING=y
40
CONFIG_ALLWINNER_H3=y
41
+CONFIG_ACPI_APEI=y
42
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
43
index XXXXXXX..XXXXXXX 100644
44
--- a/include/hw/acpi/aml-build.h
45
+++ b/include/hw/acpi/aml-build.h
46
@@ -XXX,XX +XXX,XX @@ struct AcpiBuildTables {
47
GArray *rsdp;
48
GArray *tcpalog;
49
GArray *vmgenid;
50
+ GArray *hardware_errors;
51
BIOSLinker *linker;
52
} AcpiBuildTables;
53
54
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
55
new file mode 100644
33
new file mode 100644
56
index XXXXXXX..XXXXXXX
34
index XXXXXXX..XXXXXXX
57
--- /dev/null
35
--- /dev/null
58
+++ b/include/hw/acpi/ghes.h
36
+++ b/include/hw/net/lan9118_phy.h
59
@@ -XXX,XX +XXX,XX @@
37
@@ -XXX,XX +XXX,XX @@
60
+/*
38
+/*
61
+ * Support for generating APEI tables and recording CPER for Guests
39
+ * SMSC LAN9118 PHY emulation
62
+ *
40
+ *
63
+ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
41
+ * Copyright (c) 2009 CodeSourcery, LLC.
42
+ * Written by Paul Brook
64
+ *
43
+ *
65
+ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
44
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
66
+ *
45
+ * See the COPYING file in the top-level directory.
67
+ * This program is free software; you can redistribute it and/or modify
68
+ * it under the terms of the GNU General Public License as published by
69
+ * the Free Software Foundation; either version 2 of the License, or
70
+ * (at your option) any later version.
71
+
72
+ * This program is distributed in the hope that it will be useful,
73
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
74
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
75
+ * GNU General Public License for more details.
76
+
77
+ * You should have received a copy of the GNU General Public License along
78
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
79
+ */
46
+ */
80
+
47
+
81
+#ifndef ACPI_GHES_H
48
+#ifndef HW_NET_LAN9118_PHY_H
82
+#define ACPI_GHES_H
49
+#define HW_NET_LAN9118_PHY_H
83
+
50
+
84
+#include "hw/acpi/bios-linker-loader.h"
51
+#include "qom/object.h"
85
+
52
+#include "hw/sysbus.h"
86
+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
53
+
54
+#define TYPE_LAN9118_PHY "lan9118-phy"
55
+OBJECT_DECLARE_SIMPLE_TYPE(Lan9118PhyState, LAN9118_PHY)
56
+
57
+typedef struct Lan9118PhyState {
58
+ SysBusDevice parent_obj;
59
+
60
+ uint16_t status;
61
+ uint16_t control;
62
+ uint16_t advertise;
63
+ uint16_t ints;
64
+ uint16_t int_mask;
65
+ qemu_irq irq;
66
+ bool link_down;
67
+} Lan9118PhyState;
68
+
69
+void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down);
70
+void lan9118_phy_reset(Lan9118PhyState *s);
71
+uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg);
72
+void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val);
73
+
87
+#endif
74
+#endif
88
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
75
diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
89
index XXXXXXX..XXXXXXX 100644
76
index XXXXXXX..XXXXXXX 100644
90
--- a/hw/acpi/aml-build.c
77
--- a/hw/net/lan9118.c
91
+++ b/hw/acpi/aml-build.c
78
+++ b/hw/net/lan9118.c
92
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_init(AcpiBuildTables *tables)
79
@@ -XXX,XX +XXX,XX @@
93
tables->table_data = g_array_new(false, true /* clear */, 1);
80
#include "net/net.h"
94
tables->tcpalog = g_array_new(false, true /* clear */, 1);
81
#include "net/eth.h"
95
tables->vmgenid = g_array_new(false, true /* clear */, 1);
82
#include "hw/irq.h"
96
+ tables->hardware_errors = g_array_new(false, true /* clear */, 1);
83
+#include "hw/net/lan9118_phy.h"
97
tables->linker = bios_linker_loader_init();
84
#include "hw/net/lan9118.h"
85
#include "hw/ptimer.h"
86
#include "hw/qdev-properties.h"
87
@@ -XXX,XX +XXX,XX @@ do { printf("lan9118: " fmt , ## __VA_ARGS__); } while (0)
88
#define MAC_CR_RXEN 0x00000004
89
#define MAC_CR_RESERVED 0x7f404213
90
91
-#define PHY_INT_ENERGYON 0x80
92
-#define PHY_INT_AUTONEG_COMPLETE 0x40
93
-#define PHY_INT_FAULT 0x20
94
-#define PHY_INT_DOWN 0x10
95
-#define PHY_INT_AUTONEG_LP 0x08
96
-#define PHY_INT_PARFAULT 0x04
97
-#define PHY_INT_AUTONEG_PAGE 0x02
98
-
99
#define GPT_TIMER_EN 0x20000000
100
101
/*
102
@@ -XXX,XX +XXX,XX @@ struct lan9118_state {
103
uint32_t mac_mii_data;
104
uint32_t mac_flow;
105
106
- uint32_t phy_status;
107
- uint32_t phy_control;
108
- uint32_t phy_advertise;
109
- uint32_t phy_int;
110
- uint32_t phy_int_mask;
111
+ Lan9118PhyState mii;
112
+ IRQState mii_irq;
113
114
int32_t eeprom_writable;
115
uint8_t eeprom[128];
116
@@ -XXX,XX +XXX,XX @@ struct lan9118_state {
117
118
static const VMStateDescription vmstate_lan9118 = {
119
.name = "lan9118",
120
- .version_id = 2,
121
- .minimum_version_id = 1,
122
+ .version_id = 3,
123
+ .minimum_version_id = 3,
124
.fields = (const VMStateField[]) {
125
VMSTATE_PTIMER(timer, lan9118_state),
126
VMSTATE_UINT32(irq_cfg, lan9118_state),
127
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118 = {
128
VMSTATE_UINT32(mac_mii_acc, lan9118_state),
129
VMSTATE_UINT32(mac_mii_data, lan9118_state),
130
VMSTATE_UINT32(mac_flow, lan9118_state),
131
- VMSTATE_UINT32(phy_status, lan9118_state),
132
- VMSTATE_UINT32(phy_control, lan9118_state),
133
- VMSTATE_UINT32(phy_advertise, lan9118_state),
134
- VMSTATE_UINT32(phy_int, lan9118_state),
135
- VMSTATE_UINT32(phy_int_mask, lan9118_state),
136
VMSTATE_INT32(eeprom_writable, lan9118_state),
137
VMSTATE_UINT8_ARRAY(eeprom, lan9118_state, 128),
138
VMSTATE_INT32(tx_fifo_size, lan9118_state),
139
@@ -XXX,XX +XXX,XX @@ static void lan9118_reload_eeprom(lan9118_state *s)
140
lan9118_mac_changed(s);
98
}
141
}
99
142
100
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
143
-static void phy_update_irq(lan9118_state *s)
101
g_array_free(tables->table_data, true);
144
+static void lan9118_update_irq(void *opaque, int n, int level)
102
g_array_free(tables->tcpalog, mfre);
145
{
103
g_array_free(tables->vmgenid, mfre);
146
- if (s->phy_int & s->phy_int_mask) {
104
+ g_array_free(tables->hardware_errors, mfre);
147
+ lan9118_state *s = opaque;
148
+
149
+ if (level) {
150
s->int_sts |= PHY_INT;
151
} else {
152
s->int_sts &= ~PHY_INT;
153
@@ -XXX,XX +XXX,XX @@ static void phy_update_irq(lan9118_state *s)
154
lan9118_update(s);
105
}
155
}
106
156
107
/*
157
-static void phy_update_link(lan9118_state *s)
108
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
158
-{
159
- /* Autonegotiation status mirrors link status. */
160
- if (qemu_get_queue(s->nic)->link_down) {
161
- s->phy_status &= ~0x0024;
162
- s->phy_int |= PHY_INT_DOWN;
163
- } else {
164
- s->phy_status |= 0x0024;
165
- s->phy_int |= PHY_INT_ENERGYON;
166
- s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
167
- }
168
- phy_update_irq(s);
169
-}
170
-
171
static void lan9118_set_link(NetClientState *nc)
172
{
173
- phy_update_link(qemu_get_nic_opaque(nc));
174
-}
175
-
176
-static void phy_reset(lan9118_state *s)
177
-{
178
- s->phy_status = 0x7809;
179
- s->phy_control = 0x3000;
180
- s->phy_advertise = 0x01e1;
181
- s->phy_int_mask = 0;
182
- s->phy_int = 0;
183
- phy_update_link(s);
184
+ lan9118_phy_update_link(&LAN9118(qemu_get_nic_opaque(nc))->mii,
185
+ nc->link_down);
186
}
187
188
static void lan9118_reset(DeviceState *d)
189
@@ -XXX,XX +XXX,XX @@ static void lan9118_reset(DeviceState *d)
190
s->read_word_n = 0;
191
s->write_word_n = 0;
192
193
- phy_reset(s);
194
-
195
s->eeprom_writable = 0;
196
lan9118_reload_eeprom(s);
197
}
198
@@ -XXX,XX +XXX,XX @@ static void do_tx_packet(lan9118_state *s)
199
uint32_t status;
200
201
/* FIXME: Honor TX disable, and allow queueing of packets. */
202
- if (s->phy_control & 0x4000) {
203
+ if (s->mii.control & 0x4000) {
204
/* This assumes the receive routine doesn't touch the VLANClient. */
205
qemu_receive_packet(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
206
} else {
207
@@ -XXX,XX +XXX,XX @@ static void tx_fifo_push(lan9118_state *s, uint32_t val)
208
}
209
}
210
211
-static uint32_t do_phy_read(lan9118_state *s, int reg)
212
-{
213
- uint32_t val;
214
-
215
- switch (reg) {
216
- case 0: /* Basic Control */
217
- return s->phy_control;
218
- case 1: /* Basic Status */
219
- return s->phy_status;
220
- case 2: /* ID1 */
221
- return 0x0007;
222
- case 3: /* ID2 */
223
- return 0xc0d1;
224
- case 4: /* Auto-neg advertisement */
225
- return s->phy_advertise;
226
- case 5: /* Auto-neg Link Partner Ability */
227
- return 0x0f71;
228
- case 6: /* Auto-neg Expansion */
229
- return 1;
230
- /* TODO 17, 18, 27, 29, 30, 31 */
231
- case 29: /* Interrupt source. */
232
- val = s->phy_int;
233
- s->phy_int = 0;
234
- phy_update_irq(s);
235
- return val;
236
- case 30: /* Interrupt mask */
237
- return s->phy_int_mask;
238
- default:
239
- qemu_log_mask(LOG_GUEST_ERROR,
240
- "do_phy_read: PHY read reg %d\n", reg);
241
- return 0;
242
- }
243
-}
244
-
245
-static void do_phy_write(lan9118_state *s, int reg, uint32_t val)
246
-{
247
- switch (reg) {
248
- case 0: /* Basic Control */
249
- if (val & 0x8000) {
250
- phy_reset(s);
251
- break;
252
- }
253
- s->phy_control = val & 0x7980;
254
- /* Complete autonegotiation immediately. */
255
- if (val & 0x1000) {
256
- s->phy_status |= 0x0020;
257
- }
258
- break;
259
- case 4: /* Auto-neg advertisement */
260
- s->phy_advertise = (val & 0x2d7f) | 0x80;
261
- break;
262
- /* TODO 17, 18, 27, 31 */
263
- case 30: /* Interrupt mask */
264
- s->phy_int_mask = val & 0xff;
265
- phy_update_irq(s);
266
- break;
267
- default:
268
- qemu_log_mask(LOG_GUEST_ERROR,
269
- "do_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
270
- }
271
-}
272
-
273
static void do_mac_write(lan9118_state *s, int reg, uint32_t val)
274
{
275
switch (reg) {
276
@@ -XXX,XX +XXX,XX @@ static void do_mac_write(lan9118_state *s, int reg, uint32_t val)
277
if (val & 2) {
278
DPRINTF("PHY write %d = 0x%04x\n",
279
(val >> 6) & 0x1f, s->mac_mii_data);
280
- do_phy_write(s, (val >> 6) & 0x1f, s->mac_mii_data);
281
+ lan9118_phy_write(&s->mii, (val >> 6) & 0x1f, s->mac_mii_data);
282
} else {
283
- s->mac_mii_data = do_phy_read(s, (val >> 6) & 0x1f);
284
+ s->mac_mii_data = lan9118_phy_read(&s->mii, (val >> 6) & 0x1f);
285
DPRINTF("PHY read %d = 0x%04x\n",
286
(val >> 6) & 0x1f, s->mac_mii_data);
287
}
288
@@ -XXX,XX +XXX,XX @@ static void lan9118_writel(void *opaque, hwaddr offset,
289
break;
290
case CSR_PMT_CTRL:
291
if (val & 0x400) {
292
- phy_reset(s);
293
+ lan9118_phy_reset(&s->mii);
294
}
295
s->pmt_ctrl &= ~0x34e;
296
s->pmt_ctrl |= (val & 0x34e);
297
@@ -XXX,XX +XXX,XX @@ static void lan9118_realize(DeviceState *dev, Error **errp)
298
const MemoryRegionOps *mem_ops =
299
s->mode_16bit ? &lan9118_16bit_mem_ops : &lan9118_mem_ops;
300
301
+ qemu_init_irq(&s->mii_irq, lan9118_update_irq, s, 0);
302
+ object_initialize_child(OBJECT(s), "mii", &s->mii, TYPE_LAN9118_PHY);
303
+ if (!sysbus_realize_and_unref(SYS_BUS_DEVICE(&s->mii), errp)) {
304
+ return;
305
+ }
306
+ qdev_connect_gpio_out(DEVICE(&s->mii), 0, &s->mii_irq);
307
+
308
memory_region_init_io(&s->mmio, OBJECT(dev), mem_ops, s,
309
"lan9118-mmio", 0x100);
310
sysbus_init_mmio(sbd, &s->mmio);
311
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
109
new file mode 100644
312
new file mode 100644
110
index XXXXXXX..XXXXXXX
313
index XXXXXXX..XXXXXXX
111
--- /dev/null
314
--- /dev/null
112
+++ b/hw/acpi/ghes.c
315
+++ b/hw/net/lan9118_phy.c
113
@@ -XXX,XX +XXX,XX @@
316
@@ -XXX,XX +XXX,XX @@
114
+/*
317
+/*
115
+ * Support for generating APEI tables and recording CPER for Guests
318
+ * SMSC LAN9118 PHY emulation
116
+ *
319
+ *
117
+ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
320
+ * Copyright (c) 2009 CodeSourcery, LLC.
321
+ * Written by Paul Brook
118
+ *
322
+ *
119
+ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
323
+ * This code is licensed under the GNU GPL v2
120
+ *
324
+ *
121
+ * This program is free software; you can redistribute it and/or modify
325
+ * Contributions after 2012-01-13 are licensed under the terms of the
122
+ * it under the terms of the GNU General Public License as published by
326
+ * GNU GPL, version 2 or (at your option) any later version.
123
+ * the Free Software Foundation; either version 2 of the License, or
124
+ * (at your option) any later version.
125
+
126
+ * This program is distributed in the hope that it will be useful,
127
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
128
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
129
+ * GNU General Public License for more details.
130
+
131
+ * You should have received a copy of the GNU General Public License along
132
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
133
+ */
327
+ */
134
+
328
+
135
+#include "qemu/osdep.h"
329
+#include "qemu/osdep.h"
136
+#include "qemu/units.h"
330
+#include "hw/net/lan9118_phy.h"
137
+#include "hw/acpi/ghes.h"
331
+#include "hw/irq.h"
138
+#include "hw/acpi/aml-build.h"
332
+#include "hw/resettable.h"
139
+
333
+#include "migration/vmstate.h"
140
+#define ACPI_GHES_ERRORS_FW_CFG_FILE "etc/hardware_errors"
334
+#include "qemu/log.h"
141
+#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
335
+
142
+
336
+#define PHY_INT_ENERGYON (1 << 7)
143
+/* The max size in bytes for one error block */
337
+#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
144
+#define ACPI_GHES_MAX_RAW_DATA_LENGTH (1 * KiB)
338
+#define PHY_INT_FAULT (1 << 5)
145
+
339
+#define PHY_INT_DOWN (1 << 4)
146
+/* Now only support ARMv8 SEA notification type error source */
340
+#define PHY_INT_AUTONEG_LP (1 << 3)
147
+#define ACPI_GHES_ERROR_SOURCE_COUNT 1
341
+#define PHY_INT_PARFAULT (1 << 2)
148
+
342
+#define PHY_INT_AUTONEG_PAGE (1 << 1)
149
+/*
343
+
150
+ * Build table for the hardware error fw_cfg blob.
344
+static void lan9118_phy_update_irq(Lan9118PhyState *s)
151
+ * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
345
+{
152
+ * See docs/specs/acpi_hest_ghes.rst for blobs format.
346
+ qemu_set_irq(s->irq, !!(s->ints & s->int_mask));
153
+ */
347
+}
154
+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
348
+
155
+{
349
+uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
156
+ int i, error_status_block_offset;
350
+{
157
+
351
+ uint16_t val;
158
+ /* Build error_block_address */
352
+
159
+ for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
353
+ switch (reg) {
160
+ build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
354
+ case 0: /* Basic Control */
355
+ return s->control;
356
+ case 1: /* Basic Status */
357
+ return s->status;
358
+ case 2: /* ID1 */
359
+ return 0x0007;
360
+ case 3: /* ID2 */
361
+ return 0xc0d1;
362
+ case 4: /* Auto-neg advertisement */
363
+ return s->advertise;
364
+ case 5: /* Auto-neg Link Partner Ability */
365
+ return 0x0f71;
366
+ case 6: /* Auto-neg Expansion */
367
+ return 1;
368
+ /* TODO 17, 18, 27, 29, 30, 31 */
369
+ case 29: /* Interrupt source. */
370
+ val = s->ints;
371
+ s->ints = 0;
372
+ lan9118_phy_update_irq(s);
373
+ return val;
374
+ case 30: /* Interrupt mask */
375
+ return s->int_mask;
376
+ default:
377
+ qemu_log_mask(LOG_GUEST_ERROR,
378
+ "lan9118_phy_read: PHY read reg %d\n", reg);
379
+ return 0;
161
+ }
380
+ }
162
+
381
+}
163
+ /* Build read_ack_register */
382
+
164
+ for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
383
+void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
165
+ /*
384
+{
166
+ * Initialize the value of read_ack_register to 1, so GHES can be
385
+ switch (reg) {
167
+ * writeable after (re)boot.
386
+ case 0: /* Basic Control */
168
+ * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
387
+ if (val & 0x8000) {
169
+ * (GHESv2 - Type 10)
388
+ lan9118_phy_reset(s);
170
+ */
389
+ break;
171
+ build_append_int_noprefix(hardware_errors, 1, sizeof(uint64_t));
390
+ }
391
+ s->control = val & 0x7980;
392
+ /* Complete autonegotiation immediately. */
393
+ if (val & 0x1000) {
394
+ s->status |= 0x0020;
395
+ }
396
+ break;
397
+ case 4: /* Auto-neg advertisement */
398
+ s->advertise = (val & 0x2d7f) | 0x80;
399
+ break;
400
+ /* TODO 17, 18, 27, 31 */
401
+ case 30: /* Interrupt mask */
402
+ s->int_mask = val & 0xff;
403
+ lan9118_phy_update_irq(s);
404
+ break;
405
+ default:
406
+ qemu_log_mask(LOG_GUEST_ERROR,
407
+ "lan9118_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
172
+ }
408
+ }
173
+
409
+}
174
+ /* Generic Error Status Block offset in the hardware error fw_cfg blob */
410
+
175
+ error_status_block_offset = hardware_errors->len;
411
+void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
176
+
412
+{
177
+ /* Reserve space for Error Status Data Block */
413
+ s->link_down = link_down;
178
+ acpi_data_push(hardware_errors,
414
+
179
+ ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
415
+ /* Autonegotiation status mirrors link status. */
180
+
416
+ if (link_down) {
181
+ /* Tell guest firmware to place hardware_errors blob into RAM */
417
+ s->status &= ~0x0024;
182
+ bios_linker_loader_alloc(linker, ACPI_GHES_ERRORS_FW_CFG_FILE,
418
+ s->ints |= PHY_INT_DOWN;
183
+ hardware_errors, sizeof(uint64_t), false);
419
+ } else {
184
+
420
+ s->status |= 0x0024;
185
+ for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
421
+ s->ints |= PHY_INT_ENERGYON;
186
+ /*
422
+ s->ints |= PHY_INT_AUTONEG_COMPLETE;
187
+ * Tell firmware to patch error_block_address entries to point to
188
+ * corresponding "Generic Error Status Block"
189
+ */
190
+ bios_linker_loader_add_pointer(linker,
191
+ ACPI_GHES_ERRORS_FW_CFG_FILE, sizeof(uint64_t) * i,
192
+ sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
193
+ error_status_block_offset + i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
194
+ }
423
+ }
195
+
424
+ lan9118_phy_update_irq(s);
196
+ /*
425
+}
197
+ * tell firmware to write hardware_errors GPA into
426
+
198
+ * hardware_errors_addr fw_cfg, once the former has been initialized.
427
+void lan9118_phy_reset(Lan9118PhyState *s)
199
+ */
428
+{
200
+ bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
429
+ s->control = 0x3000;
201
+ 0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
430
+ s->status = 0x7809;
202
+}
431
+ s->advertise = 0x01e1;
203
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
432
+ s->int_mask = 0;
433
+ s->ints = 0;
434
+ lan9118_phy_update_link(s, s->link_down);
435
+}
436
+
437
+static void lan9118_phy_reset_hold(Object *obj, ResetType type)
438
+{
439
+ Lan9118PhyState *s = LAN9118_PHY(obj);
440
+
441
+ lan9118_phy_reset(s);
442
+}
443
+
444
+static void lan9118_phy_init(Object *obj)
445
+{
446
+ Lan9118PhyState *s = LAN9118_PHY(obj);
447
+
448
+ qdev_init_gpio_out(DEVICE(s), &s->irq, 1);
449
+}
450
+
451
+static const VMStateDescription vmstate_lan9118_phy = {
452
+ .name = "lan9118-phy",
453
+ .version_id = 1,
454
+ .minimum_version_id = 1,
455
+ .fields = (const VMStateField[]) {
456
+ VMSTATE_UINT16(control, Lan9118PhyState),
457
+ VMSTATE_UINT16(status, Lan9118PhyState),
458
+ VMSTATE_UINT16(advertise, Lan9118PhyState),
459
+ VMSTATE_UINT16(ints, Lan9118PhyState),
460
+ VMSTATE_UINT16(int_mask, Lan9118PhyState),
461
+ VMSTATE_BOOL(link_down, Lan9118PhyState),
462
+ VMSTATE_END_OF_LIST()
463
+ }
464
+};
465
+
466
+static void lan9118_phy_class_init(ObjectClass *klass, void *data)
467
+{
468
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
469
+ DeviceClass *dc = DEVICE_CLASS(klass);
470
+
471
+ rc->phases.hold = lan9118_phy_reset_hold;
472
+ dc->vmsd = &vmstate_lan9118_phy;
473
+}
474
+
475
+static const TypeInfo types[] = {
476
+ {
477
+ .name = TYPE_LAN9118_PHY,
478
+ .parent = TYPE_SYS_BUS_DEVICE,
479
+ .instance_size = sizeof(Lan9118PhyState),
480
+ .instance_init = lan9118_phy_init,
481
+ .class_init = lan9118_phy_class_init,
482
+ }
483
+};
484
+
485
+DEFINE_TYPES(types)
486
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
204
index XXXXXXX..XXXXXXX 100644
487
index XXXXXXX..XXXXXXX 100644
205
--- a/hw/arm/virt-acpi-build.c
488
--- a/hw/net/Kconfig
206
+++ b/hw/arm/virt-acpi-build.c
489
+++ b/hw/net/Kconfig
207
@@ -XXX,XX +XXX,XX @@
490
@@ -XXX,XX +XXX,XX @@ config VMXNET3_PCI
208
#include "sysemu/reset.h"
491
config SMC91C111
209
#include "kvm_arm.h"
492
bool
210
#include "migration/vmstate.h"
493
211
+#include "hw/acpi/ghes.h"
494
+config LAN9118_PHY
212
495
+ bool
213
#define ARM_SPI_BASE 32
496
+
214
497
config LAN9118
215
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
498
bool
216
acpi_add_table(table_offsets, tables_blob);
499
+ select LAN9118_PHY
217
build_spcr(tables_blob, tables->linker, vms);
500
select PTIMER
218
501
219
+ if (vms->ras) {
502
config NE2000_ISA
220
+ build_ghes_error_table(tables->hardware_errors, tables->linker);
503
diff --git a/hw/net/meson.build b/hw/net/meson.build
221
+ }
222
+
223
if (ms->numa_state->num_nodes > 0) {
224
acpi_add_table(table_offsets, tables_blob);
225
build_srat(tables_blob, tables->linker, vms);
226
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
227
index XXXXXXX..XXXXXXX 100644
504
index XXXXXXX..XXXXXXX 100644
228
--- a/hw/acpi/Kconfig
505
--- a/hw/net/meson.build
229
+++ b/hw/acpi/Kconfig
506
+++ b/hw/net/meson.build
230
@@ -XXX,XX +XXX,XX @@ config ACPI_HMAT
507
@@ -XXX,XX +XXX,XX @@ system_ss.add(when: 'CONFIG_VMXNET3_PCI', if_true: files('vmxnet3.c'))
231
bool
508
232
depends on ACPI
509
system_ss.add(when: 'CONFIG_SMC91C111', if_true: files('smc91c111.c'))
233
510
system_ss.add(when: 'CONFIG_LAN9118', if_true: files('lan9118.c'))
234
+config ACPI_APEI
511
+system_ss.add(when: 'CONFIG_LAN9118_PHY', if_true: files('lan9118_phy.c'))
235
+ bool
512
system_ss.add(when: 'CONFIG_NE2000_ISA', if_true: files('ne2000-isa.c'))
236
+ depends on ACPI
513
system_ss.add(when: 'CONFIG_OPENCORES_ETH', if_true: files('opencores_eth.c'))
237
+
514
system_ss.add(when: 'CONFIG_XGMAC', if_true: files('xgmac.c'))
238
config ACPI_PCI
239
bool
240
depends on ACPI && PCI
241
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
242
index XXXXXXX..XXXXXXX 100644
243
--- a/hw/acpi/Makefile.objs
244
+++ b/hw/acpi/Makefile.objs
245
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
246
common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
247
common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
248
common-obj-$(CONFIG_ACPI_HMAT) += hmat.o
249
+common-obj-$(CONFIG_ACPI_APEI) += ghes.o
250
common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
251
common-obj-$(call lnot,$(CONFIG_PC)) += acpi-x86-stub.o
252
253
--
515
--
254
2.20.1
516
2.34.1
255
256
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
Provide a functional interface for the vector expansion.
3
imx_fec models the same PHY as lan9118_phy. The code is almost the same with
4
This fits better with the existing set of helpers that
4
imx_fec having more logging and tracing. Merge these improvements into
5
we provide for other operations.
5
lan9118_phy and reuse in imx_fec to fix the code duplication.
6
6
7
Macro-ize the 5 nearly identical comparisons.
7
Some migration state how resides in the new device model which breaks migration
8
compatibility for the following machines:
9
* imx25-pdk
10
* sabrelite
11
* mcimx7d-sabre
12
* mcimx6ul-evk
8
13
14
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
15
Tested-by: Guenter Roeck <linux@roeck-us.net>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
17
Message-id: 20241102125724.532843-3-shentey@gmail.com
11
Message-id: 20200513163245.17915-7-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
19
---
14
target/arm/translate.h | 16 ++-
20
include/hw/net/imx_fec.h | 9 ++-
15
target/arm/translate-a64.c | 22 ++--
21
hw/net/imx_fec.c | 146 ++++-----------------------------------
16
target/arm/translate.c | 254 ++++++++-----------------------------
22
hw/net/lan9118_phy.c | 82 ++++++++++++++++------
17
3 files changed, 74 insertions(+), 218 deletions(-)
23
hw/net/Kconfig | 1 +
24
hw/net/trace-events | 10 +--
25
5 files changed, 85 insertions(+), 163 deletions(-)
18
26
19
diff --git a/target/arm/translate.h b/target/arm/translate.h
27
diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
20
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/translate.h
29
--- a/include/hw/net/imx_fec.h
22
+++ b/target/arm/translate.h
30
+++ b/include/hw/net/imx_fec.h
23
@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
31
@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(IMXFECState, IMX_FEC)
24
uint64_t vfp_expand_imm(int size, uint8_t imm8);
32
#define TYPE_IMX_ENET "imx.enet"
25
33
26
/* Vector operations shared between ARM and AArch64. */
34
#include "hw/sysbus.h"
27
-extern const GVecGen2 ceq0_op[4];
35
+#include "hw/net/lan9118_phy.h"
28
-extern const GVecGen2 clt0_op[4];
36
+#include "hw/irq.h"
29
-extern const GVecGen2 cgt0_op[4];
37
#include "net/net.h"
30
-extern const GVecGen2 cle0_op[4];
38
31
-extern const GVecGen2 cge0_op[4];
39
#define ENET_EIR 1
32
+void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
40
@@ -XXX,XX +XXX,XX @@ struct IMXFECState {
33
+ uint32_t opr_sz, uint32_t max_sz);
41
uint32_t tx_descriptor[ENET_TX_RING_NUM];
34
+void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
42
uint32_t tx_ring_num;
35
+ uint32_t opr_sz, uint32_t max_sz);
43
36
+void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
44
- uint32_t phy_status;
37
+ uint32_t opr_sz, uint32_t max_sz);
45
- uint32_t phy_control;
38
+void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
46
- uint32_t phy_advertise;
39
+ uint32_t opr_sz, uint32_t max_sz);
47
- uint32_t phy_int;
40
+void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
48
- uint32_t phy_int_mask;
41
+ uint32_t opr_sz, uint32_t max_sz);
49
+ Lan9118PhyState mii;
42
+
50
+ IRQState mii_irq;
43
extern const GVecGen3 mla_op[4];
51
uint32_t phy_num;
44
extern const GVecGen3 mls_op[4];
52
bool phy_connected;
45
extern const GVecGen3 cmtst_op[4];
53
struct IMXFECState *phy_consumer;
46
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
54
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
47
index XXXXXXX..XXXXXXX 100644
55
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/translate-a64.c
56
--- a/hw/net/imx_fec.c
49
+++ b/target/arm/translate-a64.c
57
+++ b/hw/net/imx_fec.c
50
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
58
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx_eth_txdescs = {
51
is_q ? 16 : 8, vec_full_reg_size(s));
59
52
}
60
static const VMStateDescription vmstate_imx_eth = {
53
61
.name = TYPE_IMX_FEC,
54
-/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */
62
- .version_id = 2,
55
-static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
63
- .minimum_version_id = 2,
56
- int rn, const GVecGen2 *gvec_op)
64
+ .version_id = 3,
65
+ .minimum_version_id = 3,
66
.fields = (const VMStateField[]) {
67
VMSTATE_UINT32_ARRAY(regs, IMXFECState, ENET_MAX),
68
VMSTATE_UINT32(rx_descriptor, IMXFECState),
69
VMSTATE_UINT32(tx_descriptor[0], IMXFECState),
70
- VMSTATE_UINT32(phy_status, IMXFECState),
71
- VMSTATE_UINT32(phy_control, IMXFECState),
72
- VMSTATE_UINT32(phy_advertise, IMXFECState),
73
- VMSTATE_UINT32(phy_int, IMXFECState),
74
- VMSTATE_UINT32(phy_int_mask, IMXFECState),
75
VMSTATE_END_OF_LIST()
76
},
77
.subsections = (const VMStateDescription * const []) {
78
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_imx_eth = {
79
},
80
};
81
82
-#define PHY_INT_ENERGYON (1 << 7)
83
-#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
84
-#define PHY_INT_FAULT (1 << 5)
85
-#define PHY_INT_DOWN (1 << 4)
86
-#define PHY_INT_AUTONEG_LP (1 << 3)
87
-#define PHY_INT_PARFAULT (1 << 2)
88
-#define PHY_INT_AUTONEG_PAGE (1 << 1)
89
-
90
static void imx_eth_update(IMXFECState *s);
91
92
/*
93
@@ -XXX,XX +XXX,XX @@ static void imx_eth_update(IMXFECState *s);
94
* For now we don't handle any GPIO/interrupt line, so the OS will
95
* have to poll for the PHY status.
96
*/
97
-static void imx_phy_update_irq(IMXFECState *s)
98
+static void imx_phy_update_irq(void *opaque, int n, int level)
99
{
100
- imx_eth_update(s);
101
-}
102
-
103
-static void imx_phy_update_link(IMXFECState *s)
57
-{
104
-{
58
- tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
105
- /* Autonegotiation status mirrors link status. */
59
- is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
106
- if (qemu_get_queue(s->nic)->link_down) {
107
- trace_imx_phy_update_link("down");
108
- s->phy_status &= ~0x0024;
109
- s->phy_int |= PHY_INT_DOWN;
110
- } else {
111
- trace_imx_phy_update_link("up");
112
- s->phy_status |= 0x0024;
113
- s->phy_int |= PHY_INT_ENERGYON;
114
- s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
115
- }
116
- imx_phy_update_irq(s);
117
+ imx_eth_update(opaque);
118
}
119
120
static void imx_eth_set_link(NetClientState *nc)
121
{
122
- imx_phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
60
-}
123
-}
61
-
124
-
62
/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */
125
-static void imx_phy_reset(IMXFECState *s)
63
static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
126
-{
64
int rn, int rm, const GVecGen3 *gvec_op)
127
- trace_imx_phy_reset();
65
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
128
-
129
- s->phy_status = 0x7809;
130
- s->phy_control = 0x3000;
131
- s->phy_advertise = 0x01e1;
132
- s->phy_int_mask = 0;
133
- s->phy_int = 0;
134
- imx_phy_update_link(s);
135
+ lan9118_phy_update_link(&IMX_FEC(qemu_get_nic_opaque(nc))->mii,
136
+ nc->link_down);
137
}
138
139
static uint32_t imx_phy_read(IMXFECState *s, int reg)
140
{
141
- uint32_t val;
142
uint32_t phy = reg / 32;
143
144
if (!s->phy_connected) {
145
@@ -XXX,XX +XXX,XX @@ static uint32_t imx_phy_read(IMXFECState *s, int reg)
146
147
reg %= 32;
148
149
- switch (reg) {
150
- case 0: /* Basic Control */
151
- val = s->phy_control;
152
- break;
153
- case 1: /* Basic Status */
154
- val = s->phy_status;
155
- break;
156
- case 2: /* ID1 */
157
- val = 0x0007;
158
- break;
159
- case 3: /* ID2 */
160
- val = 0xc0d1;
161
- break;
162
- case 4: /* Auto-neg advertisement */
163
- val = s->phy_advertise;
164
- break;
165
- case 5: /* Auto-neg Link Partner Ability */
166
- val = 0x0f71;
167
- break;
168
- case 6: /* Auto-neg Expansion */
169
- val = 1;
170
- break;
171
- case 29: /* Interrupt source. */
172
- val = s->phy_int;
173
- s->phy_int = 0;
174
- imx_phy_update_irq(s);
175
- break;
176
- case 30: /* Interrupt mask */
177
- val = s->phy_int_mask;
178
- break;
179
- case 17:
180
- case 18:
181
- case 27:
182
- case 31:
183
- qemu_log_mask(LOG_UNIMP, "[%s.phy]%s: reg %d not implemented\n",
184
- TYPE_IMX_FEC, __func__, reg);
185
- val = 0;
186
- break;
187
- default:
188
- qemu_log_mask(LOG_GUEST_ERROR, "[%s.phy]%s: Bad address at offset %d\n",
189
- TYPE_IMX_FEC, __func__, reg);
190
- val = 0;
191
- break;
192
- }
193
-
194
- trace_imx_phy_read(val, phy, reg);
195
-
196
- return val;
197
+ return lan9118_phy_read(&s->mii, reg);
198
}
199
200
static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
201
@@ -XXX,XX +XXX,XX @@ static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
202
203
reg %= 32;
204
205
- trace_imx_phy_write(val, phy, reg);
206
-
207
- switch (reg) {
208
- case 0: /* Basic Control */
209
- if (val & 0x8000) {
210
- imx_phy_reset(s);
211
- } else {
212
- s->phy_control = val & 0x7980;
213
- /* Complete autonegotiation immediately. */
214
- if (val & 0x1000) {
215
- s->phy_status |= 0x0020;
216
- }
217
- }
218
- break;
219
- case 4: /* Auto-neg advertisement */
220
- s->phy_advertise = (val & 0x2d7f) | 0x80;
221
- break;
222
- case 30: /* Interrupt mask */
223
- s->phy_int_mask = val & 0xff;
224
- imx_phy_update_irq(s);
225
- break;
226
- case 17:
227
- case 18:
228
- case 27:
229
- case 31:
230
- qemu_log_mask(LOG_UNIMP, "[%s.phy)%s: reg %d not implemented\n",
231
- TYPE_IMX_FEC, __func__, reg);
232
- break;
233
- default:
234
- qemu_log_mask(LOG_GUEST_ERROR, "[%s.phy]%s: Bad address at offset %d\n",
235
- TYPE_IMX_FEC, __func__, reg);
236
- break;
237
- }
238
+ lan9118_phy_write(&s->mii, reg, val);
239
}
240
241
static void imx_fec_read_bd(IMXFECBufDesc *bd, dma_addr_t addr)
242
@@ -XXX,XX +XXX,XX @@ static void imx_eth_reset(DeviceState *d)
243
244
s->rx_descriptor = 0;
245
memset(s->tx_descriptor, 0, sizeof(s->tx_descriptor));
246
-
247
- /* We also reset the PHY */
248
- imx_phy_reset(s);
249
}
250
251
static uint32_t imx_default_read(IMXFECState *s, uint32_t index)
252
@@ -XXX,XX +XXX,XX @@ static void imx_eth_realize(DeviceState *dev, Error **errp)
253
sysbus_init_irq(sbd, &s->irq[0]);
254
sysbus_init_irq(sbd, &s->irq[1]);
255
256
+ qemu_init_irq(&s->mii_irq, imx_phy_update_irq, s, 0);
257
+ object_initialize_child(OBJECT(s), "mii", &s->mii, TYPE_LAN9118_PHY);
258
+ if (!sysbus_realize_and_unref(SYS_BUS_DEVICE(&s->mii), errp)) {
259
+ return;
260
+ }
261
+ qdev_connect_gpio_out(DEVICE(&s->mii), 0, &s->mii_irq);
262
+
263
qemu_macaddr_default_if_unset(&s->conf.macaddr);
264
265
s->nic = qemu_new_nic(&imx_eth_net_info, &s->conf,
266
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
267
index XXXXXXX..XXXXXXX 100644
268
--- a/hw/net/lan9118_phy.c
269
+++ b/hw/net/lan9118_phy.c
270
@@ -XXX,XX +XXX,XX @@
271
* Copyright (c) 2009 CodeSourcery, LLC.
272
* Written by Paul Brook
273
*
274
+ * Copyright (c) 2013 Jean-Christophe Dubois. <jcd@tribudubois.net>
275
+ *
276
* This code is licensed under the GNU GPL v2
277
*
278
* Contributions after 2012-01-13 are licensed under the terms of the
279
@@ -XXX,XX +XXX,XX @@
280
#include "hw/resettable.h"
281
#include "migration/vmstate.h"
282
#include "qemu/log.h"
283
+#include "trace.h"
284
285
#define PHY_INT_ENERGYON (1 << 7)
286
#define PHY_INT_AUTONEG_COMPLETE (1 << 6)
287
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
288
289
switch (reg) {
290
case 0: /* Basic Control */
291
- return s->control;
292
+ val = s->control;
293
+ break;
294
case 1: /* Basic Status */
295
- return s->status;
296
+ val = s->status;
297
+ break;
298
case 2: /* ID1 */
299
- return 0x0007;
300
+ val = 0x0007;
301
+ break;
302
case 3: /* ID2 */
303
- return 0xc0d1;
304
+ val = 0xc0d1;
305
+ break;
306
case 4: /* Auto-neg advertisement */
307
- return s->advertise;
308
+ val = s->advertise;
309
+ break;
310
case 5: /* Auto-neg Link Partner Ability */
311
- return 0x0f71;
312
+ val = 0x0f71;
313
+ break;
314
case 6: /* Auto-neg Expansion */
315
- return 1;
316
- /* TODO 17, 18, 27, 29, 30, 31 */
317
+ val = 1;
318
+ break;
319
case 29: /* Interrupt source. */
320
val = s->ints;
321
s->ints = 0;
322
lan9118_phy_update_irq(s);
323
- return val;
324
+ break;
325
case 30: /* Interrupt mask */
326
- return s->int_mask;
327
+ val = s->int_mask;
328
+ break;
329
+ case 17:
330
+ case 18:
331
+ case 27:
332
+ case 31:
333
+ qemu_log_mask(LOG_UNIMP, "%s: reg %d not implemented\n",
334
+ __func__, reg);
335
+ val = 0;
336
+ break;
337
default:
338
- qemu_log_mask(LOG_GUEST_ERROR,
339
- "lan9118_phy_read: PHY read reg %d\n", reg);
340
- return 0;
341
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address at offset %d\n",
342
+ __func__, reg);
343
+ val = 0;
344
+ break;
345
}
346
+
347
+ trace_lan9118_phy_read(val, reg);
348
+
349
+ return val;
350
}
351
352
void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
353
{
354
+ trace_lan9118_phy_write(val, reg);
355
+
356
switch (reg) {
357
case 0: /* Basic Control */
358
if (val & 0x8000) {
359
lan9118_phy_reset(s);
360
- break;
361
- }
362
- s->control = val & 0x7980;
363
- /* Complete autonegotiation immediately. */
364
- if (val & 0x1000) {
365
- s->status |= 0x0020;
366
+ } else {
367
+ s->control = val & 0x7980;
368
+ /* Complete autonegotiation immediately. */
369
+ if (val & 0x1000) {
370
+ s->status |= 0x0020;
371
+ }
66
}
372
}
67
break;
373
break;
68
case 0x8: /* CMGT, CMGE */
374
case 4: /* Auto-neg advertisement */
69
- gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]);
375
s->advertise = (val & 0x2d7f) | 0x80;
70
+ if (u) {
376
break;
71
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size);
377
- /* TODO 17, 18, 27, 31 */
72
+ } else {
378
case 30: /* Interrupt mask */
73
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size);
379
s->int_mask = val & 0xff;
74
+ }
380
lan9118_phy_update_irq(s);
75
return;
381
break;
76
case 0x9: /* CMEQ, CMLE */
382
+ case 17:
77
- gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]);
383
+ case 18:
78
+ if (u) {
384
+ case 27:
79
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size);
385
+ case 31:
80
+ } else {
386
+ qemu_log_mask(LOG_UNIMP, "%s: reg %d not implemented\n",
81
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size);
387
+ __func__, reg);
82
+ }
388
+ break;
83
return;
389
default:
84
case 0xa: /* CMLT */
390
- qemu_log_mask(LOG_GUEST_ERROR,
85
- gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]);
391
- "lan9118_phy_write: PHY write reg %d = 0x%04x\n", reg, val);
86
+ gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size);
392
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad address at offset %d\n",
87
return;
393
+ __func__, reg);
88
case 0xb:
394
+ break;
89
if (u) { /* ABS, NEG */
395
}
90
diff --git a/target/arm/translate.c b/target/arm/translate.c
396
}
397
398
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
399
400
/* Autonegotiation status mirrors link status. */
401
if (link_down) {
402
+ trace_lan9118_phy_update_link("down");
403
s->status &= ~0x0024;
404
s->ints |= PHY_INT_DOWN;
405
} else {
406
+ trace_lan9118_phy_update_link("up");
407
s->status |= 0x0024;
408
s->ints |= PHY_INT_ENERGYON;
409
s->ints |= PHY_INT_AUTONEG_COMPLETE;
410
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
411
412
void lan9118_phy_reset(Lan9118PhyState *s)
413
{
414
+ trace_lan9118_phy_reset();
415
+
416
s->control = 0x3000;
417
s->status = 0x7809;
418
s->advertise = 0x01e1;
419
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118_phy = {
420
.version_id = 1,
421
.minimum_version_id = 1,
422
.fields = (const VMStateField[]) {
423
- VMSTATE_UINT16(control, Lan9118PhyState),
424
VMSTATE_UINT16(status, Lan9118PhyState),
425
+ VMSTATE_UINT16(control, Lan9118PhyState),
426
VMSTATE_UINT16(advertise, Lan9118PhyState),
427
VMSTATE_UINT16(ints, Lan9118PhyState),
428
VMSTATE_UINT16(int_mask, Lan9118PhyState),
429
diff --git a/hw/net/Kconfig b/hw/net/Kconfig
91
index XXXXXXX..XXXXXXX 100644
430
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/translate.c
431
--- a/hw/net/Kconfig
93
+++ b/target/arm/translate.c
432
+++ b/hw/net/Kconfig
94
@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
433
@@ -XXX,XX +XXX,XX @@ config ALLWINNER_SUN8I_EMAC
95
return 1;
434
96
}
435
config IMX_FEC
97
436
bool
98
-static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a)
437
+ select LAN9118_PHY
99
-{
438
100
- tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0);
439
config CADENCE
101
- tcg_gen_neg_i32(d, d);
440
bool
102
-}
441
diff --git a/hw/net/trace-events b/hw/net/trace-events
103
-
442
index XXXXXXX..XXXXXXX 100644
104
-static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a)
443
--- a/hw/net/trace-events
105
-{
444
+++ b/hw/net/trace-events
106
- tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0);
445
@@ -XXX,XX +XXX,XX @@ allwinner_sun8i_emac_set_link(bool active) "Set link: active=%u"
107
- tcg_gen_neg_i64(d, d);
446
allwinner_sun8i_emac_read(uint64_t offset, uint64_t val) "MMIO read: offset=0x%" PRIx64 " value=0x%" PRIx64
108
-}
447
allwinner_sun8i_emac_write(uint64_t offset, uint64_t val) "MMIO write: offset=0x%" PRIx64 " value=0x%" PRIx64
109
-
448
110
-static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
449
+# lan9118_phy.c
111
-{
450
+lan9118_phy_read(uint16_t val, int reg) "[0x%02x] -> 0x%04" PRIx16
112
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
451
+lan9118_phy_write(uint16_t val, int reg) "[0x%02x] <- 0x%04" PRIx16
113
- tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero);
452
+lan9118_phy_update_link(const char *s) "%s"
114
- tcg_temp_free_vec(zero);
453
+lan9118_phy_reset(void) ""
115
-}
454
+
116
+#define GEN_CMP0(NAME, COND) \
455
# lance.c
117
+ static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a) \
456
lance_mem_readw(uint64_t addr, uint32_t ret) "addr=0x%"PRIx64"val=0x%04x"
118
+ { \
457
lance_mem_writew(uint64_t addr, uint32_t val) "addr=0x%"PRIx64"val=0x%04x"
119
+ tcg_gen_setcondi_i32(COND, d, a, 0); \
458
@@ -XXX,XX +XXX,XX @@ i82596_set_multicast(uint16_t count) "Added %d multicast entries"
120
+ tcg_gen_neg_i32(d, d); \
459
i82596_channel_attention(void *s) "%p: Received CHANNEL ATTENTION"
121
+ } \
460
122
+ static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a) \
461
# imx_fec.c
123
+ { \
462
-imx_phy_read(uint32_t val, int phy, int reg) "0x%04"PRIx32" <= phy[%d].reg[%d]"
124
+ tcg_gen_setcondi_i64(COND, d, a, 0); \
463
imx_phy_read_num(int phy, int configured) "read request from unconfigured phy %d (configured %d)"
125
+ tcg_gen_neg_i64(d, d); \
464
-imx_phy_write(uint32_t val, int phy, int reg) "0x%04"PRIx32" => phy[%d].reg[%d]"
126
+ } \
465
imx_phy_write_num(int phy, int configured) "write request to unconfigured phy %d (configured %d)"
127
+ static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
466
-imx_phy_update_link(const char *s) "%s"
128
+ { \
467
-imx_phy_reset(void) ""
129
+ TCGv_vec zero = tcg_const_zeros_vec_matching(d); \
468
imx_fec_read_bd(uint64_t addr, int flags, int len, int data) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x"
130
+ tcg_gen_cmp_vec(COND, vece, d, a, zero); \
469
imx_enet_read_bd(uint64_t addr, int flags, int len, int data, int options, int status) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x option 0x%04x status 0x%04x"
131
+ tcg_temp_free_vec(zero); \
470
imx_eth_tx_bd_busy(void) "tx_bd ran out of descriptors to transmit"
132
+ } \
133
+ void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m, \
134
+ uint32_t opr_sz, uint32_t max_sz) \
135
+ { \
136
+ const GVecGen2 op[4] = { \
137
+ { .fno = gen_helper_gvec_##NAME##0_b, \
138
+ .fniv = gen_##NAME##0_vec, \
139
+ .opt_opc = vecop_list_cmp, \
140
+ .vece = MO_8 }, \
141
+ { .fno = gen_helper_gvec_##NAME##0_h, \
142
+ .fniv = gen_##NAME##0_vec, \
143
+ .opt_opc = vecop_list_cmp, \
144
+ .vece = MO_16 }, \
145
+ { .fni4 = gen_##NAME##0_i32, \
146
+ .fniv = gen_##NAME##0_vec, \
147
+ .opt_opc = vecop_list_cmp, \
148
+ .vece = MO_32 }, \
149
+ { .fni8 = gen_##NAME##0_i64, \
150
+ .fniv = gen_##NAME##0_vec, \
151
+ .opt_opc = vecop_list_cmp, \
152
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64, \
153
+ .vece = MO_64 }, \
154
+ }; \
155
+ tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]); \
156
+ }
157
158
static const TCGOpcode vecop_list_cmp[] = {
159
INDEX_op_cmp_vec, 0
160
};
161
162
-const GVecGen2 ceq0_op[4] = {
163
- { .fno = gen_helper_gvec_ceq0_b,
164
- .fniv = gen_ceq0_vec,
165
- .opt_opc = vecop_list_cmp,
166
- .vece = MO_8 },
167
- { .fno = gen_helper_gvec_ceq0_h,
168
- .fniv = gen_ceq0_vec,
169
- .opt_opc = vecop_list_cmp,
170
- .vece = MO_16 },
171
- { .fni4 = gen_ceq0_i32,
172
- .fniv = gen_ceq0_vec,
173
- .opt_opc = vecop_list_cmp,
174
- .vece = MO_32 },
175
- { .fni8 = gen_ceq0_i64,
176
- .fniv = gen_ceq0_vec,
177
- .opt_opc = vecop_list_cmp,
178
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
179
- .vece = MO_64 },
180
-};
181
+GEN_CMP0(ceq, TCG_COND_EQ)
182
+GEN_CMP0(cle, TCG_COND_LE)
183
+GEN_CMP0(cge, TCG_COND_GE)
184
+GEN_CMP0(clt, TCG_COND_LT)
185
+GEN_CMP0(cgt, TCG_COND_GT)
186
187
-static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a)
188
-{
189
- tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0);
190
- tcg_gen_neg_i32(d, d);
191
-}
192
-
193
-static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a)
194
-{
195
- tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0);
196
- tcg_gen_neg_i64(d, d);
197
-}
198
-
199
-static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
200
-{
201
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
202
- tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero);
203
- tcg_temp_free_vec(zero);
204
-}
205
-
206
-const GVecGen2 cle0_op[4] = {
207
- { .fno = gen_helper_gvec_cle0_b,
208
- .fniv = gen_cle0_vec,
209
- .opt_opc = vecop_list_cmp,
210
- .vece = MO_8 },
211
- { .fno = gen_helper_gvec_cle0_h,
212
- .fniv = gen_cle0_vec,
213
- .opt_opc = vecop_list_cmp,
214
- .vece = MO_16 },
215
- { .fni4 = gen_cle0_i32,
216
- .fniv = gen_cle0_vec,
217
- .opt_opc = vecop_list_cmp,
218
- .vece = MO_32 },
219
- { .fni8 = gen_cle0_i64,
220
- .fniv = gen_cle0_vec,
221
- .opt_opc = vecop_list_cmp,
222
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
223
- .vece = MO_64 },
224
-};
225
-
226
-static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a)
227
-{
228
- tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0);
229
- tcg_gen_neg_i32(d, d);
230
-}
231
-
232
-static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a)
233
-{
234
- tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0);
235
- tcg_gen_neg_i64(d, d);
236
-}
237
-
238
-static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
239
-{
240
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
241
- tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero);
242
- tcg_temp_free_vec(zero);
243
-}
244
-
245
-const GVecGen2 cge0_op[4] = {
246
- { .fno = gen_helper_gvec_cge0_b,
247
- .fniv = gen_cge0_vec,
248
- .opt_opc = vecop_list_cmp,
249
- .vece = MO_8 },
250
- { .fno = gen_helper_gvec_cge0_h,
251
- .fniv = gen_cge0_vec,
252
- .opt_opc = vecop_list_cmp,
253
- .vece = MO_16 },
254
- { .fni4 = gen_cge0_i32,
255
- .fniv = gen_cge0_vec,
256
- .opt_opc = vecop_list_cmp,
257
- .vece = MO_32 },
258
- { .fni8 = gen_cge0_i64,
259
- .fniv = gen_cge0_vec,
260
- .opt_opc = vecop_list_cmp,
261
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
262
- .vece = MO_64 },
263
-};
264
-
265
-static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a)
266
-{
267
- tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0);
268
- tcg_gen_neg_i32(d, d);
269
-}
270
-
271
-static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a)
272
-{
273
- tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0);
274
- tcg_gen_neg_i64(d, d);
275
-}
276
-
277
-static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
278
-{
279
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
280
- tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero);
281
- tcg_temp_free_vec(zero);
282
-}
283
-
284
-const GVecGen2 clt0_op[4] = {
285
- { .fno = gen_helper_gvec_clt0_b,
286
- .fniv = gen_clt0_vec,
287
- .opt_opc = vecop_list_cmp,
288
- .vece = MO_8 },
289
- { .fno = gen_helper_gvec_clt0_h,
290
- .fniv = gen_clt0_vec,
291
- .opt_opc = vecop_list_cmp,
292
- .vece = MO_16 },
293
- { .fni4 = gen_clt0_i32,
294
- .fniv = gen_clt0_vec,
295
- .opt_opc = vecop_list_cmp,
296
- .vece = MO_32 },
297
- { .fni8 = gen_clt0_i64,
298
- .fniv = gen_clt0_vec,
299
- .opt_opc = vecop_list_cmp,
300
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
301
- .vece = MO_64 },
302
-};
303
-
304
-static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a)
305
-{
306
- tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0);
307
- tcg_gen_neg_i32(d, d);
308
-}
309
-
310
-static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a)
311
-{
312
- tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0);
313
- tcg_gen_neg_i64(d, d);
314
-}
315
-
316
-static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
317
-{
318
- TCGv_vec zero = tcg_const_zeros_vec_matching(d);
319
- tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero);
320
- tcg_temp_free_vec(zero);
321
-}
322
-
323
-const GVecGen2 cgt0_op[4] = {
324
- { .fno = gen_helper_gvec_cgt0_b,
325
- .fniv = gen_cgt0_vec,
326
- .opt_opc = vecop_list_cmp,
327
- .vece = MO_8 },
328
- { .fno = gen_helper_gvec_cgt0_h,
329
- .fniv = gen_cgt0_vec,
330
- .opt_opc = vecop_list_cmp,
331
- .vece = MO_16 },
332
- { .fni4 = gen_cgt0_i32,
333
- .fniv = gen_cgt0_vec,
334
- .opt_opc = vecop_list_cmp,
335
- .vece = MO_32 },
336
- { .fni8 = gen_cgt0_i64,
337
- .fniv = gen_cgt0_vec,
338
- .opt_opc = vecop_list_cmp,
339
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
340
- .vece = MO_64 },
341
-};
342
+#undef GEN_CMP0
343
344
static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
345
{
346
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
347
break;
348
349
case NEON_2RM_VCEQ0:
350
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
351
- vec_size, &ceq0_op[size]);
352
+ gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
353
break;
354
case NEON_2RM_VCGT0:
355
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
356
- vec_size, &cgt0_op[size]);
357
+ gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
358
break;
359
case NEON_2RM_VCLE0:
360
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
361
- vec_size, &cle0_op[size]);
362
+ gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
363
break;
364
case NEON_2RM_VCGE0:
365
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
366
- vec_size, &cge0_op[size]);
367
+ gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
368
break;
369
case NEON_2RM_VCLT0:
370
- tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
371
- vec_size, &clt0_op[size]);
372
+ gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
373
break;
374
375
default:
376
--
471
--
377
2.20.1
472
2.34.1
378
379
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
Record the GHEB address via fw_cfg file, when recording
3
Turns 0x70 into 0xe0 (== 0x70 << 1) which adds the missing MII_ANLPAR_TX and
4
a error to CPER, it will use this address to find out
4
fixes the MSB of selector field to be zero, as specified in the datasheet.
5
Generic Error Data Entries and write the error.
6
5
7
In order to avoid migration failure, make hardware
6
Fixes: 2a424990170b "LAN9118 emulation"
8
error table address to a part of GED device instead
7
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
9
of global variable, then this address will be migrated
8
Tested-by: Guenter Roeck <linux@roeck-us.net>
10
to target QEMU.
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
10
Message-id: 20241102125724.532843-4-shentey@gmail.com
12
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
13
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
14
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
15
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
16
Message-id: 20200512030609.19593-7-gengdongjiu@huawei.com
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
---
12
---
19
include/hw/acpi/generic_event_device.h | 2 ++
13
hw/net/lan9118_phy.c | 2 +-
20
include/hw/acpi/ghes.h | 6 ++++++
14
1 file changed, 1 insertion(+), 1 deletion(-)
21
hw/acpi/generic_event_device.c | 19 +++++++++++++++++++
22
hw/acpi/ghes.c | 14 ++++++++++++++
23
hw/arm/virt-acpi-build.c | 8 ++++++++
24
5 files changed, 49 insertions(+)
25
15
26
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
16
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
27
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
28
--- a/include/hw/acpi/generic_event_device.h
18
--- a/hw/net/lan9118_phy.c
29
+++ b/include/hw/acpi/generic_event_device.h
19
+++ b/hw/net/lan9118_phy.c
30
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
31
21
val = s->advertise;
32
#include "hw/sysbus.h"
22
break;
33
#include "hw/acpi/memory_hotplug.h"
23
case 5: /* Auto-neg Link Partner Ability */
34
+#include "hw/acpi/ghes.h"
24
- val = 0x0f71;
35
25
+ val = 0x0fe1;
36
#define ACPI_POWER_BUTTON_DEVICE "PWRB"
26
break;
37
27
case 6: /* Auto-neg Expansion */
38
@@ -XXX,XX +XXX,XX @@ typedef struct AcpiGedState {
28
val = 1;
39
GEDState ged_state;
40
uint32_t ged_event_bitmap;
41
qemu_irq irq;
42
+ AcpiGhesState ghes_state;
43
} AcpiGedState;
44
45
void build_ged_aml(Aml *table, const char* name, HotplugHandler *hotplug_dev,
46
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
47
index XXXXXXX..XXXXXXX 100644
48
--- a/include/hw/acpi/ghes.h
49
+++ b/include/hw/acpi/ghes.h
50
@@ -XXX,XX +XXX,XX @@ enum {
51
ACPI_HEST_SRC_ID_RESERVED,
52
};
53
54
+typedef struct AcpiGhesState {
55
+ uint64_t ghes_addr_le;
56
+} AcpiGhesState;
57
+
58
void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
59
void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
60
+void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
61
+ GArray *hardware_errors);
62
#endif
63
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/hw/acpi/generic_event_device.c
66
+++ b/hw/acpi/generic_event_device.c
67
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_ged_state = {
68
}
69
};
70
71
+static bool ghes_needed(void *opaque)
72
+{
73
+ AcpiGedState *s = opaque;
74
+ return s->ghes_state.ghes_addr_le;
75
+}
76
+
77
+static const VMStateDescription vmstate_ghes_state = {
78
+ .name = "acpi-ged/ghes",
79
+ .version_id = 1,
80
+ .minimum_version_id = 1,
81
+ .needed = ghes_needed,
82
+ .fields = (VMStateField[]) {
83
+ VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
84
+ vmstate_ghes_state, AcpiGhesState),
85
+ VMSTATE_END_OF_LIST()
86
+ }
87
+};
88
+
89
static const VMStateDescription vmstate_acpi_ged = {
90
.name = "acpi-ged",
91
.version_id = 1,
92
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_acpi_ged = {
93
},
94
.subsections = (const VMStateDescription * []) {
95
&vmstate_memhp_state,
96
+ &vmstate_ghes_state,
97
NULL
98
}
99
};
100
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/hw/acpi/ghes.c
103
+++ b/hw/acpi/ghes.c
104
@@ -XXX,XX +XXX,XX @@
105
#include "hw/acpi/ghes.h"
106
#include "hw/acpi/aml-build.h"
107
#include "qemu/error-report.h"
108
+#include "hw/acpi/generic_event_device.h"
109
+#include "hw/nvram/fw_cfg.h"
110
111
#define ACPI_GHES_ERRORS_FW_CFG_FILE "etc/hardware_errors"
112
#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
113
@@ -XXX,XX +XXX,XX @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
114
build_header(linker, table_data, (void *)(table_data->data + hest_start),
115
"HEST", table_data->len - hest_start, 1, NULL, NULL);
116
}
117
+
118
+void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
119
+ GArray *hardware_error)
120
+{
121
+ /* Create a read-only fw_cfg file for GHES */
122
+ fw_cfg_add_file(s, ACPI_GHES_ERRORS_FW_CFG_FILE, hardware_error->data,
123
+ hardware_error->len);
124
+
125
+ /* Create a read-write fw_cfg file for Address */
126
+ fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
127
+ NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
128
+}
129
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
130
index XXXXXXX..XXXXXXX 100644
131
--- a/hw/arm/virt-acpi-build.c
132
+++ b/hw/arm/virt-acpi-build.c
133
@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
134
{
135
AcpiBuildTables tables;
136
AcpiBuildState *build_state;
137
+ AcpiGedState *acpi_ged_state;
138
139
if (!vms->fw_cfg) {
140
trace_virt_acpi_setup();
141
@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
142
fw_cfg_add_file(vms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, tables.tcpalog->data,
143
acpi_data_len(tables.tcpalog));
144
145
+ if (vms->ras) {
146
+ assert(vms->acpi_dev);
147
+ acpi_ged_state = ACPI_GED(vms->acpi_dev);
148
+ acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
149
+ vms->fw_cfg, tables.hardware_errors);
150
+ }
151
+
152
build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
153
build_state, tables.rsdp,
154
ACPI_BUILD_RSDP_FILE, 0);
155
--
29
--
156
2.20.1
30
2.34.1
157
158
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
The little end UUID is used in many places, so make
3
Prefer named constants over magic values for better readability.
4
NVDIMM_UUID_LE to a common macro to convert the UUID
5
to a little end array.
6
4
7
Reviewed-by: Xiang Zheng <zhengxiang9@huawei.com>
8
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
9
Message-id: 20200512030609.19593-2-gengdongjiu@huawei.com
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
7
Tested-by: Guenter Roeck <linux@roeck-us.net>
8
Message-id: 20241102125724.532843-5-shentey@gmail.com
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
10
---
13
include/qemu/uuid.h | 27 +++++++++++++++++++++++++++
11
include/hw/net/mii.h | 6 +++++
14
hw/acpi/nvdimm.c | 10 +++-------
12
hw/net/lan9118_phy.c | 63 ++++++++++++++++++++++++++++----------------
15
2 files changed, 30 insertions(+), 7 deletions(-)
13
2 files changed, 46 insertions(+), 23 deletions(-)
16
14
17
diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
15
diff --git a/include/hw/net/mii.h b/include/hw/net/mii.h
18
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
19
--- a/include/qemu/uuid.h
17
--- a/include/hw/net/mii.h
20
+++ b/include/qemu/uuid.h
18
+++ b/include/hw/net/mii.h
21
@@ -XXX,XX +XXX,XX @@ typedef struct {
19
@@ -XXX,XX +XXX,XX @@
22
};
20
#define MII_BMSR_JABBER (1 << 1) /* Jabber detected */
23
} QemuUUID;
21
#define MII_BMSR_EXTCAP (1 << 0) /* Ext-reg capability */
24
22
25
+/**
23
+#define MII_ANAR_RFAULT (1 << 13) /* Say we can detect faults */
26
+ * UUID_LE - converts the fields of UUID to little-endian array,
24
#define MII_ANAR_PAUSE_ASYM (1 << 11) /* Try for asymmetric pause */
27
+ * each of parameters is the filed of UUID.
25
#define MII_ANAR_PAUSE (1 << 10) /* Try for pause */
28
+ *
26
#define MII_ANAR_TXFD (1 << 8)
29
+ * @time_low: The low field of the timestamp
27
@@ -XXX,XX +XXX,XX @@
30
+ * @time_mid: The middle field of the timestamp
28
#define MII_ANAR_10FD (1 << 6)
31
+ * @time_hi_and_version: The high field of the timestamp
29
#define MII_ANAR_10 (1 << 5)
32
+ * multiplexed with the version number
30
#define MII_ANAR_CSMACD (1 << 0)
33
+ * @clock_seq_hi_and_reserved: The high field of the clock
31
+#define MII_ANAR_SELECT (0x001f) /* Selector bits */
34
+ * sequence multiplexed with the variant
32
35
+ * @clock_seq_low: The low field of the clock sequence
33
#define MII_ANLPAR_ACK (1 << 14)
36
+ * @node0: The spatially unique node0 identifier
34
#define MII_ANLPAR_PAUSEASY (1 << 11) /* can pause asymmetrically */
37
+ * @node1: The spatially unique node1 identifier
35
@@ -XXX,XX +XXX,XX @@
38
+ * @node2: The spatially unique node2 identifier
36
#define RTL8201CP_PHYID1 0x0000
39
+ * @node3: The spatially unique node3 identifier
37
#define RTL8201CP_PHYID2 0x8201
40
+ * @node4: The spatially unique node4 identifier
38
41
+ * @node5: The spatially unique node5 identifier
39
+/* SMSC LAN9118 */
42
+ */
40
+#define SMSCLAN9118_PHYID1 0x0007
43
+#define UUID_LE(time_low, time_mid, time_hi_and_version, \
41
+#define SMSCLAN9118_PHYID2 0xc0d1
44
+ clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2, \
45
+ node3, node4, node5) \
46
+ { (time_low) & 0xff, ((time_low) >> 8) & 0xff, ((time_low) >> 16) & 0xff, \
47
+ ((time_low) >> 24) & 0xff, (time_mid) & 0xff, ((time_mid) >> 8) & 0xff, \
48
+ (time_hi_and_version) & 0xff, ((time_hi_and_version) >> 8) & 0xff, \
49
+ (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
50
+ (node3), (node4), (node5) }
51
+
42
+
52
#define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
43
/* RealTek 8211E */
53
"%02hhx%02hhx-%02hhx%02hhx-" \
44
#define RTL8211E_PHYID1 0x001c
54
"%02hhx%02hhx-" \
45
#define RTL8211E_PHYID2 0xc915
55
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
46
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
56
index XXXXXXX..XXXXXXX 100644
47
index XXXXXXX..XXXXXXX 100644
57
--- a/hw/acpi/nvdimm.c
48
--- a/hw/net/lan9118_phy.c
58
+++ b/hw/acpi/nvdimm.c
49
+++ b/hw/net/lan9118_phy.c
59
@@ -XXX,XX +XXX,XX @@
50
@@ -XXX,XX +XXX,XX @@
60
*/
61
51
62
#include "qemu/osdep.h"
52
#include "qemu/osdep.h"
63
+#include "qemu/uuid.h"
53
#include "hw/net/lan9118_phy.h"
64
#include "hw/acpi/acpi.h"
54
+#include "hw/net/mii.h"
65
#include "hw/acpi/aml-build.h"
55
#include "hw/irq.h"
66
#include "hw/acpi/bios-linker-loader.h"
56
#include "hw/resettable.h"
67
@@ -XXX,XX +XXX,XX @@
57
#include "migration/vmstate.h"
68
#include "hw/mem/nvdimm.h"
58
@@ -XXX,XX +XXX,XX @@ uint16_t lan9118_phy_read(Lan9118PhyState *s, int reg)
69
#include "qemu/nvdimm-utils.h"
59
uint16_t val;
70
60
71
-#define NVDIMM_UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7) \
61
switch (reg) {
72
- { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
62
- case 0: /* Basic Control */
73
- (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff, \
63
+ case MII_BMCR:
74
- (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }
64
val = s->control;
75
-
65
break;
76
/*
66
- case 1: /* Basic Status */
77
* define Byte Addressable Persistent Memory (PM) Region according to
67
+ case MII_BMSR:
78
* ACPI 6.0: 5.2.25.1 System Physical Address Range Structure.
68
val = s->status;
79
*/
69
break;
80
static const uint8_t nvdimm_nfit_spa_uuid[] =
70
- case 2: /* ID1 */
81
- NVDIMM_UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
71
- val = 0x0007;
82
- 0x18, 0xb7, 0x8c, 0xdb);
72
+ case MII_PHYID1:
83
+ UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
73
+ val = SMSCLAN9118_PHYID1;
84
+ 0x18, 0xb7, 0x8c, 0xdb);
74
break;
85
75
- case 3: /* ID2 */
86
/*
76
- val = 0xc0d1;
87
* NVDIMM Firmware Interface Table
77
+ case MII_PHYID2:
78
+ val = SMSCLAN9118_PHYID2;
79
break;
80
- case 4: /* Auto-neg advertisement */
81
+ case MII_ANAR:
82
val = s->advertise;
83
break;
84
- case 5: /* Auto-neg Link Partner Ability */
85
- val = 0x0fe1;
86
+ case MII_ANLPAR:
87
+ val = MII_ANLPAR_PAUSEASY | MII_ANLPAR_PAUSE | MII_ANLPAR_T4 |
88
+ MII_ANLPAR_TXFD | MII_ANLPAR_TX | MII_ANLPAR_10FD |
89
+ MII_ANLPAR_10 | MII_ANLPAR_CSMACD;
90
break;
91
- case 6: /* Auto-neg Expansion */
92
- val = 1;
93
+ case MII_ANER:
94
+ val = MII_ANER_NWAY;
95
break;
96
case 29: /* Interrupt source. */
97
val = s->ints;
98
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
99
trace_lan9118_phy_write(val, reg);
100
101
switch (reg) {
102
- case 0: /* Basic Control */
103
- if (val & 0x8000) {
104
+ case MII_BMCR:
105
+ if (val & MII_BMCR_RESET) {
106
lan9118_phy_reset(s);
107
} else {
108
- s->control = val & 0x7980;
109
+ s->control = val & (MII_BMCR_LOOPBACK | MII_BMCR_SPEED100 |
110
+ MII_BMCR_AUTOEN | MII_BMCR_PDOWN | MII_BMCR_FD |
111
+ MII_BMCR_CTST);
112
/* Complete autonegotiation immediately. */
113
- if (val & 0x1000) {
114
- s->status |= 0x0020;
115
+ if (val & MII_BMCR_AUTOEN) {
116
+ s->status |= MII_BMSR_AN_COMP;
117
}
118
}
119
break;
120
- case 4: /* Auto-neg advertisement */
121
- s->advertise = (val & 0x2d7f) | 0x80;
122
+ case MII_ANAR:
123
+ s->advertise = (val & (MII_ANAR_RFAULT | MII_ANAR_PAUSE_ASYM |
124
+ MII_ANAR_PAUSE | MII_ANAR_10FD | MII_ANAR_10 |
125
+ MII_ANAR_SELECT))
126
+ | MII_ANAR_TX;
127
break;
128
case 30: /* Interrupt mask */
129
s->int_mask = val & 0xff;
130
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_update_link(Lan9118PhyState *s, bool link_down)
131
/* Autonegotiation status mirrors link status. */
132
if (link_down) {
133
trace_lan9118_phy_update_link("down");
134
- s->status &= ~0x0024;
135
+ s->status &= ~(MII_BMSR_AN_COMP | MII_BMSR_LINK_ST);
136
s->ints |= PHY_INT_DOWN;
137
} else {
138
trace_lan9118_phy_update_link("up");
139
- s->status |= 0x0024;
140
+ s->status |= MII_BMSR_AN_COMP | MII_BMSR_LINK_ST;
141
s->ints |= PHY_INT_ENERGYON;
142
s->ints |= PHY_INT_AUTONEG_COMPLETE;
143
}
144
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_reset(Lan9118PhyState *s)
145
{
146
trace_lan9118_phy_reset();
147
148
- s->control = 0x3000;
149
- s->status = 0x7809;
150
- s->advertise = 0x01e1;
151
+ s->control = MII_BMCR_AUTOEN | MII_BMCR_SPEED100;
152
+ s->status = MII_BMSR_100TX_FD
153
+ | MII_BMSR_100TX_HD
154
+ | MII_BMSR_10T_FD
155
+ | MII_BMSR_10T_HD
156
+ | MII_BMSR_AUTONEG
157
+ | MII_BMSR_EXTCAP;
158
+ s->advertise = MII_ANAR_TXFD
159
+ | MII_ANAR_TX
160
+ | MII_ANAR_10FD
161
+ | MII_ANAR_10
162
+ | MII_ANAR_CSMACD;
163
s->int_mask = 0;
164
s->ints = 0;
165
lan9118_phy_update_link(s, s->link_down);
88
--
166
--
89
2.20.1
167
2.34.1
90
91
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Bernhard Beschow <shentey@gmail.com>
2
2
3
RAS Virtualization feature is not supported now, so
3
The real device advertises this mode and the device model already advertises
4
add a RAS machine option and disable it by default.
4
100 mbps half duplex and 10 mbps full+half duplex. So advertise this mode to
5
make the model more realistic.
5
6
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
8
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
8
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
9
Tested-by: Guenter Roeck <linux@roeck-us.net>
9
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
10
Message-id: 20241102125724.532843-6-shentey@gmail.com
10
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
11
Message-id: 20200512030609.19593-3-gengdongjiu@huawei.com
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
12
---
14
include/hw/arm/virt.h | 1 +
13
hw/net/lan9118_phy.c | 4 ++--
15
hw/arm/virt.c | 23 +++++++++++++++++++++++
14
1 file changed, 2 insertions(+), 2 deletions(-)
16
2 files changed, 24 insertions(+)
17
15
18
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
16
diff --git a/hw/net/lan9118_phy.c b/hw/net/lan9118_phy.c
19
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
20
--- a/include/hw/arm/virt.h
18
--- a/hw/net/lan9118_phy.c
21
+++ b/include/hw/arm/virt.h
19
+++ b/hw/net/lan9118_phy.c
22
@@ -XXX,XX +XXX,XX @@ typedef struct {
20
@@ -XXX,XX +XXX,XX @@ void lan9118_phy_write(Lan9118PhyState *s, int reg, uint16_t val)
23
bool highmem_ecam;
21
break;
24
bool its;
22
case MII_ANAR:
25
bool virt;
23
s->advertise = (val & (MII_ANAR_RFAULT | MII_ANAR_PAUSE_ASYM |
26
+ bool ras;
24
- MII_ANAR_PAUSE | MII_ANAR_10FD | MII_ANAR_10 |
27
OnOffAuto acpi;
25
- MII_ANAR_SELECT))
28
VirtGICType gic_version;
26
+ MII_ANAR_PAUSE | MII_ANAR_TXFD | MII_ANAR_10FD |
29
VirtIOMMUType iommu;
27
+ MII_ANAR_10 | MII_ANAR_SELECT))
30
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
28
| MII_ANAR_TX;
31
index XXXXXXX..XXXXXXX 100644
29
break;
32
--- a/hw/arm/virt.c
30
case 30: /* Interrupt mask */
33
+++ b/hw/arm/virt.c
34
@@ -XXX,XX +XXX,XX @@ static void virt_set_acpi(Object *obj, Visitor *v, const char *name,
35
visit_type_OnOffAuto(v, name, &vms->acpi, errp);
36
}
37
38
+static bool virt_get_ras(Object *obj, Error **errp)
39
+{
40
+ VirtMachineState *vms = VIRT_MACHINE(obj);
41
+
42
+ return vms->ras;
43
+}
44
+
45
+static void virt_set_ras(Object *obj, bool value, Error **errp)
46
+{
47
+ VirtMachineState *vms = VIRT_MACHINE(obj);
48
+
49
+ vms->ras = value;
50
+}
51
+
52
static char *virt_get_gic_version(Object *obj, Error **errp)
53
{
54
VirtMachineState *vms = VIRT_MACHINE(obj);
55
@@ -XXX,XX +XXX,XX @@ static void virt_instance_init(Object *obj)
56
"Valid values are none and smmuv3",
57
NULL);
58
59
+ /* Default disallows RAS instantiation */
60
+ vms->ras = false;
61
+ object_property_add_bool(obj, "ras", virt_get_ras,
62
+ virt_set_ras, NULL);
63
+ object_property_set_description(obj, "ras",
64
+ "Set on/off to enable/disable reporting host memory errors "
65
+ "to a KVM guest using ACPI and guest external abort exceptions",
66
+ NULL);
67
+
68
vms->irqmap = a15irqmap;
69
70
virt_flash_create(vms);
71
--
31
--
72
2.20.1
32
2.34.1
73
74
diff view generated by jsdifflib
1
Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same
1
For IEEE fused multiply-add, the (0 * inf) + NaN case should raise
2
insns to decodetree. (These are all the remaining non-accumulation
2
Invalid for the multiplication of 0 by infinity. Currently we handle
3
instructions in this group.)
3
this in the per-architecture ifdef ladder in pickNaNMulAdd().
4
However, since this isn't really architecture specific we can hoist
5
it up to the generic code.
6
7
For the cases where the infzero test in pickNaNMulAdd was
8
returning 2, we can delete the check entirely and allow the
9
code to fall into the normal pick-a-NaN handling, because this
10
will return 2 anyway (input 'c' being the only NaN in this case).
11
For the cases where infzero was returning 3 to indicate "return
12
the default NaN", we must retain that "return 3".
13
14
For Arm, this looks like it might be a behaviour change because we
15
used to set float_flag_invalid | float_flag_invalid_imz only if C is
16
a quiet NaN. However, it is not, because Arm target code never looks
17
at float_flag_invalid_imz, and for the (0 * inf) + SNaN case we
18
already raised float_flag_invalid via the "abc_mask &
19
float_cmask_snan" check in pick_nan_muladd.
20
21
For any target architecture using the "default implementation" at the
22
bottom of the ifdef, this is a behaviour change but will be fixing a
23
bug (where we failed to raise the Invalid exception for (0 * inf +
24
QNaN). The architectures using the default case are:
25
* hppa
26
* i386
27
* sh4
28
* tricore
29
30
The x86, Tricore and SH4 CPU architecture manuals are clear that this
31
should have raised Invalid; HPPA is a bit vaguer but still seems
32
clear enough.
4
33
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
34
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
35
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-17-peter.maydell@linaro.org
36
Message-id: 20241202131347.498124-2-peter.maydell@linaro.org
8
---
37
---
9
target/arm/neon-dp.decode | 6 +++
38
fpu/softfloat-parts.c.inc | 13 +++++++------
10
target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
39
fpu/softfloat-specialize.c.inc | 29 +----------------------------
11
target/arm/translate.c | 42 +-------------------
40
2 files changed, 8 insertions(+), 34 deletions(-)
12
3 files changed, 78 insertions(+), 40 deletions(-)
13
41
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
42
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
15
index XXXXXXX..XXXXXXX 100644
43
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/neon-dp.decode
44
--- a/fpu/softfloat-parts.c.inc
17
+++ b/target/arm/neon-dp.decode
45
+++ b/fpu/softfloat-parts.c.inc
18
@@ -XXX,XX +XXX,XX @@ VCGE_fp_3s 1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
46
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
19
VACGE_fp_3s 1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
47
int ab_mask, int abc_mask)
20
VCGT_fp_3s 1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
48
{
21
VACGT_fp_3s 1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
49
int which;
22
+VMAX_fp_3s 1111 001 0 0 . 0 . .... .... 1111 ... 0 .... @3same_fp
50
+ bool infzero = (ab_mask == float_cmask_infzero);
23
+VMIN_fp_3s 1111 001 0 0 . 1 . .... .... 1111 ... 0 .... @3same_fp
51
24
VPMAX_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
52
if (unlikely(abc_mask & float_cmask_snan)) {
25
VPMIN_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
53
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
26
+VRECPS_fp_3s 1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
54
}
27
+VRSQRTS_fp_3s 1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
55
28
+VMAXNM_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
56
- which = pickNaNMulAdd(a->cls, b->cls, c->cls,
29
+VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
57
- ab_mask == float_cmask_infzero, s);
30
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
58
+ if (infzero) {
31
index XXXXXXX..XXXXXXX 100644
59
+ /* This is (0 * inf) + NaN or (inf * 0) + NaN */
32
--- a/target/arm/translate-neon.inc.c
60
+ float_raise(float_flag_invalid | float_flag_invalid_imz, s);
33
+++ b/target/arm/translate-neon.inc.c
34
@@ -XXX,XX +XXX,XX @@ DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
35
DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
36
DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
37
DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
38
+DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
39
+DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
40
41
static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
42
TCGv_ptr fpstatus)
43
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
44
DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
45
DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
46
47
+static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
48
+{
49
+ if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
50
+ return false;
51
+ }
61
+ }
52
+
62
+
53
+ if (a->size != 0) {
63
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
54
+ /* TODO fp16 support */
64
55
+ return false;
65
if (s->default_nan_mode || which == 3) {
56
+ }
66
- /*
67
- * Note that this check is after pickNaNMulAdd so that function
68
- * has an opportunity to set the Invalid flag for infzero.
69
- */
70
parts_default_nan(a, s);
71
return a;
72
}
73
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
74
index XXXXXXX..XXXXXXX 100644
75
--- a/fpu/softfloat-specialize.c.inc
76
+++ b/fpu/softfloat-specialize.c.inc
77
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
78
* the default NaN
79
*/
80
if (infzero && is_qnan(c_cls)) {
81
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
82
return 3;
83
}
84
85
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
86
* case sets InvalidOp and returns the default NaN
87
*/
88
if (infzero) {
89
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
90
return 3;
91
}
92
/* Prefer sNaN over qNaN, in the a, b, c order. */
93
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
94
* For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
95
* case sets InvalidOp and returns the input value 'c'
96
*/
97
- if (infzero) {
98
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
99
- return 2;
100
- }
101
/* Prefer sNaN over qNaN, in the c, a, b order. */
102
if (is_snan(c_cls)) {
103
return 2;
104
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
105
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
106
* case sets InvalidOp and returns the input value 'c'
107
*/
108
- if (infzero) {
109
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
110
- return 2;
111
- }
57
+
112
+
58
+ return do_3same_fp(s, a, gen_helper_vfp_maxnums, false);
113
/* Prefer sNaN over qNaN, in the c, a, b order. */
59
+}
114
if (is_snan(c_cls)) {
60
+
115
return 2;
61
+static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
116
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
62
+{
117
* to return an input NaN if we have one (ie c) rather than generating
63
+ if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
118
* a default NaN
64
+ return false;
119
*/
65
+ }
120
- if (infzero) {
66
+
121
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
67
+ if (a->size != 0) {
122
- return 2;
68
+ /* TODO fp16 support */
123
- }
69
+ return false;
124
70
+ }
125
/* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
71
+
126
* otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
72
+ return do_3same_fp(s, a, gen_helper_vfp_minnums, false);
127
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
73
+}
128
return 1;
74
+
129
}
75
+WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
130
#elif defined(TARGET_RISCV)
76
+
131
- /* For RISC-V, InvalidOp is set when multiplicands are Inf and zero */
77
+static void gen_VRECPS_fp_3s(unsigned vece, uint32_t rd_ofs,
132
- if (infzero) {
78
+ uint32_t rn_ofs, uint32_t rm_ofs,
133
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
79
+ uint32_t oprsz, uint32_t maxsz)
134
- }
80
+{
135
return 3; /* default NaN */
81
+ static const GVecGen3 ops = { .fni4 = gen_VRECPS_tramp };
136
#elif defined(TARGET_S390X)
82
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
137
if (infzero) {
83
+}
138
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
84
+
139
return 3;
85
+static bool trans_VRECPS_fp_3s(DisasContext *s, arg_3same *a)
140
}
86
+{
141
87
+ if (a->size != 0) {
142
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
88
+ /* TODO fp16 support */
143
return 2;
89
+ return false;
144
}
90
+ }
145
#elif defined(TARGET_SPARC)
91
+
146
- /* For (inf,0,nan) return c. */
92
+ return do_3same(s, a, gen_VRECPS_fp_3s);
147
- if (infzero) {
93
+}
148
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
94
+
149
- return 2;
95
+WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
150
- }
96
+
151
/* Prefer SNaN over QNaN, order C, B, A. */
97
+static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
152
if (is_snan(c_cls)) {
98
+ uint32_t rn_ofs, uint32_t rm_ofs,
153
return 2;
99
+ uint32_t oprsz, uint32_t maxsz)
154
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
100
+{
155
* For Xtensa, the (inf,zero,nan) case sets InvalidOp and returns
101
+ static const GVecGen3 ops = { .fni4 = gen_VRSQRTS_tramp };
156
* an input NaN if we have one (ie c).
102
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
157
*/
103
+}
158
- if (infzero) {
104
+
159
- float_raise(float_flag_invalid | float_flag_invalid_imz, status);
105
+static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
160
- return 2;
106
+{
161
- }
107
+ if (a->size != 0) {
162
if (status->use_first_nan) {
108
+ /* TODO fp16 support */
163
if (is_nan(a_cls)) {
109
+ return false;
164
return 0;
110
+ }
111
+
112
+ return do_3same(s, a, gen_VRSQRTS_fp_3s);
113
+}
114
+
115
static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
116
{
117
/* FP operations handled pairwise 32 bits at a time */
118
diff --git a/target/arm/translate.c b/target/arm/translate.c
119
index XXXXXXX..XXXXXXX 100644
120
--- a/target/arm/translate.c
121
+++ b/target/arm/translate.c
122
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
123
case NEON_3R_FLOAT_MULTIPLY:
124
case NEON_3R_FLOAT_CMP:
125
case NEON_3R_FLOAT_ACMP:
126
+ case NEON_3R_FLOAT_MINMAX:
127
+ case NEON_3R_FLOAT_MISC:
128
/* Already handled by decodetree */
129
return 1;
130
}
131
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
132
return 1;
133
}
134
switch (op) {
135
- case NEON_3R_FLOAT_MINMAX:
136
- if (u) {
137
- return 1; /* VPMIN/VPMAX handled by decodetree */
138
- }
139
- break;
140
- case NEON_3R_FLOAT_MISC:
141
- /* VMAXNM/VMINNM in ARMv8 */
142
- if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
143
- return 1;
144
- }
145
- break;
146
case NEON_3R_VFM_VQRDMLSH:
147
if (!dc_isar_feature(aa32_simdfmac, s)) {
148
return 1;
149
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
150
tmp = neon_load_reg(rn, pass);
151
tmp2 = neon_load_reg(rm, pass);
152
switch (op) {
153
- case NEON_3R_FLOAT_MINMAX:
154
- {
155
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
156
- if (size == 0) {
157
- gen_helper_vfp_maxs(tmp, tmp, tmp2, fpstatus);
158
- } else {
159
- gen_helper_vfp_mins(tmp, tmp, tmp2, fpstatus);
160
- }
161
- tcg_temp_free_ptr(fpstatus);
162
- break;
163
- }
164
- case NEON_3R_FLOAT_MISC:
165
- if (u) {
166
- /* VMAXNM/VMINNM */
167
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
168
- if (size == 0) {
169
- gen_helper_vfp_maxnums(tmp, tmp, tmp2, fpstatus);
170
- } else {
171
- gen_helper_vfp_minnums(tmp, tmp, tmp2, fpstatus);
172
- }
173
- tcg_temp_free_ptr(fpstatus);
174
- } else {
175
- if (size == 0) {
176
- gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
177
- } else {
178
- gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
179
- }
180
- }
181
- break;
182
case NEON_3R_VFM_VQRDMLSH:
183
{
184
/* VFMA, VFMS: fused multiply-add */
185
--
165
--
186
2.20.1
166
2.34.1
187
188
diff view generated by jsdifflib
New patch
1
If the target sets default_nan_mode then we're always going to return
2
the default NaN, and pickNaNMulAdd() no longer has any side effects.
3
For consistency with pickNaN(), check for default_nan_mode before
4
calling pickNaNMulAdd().
1
5
6
When we convert pickNaNMulAdd() to allow runtime selection of the NaN
7
propagation rule, this means we won't have to make the targets which
8
use default_nan_mode also set a propagation rule.
9
10
Since RiscV always uses default_nan_mode, this allows us to remove
11
its ifdef case from pickNaNMulAdd().
12
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20241202131347.498124-3-peter.maydell@linaro.org
16
---
17
fpu/softfloat-parts.c.inc | 8 ++++++--
18
fpu/softfloat-specialize.c.inc | 9 +++++++--
19
2 files changed, 13 insertions(+), 4 deletions(-)
20
21
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
22
index XXXXXXX..XXXXXXX 100644
23
--- a/fpu/softfloat-parts.c.inc
24
+++ b/fpu/softfloat-parts.c.inc
25
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
26
float_raise(float_flag_invalid | float_flag_invalid_imz, s);
27
}
28
29
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
30
+ if (s->default_nan_mode) {
31
+ which = 3;
32
+ } else {
33
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
34
+ }
35
36
- if (s->default_nan_mode || which == 3) {
37
+ if (which == 3) {
38
parts_default_nan(a, s);
39
return a;
40
}
41
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
42
index XXXXXXX..XXXXXXX 100644
43
--- a/fpu/softfloat-specialize.c.inc
44
+++ b/fpu/softfloat-specialize.c.inc
45
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
46
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
47
bool infzero, float_status *status)
48
{
49
+ /*
50
+ * We guarantee not to require the target to tell us how to
51
+ * pick a NaN if we're always returning the default NaN.
52
+ * But if we're not in default-NaN mode then the target must
53
+ * specify.
54
+ */
55
+ assert(!status->default_nan_mode);
56
#if defined(TARGET_ARM)
57
/* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns
58
* the default NaN
59
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
60
} else {
61
return 1;
62
}
63
-#elif defined(TARGET_RISCV)
64
- return 3; /* default NaN */
65
#elif defined(TARGET_S390X)
66
if (infzero) {
67
return 3;
68
--
69
2.34.1
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
IEEE 758 does not define a fixed rule for what NaN to return in
2
2
the case of a fused multiply-add of inf * 0 + NaN. Different
3
kvm_arch_on_sigbus_vcpu() error injection uses source_id as
3
architectures thus do different things:
4
index in etc/hardware_errors to find out Error Status Data
4
* some return the default NaN
5
Block entry corresponding to error source. So supported source_id
5
* some return the input NaN
6
values should be assigned here and not be changed afterwards to
6
* Arm returns the default NaN if the input NaN is quiet,
7
make sure that guest will write error into expected Error Status
7
and the input NaN if it is signalling
8
Data Block.
8
9
9
We want to make this logic be runtime selected rather than
10
Before QEMU writes a new error to ACPI table, it will check whether
10
hardcoded into the binary, because:
11
previous error has been acknowledged. If not acknowledged, the new
11
* this will let us have multiple targets in one QEMU binary
12
errors will be ignored and not be recorded. For the errors section
12
* the Arm FEAT_AFP architectural feature includes letting
13
type, QEMU simulate it to memory section error.
13
the guest select a NaN propagation rule at runtime
14
14
15
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
15
In this commit we add an enum for the propagation rule, the field in
16
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
16
float_status, and the corresponding getters and setters. We change
17
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
17
pickNaNMulAdd to honour this, but because all targets still leave
18
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
18
this field at its default 0 value, the fallback logic will pick the
19
Message-id: 20200512030609.19593-9-gengdongjiu@huawei.com
19
rule type with the old ifdef ladder.
20
21
Note that four architectures both use the muladd softfloat functions
22
and did not have a branch of the ifdef ladder to specify their
23
behaviour (and so were ending up with the "default" case, probably
24
wrongly): i386, HPPA, SH4 and Tricore. SH4 and Tricore both set
25
default_nan_mode, and so will never get into pickNaNMulAdd(). For
26
HPPA and i386 we retain the same behaviour as the old default-case,
27
which is to not ever return the default NaN. This might not be
28
correct but it is not a behaviour change.
29
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
30
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
31
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
32
Message-id: 20241202131347.498124-4-peter.maydell@linaro.org
21
---
33
---
22
include/hw/acpi/ghes.h | 1 +
34
include/fpu/softfloat-helpers.h | 11 ++++
23
hw/acpi/ghes.c | 219 +++++++++++++++++++++++++++++++++++++++++
35
include/fpu/softfloat-types.h | 23 +++++++++
24
2 files changed, 220 insertions(+)
36
fpu/softfloat-specialize.c.inc | 91 ++++++++++++++++++++++-----------
25
37
3 files changed, 95 insertions(+), 30 deletions(-)
26
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
38
39
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
27
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
28
--- a/include/hw/acpi/ghes.h
41
--- a/include/fpu/softfloat-helpers.h
29
+++ b/include/hw/acpi/ghes.h
42
+++ b/include/fpu/softfloat-helpers.h
30
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
43
@@ -XXX,XX +XXX,XX @@ static inline void set_float_2nan_prop_rule(Float2NaNPropRule rule,
31
void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
44
status->float_2nan_prop_rule = rule;
32
void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
45
}
33
GArray *hardware_errors);
46
34
+int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
47
+static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
35
#endif
48
+ float_status *status)
36
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
49
+{
50
+ status->float_infzeronan_rule = rule;
51
+}
52
+
53
static inline void set_flush_to_zero(bool val, float_status *status)
54
{
55
status->flush_to_zero = val;
56
@@ -XXX,XX +XXX,XX @@ static inline Float2NaNPropRule get_float_2nan_prop_rule(float_status *status)
57
return status->float_2nan_prop_rule;
58
}
59
60
+static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status)
61
+{
62
+ return status->float_infzeronan_rule;
63
+}
64
+
65
static inline bool get_flush_to_zero(float_status *status)
66
{
67
return status->flush_to_zero;
68
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
37
index XXXXXXX..XXXXXXX 100644
69
index XXXXXXX..XXXXXXX 100644
38
--- a/hw/acpi/ghes.c
70
--- a/include/fpu/softfloat-types.h
39
+++ b/hw/acpi/ghes.c
71
+++ b/include/fpu/softfloat-types.h
40
@@ -XXX,XX +XXX,XX @@
72
@@ -XXX,XX +XXX,XX @@ typedef enum __attribute__((__packed__)) {
41
#include "qemu/error-report.h"
73
float_2nan_prop_x87,
42
#include "hw/acpi/generic_event_device.h"
74
} Float2NaNPropRule;
43
#include "hw/nvram/fw_cfg.h"
44
+#include "qemu/uuid.h"
45
46
#define ACPI_GHES_ERRORS_FW_CFG_FILE "etc/hardware_errors"
47
#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
48
@@ -XXX,XX +XXX,XX @@
49
/* Address offset in Generic Address Structure(GAS) */
50
#define GAS_ADDR_OFFSET 4
51
75
52
+/*
76
+/*
53
+ * The total size of Generic Error Data Entry
77
+ * Rule for result of fused multiply-add 0 * Inf + NaN.
54
+ * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
78
+ * This must be a NaN, but implementations differ on whether this
55
+ * Table 18-343 Generic Error Data Entry
79
+ * is the input NaN or the default NaN.
80
+ *
81
+ * You don't need to set this if default_nan_mode is enabled.
82
+ * When not in default-NaN mode, it is an error for the target
83
+ * not to set the rule in float_status if it uses muladd, and we
84
+ * will assert if we need to handle an input NaN and no rule was
85
+ * selected.
56
+ */
86
+ */
57
+#define ACPI_GHES_DATA_LENGTH 72
87
+typedef enum __attribute__((__packed__)) {
58
+
88
+ /* No propagation rule specified */
59
+/* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section */
89
+ float_infzeronan_none = 0,
60
+#define ACPI_GHES_MEM_CPER_LENGTH 80
90
+ /* Result is never the default NaN (so always the input NaN) */
61
+
91
+ float_infzeronan_dnan_never,
62
+/* Masks for block_status flags */
92
+ /* Result is always the default NaN */
63
+#define ACPI_GEBS_UNCORRECTABLE 1
93
+ float_infzeronan_dnan_always,
64
+
94
+ /* Result is the default NaN if the input NaN is quiet */
65
+/*
95
+ float_infzeronan_dnan_if_qnan,
66
+ * Total size for Generic Error Status Block except Generic Error Data Entries
96
+} FloatInfZeroNaNRule;
67
+ * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
68
+ * Table 18-380 Generic Error Status Block
69
+ */
70
+#define ACPI_GHES_GESB_SIZE 20
71
+
72
+/*
73
+ * Values for error_severity field
74
+ */
75
+enum AcpiGenericErrorSeverity {
76
+ ACPI_CPER_SEV_RECOVERABLE = 0,
77
+ ACPI_CPER_SEV_FATAL = 1,
78
+ ACPI_CPER_SEV_CORRECTED = 2,
79
+ ACPI_CPER_SEV_NONE = 3,
80
+};
81
+
97
+
82
/*
98
/*
83
* Hardware Error Notification
99
* Floating Point Status. Individual architectures may maintain
84
* ACPI 4.0: 17.3.2.7 Hardware Error Notification
100
* several versions of float_status for different functions. The
85
@@ -XXX,XX +XXX,XX @@ static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
101
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
86
build_append_int_noprefix(table, 0, 4);
102
FloatRoundMode float_rounding_mode;
87
}
103
FloatX80RoundPrec floatx80_rounding_precision;
88
104
Float2NaNPropRule float_2nan_prop_rule;
89
+/*
105
+ FloatInfZeroNaNRule float_infzeronan_rule;
90
+ * Generic Error Data Entry
106
bool tininess_before_rounding;
91
+ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
107
/* should denormalised results go to zero and set the inexact flag? */
92
+ */
108
bool flush_to_zero;
93
+static void acpi_ghes_generic_error_data(GArray *table,
109
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
94
+ const uint8_t *section_type, uint32_t error_severity,
110
index XXXXXXX..XXXXXXX 100644
95
+ uint8_t validation_bits, uint8_t flags,
111
--- a/fpu/softfloat-specialize.c.inc
96
+ uint32_t error_data_length, QemuUUID fru_id,
112
+++ b/fpu/softfloat-specialize.c.inc
97
+ uint64_t time_stamp)
113
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
98
+{
114
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
99
+ const uint8_t fru_text[20] = {0};
115
bool infzero, float_status *status)
100
+
116
{
101
+ /* Section Type */
117
+ FloatInfZeroNaNRule rule = status->float_infzeronan_rule;
102
+ g_array_append_vals(table, section_type, 16);
118
+
103
+
119
/*
104
+ /* Error Severity */
120
* We guarantee not to require the target to tell us how to
105
+ build_append_int_noprefix(table, error_severity, 4);
121
* pick a NaN if we're always returning the default NaN.
106
+ /* Revision */
122
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
107
+ build_append_int_noprefix(table, 0x300, 2);
123
* specify.
108
+ /* Validation Bits */
124
*/
109
+ build_append_int_noprefix(table, validation_bits, 1);
125
assert(!status->default_nan_mode);
110
+ /* Flags */
126
+
111
+ build_append_int_noprefix(table, flags, 1);
127
+ if (rule == float_infzeronan_none) {
112
+ /* Error Data Length */
128
+ /*
113
+ build_append_int_noprefix(table, error_data_length, 4);
129
+ * Temporarily fall back to ifdef ladder
114
+
130
+ */
115
+ /* FRU Id */
131
#if defined(TARGET_ARM)
116
+ g_array_append_vals(table, fru_id.data, ARRAY_SIZE(fru_id.data));
132
- /* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns
117
+
133
- * the default NaN
118
+ /* FRU Text */
134
- */
119
+ g_array_append_vals(table, fru_text, sizeof(fru_text));
135
- if (infzero && is_qnan(c_cls)) {
120
+
136
- return 3;
121
+ /* Timestamp */
137
+ /*
122
+ build_append_int_noprefix(table, time_stamp, 8);
138
+ * For ARM, the (inf,zero,qnan) case returns the default NaN,
123
+}
139
+ * but (inf,zero,snan) returns the input NaN.
124
+
140
+ */
125
+/*
141
+ rule = float_infzeronan_dnan_if_qnan;
126
+ * Generic Error Status Block
142
+#elif defined(TARGET_MIPS)
127
+ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
143
+ if (snan_bit_is_one(status)) {
128
+ */
144
+ /*
129
+static void acpi_ghes_generic_error_status(GArray *table, uint32_t block_status,
145
+ * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
130
+ uint32_t raw_data_offset, uint32_t raw_data_length,
146
+ * case sets InvalidOp and returns the default NaN
131
+ uint32_t data_length, uint32_t error_severity)
147
+ */
132
+{
148
+ rule = float_infzeronan_dnan_always;
133
+ /* Block Status */
149
+ } else {
134
+ build_append_int_noprefix(table, block_status, 4);
150
+ /*
135
+ /* Raw Data Offset */
151
+ * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
136
+ build_append_int_noprefix(table, raw_data_offset, 4);
152
+ * case sets InvalidOp and returns the input value 'c'
137
+ /* Raw Data Length */
153
+ */
138
+ build_append_int_noprefix(table, raw_data_length, 4);
154
+ rule = float_infzeronan_dnan_never;
139
+ /* Data Length */
155
+ }
140
+ build_append_int_noprefix(table, data_length, 4);
156
+#elif defined(TARGET_PPC) || defined(TARGET_SPARC) || \
141
+ /* Error Severity */
157
+ defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
142
+ build_append_int_noprefix(table, error_severity, 4);
158
+ defined(TARGET_I386) || defined(TARGET_LOONGARCH)
143
+}
159
+ /*
144
+
160
+ * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
145
+/* UEFI 2.6: N.2.5 Memory Error Section */
161
+ * case sets InvalidOp and returns the input value 'c'
146
+static void acpi_ghes_build_append_mem_cper(GArray *table,
162
+ */
147
+ uint64_t error_physical_addr)
163
+ /*
148
+{
164
+ * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
149
+ /*
165
+ * to return an input NaN if we have one (ie c) rather than generating
150
+ * Memory Error Record
166
+ * a default NaN
151
+ */
167
+ */
152
+
168
+ rule = float_infzeronan_dnan_never;
153
+ /* Validation Bits */
169
+#elif defined(TARGET_S390X)
154
+ build_append_int_noprefix(table,
170
+ rule = float_infzeronan_dnan_always;
155
+ (1ULL << 14) | /* Type Valid */
171
+#endif
156
+ (1ULL << 1) /* Physical Address Valid */,
172
}
157
+ 8);
173
158
+ /* Error Status */
174
+ if (infzero) {
159
+ build_append_int_noprefix(table, 0, 8);
175
+ /*
160
+ /* Physical Address */
176
+ * Inf * 0 + NaN -- some implementations return the default NaN here,
161
+ build_append_int_noprefix(table, error_physical_addr, 8);
177
+ * and some return the input NaN.
162
+ /* Skip all the detailed information normally found in such a record */
178
+ */
163
+ build_append_int_noprefix(table, 0, 48);
179
+ switch (rule) {
164
+ /* Memory Error Type */
180
+ case float_infzeronan_dnan_never:
165
+ build_append_int_noprefix(table, 0 /* Unknown error */, 1);
181
+ return 2;
166
+ /* Skip all the detailed information normally found in such a record */
182
+ case float_infzeronan_dnan_always:
167
+ build_append_int_noprefix(table, 0, 7);
183
+ return 3;
168
+}
184
+ case float_infzeronan_dnan_if_qnan:
169
+
185
+ return is_qnan(c_cls) ? 3 : 2;
170
+static int acpi_ghes_record_mem_error(uint64_t error_block_address,
186
+ default:
171
+ uint64_t error_physical_addr)
187
+ g_assert_not_reached();
172
+{
188
+ }
173
+ GArray *block;
174
+
175
+ /* Memory Error Section Type */
176
+ const uint8_t uefi_cper_mem_sec[] =
177
+ UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
178
+ 0xED, 0x7C, 0x83, 0xB1);
179
+
180
+ /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
181
+ * Table 17-13 Generic Error Data Entry
182
+ */
183
+ QemuUUID fru_id = {};
184
+ uint32_t data_length;
185
+
186
+ block = g_array_new(false, true /* clear */, 1);
187
+
188
+ /* This is the length if adding a new generic error data entry*/
189
+ data_length = ACPI_GHES_DATA_LENGTH + ACPI_GHES_MEM_CPER_LENGTH;
190
+
191
+ /*
192
+ * Check whether it will run out of the preallocated memory if adding a new
193
+ * generic error data entry
194
+ */
195
+ if ((data_length + ACPI_GHES_GESB_SIZE) > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
196
+ error_report("Not enough memory to record new CPER!!!");
197
+ g_array_free(block, true);
198
+ return -1;
199
+ }
189
+ }
200
+
190
+
201
+ /* Build the new generic error status block header */
191
+#if defined(TARGET_ARM)
202
+ acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
192
+
203
+ 0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
193
/* This looks different from the ARM ARM pseudocode, because the ARM ARM
204
+
194
* puts the operands to a fused mac operation (a*b)+c in the order c,a,b.
205
+ /* Build this new generic error data entry header */
195
*/
206
+ acpi_ghes_generic_error_data(block, uefi_cper_mem_sec,
196
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
207
+ ACPI_CPER_SEV_RECOVERABLE, 0, 0,
197
}
208
+ ACPI_GHES_MEM_CPER_LENGTH, fru_id, 0);
198
#elif defined(TARGET_MIPS)
209
+
199
if (snan_bit_is_one(status)) {
210
+ /* Build the memory section CPER for above new generic error data entry */
200
- /*
211
+ acpi_ghes_build_append_mem_cper(block, error_physical_addr);
201
- * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
212
+
202
- * case sets InvalidOp and returns the default NaN
213
+ /* Write the generic error data entry into guest memory */
203
- */
214
+ cpu_physical_memory_write(error_block_address, block->data, block->len);
204
- if (infzero) {
215
+
205
- return 3;
216
+ g_array_free(block, true);
206
- }
217
+
207
/* Prefer sNaN over qNaN, in the a, b, c order. */
218
+ return 0;
208
if (is_snan(a_cls)) {
219
+}
209
return 0;
220
+
210
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
221
/*
211
return 2;
222
* Build table for the hardware error fw_cfg blob.
212
}
223
* Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
213
} else {
224
@@ -XXX,XX +XXX,XX @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
214
- /*
225
fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
215
- * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
226
NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
216
- * case sets InvalidOp and returns the input value 'c'
227
}
217
- */
228
+
218
/* Prefer sNaN over qNaN, in the c, a, b order. */
229
+int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
219
if (is_snan(c_cls)) {
230
+{
220
return 2;
231
+ uint64_t error_block_addr, read_ack_register_addr, read_ack_register = 0;
221
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
232
+ uint64_t start_addr;
222
}
233
+ bool ret = -1;
223
}
234
+ AcpiGedState *acpi_ged_state;
224
#elif defined(TARGET_LOONGARCH64)
235
+ AcpiGhesState *ags;
225
- /*
236
+
226
- * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
237
+ assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
227
- * case sets InvalidOp and returns the input value 'c'
238
+
228
- */
239
+ acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
229
-
240
+ NULL));
230
/* Prefer sNaN over qNaN, in the c, a, b order. */
241
+ g_assert(acpi_ged_state);
231
if (is_snan(c_cls)) {
242
+ ags = &acpi_ged_state->ghes_state;
232
return 2;
243
+
233
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
244
+ start_addr = le64_to_cpu(ags->ghes_addr_le);
234
return 1;
245
+
235
}
246
+ if (physical_address) {
236
#elif defined(TARGET_PPC)
247
+
237
- /* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
248
+ if (source_id < ACPI_HEST_SRC_ID_RESERVED) {
238
- * to return an input NaN if we have one (ie c) rather than generating
249
+ start_addr += source_id * sizeof(uint64_t);
239
- * a default NaN
250
+ }
240
- */
251
+
241
-
252
+ cpu_physical_memory_read(start_addr, &error_block_addr,
242
/* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
253
+ sizeof(error_block_addr));
243
* otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
254
+
244
*/
255
+ error_block_addr = le64_to_cpu(error_block_addr);
245
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
256
+
246
return 1;
257
+ read_ack_register_addr = start_addr +
247
}
258
+ ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t);
248
#elif defined(TARGET_S390X)
259
+
249
- if (infzero) {
260
+ cpu_physical_memory_read(read_ack_register_addr,
250
- return 3;
261
+ &read_ack_register, sizeof(read_ack_register));
251
- }
262
+
252
-
263
+ /* zero means OSPM does not acknowledge the error */
253
if (is_snan(a_cls)) {
264
+ if (!read_ack_register) {
254
return 0;
265
+ error_report("OSPM does not acknowledge previous error,"
255
} else if (is_snan(b_cls)) {
266
+ " so can not record CPER for current error anymore");
267
+ } else if (error_block_addr) {
268
+ read_ack_register = cpu_to_le64(0);
269
+ /*
270
+ * Clear the Read Ack Register, OSPM will write it to 1 when
271
+ * it acknowledges this error.
272
+ */
273
+ cpu_physical_memory_write(read_ack_register_addr,
274
+ &read_ack_register, sizeof(uint64_t));
275
+
276
+ ret = acpi_ghes_record_mem_error(error_block_addr,
277
+ physical_address);
278
+ } else
279
+ error_report("can not find Generic Error Status Block");
280
+ }
281
+
282
+ return ret;
283
+}
284
--
256
--
285
2.20.1
257
2.34.1
286
287
diff view generated by jsdifflib
New patch
1
Explicitly set a rule in the softfloat tests for the inf-zero-nan
2
muladd special case. In meson.build we put -DTARGET_ARM in fpcflags,
3
and so we should select here the Arm rule of
4
float_infzeronan_dnan_if_qnan.
1
5
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20241202131347.498124-5-peter.maydell@linaro.org
9
---
10
tests/fp/fp-bench.c | 5 +++++
11
tests/fp/fp-test.c | 5 +++++
12
2 files changed, 10 insertions(+)
13
14
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/tests/fp/fp-bench.c
17
+++ b/tests/fp/fp-bench.c
18
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
19
{
20
bench_func_t f;
21
22
+ /*
23
+ * These implementation-defined choices for various things IEEE
24
+ * doesn't specify match those used by the Arm architecture.
25
+ */
26
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
27
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
28
29
f = bench_funcs[operation][precision];
30
g_assert(f);
31
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/tests/fp/fp-test.c
34
+++ b/tests/fp/fp-test.c
35
@@ -XXX,XX +XXX,XX @@ void run_test(void)
36
{
37
unsigned int i;
38
39
+ /*
40
+ * These implementation-defined choices for various things IEEE
41
+ * doesn't specify match those used by the Arm architecture.
42
+ */
43
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
44
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
45
46
genCases_setLevel(test_level);
47
verCases_maxErrorCount = n_max_errors;
48
--
49
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the Arm target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-6-peter.maydell@linaro.org
7
---
8
target/arm/cpu.c | 3 +++
9
fpu/softfloat-specialize.c.inc | 8 +-------
10
2 files changed, 4 insertions(+), 7 deletions(-)
11
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
16
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
17
* * tininess-before-rounding
18
* * 2-input NaN propagation prefers SNaN over QNaN, and then
19
* operand A over operand B (see FPProcessNaNs() pseudocode)
20
+ * * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
21
+ * and the input NaN if it is signalling
22
*/
23
static void arm_set_default_fp_behaviours(float_status *s)
24
{
25
set_float_detect_tininess(float_tininess_before_rounding, s);
26
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
27
+ set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
28
}
29
30
static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
31
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
32
index XXXXXXX..XXXXXXX 100644
33
--- a/fpu/softfloat-specialize.c.inc
34
+++ b/fpu/softfloat-specialize.c.inc
35
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
36
/*
37
* Temporarily fall back to ifdef ladder
38
*/
39
-#if defined(TARGET_ARM)
40
- /*
41
- * For ARM, the (inf,zero,qnan) case returns the default NaN,
42
- * but (inf,zero,snan) returns the input NaN.
43
- */
44
- rule = float_infzeronan_dnan_if_qnan;
45
-#elif defined(TARGET_MIPS)
46
+#if defined(TARGET_MIPS)
47
if (snan_bit_is_one(status)) {
48
/*
49
* For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
50
--
51
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for s390, so we
2
can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-7-peter.maydell@linaro.org
7
---
8
target/s390x/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 2 insertions(+), 2 deletions(-)
11
12
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/s390x/cpu.c
15
+++ b/target/s390x/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
17
set_float_detect_tininess(float_tininess_before_rounding,
18
&env->fpu_status);
19
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fpu_status);
20
+ set_float_infzeronan_rule(float_infzeronan_dnan_always,
21
+ &env->fpu_status);
22
/* fall through */
23
case RESET_TYPE_S390_CPU_NORMAL:
24
env->psw.mask &= ~PSW_MASK_RI;
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
* a default NaN
31
*/
32
rule = float_infzeronan_dnan_never;
33
-#elif defined(TARGET_S390X)
34
- rule = float_infzeronan_dnan_always;
35
#endif
36
}
37
38
--
39
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the PPC target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-8-peter.maydell@linaro.org
7
---
8
target/ppc/cpu_init.c | 7 +++++++
9
fpu/softfloat-specialize.c.inc | 7 +------
10
2 files changed, 8 insertions(+), 6 deletions(-)
11
12
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/ppc/cpu_init.c
15
+++ b/target/ppc/cpu_init.c
16
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->vec_status);
20
+ /*
21
+ * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
22
+ * to return an input NaN if we have one (ie c) rather than generating
23
+ * a default NaN
24
+ */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
26
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->vec_status);
27
28
for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
29
ppc_spr_t *spr = &env->spr_cb[i];
30
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
31
index XXXXXXX..XXXXXXX 100644
32
--- a/fpu/softfloat-specialize.c.inc
33
+++ b/fpu/softfloat-specialize.c.inc
34
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
35
*/
36
rule = float_infzeronan_dnan_never;
37
}
38
-#elif defined(TARGET_PPC) || defined(TARGET_SPARC) || \
39
+#elif defined(TARGET_SPARC) || \
40
defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
41
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
42
/*
43
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
44
* case sets InvalidOp and returns the input value 'c'
45
*/
46
- /*
47
- * For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
48
- * to return an input NaN if we have one (ie c) rather than generating
49
- * a default NaN
50
- */
51
rule = float_infzeronan_dnan_never;
52
#endif
53
}
54
--
55
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the MIPS target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-9-peter.maydell@linaro.org
7
---
8
target/mips/fpu_helper.h | 9 +++++++++
9
target/mips/msa.c | 4 ++++
10
fpu/softfloat-specialize.c.inc | 16 +---------------
11
3 files changed, 14 insertions(+), 15 deletions(-)
12
13
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/mips/fpu_helper.h
16
+++ b/target/mips/fpu_helper.h
17
@@ -XXX,XX +XXX,XX @@ static inline void restore_flush_mode(CPUMIPSState *env)
18
static inline void restore_snan_bit_mode(CPUMIPSState *env)
19
{
20
bool nan2008 = env->active_fpu.fcr31 & (1 << FCR31_NAN2008);
21
+ FloatInfZeroNaNRule izn_rule;
22
23
/*
24
* With nan2008, SNaNs are silenced in the usual way.
25
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
26
*/
27
set_snan_bit_is_one(!nan2008, &env->active_fpu.fp_status);
28
set_default_nan_mode(!nan2008, &env->active_fpu.fp_status);
29
+ /*
30
+ * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
31
+ * case sets InvalidOp and returns the default NaN.
32
+ * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
33
+ * case sets InvalidOp and returns the input value 'c'.
34
+ */
35
+ izn_rule = nan2008 ? float_infzeronan_dnan_never : float_infzeronan_dnan_always;
36
+ set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
37
}
38
39
static inline void restore_fp_status(CPUMIPSState *env)
40
diff --git a/target/mips/msa.c b/target/mips/msa.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/mips/msa.c
43
+++ b/target/mips/msa.c
44
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
45
46
/* set proper signanling bit meaning ("1" means "quiet") */
47
set_snan_bit_is_one(0, &env->active_tc.msa_fp_status);
48
+
49
+ /* Inf * 0 + NaN returns the input NaN */
50
+ set_float_infzeronan_rule(float_infzeronan_dnan_never,
51
+ &env->active_tc.msa_fp_status);
52
}
53
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
54
index XXXXXXX..XXXXXXX 100644
55
--- a/fpu/softfloat-specialize.c.inc
56
+++ b/fpu/softfloat-specialize.c.inc
57
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
58
/*
59
* Temporarily fall back to ifdef ladder
60
*/
61
-#if defined(TARGET_MIPS)
62
- if (snan_bit_is_one(status)) {
63
- /*
64
- * For MIPS systems that conform to IEEE754-1985, the (inf,zero,nan)
65
- * case sets InvalidOp and returns the default NaN
66
- */
67
- rule = float_infzeronan_dnan_always;
68
- } else {
69
- /*
70
- * For MIPS systems that conform to IEEE754-2008, the (inf,zero,nan)
71
- * case sets InvalidOp and returns the input value 'c'
72
- */
73
- rule = float_infzeronan_dnan_never;
74
- }
75
-#elif defined(TARGET_SPARC) || \
76
+#if defined(TARGET_SPARC) || \
77
defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
78
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
79
/*
80
--
81
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the SPARC target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-10-peter.maydell@linaro.org
7
---
8
target/sparc/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 3 +--
10
2 files changed, 3 insertions(+), 2 deletions(-)
11
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/sparc/cpu.c
15
+++ b/target/sparc/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
17
* the CPU state struct so it won't get zeroed on reset.
18
*/
19
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &env->fp_status);
20
+ /* For inf * 0 + NaN, return the input NaN */
21
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
23
cpu_exec_realizefn(cs, &local_err);
24
if (local_err != NULL) {
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
/*
31
* Temporarily fall back to ifdef ladder
32
*/
33
-#if defined(TARGET_SPARC) || \
34
- defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
35
+#if defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
36
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
37
/*
38
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
39
--
40
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the xtensa target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-11-peter.maydell@linaro.org
7
---
8
target/xtensa/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 +-
10
2 files changed, 3 insertions(+), 1 deletion(-)
11
12
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/xtensa/cpu.c
15
+++ b/target/xtensa/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_reset_hold(Object *obj, ResetType type)
17
reset_mmu(env);
18
cs->halted = env->runstall;
19
#endif
20
+ /* For inf * 0 + NaN, return the input NaN */
21
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
set_no_signaling_nans(!dfpu, &env->fp_status);
23
xtensa_use_first_nan(env, !dfpu);
24
}
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
/*
31
* Temporarily fall back to ifdef ladder
32
*/
33
-#if defined(TARGET_XTENSA) || defined(TARGET_HPPA) || \
34
+#if defined(TARGET_HPPA) || \
35
defined(TARGET_I386) || defined(TARGET_LOONGARCH)
36
/*
37
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
38
--
39
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the x86 target.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-12-peter.maydell@linaro.org
6
---
7
target/i386/tcg/fpu_helper.c | 7 +++++++
8
fpu/softfloat-specialize.c.inc | 2 +-
9
2 files changed, 8 insertions(+), 1 deletion(-)
10
11
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/i386/tcg/fpu_helper.c
14
+++ b/target/i386/tcg/fpu_helper.c
15
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
16
*/
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->mmx_status);
18
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->sse_status);
19
+ /*
20
+ * Only SSE has multiply-add instructions. In the SDM Section 14.5.2
21
+ * "Fused-Multiply-ADD (FMA) Numeric Behavior" the NaN handling is
22
+ * specified -- for 0 * inf + NaN the input NaN is selected, and if
23
+ * there are multiple input NaNs they are selected in the order a, b, c.
24
+ */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
26
}
27
28
static inline uint8_t save_exception_flags(CPUX86State *env)
29
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
30
index XXXXXXX..XXXXXXX 100644
31
--- a/fpu/softfloat-specialize.c.inc
32
+++ b/fpu/softfloat-specialize.c.inc
33
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
34
* Temporarily fall back to ifdef ladder
35
*/
36
#if defined(TARGET_HPPA) || \
37
- defined(TARGET_I386) || defined(TARGET_LOONGARCH)
38
+ defined(TARGET_LOONGARCH)
39
/*
40
* For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
41
* case sets InvalidOp and returns the input value 'c'
42
--
43
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the loongarch target.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-13-peter.maydell@linaro.org
6
---
7
target/loongarch/tcg/fpu_helper.c | 5 +++++
8
fpu/softfloat-specialize.c.inc | 7 +------
9
2 files changed, 6 insertions(+), 6 deletions(-)
10
11
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/loongarch/tcg/fpu_helper.c
14
+++ b/target/loongarch/tcg/fpu_helper.c
15
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
16
&env->fp_status);
17
set_flush_to_zero(0, &env->fp_status);
18
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
19
+ /*
20
+ * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
21
+ * case sets InvalidOp and returns the input value 'c'
22
+ */
23
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
}
25
26
int ieee_ex_to_loongarch(int xcpt)
27
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
index XXXXXXX..XXXXXXX 100644
29
--- a/fpu/softfloat-specialize.c.inc
30
+++ b/fpu/softfloat-specialize.c.inc
31
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
32
/*
33
* Temporarily fall back to ifdef ladder
34
*/
35
-#if defined(TARGET_HPPA) || \
36
- defined(TARGET_LOONGARCH)
37
- /*
38
- * For LoongArch systems that conform to IEEE754-2008, the (inf,zero,nan)
39
- * case sets InvalidOp and returns the input value 'c'
40
- */
41
+#if defined(TARGET_HPPA)
42
rule = float_infzeronan_dnan_never;
43
#endif
44
}
45
--
46
2.34.1
diff view generated by jsdifflib
New patch
1
Set the FloatInfZeroNaNRule explicitly for the HPPA target,
2
so we can remove the ifdef from pickNaNMulAdd().
1
3
4
As this is the last target to be converted to explicitly setting
5
the rule, we can remove the fallback code in pickNaNMulAdd()
6
entirely.
7
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20241202131347.498124-14-peter.maydell@linaro.org
11
---
12
target/hppa/fpu_helper.c | 2 ++
13
fpu/softfloat-specialize.c.inc | 13 +------------
14
2 files changed, 3 insertions(+), 12 deletions(-)
15
16
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/hppa/fpu_helper.c
19
+++ b/target/hppa/fpu_helper.c
20
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
21
* HPPA does note implement a CPU reset method at all...
22
*/
23
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
24
+ /* For inf * 0 + NaN, return the input NaN */
25
+ set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
26
}
27
28
void cpu_hppa_loaded_fr0(CPUHPPAState *env)
29
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
30
index XXXXXXX..XXXXXXX 100644
31
--- a/fpu/softfloat-specialize.c.inc
32
+++ b/fpu/softfloat-specialize.c.inc
33
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
34
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
35
bool infzero, float_status *status)
36
{
37
- FloatInfZeroNaNRule rule = status->float_infzeronan_rule;
38
-
39
/*
40
* We guarantee not to require the target to tell us how to
41
* pick a NaN if we're always returning the default NaN.
42
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
43
*/
44
assert(!status->default_nan_mode);
45
46
- if (rule == float_infzeronan_none) {
47
- /*
48
- * Temporarily fall back to ifdef ladder
49
- */
50
-#if defined(TARGET_HPPA)
51
- rule = float_infzeronan_dnan_never;
52
-#endif
53
- }
54
-
55
if (infzero) {
56
/*
57
* Inf * 0 + NaN -- some implementations return the default NaN here,
58
* and some return the input NaN.
59
*/
60
- switch (rule) {
61
+ switch (status->float_infzeronan_rule) {
62
case float_infzeronan_dnan_never:
63
return 2;
64
case float_infzeronan_dnan_always:
65
--
66
2.34.1
diff view generated by jsdifflib
New patch
1
The new implementation of pickNaNMulAdd() will find it convenient
2
to know whether at least one of the three arguments to the muladd
3
was a signaling NaN. We already calculate that in the caller,
4
so pass it in as a new bool have_snan.
1
5
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-15-peter.maydell@linaro.org
9
---
10
fpu/softfloat-parts.c.inc | 5 +++--
11
fpu/softfloat-specialize.c.inc | 2 +-
12
2 files changed, 4 insertions(+), 3 deletions(-)
13
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
15
index XXXXXXX..XXXXXXX 100644
16
--- a/fpu/softfloat-parts.c.inc
17
+++ b/fpu/softfloat-parts.c.inc
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
19
{
20
int which;
21
bool infzero = (ab_mask == float_cmask_infzero);
22
+ bool have_snan = (abc_mask & float_cmask_snan);
23
24
- if (unlikely(abc_mask & float_cmask_snan)) {
25
+ if (unlikely(have_snan)) {
26
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
27
}
28
29
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
30
if (s->default_nan_mode) {
31
which = 3;
32
} else {
33
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, s);
34
+ which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, have_snan, s);
35
}
36
37
if (which == 3) {
38
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
39
index XXXXXXX..XXXXXXX 100644
40
--- a/fpu/softfloat-specialize.c.inc
41
+++ b/fpu/softfloat-specialize.c.inc
42
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
43
| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN
44
*----------------------------------------------------------------------------*/
45
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
46
- bool infzero, float_status *status)
47
+ bool infzero, bool have_snan, float_status *status)
48
{
49
/*
50
* We guarantee not to require the target to tell us how to
51
--
52
2.34.1
diff view generated by jsdifflib
1
Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same
1
IEEE 758 does not define a fixed rule for which NaN to pick as the
2
group to decodetree. We have already implemented the size==0b11
2
result if both operands of a 3-operand fused multiply-add operation
3
case of these insns; this commit handles the remaining sizes.
3
are NaNs. As a result different architectures have ended up with
4
different rules for propagating NaNs.
5
6
QEMU currently hardcodes the NaN propagation logic into the binary
7
because pickNaNMulAdd() has an ifdef ladder for different targets.
8
We want to make the propagation rule instead be selectable at
9
runtime, because:
10
* this will let us have multiple targets in one QEMU binary
11
* the Arm FEAT_AFP architectural feature includes letting
12
the guest select a NaN propagation rule at runtime
13
14
In this commit we add an enum for the propagation rule, the field in
15
float_status, and the corresponding getters and setters. We change
16
pickNaNMulAdd to honour this, but because all targets still leave
17
this field at its default 0 value, the fallback logic will pick the
18
rule type with the old ifdef ladder.
19
20
It's valid not to set a propagation rule if default_nan_mode is
21
enabled, because in that case there's no need to pick a NaN; all the
22
callers of pickNaNMulAdd() catch this case and skip calling it.
4
23
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
25
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-8-peter.maydell@linaro.org
26
Message-id: 20241202131347.498124-16-peter.maydell@linaro.org
8
---
27
---
9
target/arm/neon-dp.decode | 30 ++++++++++++++++++-----
28
include/fpu/softfloat-helpers.h | 11 +++
10
target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++++++++++++
29
include/fpu/softfloat-types.h | 55 +++++++++++
11
target/arm/translate.c | 22 +++--------------
30
fpu/softfloat-specialize.c.inc | 167 ++++++++------------------------
12
3 files changed, 70 insertions(+), 25 deletions(-)
31
3 files changed, 107 insertions(+), 126 deletions(-)
13
32
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
33
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
15
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/neon-dp.decode
35
--- a/include/fpu/softfloat-helpers.h
17
+++ b/target/arm/neon-dp.decode
36
+++ b/include/fpu/softfloat-helpers.h
18
@@ -XXX,XX +XXX,XX @@ VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
37
@@ -XXX,XX +XXX,XX @@ static inline void set_float_2nan_prop_rule(Float2NaNPropRule rule,
19
@3same_64_rev .... ... . . . 11 .... .... .... . q:1 . . .... \
38
status->float_2nan_prop_rule = rule;
20
&3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
39
}
21
40
22
-VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
41
+static inline void set_float_3nan_prop_rule(Float3NaNPropRule rule,
23
-VQSHL_U64_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
42
+ float_status *status)
24
-VRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
25
-VRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
26
-VQRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
27
-VQRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
28
+{
43
+{
29
+ VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
44
+ status->float_3nan_prop_rule = rule;
30
+ VQSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev
31
+}
45
+}
46
+
47
static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
48
float_status *status)
49
{
50
@@ -XXX,XX +XXX,XX @@ static inline Float2NaNPropRule get_float_2nan_prop_rule(float_status *status)
51
return status->float_2nan_prop_rule;
52
}
53
54
+static inline Float3NaNPropRule get_float_3nan_prop_rule(float_status *status)
32
+{
55
+{
33
+ VQSHL_U64_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
56
+ return status->float_3nan_prop_rule;
34
+ VQSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev
35
+}
57
+}
36
+{
58
+
37
+ VRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
59
static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status)
38
+ VRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev
60
{
39
+}
61
return status->float_infzeronan_rule;
40
+{
62
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
41
+ VRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
42
+ VRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev
43
+}
44
+{
45
+ VQRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
46
+ VQRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev
47
+}
48
+{
49
+ VQRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
50
+ VQRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev
51
+}
52
53
VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
54
VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
55
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
56
index XXXXXXX..XXXXXXX 100644
63
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/translate-neon.inc.c
64
--- a/include/fpu/softfloat-types.h
58
+++ b/target/arm/translate-neon.inc.c
65
+++ b/include/fpu/softfloat-types.h
59
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
66
@@ -XXX,XX +XXX,XX @@ this code that are retained.
60
return do_3same(s, a, gen_##INSN##_3s); \
67
#ifndef SOFTFLOAT_TYPES_H
68
#define SOFTFLOAT_TYPES_H
69
70
+#include "hw/registerfields.h"
71
+
72
/*
73
* Software IEC/IEEE floating-point types.
74
*/
75
@@ -XXX,XX +XXX,XX @@ typedef enum __attribute__((__packed__)) {
76
float_2nan_prop_x87,
77
} Float2NaNPropRule;
78
79
+/*
80
+ * 3-input NaN propagation rule, for fused multiply-add. Individual
81
+ * architectures have different rules for which input NaN is
82
+ * propagated to the output when there is more than one NaN on the
83
+ * input.
84
+ *
85
+ * If default_nan_mode is enabled then it is valid not to set a NaN
86
+ * propagation rule, because the softfloat code guarantees not to try
87
+ * to pick a NaN to propagate in default NaN mode. When not in
88
+ * default-NaN mode, it is an error for the target not to set the rule
89
+ * in float_status if it uses a muladd, and we will assert if we need
90
+ * to handle an input NaN and no rule was selected.
91
+ *
92
+ * The naming scheme for Float3NaNPropRule values is:
93
+ * float_3nan_prop_s_abc:
94
+ * = "Prefer SNaN over QNaN, then operand A over B over C"
95
+ * float_3nan_prop_abc:
96
+ * = "Prefer A over B over C regardless of SNaN vs QNAN"
97
+ *
98
+ * For QEMU, the multiply-add operation is A * B + C.
99
+ */
100
+
101
+/*
102
+ * We set the Float3NaNPropRule enum values up so we can select the
103
+ * right value in pickNaNMulAdd in a data driven way.
104
+ */
105
+FIELD(3NAN, 1ST, 0, 2) /* which operand is most preferred ? */
106
+FIELD(3NAN, 2ND, 2, 2) /* which operand is next most preferred ? */
107
+FIELD(3NAN, 3RD, 4, 2) /* which operand is least preferred ? */
108
+FIELD(3NAN, SNAN, 6, 1) /* do we prefer SNaN over QNaN ? */
109
+
110
+#define PROPRULE(X, Y, Z) \
111
+ ((X << R_3NAN_1ST_SHIFT) | (Y << R_3NAN_2ND_SHIFT) | (Z << R_3NAN_3RD_SHIFT))
112
+
113
+typedef enum __attribute__((__packed__)) {
114
+ float_3nan_prop_none = 0, /* No propagation rule specified */
115
+ float_3nan_prop_abc = PROPRULE(0, 1, 2),
116
+ float_3nan_prop_acb = PROPRULE(0, 2, 1),
117
+ float_3nan_prop_bac = PROPRULE(1, 0, 2),
118
+ float_3nan_prop_bca = PROPRULE(1, 2, 0),
119
+ float_3nan_prop_cab = PROPRULE(2, 0, 1),
120
+ float_3nan_prop_cba = PROPRULE(2, 1, 0),
121
+ float_3nan_prop_s_abc = float_3nan_prop_abc | R_3NAN_SNAN_MASK,
122
+ float_3nan_prop_s_acb = float_3nan_prop_acb | R_3NAN_SNAN_MASK,
123
+ float_3nan_prop_s_bac = float_3nan_prop_bac | R_3NAN_SNAN_MASK,
124
+ float_3nan_prop_s_bca = float_3nan_prop_bca | R_3NAN_SNAN_MASK,
125
+ float_3nan_prop_s_cab = float_3nan_prop_cab | R_3NAN_SNAN_MASK,
126
+ float_3nan_prop_s_cba = float_3nan_prop_cba | R_3NAN_SNAN_MASK,
127
+} Float3NaNPropRule;
128
+
129
+#undef PROPRULE
130
+
131
/*
132
* Rule for result of fused multiply-add 0 * Inf + NaN.
133
* This must be a NaN, but implementations differ on whether this
134
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
135
FloatRoundMode float_rounding_mode;
136
FloatX80RoundPrec floatx80_rounding_precision;
137
Float2NaNPropRule float_2nan_prop_rule;
138
+ Float3NaNPropRule float_3nan_prop_rule;
139
FloatInfZeroNaNRule float_infzeronan_rule;
140
bool tininess_before_rounding;
141
/* should denormalised results go to zero and set the inexact flag? */
142
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
143
index XXXXXXX..XXXXXXX 100644
144
--- a/fpu/softfloat-specialize.c.inc
145
+++ b/fpu/softfloat-specialize.c.inc
146
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
147
static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
148
bool infzero, bool have_snan, float_status *status)
149
{
150
+ FloatClass cls[3] = { a_cls, b_cls, c_cls };
151
+ Float3NaNPropRule rule = status->float_3nan_prop_rule;
152
+ int which;
153
+
154
/*
155
* We guarantee not to require the target to tell us how to
156
* pick a NaN if we're always returning the default NaN.
157
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
158
}
61
}
159
}
62
160
63
+/*
161
+ if (rule == float_3nan_prop_none) {
64
+ * Some helper functions need to be passed the cpu_env. In order
162
#if defined(TARGET_ARM)
65
+ * to use those with the gvec APIs like tcg_gen_gvec_3() we need
163
-
66
+ * to create wrapper functions whose prototype is a NeonGenTwoOpFn()
164
- /* This looks different from the ARM ARM pseudocode, because the ARM ARM
67
+ * and which call a NeonGenTwoOpEnvFn().
165
- * puts the operands to a fused mac operation (a*b)+c in the order c,a,b.
68
+ */
166
- */
69
+#define WRAP_ENV_FN(WRAPNAME, FUNC) \
167
- if (is_snan(c_cls)) {
70
+ static void WRAPNAME(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m) \
168
- return 2;
71
+ { \
169
- } else if (is_snan(a_cls)) {
72
+ FUNC(d, cpu_env, n, m); \
170
- return 0;
171
- } else if (is_snan(b_cls)) {
172
- return 1;
173
- } else if (is_qnan(c_cls)) {
174
- return 2;
175
- } else if (is_qnan(a_cls)) {
176
- return 0;
177
- } else {
178
- return 1;
179
- }
180
+ /*
181
+ * This looks different from the ARM ARM pseudocode, because the ARM ARM
182
+ * puts the operands to a fused mac operation (a*b)+c in the order c,a,b
183
+ */
184
+ rule = float_3nan_prop_s_cab;
185
#elif defined(TARGET_MIPS)
186
- if (snan_bit_is_one(status)) {
187
- /* Prefer sNaN over qNaN, in the a, b, c order. */
188
- if (is_snan(a_cls)) {
189
- return 0;
190
- } else if (is_snan(b_cls)) {
191
- return 1;
192
- } else if (is_snan(c_cls)) {
193
- return 2;
194
- } else if (is_qnan(a_cls)) {
195
- return 0;
196
- } else if (is_qnan(b_cls)) {
197
- return 1;
198
+ if (snan_bit_is_one(status)) {
199
+ rule = float_3nan_prop_s_abc;
200
} else {
201
- return 2;
202
+ rule = float_3nan_prop_s_cab;
203
}
204
- } else {
205
- /* Prefer sNaN over qNaN, in the c, a, b order. */
206
- if (is_snan(c_cls)) {
207
- return 2;
208
- } else if (is_snan(a_cls)) {
209
- return 0;
210
- } else if (is_snan(b_cls)) {
211
- return 1;
212
- } else if (is_qnan(c_cls)) {
213
- return 2;
214
- } else if (is_qnan(a_cls)) {
215
- return 0;
216
- } else {
217
- return 1;
218
- }
219
- }
220
#elif defined(TARGET_LOONGARCH64)
221
- /* Prefer sNaN over qNaN, in the c, a, b order. */
222
- if (is_snan(c_cls)) {
223
- return 2;
224
- } else if (is_snan(a_cls)) {
225
- return 0;
226
- } else if (is_snan(b_cls)) {
227
- return 1;
228
- } else if (is_qnan(c_cls)) {
229
- return 2;
230
- } else if (is_qnan(a_cls)) {
231
- return 0;
232
- } else {
233
- return 1;
234
- }
235
+ rule = float_3nan_prop_s_cab;
236
#elif defined(TARGET_PPC)
237
- /* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
238
- * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
239
- */
240
- if (is_nan(a_cls)) {
241
- return 0;
242
- } else if (is_nan(c_cls)) {
243
- return 2;
244
- } else {
245
- return 1;
246
- }
247
+ /*
248
+ * If fRA is a NaN return it; otherwise if fRB is a NaN return it;
249
+ * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
250
+ */
251
+ rule = float_3nan_prop_acb;
252
#elif defined(TARGET_S390X)
253
- if (is_snan(a_cls)) {
254
- return 0;
255
- } else if (is_snan(b_cls)) {
256
- return 1;
257
- } else if (is_snan(c_cls)) {
258
- return 2;
259
- } else if (is_qnan(a_cls)) {
260
- return 0;
261
- } else if (is_qnan(b_cls)) {
262
- return 1;
263
- } else {
264
- return 2;
265
- }
266
+ rule = float_3nan_prop_s_abc;
267
#elif defined(TARGET_SPARC)
268
- /* Prefer SNaN over QNaN, order C, B, A. */
269
- if (is_snan(c_cls)) {
270
- return 2;
271
- } else if (is_snan(b_cls)) {
272
- return 1;
273
- } else if (is_snan(a_cls)) {
274
- return 0;
275
- } else if (is_qnan(c_cls)) {
276
- return 2;
277
- } else if (is_qnan(b_cls)) {
278
- return 1;
279
- } else {
280
- return 0;
281
- }
282
+ rule = float_3nan_prop_s_cba;
283
#elif defined(TARGET_XTENSA)
284
- /*
285
- * For Xtensa, the (inf,zero,nan) case sets InvalidOp and returns
286
- * an input NaN if we have one (ie c).
287
- */
288
- if (status->use_first_nan) {
289
- if (is_nan(a_cls)) {
290
- return 0;
291
- } else if (is_nan(b_cls)) {
292
- return 1;
293
+ if (status->use_first_nan) {
294
+ rule = float_3nan_prop_abc;
295
} else {
296
- return 2;
297
+ rule = float_3nan_prop_cba;
298
}
299
- } else {
300
- if (is_nan(c_cls)) {
301
- return 2;
302
- } else if (is_nan(b_cls)) {
303
- return 1;
304
- } else {
305
- return 0;
306
- }
307
- }
308
#else
309
- /* A default implementation: prefer a to b to c.
310
- * This is unlikely to actually match any real implementation.
311
- */
312
- if (is_nan(a_cls)) {
313
- return 0;
314
- } else if (is_nan(b_cls)) {
315
- return 1;
316
- } else {
317
- return 2;
318
- }
319
+ rule = float_3nan_prop_abc;
320
#endif
73
+ }
321
+ }
74
+
322
+
75
+#define DO_3SAME_32_ENV(INSN, FUNC) \
323
+ assert(rule != float_3nan_prop_none);
76
+ WRAP_ENV_FN(gen_##INSN##_tramp8, gen_helper_neon_##FUNC##8); \
324
+ if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
77
+ WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##16); \
325
+ /* We have at least one SNaN input and should prefer it */
78
+ WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##32); \
326
+ do {
79
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
327
+ which = rule & R_3NAN_1ST_MASK;
80
+ uint32_t rn_ofs, uint32_t rm_ofs, \
328
+ rule >>= R_3NAN_1ST_LENGTH;
81
+ uint32_t oprsz, uint32_t maxsz) \
329
+ } while (!is_snan(cls[which]));
82
+ { \
330
+ } else {
83
+ static const GVecGen3 ops[4] = { \
331
+ do {
84
+ { .fni4 = gen_##INSN##_tramp8 }, \
332
+ which = rule & R_3NAN_1ST_MASK;
85
+ { .fni4 = gen_##INSN##_tramp16 }, \
333
+ rule >>= R_3NAN_1ST_LENGTH;
86
+ { .fni4 = gen_##INSN##_tramp32 }, \
334
+ } while (!is_nan(cls[which]));
87
+ { 0 }, \
88
+ }; \
89
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
90
+ } \
91
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
92
+ { \
93
+ if (a->size > 2) { \
94
+ return false; \
95
+ } \
96
+ return do_3same(s, a, gen_##INSN##_3s); \
97
+ }
335
+ }
98
+
336
+ return which;
99
DO_3SAME_32(VHADD_S, hadd_s)
337
}
100
DO_3SAME_32(VHADD_U, hadd_u)
338
101
DO_3SAME_32(VHSUB_S, hsub_s)
339
/*----------------------------------------------------------------------------
102
DO_3SAME_32(VHSUB_U, hsub_u)
103
DO_3SAME_32(VRHADD_S, rhadd_s)
104
DO_3SAME_32(VRHADD_U, rhadd_u)
105
+DO_3SAME_32(VRSHL_S, rshl_s)
106
+DO_3SAME_32(VRSHL_U, rshl_u)
107
+
108
+DO_3SAME_32_ENV(VQSHL_S, qshl_s)
109
+DO_3SAME_32_ENV(VQSHL_U, qshl_u)
110
+DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
111
+DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
112
diff --git a/target/arm/translate.c b/target/arm/translate.c
113
index XXXXXXX..XXXXXXX 100644
114
--- a/target/arm/translate.c
115
+++ b/target/arm/translate.c
116
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
117
case NEON_3R_VHSUB:
118
case NEON_3R_VABD:
119
case NEON_3R_VABA:
120
+ case NEON_3R_VQSHL:
121
+ case NEON_3R_VRSHL:
122
+ case NEON_3R_VQRSHL:
123
/* Already handled by decodetree */
124
return 1;
125
}
126
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
127
}
128
pairwise = 0;
129
switch (op) {
130
- case NEON_3R_VQSHL:
131
- case NEON_3R_VRSHL:
132
- case NEON_3R_VQRSHL:
133
- {
134
- int rtmp;
135
- /* Shift instruction operands are reversed. */
136
- rtmp = rn;
137
- rn = rm;
138
- rm = rtmp;
139
- }
140
- break;
141
case NEON_3R_VPADD_VQRDMLAH:
142
case NEON_3R_VPMAX:
143
case NEON_3R_VPMIN:
144
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
145
tmp2 = neon_load_reg(rm, pass);
146
}
147
switch (op) {
148
- case NEON_3R_VQSHL:
149
- GEN_NEON_INTEGER_OP_ENV(qshl);
150
- break;
151
- case NEON_3R_VRSHL:
152
- GEN_NEON_INTEGER_OP(rshl);
153
- break;
154
- case NEON_3R_VQRSHL:
155
- GEN_NEON_INTEGER_OP_ENV(qrshl);
156
break;
157
case NEON_3R_VPMAX:
158
GEN_NEON_INTEGER_OP(pmax);
159
--
340
--
160
2.20.1
341
2.34.1
161
162
diff view generated by jsdifflib
New patch
1
Explicitly set a rule in the softfloat tests for propagating NaNs in
2
the muladd case. In meson.build we put -DTARGET_ARM in fpcflags, and
3
so we should select here the Arm rule of float_3nan_prop_s_cab.
1
4
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20241202131347.498124-17-peter.maydell@linaro.org
8
---
9
tests/fp/fp-bench.c | 1 +
10
tests/fp/fp-test.c | 1 +
11
2 files changed, 2 insertions(+)
12
13
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tests/fp/fp-bench.c
16
+++ b/tests/fp/fp-bench.c
17
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
18
* doesn't specify match those used by the Arm architecture.
19
*/
20
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
21
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &soft_status);
22
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
23
24
f = bench_funcs[operation][precision];
25
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/tests/fp/fp-test.c
28
+++ b/tests/fp/fp-test.c
29
@@ -XXX,XX +XXX,XX @@ void run_test(void)
30
* doesn't specify match those used by the Arm architecture.
31
*/
32
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
33
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &qsf);
34
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
35
36
genCases_setLevel(test_level);
37
--
38
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for Arm, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-18-peter.maydell@linaro.org
7
---
8
target/arm/cpu.c | 5 +++++
9
fpu/softfloat-specialize.c.inc | 8 +-------
10
2 files changed, 6 insertions(+), 7 deletions(-)
11
12
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.c
15
+++ b/target/arm/cpu.c
16
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
17
* * tininess-before-rounding
18
* * 2-input NaN propagation prefers SNaN over QNaN, and then
19
* operand A over operand B (see FPProcessNaNs() pseudocode)
20
+ * * 3-input NaN propagation prefers SNaN over QNaN, and then
21
+ * operand C over A over B (see FPProcessNaNs3() pseudocode,
22
+ * but note that for QEMU muladd is a * b + c, whereas for
23
+ * the pseudocode function the arguments are in the order c, a, b.
24
* * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
25
* and the input NaN if it is signalling
26
*/
27
@@ -XXX,XX +XXX,XX @@ static void arm_set_default_fp_behaviours(float_status *s)
28
{
29
set_float_detect_tininess(float_tininess_before_rounding, s);
30
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
31
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, s);
32
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
33
}
34
35
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
36
index XXXXXXX..XXXXXXX 100644
37
--- a/fpu/softfloat-specialize.c.inc
38
+++ b/fpu/softfloat-specialize.c.inc
39
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
40
}
41
42
if (rule == float_3nan_prop_none) {
43
-#if defined(TARGET_ARM)
44
- /*
45
- * This looks different from the ARM ARM pseudocode, because the ARM ARM
46
- * puts the operands to a fused mac operation (a*b)+c in the order c,a,b
47
- */
48
- rule = float_3nan_prop_s_cab;
49
-#elif defined(TARGET_MIPS)
50
+#if defined(TARGET_MIPS)
51
if (snan_bit_is_one(status)) {
52
rule = float_3nan_prop_s_abc;
53
} else {
54
--
55
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for loongarch, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-19-peter.maydell@linaro.org
7
---
8
target/loongarch/tcg/fpu_helper.c | 1 +
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 1 insertion(+), 2 deletions(-)
11
12
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/loongarch/tcg/fpu_helper.c
15
+++ b/target/loongarch/tcg/fpu_helper.c
16
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
17
* case sets InvalidOp and returns the input value 'c'
18
*/
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
20
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab, &env->fp_status);
21
}
22
23
int ieee_ex_to_loongarch(int xcpt)
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
25
index XXXXXXX..XXXXXXX 100644
26
--- a/fpu/softfloat-specialize.c.inc
27
+++ b/fpu/softfloat-specialize.c.inc
28
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
29
} else {
30
rule = float_3nan_prop_s_cab;
31
}
32
-#elif defined(TARGET_LOONGARCH64)
33
- rule = float_3nan_prop_s_cab;
34
#elif defined(TARGET_PPC)
35
/*
36
* If fRA is a NaN return it; otherwise if fRB is a NaN return it;
37
--
38
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for PPC, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-20-peter.maydell@linaro.org
7
---
8
target/ppc/cpu_init.c | 8 ++++++++
9
fpu/softfloat-specialize.c.inc | 6 ------
10
2 files changed, 8 insertions(+), 6 deletions(-)
11
12
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/ppc/cpu_init.c
15
+++ b/target/ppc/cpu_init.c
16
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
17
*/
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
19
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->vec_status);
20
+ /*
21
+ * NaN propagation for fused multiply-add:
22
+ * if fRA is a NaN return it; otherwise if fRB is a NaN return it;
23
+ * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
24
+ * whereas QEMU labels the operands as (a * b) + c.
25
+ */
26
+ set_float_3nan_prop_rule(float_3nan_prop_acb, &env->fp_status);
27
+ set_float_3nan_prop_rule(float_3nan_prop_acb, &env->vec_status);
28
/*
29
* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer
30
* to return an input NaN if we have one (ie c) rather than generating
31
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
32
index XXXXXXX..XXXXXXX 100644
33
--- a/fpu/softfloat-specialize.c.inc
34
+++ b/fpu/softfloat-specialize.c.inc
35
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
36
} else {
37
rule = float_3nan_prop_s_cab;
38
}
39
-#elif defined(TARGET_PPC)
40
- /*
41
- * If fRA is a NaN return it; otherwise if fRB is a NaN return it;
42
- * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB
43
- */
44
- rule = float_3nan_prop_acb;
45
#elif defined(TARGET_S390X)
46
rule = float_3nan_prop_s_abc;
47
#elif defined(TARGET_SPARC)
48
--
49
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for s390x, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-21-peter.maydell@linaro.org
7
---
8
target/s390x/cpu.c | 1 +
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 1 insertion(+), 2 deletions(-)
11
12
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/s390x/cpu.c
15
+++ b/target/s390x/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
17
set_float_detect_tininess(float_tininess_before_rounding,
18
&env->fpu_status);
19
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fpu_status);
20
+ set_float_3nan_prop_rule(float_3nan_prop_s_abc, &env->fpu_status);
21
set_float_infzeronan_rule(float_infzeronan_dnan_always,
22
&env->fpu_status);
23
/* fall through */
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
25
index XXXXXXX..XXXXXXX 100644
26
--- a/fpu/softfloat-specialize.c.inc
27
+++ b/fpu/softfloat-specialize.c.inc
28
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
29
} else {
30
rule = float_3nan_prop_s_cab;
31
}
32
-#elif defined(TARGET_S390X)
33
- rule = float_3nan_prop_s_abc;
34
#elif defined(TARGET_SPARC)
35
rule = float_3nan_prop_s_cba;
36
#elif defined(TARGET_XTENSA)
37
--
38
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for SPARC, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-22-peter.maydell@linaro.org
7
---
8
target/sparc/cpu.c | 2 ++
9
fpu/softfloat-specialize.c.inc | 2 --
10
2 files changed, 2 insertions(+), 2 deletions(-)
11
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/sparc/cpu.c
15
+++ b/target/sparc/cpu.c
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
17
* the CPU state struct so it won't get zeroed on reset.
18
*/
19
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &env->fp_status);
20
+ /* For fused-multiply add, prefer SNaN over QNaN, then C->B->A */
21
+ set_float_3nan_prop_rule(float_3nan_prop_s_cba, &env->fp_status);
22
/* For inf * 0 + NaN, return the input NaN */
23
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
26
index XXXXXXX..XXXXXXX 100644
27
--- a/fpu/softfloat-specialize.c.inc
28
+++ b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
30
} else {
31
rule = float_3nan_prop_s_cab;
32
}
33
-#elif defined(TARGET_SPARC)
34
- rule = float_3nan_prop_s_cba;
35
#elif defined(TARGET_XTENSA)
36
if (status->use_first_nan) {
37
rule = float_3nan_prop_abc;
38
--
39
2.34.1
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for Arm, and remove the
2
ifdef from pickNaNMulAdd().
1
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20241202131347.498124-23-peter.maydell@linaro.org
7
---
8
target/mips/fpu_helper.h | 4 ++++
9
target/mips/msa.c | 3 +++
10
fpu/softfloat-specialize.c.inc | 8 +-------
11
3 files changed, 8 insertions(+), 7 deletions(-)
12
13
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/mips/fpu_helper.h
16
+++ b/target/mips/fpu_helper.h
17
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
18
{
19
bool nan2008 = env->active_fpu.fcr31 & (1 << FCR31_NAN2008);
20
FloatInfZeroNaNRule izn_rule;
21
+ Float3NaNPropRule nan3_rule;
22
23
/*
24
* With nan2008, SNaNs are silenced in the usual way.
25
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
26
*/
27
izn_rule = nan2008 ? float_infzeronan_dnan_never : float_infzeronan_dnan_always;
28
set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
29
+ nan3_rule = nan2008 ? float_3nan_prop_s_cab : float_3nan_prop_s_abc;
30
+ set_float_3nan_prop_rule(nan3_rule, &env->active_fpu.fp_status);
31
+
32
}
33
34
static inline void restore_fp_status(CPUMIPSState *env)
35
diff --git a/target/mips/msa.c b/target/mips/msa.c
36
index XXXXXXX..XXXXXXX 100644
37
--- a/target/mips/msa.c
38
+++ b/target/mips/msa.c
39
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
40
set_float_2nan_prop_rule(float_2nan_prop_s_ab,
41
&env->active_tc.msa_fp_status);
42
43
+ set_float_3nan_prop_rule(float_3nan_prop_s_cab,
44
+ &env->active_tc.msa_fp_status);
45
+
46
/* clear float_status exception flags */
47
set_float_exception_flags(0, &env->active_tc.msa_fp_status);
48
49
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
50
index XXXXXXX..XXXXXXX 100644
51
--- a/fpu/softfloat-specialize.c.inc
52
+++ b/fpu/softfloat-specialize.c.inc
53
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
54
}
55
56
if (rule == float_3nan_prop_none) {
57
-#if defined(TARGET_MIPS)
58
- if (snan_bit_is_one(status)) {
59
- rule = float_3nan_prop_s_abc;
60
- } else {
61
- rule = float_3nan_prop_s_cab;
62
- }
63
-#elif defined(TARGET_XTENSA)
64
+#if defined(TARGET_XTENSA)
65
if (status->use_first_nan) {
66
rule = float_3nan_prop_abc;
67
} else {
68
--
69
2.34.1
diff view generated by jsdifflib
1
Convert the Neon integer 3-reg-same compare insns VCGE, VCGT,
1
Set the Float3NaNPropRule explicitly for xtensa, and remove the
2
VCEQ, VACGE and VACGT to decodetree.
2
ifdef from pickNaNMulAdd().
3
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200512163904.10918-15-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-24-peter.maydell@linaro.org
7
---
7
---
8
target/arm/neon-dp.decode | 5 +++++
8
target/xtensa/fpu_helper.c | 2 ++
9
target/arm/translate-neon.inc.c | 6 +++++
9
fpu/softfloat-specialize.c.inc | 8 --------
10
target/arm/translate.c | 39 ++-------------------------------
10
2 files changed, 2 insertions(+), 8 deletions(-)
11
3 files changed, 13 insertions(+), 37 deletions(-)
12
11
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
12
diff --git a/target/xtensa/fpu_helper.c b/target/xtensa/fpu_helper.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/neon-dp.decode
14
--- a/target/xtensa/fpu_helper.c
16
+++ b/target/arm/neon-dp.decode
15
+++ b/target/xtensa/fpu_helper.c
17
@@ -XXX,XX +XXX,XX @@ VABD_fp_3s 1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
16
@@ -XXX,XX +XXX,XX @@ void xtensa_use_first_nan(CPUXtensaState *env, bool use_first)
18
VMLA_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
17
set_use_first_nan(use_first, &env->fp_status);
19
VMLS_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
18
set_float_2nan_prop_rule(use_first ? float_2nan_prop_ab : float_2nan_prop_ba,
20
VMUL_fp_3s 1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
19
&env->fp_status);
21
+VCEQ_fp_3s 1111 001 0 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
20
+ set_float_3nan_prop_rule(use_first ? float_3nan_prop_abc : float_3nan_prop_cba,
22
+VCGE_fp_3s 1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
21
+ &env->fp_status);
23
+VACGE_fp_3s 1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
22
}
24
+VCGT_fp_3s 1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
23
25
+VACGT_fp_3s 1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
24
void HELPER(wur_fpu2k_fcr)(CPUXtensaState *env, uint32_t v)
26
VPMAX_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
27
VPMIN_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
28
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
29
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-neon.inc.c
27
--- a/fpu/softfloat-specialize.c.inc
31
+++ b/target/arm/translate-neon.inc.c
28
+++ b/fpu/softfloat-specialize.c.inc
32
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
29
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
33
return do_3same_fp(s, a, FUNC, READS_VD); \
34
}
30
}
35
31
36
+DO_3S_FP(VCEQ, gen_helper_neon_ceq_f32, false)
32
if (rule == float_3nan_prop_none) {
37
+DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
33
-#if defined(TARGET_XTENSA)
38
+DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
34
- if (status->use_first_nan) {
39
+DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
35
- rule = float_3nan_prop_abc;
40
+DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
36
- } else {
41
+
37
- rule = float_3nan_prop_cba;
42
static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
43
TCGv_ptr fpstatus)
44
{
45
diff --git a/target/arm/translate.c b/target/arm/translate.c
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/translate.c
48
+++ b/target/arm/translate.c
49
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
50
case NEON_3R_VQDMULH_VQRDMULH:
51
case NEON_3R_FLOAT_ARITH:
52
case NEON_3R_FLOAT_MULTIPLY:
53
+ case NEON_3R_FLOAT_CMP:
54
+ case NEON_3R_FLOAT_ACMP:
55
/* Already handled by decodetree */
56
return 1;
57
}
58
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
59
return 1; /* VPMIN/VPMAX handled by decodetree */
60
}
61
break;
62
- case NEON_3R_FLOAT_CMP:
63
- if (!u && size) {
64
- /* no encoding for U=0 C=1x */
65
- return 1;
66
- }
67
- break;
68
- case NEON_3R_FLOAT_ACMP:
69
- if (!u) {
70
- return 1;
71
- }
72
- break;
73
case NEON_3R_FLOAT_MISC:
74
/* VMAXNM/VMINNM in ARMv8 */
75
if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
76
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
77
tmp = neon_load_reg(rn, pass);
78
tmp2 = neon_load_reg(rm, pass);
79
switch (op) {
80
- case NEON_3R_FLOAT_CMP:
81
- {
82
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
83
- if (!u) {
84
- gen_helper_neon_ceq_f32(tmp, tmp, tmp2, fpstatus);
85
- } else {
86
- if (size == 0) {
87
- gen_helper_neon_cge_f32(tmp, tmp, tmp2, fpstatus);
88
- } else {
89
- gen_helper_neon_cgt_f32(tmp, tmp, tmp2, fpstatus);
90
- }
91
- }
92
- tcg_temp_free_ptr(fpstatus);
93
- break;
94
- }
38
- }
95
- case NEON_3R_FLOAT_ACMP:
39
-#else
96
- {
40
rule = float_3nan_prop_abc;
97
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
41
-#endif
98
- if (size == 0) {
42
}
99
- gen_helper_neon_acge_f32(tmp, tmp, tmp2, fpstatus);
43
100
- } else {
44
assert(rule != float_3nan_prop_none);
101
- gen_helper_neon_acgt_f32(tmp, tmp, tmp2, fpstatus);
102
- }
103
- tcg_temp_free_ptr(fpstatus);
104
- break;
105
- }
106
case NEON_3R_FLOAT_MINMAX:
107
{
108
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
109
--
45
--
110
2.20.1
46
2.34.1
111
112
diff view generated by jsdifflib
New patch
1
Set the Float3NaNPropRule explicitly for i386. We had no
2
i386-specific behaviour in the old ifdef ladder, so we were using the
3
default "prefer a then b then c" fallback; this is actually the
4
correct per-the-spec handling for i386.
1
5
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20241202131347.498124-25-peter.maydell@linaro.org
9
---
10
target/i386/tcg/fpu_helper.c | 1 +
11
1 file changed, 1 insertion(+)
12
13
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/i386/tcg/fpu_helper.c
16
+++ b/target/i386/tcg/fpu_helper.c
17
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
18
* there are multiple input NaNs they are selected in the order a, b, c.
19
*/
20
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
21
+ set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
22
}
23
24
static inline uint8_t save_exception_flags(CPUX86State *env)
25
--
26
2.34.1
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
Set the Float3NaNPropRule explicitly for HPPA, and remove the
2
ifdef from pickNaNMulAdd().
2
3
3
This patch builds Hardware Error Source Table(HEST) via fw_cfg blobs.
4
HPPA is the only target that was using the default branch of the
4
Now it only supports ARMv8 SEA, a type of Generic Hardware Error
5
ifdef ladder (other targets either do not use muladd or set
5
Source version 2(GHESv2) error source. Afterwards, we can extend
6
default_nan_mode), so we can remove the ifdef fallback entirely now
6
the supported types if needed. For the CPER section, currently it
7
(allowing the "rule not set" case to fall into the default of the
7
is memory section because kernel mainly wants userspace to handle
8
switch statement and assert).
8
the memory errors.
9
9
10
This patch follows the spec ACPI 6.2 to build the Hardware Error
10
We add a TODO note that the HPPA rule is probably wrong; this is
11
Source table. For more detailed information, please refer to
11
not a behavioural change for this refactoring.
12
document: docs/specs/acpi_hest_ghes.rst
13
12
14
build_ghes_hw_error_notification() helper will help to add Hardware
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Error Notification to ACPI tables without using packed C structures
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
16
and avoid endianness issues as API doesn't need explicit conversion.
15
Message-id: 20241202131347.498124-26-peter.maydell@linaro.org
16
---
17
target/hppa/fpu_helper.c | 8 ++++++++
18
fpu/softfloat-specialize.c.inc | 4 ----
19
2 files changed, 8 insertions(+), 4 deletions(-)
17
20
18
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
21
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
19
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
20
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
21
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
22
Message-id: 20200512030609.19593-6-gengdongjiu@huawei.com
23
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
24
---
25
include/hw/acpi/ghes.h | 39 ++++++++++++
26
hw/acpi/ghes.c | 126 +++++++++++++++++++++++++++++++++++++++
27
hw/arm/virt-acpi-build.c | 2 +
28
3 files changed, 167 insertions(+)
29
30
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
31
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
32
--- a/include/hw/acpi/ghes.h
23
--- a/target/hppa/fpu_helper.c
33
+++ b/include/hw/acpi/ghes.h
24
+++ b/target/hppa/fpu_helper.c
34
@@ -XXX,XX +XXX,XX @@
25
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
35
26
* HPPA does note implement a CPU reset method at all...
36
#include "hw/acpi/bios-linker-loader.h"
27
*/
37
28
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &env->fp_status);
38
+/*
29
+ /*
39
+ * Values for Hardware Error Notification Type field
30
+ * TODO: The HPPA architecture reference only documents its NaN
40
+ */
31
+ * propagation rule for 2-operand operations. Testing on real hardware
41
+enum AcpiGhesNotifyType {
32
+ * might be necessary to confirm whether this order for muladd is correct.
42
+ /* Polled */
33
+ * Not preferring the SNaN is almost certainly incorrect as it diverges
43
+ ACPI_GHES_NOTIFY_POLLED = 0,
34
+ * from the documented rules for 2-operand operations.
44
+ /* External Interrupt */
35
+ */
45
+ ACPI_GHES_NOTIFY_EXTERNAL = 1,
36
+ set_float_3nan_prop_rule(float_3nan_prop_abc, &env->fp_status);
46
+ /* Local Interrupt */
37
/* For inf * 0 + NaN, return the input NaN */
47
+ ACPI_GHES_NOTIFY_LOCAL = 2,
38
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
48
+ /* SCI */
39
}
49
+ ACPI_GHES_NOTIFY_SCI = 3,
40
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
50
+ /* NMI */
51
+ ACPI_GHES_NOTIFY_NMI = 4,
52
+ /* CMCI, ACPI 5.0: 18.3.2.7, Table 18-290 */
53
+ ACPI_GHES_NOTIFY_CMCI = 5,
54
+ /* MCE, ACPI 5.0: 18.3.2.7, Table 18-290 */
55
+ ACPI_GHES_NOTIFY_MCE = 6,
56
+ /* GPIO-Signal, ACPI 6.0: 18.3.2.7, Table 18-332 */
57
+ ACPI_GHES_NOTIFY_GPIO = 7,
58
+ /* ARMv8 SEA, ACPI 6.1: 18.3.2.9, Table 18-345 */
59
+ ACPI_GHES_NOTIFY_SEA = 8,
60
+ /* ARMv8 SEI, ACPI 6.1: 18.3.2.9, Table 18-345 */
61
+ ACPI_GHES_NOTIFY_SEI = 9,
62
+ /* External Interrupt - GSIV, ACPI 6.1: 18.3.2.9, Table 18-345 */
63
+ ACPI_GHES_NOTIFY_GSIV = 10,
64
+ /* Software Delegated Exception, ACPI 6.2: 18.3.2.9, Table 18-383 */
65
+ ACPI_GHES_NOTIFY_SDEI = 11,
66
+ /* 12 and greater are reserved */
67
+ ACPI_GHES_NOTIFY_RESERVED = 12
68
+};
69
+
70
+enum {
71
+ ACPI_HEST_SRC_ID_SEA = 0,
72
+ /* future ids go here */
73
+ ACPI_HEST_SRC_ID_RESERVED,
74
+};
75
+
76
void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
77
+void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
78
#endif
79
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
80
index XXXXXXX..XXXXXXX 100644
41
index XXXXXXX..XXXXXXX 100644
81
--- a/hw/acpi/ghes.c
42
--- a/fpu/softfloat-specialize.c.inc
82
+++ b/hw/acpi/ghes.c
43
+++ b/fpu/softfloat-specialize.c.inc
83
@@ -XXX,XX +XXX,XX @@
44
@@ -XXX,XX +XXX,XX @@ static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
84
#include "qemu/units.h"
45
}
85
#include "hw/acpi/ghes.h"
86
#include "hw/acpi/aml-build.h"
87
+#include "qemu/error-report.h"
88
89
#define ACPI_GHES_ERRORS_FW_CFG_FILE "etc/hardware_errors"
90
#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE "etc/hardware_errors_addr"
91
@@ -XXX,XX +XXX,XX @@
92
/* Now only support ARMv8 SEA notification type error source */
93
#define ACPI_GHES_ERROR_SOURCE_COUNT 1
94
95
+/* Generic Hardware Error Source version 2 */
96
+#define ACPI_GHES_SOURCE_GENERIC_ERROR_V2 10
97
+
98
+/* Address offset in Generic Address Structure(GAS) */
99
+#define GAS_ADDR_OFFSET 4
100
+
101
+/*
102
+ * Hardware Error Notification
103
+ * ACPI 4.0: 17.3.2.7 Hardware Error Notification
104
+ * Composes dummy Hardware Error Notification descriptor of specified type
105
+ */
106
+static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
107
+{
108
+ /* Type */
109
+ build_append_int_noprefix(table, type, 1);
110
+ /*
111
+ * Length:
112
+ * Total length of the structure in bytes
113
+ */
114
+ build_append_int_noprefix(table, 28, 1);
115
+ /* Configuration Write Enable */
116
+ build_append_int_noprefix(table, 0, 2);
117
+ /* Poll Interval */
118
+ build_append_int_noprefix(table, 0, 4);
119
+ /* Vector */
120
+ build_append_int_noprefix(table, 0, 4);
121
+ /* Switch To Polling Threshold Value */
122
+ build_append_int_noprefix(table, 0, 4);
123
+ /* Switch To Polling Threshold Window */
124
+ build_append_int_noprefix(table, 0, 4);
125
+ /* Error Threshold Value */
126
+ build_append_int_noprefix(table, 0, 4);
127
+ /* Error Threshold Window */
128
+ build_append_int_noprefix(table, 0, 4);
129
+}
130
+
131
/*
132
* Build table for the hardware error fw_cfg blob.
133
* Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
134
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
135
bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
136
0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
137
}
138
+
139
+/* Build Generic Hardware Error Source version 2 (GHESv2) */
140
+static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
141
+{
142
+ uint64_t address_offset;
143
+ /*
144
+ * Type:
145
+ * Generic Hardware Error Source version 2(GHESv2 - Type 10)
146
+ */
147
+ build_append_int_noprefix(table_data, ACPI_GHES_SOURCE_GENERIC_ERROR_V2, 2);
148
+ /* Source Id */
149
+ build_append_int_noprefix(table_data, source_id, 2);
150
+ /* Related Source Id */
151
+ build_append_int_noprefix(table_data, 0xffff, 2);
152
+ /* Flags */
153
+ build_append_int_noprefix(table_data, 0, 1);
154
+ /* Enabled */
155
+ build_append_int_noprefix(table_data, 1, 1);
156
+
157
+ /* Number of Records To Pre-allocate */
158
+ build_append_int_noprefix(table_data, 1, 4);
159
+ /* Max Sections Per Record */
160
+ build_append_int_noprefix(table_data, 1, 4);
161
+ /* Max Raw Data Length */
162
+ build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
163
+
164
+ address_offset = table_data->len;
165
+ /* Error Status Address */
166
+ build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
167
+ 4 /* QWord access */, 0);
168
+ bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
169
+ address_offset + GAS_ADDR_OFFSET, sizeof(uint64_t),
170
+ ACPI_GHES_ERRORS_FW_CFG_FILE, source_id * sizeof(uint64_t));
171
+
172
+ switch (source_id) {
173
+ case ACPI_HEST_SRC_ID_SEA:
174
+ /*
175
+ * Notification Structure
176
+ * Now only enable ARMv8 SEA notification type
177
+ */
178
+ build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_SEA);
179
+ break;
180
+ default:
181
+ error_report("Not support this error source");
182
+ abort();
183
+ }
184
+
185
+ /* Error Status Block Length */
186
+ build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
187
+
188
+ /*
189
+ * Read Ack Register
190
+ * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
191
+ * version 2 (GHESv2 - Type 10)
192
+ */
193
+ address_offset = table_data->len;
194
+ build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
195
+ 4 /* QWord access */, 0);
196
+ bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
197
+ address_offset + GAS_ADDR_OFFSET,
198
+ sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
199
+ (ACPI_GHES_ERROR_SOURCE_COUNT + source_id) * sizeof(uint64_t));
200
+
201
+ /*
202
+ * Read Ack Preserve field
203
+ * We only provide the first bit in Read Ack Register to OSPM to write
204
+ * while the other bits are preserved.
205
+ */
206
+ build_append_int_noprefix(table_data, ~0x1ULL, 8);
207
+ /* Read Ack Write */
208
+ build_append_int_noprefix(table_data, 0x1, 8);
209
+}
210
+
211
+/* Build Hardware Error Source Table */
212
+void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
213
+{
214
+ uint64_t hest_start = table_data->len;
215
+
216
+ /* Hardware Error Source Table header*/
217
+ acpi_data_push(table_data, sizeof(AcpiTableHeader));
218
+
219
+ /* Error Source Count */
220
+ build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
221
+
222
+ build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
223
+
224
+ build_header(linker, table_data, (void *)(table_data->data + hest_start),
225
+ "HEST", table_data->len - hest_start, 1, NULL, NULL);
226
+}
227
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
228
index XXXXXXX..XXXXXXX 100644
229
--- a/hw/arm/virt-acpi-build.c
230
+++ b/hw/arm/virt-acpi-build.c
231
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
232
233
if (vms->ras) {
234
build_ghes_error_table(tables->hardware_errors, tables->linker);
235
+ acpi_add_table(table_offsets, tables_blob);
236
+ acpi_build_hest(tables_blob, tables->linker);
237
}
46
}
238
47
239
if (ms->numa_state->num_nodes > 0) {
48
- if (rule == float_3nan_prop_none) {
49
- rule = float_3nan_prop_abc;
50
- }
51
-
52
assert(rule != float_3nan_prop_none);
53
if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
54
/* We have at least one SNaN input and should prefer it */
240
--
55
--
241
2.20.1
56
2.34.1
242
243
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The use_first_nan field in float_status was an xtensa-specific way to
2
select at runtime from two different NaN propagation rules. Now that
3
xtensa is using the target-agnostic NaN propagation rule selection
4
that we've just added, we can remove use_first_nan, because there is
5
no longer any code that reads it.
2
6
3
Provide a functional interface for the vector expansion.
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
This fits better with the existing set of helpers that
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
we provide for other operations.
9
Message-id: 20241202131347.498124-27-peter.maydell@linaro.org
10
---
11
include/fpu/softfloat-helpers.h | 5 -----
12
include/fpu/softfloat-types.h | 1 -
13
target/xtensa/fpu_helper.c | 1 -
14
3 files changed, 7 deletions(-)
6
15
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200513163245.17915-10-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/translate.h | 10 ++-
13
target/arm/translate-a64.c | 18 ++--
14
target/arm/translate-neon.inc.c | 23 +----
15
target/arm/translate.c | 146 +++++++++++++++++---------------
16
4 files changed, 95 insertions(+), 102 deletions(-)
17
18
diff --git a/target/arm/translate.h b/target/arm/translate.h
19
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate.h
18
--- a/include/fpu/softfloat-helpers.h
21
+++ b/target/arm/translate.h
19
+++ b/include/fpu/softfloat-helpers.h
22
@@ -XXX,XX +XXX,XX @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
20
@@ -XXX,XX +XXX,XX @@ static inline void set_snan_bit_is_one(bool val, float_status *status)
23
void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
21
status->snan_bit_is_one = val;
24
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
25
26
-extern const GVecGen3 cmtst_op[4];
27
-extern const GVecGen3 sshl_op[4];
28
-extern const GVecGen3 ushl_op[4];
29
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
30
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
31
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
32
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
33
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
34
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
35
+
36
extern const GVecGen4 uqadd_op[4];
37
extern const GVecGen4 sqadd_op[4];
38
extern const GVecGen4 uqsub_op[4];
39
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate-a64.c
42
+++ b/target/arm/translate-a64.c
43
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
44
is_q ? 16 : 8, vec_full_reg_size(s));
45
}
22
}
46
23
47
-/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */
24
-static inline void set_use_first_nan(bool val, float_status *status)
48
-static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
49
- int rn, int rm, const GVecGen3 *gvec_op)
50
-{
25
-{
51
- tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
26
- status->use_first_nan = val;
52
- vec_full_reg_offset(s, rm), is_q ? 16 : 8,
53
- vec_full_reg_size(s), gvec_op);
54
-}
27
-}
55
-
28
-
56
/* Expand a 3-operand operation using an out-of-line helper. */
29
static inline void set_no_signaling_nans(bool val, float_status *status)
57
static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
30
{
58
int rn, int rm, int data, gen_helper_gvec_3 *fn)
31
status->no_signaling_nans = val;
59
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
32
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
60
(u ? uqsub_op : sqsub_op) + size);
61
return;
62
case 0x08: /* SSHL, USHL */
63
- gen_gvec_op3(s, is_q, rd, rn, rm,
64
- u ? &ushl_op[size] : &sshl_op[size]);
65
+ if (u) {
66
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size);
67
+ } else {
68
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size);
69
+ }
70
return;
71
case 0x0c: /* SMAX, UMAX */
72
if (u) {
73
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
74
return;
75
case 0x11:
76
if (!u) { /* CMTST */
77
- gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]);
78
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size);
79
return;
80
}
81
/* else CMEQ */
82
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
83
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
84
--- a/target/arm/translate-neon.inc.c
34
--- a/include/fpu/softfloat-types.h
85
+++ b/target/arm/translate-neon.inc.c
35
+++ b/include/fpu/softfloat-types.h
86
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VBIC, tcg_gen_gvec_andc)
36
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
87
DO_3SAME(VORR, tcg_gen_gvec_or)
37
* softfloat-specialize.inc.c)
88
DO_3SAME(VORN, tcg_gen_gvec_orc)
38
*/
89
DO_3SAME(VEOR, tcg_gen_gvec_xor)
39
bool snan_bit_is_one;
90
+DO_3SAME(VSHL_S, gen_gvec_sshl)
40
- bool use_first_nan;
91
+DO_3SAME(VSHL_U, gen_gvec_ushl)
41
bool no_signaling_nans;
92
42
/* should overflowed results subtract re_bias to its exponent? */
93
/* These insns are all gvec_bitsel but with the inputs in various orders. */
43
bool rebias_overflow;
94
#define DO_3SAME_BITSEL(INSN, O1, O2, O3) \
44
diff --git a/target/xtensa/fpu_helper.c b/target/xtensa/fpu_helper.c
95
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
96
DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
97
DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
98
DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
99
+DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
100
101
#define DO_3SAME_CMP(INSN, COND) \
102
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
103
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
104
DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
105
DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
106
107
-static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
108
- uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
109
-{
110
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
111
-}
112
-DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
113
-
114
#define DO_3SAME_GVEC4(INSN, OPARRAY) \
115
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
116
uint32_t rn_ofs, uint32_t rm_ofs, \
117
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
118
}
119
return do_3same(s, a, gen_VMUL_p_3s);
120
}
121
-
122
-#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \
123
- static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
124
- uint32_t rn_ofs, uint32_t rm_ofs, \
125
- uint32_t oprsz, uint32_t maxsz) \
126
- { \
127
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \
128
- oprsz, maxsz, &OPARRAY[vece]); \
129
- } \
130
- DO_3SAME(INSN, gen_##INSN##_3s)
131
-
132
-DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
133
-DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
134
diff --git a/target/arm/translate.c b/target/arm/translate.c
135
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
136
--- a/target/arm/translate.c
46
--- a/target/xtensa/fpu_helper.c
137
+++ b/target/arm/translate.c
47
+++ b/target/xtensa/fpu_helper.c
138
@@ -XXX,XX +XXX,XX @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
48
@@ -XXX,XX +XXX,XX @@ static const struct {
139
tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
49
140
}
50
void xtensa_use_first_nan(CPUXtensaState *env, bool use_first)
141
142
-static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 };
143
-
144
-const GVecGen3 cmtst_op[4] = {
145
- { .fni4 = gen_helper_neon_tst_u8,
146
- .fniv = gen_cmtst_vec,
147
- .opt_opc = vecop_list_cmtst,
148
- .vece = MO_8 },
149
- { .fni4 = gen_helper_neon_tst_u16,
150
- .fniv = gen_cmtst_vec,
151
- .opt_opc = vecop_list_cmtst,
152
- .vece = MO_16 },
153
- { .fni4 = gen_cmtst_i32,
154
- .fniv = gen_cmtst_vec,
155
- .opt_opc = vecop_list_cmtst,
156
- .vece = MO_32 },
157
- { .fni8 = gen_cmtst_i64,
158
- .fniv = gen_cmtst_vec,
159
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
160
- .opt_opc = vecop_list_cmtst,
161
- .vece = MO_64 },
162
-};
163
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
164
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
165
+{
166
+ static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
167
+ static const GVecGen3 ops[4] = {
168
+ { .fni4 = gen_helper_neon_tst_u8,
169
+ .fniv = gen_cmtst_vec,
170
+ .opt_opc = vecop_list,
171
+ .vece = MO_8 },
172
+ { .fni4 = gen_helper_neon_tst_u16,
173
+ .fniv = gen_cmtst_vec,
174
+ .opt_opc = vecop_list,
175
+ .vece = MO_16 },
176
+ { .fni4 = gen_cmtst_i32,
177
+ .fniv = gen_cmtst_vec,
178
+ .opt_opc = vecop_list,
179
+ .vece = MO_32 },
180
+ { .fni8 = gen_cmtst_i64,
181
+ .fniv = gen_cmtst_vec,
182
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
183
+ .opt_opc = vecop_list,
184
+ .vece = MO_64 },
185
+ };
186
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
187
+}
188
189
void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
190
{
51
{
191
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
52
- set_use_first_nan(use_first, &env->fp_status);
192
tcg_temp_free_vec(rsh);
53
set_float_2nan_prop_rule(use_first ? float_2nan_prop_ab : float_2nan_prop_ba,
193
}
54
&env->fp_status);
194
55
set_float_3nan_prop_rule(use_first ? float_3nan_prop_abc : float_3nan_prop_cba,
195
-static const TCGOpcode ushl_list[] = {
196
- INDEX_op_neg_vec, INDEX_op_shlv_vec,
197
- INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
198
-};
199
-
200
-const GVecGen3 ushl_op[4] = {
201
- { .fniv = gen_ushl_vec,
202
- .fno = gen_helper_gvec_ushl_b,
203
- .opt_opc = ushl_list,
204
- .vece = MO_8 },
205
- { .fniv = gen_ushl_vec,
206
- .fno = gen_helper_gvec_ushl_h,
207
- .opt_opc = ushl_list,
208
- .vece = MO_16 },
209
- { .fni4 = gen_ushl_i32,
210
- .fniv = gen_ushl_vec,
211
- .opt_opc = ushl_list,
212
- .vece = MO_32 },
213
- { .fni8 = gen_ushl_i64,
214
- .fniv = gen_ushl_vec,
215
- .opt_opc = ushl_list,
216
- .vece = MO_64 },
217
-};
218
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
219
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
220
+{
221
+ static const TCGOpcode vecop_list[] = {
222
+ INDEX_op_neg_vec, INDEX_op_shlv_vec,
223
+ INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
224
+ };
225
+ static const GVecGen3 ops[4] = {
226
+ { .fniv = gen_ushl_vec,
227
+ .fno = gen_helper_gvec_ushl_b,
228
+ .opt_opc = vecop_list,
229
+ .vece = MO_8 },
230
+ { .fniv = gen_ushl_vec,
231
+ .fno = gen_helper_gvec_ushl_h,
232
+ .opt_opc = vecop_list,
233
+ .vece = MO_16 },
234
+ { .fni4 = gen_ushl_i32,
235
+ .fniv = gen_ushl_vec,
236
+ .opt_opc = vecop_list,
237
+ .vece = MO_32 },
238
+ { .fni8 = gen_ushl_i64,
239
+ .fniv = gen_ushl_vec,
240
+ .opt_opc = vecop_list,
241
+ .vece = MO_64 },
242
+ };
243
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
244
+}
245
246
void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
247
{
248
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
249
tcg_temp_free_vec(tmp);
250
}
251
252
-static const TCGOpcode sshl_list[] = {
253
- INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
254
- INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
255
-};
256
-
257
-const GVecGen3 sshl_op[4] = {
258
- { .fniv = gen_sshl_vec,
259
- .fno = gen_helper_gvec_sshl_b,
260
- .opt_opc = sshl_list,
261
- .vece = MO_8 },
262
- { .fniv = gen_sshl_vec,
263
- .fno = gen_helper_gvec_sshl_h,
264
- .opt_opc = sshl_list,
265
- .vece = MO_16 },
266
- { .fni4 = gen_sshl_i32,
267
- .fniv = gen_sshl_vec,
268
- .opt_opc = sshl_list,
269
- .vece = MO_32 },
270
- { .fni8 = gen_sshl_i64,
271
- .fniv = gen_sshl_vec,
272
- .opt_opc = sshl_list,
273
- .vece = MO_64 },
274
-};
275
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
276
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
277
+{
278
+ static const TCGOpcode vecop_list[] = {
279
+ INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
280
+ INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
281
+ };
282
+ static const GVecGen3 ops[4] = {
283
+ { .fniv = gen_sshl_vec,
284
+ .fno = gen_helper_gvec_sshl_b,
285
+ .opt_opc = vecop_list,
286
+ .vece = MO_8 },
287
+ { .fniv = gen_sshl_vec,
288
+ .fno = gen_helper_gvec_sshl_h,
289
+ .opt_opc = vecop_list,
290
+ .vece = MO_16 },
291
+ { .fni4 = gen_sshl_i32,
292
+ .fniv = gen_sshl_vec,
293
+ .opt_opc = vecop_list,
294
+ .vece = MO_32 },
295
+ { .fni8 = gen_sshl_i64,
296
+ .fniv = gen_sshl_vec,
297
+ .opt_opc = vecop_list,
298
+ .vece = MO_64 },
299
+ };
300
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
301
+}
302
303
static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
304
TCGv_vec a, TCGv_vec b)
305
--
56
--
306
2.20.1
57
2.34.1
307
308
diff view generated by jsdifflib
New patch
1
Currently m68k_cpu_reset_hold() calls floatx80_default_nan(NULL)
2
to get the NaN bit pattern to reset the FPU registers. This
3
works because it happens that our implementation of
4
floatx80_default_nan() doesn't actually look at the float_status
5
pointer except for TARGET_MIPS. However, this isn't guaranteed,
6
and to be able to remove the ifdef in floatx80_default_nan()
7
we're going to need a real float_status here.
1
8
9
Rearrange m68k_cpu_reset_hold() so that we initialize env->fp_status
10
earlier, and thus can pass it to floatx80_default_nan().
11
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20241202131347.498124-28-peter.maydell@linaro.org
15
---
16
target/m68k/cpu.c | 12 +++++++-----
17
1 file changed, 7 insertions(+), 5 deletions(-)
18
19
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/target/m68k/cpu.c
22
+++ b/target/m68k/cpu.c
23
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
24
CPUState *cs = CPU(obj);
25
M68kCPUClass *mcc = M68K_CPU_GET_CLASS(obj);
26
CPUM68KState *env = cpu_env(cs);
27
- floatx80 nan = floatx80_default_nan(NULL);
28
+ floatx80 nan;
29
int i;
30
31
if (mcc->parent_phases.hold) {
32
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
33
#else
34
cpu_m68k_set_sr(env, SR_S | SR_I);
35
#endif
36
- for (i = 0; i < 8; i++) {
37
- env->fregs[i].d = nan;
38
- }
39
- cpu_m68k_set_fpcr(env, 0);
40
/*
41
* M68000 FAMILY PROGRAMMER'S REFERENCE MANUAL
42
* 3.4 FLOATING-POINT INSTRUCTION DETAILS
43
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
44
* preceding paragraph for nonsignaling NaNs.
45
*/
46
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
47
+
48
+ nan = floatx80_default_nan(&env->fp_status);
49
+ for (i = 0; i < 8; i++) {
50
+ env->fregs[i].d = nan;
51
+ }
52
+ cpu_m68k_set_fpcr(env, 0);
53
env->fpsr = 0;
54
55
/* TODO: We should set PC from the interrupt vector. */
56
--
57
2.34.1
diff view generated by jsdifflib
New patch
1
We create our 128-bit default NaN by calling parts64_default_nan()
2
and then adjusting the result. We can do the same trick for creating
3
the floatx80 default NaN, which lets us drop a target ifdef.
1
4
5
floatx80 is used only by:
6
i386
7
m68k
8
arm nwfpe old floating-point emulation emulation support
9
(which is essentially dead, especially the parts involving floatx80)
10
PPC (only in the xsrqpxp instruction, which just rounds an input
11
value by converting to floatx80 and back, so will never generate
12
the default NaN)
13
14
The floatx80 default NaN as currently implemented is:
15
m68k: sign = 0, exp = 1...1, int = 1, frac = 1....1
16
i386: sign = 1, exp = 1...1, int = 1, frac = 10...0
17
18
These are the same as the parts64_default_nan for these architectures.
19
20
This is technically a possible behaviour change for arm linux-user
21
nwfpe emulation emulation, because the default NaN will now have the
22
sign bit clear. But we were already generating a different floatx80
23
default NaN from the real kernel emulation we are supposedly
24
following, which appears to use an all-bits-1 value:
25
https://elixir.bootlin.com/linux/v6.12/source/arch/arm/nwfpe/softfloat-specialize#L267
26
27
This won't affect the only "real" use of the nwfpe emulation, which
28
is ancient binaries that used it as part of the old floating point
29
calling convention; that only uses loads and stores of 32 and 64 bit
30
floats, not any of the floatx80 behaviour the original hardware had.
31
We also get the nwfpe float64 default NaN value wrong:
32
https://elixir.bootlin.com/linux/v6.12/source/arch/arm/nwfpe/softfloat-specialize#L166
33
so if we ever cared about this obscure corner the right fix would be
34
to correct that so nwfpe used its own default-NaN setting rather
35
than the Arm VFP one.
36
37
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
38
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
39
Message-id: 20241202131347.498124-29-peter.maydell@linaro.org
40
---
41
fpu/softfloat-specialize.c.inc | 20 ++++++++++----------
42
1 file changed, 10 insertions(+), 10 deletions(-)
43
44
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
45
index XXXXXXX..XXXXXXX 100644
46
--- a/fpu/softfloat-specialize.c.inc
47
+++ b/fpu/softfloat-specialize.c.inc
48
@@ -XXX,XX +XXX,XX @@ static void parts128_silence_nan(FloatParts128 *p, float_status *status)
49
floatx80 floatx80_default_nan(float_status *status)
50
{
51
floatx80 r;
52
+ /*
53
+ * Extrapolate from the choices made by parts64_default_nan to fill
54
+ * in the floatx80 format. We assume that floatx80's explicit
55
+ * integer bit is always set (this is true for i386 and m68k,
56
+ * which are the only real users of this format).
57
+ */
58
+ FloatParts64 p64;
59
+ parts64_default_nan(&p64, status);
60
61
- /* None of the targets that have snan_bit_is_one use floatx80. */
62
- assert(!snan_bit_is_one(status));
63
-#if defined(TARGET_M68K)
64
- r.low = UINT64_C(0xFFFFFFFFFFFFFFFF);
65
- r.high = 0x7FFF;
66
-#else
67
- /* X86 */
68
- r.low = UINT64_C(0xC000000000000000);
69
- r.high = 0xFFFF;
70
-#endif
71
+ r.high = 0x7FFF | (p64.sign << 15);
72
+ r.low = (1ULL << DECOMPOSED_BINARY_POINT) | p64.frac;
73
return r;
74
}
75
76
--
77
2.34.1
diff view generated by jsdifflib
New patch
1
In target/loongarch's helper_fclass_s() and helper_fclass_d() we pass
2
a zero-initialized float_status struct to float32_is_quiet_nan() and
3
float64_is_quiet_nan(), with the cryptic comment "for
4
snan_bit_is_one".
1
5
6
This pattern appears to have been copied from target/riscv, where it
7
is used because the functions there do not have ready access to the
8
CPU state struct. The comment presumably refers to the fact that the
9
main reason the is_quiet_nan() functions want the float_state is
10
because they want to know about the snan_bit_is_one config.
11
12
In the loongarch helpers, though, we have the CPU state struct
13
to hand. Use the usual env->fp_status here. This avoids our needing
14
to track that we need to update the initializer of the local
15
float_status structs when the core softfloat code adds new
16
options for targets to configure their behaviour.
17
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
20
Message-id: 20241202131347.498124-30-peter.maydell@linaro.org
21
---
22
target/loongarch/tcg/fpu_helper.c | 6 ++----
23
1 file changed, 2 insertions(+), 4 deletions(-)
24
25
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/loongarch/tcg/fpu_helper.c
28
+++ b/target/loongarch/tcg/fpu_helper.c
29
@@ -XXX,XX +XXX,XX @@ uint64_t helper_fclass_s(CPULoongArchState *env, uint64_t fj)
30
} else if (float32_is_zero_or_denormal(f)) {
31
return sign ? 1 << 4 : 1 << 8;
32
} else if (float32_is_any_nan(f)) {
33
- float_status s = { }; /* for snan_bit_is_one */
34
- return float32_is_quiet_nan(f, &s) ? 1 << 1 : 1 << 0;
35
+ return float32_is_quiet_nan(f, &env->fp_status) ? 1 << 1 : 1 << 0;
36
} else {
37
return sign ? 1 << 3 : 1 << 7;
38
}
39
@@ -XXX,XX +XXX,XX @@ uint64_t helper_fclass_d(CPULoongArchState *env, uint64_t fj)
40
} else if (float64_is_zero_or_denormal(f)) {
41
return sign ? 1 << 4 : 1 << 8;
42
} else if (float64_is_any_nan(f)) {
43
- float_status s = { }; /* for snan_bit_is_one */
44
- return float64_is_quiet_nan(f, &s) ? 1 << 1 : 1 << 0;
45
+ return float64_is_quiet_nan(f, &env->fp_status) ? 1 << 1 : 1 << 0;
46
} else {
47
return sign ? 1 << 3 : 1 << 7;
48
}
49
--
50
2.34.1
diff view generated by jsdifflib
New patch
1
In the frem helper, we have a local float_status because we want to
2
execute the floatx80_div() with a custom rounding mode. Instead of
3
zero-initializing the local float_status and then having to set it up
4
with the m68k standard behaviour (including the NaN propagation rule
5
and copying the rounding precision from env->fp_status), initialize
6
it as a complete copy of env->fp_status. This will avoid our having
7
to add new code in this function for every new config knob we add
8
to fp_status.
1
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241202131347.498124-31-peter.maydell@linaro.org
13
---
14
target/m68k/fpu_helper.c | 6 ++----
15
1 file changed, 2 insertions(+), 4 deletions(-)
16
17
diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/m68k/fpu_helper.c
20
+++ b/target/m68k/fpu_helper.c
21
@@ -XXX,XX +XXX,XX @@ void HELPER(frem)(CPUM68KState *env, FPReg *res, FPReg *val0, FPReg *val1)
22
23
fp_rem = floatx80_rem(val1->d, val0->d, &env->fp_status);
24
if (!floatx80_is_any_nan(fp_rem)) {
25
- float_status fp_status = { };
26
+ /* Use local temporary fp_status to set different rounding mode */
27
+ float_status fp_status = env->fp_status;
28
uint32_t quotient;
29
int sign;
30
31
/* Calculate quotient directly using round to nearest mode */
32
- set_float_2nan_prop_rule(float_2nan_prop_ab, &fp_status);
33
set_float_rounding_mode(float_round_nearest_even, &fp_status);
34
- set_floatx80_rounding_precision(
35
- get_floatx80_rounding_precision(&env->fp_status), &fp_status);
36
fp_quot.d = floatx80_div(val1->d, val0->d, &fp_status);
37
38
sign = extractFloatx80Sign(fp_quot.d);
39
--
40
2.34.1
diff view generated by jsdifflib
New patch
1
In cf_fpu_gdb_get_reg() and cf_fpu_gdb_set_reg() we do the conversion
2
from float64 to floatx80 using a scratch float_status, because we
3
don't want the conversion to affect the CPU's floating point exception
4
status. Currently we use a zero-initialized float_status. This will
5
get steadily more awkward as we add config knobs to float_status
6
that the target must initialize. Avoid having to add any of that
7
configuration here by instead initializing our local float_status
8
from the env->fp_status.
1
9
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241202131347.498124-32-peter.maydell@linaro.org
13
---
14
target/m68k/helper.c | 6 ++++--
15
1 file changed, 4 insertions(+), 2 deletions(-)
16
17
diff --git a/target/m68k/helper.c b/target/m68k/helper.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/m68k/helper.c
20
+++ b/target/m68k/helper.c
21
@@ -XXX,XX +XXX,XX @@ static int cf_fpu_gdb_get_reg(CPUState *cs, GByteArray *mem_buf, int n)
22
CPUM68KState *env = &cpu->env;
23
24
if (n < 8) {
25
- float_status s = {};
26
+ /* Use scratch float_status so any exceptions don't change CPU state */
27
+ float_status s = env->fp_status;
28
return gdb_get_reg64(mem_buf, floatx80_to_float64(env->fregs[n].d, &s));
29
}
30
switch (n) {
31
@@ -XXX,XX +XXX,XX @@ static int cf_fpu_gdb_set_reg(CPUState *cs, uint8_t *mem_buf, int n)
32
CPUM68KState *env = &cpu->env;
33
34
if (n < 8) {
35
- float_status s = {};
36
+ /* Use scratch float_status so any exceptions don't change CPU state */
37
+ float_status s = env->fp_status;
38
env->fregs[n].d = float64_to_floatx80(ldq_be_p(mem_buf), &s);
39
return 8;
40
}
41
--
42
2.34.1
diff view generated by jsdifflib
1
Convert the Neon floating point VFMA and VFMS insn to decodetree.
1
In the helper functions flcmps and flcmpd we use a scratch float_status
2
These are the last insns in the 3-reg-same group so we can
2
so that we don't change the CPU state if the comparison raises any
3
remove all the support/loop code from the old decoder.
3
floating point exception flags. Instead of zero-initializing this
4
scratch float_status, initialize it as a copy of env->fp_status. This
5
avoids the need to explicitly initialize settings like the NaN
6
propagation rule or others we might add to softfloat in future.
7
8
To do this we need to pass the CPU env pointer in to the helper.
4
9
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-18-peter.maydell@linaro.org
12
Message-id: 20241202131347.498124-33-peter.maydell@linaro.org
8
---
13
---
9
target/arm/neon-dp.decode | 3 +
14
target/sparc/helper.h | 4 ++--
10
target/arm/translate-neon.inc.c | 41 ++++++++
15
target/sparc/fop_helper.c | 8 ++++----
11
target/arm/translate.c | 176 +-------------------------------
16
target/sparc/translate.c | 4 ++--
12
3 files changed, 46 insertions(+), 174 deletions(-)
17
3 files changed, 8 insertions(+), 8 deletions(-)
13
18
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
19
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
15
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/neon-dp.decode
21
--- a/target/sparc/helper.h
17
+++ b/target/arm/neon-dp.decode
22
+++ b/target/sparc/helper.h
18
@@ -XXX,XX +XXX,XX @@ SHA256H2_3s 1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(fcmpd, TCG_CALL_NO_WG, i32, env, f64, f64)
19
SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
24
DEF_HELPER_FLAGS_3(fcmped, TCG_CALL_NO_WG, i32, env, f64, f64)
20
vm=%vm_dp vn=%vn_dp vd=%vd_dp
25
DEF_HELPER_FLAGS_3(fcmpq, TCG_CALL_NO_WG, i32, env, i128, i128)
21
26
DEF_HELPER_FLAGS_3(fcmpeq, TCG_CALL_NO_WG, i32, env, i128, i128)
22
+VFMA_fp_3s 1111 001 0 0 . 0 . .... .... 1100 ... 1 .... @3same_fp
27
-DEF_HELPER_FLAGS_2(flcmps, TCG_CALL_NO_RWG_SE, i32, f32, f32)
23
+VFMS_fp_3s 1111 001 0 0 . 1 . .... .... 1100 ... 1 .... @3same_fp
28
-DEF_HELPER_FLAGS_2(flcmpd, TCG_CALL_NO_RWG_SE, i32, f64, f64)
24
+
29
+DEF_HELPER_FLAGS_3(flcmps, TCG_CALL_NO_RWG_SE, i32, env, f32, f32)
25
VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
30
+DEF_HELPER_FLAGS_3(flcmpd, TCG_CALL_NO_RWG_SE, i32, env, f64, f64)
26
31
DEF_HELPER_2(raise_exception, noreturn, env, int)
27
VADD_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
32
28
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
33
DEF_HELPER_FLAGS_3(faddd, TCG_CALL_NO_WG, f64, env, f64, f64)
34
diff --git a/target/sparc/fop_helper.c b/target/sparc/fop_helper.c
29
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-neon.inc.c
36
--- a/target/sparc/fop_helper.c
31
+++ b/target/arm/translate-neon.inc.c
37
+++ b/target/sparc/fop_helper.c
32
@@ -XXX,XX +XXX,XX @@ static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
38
@@ -XXX,XX +XXX,XX @@ uint32_t helper_fcmpeq(CPUSPARCState *env, Int128 src1, Int128 src2)
33
return do_3same(s, a, gen_VRSQRTS_fp_3s);
39
return finish_fcmp(env, r, GETPC());
34
}
40
}
35
41
36
+static void gen_VFMA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
42
-uint32_t helper_flcmps(float32 src1, float32 src2)
37
+ TCGv_ptr fpstatus)
43
+uint32_t helper_flcmps(CPUSPARCState *env, float32 src1, float32 src2)
38
+{
39
+ gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
40
+}
41
+
42
+static bool trans_VFMA_fp_3s(DisasContext *s, arg_3same *a)
43
+{
44
+ if (!dc_isar_feature(aa32_simdfmac, s)) {
45
+ return false;
46
+ }
47
+
48
+ if (a->size != 0) {
49
+ /* TODO fp16 support */
50
+ return false;
51
+ }
52
+
53
+ return do_3same_fp(s, a, gen_VFMA_fp_3s, true);
54
+}
55
+
56
+static void gen_VFMS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
57
+ TCGv_ptr fpstatus)
58
+{
59
+ gen_helper_vfp_negs(vn, vn);
60
+ gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
61
+}
62
+
63
+static bool trans_VFMS_fp_3s(DisasContext *s, arg_3same *a)
64
+{
65
+ if (!dc_isar_feature(aa32_simdfmac, s)) {
66
+ return false;
67
+ }
68
+
69
+ if (a->size != 0) {
70
+ /* TODO fp16 support */
71
+ return false;
72
+ }
73
+
74
+ return do_3same_fp(s, a, gen_VFMS_fp_3s, true);
75
+}
76
+
77
static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
78
{
44
{
79
/* FP operations handled pairwise 32 bits at a time */
45
/*
80
diff --git a/target/arm/translate.c b/target/arm/translate.c
46
* FLCMP never raises an exception nor modifies any FSR fields.
47
* Perform the comparison with a dummy fp environment.
48
*/
49
- float_status discard = { };
50
+ float_status discard = env->fp_status;
51
FloatRelation r;
52
53
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &discard);
54
@@ -XXX,XX +XXX,XX @@ uint32_t helper_flcmps(float32 src1, float32 src2)
55
g_assert_not_reached();
56
}
57
58
-uint32_t helper_flcmpd(float64 src1, float64 src2)
59
+uint32_t helper_flcmpd(CPUSPARCState *env, float64 src1, float64 src2)
60
{
61
- float_status discard = { };
62
+ float_status discard = env->fp_status;
63
FloatRelation r;
64
65
set_float_2nan_prop_rule(float_2nan_prop_s_ba, &discard);
66
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
81
index XXXXXXX..XXXXXXX 100644
67
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/translate.c
68
--- a/target/sparc/translate.c
83
+++ b/target/arm/translate.c
69
+++ b/target/sparc/translate.c
84
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
70
@@ -XXX,XX +XXX,XX @@ static bool trans_FLCMPs(DisasContext *dc, arg_FLCMPs *a)
85
}
71
72
src1 = gen_load_fpr_F(dc, a->rs1);
73
src2 = gen_load_fpr_F(dc, a->rs2);
74
- gen_helper_flcmps(cpu_fcc[a->cc], src1, src2);
75
+ gen_helper_flcmps(cpu_fcc[a->cc], tcg_env, src1, src2);
76
return advance_pc(dc);
86
}
77
}
87
78
88
-/* Symbolic constants for op fields for Neon 3-register same-length.
79
@@ -XXX,XX +XXX,XX @@ static bool trans_FLCMPd(DisasContext *dc, arg_FLCMPd *a)
89
- * The values correspond to bits [11:8,4]; see the ARM ARM DDI0406B
80
90
- * table A7-9.
81
src1 = gen_load_fpr_D(dc, a->rs1);
91
- */
82
src2 = gen_load_fpr_D(dc, a->rs2);
92
-#define NEON_3R_VHADD 0
83
- gen_helper_flcmpd(cpu_fcc[a->cc], src1, src2);
93
-#define NEON_3R_VQADD 1
84
+ gen_helper_flcmpd(cpu_fcc[a->cc], tcg_env, src1, src2);
94
-#define NEON_3R_VRHADD 2
85
return advance_pc(dc);
95
-#define NEON_3R_LOGIC 3 /* VAND,VBIC,VORR,VMOV,VORN,VEOR,VBIF,VBIT,VBSL */
86
}
96
-#define NEON_3R_VHSUB 4
87
97
-#define NEON_3R_VQSUB 5
98
-#define NEON_3R_VCGT 6
99
-#define NEON_3R_VCGE 7
100
-#define NEON_3R_VSHL 8
101
-#define NEON_3R_VQSHL 9
102
-#define NEON_3R_VRSHL 10
103
-#define NEON_3R_VQRSHL 11
104
-#define NEON_3R_VMAX 12
105
-#define NEON_3R_VMIN 13
106
-#define NEON_3R_VABD 14
107
-#define NEON_3R_VABA 15
108
-#define NEON_3R_VADD_VSUB 16
109
-#define NEON_3R_VTST_VCEQ 17
110
-#define NEON_3R_VML 18 /* VMLA, VMLS */
111
-#define NEON_3R_VMUL 19
112
-#define NEON_3R_VPMAX 20
113
-#define NEON_3R_VPMIN 21
114
-#define NEON_3R_VQDMULH_VQRDMULH 22
115
-#define NEON_3R_VPADD_VQRDMLAH 23
116
-#define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
117
-#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
118
-#define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
119
-#define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
120
-#define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
121
-#define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */
122
-#define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */
123
-#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */
124
-
125
-static const uint8_t neon_3r_sizes[] = {
126
- [NEON_3R_VHADD] = 0x7,
127
- [NEON_3R_VQADD] = 0xf,
128
- [NEON_3R_VRHADD] = 0x7,
129
- [NEON_3R_LOGIC] = 0xf, /* size field encodes op type */
130
- [NEON_3R_VHSUB] = 0x7,
131
- [NEON_3R_VQSUB] = 0xf,
132
- [NEON_3R_VCGT] = 0x7,
133
- [NEON_3R_VCGE] = 0x7,
134
- [NEON_3R_VSHL] = 0xf,
135
- [NEON_3R_VQSHL] = 0xf,
136
- [NEON_3R_VRSHL] = 0xf,
137
- [NEON_3R_VQRSHL] = 0xf,
138
- [NEON_3R_VMAX] = 0x7,
139
- [NEON_3R_VMIN] = 0x7,
140
- [NEON_3R_VABD] = 0x7,
141
- [NEON_3R_VABA] = 0x7,
142
- [NEON_3R_VADD_VSUB] = 0xf,
143
- [NEON_3R_VTST_VCEQ] = 0x7,
144
- [NEON_3R_VML] = 0x7,
145
- [NEON_3R_VMUL] = 0x7,
146
- [NEON_3R_VPMAX] = 0x7,
147
- [NEON_3R_VPMIN] = 0x7,
148
- [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
149
- [NEON_3R_VPADD_VQRDMLAH] = 0x7,
150
- [NEON_3R_SHA] = 0xf, /* size field encodes op type */
151
- [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
152
- [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
153
- [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
154
- [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
155
- [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */
156
- [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */
157
- [NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */
158
-};
159
-
160
/* Symbolic constants for op fields for Neon 2-register miscellaneous.
161
* The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
162
* table A7-13.
163
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
164
rm_ofs = neon_reg_offset(rm, 0);
165
166
if ((insn & (1 << 23)) == 0) {
167
- /* Three register same length. */
168
- op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
169
- /* Catch invalid op and bad size combinations: UNDEF */
170
- if ((neon_3r_sizes[op] & (1 << size)) == 0) {
171
- return 1;
172
- }
173
- /* All insns of this form UNDEF for either this condition or the
174
- * superset of cases "Q==1"; we catch the latter later.
175
- */
176
- if (q && ((rd | rn | rm) & 1)) {
177
- return 1;
178
- }
179
- switch (op) {
180
- case NEON_3R_VFM_VQRDMLSH:
181
- if (!u) {
182
- /* VFM, VFMS */
183
- if (size == 1) {
184
- return 1;
185
- }
186
- break;
187
- }
188
- /* VQRDMLSH : handled by decodetree */
189
- return 1;
190
-
191
- case NEON_3R_VADD_VSUB:
192
- case NEON_3R_LOGIC:
193
- case NEON_3R_VMAX:
194
- case NEON_3R_VMIN:
195
- case NEON_3R_VTST_VCEQ:
196
- case NEON_3R_VCGT:
197
- case NEON_3R_VCGE:
198
- case NEON_3R_VQADD:
199
- case NEON_3R_VQSUB:
200
- case NEON_3R_VMUL:
201
- case NEON_3R_VML:
202
- case NEON_3R_VSHL:
203
- case NEON_3R_SHA:
204
- case NEON_3R_VHADD:
205
- case NEON_3R_VRHADD:
206
- case NEON_3R_VHSUB:
207
- case NEON_3R_VABD:
208
- case NEON_3R_VABA:
209
- case NEON_3R_VQSHL:
210
- case NEON_3R_VRSHL:
211
- case NEON_3R_VQRSHL:
212
- case NEON_3R_VPMAX:
213
- case NEON_3R_VPMIN:
214
- case NEON_3R_VPADD_VQRDMLAH:
215
- case NEON_3R_VQDMULH_VQRDMULH:
216
- case NEON_3R_FLOAT_ARITH:
217
- case NEON_3R_FLOAT_MULTIPLY:
218
- case NEON_3R_FLOAT_CMP:
219
- case NEON_3R_FLOAT_ACMP:
220
- case NEON_3R_FLOAT_MINMAX:
221
- case NEON_3R_FLOAT_MISC:
222
- /* Already handled by decodetree */
223
- return 1;
224
- }
225
-
226
- if (size == 3) {
227
- /* 64-bit element instructions: handled by decodetree */
228
- return 1;
229
- }
230
- switch (op) {
231
- case NEON_3R_VFM_VQRDMLSH:
232
- if (!dc_isar_feature(aa32_simdfmac, s)) {
233
- return 1;
234
- }
235
- break;
236
- default:
237
- break;
238
- }
239
-
240
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
241
-
242
- /* Elementwise. */
243
- tmp = neon_load_reg(rn, pass);
244
- tmp2 = neon_load_reg(rm, pass);
245
- switch (op) {
246
- case NEON_3R_VFM_VQRDMLSH:
247
- {
248
- /* VFMA, VFMS: fused multiply-add */
249
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
250
- TCGv_i32 tmp3 = neon_load_reg(rd, pass);
251
- if (size) {
252
- /* VFMS */
253
- gen_helper_vfp_negs(tmp, tmp);
254
- }
255
- gen_helper_vfp_muladds(tmp, tmp, tmp2, tmp3, fpstatus);
256
- tcg_temp_free_i32(tmp3);
257
- tcg_temp_free_ptr(fpstatus);
258
- break;
259
- }
260
- default:
261
- abort();
262
- }
263
- tcg_temp_free_i32(tmp2);
264
-
265
- neon_store_reg(rd, pass, tmp);
266
-
267
- } /* for pass */
268
- /* End of 3 register same size operations. */
269
+ /* Three register same length: handled by decodetree */
270
+ return 1;
271
} else if (insn & (1 << 4)) {
272
if ((insn & 0x00380080) != 0) {
273
/* Two registers and shift. */
274
--
88
--
275
2.20.1
89
2.34.1
276
277
diff view generated by jsdifflib
New patch
1
In the helper_compute_fprf functions, we pass a dummy float_status
2
in to the is_signaling_nan() function. This is unnecessary, because
3
we have convenient access to the CPU env pointer here and that
4
is already set up with the correct values for the snan_bit_is_one
5
and no_signaling_nans config settings. is_signaling_nan() doesn't
6
ever update the fp_status with any exception flags, so there is
7
no reason not to use env->fp_status here.
1
8
9
Use env->fp_status instead of the dummy fp_status.
10
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20241202131347.498124-34-peter.maydell@linaro.org
14
---
15
target/ppc/fpu_helper.c | 3 +--
16
1 file changed, 1 insertion(+), 2 deletions(-)
17
18
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/ppc/fpu_helper.c
21
+++ b/target/ppc/fpu_helper.c
22
@@ -XXX,XX +XXX,XX @@ void helper_compute_fprf_##tp(CPUPPCState *env, tp arg) \
23
} else if (tp##_is_infinity(arg)) { \
24
fprf = neg ? 0x09 << FPSCR_FPRF : 0x05 << FPSCR_FPRF; \
25
} else { \
26
- float_status dummy = { }; /* snan_bit_is_one = 0 */ \
27
- if (tp##_is_signaling_nan(arg, &dummy)) { \
28
+ if (tp##_is_signaling_nan(arg, &env->fp_status)) { \
29
fprf = 0x00 << FPSCR_FPRF; \
30
} else { \
31
fprf = 0x11 << FPSCR_FPRF; \
32
--
33
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Include 64-bit element size in preparation for SVE2.
3
Now that float_status has a bunch of fp parameters,
4
it is easier to copy an existing structure than create
5
one from scratch. Begin by copying the structure that
6
corresponds to the FPSR and make only the adjustments
7
required for BFloat16 semantics.
4
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20241203203949.483774-2-richard.henderson@linaro.org
7
Message-id: 20200513163245.17915-17-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
14
---
10
target/arm/helper.h | 17 +++--
15
target/arm/tcg/vec_helper.c | 20 +++++++-------------
11
target/arm/translate.h | 5 ++
16
1 file changed, 7 insertions(+), 13 deletions(-)
12
target/arm/neon_helper.c | 10 ---
13
target/arm/translate-a64.c | 17 ++---
14
target/arm/translate.c | 134 +++++++++++++++++++++++++++++++++++--
15
target/arm/vec_helper.c | 24 +++++++
16
6 files changed, 174 insertions(+), 33 deletions(-)
17
17
18
diff --git a/target/arm/helper.h b/target/arm/helper.h
18
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
19
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper.h
20
--- a/target/arm/tcg/vec_helper.c
21
+++ b/target/arm/helper.h
21
+++ b/target/arm/tcg/vec_helper.c
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32)
22
@@ -XXX,XX +XXX,XX @@ bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp)
23
DEF_HELPER_2(neon_pmax_u16, i32, i32, i32)
23
* no effect on AArch32 instructions.
24
DEF_HELPER_2(neon_pmax_s16, i32, i32, i32)
24
*/
25
25
bool ebf = is_a64(env) && env->vfp.fpcr & FPCR_EBF;
26
-DEF_HELPER_2(neon_abd_u8, i32, i32, i32)
26
- *statusp = (float_status){
27
-DEF_HELPER_2(neon_abd_s8, i32, i32, i32)
27
- .tininess_before_rounding = float_tininess_before_rounding,
28
-DEF_HELPER_2(neon_abd_u16, i32, i32, i32)
28
- .float_rounding_mode = float_round_to_odd_inf,
29
-DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
29
- .flush_to_zero = true,
30
-DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
30
- .flush_inputs_to_zero = true,
31
-DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
31
- .default_nan_mode = true,
32
- };
33
+
34
+ *statusp = env->vfp.fp_status;
35
+ set_default_nan_mode(true, statusp);
36
37
if (ebf) {
38
- float_status *fpst = &env->vfp.fp_status;
39
- set_flush_to_zero(get_flush_to_zero(fpst), statusp);
40
- set_flush_inputs_to_zero(get_flush_inputs_to_zero(fpst), statusp);
41
- set_float_rounding_mode(get_float_rounding_mode(fpst), statusp);
32
-
42
-
33
DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
43
/* EBF=1 needs to do a step with round-to-odd semantics */
34
DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
44
*oddstatusp = *statusp;
35
DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
45
set_float_rounding_mode(float_round_to_odd, oddstatusp);
36
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
46
+ } else {
37
DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
47
+ set_flush_to_zero(true, statusp);
38
DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
48
+ set_flush_inputs_to_zero(true, statusp);
39
49
+ set_float_rounding_mode(float_round_to_odd_inf, statusp);
40
+DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
50
}
41
+DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
42
+DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
44
+
45
+DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
46
+DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
48
+DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
49
+
50
#ifdef TARGET_AARCH64
51
#include "helper-a64.h"
52
#include "helper-sve.h"
53
diff --git a/target/arm/translate.h b/target/arm/translate.h
54
index XXXXXXX..XXXXXXX 100644
55
--- a/target/arm/translate.h
56
+++ b/target/arm/translate.h
57
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
58
void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
59
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
60
61
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
62
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
63
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
64
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
65
+
66
/*
67
* Forward to the isar_feature_* tests given a DisasContext pointer.
68
*/
69
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/target/arm/neon_helper.c
72
+++ b/target/arm/neon_helper.c
73
@@ -XXX,XX +XXX,XX @@ NEON_POP(pmax_s16, neon_s16, 2)
74
NEON_POP(pmax_u16, neon_u16, 2)
75
#undef NEON_FN
76
77
-#define NEON_FN(dest, src1, src2) \
78
- dest = (src1 > src2) ? (src1 - src2) : (src2 - src1)
79
-NEON_VOP(abd_s8, neon_s8, 4)
80
-NEON_VOP(abd_u8, neon_u8, 4)
81
-NEON_VOP(abd_s16, neon_s16, 2)
82
-NEON_VOP(abd_u16, neon_u16, 2)
83
-NEON_VOP(abd_s32, neon_s32, 1)
84
-NEON_VOP(abd_u32, neon_u32, 1)
85
-#undef NEON_FN
86
-
51
-
87
#define NEON_FN(dest, src1, src2) do { \
52
return ebf;
88
int8_t tmp; \
89
tmp = (int8_t)src2; \
90
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/translate-a64.c
93
+++ b/target/arm/translate-a64.c
94
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
95
gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
96
}
97
return;
98
+ case 0xf: /* SABA, UABA */
99
+ if (u) {
100
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size);
101
+ } else {
102
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size);
103
+ }
104
+ return;
105
case 0x10: /* ADD, SUB */
106
if (u) {
107
gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
108
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
109
genenvfn = fns[size][u];
110
break;
111
}
112
- case 0xf: /* SABA, UABA */
113
- {
114
- static NeonGenTwoOpFn * const fns[3][2] = {
115
- { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 },
116
- { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 },
117
- { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 },
118
- };
119
- genfn = fns[size][u];
120
- break;
121
- }
122
case 0x16: /* SQDMULH, SQRDMULH */
123
{
124
static NeonGenTwoOpEnvFn * const fns[2][2] = {
125
diff --git a/target/arm/translate.c b/target/arm/translate.c
126
index XXXXXXX..XXXXXXX 100644
127
--- a/target/arm/translate.c
128
+++ b/target/arm/translate.c
129
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
130
tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
131
}
53
}
132
54
133
+static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
134
+{
135
+ TCGv_i32 t = tcg_temp_new_i32();
136
+ gen_sabd_i32(t, a, b);
137
+ tcg_gen_add_i32(d, d, t);
138
+ tcg_temp_free_i32(t);
139
+}
140
+
141
+static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
142
+{
143
+ TCGv_i64 t = tcg_temp_new_i64();
144
+ gen_sabd_i64(t, a, b);
145
+ tcg_gen_add_i64(d, d, t);
146
+ tcg_temp_free_i64(t);
147
+}
148
+
149
+static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
150
+{
151
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
152
+ gen_sabd_vec(vece, t, a, b);
153
+ tcg_gen_add_vec(vece, d, d, t);
154
+ tcg_temp_free_vec(t);
155
+}
156
+
157
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
158
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
159
+{
160
+ static const TCGOpcode vecop_list[] = {
161
+ INDEX_op_sub_vec, INDEX_op_add_vec,
162
+ INDEX_op_smin_vec, INDEX_op_smax_vec, 0
163
+ };
164
+ static const GVecGen3 ops[4] = {
165
+ { .fniv = gen_saba_vec,
166
+ .fno = gen_helper_gvec_saba_b,
167
+ .opt_opc = vecop_list,
168
+ .load_dest = true,
169
+ .vece = MO_8 },
170
+ { .fniv = gen_saba_vec,
171
+ .fno = gen_helper_gvec_saba_h,
172
+ .opt_opc = vecop_list,
173
+ .load_dest = true,
174
+ .vece = MO_16 },
175
+ { .fni4 = gen_saba_i32,
176
+ .fniv = gen_saba_vec,
177
+ .fno = gen_helper_gvec_saba_s,
178
+ .opt_opc = vecop_list,
179
+ .load_dest = true,
180
+ .vece = MO_32 },
181
+ { .fni8 = gen_saba_i64,
182
+ .fniv = gen_saba_vec,
183
+ .fno = gen_helper_gvec_saba_d,
184
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
185
+ .opt_opc = vecop_list,
186
+ .load_dest = true,
187
+ .vece = MO_64 },
188
+ };
189
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
190
+}
191
+
192
+static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
193
+{
194
+ TCGv_i32 t = tcg_temp_new_i32();
195
+ gen_uabd_i32(t, a, b);
196
+ tcg_gen_add_i32(d, d, t);
197
+ tcg_temp_free_i32(t);
198
+}
199
+
200
+static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
201
+{
202
+ TCGv_i64 t = tcg_temp_new_i64();
203
+ gen_uabd_i64(t, a, b);
204
+ tcg_gen_add_i64(d, d, t);
205
+ tcg_temp_free_i64(t);
206
+}
207
+
208
+static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
209
+{
210
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
211
+ gen_uabd_vec(vece, t, a, b);
212
+ tcg_gen_add_vec(vece, d, d, t);
213
+ tcg_temp_free_vec(t);
214
+}
215
+
216
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
217
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
218
+{
219
+ static const TCGOpcode vecop_list[] = {
220
+ INDEX_op_sub_vec, INDEX_op_add_vec,
221
+ INDEX_op_umin_vec, INDEX_op_umax_vec, 0
222
+ };
223
+ static const GVecGen3 ops[4] = {
224
+ { .fniv = gen_uaba_vec,
225
+ .fno = gen_helper_gvec_uaba_b,
226
+ .opt_opc = vecop_list,
227
+ .load_dest = true,
228
+ .vece = MO_8 },
229
+ { .fniv = gen_uaba_vec,
230
+ .fno = gen_helper_gvec_uaba_h,
231
+ .opt_opc = vecop_list,
232
+ .load_dest = true,
233
+ .vece = MO_16 },
234
+ { .fni4 = gen_uaba_i32,
235
+ .fniv = gen_uaba_vec,
236
+ .fno = gen_helper_gvec_uaba_s,
237
+ .opt_opc = vecop_list,
238
+ .load_dest = true,
239
+ .vece = MO_32 },
240
+ { .fni8 = gen_uaba_i64,
241
+ .fniv = gen_uaba_vec,
242
+ .fno = gen_helper_gvec_uaba_d,
243
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
244
+ .opt_opc = vecop_list,
245
+ .load_dest = true,
246
+ .vece = MO_64 },
247
+ };
248
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
249
+}
250
+
251
/* Translate a NEON data processing instruction. Return nonzero if the
252
instruction is invalid.
253
We process data in a mixture of 32-bit and 64-bit chunks.
254
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
255
}
256
return 0;
257
258
+ case NEON_3R_VABA:
259
+ if (u) {
260
+ gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
261
+ vec_size, vec_size);
262
+ } else {
263
+ gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
264
+ vec_size, vec_size);
265
+ }
266
+ return 0;
267
+
268
case NEON_3R_VADD_VSUB:
269
case NEON_3R_LOGIC:
270
case NEON_3R_VMAX:
271
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
272
case NEON_3R_VQRSHL:
273
GEN_NEON_INTEGER_OP_ENV(qrshl);
274
break;
275
- case NEON_3R_VABA:
276
- GEN_NEON_INTEGER_OP(abd);
277
- tcg_temp_free_i32(tmp2);
278
- tmp2 = neon_load_reg(rd, pass);
279
- gen_neon_add(size, tmp, tmp2);
280
- break;
281
case NEON_3R_VPMAX:
282
GEN_NEON_INTEGER_OP(pmax);
283
break;
284
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
285
index XXXXXXX..XXXXXXX 100644
286
--- a/target/arm/vec_helper.c
287
+++ b/target/arm/vec_helper.c
288
@@ -XXX,XX +XXX,XX @@ DO_ABD(gvec_uabd_s, uint32_t)
289
DO_ABD(gvec_uabd_d, uint64_t)
290
291
#undef DO_ABD
292
+
293
+#define DO_ABA(NAME, TYPE) \
294
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
295
+{ \
296
+ intptr_t i, opr_sz = simd_oprsz(desc); \
297
+ TYPE *d = vd, *n = vn, *m = vm; \
298
+ \
299
+ for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \
300
+ d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \
301
+ } \
302
+ clear_tail(d, opr_sz, simd_maxsz(desc)); \
303
+}
304
+
305
+DO_ABA(gvec_saba_b, int8_t)
306
+DO_ABA(gvec_saba_h, int16_t)
307
+DO_ABA(gvec_saba_s, int32_t)
308
+DO_ABA(gvec_saba_d, int64_t)
309
+
310
+DO_ABA(gvec_uaba_b, uint8_t)
311
+DO_ABA(gvec_uaba_h, uint16_t)
312
+DO_ABA(gvec_uaba_s, uint32_t)
313
+DO_ABA(gvec_uaba_d, uint64_t)
314
+
315
+#undef DO_ABA
316
--
55
--
317
2.20.1
56
2.34.1
318
57
319
58
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
Currently we hardcode the default NaN value in parts64_default_nan()
2
using a compile-time ifdef ladder. This is awkward for two cases:
3
* for single-QEMU-binary we can't hard-code target-specifics like this
4
* for Arm FEAT_AFP the default NaN value depends on FPCR.AH
5
(specifically the sign bit is different)
2
6
3
Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
7
Add a field to float_status to specify the default NaN value; fall
4
translates the host VA delivered by host to guest PA, then fills this PA
8
back to the old ifdef behaviour if these are not set.
5
to guest APEI GHES memory, then notifies guest according to the SIGBUS
6
type.
7
9
8
When guest accesses the poisoned memory, it will generate a Synchronous
10
The default NaN value is specified by setting a uint8_t to a
9
External Abort(SEA). Then host kernel gets an APEI notification and calls
11
pattern corresponding to the sign and upper fraction parts of
10
memory_failure() to unmapped the affected page in stage 2, finally
12
the NaN; the lower bits of the fraction are set from bit 0 of
11
returns to guest.
13
the pattern.
12
14
13
Guest continues to access the PG_hwpoison page, it will trap to KVM as
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
16
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Qemu, Qemu records this error address into guest APEI GHES memory and
17
Message-id: 20241202131347.498124-35-peter.maydell@linaro.org
16
notifes guest using Synchronous-External-Abort(SEA).
18
---
19
include/fpu/softfloat-helpers.h | 11 +++++++
20
include/fpu/softfloat-types.h | 10 ++++++
21
fpu/softfloat-specialize.c.inc | 55 ++++++++++++++++++++-------------
22
3 files changed, 54 insertions(+), 22 deletions(-)
17
23
18
In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
24
diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
19
in which we can setup the type of exception and the syndrome information.
20
When switching to guest, the target vcpu will jump to the synchronous
21
external abort vector table entry.
22
23
The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
24
ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
25
not valid and hold an UNKNOWN value. These values will be set to KVM
26
register structures through KVM_SET_ONE_REG IOCTL.
27
28
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
29
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
30
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
31
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
32
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
33
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
34
Message-id: 20200512030609.19593-10-gengdongjiu@huawei.com
35
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
36
---
37
include/sysemu/kvm.h | 3 +-
38
target/arm/cpu.h | 4 +++
39
target/arm/internals.h | 5 +--
40
target/i386/cpu.h | 2 ++
41
target/arm/helper.c | 2 +-
42
target/arm/kvm64.c | 77 +++++++++++++++++++++++++++++++++++++++++
43
target/arm/tlb_helper.c | 2 +-
44
7 files changed, 89 insertions(+), 6 deletions(-)
45
46
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
47
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
48
--- a/include/sysemu/kvm.h
26
--- a/include/fpu/softfloat-helpers.h
49
+++ b/include/sysemu/kvm.h
27
+++ b/include/fpu/softfloat-helpers.h
50
@@ -XXX,XX +XXX,XX @@ bool kvm_vcpu_id_is_valid(int vcpu_id);
28
@@ -XXX,XX +XXX,XX @@ static inline void set_float_infzeronan_rule(FloatInfZeroNaNRule rule,
51
/* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
29
status->float_infzeronan_rule = rule;
52
unsigned long kvm_arch_vcpu_id(CPUState *cpu);
53
54
-#ifdef TARGET_I386
55
-#define KVM_HAVE_MCE_INJECTION 1
56
+#ifdef KVM_HAVE_MCE_INJECTION
57
void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
58
#endif
59
60
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/cpu.h
63
+++ b/target/arm/cpu.h
64
@@ -XXX,XX +XXX,XX @@
65
/* ARM processors have a weak memory model */
66
#define TCG_GUEST_DEFAULT_MO (0)
67
68
+#ifdef TARGET_AARCH64
69
+#define KVM_HAVE_MCE_INJECTION 1
70
+#endif
71
+
72
#define EXCP_UDEF 1 /* undefined instruction */
73
#define EXCP_SWI 2 /* software interrupt */
74
#define EXCP_PREFETCH_ABORT 3
75
diff --git a/target/arm/internals.h b/target/arm/internals.h
76
index XXXXXXX..XXXXXXX 100644
77
--- a/target/arm/internals.h
78
+++ b/target/arm/internals.h
79
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
80
| ARM_EL_IL | (ea << 9) | (s1ptw << 7) | fsc;
81
}
30
}
82
31
83
-static inline uint32_t syn_data_abort_no_iss(int same_el,
32
+static inline void set_float_default_nan_pattern(uint8_t dnan_pattern,
84
+static inline uint32_t syn_data_abort_no_iss(int same_el, int fnv,
33
+ float_status *status)
85
int ea, int cm, int s1ptw,
86
int wnr, int fsc)
87
{
88
return (EC_DATAABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
89
| ARM_EL_IL
90
- | (ea << 9) | (cm << 8) | (s1ptw << 7) | (wnr << 6) | fsc;
91
+ | (fnv << 10) | (ea << 9) | (cm << 8) | (s1ptw << 7)
92
+ | (wnr << 6) | fsc;
93
}
94
95
static inline uint32_t syn_data_abort_with_iss(int same_el,
96
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
97
index XXXXXXX..XXXXXXX 100644
98
--- a/target/i386/cpu.h
99
+++ b/target/i386/cpu.h
100
@@ -XXX,XX +XXX,XX @@
101
/* The x86 has a strong memory model with some store-after-load re-ordering */
102
#define TCG_GUEST_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD)
103
104
+#define KVM_HAVE_MCE_INJECTION 1
105
+
106
/* Maximum instruction code size */
107
#define TARGET_MAX_INSN_SIZE 16
108
109
diff --git a/target/arm/helper.c b/target/arm/helper.c
110
index XXXXXXX..XXXXXXX 100644
111
--- a/target/arm/helper.c
112
+++ b/target/arm/helper.c
113
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
114
* Report exception with ESR indicating a fault due to a
115
* translation table walk for a cache maintenance instruction.
116
*/
117
- syn = syn_data_abort_no_iss(current_el == target_el,
118
+ syn = syn_data_abort_no_iss(current_el == target_el, 0,
119
fi.ea, 1, fi.s1ptw, 1, fsc);
120
env->exception.vaddress = value;
121
env->exception.fsr = fsr;
122
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
123
index XXXXXXX..XXXXXXX 100644
124
--- a/target/arm/kvm64.c
125
+++ b/target/arm/kvm64.c
126
@@ -XXX,XX +XXX,XX @@
127
#include "sysemu/kvm_int.h"
128
#include "kvm_arm.h"
129
#include "internals.h"
130
+#include "hw/acpi/acpi.h"
131
+#include "hw/acpi/ghes.h"
132
+#include "hw/arm/virt.h"
133
134
static bool have_guest_debug;
135
136
@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
137
return KVM_PUT_RUNTIME_STATE;
138
}
139
140
+/* Callers must hold the iothread mutex lock */
141
+static void kvm_inject_arm_sea(CPUState *c)
142
+{
34
+{
143
+ ARMCPU *cpu = ARM_CPU(c);
35
+ status->default_nan_pattern = dnan_pattern;
144
+ CPUARMState *env = &cpu->env;
145
+ CPUClass *cc = CPU_GET_CLASS(c);
146
+ uint32_t esr;
147
+ bool same_el;
148
+
149
+ c->exception_index = EXCP_DATA_ABORT;
150
+ env->exception.target_el = 1;
151
+
152
+ /*
153
+ * Set the DFSC to synchronous external abort and set FnV to not valid,
154
+ * this will tell guest the FAR_ELx is UNKNOWN for this abort.
155
+ */
156
+ same_el = arm_current_el(env) == env->exception.target_el;
157
+ esr = syn_data_abort_no_iss(same_el, 1, 0, 0, 0, 0, 0x10);
158
+
159
+ env->exception.syndrome = esr;
160
+
161
+ cc->do_interrupt(c);
162
+}
36
+}
163
+
37
+
164
#define AARCH64_CORE_REG(x) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
38
static inline void set_flush_to_zero(bool val, float_status *status)
165
KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
39
{
166
40
status->flush_to_zero = val;
167
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
41
@@ -XXX,XX +XXX,XX @@ static inline FloatInfZeroNaNRule get_float_infzeronan_rule(float_status *status
168
return ret;
42
return status->float_infzeronan_rule;
169
}
43
}
170
44
171
+void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
45
+static inline uint8_t get_float_default_nan_pattern(float_status *status)
172
+{
46
+{
173
+ ram_addr_t ram_addr;
47
+ return status->default_nan_pattern;
174
+ hwaddr paddr;
175
+ Object *obj = qdev_get_machine();
176
+ VirtMachineState *vms = VIRT_MACHINE(obj);
177
+ bool acpi_enabled = virt_is_acpi_enabled(vms);
178
+
179
+ assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
180
+
181
+ if (acpi_enabled && addr &&
182
+ object_property_get_bool(obj, "ras", NULL)) {
183
+ ram_addr = qemu_ram_addr_from_host(addr);
184
+ if (ram_addr != RAM_ADDR_INVALID &&
185
+ kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
186
+ kvm_hwpoison_page_add(ram_addr);
187
+ /*
188
+ * If this is a BUS_MCEERR_AR, we know we have been called
189
+ * synchronously from the vCPU thread, so we can easily
190
+ * synchronize the state and inject an error.
191
+ *
192
+ * TODO: we currently don't tell the guest at all about
193
+ * BUS_MCEERR_AO. In that case we might either be being
194
+ * called synchronously from the vCPU thread, or a bit
195
+ * later from the main thread, so doing the injection of
196
+ * the error would be more complicated.
197
+ */
198
+ if (code == BUS_MCEERR_AR) {
199
+ kvm_cpu_synchronize_state(c);
200
+ if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
201
+ kvm_inject_arm_sea(c);
202
+ } else {
203
+ error_report("failed to record the error");
204
+ abort();
205
+ }
206
+ }
207
+ return;
208
+ }
209
+ if (code == BUS_MCEERR_AO) {
210
+ error_report("Hardware memory error at addr %p for memory used by "
211
+ "QEMU itself instead of guest system!", addr);
212
+ }
213
+ }
214
+
215
+ if (code == BUS_MCEERR_AR) {
216
+ error_report("Hardware memory error!");
217
+ exit(1);
218
+ }
219
+}
48
+}
220
+
49
+
221
/* C6.6.29 BRK instruction */
50
static inline bool get_flush_to_zero(float_status *status)
222
static const uint32_t brk_insn = 0xd4200000;
51
{
223
52
return status->flush_to_zero;
224
diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
53
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
225
index XXXXXXX..XXXXXXX 100644
54
index XXXXXXX..XXXXXXX 100644
226
--- a/target/arm/tlb_helper.c
55
--- a/include/fpu/softfloat-types.h
227
+++ b/target/arm/tlb_helper.c
56
+++ b/include/fpu/softfloat-types.h
228
@@ -XXX,XX +XXX,XX @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
57
@@ -XXX,XX +XXX,XX @@ typedef struct float_status {
229
* ISV field.
58
/* should denormalised inputs go to zero and set the input_denormal flag? */
230
*/
59
bool flush_inputs_to_zero;
231
if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) {
60
bool default_nan_mode;
232
- syn = syn_data_abort_no_iss(same_el,
61
+ /*
233
+ syn = syn_data_abort_no_iss(same_el, 0,
62
+ * The pattern to use for the default NaN. Here the high bit specifies
234
ea, 0, s1ptw, is_write, fsc);
63
+ * the default NaN's sign bit, and bits 6..0 specify the high bits of the
235
} else {
64
+ * fractional part. The low bits of the fractional part are copies of bit 0.
236
/*
65
+ * The exponent of the default NaN is (as for any NaN) always all 1s.
66
+ * Note that a value of 0 here is not a valid NaN. The target must set
67
+ * this to the correct non-zero value, or we will assert when trying to
68
+ * create a default NaN.
69
+ */
70
+ uint8_t default_nan_pattern;
71
/*
72
* The flags below are not used on all specializations and may
73
* constant fold away (see snan_bit_is_one()/no_signalling_nans() in
74
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
75
index XXXXXXX..XXXXXXX 100644
76
--- a/fpu/softfloat-specialize.c.inc
77
+++ b/fpu/softfloat-specialize.c.inc
78
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
79
{
80
bool sign = 0;
81
uint64_t frac;
82
+ uint8_t dnan_pattern = status->default_nan_pattern;
83
84
+ if (dnan_pattern == 0) {
85
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
86
- /* !snan_bit_is_one, set all bits */
87
- frac = (1ULL << DECOMPOSED_BINARY_POINT) - 1;
88
-#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
89
+ /* Sign bit clear, all frac bits set */
90
+ dnan_pattern = 0b01111111;
91
+#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
92
|| defined(TARGET_MICROBLAZE)
93
- /* !snan_bit_is_one, set sign and msb */
94
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 1);
95
- sign = 1;
96
+ /* Sign bit set, most significant frac bit set */
97
+ dnan_pattern = 0b11000000;
98
#elif defined(TARGET_HPPA)
99
- /* snan_bit_is_one, set msb-1. */
100
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 2);
101
+ /* Sign bit clear, msb-1 frac bit set */
102
+ dnan_pattern = 0b00100000;
103
#elif defined(TARGET_HEXAGON)
104
- sign = 1;
105
- frac = ~0ULL;
106
+ /* Sign bit set, all frac bits set. */
107
+ dnan_pattern = 0b11111111;
108
#else
109
- /*
110
- * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
111
- * S390, SH4, TriCore, and Xtensa. Our other supported targets
112
- * do not have floating-point.
113
- */
114
- if (snan_bit_is_one(status)) {
115
- /* set all bits other than msb */
116
- frac = (1ULL << (DECOMPOSED_BINARY_POINT - 1)) - 1;
117
- } else {
118
- /* set msb */
119
- frac = 1ULL << (DECOMPOSED_BINARY_POINT - 1);
120
- }
121
+ /*
122
+ * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
123
+ * S390, SH4, TriCore, and Xtensa. Our other supported targets
124
+ * do not have floating-point.
125
+ */
126
+ if (snan_bit_is_one(status)) {
127
+ /* sign bit clear, set all frac bits other than msb */
128
+ dnan_pattern = 0b00111111;
129
+ } else {
130
+ /* sign bit clear, set frac msb */
131
+ dnan_pattern = 0b01000000;
132
+ }
133
#endif
134
+ }
135
+ assert(dnan_pattern != 0);
136
+
137
+ sign = dnan_pattern >> 7;
138
+ /*
139
+ * Place default_nan_pattern [6:0] into bits [62:56],
140
+ * and replecate bit [0] down into [55:0]
141
+ */
142
+ frac = deposit64(0, DECOMPOSED_BINARY_POINT - 7, 7, dnan_pattern);
143
+ frac = deposit64(frac, 0, DECOMPOSED_BINARY_POINT - 7, -(dnan_pattern & 1));
144
145
*p = (FloatParts64) {
146
.cls = float_class_qnan,
237
--
147
--
238
2.20.1
148
2.34.1
239
240
diff view generated by jsdifflib
New patch
1
Set the default NaN pattern explicitly for the tests/fp code.
1
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-36-peter.maydell@linaro.org
6
---
7
tests/fp/fp-bench.c | 1 +
8
tests/fp/fp-test-log2.c | 1 +
9
tests/fp/fp-test.c | 1 +
10
3 files changed, 3 insertions(+)
11
12
diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tests/fp/fp-bench.c
15
+++ b/tests/fp/fp-bench.c
16
@@ -XXX,XX +XXX,XX @@ static void run_bench(void)
17
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &soft_status);
18
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &soft_status);
19
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &soft_status);
20
+ set_float_default_nan_pattern(0b01000000, &soft_status);
21
22
f = bench_funcs[operation][precision];
23
g_assert(f);
24
diff --git a/tests/fp/fp-test-log2.c b/tests/fp/fp-test-log2.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/tests/fp/fp-test-log2.c
27
+++ b/tests/fp/fp-test-log2.c
28
@@ -XXX,XX +XXX,XX @@ int main(int ac, char **av)
29
int i;
30
31
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
32
+ set_float_default_nan_pattern(0b01000000, &qsf);
33
set_float_rounding_mode(float_round_nearest_even, &qsf);
34
35
test.d = 0.0;
36
diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/tests/fp/fp-test.c
39
+++ b/tests/fp/fp-test.c
40
@@ -XXX,XX +XXX,XX @@ void run_test(void)
41
*/
42
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &qsf);
43
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &qsf);
44
+ set_float_default_nan_pattern(0b01000000, &qsf);
45
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, &qsf);
46
47
genCases_setLevel(test_level);
48
--
49
2.34.1
diff view generated by jsdifflib
1
Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
decodetree.
2
parts64_default_nan().
3
4
We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
5
need a loop function do_3same_fp(). This takes a reads_vd parameter
6
to do_3same_fp() which tells it to load the old value into vd before
7
calling the callback function, in the same way that the do_vfp_3op_sp()
8
and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
9
only uses in this patch pass reads_vd == true, but later commits
10
will use reads_vd == false.)
11
12
This conversion fixes in passing an underdecoding for VMUL
13
(originally reported by Fredrik Strupe <fredrik@strupe.net>): bit 1
14
of the 'size' field must be 0. The old decoder didn't enforce this,
15
but the decodetree pattern does.
16
17
The gen_VMLA_fp_reg() function performs the addition operation
18
with the operands in the opposite order to the old decoder:
19
since Neon sets 'default NaN mode' float32_add operations are
20
commutative so there is no behaviour difference, but putting
21
them this way around matches the Arm ARM pseudocode and the
22
required operation order for the subtraction in gen_VMLS_fp_reg().
23
3
24
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
25
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
26
Message-id: 20200512163904.10918-14-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-37-peter.maydell@linaro.org
27
---
7
---
28
target/arm/neon-dp.decode | 3 ++
8
target/microblaze/cpu.c | 2 ++
29
target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
9
fpu/softfloat-specialize.c.inc | 3 +--
30
target/arm/translate.c | 17 +------
10
2 files changed, 3 insertions(+), 2 deletions(-)
31
3 files changed, 85 insertions(+), 16 deletions(-)
32
11
33
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
12
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
34
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/neon-dp.decode
14
--- a/target/microblaze/cpu.c
36
+++ b/target/arm/neon-dp.decode
15
+++ b/target/microblaze/cpu.c
37
@@ -XXX,XX +XXX,XX @@ VADD_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
16
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_reset_hold(Object *obj, ResetType type)
38
VSUB_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
17
* this architecture.
39
VPADD_fp_3s 1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
18
*/
40
VABD_fp_3s 1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
19
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
41
+VMLA_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
20
+ /* Default NaN: sign bit set, most significant frac bit set */
42
+VMLS_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
21
+ set_float_default_nan_pattern(0b11000000, &env->fp_status);
43
+VMUL_fp_3s 1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
22
44
VPMAX_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
23
#if defined(CONFIG_USER_ONLY)
45
VPMIN_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
24
/* start in user mode with interrupts enabled. */
46
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
47
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/translate-neon.inc.c
27
--- a/fpu/softfloat-specialize.c.inc
49
+++ b/target/arm/translate-neon.inc.c
28
+++ b/fpu/softfloat-specialize.c.inc
50
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
51
DO_3SAME_VQDMULH(VQDMULH, qdmulh)
30
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
52
DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
31
/* Sign bit clear, all frac bits set */
53
32
dnan_pattern = 0b01111111;
54
+static bool do_3same_fp(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn,
33
-#elif defined(TARGET_I386) || defined(TARGET_X86_64) \
55
+ bool reads_vd)
34
- || defined(TARGET_MICROBLAZE)
56
+{
35
+#elif defined(TARGET_I386) || defined(TARGET_X86_64)
57
+ /*
36
/* Sign bit set, most significant frac bit set */
58
+ * FP operations handled elementwise 32 bits at a time.
37
dnan_pattern = 0b11000000;
59
+ * If reads_vd is true then the old value of Vd will be
38
#elif defined(TARGET_HPPA)
60
+ * loaded before calling the callback function. This is
61
+ * used for multiply-accumulate type operations.
62
+ */
63
+ TCGv_i32 tmp, tmp2;
64
+ int pass;
65
+
66
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
67
+ return false;
68
+ }
69
+
70
+ /* UNDEF accesses to D16-D31 if they don't exist. */
71
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
72
+ ((a->vd | a->vn | a->vm) & 0x10)) {
73
+ return false;
74
+ }
75
+
76
+ if ((a->vn | a->vm | a->vd) & a->q) {
77
+ return false;
78
+ }
79
+
80
+ if (!vfp_access_check(s)) {
81
+ return true;
82
+ }
83
+
84
+ TCGv_ptr fpstatus = get_fpstatus_ptr(1);
85
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
86
+ tmp = neon_load_reg(a->vn, pass);
87
+ tmp2 = neon_load_reg(a->vm, pass);
88
+ if (reads_vd) {
89
+ TCGv_i32 tmp_rd = neon_load_reg(a->vd, pass);
90
+ fn(tmp_rd, tmp, tmp2, fpstatus);
91
+ neon_store_reg(a->vd, pass, tmp_rd);
92
+ tcg_temp_free_i32(tmp);
93
+ } else {
94
+ fn(tmp, tmp, tmp2, fpstatus);
95
+ neon_store_reg(a->vd, pass, tmp);
96
+ }
97
+ tcg_temp_free_i32(tmp2);
98
+ }
99
+ tcg_temp_free_ptr(fpstatus);
100
+ return true;
101
+}
102
+
103
/*
104
* For all the functions using this macro, size == 1 means fp16,
105
* which is an architecture extension we don't implement yet.
106
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
107
DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
108
DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
109
DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
110
+DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
111
+
112
+/*
113
+ * For all the functions using this macro, size == 1 means fp16,
114
+ * which is an architecture extension we don't implement yet.
115
+ */
116
+#define DO_3S_FP(INSN,FUNC,READS_VD) \
117
+ static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
118
+ { \
119
+ if (a->size != 0) { \
120
+ /* TODO fp16 support */ \
121
+ return false; \
122
+ } \
123
+ return do_3same_fp(s, a, FUNC, READS_VD); \
124
+ }
125
+
126
+static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
127
+ TCGv_ptr fpstatus)
128
+{
129
+ gen_helper_vfp_muls(vn, vn, vm, fpstatus);
130
+ gen_helper_vfp_adds(vd, vd, vn, fpstatus);
131
+}
132
+
133
+static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
134
+ TCGv_ptr fpstatus)
135
+{
136
+ gen_helper_vfp_muls(vn, vn, vm, fpstatus);
137
+ gen_helper_vfp_subs(vd, vd, vn, fpstatus);
138
+}
139
+
140
+DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
141
+DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
142
143
static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
144
{
145
diff --git a/target/arm/translate.c b/target/arm/translate.c
146
index XXXXXXX..XXXXXXX 100644
147
--- a/target/arm/translate.c
148
+++ b/target/arm/translate.c
149
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
150
case NEON_3R_VPADD_VQRDMLAH:
151
case NEON_3R_VQDMULH_VQRDMULH:
152
case NEON_3R_FLOAT_ARITH:
153
+ case NEON_3R_FLOAT_MULTIPLY:
154
/* Already handled by decodetree */
155
return 1;
156
}
157
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
158
tmp = neon_load_reg(rn, pass);
159
tmp2 = neon_load_reg(rm, pass);
160
switch (op) {
161
- case NEON_3R_FLOAT_MULTIPLY:
162
- {
163
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
164
- gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
165
- if (!u) {
166
- tcg_temp_free_i32(tmp2);
167
- tmp2 = neon_load_reg(rd, pass);
168
- if (size == 0) {
169
- gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
170
- } else {
171
- gen_helper_vfp_subs(tmp, tmp2, tmp, fpstatus);
172
- }
173
- }
174
- tcg_temp_free_ptr(fpstatus);
175
- break;
176
- }
177
case NEON_3R_FLOAT_CMP:
178
{
179
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
180
--
39
--
181
2.20.1
40
2.34.1
182
183
diff view generated by jsdifflib
1
Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
decodetree. These are the only remaining 'pairwise' operations,
2
parts64_default_nan().
3
so we can delete the pairwise-specific bits of the old decoder's
4
for-each-element loop now.
5
3
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200512163904.10918-13-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-38-peter.maydell@linaro.org
9
---
7
---
10
target/arm/neon-dp.decode | 5 +++
8
target/i386/tcg/fpu_helper.c | 4 ++++
11
target/arm/translate-neon.inc.c | 63 +++++++++++++++++++++++++++++++++
9
fpu/softfloat-specialize.c.inc | 3 ---
12
target/arm/translate.c | 63 +++++----------------------------
10
2 files changed, 4 insertions(+), 3 deletions(-)
13
3 files changed, 76 insertions(+), 55 deletions(-)
14
11
15
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
12
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/neon-dp.decode
14
--- a/target/i386/tcg/fpu_helper.c
18
+++ b/target/arm/neon-dp.decode
15
+++ b/target/i386/tcg/fpu_helper.c
19
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@ void cpu_init_fp_statuses(CPUX86State *env)
20
# For FP insns the high bit of 'size' is used as part of opcode decode
17
*/
21
@3same_fp .... ... . . . . size:1 .... .... .... . q:1 . . .... \
18
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
22
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
19
set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
23
+@3same_fp_q0 .... ... . . . . size:1 .... .... .... . 0 . . .... \
20
+ /* Default NaN: sign bit set, most significant frac bit set */
24
+ &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
21
+ set_float_default_nan_pattern(0b11000000, &env->fp_status);
25
22
+ set_float_default_nan_pattern(0b11000000, &env->mmx_status);
26
VHADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
23
+ set_float_default_nan_pattern(0b11000000, &env->sse_status);
27
VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
24
}
28
@@ -XXX,XX +XXX,XX @@ VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
25
29
26
static inline uint8_t save_exception_flags(CPUX86State *env)
30
VADD_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
27
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
31
VSUB_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
32
+VPADD_fp_3s 1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
33
VABD_fp_3s 1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
34
+VPMAX_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
35
+VPMIN_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
36
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
37
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/translate-neon.inc.c
29
--- a/fpu/softfloat-specialize.c.inc
39
+++ b/target/arm/translate-neon.inc.c
30
+++ b/fpu/softfloat-specialize.c.inc
40
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
31
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
41
DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
32
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
42
DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
33
/* Sign bit clear, all frac bits set */
43
DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
34
dnan_pattern = 0b01111111;
44
+
35
-#elif defined(TARGET_I386) || defined(TARGET_X86_64)
45
+static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
36
- /* Sign bit set, most significant frac bit set */
46
+{
37
- dnan_pattern = 0b11000000;
47
+ /* FP operations handled pairwise 32 bits at a time */
38
#elif defined(TARGET_HPPA)
48
+ TCGv_i32 tmp, tmp2, tmp3;
39
/* Sign bit clear, msb-1 frac bit set */
49
+ TCGv_ptr fpstatus;
40
dnan_pattern = 0b00100000;
50
+
51
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
52
+ return false;
53
+ }
54
+
55
+ /* UNDEF accesses to D16-D31 if they don't exist. */
56
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
57
+ ((a->vd | a->vn | a->vm) & 0x10)) {
58
+ return false;
59
+ }
60
+
61
+ if (!vfp_access_check(s)) {
62
+ return true;
63
+ }
64
+
65
+ assert(a->q == 0); /* enforced by decode patterns */
66
+
67
+ /*
68
+ * Note that we have to be careful not to clobber the source operands
69
+ * in the "vm == vd" case by storing the result of the first pass too
70
+ * early. Since Q is 0 there are always just two passes, so instead
71
+ * of a complicated loop over each pass we just unroll.
72
+ */
73
+ fpstatus = get_fpstatus_ptr(1);
74
+ tmp = neon_load_reg(a->vn, 0);
75
+ tmp2 = neon_load_reg(a->vn, 1);
76
+ fn(tmp, tmp, tmp2, fpstatus);
77
+ tcg_temp_free_i32(tmp2);
78
+
79
+ tmp3 = neon_load_reg(a->vm, 0);
80
+ tmp2 = neon_load_reg(a->vm, 1);
81
+ fn(tmp3, tmp3, tmp2, fpstatus);
82
+ tcg_temp_free_i32(tmp2);
83
+ tcg_temp_free_ptr(fpstatus);
84
+
85
+ neon_store_reg(a->vd, 0, tmp);
86
+ neon_store_reg(a->vd, 1, tmp3);
87
+ return true;
88
+}
89
+
90
+/*
91
+ * For all the functions using this macro, size == 1 means fp16,
92
+ * which is an architecture extension we don't implement yet.
93
+ */
94
+#define DO_3S_FP_PAIR(INSN,FUNC) \
95
+ static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
96
+ { \
97
+ if (a->size != 0) { \
98
+ /* TODO fp16 support */ \
99
+ return false; \
100
+ } \
101
+ return do_3same_fp_pair(s, a, FUNC); \
102
+ }
103
+
104
+DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
105
+DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
106
+DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
107
diff --git a/target/arm/translate.c b/target/arm/translate.c
108
index XXXXXXX..XXXXXXX 100644
109
--- a/target/arm/translate.c
110
+++ b/target/arm/translate.c
111
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
112
int shift;
113
int pass;
114
int count;
115
- int pairwise;
116
int u;
117
int vec_size;
118
uint32_t imm;
119
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
120
case NEON_3R_VPMIN:
121
case NEON_3R_VPADD_VQRDMLAH:
122
case NEON_3R_VQDMULH_VQRDMULH:
123
+ case NEON_3R_FLOAT_ARITH:
124
/* Already handled by decodetree */
125
return 1;
126
}
127
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
128
/* 64-bit element instructions: handled by decodetree */
129
return 1;
130
}
131
- pairwise = 0;
132
switch (op) {
133
- case NEON_3R_FLOAT_ARITH:
134
- pairwise = (u && size < 2); /* if VPADD (float) */
135
- if (!pairwise) {
136
- return 1; /* handled by decodetree */
137
- }
138
- break;
139
case NEON_3R_FLOAT_MINMAX:
140
- pairwise = u; /* if VPMIN/VPMAX (float) */
141
+ if (u) {
142
+ return 1; /* VPMIN/VPMAX handled by decodetree */
143
+ }
144
break;
145
case NEON_3R_FLOAT_CMP:
146
if (!u && size) {
147
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
148
break;
149
}
150
151
- if (pairwise && q) {
152
- /* All the pairwise insns UNDEF if Q is set */
153
- return 1;
154
- }
155
-
156
for (pass = 0; pass < (q ? 4 : 2); pass++) {
157
158
- if (pairwise) {
159
- /* Pairwise. */
160
- if (pass < 1) {
161
- tmp = neon_load_reg(rn, 0);
162
- tmp2 = neon_load_reg(rn, 1);
163
- } else {
164
- tmp = neon_load_reg(rm, 0);
165
- tmp2 = neon_load_reg(rm, 1);
166
- }
167
- } else {
168
- /* Elementwise. */
169
- tmp = neon_load_reg(rn, pass);
170
- tmp2 = neon_load_reg(rm, pass);
171
- }
172
+ /* Elementwise. */
173
+ tmp = neon_load_reg(rn, pass);
174
+ tmp2 = neon_load_reg(rm, pass);
175
switch (op) {
176
- case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
177
- {
178
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
179
- switch ((u << 2) | size) {
180
- case 4: /* VPADD */
181
- gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
182
- break;
183
- default:
184
- abort();
185
- }
186
- tcg_temp_free_ptr(fpstatus);
187
- break;
188
- }
189
case NEON_3R_FLOAT_MULTIPLY:
190
{
191
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
192
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
193
}
194
tcg_temp_free_i32(tmp2);
195
196
- /* Save the result. For elementwise operations we can put it
197
- straight into the destination register. For pairwise operations
198
- we have to be careful to avoid clobbering the source operands. */
199
- if (pairwise && rd == rm) {
200
- neon_store_scratch(pass, tmp);
201
- } else {
202
- neon_store_reg(rd, pass, tmp);
203
- }
204
+ neon_store_reg(rd, pass, tmp);
205
206
} /* for pass */
207
- if (pairwise && rd == rm) {
208
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
209
- tmp = neon_load_scratch(pass);
210
- neon_store_reg(rd, pass, tmp);
211
- }
212
- }
213
/* End of 3 register same size operations. */
214
} else if (insn & (1 << 4)) {
215
if ((insn & 0x00380080) != 0) {
216
--
41
--
217
2.20.1
42
2.34.1
218
219
diff view generated by jsdifflib
1
Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to
1
Set the default NaN pattern explicitly, and remove the ifdef from
2
decodetree. These are the last integer operations in the
2
parts64_default_nan().
3
3-reg-same group.
4
3
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-11-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-39-peter.maydell@linaro.org
8
---
7
---
9
target/arm/neon-dp.decode | 3 +++
8
target/hppa/fpu_helper.c | 2 ++
10
target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
9
fpu/softfloat-specialize.c.inc | 3 ---
11
target/arm/translate.c | 24 +-----------------------
10
2 files changed, 2 insertions(+), 3 deletions(-)
12
3 files changed, 28 insertions(+), 23 deletions(-)
13
11
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
12
diff --git a/target/hppa/fpu_helper.c b/target/hppa/fpu_helper.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/neon-dp.decode
14
--- a/target/hppa/fpu_helper.c
17
+++ b/target/arm/neon-dp.decode
15
+++ b/target/hppa/fpu_helper.c
18
@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
16
@@ -XXX,XX +XXX,XX @@ void HELPER(loaded_fr0)(CPUHPPAState *env)
19
VPMIN_S_3s 1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
17
set_float_3nan_prop_rule(float_3nan_prop_abc, &env->fp_status);
20
VPMIN_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
18
/* For inf * 0 + NaN, return the input NaN */
21
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
+VQDMULH_3s 1111 001 0 0 . .. .... .... 1011 . . . 0 .... @3same
20
+ /* Default NaN: sign bit clear, msb-1 frac bit set */
23
+VQRDMULH_3s 1111 001 1 0 . .. .... .... 1011 . . . 0 .... @3same
21
+ set_float_default_nan_pattern(0b00100000, &env->fp_status);
24
+
22
}
25
VPADD_3s 1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
23
26
24
void cpu_hppa_loaded_fr0(CPUHPPAState *env)
27
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
29
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-neon.inc.c
27
--- a/fpu/softfloat-specialize.c.inc
31
+++ b/target/arm/translate-neon.inc.c
28
+++ b/fpu/softfloat-specialize.c.inc
32
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPMIN_S, pmin_s)
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
33
DO_3SAME_PAIR(VPMAX_U, pmax_u)
30
#if defined(TARGET_SPARC) || defined(TARGET_M68K)
34
DO_3SAME_PAIR(VPMIN_U, pmin_u)
31
/* Sign bit clear, all frac bits set */
35
DO_3SAME_PAIR(VPADD, padd_u)
32
dnan_pattern = 0b01111111;
36
+
33
-#elif defined(TARGET_HPPA)
37
+#define DO_3SAME_VQDMULH(INSN, FUNC) \
34
- /* Sign bit clear, msb-1 frac bit set */
38
+ WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \
35
- dnan_pattern = 0b00100000;
39
+ WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \
36
#elif defined(TARGET_HEXAGON)
40
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
37
/* Sign bit set, all frac bits set. */
41
+ uint32_t rn_ofs, uint32_t rm_ofs, \
38
dnan_pattern = 0b11111111;
42
+ uint32_t oprsz, uint32_t maxsz) \
43
+ { \
44
+ static const GVecGen3 ops[2] = { \
45
+ { .fni4 = gen_##INSN##_tramp16 }, \
46
+ { .fni4 = gen_##INSN##_tramp32 }, \
47
+ }; \
48
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece - 1]); \
49
+ } \
50
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
51
+ { \
52
+ if (a->size != 1 && a->size != 2) { \
53
+ return false; \
54
+ } \
55
+ return do_3same(s, a, gen_##INSN##_3s); \
56
+ }
57
+
58
+DO_3SAME_VQDMULH(VQDMULH, qdmulh)
59
+DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
60
diff --git a/target/arm/translate.c b/target/arm/translate.c
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/translate.c
63
+++ b/target/arm/translate.c
64
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
65
case NEON_3R_VPMAX:
66
case NEON_3R_VPMIN:
67
case NEON_3R_VPADD_VQRDMLAH:
68
+ case NEON_3R_VQDMULH_VQRDMULH:
69
/* Already handled by decodetree */
70
return 1;
71
}
72
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
73
tmp2 = neon_load_reg(rm, pass);
74
}
75
switch (op) {
76
- case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high. */
77
- if (!u) { /* VQDMULH */
78
- switch (size) {
79
- case 1:
80
- gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
81
- break;
82
- case 2:
83
- gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
84
- break;
85
- default: abort();
86
- }
87
- } else { /* VQRDMULH */
88
- switch (size) {
89
- case 1:
90
- gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
91
- break;
92
- case 2:
93
- gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
94
- break;
95
- default: abort();
96
- }
97
- }
98
- break;
99
case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
100
{
101
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
102
--
39
--
103
2.20.1
40
2.34.1
104
105
diff view generated by jsdifflib
1
Convert the Neon integer VPADD 3-reg-same insns to decodetree. These
1
Set the default NaN pattern explicitly for the alpha target.
2
are 'pairwise' operations. (Note that VQRDMLAH, which shares the
3
same primary opcode but has U=1, has already been converted.)
4
2
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-10-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-40-peter.maydell@linaro.org
8
---
6
---
9
target/arm/neon-dp.decode | 2 ++
7
target/alpha/cpu.c | 2 ++
10
target/arm/translate-neon.inc.c | 2 ++
8
1 file changed, 2 insertions(+)
11
target/arm/translate.c | 19 +------------------
12
3 files changed, 5 insertions(+), 18 deletions(-)
13
9
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
10
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/neon-dp.decode
12
--- a/target/alpha/cpu.c
17
+++ b/target/arm/neon-dp.decode
13
+++ b/target/alpha/cpu.c
18
@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
14
@@ -XXX,XX +XXX,XX @@ static void alpha_cpu_initfn(Object *obj)
19
VPMIN_S_3s 1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
15
* operand in Fa. That is float_2nan_prop_ba.
20
VPMIN_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
16
*/
21
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
22
+VPADD_3s 1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
18
+ /* Default NaN: sign bit clear, msb frac bit set */
23
+
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
24
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
20
#if defined(CONFIG_USER_ONLY)
25
21
env->flags = ENV_FLAG_PS_USER | ENV_FLAG_FEN;
26
SHA1_3s 1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
22
cpu_alpha_store_fpcr(env, (uint64_t)(FPCR_INVD | FPCR_DZED | FPCR_OVFD
27
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/translate-neon.inc.c
30
+++ b/target/arm/translate-neon.inc.c
31
@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
32
#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
33
#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
34
#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
35
+#define gen_helper_neon_padd_u32 tcg_gen_add_i32
36
37
DO_3SAME_PAIR(VPMAX_S, pmax_s)
38
DO_3SAME_PAIR(VPMIN_S, pmin_s)
39
DO_3SAME_PAIR(VPMAX_U, pmax_u)
40
DO_3SAME_PAIR(VPMIN_U, pmin_u)
41
+DO_3SAME_PAIR(VPADD, padd_u)
42
diff --git a/target/arm/translate.c b/target/arm/translate.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/translate.c
45
+++ b/target/arm/translate.c
46
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
47
return 1;
48
}
49
switch (op) {
50
- case NEON_3R_VPADD_VQRDMLAH:
51
- if (!u) {
52
- break; /* VPADD */
53
- }
54
- /* VQRDMLAH : handled by decodetree */
55
- return 1;
56
-
57
case NEON_3R_VFM_VQRDMLSH:
58
if (!u) {
59
/* VFM, VFMS */
60
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
61
case NEON_3R_VQRSHL:
62
case NEON_3R_VPMAX:
63
case NEON_3R_VPMIN:
64
+ case NEON_3R_VPADD_VQRDMLAH:
65
/* Already handled by decodetree */
66
return 1;
67
}
68
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
69
}
70
pairwise = 0;
71
switch (op) {
72
- case NEON_3R_VPADD_VQRDMLAH:
73
- pairwise = 1;
74
- break;
75
case NEON_3R_FLOAT_ARITH:
76
pairwise = (u && size < 2); /* if VPADD (float) */
77
break;
78
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
79
}
80
}
81
break;
82
- case NEON_3R_VPADD_VQRDMLAH:
83
- switch (size) {
84
- case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
85
- case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
86
- case 2: tcg_gen_add_i32(tmp, tmp, tmp2); break;
87
- default: abort();
88
- }
89
- break;
90
case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
91
{
92
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
93
--
23
--
94
2.20.1
24
2.34.1
95
96
diff view generated by jsdifflib
1
The usual location for the env argument in the argument list of a TCG helper
1
Set the default NaN pattern explicitly for the arm target.
2
is immediately after the return-value argument. recps_f32 and rsqrts_f32
2
This includes setting it for the old linux-user nwfpe emulation.
3
differ in that they put it at the end.
3
For nwfpe, our default doesn't match the real kernel, but we
4
4
avoid making a behaviour change in this commit.
5
Move the env argument to its usual place; this will allow us to
6
more easily use these helper functions with the gvec APIs.
7
5
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200512163904.10918-16-peter.maydell@linaro.org
8
Message-id: 20241202131347.498124-41-peter.maydell@linaro.org
11
---
9
---
12
target/arm/helper.h | 4 ++--
10
linux-user/arm/nwfpe/fpa11.c | 5 +++++
13
target/arm/translate.c | 4 ++--
11
target/arm/cpu.c | 2 ++
14
target/arm/vfp_helper.c | 4 ++--
12
2 files changed, 7 insertions(+)
15
3 files changed, 6 insertions(+), 6 deletions(-)
16
13
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
14
diff --git a/linux-user/arm/nwfpe/fpa11.c b/linux-user/arm/nwfpe/fpa11.c
18
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.h
16
--- a/linux-user/arm/nwfpe/fpa11.c
20
+++ b/target/arm/helper.h
17
+++ b/linux-user/arm/nwfpe/fpa11.c
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
18
@@ -XXX,XX +XXX,XX @@ void resetFPA11(void)
22
DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
19
* this late date.
23
DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
20
*/
24
21
set_float_2nan_prop_rule(float_2nan_prop_s_ab, &fpa11->fp_status);
25
-DEF_HELPER_3(recps_f32, f32, f32, f32, env)
22
+ /*
26
-DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env)
23
+ * Use the same default NaN value as Arm VFP. This doesn't match
27
+DEF_HELPER_3(recps_f32, f32, env, f32, f32)
24
+ * the Linux kernel's nwfpe emulation, which uses an all-1s value.
28
+DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
25
+ */
29
DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
26
+ set_float_default_nan_pattern(0b01000000, &fpa11->fp_status);
30
DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
27
}
31
DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
28
32
diff --git a/target/arm/translate.c b/target/arm/translate.c
29
void SetRoundingMode(const unsigned int opcode)
30
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
33
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/translate.c
32
--- a/target/arm/cpu.c
35
+++ b/target/arm/translate.c
33
+++ b/target/arm/cpu.c
36
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
34
@@ -XXX,XX +XXX,XX @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
37
tcg_temp_free_ptr(fpstatus);
35
* the pseudocode function the arguments are in the order c, a, b.
38
} else {
36
* * 0 * Inf + NaN returns the default NaN if the input NaN is quiet,
39
if (size == 0) {
37
* and the input NaN if it is signalling
40
- gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env);
38
+ * * Default NaN has sign bit clear, msb frac bit set
41
+ gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
39
*/
42
} else {
40
static void arm_set_default_fp_behaviours(float_status *s)
43
- gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env);
44
+ gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
45
}
46
}
47
break;
48
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/vfp_helper.c
51
+++ b/target/arm/vfp_helper.c
52
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
53
#define float32_three make_float32(0x40400000)
54
#define float32_one_point_five make_float32(0x3fc00000)
55
56
-float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
57
+float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
58
{
41
{
59
float_status *s = &env->vfp.standard_fp_status;
42
@@ -XXX,XX +XXX,XX @@ static void arm_set_default_fp_behaviours(float_status *s)
60
if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
43
set_float_2nan_prop_rule(float_2nan_prop_s_ab, s);
61
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
44
set_float_3nan_prop_rule(float_3nan_prop_s_cab, s);
62
return float32_sub(float32_two, float32_mul(a, b, s), s);
45
set_float_infzeronan_rule(float_infzeronan_dnan_if_qnan, s);
46
+ set_float_default_nan_pattern(0b01000000, s);
63
}
47
}
64
48
65
-float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUARMState *env)
49
static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
66
+float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
67
{
68
float_status *s = &env->vfp.standard_fp_status;
69
float32 product;
70
--
50
--
71
2.20.1
51
2.34.1
72
73
diff view generated by jsdifflib
1
Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
1
Set the default NaN pattern explicitly for loongarch.
2
We already have gvec helpers for addition and subtraction, but must
3
add one for fabd.
4
2
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-42-peter.maydell@linaro.org
8
---
6
---
9
target/arm/helper.h | 3 ++-
7
target/loongarch/tcg/fpu_helper.c | 2 ++
10
target/arm/neon-dp.decode | 8 ++++++++
8
1 file changed, 2 insertions(+)
11
target/arm/neon_helper.c | 7 -------
12
target/arm/translate-neon.inc.c | 28 ++++++++++++++++++++++++++++
13
target/arm/translate.c | 10 +++-------
14
target/arm/vec_helper.c | 7 +++++++
15
6 files changed, 48 insertions(+), 15 deletions(-)
16
9
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
10
diff --git a/target/loongarch/tcg/fpu_helper.c b/target/loongarch/tcg/fpu_helper.c
18
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.h
12
--- a/target/loongarch/tcg/fpu_helper.c
20
+++ b/target/arm/helper.h
13
+++ b/target/loongarch/tcg/fpu_helper.c
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(neon_qneg_s16, TCG_CALL_NO_RWG, i32, env, i32)
14
@@ -XXX,XX +XXX,XX @@ void restore_fp_status(CPULoongArchState *env)
22
DEF_HELPER_FLAGS_2(neon_qneg_s32, TCG_CALL_NO_RWG, i32, env, i32)
15
*/
23
DEF_HELPER_FLAGS_2(neon_qneg_s64, TCG_CALL_NO_RWG, i64, env, i64)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
17
set_float_3nan_prop_rule(float_3nan_prop_s_cab, &env->fp_status);
25
-DEF_HELPER_3(neon_abd_f32, i32, i32, i32, ptr)
18
+ /* Default NaN: sign bit clear, msb frac bit set */
26
DEF_HELPER_3(neon_ceq_f32, i32, i32, i32, ptr)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
27
DEF_HELPER_3(neon_cge_f32, i32, i32, i32, ptr)
28
DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, ptr)
29
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
33
+DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+
35
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
36
void, ptr, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
38
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/neon-dp.decode
41
+++ b/target/arm/neon-dp.decode
42
@@ -XXX,XX +XXX,XX @@
43
@3same_q0 .... ... . . . size:2 .... .... .... . 0 . . .... \
44
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
45
46
+# For FP insns the high bit of 'size' is used as part of opcode decode
47
+@3same_fp .... ... . . . . size:1 .... .... .... . q:1 . . .... \
48
+ &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
49
+
50
VHADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
51
VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
52
VQADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
53
@@ -XXX,XX +XXX,XX @@ SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
54
vm=%vm_dp vn=%vn_dp vd=%vd_dp
55
56
VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
57
+
58
+VADD_fp_3s 1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
59
+VSUB_fp_3s 1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
60
+VABD_fp_3s 1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
61
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
62
index XXXXXXX..XXXXXXX 100644
63
--- a/target/arm/neon_helper.c
64
+++ b/target/arm/neon_helper.c
65
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qneg_s64)(CPUARMState *env, uint64_t x)
66
}
20
}
67
21
68
/* NEON Float helpers. */
22
int ieee_ex_to_loongarch(int xcpt)
69
-uint32_t HELPER(neon_abd_f32)(uint32_t a, uint32_t b, void *fpstp)
70
-{
71
- float_status *fpst = fpstp;
72
- float32 f0 = make_float32(a);
73
- float32 f1 = make_float32(b);
74
- return float32_val(float32_abs(float32_sub(f0, f1, fpst)));
75
-}
76
77
/* Floating point comparisons produce an integer result.
78
* Note that EQ doesn't signal InvalidOp for QNaNs but GE and GT do.
79
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
80
index XXXXXXX..XXXXXXX 100644
81
--- a/target/arm/translate-neon.inc.c
82
+++ b/target/arm/translate-neon.inc.c
83
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
84
85
DO_3SAME_VQDMULH(VQDMULH, qdmulh)
86
DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
87
+
88
+/*
89
+ * For all the functions using this macro, size == 1 means fp16,
90
+ * which is an architecture extension we don't implement yet.
91
+ */
92
+#define DO_3S_FP_GVEC(INSN,FUNC) \
93
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
94
+ uint32_t rn_ofs, uint32_t rm_ofs, \
95
+ uint32_t oprsz, uint32_t maxsz) \
96
+ { \
97
+ TCGv_ptr fpst = get_fpstatus_ptr(1); \
98
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpst, \
99
+ oprsz, maxsz, 0, FUNC); \
100
+ tcg_temp_free_ptr(fpst); \
101
+ } \
102
+ static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
103
+ { \
104
+ if (a->size != 0) { \
105
+ /* TODO fp16 support */ \
106
+ return false; \
107
+ } \
108
+ return do_3same(s, a, gen_##INSN##_3s); \
109
+ }
110
+
111
+
112
+DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
113
+DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
114
+DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
115
diff --git a/target/arm/translate.c b/target/arm/translate.c
116
index XXXXXXX..XXXXXXX 100644
117
--- a/target/arm/translate.c
118
+++ b/target/arm/translate.c
119
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
120
switch (op) {
121
case NEON_3R_FLOAT_ARITH:
122
pairwise = (u && size < 2); /* if VPADD (float) */
123
+ if (!pairwise) {
124
+ return 1; /* handled by decodetree */
125
+ }
126
break;
127
case NEON_3R_FLOAT_MINMAX:
128
pairwise = u; /* if VPMIN/VPMAX (float) */
129
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
130
{
131
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
132
switch ((u << 2) | size) {
133
- case 0: /* VADD */
134
case 4: /* VPADD */
135
gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
136
break;
137
- case 2: /* VSUB */
138
- gen_helper_vfp_subs(tmp, tmp, tmp2, fpstatus);
139
- break;
140
- case 6: /* VABD */
141
- gen_helper_neon_abd_f32(tmp, tmp, tmp2, fpstatus);
142
- break;
143
default:
144
abort();
145
}
146
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
147
index XXXXXXX..XXXXXXX 100644
148
--- a/target/arm/vec_helper.c
149
+++ b/target/arm/vec_helper.c
150
@@ -XXX,XX +XXX,XX @@ static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat)
151
return result;
152
}
153
154
+static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
155
+{
156
+ return float32_abs(float32_sub(op1, op2, stat));
157
+}
158
+
159
#define DO_3OP(NAME, FUNC, TYPE) \
160
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
161
{ \
162
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16)
163
DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32)
164
DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
165
166
+DO_3OP(gvec_fabd_s, float32_abd, float32)
167
+
168
#ifdef TARGET_AARCH64
169
170
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
171
--
23
--
172
2.20.1
24
2.34.1
173
174
diff view generated by jsdifflib
1
Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree.
1
Set the default NaN pattern explicitly for m68k.
2
(These are all the other insns in 3-reg-same which were using
3
GEN_NEON_INTEGER_OP() and which are not pairwise or
4
reversed-operands.)
5
2
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200512163904.10918-7-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-43-peter.maydell@linaro.org
9
---
6
---
10
target/arm/neon-dp.decode | 6 ++++++
7
target/m68k/cpu.c | 2 ++
11
target/arm/translate-neon.inc.c | 4 ++++
8
fpu/softfloat-specialize.c.inc | 2 +-
12
target/arm/translate.c | 8 ++------
9
2 files changed, 3 insertions(+), 1 deletion(-)
13
3 files changed, 12 insertions(+), 6 deletions(-)
14
10
15
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
11
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
16
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/neon-dp.decode
13
--- a/target/m68k/cpu.c
18
+++ b/target/arm/neon-dp.decode
14
+++ b/target/m68k/cpu.c
19
@@ -XXX,XX +XXX,XX @@ VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
15
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type)
20
VQADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
16
* preceding paragraph for nonsignaling NaNs.
21
VQADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
17
*/
22
18
set_float_2nan_prop_rule(float_2nan_prop_ab, &env->fp_status);
23
+VRHADD_S_3s 1111 001 0 0 . .. .... .... 0001 . . . 0 .... @3same
19
+ /* Default NaN: sign bit clear, all frac bits set */
24
+VRHADD_U_3s 1111 001 1 0 . .. .... .... 0001 . . . 0 .... @3same
20
+ set_float_default_nan_pattern(0b01111111, &env->fp_status);
25
+
21
26
@3same_logic .... ... . . . .. .... .... .... . q:1 .. .... \
22
nan = floatx80_default_nan(&env->fp_status);
27
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
23
for (i = 0; i < 8; i++) {
28
24
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
29
@@ -XXX,XX +XXX,XX @@ VBSL_3s 1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
30
VBIT_3s 1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
31
VBIF_3s 1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
32
33
+VHSUB_S_3s 1111 001 0 0 . .. .... .... 0010 . . . 0 .... @3same
34
+VHSUB_U_3s 1111 001 1 0 . .. .... .... 0010 . . . 0 .... @3same
35
+
36
VQSUB_S_3s 1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
37
VQSUB_U_3s 1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
38
39
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
40
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate-neon.inc.c
26
--- a/fpu/softfloat-specialize.c.inc
42
+++ b/target/arm/translate-neon.inc.c
27
+++ b/fpu/softfloat-specialize.c.inc
43
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
28
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
44
29
uint8_t dnan_pattern = status->default_nan_pattern;
45
DO_3SAME_32(VHADD_S, hadd_s)
30
46
DO_3SAME_32(VHADD_U, hadd_u)
31
if (dnan_pattern == 0) {
47
+DO_3SAME_32(VHSUB_S, hsub_s)
32
-#if defined(TARGET_SPARC) || defined(TARGET_M68K)
48
+DO_3SAME_32(VHSUB_U, hsub_u)
33
+#if defined(TARGET_SPARC)
49
+DO_3SAME_32(VRHADD_S, rhadd_s)
34
/* Sign bit clear, all frac bits set */
50
+DO_3SAME_32(VRHADD_U, rhadd_u)
35
dnan_pattern = 0b01111111;
51
diff --git a/target/arm/translate.c b/target/arm/translate.c
36
#elif defined(TARGET_HEXAGON)
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/translate.c
54
+++ b/target/arm/translate.c
55
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
56
case NEON_3R_VSHL:
57
case NEON_3R_SHA:
58
case NEON_3R_VHADD:
59
+ case NEON_3R_VRHADD:
60
+ case NEON_3R_VHSUB:
61
case NEON_3R_VABD:
62
case NEON_3R_VABA:
63
/* Already handled by decodetree */
64
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
65
tmp2 = neon_load_reg(rm, pass);
66
}
67
switch (op) {
68
- case NEON_3R_VRHADD:
69
- GEN_NEON_INTEGER_OP(rhadd);
70
- break;
71
- case NEON_3R_VHSUB:
72
- GEN_NEON_INTEGER_OP(hsub);
73
- break;
74
case NEON_3R_VQSHL:
75
GEN_NEON_INTEGER_OP_ENV(qshl);
76
break;
77
--
37
--
78
2.20.1
38
2.34.1
79
80
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for MIPS. Note that this
2
is our only target which currently changes the default NaN
3
at runtime (which it was previously doing indirectly when it
4
changed the snan_bit_is_one setting).
2
5
3
Pass a pointer directly to env->vfp.qc[0], rather than env.
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
This will allow SVE2, which does not modify QC, to pass a
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
pointer to dummy storage.
8
Message-id: 20241202131347.498124-44-peter.maydell@linaro.org
9
---
10
target/mips/fpu_helper.h | 7 +++++++
11
target/mips/msa.c | 3 +++
12
2 files changed, 10 insertions(+)
6
13
7
Change the return type of inl_qrdml.h_s16 to match the
14
diff --git a/target/mips/fpu_helper.h b/target/mips/fpu_helper.h
8
sense of the operation: signed.
9
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20200513163245.17915-14-richard.henderson@linaro.org
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
15
target/arm/translate.c | 18 ++++++++---
16
target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------
17
2 files changed, 54 insertions(+), 34 deletions(-)
18
19
diff --git a/target/arm/translate.c b/target/arm/translate.c
20
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/translate.c
16
--- a/target/mips/fpu_helper.h
22
+++ b/target/arm/translate.c
17
+++ b/target/mips/fpu_helper.h
23
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
18
@@ -XXX,XX +XXX,XX @@ static inline void restore_snan_bit_mode(CPUMIPSState *env)
24
[NEON_2RM_VCVT_UF] = 0x4,
19
set_float_infzeronan_rule(izn_rule, &env->active_fpu.fp_status);
25
};
20
nan3_rule = nan2008 ? float_3nan_prop_s_cab : float_3nan_prop_s_abc;
26
21
set_float_3nan_prop_rule(nan3_rule, &env->active_fpu.fp_status);
27
+static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
22
+ /*
28
+ uint32_t opr_sz, uint32_t max_sz,
23
+ * With nan2008, the default NaN value has the sign bit clear and the
29
+ gen_helper_gvec_3_ptr *fn)
24
+ * frac msb set; with the older mode, the sign bit is clear, and all
30
+{
25
+ * frac bits except the msb are set.
31
+ TCGv_ptr qc_ptr = tcg_temp_new_ptr();
26
+ */
32
+
27
+ set_float_default_nan_pattern(nan2008 ? 0b01000000 : 0b00111111,
33
+ tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc));
28
+ &env->active_fpu.fp_status);
34
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
29
35
+ opr_sz, max_sz, 0, fn);
36
+ tcg_temp_free_ptr(qc_ptr);
37
+}
38
+
39
void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
40
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
41
{
42
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
43
gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
44
};
45
tcg_debug_assert(vece >= 1 && vece <= 2);
46
- tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
47
- opr_sz, max_sz, 0, fns[vece - 1]);
48
+ gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
49
}
30
}
50
31
51
void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
32
diff --git a/target/mips/msa.c b/target/mips/msa.c
52
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
53
gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
54
};
55
tcg_debug_assert(vece >= 1 && vece <= 2);
56
- tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
57
- opr_sz, max_sz, 0, fns[vece - 1]);
58
+ gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
59
}
60
61
#define GEN_CMP0(NAME, COND) \
62
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
63
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
64
--- a/target/arm/vec_helper.c
34
--- a/target/mips/msa.c
65
+++ b/target/arm/vec_helper.c
35
+++ b/target/mips/msa.c
66
@@ -XXX,XX +XXX,XX @@
36
@@ -XXX,XX +XXX,XX @@ void msa_reset(CPUMIPSState *env)
67
#define H4(x) (x)
37
/* Inf * 0 + NaN returns the input NaN */
68
#endif
38
set_float_infzeronan_rule(float_infzeronan_dnan_never,
69
39
&env->active_tc.msa_fp_status);
70
-#define SET_QC() env->vfp.qc[0] = 1
40
+ /* Default NaN: sign bit clear, frac msb set */
71
-
41
+ set_float_default_nan_pattern(0b01000000,
72
static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
42
+ &env->active_tc.msa_fp_status);
73
{
74
uint64_t *d = vd + opr_sz;
75
@@ -XXX,XX +XXX,XX @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
76
}
77
78
/* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
79
-static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
80
- int16_t src2, int16_t src3)
81
+static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
82
+ int16_t src3, uint32_t *sat)
83
{
84
/* Simplify:
85
* = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
86
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
87
ret = ((int32_t)src3 << 15) + ret + (1 << 14);
88
ret >>= 15;
89
if (ret != (int16_t)ret) {
90
- SET_QC();
91
+ *sat = 1;
92
ret = (ret < 0 ? -0x8000 : 0x7fff);
93
}
94
return ret;
95
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
96
uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
97
uint32_t src2, uint32_t src3)
98
{
99
- uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
100
- uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
101
+ uint32_t *sat = &env->vfp.qc[0];
102
+ uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat);
103
+ uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
104
return deposit32(e1, 16, 16, e2);
105
}
106
107
void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
108
- void *ve, uint32_t desc)
109
+ void *vq, uint32_t desc)
110
{
111
uintptr_t opr_sz = simd_oprsz(desc);
112
int16_t *d = vd;
113
int16_t *n = vn;
114
int16_t *m = vm;
115
- CPUARMState *env = ve;
116
uintptr_t i;
117
118
for (i = 0; i < opr_sz / 2; ++i) {
119
- d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
120
+ d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq);
121
}
122
clear_tail(d, opr_sz, simd_maxsz(desc));
123
}
124
125
/* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
126
-static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
127
- int16_t src2, int16_t src3)
128
+static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2,
129
+ int16_t src3, uint32_t *sat)
130
{
131
/* Similarly, using subtraction:
132
* = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
133
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
134
ret = ((int32_t)src3 << 15) - ret + (1 << 14);
135
ret >>= 15;
136
if (ret != (int16_t)ret) {
137
- SET_QC();
138
+ *sat = 1;
139
ret = (ret < 0 ? -0x8000 : 0x7fff);
140
}
141
return ret;
142
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
143
uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
144
uint32_t src2, uint32_t src3)
145
{
146
- uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
147
- uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
148
+ uint32_t *sat = &env->vfp.qc[0];
149
+ uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat);
150
+ uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
151
return deposit32(e1, 16, 16, e2);
152
}
153
154
void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
155
- void *ve, uint32_t desc)
156
+ void *vq, uint32_t desc)
157
{
158
uintptr_t opr_sz = simd_oprsz(desc);
159
int16_t *d = vd;
160
int16_t *n = vn;
161
int16_t *m = vm;
162
- CPUARMState *env = ve;
163
uintptr_t i;
164
165
for (i = 0; i < opr_sz / 2; ++i) {
166
- d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
167
+ d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq);
168
}
169
clear_tail(d, opr_sz, simd_maxsz(desc));
170
}
171
172
/* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
173
-uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
174
- int32_t src2, int32_t src3)
175
+static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2,
176
+ int32_t src3, uint32_t *sat)
177
{
178
/* Simplify similarly to int_qrdmlah_s16 above. */
179
int64_t ret = (int64_t)src1 * src2;
180
ret = ((int64_t)src3 << 31) + ret + (1 << 30);
181
ret >>= 31;
182
if (ret != (int32_t)ret) {
183
- SET_QC();
184
+ *sat = 1;
185
ret = (ret < 0 ? INT32_MIN : INT32_MAX);
186
}
187
return ret;
188
}
189
190
+uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
191
+ int32_t src2, int32_t src3)
192
+{
193
+ uint32_t *sat = &env->vfp.qc[0];
194
+ return inl_qrdmlah_s32(src1, src2, src3, sat);
195
+}
196
+
197
void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
198
- void *ve, uint32_t desc)
199
+ void *vq, uint32_t desc)
200
{
201
uintptr_t opr_sz = simd_oprsz(desc);
202
int32_t *d = vd;
203
int32_t *n = vn;
204
int32_t *m = vm;
205
- CPUARMState *env = ve;
206
uintptr_t i;
207
208
for (i = 0; i < opr_sz / 4; ++i) {
209
- d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
210
+ d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq);
211
}
212
clear_tail(d, opr_sz, simd_maxsz(desc));
213
}
214
215
/* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
216
-uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
217
- int32_t src2, int32_t src3)
218
+static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2,
219
+ int32_t src3, uint32_t *sat)
220
{
221
/* Simplify similarly to int_qrdmlsh_s16 above. */
222
int64_t ret = (int64_t)src1 * src2;
223
ret = ((int64_t)src3 << 31) - ret + (1 << 30);
224
ret >>= 31;
225
if (ret != (int32_t)ret) {
226
- SET_QC();
227
+ *sat = 1;
228
ret = (ret < 0 ? INT32_MIN : INT32_MAX);
229
}
230
return ret;
231
}
232
233
+uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
234
+ int32_t src2, int32_t src3)
235
+{
236
+ uint32_t *sat = &env->vfp.qc[0];
237
+ return inl_qrdmlsh_s32(src1, src2, src3, sat);
238
+}
239
+
240
void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
241
- void *ve, uint32_t desc)
242
+ void *vq, uint32_t desc)
243
{
244
uintptr_t opr_sz = simd_oprsz(desc);
245
int32_t *d = vd;
246
int32_t *n = vn;
247
int32_t *m = vm;
248
- CPUARMState *env = ve;
249
uintptr_t i;
250
251
for (i = 0; i < opr_sz / 4; ++i) {
252
- d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
253
+ d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq);
254
}
255
clear_tail(d, opr_sz, simd_maxsz(desc));
256
}
43
}
257
--
44
--
258
2.20.1
45
2.34.1
259
260
diff view generated by jsdifflib
1
Convert the Neon VABA and VABD insns in the 3-reg-same group to
1
Set the default NaN pattern explicitly for openrisc.
2
decodetree.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200512163904.10918-6-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-45-peter.maydell@linaro.org
7
---
6
---
8
target/arm/neon-dp.decode | 6 ++++++
7
target/openrisc/cpu.c | 2 ++
9
target/arm/translate-neon.inc.c | 4 ++++
8
1 file changed, 2 insertions(+)
10
target/arm/translate.c | 22 ++--------------------
11
3 files changed, 12 insertions(+), 20 deletions(-)
12
9
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
10
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
14
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/neon-dp.decode
12
--- a/target/openrisc/cpu.c
16
+++ b/target/arm/neon-dp.decode
13
+++ b/target/openrisc/cpu.c
17
@@ -XXX,XX +XXX,XX @@ VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
14
@@ -XXX,XX +XXX,XX @@ static void openrisc_cpu_reset_hold(Object *obj, ResetType type)
18
VMIN_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
15
*/
19
VMIN_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
16
set_float_2nan_prop_rule(float_2nan_prop_x87, &cpu->env.fp_status);
20
17
21
+VABD_S_3s 1111 001 0 0 . .. .... .... 0111 . . . 0 .... @3same
18
+ /* Default NaN: sign bit clear, frac msb set */
22
+VABD_U_3s 1111 001 1 0 . .. .... .... 0111 . . . 0 .... @3same
19
+ set_float_default_nan_pattern(0b01000000, &cpu->env.fp_status);
23
+
20
24
+VABA_S_3s 1111 001 0 0 . .. .... .... 0111 . . . 1 .... @3same
21
#ifndef CONFIG_USER_ONLY
25
+VABA_U_3s 1111 001 1 0 . .. .... .... 0111 . . . 1 .... @3same
22
cpu->env.picmr = 0x00000000;
26
+
27
VADD_3s 1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
28
VSUB_3s 1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
29
30
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/translate-neon.inc.c
33
+++ b/target/arm/translate-neon.inc.c
34
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
35
DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
36
DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
37
DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
38
+DO_3SAME_NO_SZ_3(VABD_S, gen_gvec_sabd)
39
+DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
40
+DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
41
+DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
42
43
#define DO_3SAME_CMP(INSN, COND) \
44
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
45
diff --git a/target/arm/translate.c b/target/arm/translate.c
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/translate.c
48
+++ b/target/arm/translate.c
49
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
50
/* VQRDMLSH : handled by decodetree */
51
return 1;
52
53
- case NEON_3R_VABD:
54
- if (u) {
55
- gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
56
- vec_size, vec_size);
57
- } else {
58
- gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
59
- vec_size, vec_size);
60
- }
61
- return 0;
62
-
63
- case NEON_3R_VABA:
64
- if (u) {
65
- gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
66
- vec_size, vec_size);
67
- } else {
68
- gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
69
- vec_size, vec_size);
70
- }
71
- return 0;
72
-
73
case NEON_3R_VADD_VSUB:
74
case NEON_3R_LOGIC:
75
case NEON_3R_VMAX:
76
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
77
case NEON_3R_VSHL:
78
case NEON_3R_SHA:
79
case NEON_3R_VHADD:
80
+ case NEON_3R_VABD:
81
+ case NEON_3R_VABA:
82
/* Already handled by decodetree */
83
return 1;
84
}
85
--
23
--
86
2.20.1
24
2.34.1
87
88
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for ppc.
2
2
3
These operations do not touch fp_status.
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-46-peter.maydell@linaro.org
6
---
7
target/ppc/cpu_init.c | 4 ++++
8
1 file changed, 4 insertions(+)
4
9
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200513163245.17915-12-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/helper.h | 4 ++--
11
target/arm/translate-a64.c | 5 ++---
12
target/arm/translate.c | 12 ++----------
13
target/arm/vfp_helper.c | 5 ++---
14
4 files changed, 8 insertions(+), 18 deletions(-)
15
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.h
12
--- a/target/ppc/cpu_init.c
19
+++ b/target/arm/helper.h
13
+++ b/target/ppc/cpu_init.c
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
14
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_reset_hold(Object *obj, ResetType type)
21
DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
15
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->vec_status);
23
DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
17
24
-DEF_HELPER_2(recpe_u32, i32, i32, ptr)
18
+ /* Default NaN: sign bit clear, set frac msb */
25
-DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
26
+DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32)
20
+ set_float_default_nan_pattern(0b01000000, &env->vec_status);
27
+DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32)
21
+
28
DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32)
22
for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
29
23
ppc_spr_t *spr = &env->spr_cb[i];
30
DEF_HELPER_3(shl_cc, i32, env, i32, i32)
31
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/translate-a64.c
34
+++ b/target/arm/translate-a64.c
35
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode,
36
37
switch (opcode) {
38
case 0x3c: /* URECPE */
39
- gen_helper_recpe_u32(tcg_res, tcg_op, fpst);
40
+ gen_helper_recpe_u32(tcg_res, tcg_op);
41
break;
42
case 0x3d: /* FRECPE */
43
gen_helper_recpe_f32(tcg_res, tcg_op, fpst);
44
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
45
unallocated_encoding(s);
46
return;
47
}
48
- need_fpstatus = true;
49
break;
50
case 0x1e: /* FRINT32Z */
51
case 0x1f: /* FRINT64Z */
52
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
53
gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus);
54
break;
55
case 0x7c: /* URSQRTE */
56
- gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus);
57
+ gen_helper_rsqrte_u32(tcg_res, tcg_op);
58
break;
59
case 0x1e: /* FRINT32Z */
60
case 0x5e: /* FRINT32X */
61
diff --git a/target/arm/translate.c b/target/arm/translate.c
62
index XXXXXXX..XXXXXXX 100644
63
--- a/target/arm/translate.c
64
+++ b/target/arm/translate.c
65
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
66
break;
67
}
68
case NEON_2RM_VRECPE:
69
- {
70
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
71
- gen_helper_recpe_u32(tmp, tmp, fpstatus);
72
- tcg_temp_free_ptr(fpstatus);
73
+ gen_helper_recpe_u32(tmp, tmp);
74
break;
75
- }
76
case NEON_2RM_VRSQRTE:
77
- {
78
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
79
- gen_helper_rsqrte_u32(tmp, tmp, fpstatus);
80
- tcg_temp_free_ptr(fpstatus);
81
+ gen_helper_rsqrte_u32(tmp, tmp);
82
break;
83
- }
84
case NEON_2RM_VRECPE_F:
85
{
86
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
87
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
88
index XXXXXXX..XXXXXXX 100644
89
--- a/target/arm/vfp_helper.c
90
+++ b/target/arm/vfp_helper.c
91
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
92
return make_float64(val);
93
}
94
95
-uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
96
+uint32_t HELPER(recpe_u32)(uint32_t a)
97
{
98
- /* float_status *s = fpstp; */
99
int input, estimate;
100
101
if ((a & 0x80000000) == 0) {
102
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
103
return deposit32(0, (32 - 9), 9, estimate);
104
}
105
106
-uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp)
107
+uint32_t HELPER(rsqrte_u32)(uint32_t a)
108
{
109
int estimate;
110
24
111
--
25
--
112
2.20.1
26
2.34.1
113
114
diff view generated by jsdifflib
1
Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to
1
Set the default NaN pattern explicitly for sh4. Note that sh4
2
decodetree. These are 'pairwise' operations.
2
is one of the only three targets (the others being HPPA and
3
sometimes MIPS) that has snan_bit_is_one set.
3
4
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200512163904.10918-9-peter.maydell@linaro.org
7
Message-id: 20241202131347.498124-47-peter.maydell@linaro.org
7
---
8
---
8
target/arm/neon-dp.decode | 9 +++++
9
target/sh4/cpu.c | 2 ++
9
target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
10
1 file changed, 2 insertions(+)
10
target/arm/translate.c | 17 +-------
11
3 files changed, 82 insertions(+), 15 deletions(-)
12
11
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
12
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/neon-dp.decode
14
--- a/target/sh4/cpu.c
16
+++ b/target/arm/neon-dp.decode
15
+++ b/target/sh4/cpu.c
17
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@ static void superh_cpu_reset_hold(Object *obj, ResetType type)
18
@3same .... ... . . . size:2 .... .... .... . q:1 . . .... \
17
set_flush_to_zero(1, &env->fp_status);
19
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
18
#endif
20
19
set_default_nan_mode(1, &env->fp_status);
21
+@3same_q0 .... ... . . . size:2 .... .... .... . 0 . . .... \
20
+ /* sign bit clear, set all frac bits other than msb */
22
+ &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
21
+ set_float_default_nan_pattern(0b00111111, &env->fp_status);
23
+
24
VHADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
25
VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
26
VQADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
27
@@ -XXX,XX +XXX,XX @@ VMLS_3s 1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
28
VMUL_3s 1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
29
VMUL_p_3s 1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
30
31
+VPMAX_S_3s 1111 001 0 0 . .. .... .... 1010 . . . 0 .... @3same_q0
32
+VPMAX_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
33
+
34
+VPMIN_S_3s 1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
35
+VPMIN_U_3s 1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
36
+
37
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
38
39
SHA1_3s 1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
40
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/translate-neon.inc.c
43
+++ b/target/arm/translate-neon.inc.c
44
@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_S, qshl_s)
45
DO_3SAME_32_ENV(VQSHL_U, qshl_u)
46
DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
47
DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
48
+
49
+static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
50
+{
51
+ /* Operations handled pairwise 32 bits at a time */
52
+ TCGv_i32 tmp, tmp2, tmp3;
53
+
54
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
55
+ return false;
56
+ }
57
+
58
+ /* UNDEF accesses to D16-D31 if they don't exist. */
59
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
60
+ ((a->vd | a->vn | a->vm) & 0x10)) {
61
+ return false;
62
+ }
63
+
64
+ if (a->size == 3) {
65
+ return false;
66
+ }
67
+
68
+ if (!vfp_access_check(s)) {
69
+ return true;
70
+ }
71
+
72
+ assert(a->q == 0); /* enforced by decode patterns */
73
+
74
+ /*
75
+ * Note that we have to be careful not to clobber the source operands
76
+ * in the "vm == vd" case by storing the result of the first pass too
77
+ * early. Since Q is 0 there are always just two passes, so instead
78
+ * of a complicated loop over each pass we just unroll.
79
+ */
80
+ tmp = neon_load_reg(a->vn, 0);
81
+ tmp2 = neon_load_reg(a->vn, 1);
82
+ fn(tmp, tmp, tmp2);
83
+ tcg_temp_free_i32(tmp2);
84
+
85
+ tmp3 = neon_load_reg(a->vm, 0);
86
+ tmp2 = neon_load_reg(a->vm, 1);
87
+ fn(tmp3, tmp3, tmp2);
88
+ tcg_temp_free_i32(tmp2);
89
+
90
+ neon_store_reg(a->vd, 0, tmp);
91
+ neon_store_reg(a->vd, 1, tmp3);
92
+ return true;
93
+}
94
+
95
+#define DO_3SAME_PAIR(INSN, func) \
96
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
97
+ { \
98
+ static NeonGenTwoOpFn * const fns[] = { \
99
+ gen_helper_neon_##func##8, \
100
+ gen_helper_neon_##func##16, \
101
+ gen_helper_neon_##func##32, \
102
+ }; \
103
+ if (a->size > 2) { \
104
+ return false; \
105
+ } \
106
+ return do_3same_pair(s, a, fns[a->size]); \
107
+ }
108
+
109
+/* 32-bit pairwise ops end up the same as the elementwise versions. */
110
+#define gen_helper_neon_pmax_s32 tcg_gen_smax_i32
111
+#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
112
+#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
113
+#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
114
+
115
+DO_3SAME_PAIR(VPMAX_S, pmax_s)
116
+DO_3SAME_PAIR(VPMIN_S, pmin_s)
117
+DO_3SAME_PAIR(VPMAX_U, pmax_u)
118
+DO_3SAME_PAIR(VPMIN_U, pmin_u)
119
diff --git a/target/arm/translate.c b/target/arm/translate.c
120
index XXXXXXX..XXXXXXX 100644
121
--- a/target/arm/translate.c
122
+++ b/target/arm/translate.c
123
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
124
}
125
}
22
}
126
23
127
-/* 32-bit pairwise ops end up the same as the elementwise versions. */
24
static void superh_cpu_disas_set_info(CPUState *cpu, disassemble_info *info)
128
-#define gen_helper_neon_pmax_s32 tcg_gen_smax_i32
129
-#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
130
-#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
131
-#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
132
-
133
#define GEN_NEON_INTEGER_OP_ENV(name) do { \
134
switch ((size << 1) | u) { \
135
case 0: \
136
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
137
case NEON_3R_VQSHL:
138
case NEON_3R_VRSHL:
139
case NEON_3R_VQRSHL:
140
+ case NEON_3R_VPMAX:
141
+ case NEON_3R_VPMIN:
142
/* Already handled by decodetree */
143
return 1;
144
}
145
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
146
pairwise = 0;
147
switch (op) {
148
case NEON_3R_VPADD_VQRDMLAH:
149
- case NEON_3R_VPMAX:
150
- case NEON_3R_VPMIN:
151
pairwise = 1;
152
break;
153
case NEON_3R_FLOAT_ARITH:
154
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
155
tmp2 = neon_load_reg(rm, pass);
156
}
157
switch (op) {
158
- break;
159
- case NEON_3R_VPMAX:
160
- GEN_NEON_INTEGER_OP(pmax);
161
- break;
162
- case NEON_3R_VPMIN:
163
- GEN_NEON_INTEGER_OP(pmin);
164
- break;
165
case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high. */
166
if (!u) { /* VQDMULH */
167
switch (size) {
168
--
25
--
169
2.20.1
26
2.34.1
170
171
diff view generated by jsdifflib
1
From: Patrick Williams <patrick@stwcx.xyz>
1
Set the default NaN pattern explicitly for rx.
2
2
3
Sonora Pass is a 2 socket x86 motherboard designed by Facebook
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
and supported by OpenBMC. Strapping configuration was obtained
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
from hardware and i2c configuration is based on dts found at:
5
Message-id: 20241202131347.498124-48-peter.maydell@linaro.org
6
---
7
target/rx/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
6
9
7
https://github.com/facebook/openbmc-linux/blob/1633c87b8ba7c162095787c988979b748ba65dc8/arch/arm/boot/dts/aspeed-bmc-facebook-sonorapass.dts
10
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
8
9
Booted a test image of http://github.com/facebook/openbmc to login
10
prompt.
11
12
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
13
Reviewed-by: Amithash Prasad <amithash@fb.com>
14
Reviewed-by: Cédric Le Goater <clg@kaod.org>
15
[PMM: fixed block comment style nit]
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
18
hw/arm/aspeed.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
19
1 file changed, 78 insertions(+)
20
21
diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
22
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
23
--- a/hw/arm/aspeed.c
12
--- a/target/rx/cpu.c
24
+++ b/hw/arm/aspeed.c
13
+++ b/target/rx/cpu.c
25
@@ -XXX,XX +XXX,XX @@ struct AspeedBoardState {
14
@@ -XXX,XX +XXX,XX @@ static void rx_cpu_reset_hold(Object *obj, ResetType type)
26
SCU_AST2500_HW_STRAP_ACPI_ENABLE | \
15
* then prefer dest over source", which is float_2nan_prop_s_ab.
27
SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
16
*/
28
17
set_float_2nan_prop_rule(float_2nan_prop_x87, &env->fp_status);
29
+/* Sonorapass hardware value: 0xF100D216 */
18
+ /* Default NaN value: sign bit clear, set frac msb */
30
+#define SONORAPASS_BMC_HW_STRAP1 ( \
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
31
+ SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE | \
32
+ SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE | \
33
+ SCU_AST2500_HW_STRAP_UART_DEBUG | \
34
+ SCU_AST2500_HW_STRAP_RESERVED28 | \
35
+ SCU_AST2500_HW_STRAP_DDR4_ENABLE | \
36
+ SCU_HW_STRAP_VGA_CLASS_CODE | \
37
+ SCU_HW_STRAP_LPC_RESET_PIN | \
38
+ SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER) | \
39
+ SCU_AST2500_HW_STRAP_SET_AXI_AHB_RATIO(AXI_AHB_RATIO_2_1) | \
40
+ SCU_HW_STRAP_VGA_BIOS_ROM | \
41
+ SCU_HW_STRAP_VGA_SIZE_SET(VGA_16M_DRAM) | \
42
+ SCU_AST2500_HW_STRAP_RESERVED1)
43
+
44
/* Swift hardware value: 0xF11AD206 */
45
#define SWIFT_BMC_HW_STRAP1 ( \
46
AST2500_HW_STRAP1_DEFAULTS | \
47
@@ -XXX,XX +XXX,XX @@ static void swift_bmc_i2c_init(AspeedBoardState *bmc)
48
i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 12), "tmp105", 0x4a);
49
}
20
}
50
21
51
+static void sonorapass_bmc_i2c_init(AspeedBoardState *bmc)
22
static ObjectClass *rx_cpu_class_by_name(const char *cpu_model)
52
+{
53
+ AspeedSoCState *soc = &bmc->soc;
54
+
55
+ /* bus 2 : */
56
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x48);
57
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x49);
58
+ /* bus 2 : pca9546 @ 0x73 */
59
+
60
+ /* bus 3 : pca9548 @ 0x70 */
61
+
62
+ /* bus 4 : */
63
+ uint8_t *eeprom4_54 = g_malloc0(8 * 1024);
64
+ smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), 0x54,
65
+ eeprom4_54);
66
+ /* PCA9539 @ 0x76, but PCA9552 is compatible */
67
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x76);
68
+ /* PCA9539 @ 0x77, but PCA9552 is compatible */
69
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x77);
70
+
71
+ /* bus 6 : */
72
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x48);
73
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x49);
74
+ /* bus 6 : pca9546 @ 0x73 */
75
+
76
+ /* bus 8 : */
77
+ uint8_t *eeprom8_56 = g_malloc0(8 * 1024);
78
+ smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), 0x56,
79
+ eeprom8_56);
80
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x60);
81
+ i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x61);
82
+ /* bus 8 : adc128d818 @ 0x1d */
83
+ /* bus 8 : adc128d818 @ 0x1f */
84
+
85
+ /*
86
+ * bus 13 : pca9548 @ 0x71
87
+ * - channel 3:
88
+ * - tmm421 @ 0x4c
89
+ * - tmp421 @ 0x4e
90
+ * - tmp421 @ 0x4f
91
+ */
92
+
93
+}
94
+
95
static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
96
{
97
AspeedSoCState *soc = &bmc->soc;
98
@@ -XXX,XX +XXX,XX @@ static void aspeed_machine_romulus_class_init(ObjectClass *oc, void *data)
99
mc->default_ram_size = 512 * MiB;
100
};
101
102
+static void aspeed_machine_sonorapass_class_init(ObjectClass *oc, void *data)
103
+{
104
+ MachineClass *mc = MACHINE_CLASS(oc);
105
+ AspeedMachineClass *amc = ASPEED_MACHINE_CLASS(oc);
106
+
107
+ mc->desc = "OCP SonoraPass BMC (ARM1176)";
108
+ amc->soc_name = "ast2500-a1";
109
+ amc->hw_strap1 = SONORAPASS_BMC_HW_STRAP1;
110
+ amc->fmc_model = "mx66l1g45g";
111
+ amc->spi_model = "mx66l1g45g";
112
+ amc->num_cs = 2;
113
+ amc->i2c_init = sonorapass_bmc_i2c_init;
114
+ mc->default_ram_size = 512 * MiB;
115
+};
116
+
117
static void aspeed_machine_swift_class_init(ObjectClass *oc, void *data)
118
{
119
MachineClass *mc = MACHINE_CLASS(oc);
120
@@ -XXX,XX +XXX,XX @@ static const TypeInfo aspeed_machine_types[] = {
121
.name = MACHINE_TYPE_NAME("swift-bmc"),
122
.parent = TYPE_ASPEED_MACHINE,
123
.class_init = aspeed_machine_swift_class_init,
124
+ }, {
125
+ .name = MACHINE_TYPE_NAME("sonorapass-bmc"),
126
+ .parent = TYPE_ASPEED_MACHINE,
127
+ .class_init = aspeed_machine_sonorapass_class_init,
128
}, {
129
.name = MACHINE_TYPE_NAME("witherspoon-bmc"),
130
.parent = TYPE_ASPEED_MACHINE,
131
--
23
--
132
2.20.1
24
2.34.1
133
134
diff view generated by jsdifflib
1
Convert the Neon VHADD insns in the 3-reg-same group to decodetree.
1
Set the default NaN pattern explicitly for s390x.
2
2
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20200512163904.10918-5-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-49-peter.maydell@linaro.org
6
---
6
---
7
target/arm/neon-dp.decode | 2 ++
7
target/s390x/cpu.c | 2 ++
8
target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
8
1 file changed, 2 insertions(+)
9
target/arm/translate.c | 4 +---
10
3 files changed, 27 insertions(+), 3 deletions(-)
11
9
12
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
10
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
13
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/neon-dp.decode
12
--- a/target/s390x/cpu.c
15
+++ b/target/arm/neon-dp.decode
13
+++ b/target/s390x/cpu.c
16
@@ -XXX,XX +XXX,XX @@
14
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_reset_hold(Object *obj, ResetType type)
17
@3same .... ... . . . size:2 .... .... .... . q:1 . . .... \
15
set_float_3nan_prop_rule(float_3nan_prop_s_abc, &env->fpu_status);
18
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
16
set_float_infzeronan_rule(float_infzeronan_dnan_always,
19
17
&env->fpu_status);
20
+VHADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
18
+ /* Default NaN value: sign bit clear, frac msb set */
21
+VHADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
19
+ set_float_default_nan_pattern(0b01000000, &env->fpu_status);
22
VQADD_S_3s 1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
20
/* fall through */
23
VQADD_U_3s 1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
21
case RESET_TYPE_S390_CPU_NORMAL:
24
22
env->psw.mask &= ~PSW_MASK_RI;
25
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate-neon.inc.c
28
+++ b/target/arm/translate-neon.inc.c
29
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
30
DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
31
DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
32
DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
33
+
34
+#define DO_3SAME_32(INSN, FUNC) \
35
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
36
+ uint32_t rn_ofs, uint32_t rm_ofs, \
37
+ uint32_t oprsz, uint32_t maxsz) \
38
+ { \
39
+ static const GVecGen3 ops[4] = { \
40
+ { .fni4 = gen_helper_neon_##FUNC##8 }, \
41
+ { .fni4 = gen_helper_neon_##FUNC##16 }, \
42
+ { .fni4 = gen_helper_neon_##FUNC##32 }, \
43
+ { 0 }, \
44
+ }; \
45
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
46
+ } \
47
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
48
+ { \
49
+ if (a->size > 2) { \
50
+ return false; \
51
+ } \
52
+ return do_3same(s, a, gen_##INSN##_3s); \
53
+ }
54
+
55
+DO_3SAME_32(VHADD_S, hadd_s)
56
+DO_3SAME_32(VHADD_U, hadd_u)
57
diff --git a/target/arm/translate.c b/target/arm/translate.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/translate.c
60
+++ b/target/arm/translate.c
61
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
62
case NEON_3R_VML:
63
case NEON_3R_VSHL:
64
case NEON_3R_SHA:
65
+ case NEON_3R_VHADD:
66
/* Already handled by decodetree */
67
return 1;
68
}
69
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
70
tmp2 = neon_load_reg(rm, pass);
71
}
72
switch (op) {
73
- case NEON_3R_VHADD:
74
- GEN_NEON_INTEGER_OP(hadd);
75
- break;
76
case NEON_3R_VRHADD:
77
GEN_NEON_INTEGER_OP(rhadd);
78
break;
79
--
23
--
80
2.20.1
24
2.34.1
81
82
diff view generated by jsdifflib
1
Convert the 64-bit element insns in the 3-reg-same group
1
Set the default NaN pattern explicitly for SPARC, and remove
2
to decodetree. This covers VQSHL, VRSHL and VQRSHL where
2
the ifdef from parts64_default_nan.
3
size==0b11.
4
3
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200512163904.10918-4-peter.maydell@linaro.org
6
Message-id: 20241202131347.498124-50-peter.maydell@linaro.org
8
---
7
---
9
target/arm/neon-dp.decode | 13 +++++++++++
8
target/sparc/cpu.c | 2 ++
10
target/arm/translate-neon.inc.c | 24 +++++++++++++++++++++
9
fpu/softfloat-specialize.c.inc | 5 +----
11
target/arm/translate.c | 38 ++-------------------------------
10
2 files changed, 3 insertions(+), 4 deletions(-)
12
3 files changed, 39 insertions(+), 36 deletions(-)
13
11
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
12
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/neon-dp.decode
14
--- a/target/sparc/cpu.c
17
+++ b/target/arm/neon-dp.decode
15
+++ b/target/sparc/cpu.c
18
@@ -XXX,XX +XXX,XX @@ VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
16
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
19
VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
17
set_float_3nan_prop_rule(float_3nan_prop_s_cba, &env->fp_status);
20
VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
18
/* For inf * 0 + NaN, return the input NaN */
21
19
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
22
+# Insns operating on 64-bit elements (size!=0b11 handled elsewhere)
20
+ /* Default NaN value: sign bit clear, all frac bits set */
23
+# The _rev suffix indicates that Vn and Vm are reversed (as explained
21
+ set_float_default_nan_pattern(0b01111111, &env->fp_status);
24
+# by the comment for the @3same_rev format).
22
25
+@3same_64_rev .... ... . . . 11 .... .... .... . q:1 . . .... \
23
cpu_exec_realizefn(cs, &local_err);
26
+ &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
24
if (local_err != NULL) {
27
+
25
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
28
+VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
29
+VQSHL_U64_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
30
+VRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
31
+VRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
32
+VQRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
33
+VQRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
34
+
35
VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
36
VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
37
VMIN_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
38
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
39
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/translate-neon.inc.c
27
--- a/fpu/softfloat-specialize.c.inc
41
+++ b/target/arm/translate-neon.inc.c
28
+++ b/fpu/softfloat-specialize.c.inc
42
@@ -XXX,XX +XXX,XX @@ static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
29
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
43
30
uint8_t dnan_pattern = status->default_nan_pattern;
44
return true;
31
45
}
32
if (dnan_pattern == 0) {
46
+
33
-#if defined(TARGET_SPARC)
47
+#define DO_3SAME_64(INSN, FUNC) \
34
- /* Sign bit clear, all frac bits set */
48
+ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
35
- dnan_pattern = 0b01111111;
49
+ uint32_t rn_ofs, uint32_t rm_ofs, \
36
-#elif defined(TARGET_HEXAGON)
50
+ uint32_t oprsz, uint32_t maxsz) \
37
+#if defined(TARGET_HEXAGON)
51
+ { \
38
/* Sign bit set, all frac bits set. */
52
+ static const GVecGen3 op = { .fni8 = FUNC }; \
39
dnan_pattern = 0b11111111;
53
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &op); \
40
#else
54
+ } \
55
+ DO_3SAME(INSN, gen_##INSN##_3s)
56
+
57
+#define DO_3SAME_64_ENV(INSN, FUNC) \
58
+ static void gen_##INSN##_elt(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m) \
59
+ { \
60
+ FUNC(d, cpu_env, n, m); \
61
+ } \
62
+ DO_3SAME_64(INSN, gen_##INSN##_elt)
63
+
64
+DO_3SAME_64(VRSHL_S64, gen_helper_neon_rshl_s64)
65
+DO_3SAME_64(VRSHL_U64, gen_helper_neon_rshl_u64)
66
+DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
67
+DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
68
+DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
69
+DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
70
diff --git a/target/arm/translate.c b/target/arm/translate.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/target/arm/translate.c
73
+++ b/target/arm/translate.c
74
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
75
}
76
77
if (size == 3) {
78
- /* 64-bit element instructions. */
79
- for (pass = 0; pass < (q ? 2 : 1); pass++) {
80
- neon_load_reg64(cpu_V0, rn + pass);
81
- neon_load_reg64(cpu_V1, rm + pass);
82
- switch (op) {
83
- case NEON_3R_VQSHL:
84
- if (u) {
85
- gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
86
- cpu_V1, cpu_V0);
87
- } else {
88
- gen_helper_neon_qshl_s64(cpu_V0, cpu_env,
89
- cpu_V1, cpu_V0);
90
- }
91
- break;
92
- case NEON_3R_VRSHL:
93
- if (u) {
94
- gen_helper_neon_rshl_u64(cpu_V0, cpu_V1, cpu_V0);
95
- } else {
96
- gen_helper_neon_rshl_s64(cpu_V0, cpu_V1, cpu_V0);
97
- }
98
- break;
99
- case NEON_3R_VQRSHL:
100
- if (u) {
101
- gen_helper_neon_qrshl_u64(cpu_V0, cpu_env,
102
- cpu_V1, cpu_V0);
103
- } else {
104
- gen_helper_neon_qrshl_s64(cpu_V0, cpu_env,
105
- cpu_V1, cpu_V0);
106
- }
107
- break;
108
- default:
109
- abort();
110
- }
111
- neon_store_reg64(cpu_V0, rd + pass);
112
- }
113
- return 0;
114
+ /* 64-bit element instructions: handled by decodetree */
115
+ return 1;
116
}
117
pairwise = 0;
118
switch (op) {
119
--
41
--
120
2.20.1
42
2.34.1
121
122
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for xtensa.
2
2
3
Include 64-bit element size in preparation for SVE2.
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-51-peter.maydell@linaro.org
6
---
7
target/xtensa/cpu.c | 2 ++
8
1 file changed, 2 insertions(+)
4
9
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200513163245.17915-16-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/helper.h | 10 +++
11
target/arm/translate.h | 5 ++
12
target/arm/translate-a64.c | 8 ++-
13
target/arm/translate.c | 133 ++++++++++++++++++++++++++++++++++++-
14
target/arm/vec_helper.c | 24 +++++++
15
5 files changed, 176 insertions(+), 4 deletions(-)
16
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
18
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.h
12
--- a/target/xtensa/cpu.c
20
+++ b/target/arm/helper.h
13
+++ b/target/xtensa/cpu.c
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
14
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_reset_hold(Object *obj, ResetType type)
22
DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
15
/* For inf * 0 + NaN, return the input NaN */
23
DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
16
set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->fp_status);
24
17
set_no_signaling_nans(!dfpu, &env->fp_status);
25
+DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
18
+ /* Default NaN value: sign bit clear, set frac msb */
26
+DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
27
+DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
xtensa_use_first_nan(env, !dfpu);
28
+DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+
30
+DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+
35
#ifdef TARGET_AARCH64
36
#include "helper-a64.h"
37
#include "helper-sve.h"
38
diff --git a/target/arm/translate.h b/target/arm/translate.h
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/translate.h
41
+++ b/target/arm/translate.h
42
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
43
void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
44
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
45
46
+void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
47
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
48
+void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
49
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
50
+
51
/*
52
* Forward to the isar_feature_* tests given a DisasContext pointer.
53
*/
54
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/translate-a64.c
57
+++ b/target/arm/translate-a64.c
58
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
59
gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size);
60
}
61
return;
62
+ case 0xe: /* SABD, UABD */
63
+ if (u) {
64
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size);
65
+ } else {
66
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
67
+ }
68
+ return;
69
case 0x10: /* ADD, SUB */
70
if (u) {
71
gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
72
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
73
genenvfn = fns[size][u];
74
break;
75
}
76
- case 0xe: /* SABD, UABD */
77
case 0xf: /* SABA, UABA */
78
{
79
static NeonGenTwoOpFn * const fns[3][2] = {
80
diff --git a/target/arm/translate.c b/target/arm/translate.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/translate.c
83
+++ b/target/arm/translate.c
84
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
85
rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
86
}
21
}
87
22
88
+static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
89
+{
90
+ TCGv_i32 t = tcg_temp_new_i32();
91
+
92
+ tcg_gen_sub_i32(t, a, b);
93
+ tcg_gen_sub_i32(d, b, a);
94
+ tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
95
+ tcg_temp_free_i32(t);
96
+}
97
+
98
+static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
99
+{
100
+ TCGv_i64 t = tcg_temp_new_i64();
101
+
102
+ tcg_gen_sub_i64(t, a, b);
103
+ tcg_gen_sub_i64(d, b, a);
104
+ tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
105
+ tcg_temp_free_i64(t);
106
+}
107
+
108
+static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
109
+{
110
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
111
+
112
+ tcg_gen_smin_vec(vece, t, a, b);
113
+ tcg_gen_smax_vec(vece, d, a, b);
114
+ tcg_gen_sub_vec(vece, d, d, t);
115
+ tcg_temp_free_vec(t);
116
+}
117
+
118
+void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
119
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
120
+{
121
+ static const TCGOpcode vecop_list[] = {
122
+ INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
123
+ };
124
+ static const GVecGen3 ops[4] = {
125
+ { .fniv = gen_sabd_vec,
126
+ .fno = gen_helper_gvec_sabd_b,
127
+ .opt_opc = vecop_list,
128
+ .vece = MO_8 },
129
+ { .fniv = gen_sabd_vec,
130
+ .fno = gen_helper_gvec_sabd_h,
131
+ .opt_opc = vecop_list,
132
+ .vece = MO_16 },
133
+ { .fni4 = gen_sabd_i32,
134
+ .fniv = gen_sabd_vec,
135
+ .fno = gen_helper_gvec_sabd_s,
136
+ .opt_opc = vecop_list,
137
+ .vece = MO_32 },
138
+ { .fni8 = gen_sabd_i64,
139
+ .fniv = gen_sabd_vec,
140
+ .fno = gen_helper_gvec_sabd_d,
141
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
142
+ .opt_opc = vecop_list,
143
+ .vece = MO_64 },
144
+ };
145
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
146
+}
147
+
148
+static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
149
+{
150
+ TCGv_i32 t = tcg_temp_new_i32();
151
+
152
+ tcg_gen_sub_i32(t, a, b);
153
+ tcg_gen_sub_i32(d, b, a);
154
+ tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
155
+ tcg_temp_free_i32(t);
156
+}
157
+
158
+static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
159
+{
160
+ TCGv_i64 t = tcg_temp_new_i64();
161
+
162
+ tcg_gen_sub_i64(t, a, b);
163
+ tcg_gen_sub_i64(d, b, a);
164
+ tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
165
+ tcg_temp_free_i64(t);
166
+}
167
+
168
+static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
169
+{
170
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
171
+
172
+ tcg_gen_umin_vec(vece, t, a, b);
173
+ tcg_gen_umax_vec(vece, d, a, b);
174
+ tcg_gen_sub_vec(vece, d, d, t);
175
+ tcg_temp_free_vec(t);
176
+}
177
+
178
+void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
179
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
180
+{
181
+ static const TCGOpcode vecop_list[] = {
182
+ INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
183
+ };
184
+ static const GVecGen3 ops[4] = {
185
+ { .fniv = gen_uabd_vec,
186
+ .fno = gen_helper_gvec_uabd_b,
187
+ .opt_opc = vecop_list,
188
+ .vece = MO_8 },
189
+ { .fniv = gen_uabd_vec,
190
+ .fno = gen_helper_gvec_uabd_h,
191
+ .opt_opc = vecop_list,
192
+ .vece = MO_16 },
193
+ { .fni4 = gen_uabd_i32,
194
+ .fniv = gen_uabd_vec,
195
+ .fno = gen_helper_gvec_uabd_s,
196
+ .opt_opc = vecop_list,
197
+ .vece = MO_32 },
198
+ { .fni8 = gen_uabd_i64,
199
+ .fniv = gen_uabd_vec,
200
+ .fno = gen_helper_gvec_uabd_d,
201
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
202
+ .opt_opc = vecop_list,
203
+ .vece = MO_64 },
204
+ };
205
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
206
+}
207
+
208
/* Translate a NEON data processing instruction. Return nonzero if the
209
instruction is invalid.
210
We process data in a mixture of 32-bit and 64-bit chunks.
211
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
212
}
213
return 1;
214
215
+ case NEON_3R_VABD:
216
+ if (u) {
217
+ gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
218
+ vec_size, vec_size);
219
+ } else {
220
+ gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
221
+ vec_size, vec_size);
222
+ }
223
+ return 0;
224
+
225
case NEON_3R_VADD_VSUB:
226
case NEON_3R_LOGIC:
227
case NEON_3R_VMAX:
228
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
229
case NEON_3R_VQRSHL:
230
GEN_NEON_INTEGER_OP_ENV(qrshl);
231
break;
232
- case NEON_3R_VABD:
233
- GEN_NEON_INTEGER_OP(abd);
234
- break;
235
case NEON_3R_VABA:
236
GEN_NEON_INTEGER_OP(abd);
237
tcg_temp_free_i32(tmp2);
238
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
239
index XXXXXXX..XXXXXXX 100644
240
--- a/target/arm/vec_helper.c
241
+++ b/target/arm/vec_helper.c
242
@@ -XXX,XX +XXX,XX @@ DO_CMP0(gvec_cgt0_h, int16_t, >)
243
DO_CMP0(gvec_cge0_h, int16_t, >=)
244
245
#undef DO_CMP0
246
+
247
+#define DO_ABD(NAME, TYPE) \
248
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
249
+{ \
250
+ intptr_t i, opr_sz = simd_oprsz(desc); \
251
+ TYPE *d = vd, *n = vn, *m = vm; \
252
+ \
253
+ for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \
254
+ d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \
255
+ } \
256
+ clear_tail(d, opr_sz, simd_maxsz(desc)); \
257
+}
258
+
259
+DO_ABD(gvec_sabd_b, int8_t)
260
+DO_ABD(gvec_sabd_h, int16_t)
261
+DO_ABD(gvec_sabd_s, int32_t)
262
+DO_ABD(gvec_sabd_d, int64_t)
263
+
264
+DO_ABD(gvec_uabd_b, uint8_t)
265
+DO_ABD(gvec_uabd_h, uint16_t)
266
+DO_ABD(gvec_uabd_s, uint32_t)
267
+DO_ABD(gvec_uabd_d, uint64_t)
268
+
269
+#undef DO_ABD
270
--
23
--
271
2.20.1
24
2.34.1
272
273
diff view generated by jsdifflib
1
GDB's remote protocol requires M-profile cores to use the feature
1
Set the default NaN pattern explicitly for hexagon.
2
name 'org.gnu.gdb.arm.m-profile' instead of the 'org.gnu.gdb.arm.core'
2
Remove the ifdef from parts64_default_nan(); the only
3
feature used for A- and R-profile cores. We weren't doing this, which
3
remaining unconverted targets all use the default case.
4
meant GDB treated our M-profile cores like A-profile ones. This mostly
5
doesn't matter, but for instance means that it doesn't correctly
6
handle backtraces where an M-profile exception frame is involved.
7
4
8
Ship a copy of GDB's arm-m-profile.xml and use it on the M-profile
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
cores. The integer registers have the same offsets as the
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
arm-core.xml, but register 25 is the M-profile XPSR rather than the
7
Message-id: 20241202131347.498124-52-peter.maydell@linaro.org
11
A-profile CPSR, so we need to update arm_cpu_gdb_read_register() and
8
---
12
arm_cpu_gdb_write_register() to handle XSPR reads and writes.
9
target/hexagon/cpu.c | 2 ++
10
fpu/softfloat-specialize.c.inc | 5 -----
11
2 files changed, 2 insertions(+), 5 deletions(-)
13
12
14
Fixes: https://bugs.launchpad.net/qemu/+bug/1877136
13
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
17
Message-id: 20200507134755.13997-1-peter.maydell@linaro.org
18
---
19
configure | 4 ++--
20
target/arm/cpu_tcg.c | 1 +
21
target/arm/gdbstub.c | 22 ++++++++++++++++++----
22
gdb-xml/arm-m-profile.xml | 27 +++++++++++++++++++++++++++
23
4 files changed, 48 insertions(+), 6 deletions(-)
24
create mode 100644 gdb-xml/arm-m-profile.xml
25
26
diff --git a/configure b/configure
27
index XXXXXXX..XXXXXXX 100755
28
--- a/configure
29
+++ b/configure
30
@@ -XXX,XX +XXX,XX @@ case "$target_name" in
31
TARGET_SYSTBL_ABI=common,oabi
32
bflt="yes"
33
mttcg="yes"
34
- gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
35
+ gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
36
;;
37
aarch64|aarch64_be)
38
TARGET_ARCH=aarch64
39
TARGET_BASE_ARCH=arm
40
bflt="yes"
41
mttcg="yes"
42
- gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
43
+ gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
44
;;
45
cris)
46
;;
47
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
48
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/cpu_tcg.c
15
--- a/target/hexagon/cpu.c
50
+++ b/target/arm/cpu_tcg.c
16
+++ b/target/hexagon/cpu.c
51
@@ -XXX,XX +XXX,XX @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
17
@@ -XXX,XX +XXX,XX @@ static void hexagon_cpu_reset_hold(Object *obj, ResetType type)
52
#endif
18
53
19
set_default_nan_mode(1, &env->fp_status);
54
cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
20
set_float_detect_tininess(float_tininess_before_rounding, &env->fp_status);
55
+ cc->gdb_core_xml_file = "arm-m-profile.xml";
21
+ /* Default NaN value: sign bit set, all frac bits set */
22
+ set_float_default_nan_pattern(0b11111111, &env->fp_status);
56
}
23
}
57
24
58
static const ARMCPUInfo arm_tcg_cpus[] = {
25
static void hexagon_cpu_disas_set_info(CPUState *s, disassemble_info *info)
59
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
26
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
60
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
61
--- a/target/arm/gdbstub.c
28
--- a/fpu/softfloat-specialize.c.inc
62
+++ b/target/arm/gdbstub.c
29
+++ b/fpu/softfloat-specialize.c.inc
63
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
30
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
31
uint8_t dnan_pattern = status->default_nan_pattern;
32
33
if (dnan_pattern == 0) {
34
-#if defined(TARGET_HEXAGON)
35
- /* Sign bit set, all frac bits set. */
36
- dnan_pattern = 0b11111111;
37
-#else
38
/*
39
* This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
40
* S390, SH4, TriCore, and Xtensa. Our other supported targets
41
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
42
/* sign bit clear, set frac msb */
43
dnan_pattern = 0b01000000;
64
}
44
}
65
return gdb_get_reg32(mem_buf, 0);
45
-#endif
66
case 25:
67
- /* CPSR */
68
- return gdb_get_reg32(mem_buf, cpsr_read(env));
69
+ /* CPSR, or XPSR for M-profile */
70
+ if (arm_feature(env, ARM_FEATURE_M)) {
71
+ return gdb_get_reg32(mem_buf, xpsr_read(env));
72
+ } else {
73
+ return gdb_get_reg32(mem_buf, cpsr_read(env));
74
+ }
75
}
46
}
76
/* Unknown register. */
47
assert(dnan_pattern != 0);
77
return 0;
48
78
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
79
}
80
return 4;
81
case 25:
82
- /* CPSR */
83
- cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
84
+ /* CPSR, or XPSR for M-profile */
85
+ if (arm_feature(env, ARM_FEATURE_M)) {
86
+ /*
87
+ * Don't allow writing to XPSR.Exception as it can cause
88
+ * a transition into or out of handler mode (it's not
89
+ * writeable via the MSR insn so this is a reasonable
90
+ * restriction). Other fields are safe to update.
91
+ */
92
+ xpsr_write(env, tmp, ~XPSR_EXCP);
93
+ } else {
94
+ cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
95
+ }
96
return 4;
97
}
98
/* Unknown register. */
99
diff --git a/gdb-xml/arm-m-profile.xml b/gdb-xml/arm-m-profile.xml
100
new file mode 100644
101
index XXXXXXX..XXXXXXX
102
--- /dev/null
103
+++ b/gdb-xml/arm-m-profile.xml
104
@@ -XXX,XX +XXX,XX @@
105
+<?xml version="1.0"?>
106
+<!-- Copyright (C) 2010-2020 Free Software Foundation, Inc.
107
+
108
+ Copying and distribution of this file, with or without modification,
109
+ are permitted in any medium without royalty provided the copyright
110
+ notice and this notice are preserved. -->
111
+
112
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
113
+<feature name="org.gnu.gdb.arm.m-profile">
114
+ <reg name="r0" bitsize="32"/>
115
+ <reg name="r1" bitsize="32"/>
116
+ <reg name="r2" bitsize="32"/>
117
+ <reg name="r3" bitsize="32"/>
118
+ <reg name="r4" bitsize="32"/>
119
+ <reg name="r5" bitsize="32"/>
120
+ <reg name="r6" bitsize="32"/>
121
+ <reg name="r7" bitsize="32"/>
122
+ <reg name="r8" bitsize="32"/>
123
+ <reg name="r9" bitsize="32"/>
124
+ <reg name="r10" bitsize="32"/>
125
+ <reg name="r11" bitsize="32"/>
126
+ <reg name="r12" bitsize="32"/>
127
+ <reg name="sp" bitsize="32" type="data_ptr"/>
128
+ <reg name="lr" bitsize="32"/>
129
+ <reg name="pc" bitsize="32" type="code_ptr"/>
130
+ <reg name="xpsr" bitsize="32" regnum="25"/>
131
+</feature>
132
--
49
--
133
2.20.1
50
2.34.1
134
135
diff view generated by jsdifflib
1
Convert the Neon SHA instructions in the 3-reg-same group
1
Set the default NaN pattern explicitly for riscv.
2
to decodetree.
3
2
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200512163904.10918-3-peter.maydell@linaro.org
5
Message-id: 20241202131347.498124-53-peter.maydell@linaro.org
7
---
6
---
8
target/arm/neon-dp.decode | 10 +++
7
target/riscv/cpu.c | 2 ++
9
target/arm/translate-neon.inc.c | 139 ++++++++++++++++++++++++++++++++
8
1 file changed, 2 insertions(+)
10
target/arm/translate.c | 46 +----------
11
3 files changed, 151 insertions(+), 44 deletions(-)
12
9
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
10
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
14
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/neon-dp.decode
12
--- a/target/riscv/cpu.c
16
+++ b/target/arm/neon-dp.decode
13
+++ b/target/riscv/cpu.c
17
@@ -XXX,XX +XXX,XX @@ VMUL_3s 1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
14
@@ -XXX,XX +XXX,XX @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type)
18
VMUL_p_3s 1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
15
cs->exception_index = RISCV_EXCP_NONE;
19
16
env->load_res = -1;
20
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
17
set_default_nan_mode(1, &env->fp_status);
21
+
18
+ /* Default NaN value: sign bit clear, frac msb set */
22
+SHA1_3s 1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
23
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
20
env->vill = true;
24
+SHA256H_3s 1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... \
21
25
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
22
#ifndef CONFIG_USER_ONLY
26
+SHA256H2_3s 1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
27
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
28
+SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
29
+ vm=%vm_dp vn=%vn_dp vd=%vd_dp
30
+
31
VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
32
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/translate-neon.inc.c
35
+++ b/target/arm/translate-neon.inc.c
36
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
37
38
DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
39
DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
40
+
41
+static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
42
+{
43
+ TCGv_ptr ptr1, ptr2, ptr3;
44
+ TCGv_i32 tmp;
45
+
46
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
47
+ !dc_isar_feature(aa32_sha1, s)) {
48
+ return false;
49
+ }
50
+
51
+ /* UNDEF accesses to D16-D31 if they don't exist. */
52
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
53
+ ((a->vd | a->vn | a->vm) & 0x10)) {
54
+ return false;
55
+ }
56
+
57
+ if ((a->vn | a->vm | a->vd) & 1) {
58
+ return false;
59
+ }
60
+
61
+ if (!vfp_access_check(s)) {
62
+ return true;
63
+ }
64
+
65
+ ptr1 = vfp_reg_ptr(true, a->vd);
66
+ ptr2 = vfp_reg_ptr(true, a->vn);
67
+ ptr3 = vfp_reg_ptr(true, a->vm);
68
+ tmp = tcg_const_i32(a->optype);
69
+ gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp);
70
+ tcg_temp_free_i32(tmp);
71
+ tcg_temp_free_ptr(ptr1);
72
+ tcg_temp_free_ptr(ptr2);
73
+ tcg_temp_free_ptr(ptr3);
74
+
75
+ return true;
76
+}
77
+
78
+static bool trans_SHA256H_3s(DisasContext *s, arg_SHA256H_3s *a)
79
+{
80
+ TCGv_ptr ptr1, ptr2, ptr3;
81
+
82
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
83
+ !dc_isar_feature(aa32_sha2, s)) {
84
+ return false;
85
+ }
86
+
87
+ /* UNDEF accesses to D16-D31 if they don't exist. */
88
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
89
+ ((a->vd | a->vn | a->vm) & 0x10)) {
90
+ return false;
91
+ }
92
+
93
+ if ((a->vn | a->vm | a->vd) & 1) {
94
+ return false;
95
+ }
96
+
97
+ if (!vfp_access_check(s)) {
98
+ return true;
99
+ }
100
+
101
+ ptr1 = vfp_reg_ptr(true, a->vd);
102
+ ptr2 = vfp_reg_ptr(true, a->vn);
103
+ ptr3 = vfp_reg_ptr(true, a->vm);
104
+ gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
105
+ tcg_temp_free_ptr(ptr1);
106
+ tcg_temp_free_ptr(ptr2);
107
+ tcg_temp_free_ptr(ptr3);
108
+
109
+ return true;
110
+}
111
+
112
+static bool trans_SHA256H2_3s(DisasContext *s, arg_SHA256H2_3s *a)
113
+{
114
+ TCGv_ptr ptr1, ptr2, ptr3;
115
+
116
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
117
+ !dc_isar_feature(aa32_sha2, s)) {
118
+ return false;
119
+ }
120
+
121
+ /* UNDEF accesses to D16-D31 if they don't exist. */
122
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
123
+ ((a->vd | a->vn | a->vm) & 0x10)) {
124
+ return false;
125
+ }
126
+
127
+ if ((a->vn | a->vm | a->vd) & 1) {
128
+ return false;
129
+ }
130
+
131
+ if (!vfp_access_check(s)) {
132
+ return true;
133
+ }
134
+
135
+ ptr1 = vfp_reg_ptr(true, a->vd);
136
+ ptr2 = vfp_reg_ptr(true, a->vn);
137
+ ptr3 = vfp_reg_ptr(true, a->vm);
138
+ gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
139
+ tcg_temp_free_ptr(ptr1);
140
+ tcg_temp_free_ptr(ptr2);
141
+ tcg_temp_free_ptr(ptr3);
142
+
143
+ return true;
144
+}
145
+
146
+static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
147
+{
148
+ TCGv_ptr ptr1, ptr2, ptr3;
149
+
150
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
151
+ !dc_isar_feature(aa32_sha2, s)) {
152
+ return false;
153
+ }
154
+
155
+ /* UNDEF accesses to D16-D31 if they don't exist. */
156
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
157
+ ((a->vd | a->vn | a->vm) & 0x10)) {
158
+ return false;
159
+ }
160
+
161
+ if ((a->vn | a->vm | a->vd) & 1) {
162
+ return false;
163
+ }
164
+
165
+ if (!vfp_access_check(s)) {
166
+ return true;
167
+ }
168
+
169
+ ptr1 = vfp_reg_ptr(true, a->vd);
170
+ ptr2 = vfp_reg_ptr(true, a->vn);
171
+ ptr3 = vfp_reg_ptr(true, a->vm);
172
+ gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
173
+ tcg_temp_free_ptr(ptr1);
174
+ tcg_temp_free_ptr(ptr2);
175
+ tcg_temp_free_ptr(ptr3);
176
+
177
+ return true;
178
+}
179
diff --git a/target/arm/translate.c b/target/arm/translate.c
180
index XXXXXXX..XXXXXXX 100644
181
--- a/target/arm/translate.c
182
+++ b/target/arm/translate.c
183
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
184
int vec_size;
185
uint32_t imm;
186
TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
187
- TCGv_ptr ptr1, ptr2, ptr3;
188
+ TCGv_ptr ptr1, ptr2;
189
TCGv_i64 tmp64;
190
191
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
192
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
193
return 1;
194
}
195
switch (op) {
196
- case NEON_3R_SHA:
197
- /* The SHA-1/SHA-256 3-register instructions require special
198
- * treatment here, as their size field is overloaded as an
199
- * op type selector, and they all consume their input in a
200
- * single pass.
201
- */
202
- if (!q) {
203
- return 1;
204
- }
205
- if (!u) { /* SHA-1 */
206
- if (!dc_isar_feature(aa32_sha1, s)) {
207
- return 1;
208
- }
209
- ptr1 = vfp_reg_ptr(true, rd);
210
- ptr2 = vfp_reg_ptr(true, rn);
211
- ptr3 = vfp_reg_ptr(true, rm);
212
- tmp4 = tcg_const_i32(size);
213
- gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
214
- tcg_temp_free_i32(tmp4);
215
- } else { /* SHA-256 */
216
- if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
217
- return 1;
218
- }
219
- ptr1 = vfp_reg_ptr(true, rd);
220
- ptr2 = vfp_reg_ptr(true, rn);
221
- ptr3 = vfp_reg_ptr(true, rm);
222
- switch (size) {
223
- case 0:
224
- gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
225
- break;
226
- case 1:
227
- gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
228
- break;
229
- case 2:
230
- gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
231
- break;
232
- }
233
- }
234
- tcg_temp_free_ptr(ptr1);
235
- tcg_temp_free_ptr(ptr2);
236
- tcg_temp_free_ptr(ptr3);
237
- return 0;
238
-
239
case NEON_3R_VPADD_VQRDMLAH:
240
if (!u) {
241
break; /* VPADD */
242
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
243
case NEON_3R_VMUL:
244
case NEON_3R_VML:
245
case NEON_3R_VSHL:
246
+ case NEON_3R_SHA:
247
/* Already handled by decodetree */
248
return 1;
249
}
250
--
23
--
251
2.20.1
24
2.34.1
252
253
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Set the default NaN pattern explicitly for tricore.
2
2
3
Must clear the tail for AdvSIMD when SVE is enabled.
4
5
Fixes: ca40a6e6e39
6
Cc: qemu-stable@nongnu.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200513163245.17915-15-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20241202131347.498124-54-peter.maydell@linaro.org
11
---
6
---
12
target/arm/vec_helper.c | 2 ++
7
target/tricore/helper.c | 2 ++
13
1 file changed, 2 insertions(+)
8
1 file changed, 2 insertions(+)
14
9
15
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
10
diff --git a/target/tricore/helper.c b/target/tricore/helper.c
16
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/vec_helper.c
12
--- a/target/tricore/helper.c
18
+++ b/target/arm/vec_helper.c
13
+++ b/target/tricore/helper.c
19
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
14
@@ -XXX,XX +XXX,XX @@ void fpu_set_state(CPUTriCoreState *env)
20
d[i + j] = TYPE##_mul(n[i + j], mm, stat); \
15
set_flush_to_zero(1, &env->fp_status);
21
} \
16
set_float_detect_tininess(float_tininess_before_rounding, &env->fp_status);
22
} \
17
set_default_nan_mode(1, &env->fp_status);
23
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
18
+ /* Default NaN pattern: sign bit clear, frac msb set */
19
+ set_float_default_nan_pattern(0b01000000, &env->fp_status);
24
}
20
}
25
21
26
DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
22
uint32_t psw_read(CPUTriCoreState *env)
27
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \
28
mm, a[i + j], 0, stat); \
29
} \
30
} \
31
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
32
}
33
34
DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
35
--
23
--
36
2.20.1
24
2.34.1
37
38
diff view generated by jsdifflib
1
Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group
1
Now that all our targets have bene converted to explicitly specify
2
to decodetree. These don't use do_3same() because they want to
2
their pattern for the default NaN value we can remove the remaining
3
operate on VFP double registers, whose offsets are different from the
3
fallback code in parts64_default_nan().
4
neon_reg_offset() calculations do_3same does.
5
4
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200512163904.10918-2-peter.maydell@linaro.org
7
Message-id: 20241202131347.498124-55-peter.maydell@linaro.org
9
---
8
---
10
target/arm/neon-dp.decode | 3 +++
9
fpu/softfloat-specialize.c.inc | 14 --------------
11
target/arm/translate-neon.inc.c | 15 +++++++++++++++
10
1 file changed, 14 deletions(-)
12
target/arm/translate.c | 14 ++------------
13
3 files changed, 20 insertions(+), 12 deletions(-)
14
11
15
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
12
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/neon-dp.decode
14
--- a/fpu/softfloat-specialize.c.inc
18
+++ b/target/arm/neon-dp.decode
15
+++ b/fpu/softfloat-specialize.c.inc
19
@@ -XXX,XX +XXX,XX @@ VMLS_3s 1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
16
@@ -XXX,XX +XXX,XX @@ static void parts64_default_nan(FloatParts64 *p, float_status *status)
20
17
uint64_t frac;
21
VMUL_3s 1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
18
uint8_t dnan_pattern = status->default_nan_pattern;
22
VMUL_p_3s 1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
19
23
+
20
- if (dnan_pattern == 0) {
24
+VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
21
- /*
25
+VQRDMLSH_3s 1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
22
- * This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
26
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
23
- * S390, SH4, TriCore, and Xtensa. Our other supported targets
27
index XXXXXXX..XXXXXXX 100644
24
- * do not have floating-point.
28
--- a/target/arm/translate-neon.inc.c
25
- */
29
+++ b/target/arm/translate-neon.inc.c
26
- if (snan_bit_is_one(status)) {
30
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
27
- /* sign bit clear, set all frac bits other than msb */
31
}
28
- dnan_pattern = 0b00111111;
32
return do_3same(s, a, gen_VMUL_p_3s);
29
- } else {
33
}
30
- /* sign bit clear, set frac msb */
34
+
31
- dnan_pattern = 0b01000000;
35
+#define DO_VQRDMLAH(INSN, FUNC) \
32
- }
36
+ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
33
- }
37
+ { \
34
assert(dnan_pattern != 0);
38
+ if (!dc_isar_feature(aa32_rdm, s)) { \
35
39
+ return false; \
36
sign = dnan_pattern >> 7;
40
+ } \
41
+ if (a->size != 1 && a->size != 2) { \
42
+ return false; \
43
+ } \
44
+ return do_3same(s, a, FUNC); \
45
+ }
46
+
47
+DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
48
+DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
49
diff --git a/target/arm/translate.c b/target/arm/translate.c
50
index XXXXXXX..XXXXXXX 100644
51
--- a/target/arm/translate.c
52
+++ b/target/arm/translate.c
53
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
54
if (!u) {
55
break; /* VPADD */
56
}
57
- /* VQRDMLAH */
58
- if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
59
- gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
60
- vec_size, vec_size);
61
- return 0;
62
- }
63
+ /* VQRDMLAH : handled by decodetree */
64
return 1;
65
66
case NEON_3R_VFM_VQRDMLSH:
67
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
68
}
69
break;
70
}
71
- /* VQRDMLSH */
72
- if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
73
- gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
74
- vec_size, vec_size);
75
- return 0;
76
- }
77
+ /* VQRDMLSH : handled by decodetree */
78
return 1;
79
80
case NEON_3R_VABD:
81
--
37
--
82
2.20.1
38
2.34.1
83
84
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Provide a functional interface for the vector expansion.
3
Inline pickNaNMulAdd into its only caller. This makes
4
This fits better with the existing set of helpers that
4
one assert redundant with the immediately preceding IF.
5
we provide for other operations.
6
5
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200513163245.17915-13-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-3-richard.henderson@linaro.org
9
[PMM: keep comment from old code in new location]
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
11
---
12
target/arm/translate.h | 5 ++++
12
fpu/softfloat-parts.c.inc | 41 +++++++++++++++++++++++++-
13
target/arm/translate-a64.c | 34 ++----------------------
13
fpu/softfloat-specialize.c.inc | 54 ----------------------------------
14
target/arm/translate.c | 54 +++++++++++++++++++-------------------
14
2 files changed, 40 insertions(+), 55 deletions(-)
15
3 files changed, 34 insertions(+), 59 deletions(-)
16
15
17
diff --git a/target/arm/translate.h b/target/arm/translate.h
16
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
18
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/translate.h
18
--- a/fpu/softfloat-parts.c.inc
20
+++ b/target/arm/translate.h
19
+++ b/fpu/softfloat-parts.c.inc
21
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
20
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
22
void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
21
}
23
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
22
24
23
if (s->default_nan_mode) {
25
+void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
24
+ /*
26
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
25
+ * We guarantee not to require the target to tell us how to
27
+void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
26
+ * pick a NaN if we're always returning the default NaN.
28
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
27
+ * But if we're not in default-NaN mode then the target must
28
+ * specify.
29
+ */
30
which = 3;
31
+ } else if (infzero) {
32
+ /*
33
+ * Inf * 0 + NaN -- some implementations return the
34
+ * default NaN here, and some return the input NaN.
35
+ */
36
+ switch (s->float_infzeronan_rule) {
37
+ case float_infzeronan_dnan_never:
38
+ which = 2;
39
+ break;
40
+ case float_infzeronan_dnan_always:
41
+ which = 3;
42
+ break;
43
+ case float_infzeronan_dnan_if_qnan:
44
+ which = is_qnan(c->cls) ? 3 : 2;
45
+ break;
46
+ default:
47
+ g_assert_not_reached();
48
+ }
49
} else {
50
- which = pickNaNMulAdd(a->cls, b->cls, c->cls, infzero, have_snan, s);
51
+ FloatClass cls[3] = { a->cls, b->cls, c->cls };
52
+ Float3NaNPropRule rule = s->float_3nan_prop_rule;
29
+
53
+
30
/*
54
+ assert(rule != float_3nan_prop_none);
31
* Forward to the isar_feature_* tests given a DisasContext pointer.
55
+ if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
32
*/
56
+ /* We have at least one SNaN input and should prefer it */
33
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
57
+ do {
58
+ which = rule & R_3NAN_1ST_MASK;
59
+ rule >>= R_3NAN_1ST_LENGTH;
60
+ } while (!is_snan(cls[which]));
61
+ } else {
62
+ do {
63
+ which = rule & R_3NAN_1ST_MASK;
64
+ rule >>= R_3NAN_1ST_LENGTH;
65
+ } while (!is_nan(cls[which]));
66
+ }
67
}
68
69
if (which == 3) {
70
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
34
index XXXXXXX..XXXXXXX 100644
71
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/translate-a64.c
72
--- a/fpu/softfloat-specialize.c.inc
36
+++ b/target/arm/translate-a64.c
73
+++ b/fpu/softfloat-specialize.c.inc
37
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
74
@@ -XXX,XX +XXX,XX @@ static int pickNaN(FloatClass a_cls, FloatClass b_cls,
38
is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
75
}
39
}
76
}
40
77
41
-/* Expand a 3-operand + env pointer operation using
78
-/*----------------------------------------------------------------------------
42
- * an out-of-line helper.
79
-| Select which NaN to propagate for a three-input operation.
43
- */
80
-| For the moment we assume that no CPU needs the 'larger significand'
44
-static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd,
81
-| information.
45
- int rn, int rm, gen_helper_gvec_3_ptr *fn)
82
-| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN
83
-*----------------------------------------------------------------------------*/
84
-static int pickNaNMulAdd(FloatClass a_cls, FloatClass b_cls, FloatClass c_cls,
85
- bool infzero, bool have_snan, float_status *status)
46
-{
86
-{
47
- tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
87
- FloatClass cls[3] = { a_cls, b_cls, c_cls };
48
- vec_full_reg_offset(s, rn),
88
- Float3NaNPropRule rule = status->float_3nan_prop_rule;
49
- vec_full_reg_offset(s, rm), cpu_env,
89
- int which;
50
- is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
51
-}
52
-
90
-
53
/* Expand a 3-operand + fpstatus pointer + simd data value operation using
91
- /*
54
* an out-of-line helper.
92
- * We guarantee not to require the target to tell us how to
55
*/
93
- * pick a NaN if we're always returning the default NaN.
56
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
94
- * But if we're not in default-NaN mode then the target must
57
95
- * specify.
58
switch (opcode) {
96
- */
59
case 0x0: /* SQRDMLAH (vector) */
97
- assert(!status->default_nan_mode);
60
- switch (size) {
98
-
61
- case 1:
99
- if (infzero) {
62
- gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16);
100
- /*
63
- break;
101
- * Inf * 0 + NaN -- some implementations return the default NaN here,
64
- case 2:
102
- * and some return the input NaN.
65
- gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32);
103
- */
66
- break;
104
- switch (status->float_infzeronan_rule) {
105
- case float_infzeronan_dnan_never:
106
- return 2;
107
- case float_infzeronan_dnan_always:
108
- return 3;
109
- case float_infzeronan_dnan_if_qnan:
110
- return is_qnan(c_cls) ? 3 : 2;
67
- default:
111
- default:
68
- g_assert_not_reached();
112
- g_assert_not_reached();
69
- }
113
- }
70
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size);
114
- }
71
return;
72
73
case 0x1: /* SQRDMLSH (vector) */
74
- switch (size) {
75
- case 1:
76
- gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16);
77
- break;
78
- case 2:
79
- gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32);
80
- break;
81
- default:
82
- g_assert_not_reached();
83
- }
84
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size);
85
return;
86
87
case 0x2: /* SDOT / UDOT */
88
diff --git a/target/arm/translate.c b/target/arm/translate.c
89
index XXXXXXX..XXXXXXX 100644
90
--- a/target/arm/translate.c
91
+++ b/target/arm/translate.c
92
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
93
[NEON_2RM_VCVT_UF] = 0x4,
94
};
95
96
-
115
-
97
-/* Expand v8.1 simd helper. */
116
- assert(rule != float_3nan_prop_none);
98
-static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
117
- if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
99
- int q, int rd, int rn, int rm)
118
- /* We have at least one SNaN input and should prefer it */
100
+void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
119
- do {
101
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
120
- which = rule & R_3NAN_1ST_MASK;
102
{
121
- rule >>= R_3NAN_1ST_LENGTH;
103
- if (dc_isar_feature(aa32_rdm, s)) {
122
- } while (!is_snan(cls[which]));
104
- int opr_sz = (1 + q) * 8;
123
- } else {
105
- tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
124
- do {
106
- vfp_reg_offset(1, rn),
125
- which = rule & R_3NAN_1ST_MASK;
107
- vfp_reg_offset(1, rm), cpu_env,
126
- rule >>= R_3NAN_1ST_LENGTH;
108
- opr_sz, opr_sz, 0, fn);
127
- } while (!is_nan(cls[which]));
109
- return 0;
110
- }
128
- }
111
- return 1;
129
- return which;
112
+ static gen_helper_gvec_3_ptr * const fns[2] = {
130
-}
113
+ gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
131
-
114
+ };
132
/*----------------------------------------------------------------------------
115
+ tcg_debug_assert(vece >= 1 && vece <= 2);
133
| Returns 1 if the double-precision floating-point value `a' is a quiet
116
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
134
| NaN; otherwise returns 0.
117
+ opr_sz, max_sz, 0, fns[vece - 1]);
118
+}
119
+
120
+void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
121
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
122
+{
123
+ static gen_helper_gvec_3_ptr * const fns[2] = {
124
+ gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
125
+ };
126
+ tcg_debug_assert(vece >= 1 && vece <= 2);
127
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
128
+ opr_sz, max_sz, 0, fns[vece - 1]);
129
}
130
131
#define GEN_CMP0(NAME, COND) \
132
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
133
break; /* VPADD */
134
}
135
/* VQRDMLAH */
136
- switch (size) {
137
- case 1:
138
- return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16,
139
- q, rd, rn, rm);
140
- case 2:
141
- return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32,
142
- q, rd, rn, rm);
143
+ if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
144
+ gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
145
+ vec_size, vec_size);
146
+ return 0;
147
}
148
return 1;
149
150
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
151
break;
152
}
153
/* VQRDMLSH */
154
- switch (size) {
155
- case 1:
156
- return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16,
157
- q, rd, rn, rm);
158
- case 2:
159
- return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32,
160
- q, rd, rn, rm);
161
+ if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
162
+ gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
163
+ vec_size, vec_size);
164
+ return 0;
165
}
166
return 1;
167
168
--
135
--
169
2.20.1
136
2.34.1
170
137
171
138
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Provide a functional interface for the vector expansion.
3
Remove "3" as a special case for which and simply
4
This fits better with the existing set of helpers that
4
branch to return the desired value.
5
we provide for other operations.
6
5
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200513163245.17915-11-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-4-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/translate.h | 13 +-
11
fpu/softfloat-parts.c.inc | 20 ++++++++++----------
13
target/arm/translate-a64.c | 22 ++-
12
1 file changed, 10 insertions(+), 10 deletions(-)
14
target/arm/translate-neon.inc.c | 19 +--
15
target/arm/translate.c | 228 +++++++++++++++++---------------
16
4 files changed, 147 insertions(+), 135 deletions(-)
17
13
18
diff --git a/target/arm/translate.h b/target/arm/translate.h
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
19
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate.h
16
--- a/fpu/softfloat-parts.c.inc
21
+++ b/target/arm/translate.h
17
+++ b/fpu/softfloat-parts.c.inc
22
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
23
void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
19
* But if we're not in default-NaN mode then the target must
24
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
20
* specify.
25
21
*/
26
-extern const GVecGen4 uqadd_op[4];
22
- which = 3;
27
-extern const GVecGen4 sqadd_op[4];
23
+ goto default_nan;
28
-extern const GVecGen4 uqsub_op[4];
24
} else if (infzero) {
29
-extern const GVecGen4 sqsub_op[4];
25
/*
30
void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
26
* Inf * 0 + NaN -- some implementations return the
31
void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
27
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
32
void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
28
*/
33
void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
29
switch (s->float_infzeronan_rule) {
34
void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
30
case float_infzeronan_dnan_never:
35
31
- which = 2;
36
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
32
break;
37
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
33
case float_infzeronan_dnan_always:
38
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
34
- which = 3;
39
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
35
- break;
40
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
36
+ goto default_nan;
41
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
37
case float_infzeronan_dnan_if_qnan:
42
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
38
- which = is_qnan(c->cls) ? 3 : 2;
43
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
39
+ if (is_qnan(c->cls)) {
40
+ goto default_nan;
41
+ }
42
break;
43
default:
44
g_assert_not_reached();
45
}
46
+ which = 2;
47
} else {
48
FloatClass cls[3] = { a->cls, b->cls, c->cls };
49
Float3NaNPropRule rule = s->float_3nan_prop_rule;
50
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
51
}
52
}
53
54
- if (which == 3) {
55
- parts_default_nan(a, s);
56
- return a;
57
- }
58
-
59
switch (which) {
60
case 0:
61
break;
62
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
63
parts_silence_nan(a, s);
64
}
65
return a;
44
+
66
+
45
void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
67
+ default_nan:
46
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
68
+ parts_default_nan(a, s);
47
void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
69
+ return a;
48
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/translate-a64.c
51
+++ b/target/arm/translate-a64.c
52
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
53
54
switch (opcode) {
55
case 0x01: /* SQADD, UQADD */
56
- tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
57
- offsetof(CPUARMState, vfp.qc),
58
- vec_full_reg_offset(s, rn),
59
- vec_full_reg_offset(s, rm),
60
- is_q ? 16 : 8, vec_full_reg_size(s),
61
- (u ? uqadd_op : sqadd_op) + size);
62
+ if (u) {
63
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size);
64
+ } else {
65
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size);
66
+ }
67
return;
68
case 0x05: /* SQSUB, UQSUB */
69
- tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
70
- offsetof(CPUARMState, vfp.qc),
71
- vec_full_reg_offset(s, rn),
72
- vec_full_reg_offset(s, rm),
73
- is_q ? 16 : 8, vec_full_reg_size(s),
74
- (u ? uqsub_op : sqsub_op) + size);
75
+ if (u) {
76
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size);
77
+ } else {
78
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size);
79
+ }
80
return;
81
case 0x08: /* SSHL, USHL */
82
if (u) {
83
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
84
index XXXXXXX..XXXXXXX 100644
85
--- a/target/arm/translate-neon.inc.c
86
+++ b/target/arm/translate-neon.inc.c
87
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VORN, tcg_gen_gvec_orc)
88
DO_3SAME(VEOR, tcg_gen_gvec_xor)
89
DO_3SAME(VSHL_S, gen_gvec_sshl)
90
DO_3SAME(VSHL_U, gen_gvec_ushl)
91
+DO_3SAME(VQADD_S, gen_gvec_sqadd_qc)
92
+DO_3SAME(VQADD_U, gen_gvec_uqadd_qc)
93
+DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc)
94
+DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc)
95
96
/* These insns are all gvec_bitsel but with the inputs in various orders. */
97
#define DO_3SAME_BITSEL(INSN, O1, O2, O3) \
98
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
99
DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
100
DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
101
102
-#define DO_3SAME_GVEC4(INSN, OPARRAY) \
103
- static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
104
- uint32_t rn_ofs, uint32_t rm_ofs, \
105
- uint32_t oprsz, uint32_t maxsz) \
106
- { \
107
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), \
108
- rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]); \
109
- } \
110
- DO_3SAME(INSN, gen_##INSN##_3s)
111
-
112
-DO_3SAME_GVEC4(VQADD_S, sqadd_op)
113
-DO_3SAME_GVEC4(VQADD_U, uqadd_op)
114
-DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
115
-DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
116
-
117
static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
118
uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
119
{
120
diff --git a/target/arm/translate.c b/target/arm/translate.c
121
index XXXXXXX..XXXXXXX 100644
122
--- a/target/arm/translate.c
123
+++ b/target/arm/translate.c
124
@@ -XXX,XX +XXX,XX @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
125
tcg_temp_free_vec(x);
126
}
70
}
127
71
128
-static const TCGOpcode vecop_list_uqadd[] = {
72
/*
129
- INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
130
-};
131
-
132
-const GVecGen4 uqadd_op[4] = {
133
- { .fniv = gen_uqadd_vec,
134
- .fno = gen_helper_gvec_uqadd_b,
135
- .write_aofs = true,
136
- .opt_opc = vecop_list_uqadd,
137
- .vece = MO_8 },
138
- { .fniv = gen_uqadd_vec,
139
- .fno = gen_helper_gvec_uqadd_h,
140
- .write_aofs = true,
141
- .opt_opc = vecop_list_uqadd,
142
- .vece = MO_16 },
143
- { .fniv = gen_uqadd_vec,
144
- .fno = gen_helper_gvec_uqadd_s,
145
- .write_aofs = true,
146
- .opt_opc = vecop_list_uqadd,
147
- .vece = MO_32 },
148
- { .fniv = gen_uqadd_vec,
149
- .fno = gen_helper_gvec_uqadd_d,
150
- .write_aofs = true,
151
- .opt_opc = vecop_list_uqadd,
152
- .vece = MO_64 },
153
-};
154
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
155
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
156
+{
157
+ static const TCGOpcode vecop_list[] = {
158
+ INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
159
+ };
160
+ static const GVecGen4 ops[4] = {
161
+ { .fniv = gen_uqadd_vec,
162
+ .fno = gen_helper_gvec_uqadd_b,
163
+ .write_aofs = true,
164
+ .opt_opc = vecop_list,
165
+ .vece = MO_8 },
166
+ { .fniv = gen_uqadd_vec,
167
+ .fno = gen_helper_gvec_uqadd_h,
168
+ .write_aofs = true,
169
+ .opt_opc = vecop_list,
170
+ .vece = MO_16 },
171
+ { .fniv = gen_uqadd_vec,
172
+ .fno = gen_helper_gvec_uqadd_s,
173
+ .write_aofs = true,
174
+ .opt_opc = vecop_list,
175
+ .vece = MO_32 },
176
+ { .fniv = gen_uqadd_vec,
177
+ .fno = gen_helper_gvec_uqadd_d,
178
+ .write_aofs = true,
179
+ .opt_opc = vecop_list,
180
+ .vece = MO_64 },
181
+ };
182
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
183
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
184
+}
185
186
static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
187
TCGv_vec a, TCGv_vec b)
188
@@ -XXX,XX +XXX,XX @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
189
tcg_temp_free_vec(x);
190
}
191
192
-static const TCGOpcode vecop_list_sqadd[] = {
193
- INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
194
-};
195
-
196
-const GVecGen4 sqadd_op[4] = {
197
- { .fniv = gen_sqadd_vec,
198
- .fno = gen_helper_gvec_sqadd_b,
199
- .opt_opc = vecop_list_sqadd,
200
- .write_aofs = true,
201
- .vece = MO_8 },
202
- { .fniv = gen_sqadd_vec,
203
- .fno = gen_helper_gvec_sqadd_h,
204
- .opt_opc = vecop_list_sqadd,
205
- .write_aofs = true,
206
- .vece = MO_16 },
207
- { .fniv = gen_sqadd_vec,
208
- .fno = gen_helper_gvec_sqadd_s,
209
- .opt_opc = vecop_list_sqadd,
210
- .write_aofs = true,
211
- .vece = MO_32 },
212
- { .fniv = gen_sqadd_vec,
213
- .fno = gen_helper_gvec_sqadd_d,
214
- .opt_opc = vecop_list_sqadd,
215
- .write_aofs = true,
216
- .vece = MO_64 },
217
-};
218
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
219
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
220
+{
221
+ static const TCGOpcode vecop_list[] = {
222
+ INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
223
+ };
224
+ static const GVecGen4 ops[4] = {
225
+ { .fniv = gen_sqadd_vec,
226
+ .fno = gen_helper_gvec_sqadd_b,
227
+ .opt_opc = vecop_list,
228
+ .write_aofs = true,
229
+ .vece = MO_8 },
230
+ { .fniv = gen_sqadd_vec,
231
+ .fno = gen_helper_gvec_sqadd_h,
232
+ .opt_opc = vecop_list,
233
+ .write_aofs = true,
234
+ .vece = MO_16 },
235
+ { .fniv = gen_sqadd_vec,
236
+ .fno = gen_helper_gvec_sqadd_s,
237
+ .opt_opc = vecop_list,
238
+ .write_aofs = true,
239
+ .vece = MO_32 },
240
+ { .fniv = gen_sqadd_vec,
241
+ .fno = gen_helper_gvec_sqadd_d,
242
+ .opt_opc = vecop_list,
243
+ .write_aofs = true,
244
+ .vece = MO_64 },
245
+ };
246
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
247
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
248
+}
249
250
static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
251
TCGv_vec a, TCGv_vec b)
252
@@ -XXX,XX +XXX,XX @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
253
tcg_temp_free_vec(x);
254
}
255
256
-static const TCGOpcode vecop_list_uqsub[] = {
257
- INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
258
-};
259
-
260
-const GVecGen4 uqsub_op[4] = {
261
- { .fniv = gen_uqsub_vec,
262
- .fno = gen_helper_gvec_uqsub_b,
263
- .opt_opc = vecop_list_uqsub,
264
- .write_aofs = true,
265
- .vece = MO_8 },
266
- { .fniv = gen_uqsub_vec,
267
- .fno = gen_helper_gvec_uqsub_h,
268
- .opt_opc = vecop_list_uqsub,
269
- .write_aofs = true,
270
- .vece = MO_16 },
271
- { .fniv = gen_uqsub_vec,
272
- .fno = gen_helper_gvec_uqsub_s,
273
- .opt_opc = vecop_list_uqsub,
274
- .write_aofs = true,
275
- .vece = MO_32 },
276
- { .fniv = gen_uqsub_vec,
277
- .fno = gen_helper_gvec_uqsub_d,
278
- .opt_opc = vecop_list_uqsub,
279
- .write_aofs = true,
280
- .vece = MO_64 },
281
-};
282
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
283
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
284
+{
285
+ static const TCGOpcode vecop_list[] = {
286
+ INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
287
+ };
288
+ static const GVecGen4 ops[4] = {
289
+ { .fniv = gen_uqsub_vec,
290
+ .fno = gen_helper_gvec_uqsub_b,
291
+ .opt_opc = vecop_list,
292
+ .write_aofs = true,
293
+ .vece = MO_8 },
294
+ { .fniv = gen_uqsub_vec,
295
+ .fno = gen_helper_gvec_uqsub_h,
296
+ .opt_opc = vecop_list,
297
+ .write_aofs = true,
298
+ .vece = MO_16 },
299
+ { .fniv = gen_uqsub_vec,
300
+ .fno = gen_helper_gvec_uqsub_s,
301
+ .opt_opc = vecop_list,
302
+ .write_aofs = true,
303
+ .vece = MO_32 },
304
+ { .fniv = gen_uqsub_vec,
305
+ .fno = gen_helper_gvec_uqsub_d,
306
+ .opt_opc = vecop_list,
307
+ .write_aofs = true,
308
+ .vece = MO_64 },
309
+ };
310
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
311
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
312
+}
313
314
static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
315
TCGv_vec a, TCGv_vec b)
316
@@ -XXX,XX +XXX,XX @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
317
tcg_temp_free_vec(x);
318
}
319
320
-static const TCGOpcode vecop_list_sqsub[] = {
321
- INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
322
-};
323
-
324
-const GVecGen4 sqsub_op[4] = {
325
- { .fniv = gen_sqsub_vec,
326
- .fno = gen_helper_gvec_sqsub_b,
327
- .opt_opc = vecop_list_sqsub,
328
- .write_aofs = true,
329
- .vece = MO_8 },
330
- { .fniv = gen_sqsub_vec,
331
- .fno = gen_helper_gvec_sqsub_h,
332
- .opt_opc = vecop_list_sqsub,
333
- .write_aofs = true,
334
- .vece = MO_16 },
335
- { .fniv = gen_sqsub_vec,
336
- .fno = gen_helper_gvec_sqsub_s,
337
- .opt_opc = vecop_list_sqsub,
338
- .write_aofs = true,
339
- .vece = MO_32 },
340
- { .fniv = gen_sqsub_vec,
341
- .fno = gen_helper_gvec_sqsub_d,
342
- .opt_opc = vecop_list_sqsub,
343
- .write_aofs = true,
344
- .vece = MO_64 },
345
-};
346
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
347
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
348
+{
349
+ static const TCGOpcode vecop_list[] = {
350
+ INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
351
+ };
352
+ static const GVecGen4 ops[4] = {
353
+ { .fniv = gen_sqsub_vec,
354
+ .fno = gen_helper_gvec_sqsub_b,
355
+ .opt_opc = vecop_list,
356
+ .write_aofs = true,
357
+ .vece = MO_8 },
358
+ { .fniv = gen_sqsub_vec,
359
+ .fno = gen_helper_gvec_sqsub_h,
360
+ .opt_opc = vecop_list,
361
+ .write_aofs = true,
362
+ .vece = MO_16 },
363
+ { .fniv = gen_sqsub_vec,
364
+ .fno = gen_helper_gvec_sqsub_s,
365
+ .opt_opc = vecop_list,
366
+ .write_aofs = true,
367
+ .vece = MO_32 },
368
+ { .fniv = gen_sqsub_vec,
369
+ .fno = gen_helper_gvec_sqsub_d,
370
+ .opt_opc = vecop_list,
371
+ .write_aofs = true,
372
+ .vece = MO_64 },
373
+ };
374
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
375
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
376
+}
377
378
/* Translate a NEON data processing instruction. Return nonzero if the
379
instruction is invalid.
380
--
73
--
381
2.20.1
74
2.34.1
382
75
383
76
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Provide a functional interface for the vector expansion.
3
Assign the pointer return value to 'a' directly,
4
This fits better with the existing set of helpers that
4
rather than going through an intermediary index.
5
we provide for other operations.
6
5
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200513163245.17915-8-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-5-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/translate.h | 7 +-
11
fpu/softfloat-parts.c.inc | 32 ++++++++++----------------------
13
target/arm/translate-a64.c | 4 +-
12
1 file changed, 10 insertions(+), 22 deletions(-)
14
target/arm/translate-neon.inc.c | 16 +----
15
target/arm/translate.c | 117 +++++++++++++++++---------------
16
4 files changed, 71 insertions(+), 73 deletions(-)
17
13
18
diff --git a/target/arm/translate.h b/target/arm/translate.h
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
19
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/translate.h
16
--- a/fpu/softfloat-parts.c.inc
21
+++ b/target/arm/translate.h
17
+++ b/fpu/softfloat-parts.c.inc
22
@@ -XXX,XX +XXX,XX @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
23
void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
19
FloatPartsN *c, float_status *s,
24
uint32_t opr_sz, uint32_t max_sz);
20
int ab_mask, int abc_mask)
25
21
{
26
-extern const GVecGen3 mla_op[4];
22
- int which;
27
-extern const GVecGen3 mls_op[4];
23
bool infzero = (ab_mask == float_cmask_infzero);
28
+void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
24
bool have_snan = (abc_mask & float_cmask_snan);
29
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
25
+ FloatPartsN *ret;
30
+void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
26
31
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
27
if (unlikely(have_snan)) {
32
+
28
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
33
extern const GVecGen3 cmtst_op[4];
29
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
34
extern const GVecGen3 sshl_op[4];
30
default:
35
extern const GVecGen3 ushl_op[4];
31
g_assert_not_reached();
36
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
32
}
37
index XXXXXXX..XXXXXXX 100644
33
- which = 2;
38
--- a/target/arm/translate-a64.c
34
+ ret = c;
39
+++ b/target/arm/translate-a64.c
35
} else {
40
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
36
- FloatClass cls[3] = { a->cls, b->cls, c->cls };
41
return;
37
+ FloatPartsN *val[3] = { a, b, c };
42
case 0x12: /* MLA, MLS */
38
Float3NaNPropRule rule = s->float_3nan_prop_rule;
43
if (u) {
39
44
- gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]);
40
assert(rule != float_3nan_prop_none);
45
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size);
41
if (have_snan && (rule & R_3NAN_SNAN_MASK)) {
42
/* We have at least one SNaN input and should prefer it */
43
do {
44
- which = rule & R_3NAN_1ST_MASK;
45
+ ret = val[rule & R_3NAN_1ST_MASK];
46
rule >>= R_3NAN_1ST_LENGTH;
47
- } while (!is_snan(cls[which]));
48
+ } while (!is_snan(ret->cls));
46
} else {
49
} else {
47
- gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]);
50
do {
48
+ gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size);
51
- which = rule & R_3NAN_1ST_MASK;
52
+ ret = val[rule & R_3NAN_1ST_MASK];
53
rule >>= R_3NAN_1ST_LENGTH;
54
- } while (!is_nan(cls[which]));
55
+ } while (!is_nan(ret->cls));
49
}
56
}
50
return;
57
}
51
case 0x11:
58
52
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
59
- switch (which) {
53
index XXXXXXX..XXXXXXX 100644
60
- case 0:
54
--- a/target/arm/translate-neon.inc.c
61
- break;
55
+++ b/target/arm/translate-neon.inc.c
62
- case 1:
56
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
63
- a = b;
57
DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
64
- break;
58
DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
65
- case 2:
59
DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
66
- a = c;
60
+DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
67
- break;
61
+DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
68
- default:
62
69
- g_assert_not_reached();
63
#define DO_3SAME_CMP(INSN, COND) \
70
+ if (is_snan(ret->cls)) {
64
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
71
+ parts_silence_nan(ret, s);
65
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
72
}
66
return do_3same(s, a, gen_VMUL_p_3s);
73
- if (is_snan(a->cls)) {
67
}
74
- parts_silence_nan(a, s);
68
75
- }
69
-#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY) \
76
- return a;
70
- static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
77
+ return ret;
71
- uint32_t rn_ofs, uint32_t rm_ofs, \
78
72
- uint32_t oprsz, uint32_t maxsz) \
79
default_nan:
73
- { \
80
parts_default_nan(a, s);
74
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \
75
- oprsz, maxsz, &OPARRAY[vece]); \
76
- } \
77
- DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
78
-
79
-
80
-DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
81
-DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
82
-
83
#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \
84
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
85
uint32_t rn_ofs, uint32_t rm_ofs, \
86
diff --git a/target/arm/translate.c b/target/arm/translate.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/translate.c
89
+++ b/target/arm/translate.c
90
@@ -XXX,XX +XXX,XX @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
91
/* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
92
* these tables are shared with AArch64 which does support them.
93
*/
94
+void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
95
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
96
+{
97
+ static const TCGOpcode vecop_list[] = {
98
+ INDEX_op_mul_vec, INDEX_op_add_vec, 0
99
+ };
100
+ static const GVecGen3 ops[4] = {
101
+ { .fni4 = gen_mla8_i32,
102
+ .fniv = gen_mla_vec,
103
+ .load_dest = true,
104
+ .opt_opc = vecop_list,
105
+ .vece = MO_8 },
106
+ { .fni4 = gen_mla16_i32,
107
+ .fniv = gen_mla_vec,
108
+ .load_dest = true,
109
+ .opt_opc = vecop_list,
110
+ .vece = MO_16 },
111
+ { .fni4 = gen_mla32_i32,
112
+ .fniv = gen_mla_vec,
113
+ .load_dest = true,
114
+ .opt_opc = vecop_list,
115
+ .vece = MO_32 },
116
+ { .fni8 = gen_mla64_i64,
117
+ .fniv = gen_mla_vec,
118
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
119
+ .load_dest = true,
120
+ .opt_opc = vecop_list,
121
+ .vece = MO_64 },
122
+ };
123
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
124
+}
125
126
-static const TCGOpcode vecop_list_mla[] = {
127
- INDEX_op_mul_vec, INDEX_op_add_vec, 0
128
-};
129
-
130
-static const TCGOpcode vecop_list_mls[] = {
131
- INDEX_op_mul_vec, INDEX_op_sub_vec, 0
132
-};
133
-
134
-const GVecGen3 mla_op[4] = {
135
- { .fni4 = gen_mla8_i32,
136
- .fniv = gen_mla_vec,
137
- .load_dest = true,
138
- .opt_opc = vecop_list_mla,
139
- .vece = MO_8 },
140
- { .fni4 = gen_mla16_i32,
141
- .fniv = gen_mla_vec,
142
- .load_dest = true,
143
- .opt_opc = vecop_list_mla,
144
- .vece = MO_16 },
145
- { .fni4 = gen_mla32_i32,
146
- .fniv = gen_mla_vec,
147
- .load_dest = true,
148
- .opt_opc = vecop_list_mla,
149
- .vece = MO_32 },
150
- { .fni8 = gen_mla64_i64,
151
- .fniv = gen_mla_vec,
152
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
153
- .load_dest = true,
154
- .opt_opc = vecop_list_mla,
155
- .vece = MO_64 },
156
-};
157
-
158
-const GVecGen3 mls_op[4] = {
159
- { .fni4 = gen_mls8_i32,
160
- .fniv = gen_mls_vec,
161
- .load_dest = true,
162
- .opt_opc = vecop_list_mls,
163
- .vece = MO_8 },
164
- { .fni4 = gen_mls16_i32,
165
- .fniv = gen_mls_vec,
166
- .load_dest = true,
167
- .opt_opc = vecop_list_mls,
168
- .vece = MO_16 },
169
- { .fni4 = gen_mls32_i32,
170
- .fniv = gen_mls_vec,
171
- .load_dest = true,
172
- .opt_opc = vecop_list_mls,
173
- .vece = MO_32 },
174
- { .fni8 = gen_mls64_i64,
175
- .fniv = gen_mls_vec,
176
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
177
- .load_dest = true,
178
- .opt_opc = vecop_list_mls,
179
- .vece = MO_64 },
180
-};
181
+void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
182
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
183
+{
184
+ static const TCGOpcode vecop_list[] = {
185
+ INDEX_op_mul_vec, INDEX_op_sub_vec, 0
186
+ };
187
+ static const GVecGen3 ops[4] = {
188
+ { .fni4 = gen_mls8_i32,
189
+ .fniv = gen_mls_vec,
190
+ .load_dest = true,
191
+ .opt_opc = vecop_list,
192
+ .vece = MO_8 },
193
+ { .fni4 = gen_mls16_i32,
194
+ .fniv = gen_mls_vec,
195
+ .load_dest = true,
196
+ .opt_opc = vecop_list,
197
+ .vece = MO_16 },
198
+ { .fni4 = gen_mls32_i32,
199
+ .fniv = gen_mls_vec,
200
+ .load_dest = true,
201
+ .opt_opc = vecop_list,
202
+ .vece = MO_32 },
203
+ { .fni8 = gen_mls64_i64,
204
+ .fniv = gen_mls_vec,
205
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
206
+ .load_dest = true,
207
+ .opt_opc = vecop_list,
208
+ .vece = MO_64 },
209
+ };
210
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
211
+}
212
213
/* CMTST : test is "if (X & Y != 0)". */
214
static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
215
--
81
--
216
2.20.1
82
2.34.1
217
83
218
84
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Rather than perform the argument swap during code generation,
3
While all indices into val[] should be in [0-2], the mask
4
perform it during decode. This means it doesn't have to be
4
applied is two bits. To help static analysis see there is
5
special cased later, and we can share code with aarch64 code
5
no possibility of read beyond the end of the array, pad the
6
generation. Hopefully the decode comment addresses any confusion
6
array to 4 entries, with the final being (implicitly) NULL.
7
that might arise in between.
8
7
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20200513163245.17915-9-richard.henderson@linaro.org
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
10
Message-id: 20241203203949.483774-6-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
12
---
14
target/arm/neon-dp.decode | 17 +++++++++++++++--
13
fpu/softfloat-parts.c.inc | 2 +-
15
target/arm/translate-neon.inc.c | 3 +--
14
1 file changed, 1 insertion(+), 1 deletion(-)
16
2 files changed, 16 insertions(+), 4 deletions(-)
17
15
18
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
16
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
19
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/neon-dp.decode
18
--- a/fpu/softfloat-parts.c.inc
21
+++ b/target/arm/neon-dp.decode
19
+++ b/fpu/softfloat-parts.c.inc
22
@@ -XXX,XX +XXX,XX @@ VCGT_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
20
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
23
VCGE_S_3s 1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
21
}
24
VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
22
ret = c;
25
23
} else {
26
-VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
24
- FloatPartsN *val[3] = { a, b, c };
27
-VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
25
+ FloatPartsN *val[R_3NAN_1ST_MASK + 1] = { a, b, c };
28
+# The _rev suffix indicates that Vn and Vm are reversed. This is
26
Float3NaNPropRule rule = s->float_3nan_prop_rule;
29
+# the case for shifts. In the Arm ARM these insns are documented
27
30
+# with the Vm and Vn fields in their usual places, but in the
28
assert(rule != float_3nan_prop_none);
31
+# assembly the operands are listed "backwards", ie in the order
32
+# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
33
+# to consider Vm and Vn as being in different fields in the insn,
34
+# which allows us to avoid special-casing shifts in the trans_
35
+# function code. We would otherwise need to manually swap the operands
36
+# over to call Neon helper functions that are shared with AArch64,
37
+# which does not have this odd reversed-operand situation.
38
+@3same_rev .... ... . . . size:2 .... .... .... . q:1 . . .... \
39
+ &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
40
+
41
+VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
42
+VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
43
44
VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
45
VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
46
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/translate-neon.inc.c
49
+++ b/target/arm/translate-neon.inc.c
50
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
51
uint32_t rn_ofs, uint32_t rm_ofs, \
52
uint32_t oprsz, uint32_t maxsz) \
53
{ \
54
- /* Note the operation is vshl vd,vm,vn */ \
55
- tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, \
56
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \
57
oprsz, maxsz, &OPARRAY[vece]); \
58
} \
59
DO_3SAME(INSN, gen_##INSN##_3s)
60
--
29
--
61
2.20.1
30
2.34.1
62
31
63
32
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
kvm_hwpoison_page_add() and kvm_unpoison_all() will both
3
This function is part of the public interface and
4
be used by X86 and ARM platforms, so moving them into
4
is not "specialized" to any target in any way.
5
"accel/kvm/kvm-all.c" to avoid duplicate code.
6
5
7
For architectures that don't use the poison-list functionality
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
the reset handler will harmlessly do nothing, so let's register
9
the kvm_unpoison_all() function in the generic kvm_init() function.
10
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
8
Message-id: 20241203203949.483774-7-richard.henderson@linaro.org
13
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
14
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
15
Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
10
---
18
include/sysemu/kvm_int.h | 12 ++++++++++++
11
fpu/softfloat.c | 52 ++++++++++++++++++++++++++++++++++
19
accel/kvm/kvm-all.c | 36 ++++++++++++++++++++++++++++++++++++
12
fpu/softfloat-specialize.c.inc | 52 ----------------------------------
20
target/i386/kvm.c | 36 ------------------------------------
13
2 files changed, 52 insertions(+), 52 deletions(-)
21
3 files changed, 48 insertions(+), 36 deletions(-)
22
14
23
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
15
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
24
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
25
--- a/include/sysemu/kvm_int.h
17
--- a/fpu/softfloat.c
26
+++ b/include/sysemu/kvm_int.h
18
+++ b/fpu/softfloat.c
27
@@ -XXX,XX +XXX,XX @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
19
@@ -XXX,XX +XXX,XX @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t *zExpPtr,
28
AddressSpace *as, int as_id);
20
*zExpPtr = 1 - shiftCount;
29
21
}
30
void kvm_set_max_memslot_size(hwaddr max_slot_size);
22
23
+/*----------------------------------------------------------------------------
24
+| Takes two extended double-precision floating-point values `a' and `b', one
25
+| of which is a NaN, and returns the appropriate NaN result. If either `a' or
26
+| `b' is a signaling NaN, the invalid exception is raised.
27
+*----------------------------------------------------------------------------*/
31
+
28
+
32
+/**
29
+floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
33
+ * kvm_hwpoison_page_add:
30
+{
34
+ *
31
+ bool aIsLargerSignificand;
35
+ * Parameters:
32
+ FloatClass a_cls, b_cls;
36
+ * @ram_addr: the address in the RAM for the poisoned page
37
+ *
38
+ * Add a poisoned page to the list
39
+ *
40
+ * Return: None.
41
+ */
42
+void kvm_hwpoison_page_add(ram_addr_t ram_addr);
43
#endif
44
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
45
index XXXXXXX..XXXXXXX 100644
46
--- a/accel/kvm/kvm-all.c
47
+++ b/accel/kvm/kvm-all.c
48
@@ -XXX,XX +XXX,XX @@
49
#include "qapi/visitor.h"
50
#include "qapi/qapi-types-common.h"
51
#include "qapi/qapi-visit-common.h"
52
+#include "sysemu/reset.h"
53
54
#include "hw/boards.h"
55
56
@@ -XXX,XX +XXX,XX @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension)
57
return ret;
58
}
59
60
+typedef struct HWPoisonPage {
61
+ ram_addr_t ram_addr;
62
+ QLIST_ENTRY(HWPoisonPage) list;
63
+} HWPoisonPage;
64
+
33
+
65
+static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
34
+ /* This is not complete, but is good enough for pickNaN. */
66
+ QLIST_HEAD_INITIALIZER(hwpoison_page_list);
35
+ a_cls = (!floatx80_is_any_nan(a)
36
+ ? float_class_normal
37
+ : floatx80_is_signaling_nan(a, status)
38
+ ? float_class_snan
39
+ : float_class_qnan);
40
+ b_cls = (!floatx80_is_any_nan(b)
41
+ ? float_class_normal
42
+ : floatx80_is_signaling_nan(b, status)
43
+ ? float_class_snan
44
+ : float_class_qnan);
67
+
45
+
68
+static void kvm_unpoison_all(void *param)
46
+ if (is_snan(a_cls) || is_snan(b_cls)) {
69
+{
47
+ float_raise(float_flag_invalid, status);
70
+ HWPoisonPage *page, *next_page;
48
+ }
71
+
49
+
72
+ QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
50
+ if (status->default_nan_mode) {
73
+ QLIST_REMOVE(page, list);
51
+ return floatx80_default_nan(status);
74
+ qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
52
+ }
75
+ g_free(page);
53
+
54
+ if (a.low < b.low) {
55
+ aIsLargerSignificand = 0;
56
+ } else if (b.low < a.low) {
57
+ aIsLargerSignificand = 1;
58
+ } else {
59
+ aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
60
+ }
61
+
62
+ if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
63
+ if (is_snan(b_cls)) {
64
+ return floatx80_silence_nan(b, status);
65
+ }
66
+ return b;
67
+ } else {
68
+ if (is_snan(a_cls)) {
69
+ return floatx80_silence_nan(a, status);
70
+ }
71
+ return a;
76
+ }
72
+ }
77
+}
73
+}
78
+
74
+
79
+void kvm_hwpoison_page_add(ram_addr_t ram_addr)
75
/*----------------------------------------------------------------------------
80
+{
76
| Takes an abstract floating-point value having sign `zSign', exponent `zExp',
81
+ HWPoisonPage *page;
77
| and extended significand formed by the concatenation of `zSig0' and `zSig1',
82
+
78
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
83
+ QLIST_FOREACH(page, &hwpoison_page_list, list) {
84
+ if (page->ram_addr == ram_addr) {
85
+ return;
86
+ }
87
+ }
88
+ page = g_new(HWPoisonPage, 1);
89
+ page->ram_addr = ram_addr;
90
+ QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
91
+}
92
+
93
static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
94
{
95
#if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
96
@@ -XXX,XX +XXX,XX @@ static int kvm_init(MachineState *ms)
97
s->kernel_irqchip_split = mc->default_kernel_irqchip_split ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
98
}
99
100
+ qemu_register_reset(kvm_unpoison_all, NULL);
101
+
102
if (s->kernel_irqchip_allowed) {
103
kvm_irqchip_create(s);
104
}
105
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
106
index XXXXXXX..XXXXXXX 100644
79
index XXXXXXX..XXXXXXX 100644
107
--- a/target/i386/kvm.c
80
--- a/fpu/softfloat-specialize.c.inc
108
+++ b/target/i386/kvm.c
81
+++ b/fpu/softfloat-specialize.c.inc
109
@@ -XXX,XX +XXX,XX @@
82
@@ -XXX,XX +XXX,XX @@ floatx80 floatx80_silence_nan(floatx80 a, float_status *status)
110
#include "sysemu/sysemu.h"
83
return a;
111
#include "sysemu/hw_accel.h"
112
#include "sysemu/kvm_int.h"
113
-#include "sysemu/reset.h"
114
#include "sysemu/runstate.h"
115
#include "kvm_i386.h"
116
#include "hyperv.h"
117
@@ -XXX,XX +XXX,XX @@ uint64_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index)
118
}
119
}
84
}
120
85
86
-/*----------------------------------------------------------------------------
87
-| Takes two extended double-precision floating-point values `a' and `b', one
88
-| of which is a NaN, and returns the appropriate NaN result. If either `a' or
89
-| `b' is a signaling NaN, the invalid exception is raised.
90
-*----------------------------------------------------------------------------*/
121
-
91
-
122
-typedef struct HWPoisonPage {
92
-floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
123
- ram_addr_t ram_addr;
93
-{
124
- QLIST_ENTRY(HWPoisonPage) list;
94
- bool aIsLargerSignificand;
125
-} HWPoisonPage;
95
- FloatClass a_cls, b_cls;
126
-
96
-
127
-static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
97
- /* This is not complete, but is good enough for pickNaN. */
128
- QLIST_HEAD_INITIALIZER(hwpoison_page_list);
98
- a_cls = (!floatx80_is_any_nan(a)
99
- ? float_class_normal
100
- : floatx80_is_signaling_nan(a, status)
101
- ? float_class_snan
102
- : float_class_qnan);
103
- b_cls = (!floatx80_is_any_nan(b)
104
- ? float_class_normal
105
- : floatx80_is_signaling_nan(b, status)
106
- ? float_class_snan
107
- : float_class_qnan);
129
-
108
-
130
-static void kvm_unpoison_all(void *param)
109
- if (is_snan(a_cls) || is_snan(b_cls)) {
131
-{
110
- float_raise(float_flag_invalid, status);
132
- HWPoisonPage *page, *next_page;
111
- }
133
-
112
-
134
- QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
113
- if (status->default_nan_mode) {
135
- QLIST_REMOVE(page, list);
114
- return floatx80_default_nan(status);
136
- qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
115
- }
137
- g_free(page);
116
-
117
- if (a.low < b.low) {
118
- aIsLargerSignificand = 0;
119
- } else if (b.low < a.low) {
120
- aIsLargerSignificand = 1;
121
- } else {
122
- aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
123
- }
124
-
125
- if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
126
- if (is_snan(b_cls)) {
127
- return floatx80_silence_nan(b, status);
128
- }
129
- return b;
130
- } else {
131
- if (is_snan(a_cls)) {
132
- return floatx80_silence_nan(a, status);
133
- }
134
- return a;
138
- }
135
- }
139
-}
136
-}
140
-
137
-
141
-static void kvm_hwpoison_page_add(ram_addr_t ram_addr)
138
/*----------------------------------------------------------------------------
142
-{
139
| Returns 1 if the quadruple-precision floating-point value `a' is a quiet
143
- HWPoisonPage *page;
140
| NaN; otherwise returns 0.
144
-
145
- QLIST_FOREACH(page, &hwpoison_page_list, list) {
146
- if (page->ram_addr == ram_addr) {
147
- return;
148
- }
149
- }
150
- page = g_new(HWPoisonPage, 1);
151
- page->ram_addr = ram_addr;
152
- QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
153
-}
154
-
155
static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
156
int *max_banks)
157
{
158
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init(MachineState *ms, KVMState *s)
159
fprintf(stderr, "e820_add_entry() table is full\n");
160
return ret;
161
}
162
- qemu_register_reset(kvm_unpoison_all, NULL);
163
164
shadow_mem = object_property_get_int(OBJECT(s), "kvm-shadow-mem", &error_abort);
165
if (shadow_mem != -1) {
166
--
141
--
167
2.20.1
142
2.34.1
168
169
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The functions eliminate duplication of the special cases for
3
Unpacking and repacking the parts may be slightly more work
4
this operation. They match up with the GVecGen2iFn typedef.
4
than we did before, but we get to reuse more code. For a
5
code path handling exceptional values, this is an improvement.
5
6
6
Add out-of-line helpers. We got away with only having inline
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
expanders because the neon vector size is only 16 bytes, and
8
Message-id: 20241203203949.483774-8-richard.henderson@linaro.org
8
we know that the inline expansion will always succeed.
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
When we reuse this for SVE, tcg-gvec-op may decide to use an
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
out-of-line helper due to longer vector lengths.
11
---
12
fpu/softfloat.c | 43 +++++--------------------------------------
13
1 file changed, 5 insertions(+), 38 deletions(-)
11
14
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
17
target/arm/helper.h | 10 +++
18
target/arm/translate.h | 7 +-
19
target/arm/translate-a64.c | 15 +---
20
target/arm/translate.c | 161 ++++++++++++++++++++++---------------
21
target/arm/vec_helper.c | 25 ++++++
22
5 files changed, 139 insertions(+), 79 deletions(-)
23
24
diff --git a/target/arm/helper.h b/target/arm/helper.h
25
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/helper.h
17
--- a/fpu/softfloat.c
27
+++ b/target/arm/helper.h
18
+++ b/fpu/softfloat.c
28
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
@@ -XXX,XX +XXX,XX @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t *zExpPtr,
29
20
30
DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, float_status *status)
31
22
{
32
+DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
23
- bool aIsLargerSignificand;
33
+DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
24
- FloatClass a_cls, b_cls;
34
+DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
25
+ FloatParts128 pa, pb, *pr;
35
+DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
26
36
+
27
- /* This is not complete, but is good enough for pickNaN. */
37
+DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
28
- a_cls = (!floatx80_is_any_nan(a)
38
+DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
29
- ? float_class_normal
39
+DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
30
- : floatx80_is_signaling_nan(a, status)
40
+DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
31
- ? float_class_snan
41
+
32
- : float_class_qnan);
42
#ifdef TARGET_AARCH64
33
- b_cls = (!floatx80_is_any_nan(b)
43
#include "helper-a64.h"
34
- ? float_class_normal
44
#include "helper-sve.h"
35
- : floatx80_is_signaling_nan(b, status)
45
diff --git a/target/arm/translate.h b/target/arm/translate.h
36
- ? float_class_snan
46
index XXXXXXX..XXXXXXX 100644
37
- : float_class_qnan);
47
--- a/target/arm/translate.h
38
-
48
+++ b/target/arm/translate.h
39
- if (is_snan(a_cls) || is_snan(b_cls)) {
49
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
40
- float_raise(float_flag_invalid, status);
50
extern const GVecGen3 cmtst_op[4];
41
- }
51
extern const GVecGen3 sshl_op[4];
42
-
52
extern const GVecGen3 ushl_op[4];
43
- if (status->default_nan_mode) {
53
-extern const GVecGen2i ssra_op[4];
44
+ if (!floatx80_unpack_canonical(&pa, a, status) ||
54
-extern const GVecGen2i usra_op[4];
45
+ !floatx80_unpack_canonical(&pb, b, status)) {
55
extern const GVecGen2i sri_op[4];
46
return floatx80_default_nan(status);
56
extern const GVecGen2i sli_op[4];
47
}
57
extern const GVecGen4 uqadd_op[4];
48
58
@@ -XXX,XX +XXX,XX @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
49
- if (a.low < b.low) {
59
void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
50
- aIsLargerSignificand = 0;
60
void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
51
- } else if (b.low < a.low) {
61
52
- aIsLargerSignificand = 1;
62
+void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
53
- } else {
63
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
54
- aIsLargerSignificand = (a.high < b.high) ? 1 : 0;
64
+void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
55
- }
65
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
56
-
66
+
57
- if (pickNaN(a_cls, b_cls, aIsLargerSignificand, status)) {
67
/*
58
- if (is_snan(b_cls)) {
68
* Forward to the isar_feature_* tests given a DisasContext pointer.
59
- return floatx80_silence_nan(b, status);
69
*/
70
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/target/arm/translate-a64.c
73
+++ b/target/arm/translate-a64.c
74
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
75
76
switch (opcode) {
77
case 0x02: /* SSRA / USRA (accumulate) */
78
- if (is_u) {
79
- /* Shift count same as element size produces zero to add. */
80
- if (shift == 8 << size) {
81
- goto done;
82
- }
83
- gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]);
84
- } else {
85
- /* Shift count same as element size produces all sign to add. */
86
- if (shift == 8 << size) {
87
- shift -= 1;
88
- }
89
- gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]);
90
- }
60
- }
91
+ gen_gvec_fn2i(s, is_q, rd, rn, shift,
61
- return b;
92
+ is_u ? gen_gvec_usra : gen_gvec_ssra, size);
62
- } else {
93
return;
63
- if (is_snan(a_cls)) {
94
case 0x08: /* SRI */
64
- return floatx80_silence_nan(a, status);
95
/* Shift count same as element size is valid but does nothing. */
65
- }
96
diff --git a/target/arm/translate.c b/target/arm/translate.c
66
- return a;
97
index XXXXXXX..XXXXXXX 100644
67
- }
98
--- a/target/arm/translate.c
68
+ pr = parts_pick_nan(&pa, &pb, status);
99
+++ b/target/arm/translate.c
69
+ return floatx80_round_pack_canonical(pr, status);
100
@@ -XXX,XX +XXX,XX @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
101
tcg_gen_add_vec(vece, d, d, a);
102
}
70
}
103
71
104
-static const TCGOpcode vecop_list_ssra[] = {
72
/*----------------------------------------------------------------------------
105
- INDEX_op_sari_vec, INDEX_op_add_vec, 0
106
-};
107
+void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
108
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
109
+{
110
+ static const TCGOpcode vecop_list[] = {
111
+ INDEX_op_sari_vec, INDEX_op_add_vec, 0
112
+ };
113
+ static const GVecGen2i ops[4] = {
114
+ { .fni8 = gen_ssra8_i64,
115
+ .fniv = gen_ssra_vec,
116
+ .fno = gen_helper_gvec_ssra_b,
117
+ .load_dest = true,
118
+ .opt_opc = vecop_list,
119
+ .vece = MO_8 },
120
+ { .fni8 = gen_ssra16_i64,
121
+ .fniv = gen_ssra_vec,
122
+ .fno = gen_helper_gvec_ssra_h,
123
+ .load_dest = true,
124
+ .opt_opc = vecop_list,
125
+ .vece = MO_16 },
126
+ { .fni4 = gen_ssra32_i32,
127
+ .fniv = gen_ssra_vec,
128
+ .fno = gen_helper_gvec_ssra_s,
129
+ .load_dest = true,
130
+ .opt_opc = vecop_list,
131
+ .vece = MO_32 },
132
+ { .fni8 = gen_ssra64_i64,
133
+ .fniv = gen_ssra_vec,
134
+ .fno = gen_helper_gvec_ssra_b,
135
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
136
+ .opt_opc = vecop_list,
137
+ .load_dest = true,
138
+ .vece = MO_64 },
139
+ };
140
141
-const GVecGen2i ssra_op[4] = {
142
- { .fni8 = gen_ssra8_i64,
143
- .fniv = gen_ssra_vec,
144
- .load_dest = true,
145
- .opt_opc = vecop_list_ssra,
146
- .vece = MO_8 },
147
- { .fni8 = gen_ssra16_i64,
148
- .fniv = gen_ssra_vec,
149
- .load_dest = true,
150
- .opt_opc = vecop_list_ssra,
151
- .vece = MO_16 },
152
- { .fni4 = gen_ssra32_i32,
153
- .fniv = gen_ssra_vec,
154
- .load_dest = true,
155
- .opt_opc = vecop_list_ssra,
156
- .vece = MO_32 },
157
- { .fni8 = gen_ssra64_i64,
158
- .fniv = gen_ssra_vec,
159
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
160
- .opt_opc = vecop_list_ssra,
161
- .load_dest = true,
162
- .vece = MO_64 },
163
-};
164
+ /* tszimm encoding produces immediates in the range [1..esize]. */
165
+ tcg_debug_assert(shift > 0);
166
+ tcg_debug_assert(shift <= (8 << vece));
167
+
168
+ /*
169
+ * Shifts larger than the element size are architecturally valid.
170
+ * Signed results in all sign bits.
171
+ */
172
+ shift = MIN(shift, (8 << vece) - 1);
173
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
174
+}
175
176
static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
177
{
178
@@ -XXX,XX +XXX,XX @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
179
tcg_gen_add_vec(vece, d, d, a);
180
}
181
182
-static const TCGOpcode vecop_list_usra[] = {
183
- INDEX_op_shri_vec, INDEX_op_add_vec, 0
184
-};
185
+void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
186
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
187
+{
188
+ static const TCGOpcode vecop_list[] = {
189
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
190
+ };
191
+ static const GVecGen2i ops[4] = {
192
+ { .fni8 = gen_usra8_i64,
193
+ .fniv = gen_usra_vec,
194
+ .fno = gen_helper_gvec_usra_b,
195
+ .load_dest = true,
196
+ .opt_opc = vecop_list,
197
+ .vece = MO_8, },
198
+ { .fni8 = gen_usra16_i64,
199
+ .fniv = gen_usra_vec,
200
+ .fno = gen_helper_gvec_usra_h,
201
+ .load_dest = true,
202
+ .opt_opc = vecop_list,
203
+ .vece = MO_16, },
204
+ { .fni4 = gen_usra32_i32,
205
+ .fniv = gen_usra_vec,
206
+ .fno = gen_helper_gvec_usra_s,
207
+ .load_dest = true,
208
+ .opt_opc = vecop_list,
209
+ .vece = MO_32, },
210
+ { .fni8 = gen_usra64_i64,
211
+ .fniv = gen_usra_vec,
212
+ .fno = gen_helper_gvec_usra_d,
213
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
214
+ .load_dest = true,
215
+ .opt_opc = vecop_list,
216
+ .vece = MO_64, },
217
+ };
218
219
-const GVecGen2i usra_op[4] = {
220
- { .fni8 = gen_usra8_i64,
221
- .fniv = gen_usra_vec,
222
- .load_dest = true,
223
- .opt_opc = vecop_list_usra,
224
- .vece = MO_8, },
225
- { .fni8 = gen_usra16_i64,
226
- .fniv = gen_usra_vec,
227
- .load_dest = true,
228
- .opt_opc = vecop_list_usra,
229
- .vece = MO_16, },
230
- { .fni4 = gen_usra32_i32,
231
- .fniv = gen_usra_vec,
232
- .load_dest = true,
233
- .opt_opc = vecop_list_usra,
234
- .vece = MO_32, },
235
- { .fni8 = gen_usra64_i64,
236
- .fniv = gen_usra_vec,
237
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
238
- .load_dest = true,
239
- .opt_opc = vecop_list_usra,
240
- .vece = MO_64, },
241
-};
242
+ /* tszimm encoding produces immediates in the range [1..esize]. */
243
+ tcg_debug_assert(shift > 0);
244
+ tcg_debug_assert(shift <= (8 << vece));
245
+
246
+ /*
247
+ * Shifts larger than the element size are architecturally valid.
248
+ * Unsigned results in all zeros as input to accumulate: nop.
249
+ */
250
+ if (shift < (8 << vece)) {
251
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
252
+ } else {
253
+ /* Nop, but we do need to clear the tail. */
254
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
255
+ }
256
+}
257
258
static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
259
{
260
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
261
case 1: /* VSRA */
262
/* Right shift comes here negative. */
263
shift = -shift;
264
- /* Shifts larger than the element size are architecturally
265
- * valid. Unsigned results in all zeros; signed results
266
- * in all sign bits.
267
- */
268
- if (!u) {
269
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
270
- MIN(shift, (8 << size) - 1),
271
- &ssra_op[size]);
272
- } else if (shift >= 8 << size) {
273
- /* rd += 0 */
274
+ if (u) {
275
+ gen_gvec_usra(size, rd_ofs, rm_ofs, shift,
276
+ vec_size, vec_size);
277
} else {
278
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
279
- shift, &usra_op[size]);
280
+ gen_gvec_ssra(size, rd_ofs, rm_ofs, shift,
281
+ vec_size, vec_size);
282
}
283
return 0;
284
285
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
286
index XXXXXXX..XXXXXXX 100644
287
--- a/target/arm/vec_helper.c
288
+++ b/target/arm/vec_helper.c
289
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn,
290
clear_tail(d, oprsz, simd_maxsz(desc));
291
}
292
293
+
294
+#define DO_SRA(NAME, TYPE) \
295
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
296
+{ \
297
+ intptr_t i, oprsz = simd_oprsz(desc); \
298
+ int shift = simd_data(desc); \
299
+ TYPE *d = vd, *n = vn; \
300
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
301
+ d[i] += n[i] >> shift; \
302
+ } \
303
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
304
+}
305
+
306
+DO_SRA(gvec_ssra_b, int8_t)
307
+DO_SRA(gvec_ssra_h, int16_t)
308
+DO_SRA(gvec_ssra_s, int32_t)
309
+DO_SRA(gvec_ssra_d, int64_t)
310
+
311
+DO_SRA(gvec_usra_b, uint8_t)
312
+DO_SRA(gvec_usra_h, uint16_t)
313
+DO_SRA(gvec_usra_s, uint32_t)
314
+DO_SRA(gvec_usra_d, uint64_t)
315
+
316
+#undef DO_SRA
317
+
318
/*
319
* Convert float16 to float32, raising no exceptions and
320
* preserving exceptional values, including SNaN.
321
--
73
--
322
2.20.1
74
2.34.1
323
324
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The functions eliminate duplication of the special cases for
3
Inline pickNaN into its only caller. This makes one assert
4
this operation. They match up with the GVecGen2iFn typedef.
4
redundant with the immediately preceding IF.
5
5
6
Add out-of-line helpers. We got away with only having inline
7
expanders because the neon vector size is only 16 bytes, and
8
we know that the inline expansion will always succeed.
9
When we reuse this for SVE, tcg-gvec-op may decide to use an
10
out-of-line helper due to longer vector lengths.
11
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-9-richard.henderson@linaro.org
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
10
---
17
target/arm/helper.h | 10 ++
11
fpu/softfloat-parts.c.inc | 82 +++++++++++++++++++++++++----
18
target/arm/translate.h | 7 +-
12
fpu/softfloat-specialize.c.inc | 96 ----------------------------------
19
target/arm/translate-a64.c | 20 +---
13
2 files changed, 73 insertions(+), 105 deletions(-)
20
target/arm/translate.c | 186 +++++++++++++++++++++----------------
14
21
target/arm/vec_helper.c | 38 ++++++++
15
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
22
5 files changed, 160 insertions(+), 101 deletions(-)
23
24
diff --git a/target/arm/helper.h b/target/arm/helper.h
25
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/helper.h
17
--- a/fpu/softfloat-parts.c.inc
27
+++ b/target/arm/helper.h
18
+++ b/fpu/softfloat-parts.c.inc
28
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
19
@@ -XXX,XX +XXX,XX @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s)
29
DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
20
static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
30
DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
21
float_status *s)
31
22
{
32
+DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
23
+ int cmp, which;
33
+DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
36
+
24
+
37
+DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
25
if (is_snan(a->cls) || is_snan(b->cls)) {
38
+DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
26
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
39
+DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
27
}
40
+DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
28
29
if (s->default_nan_mode) {
30
parts_default_nan(a, s);
31
- } else {
32
- int cmp = frac_cmp(a, b);
33
- if (cmp == 0) {
34
- cmp = a->sign < b->sign;
35
- }
36
+ return a;
37
+ }
38
39
- if (pickNaN(a->cls, b->cls, cmp > 0, s)) {
40
- a = b;
41
- }
42
+ cmp = frac_cmp(a, b);
43
+ if (cmp == 0) {
44
+ cmp = a->sign < b->sign;
45
+ }
41
+
46
+
42
#ifdef TARGET_AARCH64
47
+ switch (s->float_2nan_prop_rule) {
43
#include "helper-a64.h"
48
+ case float_2nan_prop_s_ab:
44
#include "helper-sve.h"
49
if (is_snan(a->cls)) {
45
diff --git a/target/arm/translate.h b/target/arm/translate.h
50
- parts_silence_nan(a, s);
51
+ which = 0;
52
+ } else if (is_snan(b->cls)) {
53
+ which = 1;
54
+ } else if (is_qnan(a->cls)) {
55
+ which = 0;
56
+ } else {
57
+ which = 1;
58
}
59
+ break;
60
+ case float_2nan_prop_s_ba:
61
+ if (is_snan(b->cls)) {
62
+ which = 1;
63
+ } else if (is_snan(a->cls)) {
64
+ which = 0;
65
+ } else if (is_qnan(b->cls)) {
66
+ which = 1;
67
+ } else {
68
+ which = 0;
69
+ }
70
+ break;
71
+ case float_2nan_prop_ab:
72
+ which = is_nan(a->cls) ? 0 : 1;
73
+ break;
74
+ case float_2nan_prop_ba:
75
+ which = is_nan(b->cls) ? 1 : 0;
76
+ break;
77
+ case float_2nan_prop_x87:
78
+ /*
79
+ * This implements x87 NaN propagation rules:
80
+ * SNaN + QNaN => return the QNaN
81
+ * two SNaNs => return the one with the larger significand, silenced
82
+ * two QNaNs => return the one with the larger significand
83
+ * SNaN and a non-NaN => return the SNaN, silenced
84
+ * QNaN and a non-NaN => return the QNaN
85
+ *
86
+ * If we get down to comparing significands and they are the same,
87
+ * return the NaN with the positive sign bit (if any).
88
+ */
89
+ if (is_snan(a->cls)) {
90
+ if (is_snan(b->cls)) {
91
+ which = cmp > 0 ? 0 : 1;
92
+ } else {
93
+ which = is_qnan(b->cls) ? 1 : 0;
94
+ }
95
+ } else if (is_qnan(a->cls)) {
96
+ if (is_snan(b->cls) || !is_qnan(b->cls)) {
97
+ which = 0;
98
+ } else {
99
+ which = cmp > 0 ? 0 : 1;
100
+ }
101
+ } else {
102
+ which = 1;
103
+ }
104
+ break;
105
+ default:
106
+ g_assert_not_reached();
107
+ }
108
+
109
+ if (which) {
110
+ a = b;
111
+ }
112
+ if (is_snan(a->cls)) {
113
+ parts_silence_nan(a, s);
114
}
115
return a;
116
}
117
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
46
index XXXXXXX..XXXXXXX 100644
118
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/translate.h
119
--- a/fpu/softfloat-specialize.c.inc
48
+++ b/target/arm/translate.h
120
+++ b/fpu/softfloat-specialize.c.inc
49
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
121
@@ -XXX,XX +XXX,XX @@ bool float32_is_signaling_nan(float32 a_, float_status *status)
50
extern const GVecGen3 cmtst_op[4];
122
}
51
extern const GVecGen3 sshl_op[4];
52
extern const GVecGen3 ushl_op[4];
53
-extern const GVecGen2i sri_op[4];
54
-extern const GVecGen2i sli_op[4];
55
extern const GVecGen4 uqadd_op[4];
56
extern const GVecGen4 sqadd_op[4];
57
extern const GVecGen4 uqsub_op[4];
58
@@ -XXX,XX +XXX,XX @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
59
void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
60
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
61
62
+void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
63
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
64
+void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
65
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
66
+
67
/*
68
* Forward to the isar_feature_* tests given a DisasContext pointer.
69
*/
70
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/target/arm/translate-a64.c
73
+++ b/target/arm/translate-a64.c
74
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
75
is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
76
}
123
}
77
124
78
-/* Expand a 2-operand + immediate AdvSIMD vector operation using
125
-/*----------------------------------------------------------------------------
79
- * an op descriptor.
126
-| Select which NaN to propagate for a two-input operation.
80
- */
127
-| IEEE754 doesn't specify all the details of this, so the
81
-static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd,
128
-| algorithm is target-specific.
82
- int rn, int64_t imm, const GVecGen2i *gvec_op)
129
-| The routine is passed various bits of information about the
130
-| two NaNs and should return 0 to select NaN a and 1 for NaN b.
131
-| Note that signalling NaNs are always squashed to quiet NaNs
132
-| by the caller, by calling floatXX_silence_nan() before
133
-| returning them.
134
-|
135
-| aIsLargerSignificand is only valid if both a and b are NaNs
136
-| of some kind, and is true if a has the larger significand,
137
-| or if both a and b have the same significand but a is
138
-| positive but b is negative. It is only needed for the x87
139
-| tie-break rule.
140
-*----------------------------------------------------------------------------*/
141
-
142
-static int pickNaN(FloatClass a_cls, FloatClass b_cls,
143
- bool aIsLargerSignificand, float_status *status)
83
-{
144
-{
84
- tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
145
- /*
85
- is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op);
146
- * We guarantee not to require the target to tell us how to
147
- * pick a NaN if we're always returning the default NaN.
148
- * But if we're not in default-NaN mode then the target must
149
- * specify via set_float_2nan_prop_rule().
150
- */
151
- assert(!status->default_nan_mode);
152
-
153
- switch (status->float_2nan_prop_rule) {
154
- case float_2nan_prop_s_ab:
155
- if (is_snan(a_cls)) {
156
- return 0;
157
- } else if (is_snan(b_cls)) {
158
- return 1;
159
- } else if (is_qnan(a_cls)) {
160
- return 0;
161
- } else {
162
- return 1;
163
- }
164
- break;
165
- case float_2nan_prop_s_ba:
166
- if (is_snan(b_cls)) {
167
- return 1;
168
- } else if (is_snan(a_cls)) {
169
- return 0;
170
- } else if (is_qnan(b_cls)) {
171
- return 1;
172
- } else {
173
- return 0;
174
- }
175
- break;
176
- case float_2nan_prop_ab:
177
- if (is_nan(a_cls)) {
178
- return 0;
179
- } else {
180
- return 1;
181
- }
182
- break;
183
- case float_2nan_prop_ba:
184
- if (is_nan(b_cls)) {
185
- return 1;
186
- } else {
187
- return 0;
188
- }
189
- break;
190
- case float_2nan_prop_x87:
191
- /*
192
- * This implements x87 NaN propagation rules:
193
- * SNaN + QNaN => return the QNaN
194
- * two SNaNs => return the one with the larger significand, silenced
195
- * two QNaNs => return the one with the larger significand
196
- * SNaN and a non-NaN => return the SNaN, silenced
197
- * QNaN and a non-NaN => return the QNaN
198
- *
199
- * If we get down to comparing significands and they are the same,
200
- * return the NaN with the positive sign bit (if any).
201
- */
202
- if (is_snan(a_cls)) {
203
- if (is_snan(b_cls)) {
204
- return aIsLargerSignificand ? 0 : 1;
205
- }
206
- return is_qnan(b_cls) ? 1 : 0;
207
- } else if (is_qnan(a_cls)) {
208
- if (is_snan(b_cls) || !is_qnan(b_cls)) {
209
- return 0;
210
- } else {
211
- return aIsLargerSignificand ? 0 : 1;
212
- }
213
- } else {
214
- return 1;
215
- }
216
- default:
217
- g_assert_not_reached();
218
- }
86
-}
219
-}
87
-
220
-
88
/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */
221
/*----------------------------------------------------------------------------
89
static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
222
| Returns 1 if the double-precision floating-point value `a' is a quiet
90
int rn, int rm, const GVecGen3 *gvec_op)
223
| NaN; otherwise returns 0.
91
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
92
gen_gvec_fn2i(s, is_q, rd, rn, shift,
93
is_u ? gen_gvec_usra : gen_gvec_ssra, size);
94
return;
95
+
96
case 0x08: /* SRI */
97
- /* Shift count same as element size is valid but does nothing. */
98
- if (shift == 8 << size) {
99
- goto done;
100
- }
101
- gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]);
102
+ gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
103
return;
104
105
case 0x00: /* SSHR / USHR */
106
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
107
}
108
tcg_temp_free_i64(tcg_round);
109
110
- done:
111
clear_vec_high(s, is_q, rd);
112
}
113
114
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
115
}
116
117
if (insert) {
118
- gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]);
119
+ gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size);
120
} else {
121
gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
122
}
123
diff --git a/target/arm/translate.c b/target/arm/translate.c
124
index XXXXXXX..XXXXXXX 100644
125
--- a/target/arm/translate.c
126
+++ b/target/arm/translate.c
127
@@ -XXX,XX +XXX,XX @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
128
129
static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
130
{
131
- if (sh == 0) {
132
- tcg_gen_mov_vec(d, a);
133
- } else {
134
- TCGv_vec t = tcg_temp_new_vec_matching(d);
135
- TCGv_vec m = tcg_temp_new_vec_matching(d);
136
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
137
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
138
139
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
140
- tcg_gen_shri_vec(vece, t, a, sh);
141
- tcg_gen_and_vec(vece, d, d, m);
142
- tcg_gen_or_vec(vece, d, d, t);
143
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
144
+ tcg_gen_shri_vec(vece, t, a, sh);
145
+ tcg_gen_and_vec(vece, d, d, m);
146
+ tcg_gen_or_vec(vece, d, d, t);
147
148
- tcg_temp_free_vec(t);
149
- tcg_temp_free_vec(m);
150
- }
151
+ tcg_temp_free_vec(t);
152
+ tcg_temp_free_vec(m);
153
}
154
155
-static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 };
156
+void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
157
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
158
+{
159
+ static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
160
+ const GVecGen2i ops[4] = {
161
+ { .fni8 = gen_shr8_ins_i64,
162
+ .fniv = gen_shr_ins_vec,
163
+ .fno = gen_helper_gvec_sri_b,
164
+ .load_dest = true,
165
+ .opt_opc = vecop_list,
166
+ .vece = MO_8 },
167
+ { .fni8 = gen_shr16_ins_i64,
168
+ .fniv = gen_shr_ins_vec,
169
+ .fno = gen_helper_gvec_sri_h,
170
+ .load_dest = true,
171
+ .opt_opc = vecop_list,
172
+ .vece = MO_16 },
173
+ { .fni4 = gen_shr32_ins_i32,
174
+ .fniv = gen_shr_ins_vec,
175
+ .fno = gen_helper_gvec_sri_s,
176
+ .load_dest = true,
177
+ .opt_opc = vecop_list,
178
+ .vece = MO_32 },
179
+ { .fni8 = gen_shr64_ins_i64,
180
+ .fniv = gen_shr_ins_vec,
181
+ .fno = gen_helper_gvec_sri_d,
182
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
183
+ .load_dest = true,
184
+ .opt_opc = vecop_list,
185
+ .vece = MO_64 },
186
+ };
187
188
-const GVecGen2i sri_op[4] = {
189
- { .fni8 = gen_shr8_ins_i64,
190
- .fniv = gen_shr_ins_vec,
191
- .load_dest = true,
192
- .opt_opc = vecop_list_sri,
193
- .vece = MO_8 },
194
- { .fni8 = gen_shr16_ins_i64,
195
- .fniv = gen_shr_ins_vec,
196
- .load_dest = true,
197
- .opt_opc = vecop_list_sri,
198
- .vece = MO_16 },
199
- { .fni4 = gen_shr32_ins_i32,
200
- .fniv = gen_shr_ins_vec,
201
- .load_dest = true,
202
- .opt_opc = vecop_list_sri,
203
- .vece = MO_32 },
204
- { .fni8 = gen_shr64_ins_i64,
205
- .fniv = gen_shr_ins_vec,
206
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
207
- .load_dest = true,
208
- .opt_opc = vecop_list_sri,
209
- .vece = MO_64 },
210
-};
211
+ /* tszimm encoding produces immediates in the range [1..esize]. */
212
+ tcg_debug_assert(shift > 0);
213
+ tcg_debug_assert(shift <= (8 << vece));
214
+
215
+ /* Shift of esize leaves destination unchanged. */
216
+ if (shift < (8 << vece)) {
217
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
218
+ } else {
219
+ /* Nop, but we do need to clear the tail. */
220
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
221
+ }
222
+}
223
224
static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
225
{
226
@@ -XXX,XX +XXX,XX @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
227
228
static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
229
{
230
- if (sh == 0) {
231
- tcg_gen_mov_vec(d, a);
232
- } else {
233
- TCGv_vec t = tcg_temp_new_vec_matching(d);
234
- TCGv_vec m = tcg_temp_new_vec_matching(d);
235
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
236
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
237
238
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
239
- tcg_gen_shli_vec(vece, t, a, sh);
240
- tcg_gen_and_vec(vece, d, d, m);
241
- tcg_gen_or_vec(vece, d, d, t);
242
+ tcg_gen_shli_vec(vece, t, a, sh);
243
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
244
+ tcg_gen_and_vec(vece, d, d, m);
245
+ tcg_gen_or_vec(vece, d, d, t);
246
247
- tcg_temp_free_vec(t);
248
- tcg_temp_free_vec(m);
249
- }
250
+ tcg_temp_free_vec(t);
251
+ tcg_temp_free_vec(m);
252
}
253
254
-static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 };
255
+void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
256
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
257
+{
258
+ static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
259
+ const GVecGen2i ops[4] = {
260
+ { .fni8 = gen_shl8_ins_i64,
261
+ .fniv = gen_shl_ins_vec,
262
+ .fno = gen_helper_gvec_sli_b,
263
+ .load_dest = true,
264
+ .opt_opc = vecop_list,
265
+ .vece = MO_8 },
266
+ { .fni8 = gen_shl16_ins_i64,
267
+ .fniv = gen_shl_ins_vec,
268
+ .fno = gen_helper_gvec_sli_h,
269
+ .load_dest = true,
270
+ .opt_opc = vecop_list,
271
+ .vece = MO_16 },
272
+ { .fni4 = gen_shl32_ins_i32,
273
+ .fniv = gen_shl_ins_vec,
274
+ .fno = gen_helper_gvec_sli_s,
275
+ .load_dest = true,
276
+ .opt_opc = vecop_list,
277
+ .vece = MO_32 },
278
+ { .fni8 = gen_shl64_ins_i64,
279
+ .fniv = gen_shl_ins_vec,
280
+ .fno = gen_helper_gvec_sli_d,
281
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
282
+ .load_dest = true,
283
+ .opt_opc = vecop_list,
284
+ .vece = MO_64 },
285
+ };
286
287
-const GVecGen2i sli_op[4] = {
288
- { .fni8 = gen_shl8_ins_i64,
289
- .fniv = gen_shl_ins_vec,
290
- .load_dest = true,
291
- .opt_opc = vecop_list_sli,
292
- .vece = MO_8 },
293
- { .fni8 = gen_shl16_ins_i64,
294
- .fniv = gen_shl_ins_vec,
295
- .load_dest = true,
296
- .opt_opc = vecop_list_sli,
297
- .vece = MO_16 },
298
- { .fni4 = gen_shl32_ins_i32,
299
- .fniv = gen_shl_ins_vec,
300
- .load_dest = true,
301
- .opt_opc = vecop_list_sli,
302
- .vece = MO_32 },
303
- { .fni8 = gen_shl64_ins_i64,
304
- .fniv = gen_shl_ins_vec,
305
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
306
- .load_dest = true,
307
- .opt_opc = vecop_list_sli,
308
- .vece = MO_64 },
309
-};
310
+ /* tszimm encoding produces immediates in the range [0..esize-1]. */
311
+ tcg_debug_assert(shift >= 0);
312
+ tcg_debug_assert(shift < (8 << vece));
313
+
314
+ if (shift == 0) {
315
+ tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
316
+ } else {
317
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
318
+ }
319
+}
320
321
static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
322
{
323
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
324
}
325
/* Right shift comes here negative. */
326
shift = -shift;
327
- /* Shift out of range leaves destination unchanged. */
328
- if (shift < 8 << size) {
329
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
330
- shift, &sri_op[size]);
331
- }
332
+ gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
333
+ vec_size, vec_size);
334
return 0;
335
336
case 5: /* VSHL, VSLI */
337
if (u) { /* VSLI */
338
- /* Shift out of range leaves destination unchanged. */
339
- if (shift < 8 << size) {
340
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size,
341
- vec_size, shift, &sli_op[size]);
342
- }
343
+ gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
344
+ vec_size, vec_size);
345
} else { /* VSHL */
346
/* Shifts larger than the element size are
347
* architecturally valid and results in zero.
348
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
349
index XXXXXXX..XXXXXXX 100644
350
--- a/target/arm/vec_helper.c
351
+++ b/target/arm/vec_helper.c
352
@@ -XXX,XX +XXX,XX @@ DO_RSRA(gvec_ursra_d, uint64_t)
353
354
#undef DO_RSRA
355
356
+#define DO_SRI(NAME, TYPE) \
357
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
358
+{ \
359
+ intptr_t i, oprsz = simd_oprsz(desc); \
360
+ int shift = simd_data(desc); \
361
+ TYPE *d = vd, *n = vn; \
362
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
363
+ d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \
364
+ } \
365
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
366
+}
367
+
368
+DO_SRI(gvec_sri_b, uint8_t)
369
+DO_SRI(gvec_sri_h, uint16_t)
370
+DO_SRI(gvec_sri_s, uint32_t)
371
+DO_SRI(gvec_sri_d, uint64_t)
372
+
373
+#undef DO_SRI
374
+
375
+#define DO_SLI(NAME, TYPE) \
376
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
377
+{ \
378
+ intptr_t i, oprsz = simd_oprsz(desc); \
379
+ int shift = simd_data(desc); \
380
+ TYPE *d = vd, *n = vn; \
381
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
382
+ d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \
383
+ } \
384
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
385
+}
386
+
387
+DO_SLI(gvec_sli_b, uint8_t)
388
+DO_SLI(gvec_sli_h, uint16_t)
389
+DO_SLI(gvec_sli_s, uint32_t)
390
+DO_SLI(gvec_sli_d, uint64_t)
391
+
392
+#undef DO_SLI
393
+
394
/*
395
* Convert float16 to float32, raising no exceptions and
396
* preserving exceptional values, including SNaN.
397
--
224
--
398
2.20.1
225
2.34.1
399
226
400
227
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
In 1dc8425e551, while converting to gvec, I added an extra range check
3
Remember if there was an SNaN, and use that to simplify
4
against the shift count. This was unnecessary because the encoding of
4
float_2nan_prop_s_{ab,ba} to only the snan component.
5
the shift count produces 0 to the element size - 1.
5
Then, fall through to the corresponding
6
float_2nan_prop_{ab,ba} case to handle any remaining
7
nans, which must be quiet.
6
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20241203203949.483774-10-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-5-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/translate.c | 12 ++----------
14
fpu/softfloat-parts.c.inc | 32 ++++++++++++--------------------
13
1 file changed, 2 insertions(+), 10 deletions(-)
15
1 file changed, 12 insertions(+), 20 deletions(-)
14
16
15
diff --git a/target/arm/translate.c b/target/arm/translate.c
17
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.c
19
--- a/fpu/softfloat-parts.c.inc
18
+++ b/target/arm/translate.c
20
+++ b/fpu/softfloat-parts.c.inc
19
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
21
@@ -XXX,XX +XXX,XX @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s)
20
gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
22
static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
21
vec_size, vec_size);
23
float_status *s)
22
} else { /* VSHL */
24
{
23
- /* Shifts larger than the element size are
25
+ bool have_snan = false;
24
- * architecturally valid and results in zero.
26
int cmp, which;
25
- */
27
26
- if (shift >= 8 << size) {
28
if (is_snan(a->cls) || is_snan(b->cls)) {
27
- tcg_gen_gvec_dup_imm(size, rd_ofs,
29
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
28
- vec_size, vec_size, 0);
30
+ have_snan = true;
29
- } else {
31
}
30
- tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
32
31
- vec_size, vec_size);
33
if (s->default_nan_mode) {
32
- }
34
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
33
+ tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
35
34
+ vec_size, vec_size);
36
switch (s->float_2nan_prop_rule) {
35
}
37
case float_2nan_prop_s_ab:
36
return 0;
38
- if (is_snan(a->cls)) {
37
}
39
- which = 0;
40
- } else if (is_snan(b->cls)) {
41
- which = 1;
42
- } else if (is_qnan(a->cls)) {
43
- which = 0;
44
- } else {
45
- which = 1;
46
+ if (have_snan) {
47
+ which = is_snan(a->cls) ? 0 : 1;
48
+ break;
49
}
50
- break;
51
- case float_2nan_prop_s_ba:
52
- if (is_snan(b->cls)) {
53
- which = 1;
54
- } else if (is_snan(a->cls)) {
55
- which = 0;
56
- } else if (is_qnan(b->cls)) {
57
- which = 1;
58
- } else {
59
- which = 0;
60
- }
61
- break;
62
+ /* fall through */
63
case float_2nan_prop_ab:
64
which = is_nan(a->cls) ? 0 : 1;
65
break;
66
+ case float_2nan_prop_s_ba:
67
+ if (have_snan) {
68
+ which = is_snan(b->cls) ? 1 : 0;
69
+ break;
70
+ }
71
+ /* fall through */
72
case float_2nan_prop_ba:
73
which = is_nan(b->cls) ? 1 : 0;
74
break;
38
--
75
--
39
2.20.1
76
2.34.1
40
41
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Create vectorized versions of handle_shri_with_rndacc
3
Move the fractional comparison to the end of the
4
for shift+round and shift+round+accumulate. Add out-of-line
4
float_2nan_prop_x87 case. This is not required for
5
helpers in preparation for longer vector lengths from SVE.
5
any other 2nan propagation rule. Reorganize the
6
x87 case itself to break out of the switch when the
7
fractional comparison is not required.
6
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20241203203949.483774-11-richard.henderson@linaro.org
9
Message-id: 20200513163245.17915-3-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/helper.h | 20 ++
14
fpu/softfloat-parts.c.inc | 19 +++++++++----------
13
target/arm/translate.h | 9 +
15
1 file changed, 9 insertions(+), 10 deletions(-)
14
target/arm/translate-a64.c | 11 +-
15
target/arm/translate.c | 463 +++++++++++++++++++++++++++++++++++--
16
target/arm/vec_helper.c | 50 ++++
17
5 files changed, 527 insertions(+), 26 deletions(-)
18
16
19
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
20
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/helper.h
19
--- a/fpu/softfloat-parts.c.inc
22
+++ b/target/arm/helper.h
20
+++ b/fpu/softfloat-parts.c.inc
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
21
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
24
DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
22
return a;
25
DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
23
}
26
24
27
+DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
25
- cmp = frac_cmp(a, b);
28
+DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
26
- if (cmp == 0) {
29
+DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
27
- cmp = a->sign < b->sign;
30
+DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
28
- }
31
+
29
-
32
+DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
30
switch (s->float_2nan_prop_rule) {
33
+DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
31
case float_2nan_prop_s_ab:
34
+DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
32
if (have_snan) {
35
+DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
33
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
36
+
34
* return the NaN with the positive sign bit (if any).
37
+DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
35
*/
38
+DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
36
if (is_snan(a->cls)) {
39
+DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
37
- if (is_snan(b->cls)) {
40
+DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
38
- which = cmp > 0 ? 0 : 1;
41
+
39
- } else {
42
+DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
40
+ if (!is_snan(b->cls)) {
43
+DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
41
which = is_qnan(b->cls) ? 1 : 0;
44
+DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
42
+ break;
45
+DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
43
}
46
+
44
} else if (is_qnan(a->cls)) {
47
#ifdef TARGET_AARCH64
45
if (is_snan(b->cls) || !is_qnan(b->cls)) {
48
#include "helper-a64.h"
46
which = 0;
49
#include "helper-sve.h"
47
- } else {
50
diff --git a/target/arm/translate.h b/target/arm/translate.h
48
- which = cmp > 0 ? 0 : 1;
51
index XXXXXXX..XXXXXXX 100644
49
+ break;
52
--- a/target/arm/translate.h
50
}
53
+++ b/target/arm/translate.h
51
} else {
54
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
52
which = 1;
55
void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
53
+ break;
56
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
54
}
57
55
+ cmp = frac_cmp(a, b);
58
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
56
+ if (cmp == 0) {
59
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
57
+ cmp = a->sign < b->sign;
60
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
58
+ }
61
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
59
+ which = cmp > 0 ? 0 : 1;
62
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
60
break;
63
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
64
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
65
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz);
66
+
67
/*
68
* Forward to the isar_feature_* tests given a DisasContext pointer.
69
*/
70
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/target/arm/translate-a64.c
73
+++ b/target/arm/translate-a64.c
74
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
75
return;
76
77
case 0x04: /* SRSHR / URSHR (rounding) */
78
- break;
79
+ gen_gvec_fn2i(s, is_q, rd, rn, shift,
80
+ is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
81
+ return;
82
+
83
case 0x06: /* SRSRA / URSRA (accum + rounding) */
84
- accumulate = true;
85
- break;
86
+ gen_gvec_fn2i(s, is_q, rd, rn, shift,
87
+ is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
88
+ return;
89
+
90
default:
61
default:
91
g_assert_not_reached();
62
g_assert_not_reached();
92
}
93
diff --git a/target/arm/translate.c b/target/arm/translate.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/translate.c
96
+++ b/target/arm/translate.c
97
@@ -XXX,XX +XXX,XX @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
98
}
99
}
100
101
+/*
102
+ * Shift one less than the requested amount, and the low bit is
103
+ * the rounding bit. For the 8 and 16-bit operations, because we
104
+ * mask the low bit, we can perform a normal integer shift instead
105
+ * of a vector shift.
106
+ */
107
+static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
108
+{
109
+ TCGv_i64 t = tcg_temp_new_i64();
110
+
111
+ tcg_gen_shri_i64(t, a, sh - 1);
112
+ tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
113
+ tcg_gen_vec_sar8i_i64(d, a, sh);
114
+ tcg_gen_vec_add8_i64(d, d, t);
115
+ tcg_temp_free_i64(t);
116
+}
117
+
118
+static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
119
+{
120
+ TCGv_i64 t = tcg_temp_new_i64();
121
+
122
+ tcg_gen_shri_i64(t, a, sh - 1);
123
+ tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
124
+ tcg_gen_vec_sar16i_i64(d, a, sh);
125
+ tcg_gen_vec_add16_i64(d, d, t);
126
+ tcg_temp_free_i64(t);
127
+}
128
+
129
+static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
130
+{
131
+ TCGv_i32 t = tcg_temp_new_i32();
132
+
133
+ tcg_gen_extract_i32(t, a, sh - 1, 1);
134
+ tcg_gen_sari_i32(d, a, sh);
135
+ tcg_gen_add_i32(d, d, t);
136
+ tcg_temp_free_i32(t);
137
+}
138
+
139
+static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
140
+{
141
+ TCGv_i64 t = tcg_temp_new_i64();
142
+
143
+ tcg_gen_extract_i64(t, a, sh - 1, 1);
144
+ tcg_gen_sari_i64(d, a, sh);
145
+ tcg_gen_add_i64(d, d, t);
146
+ tcg_temp_free_i64(t);
147
+}
148
+
149
+static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
150
+{
151
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
152
+ TCGv_vec ones = tcg_temp_new_vec_matching(d);
153
+
154
+ tcg_gen_shri_vec(vece, t, a, sh - 1);
155
+ tcg_gen_dupi_vec(vece, ones, 1);
156
+ tcg_gen_and_vec(vece, t, t, ones);
157
+ tcg_gen_sari_vec(vece, d, a, sh);
158
+ tcg_gen_add_vec(vece, d, d, t);
159
+
160
+ tcg_temp_free_vec(t);
161
+ tcg_temp_free_vec(ones);
162
+}
163
+
164
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
165
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
166
+{
167
+ static const TCGOpcode vecop_list[] = {
168
+ INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
169
+ };
170
+ static const GVecGen2i ops[4] = {
171
+ { .fni8 = gen_srshr8_i64,
172
+ .fniv = gen_srshr_vec,
173
+ .fno = gen_helper_gvec_srshr_b,
174
+ .opt_opc = vecop_list,
175
+ .vece = MO_8 },
176
+ { .fni8 = gen_srshr16_i64,
177
+ .fniv = gen_srshr_vec,
178
+ .fno = gen_helper_gvec_srshr_h,
179
+ .opt_opc = vecop_list,
180
+ .vece = MO_16 },
181
+ { .fni4 = gen_srshr32_i32,
182
+ .fniv = gen_srshr_vec,
183
+ .fno = gen_helper_gvec_srshr_s,
184
+ .opt_opc = vecop_list,
185
+ .vece = MO_32 },
186
+ { .fni8 = gen_srshr64_i64,
187
+ .fniv = gen_srshr_vec,
188
+ .fno = gen_helper_gvec_srshr_d,
189
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
190
+ .opt_opc = vecop_list,
191
+ .vece = MO_64 },
192
+ };
193
+
194
+ /* tszimm encoding produces immediates in the range [1..esize] */
195
+ tcg_debug_assert(shift > 0);
196
+ tcg_debug_assert(shift <= (8 << vece));
197
+
198
+ if (shift == (8 << vece)) {
199
+ /*
200
+ * Shifts larger than the element size are architecturally valid.
201
+ * Signed results in all sign bits. With rounding, this produces
202
+ * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
203
+ * I.e. always zero.
204
+ */
205
+ tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
206
+ } else {
207
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
208
+ }
209
+}
210
+
211
+static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
212
+{
213
+ TCGv_i64 t = tcg_temp_new_i64();
214
+
215
+ gen_srshr8_i64(t, a, sh);
216
+ tcg_gen_vec_add8_i64(d, d, t);
217
+ tcg_temp_free_i64(t);
218
+}
219
+
220
+static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
221
+{
222
+ TCGv_i64 t = tcg_temp_new_i64();
223
+
224
+ gen_srshr16_i64(t, a, sh);
225
+ tcg_gen_vec_add16_i64(d, d, t);
226
+ tcg_temp_free_i64(t);
227
+}
228
+
229
+static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
230
+{
231
+ TCGv_i32 t = tcg_temp_new_i32();
232
+
233
+ gen_srshr32_i32(t, a, sh);
234
+ tcg_gen_add_i32(d, d, t);
235
+ tcg_temp_free_i32(t);
236
+}
237
+
238
+static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
239
+{
240
+ TCGv_i64 t = tcg_temp_new_i64();
241
+
242
+ gen_srshr64_i64(t, a, sh);
243
+ tcg_gen_add_i64(d, d, t);
244
+ tcg_temp_free_i64(t);
245
+}
246
+
247
+static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
248
+{
249
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
250
+
251
+ gen_srshr_vec(vece, t, a, sh);
252
+ tcg_gen_add_vec(vece, d, d, t);
253
+ tcg_temp_free_vec(t);
254
+}
255
+
256
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
257
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
258
+{
259
+ static const TCGOpcode vecop_list[] = {
260
+ INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
261
+ };
262
+ static const GVecGen2i ops[4] = {
263
+ { .fni8 = gen_srsra8_i64,
264
+ .fniv = gen_srsra_vec,
265
+ .fno = gen_helper_gvec_srsra_b,
266
+ .opt_opc = vecop_list,
267
+ .load_dest = true,
268
+ .vece = MO_8 },
269
+ { .fni8 = gen_srsra16_i64,
270
+ .fniv = gen_srsra_vec,
271
+ .fno = gen_helper_gvec_srsra_h,
272
+ .opt_opc = vecop_list,
273
+ .load_dest = true,
274
+ .vece = MO_16 },
275
+ { .fni4 = gen_srsra32_i32,
276
+ .fniv = gen_srsra_vec,
277
+ .fno = gen_helper_gvec_srsra_s,
278
+ .opt_opc = vecop_list,
279
+ .load_dest = true,
280
+ .vece = MO_32 },
281
+ { .fni8 = gen_srsra64_i64,
282
+ .fniv = gen_srsra_vec,
283
+ .fno = gen_helper_gvec_srsra_d,
284
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
285
+ .opt_opc = vecop_list,
286
+ .load_dest = true,
287
+ .vece = MO_64 },
288
+ };
289
+
290
+ /* tszimm encoding produces immediates in the range [1..esize] */
291
+ tcg_debug_assert(shift > 0);
292
+ tcg_debug_assert(shift <= (8 << vece));
293
+
294
+ /*
295
+ * Shifts larger than the element size are architecturally valid.
296
+ * Signed results in all sign bits. With rounding, this produces
297
+ * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
298
+ * I.e. always zero. With accumulation, this leaves D unchanged.
299
+ */
300
+ if (shift == (8 << vece)) {
301
+ /* Nop, but we do need to clear the tail. */
302
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
303
+ } else {
304
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
305
+ }
306
+}
307
+
308
+static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
309
+{
310
+ TCGv_i64 t = tcg_temp_new_i64();
311
+
312
+ tcg_gen_shri_i64(t, a, sh - 1);
313
+ tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
314
+ tcg_gen_vec_shr8i_i64(d, a, sh);
315
+ tcg_gen_vec_add8_i64(d, d, t);
316
+ tcg_temp_free_i64(t);
317
+}
318
+
319
+static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
320
+{
321
+ TCGv_i64 t = tcg_temp_new_i64();
322
+
323
+ tcg_gen_shri_i64(t, a, sh - 1);
324
+ tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
325
+ tcg_gen_vec_shr16i_i64(d, a, sh);
326
+ tcg_gen_vec_add16_i64(d, d, t);
327
+ tcg_temp_free_i64(t);
328
+}
329
+
330
+static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
331
+{
332
+ TCGv_i32 t = tcg_temp_new_i32();
333
+
334
+ tcg_gen_extract_i32(t, a, sh - 1, 1);
335
+ tcg_gen_shri_i32(d, a, sh);
336
+ tcg_gen_add_i32(d, d, t);
337
+ tcg_temp_free_i32(t);
338
+}
339
+
340
+static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
341
+{
342
+ TCGv_i64 t = tcg_temp_new_i64();
343
+
344
+ tcg_gen_extract_i64(t, a, sh - 1, 1);
345
+ tcg_gen_shri_i64(d, a, sh);
346
+ tcg_gen_add_i64(d, d, t);
347
+ tcg_temp_free_i64(t);
348
+}
349
+
350
+static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
351
+{
352
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
353
+ TCGv_vec ones = tcg_temp_new_vec_matching(d);
354
+
355
+ tcg_gen_shri_vec(vece, t, a, shift - 1);
356
+ tcg_gen_dupi_vec(vece, ones, 1);
357
+ tcg_gen_and_vec(vece, t, t, ones);
358
+ tcg_gen_shri_vec(vece, d, a, shift);
359
+ tcg_gen_add_vec(vece, d, d, t);
360
+
361
+ tcg_temp_free_vec(t);
362
+ tcg_temp_free_vec(ones);
363
+}
364
+
365
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
366
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
367
+{
368
+ static const TCGOpcode vecop_list[] = {
369
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
370
+ };
371
+ static const GVecGen2i ops[4] = {
372
+ { .fni8 = gen_urshr8_i64,
373
+ .fniv = gen_urshr_vec,
374
+ .fno = gen_helper_gvec_urshr_b,
375
+ .opt_opc = vecop_list,
376
+ .vece = MO_8 },
377
+ { .fni8 = gen_urshr16_i64,
378
+ .fniv = gen_urshr_vec,
379
+ .fno = gen_helper_gvec_urshr_h,
380
+ .opt_opc = vecop_list,
381
+ .vece = MO_16 },
382
+ { .fni4 = gen_urshr32_i32,
383
+ .fniv = gen_urshr_vec,
384
+ .fno = gen_helper_gvec_urshr_s,
385
+ .opt_opc = vecop_list,
386
+ .vece = MO_32 },
387
+ { .fni8 = gen_urshr64_i64,
388
+ .fniv = gen_urshr_vec,
389
+ .fno = gen_helper_gvec_urshr_d,
390
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
391
+ .opt_opc = vecop_list,
392
+ .vece = MO_64 },
393
+ };
394
+
395
+ /* tszimm encoding produces immediates in the range [1..esize] */
396
+ tcg_debug_assert(shift > 0);
397
+ tcg_debug_assert(shift <= (8 << vece));
398
+
399
+ if (shift == (8 << vece)) {
400
+ /*
401
+ * Shifts larger than the element size are architecturally valid.
402
+ * Unsigned results in zero. With rounding, this produces a
403
+ * copy of the most significant bit.
404
+ */
405
+ tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
406
+ } else {
407
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
408
+ }
409
+}
410
+
411
+static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
412
+{
413
+ TCGv_i64 t = tcg_temp_new_i64();
414
+
415
+ if (sh == 8) {
416
+ tcg_gen_vec_shr8i_i64(t, a, 7);
417
+ } else {
418
+ gen_urshr8_i64(t, a, sh);
419
+ }
420
+ tcg_gen_vec_add8_i64(d, d, t);
421
+ tcg_temp_free_i64(t);
422
+}
423
+
424
+static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
425
+{
426
+ TCGv_i64 t = tcg_temp_new_i64();
427
+
428
+ if (sh == 16) {
429
+ tcg_gen_vec_shr16i_i64(t, a, 15);
430
+ } else {
431
+ gen_urshr16_i64(t, a, sh);
432
+ }
433
+ tcg_gen_vec_add16_i64(d, d, t);
434
+ tcg_temp_free_i64(t);
435
+}
436
+
437
+static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
438
+{
439
+ TCGv_i32 t = tcg_temp_new_i32();
440
+
441
+ if (sh == 32) {
442
+ tcg_gen_shri_i32(t, a, 31);
443
+ } else {
444
+ gen_urshr32_i32(t, a, sh);
445
+ }
446
+ tcg_gen_add_i32(d, d, t);
447
+ tcg_temp_free_i32(t);
448
+}
449
+
450
+static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
451
+{
452
+ TCGv_i64 t = tcg_temp_new_i64();
453
+
454
+ if (sh == 64) {
455
+ tcg_gen_shri_i64(t, a, 63);
456
+ } else {
457
+ gen_urshr64_i64(t, a, sh);
458
+ }
459
+ tcg_gen_add_i64(d, d, t);
460
+ tcg_temp_free_i64(t);
461
+}
462
+
463
+static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
464
+{
465
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
466
+
467
+ if (sh == (8 << vece)) {
468
+ tcg_gen_shri_vec(vece, t, a, sh - 1);
469
+ } else {
470
+ gen_urshr_vec(vece, t, a, sh);
471
+ }
472
+ tcg_gen_add_vec(vece, d, d, t);
473
+ tcg_temp_free_vec(t);
474
+}
475
+
476
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
477
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
478
+{
479
+ static const TCGOpcode vecop_list[] = {
480
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
481
+ };
482
+ static const GVecGen2i ops[4] = {
483
+ { .fni8 = gen_ursra8_i64,
484
+ .fniv = gen_ursra_vec,
485
+ .fno = gen_helper_gvec_ursra_b,
486
+ .opt_opc = vecop_list,
487
+ .load_dest = true,
488
+ .vece = MO_8 },
489
+ { .fni8 = gen_ursra16_i64,
490
+ .fniv = gen_ursra_vec,
491
+ .fno = gen_helper_gvec_ursra_h,
492
+ .opt_opc = vecop_list,
493
+ .load_dest = true,
494
+ .vece = MO_16 },
495
+ { .fni4 = gen_ursra32_i32,
496
+ .fniv = gen_ursra_vec,
497
+ .fno = gen_helper_gvec_ursra_s,
498
+ .opt_opc = vecop_list,
499
+ .load_dest = true,
500
+ .vece = MO_32 },
501
+ { .fni8 = gen_ursra64_i64,
502
+ .fniv = gen_ursra_vec,
503
+ .fno = gen_helper_gvec_ursra_d,
504
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
505
+ .opt_opc = vecop_list,
506
+ .load_dest = true,
507
+ .vece = MO_64 },
508
+ };
509
+
510
+ /* tszimm encoding produces immediates in the range [1..esize] */
511
+ tcg_debug_assert(shift > 0);
512
+ tcg_debug_assert(shift <= (8 << vece));
513
+
514
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
515
+}
516
+
517
static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
518
{
519
uint64_t mask = dup_const(MO_8, 0xff >> shift);
520
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
521
}
522
return 0;
523
524
+ case 2: /* VRSHR */
525
+ /* Right shift comes here negative. */
526
+ shift = -shift;
527
+ if (u) {
528
+ gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
529
+ vec_size, vec_size);
530
+ } else {
531
+ gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
532
+ vec_size, vec_size);
533
+ }
534
+ return 0;
535
+
536
+ case 3: /* VRSRA */
537
+ /* Right shift comes here negative. */
538
+ shift = -shift;
539
+ if (u) {
540
+ gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
541
+ vec_size, vec_size);
542
+ } else {
543
+ gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
544
+ vec_size, vec_size);
545
+ }
546
+ return 0;
547
+
548
case 4: /* VSRI */
549
if (!u) {
550
return 1;
551
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
552
neon_load_reg64(cpu_V0, rm + pass);
553
tcg_gen_movi_i64(cpu_V1, imm);
554
switch (op) {
555
- case 2: /* VRSHR */
556
- case 3: /* VRSRA */
557
- if (u)
558
- gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1);
559
- else
560
- gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
561
- break;
562
case 6: /* VQSHLU */
563
gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
564
cpu_V0, cpu_V1);
565
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
566
default:
567
g_assert_not_reached();
568
}
569
- if (op == 3) {
570
- /* Accumulate. */
571
- neon_load_reg64(cpu_V1, rd + pass);
572
- tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
573
- }
574
neon_store_reg64(cpu_V0, rd + pass);
575
} else { /* size < 3 */
576
/* Operands in T0 and T1. */
577
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
578
tmp2 = tcg_temp_new_i32();
579
tcg_gen_movi_i32(tmp2, imm);
580
switch (op) {
581
- case 2: /* VRSHR */
582
- case 3: /* VRSRA */
583
- GEN_NEON_INTEGER_OP(rshl);
584
- break;
585
case 6: /* VQSHLU */
586
switch (size) {
587
case 0:
588
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
589
g_assert_not_reached();
590
}
591
tcg_temp_free_i32(tmp2);
592
-
593
- if (op == 3) {
594
- /* Accumulate. */
595
- tmp2 = neon_load_reg(rd, pass);
596
- gen_neon_add(size, tmp, tmp2);
597
- tcg_temp_free_i32(tmp2);
598
- }
599
neon_store_reg(rd, pass, tmp);
600
}
601
} /* for pass */
602
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
603
index XXXXXXX..XXXXXXX 100644
604
--- a/target/arm/vec_helper.c
605
+++ b/target/arm/vec_helper.c
606
@@ -XXX,XX +XXX,XX @@ DO_SRA(gvec_usra_d, uint64_t)
607
608
#undef DO_SRA
609
610
+#define DO_RSHR(NAME, TYPE) \
611
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
612
+{ \
613
+ intptr_t i, oprsz = simd_oprsz(desc); \
614
+ int shift = simd_data(desc); \
615
+ TYPE *d = vd, *n = vn; \
616
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
617
+ TYPE tmp = n[i] >> (shift - 1); \
618
+ d[i] = (tmp >> 1) + (tmp & 1); \
619
+ } \
620
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
621
+}
622
+
623
+DO_RSHR(gvec_srshr_b, int8_t)
624
+DO_RSHR(gvec_srshr_h, int16_t)
625
+DO_RSHR(gvec_srshr_s, int32_t)
626
+DO_RSHR(gvec_srshr_d, int64_t)
627
+
628
+DO_RSHR(gvec_urshr_b, uint8_t)
629
+DO_RSHR(gvec_urshr_h, uint16_t)
630
+DO_RSHR(gvec_urshr_s, uint32_t)
631
+DO_RSHR(gvec_urshr_d, uint64_t)
632
+
633
+#undef DO_RSHR
634
+
635
+#define DO_RSRA(NAME, TYPE) \
636
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
637
+{ \
638
+ intptr_t i, oprsz = simd_oprsz(desc); \
639
+ int shift = simd_data(desc); \
640
+ TYPE *d = vd, *n = vn; \
641
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
642
+ TYPE tmp = n[i] >> (shift - 1); \
643
+ d[i] += (tmp >> 1) + (tmp & 1); \
644
+ } \
645
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
646
+}
647
+
648
+DO_RSRA(gvec_srsra_b, int8_t)
649
+DO_RSRA(gvec_srsra_h, int16_t)
650
+DO_RSRA(gvec_srsra_s, int32_t)
651
+DO_RSRA(gvec_srsra_d, int64_t)
652
+
653
+DO_RSRA(gvec_ursra_b, uint8_t)
654
+DO_RSRA(gvec_ursra_h, uint16_t)
655
+DO_RSRA(gvec_ursra_s, uint32_t)
656
+DO_RSRA(gvec_ursra_d, uint64_t)
657
+
658
+#undef DO_RSRA
659
+
660
/*
661
* Convert float16 to float32, raising no exceptions and
662
* preserving exceptional values, including SNaN.
663
--
63
--
664
2.20.1
64
2.34.1
665
666
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Now that we've converted all cases to gvec, there is quite a bit
3
Replace the "index" selecting between A and B with a result variable
4
of dead code at the end of the function. Remove it.
4
of the proper type. This improves clarity within the function.
5
5
6
Sink the call to gen_gvec_fn2i to the end, loading a function
7
pointer within the switch statement.
8
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20200513163245.17915-6-richard.henderson@linaro.org
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20241203203949.483774-12-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
10
---
14
target/arm/translate-a64.c | 56 ++++++++++----------------------------
11
fpu/softfloat-parts.c.inc | 28 +++++++++++++---------------
15
1 file changed, 14 insertions(+), 42 deletions(-)
12
1 file changed, 13 insertions(+), 15 deletions(-)
16
13
17
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
14
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
18
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/translate-a64.c
16
--- a/fpu/softfloat-parts.c.inc
20
+++ b/target/arm/translate-a64.c
17
+++ b/fpu/softfloat-parts.c.inc
21
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
18
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
22
int size = 32 - clz32(immh) - 1;
19
float_status *s)
23
int immhb = immh << 3 | immb;
20
{
24
int shift = 2 * (8 << size) - immhb;
21
bool have_snan = false;
25
- bool accumulate = false;
22
- int cmp, which;
26
- int dsize = is_q ? 128 : 64;
23
+ FloatPartsN *ret;
27
- int esize = 8 << size;
24
+ int cmp;
28
- int elements = dsize/esize;
25
29
- MemOp memop = size | (is_u ? 0 : MO_SIGN);
26
if (is_snan(a->cls) || is_snan(b->cls)) {
30
- TCGv_i64 tcg_rn = new_tmp_a64(s);
27
float_raise(float_flag_invalid | float_flag_invalid_snan, s);
31
- TCGv_i64 tcg_rd = new_tmp_a64(s);
28
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
32
- TCGv_i64 tcg_round;
29
switch (s->float_2nan_prop_rule) {
33
- uint64_t round_const;
30
case float_2nan_prop_s_ab:
34
- int i;
31
if (have_snan) {
35
+ GVecGen2iFn *gvec_fn;
32
- which = is_snan(a->cls) ? 0 : 1;
36
33
+ ret = is_snan(a->cls) ? a : b;
37
if (extract32(immh, 3, 1) && !is_q) {
34
break;
38
unallocated_encoding(s);
35
}
39
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
36
/* fall through */
40
37
case float_2nan_prop_ab:
41
switch (opcode) {
38
- which = is_nan(a->cls) ? 0 : 1;
42
case 0x02: /* SSRA / USRA (accumulate) */
39
+ ret = is_nan(a->cls) ? a : b;
43
- gen_gvec_fn2i(s, is_q, rd, rn, shift,
40
break;
44
- is_u ? gen_gvec_usra : gen_gvec_ssra, size);
41
case float_2nan_prop_s_ba:
45
- return;
42
if (have_snan) {
46
+ gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
43
- which = is_snan(b->cls) ? 1 : 0;
47
+ break;
44
+ ret = is_snan(b->cls) ? b : a;
48
45
break;
49
case 0x08: /* SRI */
46
}
50
- gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
47
/* fall through */
51
- return;
48
case float_2nan_prop_ba:
52
+ gvec_fn = gen_gvec_sri;
49
- which = is_nan(b->cls) ? 1 : 0;
53
+ break;
50
+ ret = is_nan(b->cls) ? b : a;
54
51
break;
55
case 0x00: /* SSHR / USHR */
52
case float_2nan_prop_x87:
56
if (is_u) {
53
/*
57
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
54
@@ -XXX,XX +XXX,XX @@ static FloatPartsN *partsN(pick_nan)(FloatPartsN *a, FloatPartsN *b,
58
/* Shift count the same size as element size produces zero. */
55
*/
59
tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
56
if (is_snan(a->cls)) {
60
is_q ? 16 : 8, vec_full_reg_size(s), 0);
57
if (!is_snan(b->cls)) {
61
- } else {
58
- which = is_qnan(b->cls) ? 1 : 0;
62
- gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size);
59
+ ret = is_qnan(b->cls) ? b : a;
63
+ return;
60
break;
64
}
61
}
65
+ gvec_fn = tcg_gen_gvec_shri;
62
} else if (is_qnan(a->cls)) {
63
if (is_snan(b->cls) || !is_qnan(b->cls)) {
64
- which = 0;
65
+ ret = a;
66
break;
67
}
66
} else {
68
} else {
67
/* Shift count the same size as element size produces all sign. */
69
- which = 1;
68
if (shift == 8 << size) {
70
+ ret = b;
69
shift -= 1;
71
break;
70
}
71
- gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size);
72
+ gvec_fn = tcg_gen_gvec_sari;
73
}
72
}
74
- return;
73
cmp = frac_cmp(a, b);
75
+ break;
74
if (cmp == 0) {
76
75
cmp = a->sign < b->sign;
77
case 0x04: /* SRSHR / URSHR (rounding) */
76
}
78
- gen_gvec_fn2i(s, is_q, rd, rn, shift,
77
- which = cmp > 0 ? 0 : 1;
79
- is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
78
+ ret = cmp > 0 ? a : b;
80
- return;
79
break;
81
+ gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
82
+ break;
83
84
case 0x06: /* SRSRA / URSRA (accum + rounding) */
85
- gen_gvec_fn2i(s, is_q, rd, rn, shift,
86
- is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
87
- return;
88
+ gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
89
+ break;
90
91
default:
80
default:
92
g_assert_not_reached();
81
g_assert_not_reached();
93
}
82
}
94
83
95
- round_const = 1ULL << (shift - 1);
84
- if (which) {
96
- tcg_round = tcg_const_i64(round_const);
85
- a = b;
97
-
86
+ if (is_snan(ret->cls)) {
98
- for (i = 0; i < elements; i++) {
87
+ parts_silence_nan(ret, s);
99
- read_vec_element(s, tcg_rn, rn, i, memop);
88
}
100
- if (accumulate) {
89
- if (is_snan(a->cls)) {
101
- read_vec_element(s, tcg_rd, rd, i, memop);
90
- parts_silence_nan(a, s);
102
- }
103
-
104
- handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
105
- accumulate, is_u, size, shift);
106
-
107
- write_vec_element(s, tcg_rd, rd, i, size);
108
- }
91
- }
109
- tcg_temp_free_i64(tcg_round);
92
- return a;
110
-
93
+ return ret;
111
- clear_vec_high(s, is_q, rd);
112
+ gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
113
}
94
}
114
95
115
/* SHL/SLI - Vector shift left */
96
static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
116
--
97
--
117
2.20.1
98
2.34.1
118
99
119
100
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Leif Lindholm <quic_llindhol@quicinc.com>
2
2
3
Add APEI/GHES detailed design document
3
I'm migrating to Qualcomm's new open source email infrastructure, so
4
update my email address, and update the mailmap to match.
4
5
5
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
6
Signed-off-by: Leif Lindholm <leif.lindholm@oss.qualcomm.com>
6
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
7
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
7
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
8
Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
8
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
9
Message-id: 20200512030609.19593-4-gengdongjiu@huawei.com
10
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Message-id: 20241205114047.1125842-1-leif.lindholm@oss.qualcomm.com
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
docs/specs/acpi_hest_ghes.rst | 110 ++++++++++++++++++++++++++++++++++
14
MAINTAINERS | 2 +-
13
docs/specs/index.rst | 1 +
15
.mailmap | 5 +++--
14
2 files changed, 111 insertions(+)
16
2 files changed, 4 insertions(+), 3 deletions(-)
15
create mode 100644 docs/specs/acpi_hest_ghes.rst
16
17
17
diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
18
diff --git a/MAINTAINERS b/MAINTAINERS
18
new file mode 100644
19
index XXXXXXX..XXXXXXX
20
--- /dev/null
21
+++ b/docs/specs/acpi_hest_ghes.rst
22
@@ -XXX,XX +XXX,XX @@
23
+APEI tables generating and CPER record
24
+======================================
25
+
26
+..
27
+ Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
28
+
29
+ This work is licensed under the terms of the GNU GPL, version 2 or later.
30
+ See the COPYING file in the top-level directory.
31
+
32
+Design Details
33
+--------------
34
+
35
+::
36
+
37
+ etc/acpi/tables etc/hardware_errors
38
+ ==================== ===============================
39
+ + +--------------------------+ +----------------------------+
40
+ | | HEST | +--------->| error_block_address1 |------+
41
+ | +--------------------------+ | +----------------------------+ |
42
+ | | GHES1 | | +------->| error_block_address2 |------+-+
43
+ | +--------------------------+ | | +----------------------------+ | |
44
+ | | ................. | | | | .............. | | |
45
+ | | error_status_address-----+-+ | -----------------------------+ | |
46
+ | | ................. | | +--->| error_block_addressN |------+-+---+
47
+ | | read_ack_register--------+-+ | | +----------------------------+ | | |
48
+ | | read_ack_preserve | +-+---+--->| read_ack_register1 | | | |
49
+ | | read_ack_write | | | +----------------------------+ | | |
50
+ + +--------------------------+ | +-+--->| read_ack_register2 | | | |
51
+ | | GHES2 | | | | +----------------------------+ | | |
52
+ + +--------------------------+ | | | | ............. | | | |
53
+ | | ................. | | | | +----------------------------+ | | |
54
+ | | error_status_address-----+---+ | | +->| read_ack_registerN | | | |
55
+ | | ................. | | | | +----------------------------+ | | |
56
+ | | read_ack_register--------+-----+ | | |Generic Error Status Block 1|<-----+ | |
57
+ | | read_ack_preserve | | | |-+------------------------+-+ | |
58
+ | | read_ack_write | | | | | CPER | | | |
59
+ + +--------------------------| | | | | CPER | | | |
60
+ | | ............... | | | | | .... | | | |
61
+ + +--------------------------+ | | | | CPER | | | |
62
+ | | GHESN | | | |-+------------------------+-| | |
63
+ + +--------------------------+ | | |Generic Error Status Block 2|<-------+ |
64
+ | | ................. | | | |-+------------------------+-+ |
65
+ | | error_status_address-----+-------+ | | | CPER | | |
66
+ | | ................. | | | | CPER | | |
67
+ | | read_ack_register--------+---------+ | | .... | | |
68
+ | | read_ack_preserve | | | CPER | | |
69
+ | | read_ack_write | +-+------------------------+-+ |
70
+ + +--------------------------+ | .......... | |
71
+ |----------------------------+ |
72
+ |Generic Error Status Block N |<----------+
73
+ |-+-------------------------+-+
74
+ | | CPER | |
75
+ | | CPER | |
76
+ | | .... | |
77
+ | | CPER | |
78
+ +-+-------------------------+-+
79
+
80
+
81
+(1) QEMU generates the ACPI HEST table. This table goes in the current
82
+ "etc/acpi/tables" fw_cfg blob. Each error source has different
83
+ notification types.
84
+
85
+(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
86
+ also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
87
+ contains an address registers table and an Error Status Data Block table.
88
+
89
+(3) The address registers table contains N Error Block Address entries
90
+ and N Read Ack Register entries. The size for each entry is 8-byte.
91
+ The Error Status Data Block table contains N Error Status Data Block
92
+ entries. The size for each entry is 4096(0x1000) bytes. The total size
93
+ for the "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
94
+ N is the number of the kinds of hardware error sources.
95
+
96
+(4) QEMU generates the ACPI linker/loader script for the firmware. The
97
+ firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
98
+ and copies blob contents there.
99
+
100
+(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
101
+ "error_status_address" fields of the HEST table with a pointer to the
102
+ corresponding "address registers" in the "etc/hardware_errors" blob.
103
+
104
+(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
105
+ "read_ack_register" fields of the HEST table with a pointer to the
106
+ corresponding "read_ack_register" within the "etc/hardware_errors" blob.
107
+
108
+(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
109
+ addresses in the "error_block_address" fields with a pointer to the
110
+ respective "Error Status Data Block" in the "etc/hardware_errors" blob.
111
+
112
+(8) QEMU defines a third and write-only fw_cfg blob which is called
113
+ "etc/hardware_errors_addr". Through that blob, the firmware can send back
114
+ the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
115
+ blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
116
+ for the firmware. The firmware will write back the start address of
117
+ "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
118
+
119
+(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
120
+ "Error Status Data Block", guest memory, and then injects platform specific
121
+ interrupt (in case of arm/virt machine it's Synchronous External Abort) as a
122
+ notification which is necessary for notifying the guest.
123
+
124
+(10) This notification (in virtual hardware) will be handled by the guest
125
+ kernel, on receiving notification, guest APEI driver could read the CPER error
126
+ and take appropriate action.
127
+
128
+(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
129
+ find out "Error Status Data Block" entry corresponding to error source. So supported
130
+ source_id values should be assigned here and not be changed afterwards to make sure
131
+ that guest will write error into expected "Error Status Data Block" even if guest was
132
+ migrated to a newer QEMU.
133
diff --git a/docs/specs/index.rst b/docs/specs/index.rst
134
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
135
--- a/docs/specs/index.rst
20
--- a/MAINTAINERS
136
+++ b/docs/specs/index.rst
21
+++ b/MAINTAINERS
137
@@ -XXX,XX +XXX,XX @@ Contents:
22
@@ -XXX,XX +XXX,XX @@ F: include/hw/ssi/imx_spi.h
138
ppc-spapr-xive
23
SBSA-REF
139
acpi_hw_reduced_hotplug
24
M: Radoslaw Biernacki <rad@semihalf.com>
140
tpm
25
M: Peter Maydell <peter.maydell@linaro.org>
141
+ acpi_hest_ghes
26
-R: Leif Lindholm <quic_llindhol@quicinc.com>
27
+R: Leif Lindholm <leif.lindholm@oss.qualcomm.com>
28
R: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
29
L: qemu-arm@nongnu.org
30
S: Maintained
31
diff --git a/.mailmap b/.mailmap
32
index XXXXXXX..XXXXXXX 100644
33
--- a/.mailmap
34
+++ b/.mailmap
35
@@ -XXX,XX +XXX,XX @@ Huacai Chen <chenhuacai@kernel.org> <chenhc@lemote.com>
36
Huacai Chen <chenhuacai@kernel.org> <chenhuacai@loongson.cn>
37
James Hogan <jhogan@kernel.org> <james.hogan@imgtec.com>
38
Juan Quintela <quintela@trasno.org> <quintela@redhat.com>
39
-Leif Lindholm <quic_llindhol@quicinc.com> <leif.lindholm@linaro.org>
40
-Leif Lindholm <quic_llindhol@quicinc.com> <leif@nuviainc.com>
41
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <quic_llindhol@quicinc.com>
42
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <leif.lindholm@linaro.org>
43
+Leif Lindholm <leif.lindholm@oss.qualcomm.com> <leif@nuviainc.com>
44
Luc Michel <luc@lmichel.fr> <luc.michel@git.antfield.fr>
45
Luc Michel <luc@lmichel.fr> <luc.michel@greensocs.com>
46
Luc Michel <luc@lmichel.fr> <lmichel@kalray.eu>
142
--
47
--
143
2.20.1
48
2.34.1
144
49
145
50
diff view generated by jsdifflib
1
From: Dongjiu Geng <gengdongjiu@huawei.com>
1
From: Vikram Garhwal <vikram.garhwal@bytedance.com>
2
2
3
I and Xiang are willing to review the APEI-related patches and
3
Previously, maintainer role was paused due to inactive email id. Commit id:
4
volunteer as the reviewers for the HEST/GHES part.
4
c009d715721861984c4987bcc78b7ee183e86d75.
5
5
6
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
6
Signed-off-by: Vikram Garhwal <vikram.garhwal@bytedance.com>
7
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
7
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Message-id: 20241204184205.12952-1-vikram.garhwal@bytedance.com
9
Acked-by: Michael S. Tsirkin <mst@redhat.com>
10
Message-id: 20200512030609.19593-11-gengdongjiu@huawei.com
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
10
---
13
MAINTAINERS | 9 +++++++++
11
MAINTAINERS | 2 ++
14
1 file changed, 9 insertions(+)
12
1 file changed, 2 insertions(+)
15
13
16
diff --git a/MAINTAINERS b/MAINTAINERS
14
diff --git a/MAINTAINERS b/MAINTAINERS
17
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
18
--- a/MAINTAINERS
16
--- a/MAINTAINERS
19
+++ b/MAINTAINERS
17
+++ b/MAINTAINERS
20
@@ -XXX,XX +XXX,XX @@ F: tests/qtest/bios-tables-test.c
18
@@ -XXX,XX +XXX,XX @@ F: tests/qtest/fuzz-sb16-test.c
21
F: tests/qtest/acpi-utils.[hc]
19
22
F: tests/data/acpi/
20
Xilinx CAN
23
21
M: Francisco Iglesias <francisco.iglesias@amd.com>
24
+ACPI/HEST/GHES
22
+M: Vikram Garhwal <vikram.garhwal@bytedance.com>
25
+R: Dongjiu Geng <gengdongjiu@huawei.com>
23
S: Maintained
26
+R: Xiang Zheng <zhengxiang9@huawei.com>
24
F: hw/net/can/xlnx-*
27
+L: qemu-arm@nongnu.org
25
F: include/hw/net/xlnx-*
28
+S: Maintained
26
@@ -XXX,XX +XXX,XX @@ F: include/hw/rx/
29
+F: hw/acpi/ghes.c
27
CAN bus subsystem and hardware
30
+F: include/hw/acpi/ghes.h
28
M: Pavel Pisa <pisa@cmp.felk.cvut.cz>
31
+F: docs/specs/acpi_hest_ghes.rst
29
M: Francisco Iglesias <francisco.iglesias@amd.com>
32
+
30
+M: Vikram Garhwal <vikram.garhwal@bytedance.com>
33
ppc4xx
31
S: Maintained
34
M: David Gibson <david@gibson.dropbear.id.au>
32
W: https://canbus.pages.fel.cvut.cz/
35
L: qemu-ppc@nongnu.org
33
F: net/can/*
36
--
34
--
37
2.20.1
35
2.34.1
38
39
diff view generated by jsdifflib