1
I don't have anything else queued up at the moment, so this is just
1
The following changes since commit 5767815218efd3cbfd409505ed824d5f356044ae:
2
Richard's SME patches.
3
2
4
-- PMM
3
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging (2024-02-14 15:45:52 +0000)
5
6
The following changes since commit 63b38f6c85acd312c2cab68554abf33adf4ee2b3:
7
8
Merge tag 'pull-target-arm-20220707' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2022-07-08 06:17:11 +0530)
9
4
10
are available in the Git repository at:
5
are available in the Git repository at:
11
6
12
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20220711
7
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20240215
13
8
14
for you to fetch changes up to f9982ceaf26df27d15547a3a7990a95019e9e3a8:
9
for you to fetch changes up to f780e63fe731b058fe52d43653600d8729a1b5f2:
15
10
16
linux-user/aarch64: Add SME related hwcap entries (2022-07-11 13:43:52 +0100)
11
docs: Add documentation for the mps3-an536 board (2024-02-15 14:32:39 +0000)
17
12
18
----------------------------------------------------------------
13
----------------------------------------------------------------
19
target-arm:
14
target-arm queue:
20
* Implement SME emulation, for both system and linux-user
15
* hw/arm/xilinx_zynq: Wire FIQ between CPU <> GIC
16
* linux-user/aarch64: Choose SYNC as the preferred MTE mode
17
* Fix some errors in SVE/SME handling of MTE tags
18
* hw/pci-host/raven.c: Mark raven_io_ops as implementing unaligned accesses
19
* hw/block/tc58128: Don't emit deprecation warning under qtest
20
* tests/qtest: Fix handling of npcm7xx and GMAC tests
21
* hw/arm/virt: Wire up non-secure EL2 virtual timer IRQ
22
* tests/qtest/npcm7xx_emc-test: Connect all NICs to a backend
23
* Don't assert on vmload/vmsave of M-profile CPUs
24
* hw/arm/smmuv3: add support for stage 1 access fault
25
* hw/arm/stellaris: QOM cleanups
26
* Use new CBAR encoding for all v8 CPUs, not all aarch64 CPUs
27
* Improve Cortex_R52 IMPDEF sysreg modelling
28
* Allow access to SPSR_hyp from hyp mode
29
* New board model mps3-an536 (Cortex-R52)
21
30
22
----------------------------------------------------------------
31
----------------------------------------------------------------
23
Richard Henderson (45):
32
Luc Michel (1):
24
target/arm: Handle SME in aarch64_cpu_dump_state
33
hw/arm/smmuv3: add support for stage 1 access fault
25
target/arm: Add infrastructure for disas_sme
26
target/arm: Trap non-streaming usage when Streaming SVE is active
27
target/arm: Mark ADR as non-streaming
28
target/arm: Mark RDFFR, WRFFR, SETFFR as non-streaming
29
target/arm: Mark BDEP, BEXT, BGRP, COMPACT, FEXPA, FTSSEL as non-streaming
30
target/arm: Mark PMULL, FMMLA as non-streaming
31
target/arm: Mark FTSMUL, FTMAD, FADDA as non-streaming
32
target/arm: Mark SMMLA, UMMLA, USMMLA as non-streaming
33
target/arm: Mark string/histo/crypto as non-streaming
34
target/arm: Mark gather/scatter load/store as non-streaming
35
target/arm: Mark gather prefetch as non-streaming
36
target/arm: Mark LDFF1 and LDNF1 as non-streaming
37
target/arm: Mark LD1RO as non-streaming
38
target/arm: Add SME enablement checks
39
target/arm: Handle SME in sve_access_check
40
target/arm: Implement SME RDSVL, ADDSVL, ADDSPL
41
target/arm: Implement SME ZERO
42
target/arm: Implement SME MOVA
43
target/arm: Implement SME LD1, ST1
44
target/arm: Export unpredicated ld/st from translate-sve.c
45
target/arm: Implement SME LDR, STR
46
target/arm: Implement SME ADDHA, ADDVA
47
target/arm: Implement FMOPA, FMOPS (non-widening)
48
target/arm: Implement BFMOPA, BFMOPS
49
target/arm: Implement FMOPA, FMOPS (widening)
50
target/arm: Implement SME integer outer product
51
target/arm: Implement PSEL
52
target/arm: Implement REVD
53
target/arm: Implement SCLAMP, UCLAMP
54
target/arm: Reset streaming sve state on exception boundaries
55
target/arm: Enable SME for -cpu max
56
linux-user/aarch64: Clear tpidr2_el0 if CLONE_SETTLS
57
linux-user/aarch64: Reset PSTATE.SM on syscalls
58
linux-user/aarch64: Add SM bit to SVE signal context
59
linux-user/aarch64: Tidy target_restore_sigframe error return
60
linux-user/aarch64: Do not allow duplicate or short sve records
61
linux-user/aarch64: Verify extra record lock succeeded
62
linux-user/aarch64: Move sve record checks into restore
63
linux-user/aarch64: Implement SME signal handling
64
linux-user: Rename sve prctls
65
linux-user/aarch64: Implement PR_SME_GET_VL, PR_SME_SET_VL
66
target/arm: Only set ZEN in reset if SVE present
67
target/arm: Enable SME for user-only
68
linux-user/aarch64: Add SME related hwcap entries
69
34
70
docs/system/arm/emulation.rst | 4 +
35
Nabih Estefan (1):
71
linux-user/aarch64/target_cpu.h | 5 +-
36
tests/qtest: Fix GMAC test to run on a machine in upstream QEMU
72
linux-user/aarch64/target_prctl.h | 62 +-
37
73
target/arm/cpu.h | 7 +
38
Peter Maydell (22):
74
target/arm/helper-sme.h | 126 ++++
39
hw/pci-host/raven.c: Mark raven_io_ops as implementing unaligned accesses
75
target/arm/helper-sve.h | 4 +
40
hw/block/tc58128: Don't emit deprecation warning under qtest
76
target/arm/helper.h | 18 +
41
tests/qtest/meson.build: Don't include qtests_npcm7xx in qtests_aarch64
77
target/arm/translate-a64.h | 45 ++
42
tests/qtest/bios-tables-test: Allow changes to virt GTDT
78
target/arm/translate.h | 16 +
43
hw/arm/virt: Wire up non-secure EL2 virtual timer IRQ
79
target/arm/sme-fa64.decode | 60 ++
44
tests/qtest/bios-tables-tests: Update virt golden reference
80
target/arm/sme.decode | 88 +++
45
hw/arm/npcm7xx: Call qemu_configure_nic_device() for GMAC modules
81
target/arm/sve.decode | 41 +-
46
tests/qtest/npcm7xx_emc-test: Connect all NICs to a backend
82
linux-user/aarch64/cpu_loop.c | 9 +
47
target/arm: Don't get MDCR_EL2 in pmu_counter_enabled() before checking ARM_FEATURE_PMU
83
linux-user/aarch64/signal.c | 243 ++++++--
48
target/arm: Use new CBAR encoding for all v8 CPUs, not all aarch64 CPUs
84
linux-user/elfload.c | 20 +
49
target/arm: The Cortex-R52 has a read-only CBAR
85
linux-user/syscall.c | 28 +-
50
target/arm: Add Cortex-R52 IMPDEF sysregs
86
target/arm/cpu.c | 35 +-
51
target/arm: Allow access to SPSR_hyp from hyp mode
87
target/arm/cpu64.c | 11 +
52
hw/misc/mps2-scc: Fix condition for CFG3 register
88
target/arm/helper.c | 56 +-
53
hw/misc/mps2-scc: Factor out which-board conditionals
89
target/arm/sme_helper.c | 1140 +++++++++++++++++++++++++++++++++++++
54
hw/misc/mps2-scc: Make changes needed for AN536 FPGA image
90
target/arm/sve_helper.c | 28 +
55
hw/arm/mps3r: Initial skeleton for mps3-an536 board
91
target/arm/translate-a64.c | 103 +++-
56
hw/arm/mps3r: Add CPUs, GIC, and per-CPU RAM
92
target/arm/translate-sme.c | 373 ++++++++++++
57
hw/arm/mps3r: Add UARTs
93
target/arm/translate-sve.c | 393 ++++++++++---
58
hw/arm/mps3r: Add GPIO, watchdog, dual-timer, I2C devices
94
target/arm/translate-vfp.c | 12 +
59
hw/arm/mps3r: Add remaining devices
95
target/arm/translate.c | 2 +
60
docs: Add documentation for the mps3-an536 board
96
target/arm/vec_helper.c | 24 +
61
97
target/arm/meson.build | 3 +
62
Philippe Mathieu-Daudé (5):
98
28 files changed, 2821 insertions(+), 135 deletions(-)
63
hw/arm/xilinx_zynq: Wire FIQ between CPU <> GIC
99
create mode 100644 target/arm/sme-fa64.decode
64
hw/arm/stellaris: Convert ADC controller to Resettable interface
100
create mode 100644 target/arm/sme.decode
65
hw/arm/stellaris: Convert I2C controller to Resettable interface
101
create mode 100644 target/arm/translate-sme.c
66
hw/arm/stellaris: Add missing QOM 'machine' parent
67
hw/arm/stellaris: Add missing QOM 'SoC' parent
68
69
Richard Henderson (6):
70
linux-user/aarch64: Choose SYNC as the preferred MTE mode
71
target/arm: Fix nregs computation in do_{ld,st}_zpa
72
target/arm: Adjust and validate mtedesc sizem1
73
target/arm: Split out make_svemte_desc
74
target/arm: Handle mte in do_ldrq, do_ldro
75
target/arm: Fix SVE/SME gross MTE suppression checks
76
77
MAINTAINERS | 3 +-
78
docs/system/arm/mps2.rst | 37 +-
79
configs/devices/arm-softmmu/default.mak | 1 +
80
hw/arm/smmuv3-internal.h | 1 +
81
include/hw/arm/smmu-common.h | 1 +
82
include/hw/arm/virt.h | 2 +
83
include/hw/misc/mps2-scc.h | 1 +
84
linux-user/aarch64/target_prctl.h | 29 +-
85
target/arm/internals.h | 2 +-
86
target/arm/tcg/translate-a64.h | 2 +
87
hw/arm/mps3r.c | 640 ++++++++++++++++++++++++++++++++
88
hw/arm/npcm7xx.c | 1 +
89
hw/arm/smmu-common.c | 11 +
90
hw/arm/smmuv3.c | 1 +
91
hw/arm/stellaris.c | 47 ++-
92
hw/arm/virt-acpi-build.c | 20 +-
93
hw/arm/virt.c | 60 ++-
94
hw/arm/xilinx_zynq.c | 2 +
95
hw/block/tc58128.c | 4 +-
96
hw/misc/mps2-scc.c | 138 ++++++-
97
hw/pci-host/raven.c | 1 +
98
target/arm/helper.c | 14 +-
99
target/arm/tcg/cpu32.c | 109 ++++++
100
target/arm/tcg/op_helper.c | 43 ++-
101
target/arm/tcg/sme_helper.c | 8 +-
102
target/arm/tcg/sve_helper.c | 12 +-
103
target/arm/tcg/translate-sme.c | 15 +-
104
target/arm/tcg/translate-sve.c | 83 +++--
105
target/arm/tcg/translate.c | 19 +-
106
tests/qtest/npcm7xx_emc-test.c | 5 +-
107
tests/qtest/npcm_gmac-test.c | 84 +----
108
hw/arm/Kconfig | 5 +
109
hw/arm/meson.build | 1 +
110
tests/data/acpi/virt/FACP | Bin 276 -> 276 bytes
111
tests/data/acpi/virt/GTDT | Bin 96 -> 104 bytes
112
tests/qtest/meson.build | 4 +-
113
36 files changed, 1184 insertions(+), 222 deletions(-)
114
create mode 100644 hw/arm/mps3r.c
115
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Dump SVCR, plus use the correct access check for Streaming Mode.
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20220708151540.18136-2-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/cpu.c | 17 ++++++++++++++++-
11
1 file changed, 16 insertions(+), 1 deletion(-)
12
13
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/cpu.c
16
+++ b/target/arm/cpu.c
17
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
18
int i;
19
int el = arm_current_el(env);
20
const char *ns_status;
21
+ bool sve;
22
23
qemu_fprintf(f, " PC=%016" PRIx64 " ", env->pc);
24
for (i = 0; i < 32; i++) {
25
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
26
el,
27
psr & PSTATE_SP ? 'h' : 't');
28
29
+ if (cpu_isar_feature(aa64_sme, cpu)) {
30
+ qemu_fprintf(f, " SVCR=%08" PRIx64 " %c%c",
31
+ env->svcr,
32
+ (FIELD_EX64(env->svcr, SVCR, ZA) ? 'Z' : '-'),
33
+ (FIELD_EX64(env->svcr, SVCR, SM) ? 'S' : '-'));
34
+ }
35
if (cpu_isar_feature(aa64_bti, cpu)) {
36
qemu_fprintf(f, " BTYPE=%d", (psr & PSTATE_BTYPE) >> 10);
37
}
38
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
39
qemu_fprintf(f, " FPCR=%08x FPSR=%08x\n",
40
vfp_get_fpcr(env), vfp_get_fpsr(env));
41
42
- if (cpu_isar_feature(aa64_sve, cpu) && sve_exception_el(env, el) == 0) {
43
+ if (cpu_isar_feature(aa64_sme, cpu) && FIELD_EX64(env->svcr, SVCR, SM)) {
44
+ sve = sme_exception_el(env, el) == 0;
45
+ } else if (cpu_isar_feature(aa64_sve, cpu)) {
46
+ sve = sve_exception_el(env, el) == 0;
47
+ } else {
48
+ sve = false;
49
+ }
50
+
51
+ if (sve) {
52
int j, zcr_len = sve_vqm1_for_el(env, el);
53
54
for (i = 0; i <= FFR_PRED_NUM; i++) {
55
--
56
2.25.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
Similarly to commits dadbb58f59..5ae79fe825 for other ARM boards,
4
connect FIQ output of the GIC CPU interfaces to the CPU.
5
6
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Message-id: 20240130152548.17855-1-philmd@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20220708151540.18136-46-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
linux-user/elfload.c | 20 ++++++++++++++++++++
11
hw/arm/xilinx_zynq.c | 2 ++
9
1 file changed, 20 insertions(+)
12
1 file changed, 2 insertions(+)
10
13
11
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
14
diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
12
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
13
--- a/linux-user/elfload.c
16
--- a/hw/arm/xilinx_zynq.c
14
+++ b/linux-user/elfload.c
17
+++ b/hw/arm/xilinx_zynq.c
15
@@ -XXX,XX +XXX,XX @@ enum {
18
@@ -XXX,XX +XXX,XX @@ static void zynq_init(MachineState *machine)
16
ARM_HWCAP2_A64_RNG = 1 << 16,
19
sysbus_mmio_map(busdev, 0, MPCORE_PERIPHBASE);
17
ARM_HWCAP2_A64_BTI = 1 << 17,
20
sysbus_connect_irq(busdev, 0,
18
ARM_HWCAP2_A64_MTE = 1 << 18,
21
qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_IRQ));
19
+ ARM_HWCAP2_A64_ECV = 1 << 19,
22
+ sysbus_connect_irq(busdev, 1,
20
+ ARM_HWCAP2_A64_AFP = 1 << 20,
23
+ qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_FIQ));
21
+ ARM_HWCAP2_A64_RPRES = 1 << 21,
24
22
+ ARM_HWCAP2_A64_MTE3 = 1 << 22,
25
for (n = 0; n < 64; n++) {
23
+ ARM_HWCAP2_A64_SME = 1 << 23,
26
pic[n] = qdev_get_gpio_in(dev, n);
24
+ ARM_HWCAP2_A64_SME_I16I64 = 1 << 24,
25
+ ARM_HWCAP2_A64_SME_F64F64 = 1 << 25,
26
+ ARM_HWCAP2_A64_SME_I8I32 = 1 << 26,
27
+ ARM_HWCAP2_A64_SME_F16F32 = 1 << 27,
28
+ ARM_HWCAP2_A64_SME_B16F32 = 1 << 28,
29
+ ARM_HWCAP2_A64_SME_F32F32 = 1 << 29,
30
+ ARM_HWCAP2_A64_SME_FA64 = 1 << 30,
31
};
32
33
#define ELF_HWCAP get_elf_hwcap()
34
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap2(void)
35
GET_FEATURE_ID(aa64_rndr, ARM_HWCAP2_A64_RNG);
36
GET_FEATURE_ID(aa64_bti, ARM_HWCAP2_A64_BTI);
37
GET_FEATURE_ID(aa64_mte, ARM_HWCAP2_A64_MTE);
38
+ GET_FEATURE_ID(aa64_sme, (ARM_HWCAP2_A64_SME |
39
+ ARM_HWCAP2_A64_SME_F32F32 |
40
+ ARM_HWCAP2_A64_SME_B16F32 |
41
+ ARM_HWCAP2_A64_SME_F16F32 |
42
+ ARM_HWCAP2_A64_SME_I8I32));
43
+ GET_FEATURE_ID(aa64_sme_f64f64, ARM_HWCAP2_A64_SME_F64F64);
44
+ GET_FEATURE_ID(aa64_sme_i16i64, ARM_HWCAP2_A64_SME_I16I64);
45
+ GET_FEATURE_ID(aa64_sme_fa64, ARM_HWCAP2_A64_SME_FA64);
46
47
return hwcaps;
48
}
49
--
27
--
50
2.25.1
28
2.34.1
29
30
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
These prctl set the Streaming SVE vector length, which may
3
The API does not generate an error for setting ASYNC | SYNC; that merely
4
be completely different from the Normal SVE vector length.
4
constrains the selection vs the per-cpu default. For qemu linux-user,
5
choose SYNC as the default.
5
6
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Cc: qemu-stable@nongnu.org
8
Reported-by: Gustavo Romero <gustavo.romero@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-43-richard.henderson@linaro.org
10
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
11
Message-id: 20240207025210.8837-2-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
13
---
11
linux-user/aarch64/target_prctl.h | 54 +++++++++++++++++++++++++++++++
14
linux-user/aarch64/target_prctl.h | 29 +++++++++++++++++------------
12
linux-user/syscall.c | 16 +++++++++
15
1 file changed, 17 insertions(+), 12 deletions(-)
13
2 files changed, 70 insertions(+)
14
16
15
diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
17
diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/linux-user/aarch64/target_prctl.h
19
--- a/linux-user/aarch64/target_prctl.h
18
+++ b/linux-user/aarch64/target_prctl.h
20
+++ b/linux-user/aarch64/target_prctl.h
19
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl_sve_get_vl(CPUArchState *env)
21
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
20
{
22
env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
21
ARMCPU *cpu = env_archcpu(env);
23
22
if (cpu_isar_feature(aa64_sve, cpu)) {
24
if (cpu_isar_feature(aa64_mte, cpu)) {
23
+ /* PSTATE.SM is always unset on syscall entry. */
25
- switch (arg2 & PR_MTE_TCF_MASK) {
24
return sve_vq(env) * 16;
26
- case PR_MTE_TCF_NONE:
25
}
27
- case PR_MTE_TCF_SYNC:
26
return -TARGET_EINVAL;
28
- case PR_MTE_TCF_ASYNC:
27
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl_sve_set_vl(CPUArchState *env, abi_long arg2)
29
- break;
28
&& arg2 >= 0 && arg2 <= 512 * 16 && !(arg2 & 15)) {
30
- default:
29
uint32_t vq, old_vq;
31
- return -EINVAL;
30
32
- }
31
+ /* PSTATE.SM is always unset on syscall entry. */
33
-
32
old_vq = sve_vq(env);
33
34
/*
34
/*
35
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl_sve_set_vl(CPUArchState *env, abi_long arg2)
35
* Write PR_MTE_TCF to SCTLR_EL1[TCF0].
36
}
36
- * Note that the syscall values are consistent with hw.
37
#define do_prctl_sve_set_vl do_prctl_sve_set_vl
37
+ *
38
38
+ * The kernel has a per-cpu configuration for the sysadmin,
39
+static abi_long do_prctl_sme_get_vl(CPUArchState *env)
39
+ * /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred,
40
+{
40
+ * which qemu does not implement.
41
+ ARMCPU *cpu = env_archcpu(env);
41
+ *
42
+ if (cpu_isar_feature(aa64_sme, cpu)) {
42
+ * Because there is no performance difference between the modes, and
43
+ return sme_vq(env) * 16;
43
+ * because SYNC is most useful for debugging MTE errors, choose SYNC
44
+ }
44
+ * as the preferred mode. With this preference, and the way the API
45
+ return -TARGET_EINVAL;
45
+ * uses only two bits, there is no way for the program to select
46
+}
46
+ * ASYMM mode.
47
+#define do_prctl_sme_get_vl do_prctl_sme_get_vl
47
*/
48
+
48
- env->cp15.sctlr_el[1] =
49
+static abi_long do_prctl_sme_set_vl(CPUArchState *env, abi_long arg2)
49
- deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
50
+{
50
+ unsigned tcf = 0;
51
+ /*
51
+ if (arg2 & PR_MTE_TCF_SYNC) {
52
+ * We cannot support either PR_SME_SET_VL_ONEXEC or PR_SME_VL_INHERIT.
52
+ tcf = 1;
53
+ * Note the kernel definition of sve_vl_valid allows for VQ=512,
53
+ } else if (arg2 & PR_MTE_TCF_ASYNC) {
54
+ * i.e. VL=8192, even though the architectural maximum is VQ=16.
54
+ tcf = 2;
55
+ */
56
+ if (cpu_isar_feature(aa64_sme, env_archcpu(env))
57
+ && arg2 >= 0 && arg2 <= 512 * 16 && !(arg2 & 15)) {
58
+ int vq, old_vq;
59
+
60
+ old_vq = sme_vq(env);
61
+
62
+ /*
63
+ * Bound the value of vq, so that we know that it fits into
64
+ * the 4-bit field in SMCR_EL1. Because PSTATE.SM is cleared
65
+ * on syscall entry, we are not modifying the current SVE
66
+ * vector length.
67
+ */
68
+ vq = MAX(arg2 / 16, 1);
69
+ vq = MIN(vq, 16);
70
+ env->vfp.smcr_el[1] =
71
+ FIELD_DP64(env->vfp.smcr_el[1], SMCR, LEN, vq - 1);
72
+
73
+ /* Delay rebuilding hflags until we know if ZA must change. */
74
+ vq = sve_vqm1_for_el_sm(env, 0, true) + 1;
75
+
76
+ if (vq != old_vq) {
77
+ /*
78
+ * PSTATE.ZA state is cleared on any change to SVL.
79
+ * We need not call arm_rebuild_hflags because PSTATE.SM was
80
+ * cleared on syscall entry, so this hasn't changed VL.
81
+ */
82
+ env->svcr = FIELD_DP64(env->svcr, SVCR, ZA, 0);
83
+ arm_rebuild_hflags(env);
84
+ }
55
+ }
85
+ return vq * 16;
56
+ env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
86
+ }
57
87
+ return -TARGET_EINVAL;
58
/*
88
+}
59
* Write PR_MTE_TAG to GCR_EL1[Exclude].
89
+#define do_prctl_sme_set_vl do_prctl_sme_set_vl
90
+
91
static abi_long do_prctl_reset_keys(CPUArchState *env, abi_long arg2)
92
{
93
ARMCPU *cpu = env_archcpu(env);
94
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
95
index XXXXXXX..XXXXXXX 100644
96
--- a/linux-user/syscall.c
97
+++ b/linux-user/syscall.c
98
@@ -XXX,XX +XXX,XX @@ abi_long do_arch_prctl(CPUX86State *env, int code, abi_ulong addr)
99
#ifndef PR_SET_SYSCALL_USER_DISPATCH
100
# define PR_SET_SYSCALL_USER_DISPATCH 59
101
#endif
102
+#ifndef PR_SME_SET_VL
103
+# define PR_SME_SET_VL 63
104
+# define PR_SME_GET_VL 64
105
+# define PR_SME_VL_LEN_MASK 0xffff
106
+# define PR_SME_VL_INHERIT (1 << 17)
107
+#endif
108
109
#include "target_prctl.h"
110
111
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl_inval1(CPUArchState *env, abi_long arg2)
112
#ifndef do_prctl_set_unalign
113
#define do_prctl_set_unalign do_prctl_inval1
114
#endif
115
+#ifndef do_prctl_sme_get_vl
116
+#define do_prctl_sme_get_vl do_prctl_inval0
117
+#endif
118
+#ifndef do_prctl_sme_set_vl
119
+#define do_prctl_sme_set_vl do_prctl_inval1
120
+#endif
121
122
static abi_long do_prctl(CPUArchState *env, abi_long option, abi_long arg2,
123
abi_long arg3, abi_long arg4, abi_long arg5)
124
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl(CPUArchState *env, abi_long option, abi_long arg2,
125
return do_prctl_sve_get_vl(env);
126
case PR_SVE_SET_VL:
127
return do_prctl_sve_set_vl(env, arg2);
128
+ case PR_SME_GET_VL:
129
+ return do_prctl_sme_get_vl(env);
130
+ case PR_SME_SET_VL:
131
+ return do_prctl_sme_set_vl(env, arg2);
132
case PR_PAC_RESET_KEYS:
133
if (arg3 || arg4 || arg5) {
134
return -TARGET_EINVAL;
135
--
60
--
136
2.25.1
61
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Fold the return value setting into the goto, so each
3
The field is encoded as [0-3], which is convenient for
4
point of failure need not do both.
4
indexing our array of function pointers, but the true
5
value is [1-4]. Adjust before calling do_mem_zpa.
5
6
7
Add an assert, and move the comment re passing ZT to
8
the helper back next to the relevant code.
9
10
Cc: qemu-stable@nongnu.org
11
Fixes: 206adacfb8d ("target/arm: Add mte helpers for sve scalar + int loads")
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
14
Message-id: 20240207025210.8837-3-richard.henderson@linaro.org
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-37-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
17
---
11
linux-user/aarch64/signal.c | 26 +++++++++++---------------
18
target/arm/tcg/translate-sve.c | 16 ++++++++--------
12
1 file changed, 11 insertions(+), 15 deletions(-)
19
1 file changed, 8 insertions(+), 8 deletions(-)
13
20
14
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
21
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
15
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
16
--- a/linux-user/aarch64/signal.c
23
--- a/target/arm/tcg/translate-sve.c
17
+++ b/linux-user/aarch64/signal.c
24
+++ b/target/arm/tcg/translate-sve.c
18
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
25
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
19
struct target_sve_context *sve = NULL;
26
TCGv_ptr t_pg;
20
uint64_t extra_datap = 0;
27
int desc = 0;
21
bool used_extra = false;
28
22
- bool err = false;
29
- /*
23
int vq = 0, sve_size = 0;
30
- * For e.g. LD4, there are not enough arguments to pass all 4
24
31
- * registers as pointers, so encode the regno into the data field.
25
target_restore_general_frame(env, sf);
32
- * For consistency, do this even for LD1.
26
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
33
- */
27
switch (magic) {
34
+ assert(mte_n >= 1 && mte_n <= 4);
28
case 0:
35
if (s->mte_active[0]) {
29
if (size != 0) {
36
int msz = dtype_msz(dtype);
30
- err = true;
37
31
- goto exit;
38
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
32
+ goto err;
39
addr = clean_data_tbi(s, addr);
33
}
34
if (used_extra) {
35
ctx = NULL;
36
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
37
38
case TARGET_FPSIMD_MAGIC:
39
if (fpsimd || size != sizeof(struct target_fpsimd_context)) {
40
- err = true;
41
- goto exit;
42
+ goto err;
43
}
44
fpsimd = (struct target_fpsimd_context *)ctx;
45
break;
46
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
47
break;
48
}
49
}
50
- err = true;
51
- goto exit;
52
+ goto err;
53
54
case TARGET_EXTRA_MAGIC:
55
if (extra || size != sizeof(struct target_extra_context)) {
56
- err = true;
57
- goto exit;
58
+ goto err;
59
}
60
__get_user(extra_datap,
61
&((struct target_extra_context *)ctx)->datap);
62
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
63
/* Unknown record -- we certainly didn't generate it.
64
* Did we in fact get out of sync?
65
*/
66
- err = true;
67
- goto exit;
68
+ goto err;
69
}
70
ctx = (void *)ctx + size;
71
}
40
}
72
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
41
73
if (fpsimd) {
42
+ /*
74
target_restore_fpsimd_record(env, fpsimd);
43
+ * For e.g. LD4, there are not enough arguments to pass all 4
44
+ * registers as pointers, so encode the regno into the data field.
45
+ * For consistency, do this even for LD1.
46
+ */
47
desc = simd_desc(vsz, vsz, zt | desc);
48
t_pg = tcg_temp_new_ptr();
49
50
@@ -XXX,XX +XXX,XX @@ static void do_ld_zpa(DisasContext *s, int zt, int pg,
51
* accessible via the instruction encoding.
52
*/
53
assert(fn != NULL);
54
- do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn);
55
+ do_mem_zpa(s, zt, pg, addr, dtype, nreg + 1, false, fn);
56
}
57
58
static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a)
59
@@ -XXX,XX +XXX,XX @@ static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
60
if (nreg == 0) {
61
/* ST1 */
62
fn = fn_single[s->mte_active[0]][be][msz][esz];
63
- nreg = 1;
75
} else {
64
} else {
76
- err = true;
65
/* ST2, ST3, ST4 -- msz == esz, enforced by encoding */
77
+ goto err;
66
assert(msz == esz);
67
fn = fn_multiple[s->mte_active[0]][be][nreg - 1][msz];
78
}
68
}
79
69
assert(fn != NULL);
80
/* SVE data, if present, overwrites FPSIMD data. */
70
- do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), nreg, true, fn);
81
if (sve) {
71
+ do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), nreg + 1, true, fn);
82
target_restore_sve_record(env, sve, vq);
83
}
84
-
85
- exit:
86
unlock_user(extra, extra_datap, 0);
87
- return err;
88
+ return 0;
89
+
90
+ err:
91
+ unlock_user(extra, extra_datap, 0);
92
+ return 1;
93
}
72
}
94
73
95
static abi_ulong get_sigframe(struct target_sigaction *ka,
74
static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a)
96
--
75
--
97
2.25.1
76
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Enable SME, TPIDR2_EL0, and FA64 if supported by the cpu.
3
When we added SVE_MTEDESC_SHIFT, we effectively limited the
4
maximum size of MTEDESC. Adjust SIZEM1 to consume the remaining
5
bits (32 - 10 - 5 - 12 == 5). Assert that the data to be stored
6
fits within the field (expecting 8 * 4 - 1 == 31, exact fit).
4
7
8
Cc: qemu-stable@nongnu.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20220708151540.18136-45-richard.henderson@linaro.org
11
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
12
Message-id: 20240207025210.8837-4-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
14
---
10
target/arm/cpu.c | 11 +++++++++++
15
target/arm/internals.h | 2 +-
11
1 file changed, 11 insertions(+)
16
target/arm/tcg/translate-sve.c | 7 ++++---
17
2 files changed, 5 insertions(+), 4 deletions(-)
12
18
13
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
19
diff --git a/target/arm/internals.h b/target/arm/internals.h
14
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/cpu.c
21
--- a/target/arm/internals.h
16
+++ b/target/arm/cpu.c
22
+++ b/target/arm/internals.h
17
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(DeviceState *dev)
23
@@ -XXX,XX +XXX,XX @@ FIELD(MTEDESC, TBI, 4, 2)
18
CPACR_EL1, ZEN, 3);
24
FIELD(MTEDESC, TCMA, 6, 2)
19
env->vfp.zcr_el[1] = cpu->sve_default_vq - 1;
25
FIELD(MTEDESC, WRITE, 8, 1)
20
}
26
FIELD(MTEDESC, ALIGN, 9, 3)
21
+ /* and for SME instructions, with default vector length, and TPIDR2 */
27
-FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12) /* size - 1 */
22
+ if (cpu_isar_feature(aa64_sme, cpu)) {
28
+FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - SVE_MTEDESC_SHIFT - 12) /* size - 1 */
23
+ env->cp15.sctlr_el[1] |= SCTLR_EnTP2;
29
24
+ env->cp15.cpacr_el1 = FIELD_DP64(env->cp15.cpacr_el1,
30
bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr);
25
+ CPACR_EL1, SMEN, 3);
31
uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
26
+ env->vfp.smcr_el[1] = cpu->sme_default_vq - 1;
32
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
27
+ if (cpu_isar_feature(aa64_sme_fa64, cpu)) {
33
index XXXXXXX..XXXXXXX 100644
28
+ env->vfp.smcr_el[1] = FIELD_DP64(env->vfp.smcr_el[1],
34
--- a/target/arm/tcg/translate-sve.c
29
+ SMCR, FA64, 1);
35
+++ b/target/arm/tcg/translate-sve.c
30
+ }
36
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
31
+ }
37
{
32
/*
38
unsigned vsz = vec_full_reg_size(s);
33
* Enable 48-bit address space (TODO: take reserved_va into account).
39
TCGv_ptr t_pg;
34
* Enable TBI0 but not TBI1.
40
+ uint32_t sizem1;
41
int desc = 0;
42
43
assert(mte_n >= 1 && mte_n <= 4);
44
+ sizem1 = (mte_n << dtype_msz(dtype)) - 1;
45
+ assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
46
if (s->mte_active[0]) {
47
- int msz = dtype_msz(dtype);
48
-
49
desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
50
desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
51
desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
52
desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
53
- desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1);
54
+ desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
55
desc <<= SVE_MTEDESC_SHIFT;
56
} else {
57
addr = clean_data_tbi(s, addr);
35
--
58
--
36
2.25.1
59
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
These functions will be used to verify that the cpu
3
Share code that creates mtedesc and embeds within simd_desc.
4
is in the correct state for a given instruction.
5
4
5
Cc: qemu-stable@nongnu.org
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-16-richard.henderson@linaro.org
8
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
9
Message-id: 20240207025210.8837-5-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
---
11
target/arm/translate-a64.h | 21 +++++++++++++++++++++
12
target/arm/tcg/translate-a64.h | 2 ++
12
target/arm/translate-a64.c | 34 ++++++++++++++++++++++++++++++++++
13
target/arm/tcg/translate-sme.c | 15 +++--------
13
2 files changed, 55 insertions(+)
14
target/arm/tcg/translate-sve.c | 47 ++++++++++++++++++----------------
15
3 files changed, 31 insertions(+), 33 deletions(-)
14
16
15
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
17
diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-a64.h
19
--- a/target/arm/tcg/translate-a64.h
18
+++ b/target/arm/translate-a64.h
20
+++ b/target/arm/tcg/translate-a64.h
19
@@ -XXX,XX +XXX,XX @@ void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v);
21
@@ -XXX,XX +XXX,XX @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
20
bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
21
unsigned int imms, unsigned int immr);
22
bool sve_access_check(DisasContext *s);
22
bool sve_access_check(DisasContext *s);
23
+bool sme_enabled_check(DisasContext *s);
23
bool sme_enabled_check(DisasContext *s);
24
+bool sme_enabled_check_with_svcr(DisasContext *s, unsigned);
24
bool sme_enabled_check_with_svcr(DisasContext *s, unsigned);
25
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
26
+ uint32_t msz, bool is_write, uint32_t data);
27
28
/* This function corresponds to CheckStreamingSVEEnabled. */
29
static inline bool sme_sm_enabled_check(DisasContext *s)
30
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/tcg/translate-sme.c
33
+++ b/target/arm/tcg/translate-sme.c
34
@@ -XXX,XX +XXX,XX @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
35
36
TCGv_ptr t_za, t_pg;
37
TCGv_i64 addr;
38
- int svl, desc = 0;
39
+ uint32_t desc;
40
bool be = s->be_data == MO_BE;
41
bool mte = s->mte_active[0];
42
43
@@ -XXX,XX +XXX,XX @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
44
tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->esz);
45
tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
46
47
- if (mte) {
48
- desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
49
- desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
50
- desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
51
- desc = FIELD_DP32(desc, MTEDESC, WRITE, a->st);
52
- desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << a->esz) - 1);
53
- desc <<= SVE_MTEDESC_SHIFT;
54
- } else {
55
+ if (!mte) {
56
addr = clean_data_tbi(s, addr);
57
}
58
- svl = streaming_vec_reg_size(s);
59
- desc = simd_desc(svl, svl, desc);
25
+
60
+
26
+/* This function corresponds to CheckStreamingSVEEnabled. */
61
+ desc = make_svemte_desc(s, streaming_vec_reg_size(s), 1, a->esz, a->st, 0);
27
+static inline bool sme_sm_enabled_check(DisasContext *s)
62
28
+{
63
fns[a->esz][be][a->v][mte][a->st](tcg_env, t_za, t_pg, addr,
29
+ return sme_enabled_check_with_svcr(s, R_SVCR_SM_MASK);
64
tcg_constant_i32(desc));
65
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
66
index XXXXXXX..XXXXXXX 100644
67
--- a/target/arm/tcg/translate-sve.c
68
+++ b/target/arm/tcg/translate-sve.c
69
@@ -XXX,XX +XXX,XX @@ static const uint8_t dtype_esz[16] = {
70
3, 2, 1, 3
71
};
72
73
-static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
74
- int dtype, uint32_t mte_n, bool is_write,
75
- gen_helper_gvec_mem *fn)
76
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
77
+ uint32_t msz, bool is_write, uint32_t data)
78
{
79
- unsigned vsz = vec_full_reg_size(s);
80
- TCGv_ptr t_pg;
81
uint32_t sizem1;
82
- int desc = 0;
83
+ uint32_t desc = 0;
84
85
- assert(mte_n >= 1 && mte_n <= 4);
86
- sizem1 = (mte_n << dtype_msz(dtype)) - 1;
87
+ /* Assert all of the data fits, with or without MTE enabled. */
88
+ assert(nregs >= 1 && nregs <= 4);
89
+ sizem1 = (nregs << msz) - 1;
90
assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
91
+ assert(data < 1u << SVE_MTEDESC_SHIFT);
92
+
93
if (s->mte_active[0]) {
94
desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
95
desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
96
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
97
desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
98
desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
99
desc <<= SVE_MTEDESC_SHIFT;
100
- } else {
101
+ }
102
+ return simd_desc(vsz, vsz, desc | data);
30
+}
103
+}
31
+
104
+
32
+/* This function corresponds to CheckSMEAndZAEnabled. */
105
+static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
33
+static inline bool sme_za_enabled_check(DisasContext *s)
106
+ int dtype, uint32_t nregs, bool is_write,
107
+ gen_helper_gvec_mem *fn)
34
+{
108
+{
35
+ return sme_enabled_check_with_svcr(s, R_SVCR_ZA_MASK);
109
+ TCGv_ptr t_pg;
36
+}
110
+ uint32_t desc;
37
+
111
+
38
+/* Note that this function corresponds to CheckStreamingSVEAndZAEnabled. */
112
+ if (!s->mte_active[0]) {
39
+static inline bool sme_smza_enabled_check(DisasContext *s)
113
addr = clean_data_tbi(s, addr);
40
+{
114
}
41
+ return sme_enabled_check_with_svcr(s, R_SVCR_SM_MASK | R_SVCR_ZA_MASK);
115
42
+}
116
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
117
* registers as pointers, so encode the regno into the data field.
118
* For consistency, do this even for LD1.
119
*/
120
- desc = simd_desc(vsz, vsz, zt | desc);
121
+ desc = make_svemte_desc(s, vec_full_reg_size(s), nregs,
122
+ dtype_msz(dtype), is_write, zt);
123
t_pg = tcg_temp_new_ptr();
124
125
tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
126
@@ -XXX,XX +XXX,XX @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
127
int scale, TCGv_i64 scalar, int msz, bool is_write,
128
gen_helper_gvec_mem_scatter *fn)
129
{
130
- unsigned vsz = vec_full_reg_size(s);
131
TCGv_ptr t_zm = tcg_temp_new_ptr();
132
TCGv_ptr t_pg = tcg_temp_new_ptr();
133
TCGv_ptr t_zt = tcg_temp_new_ptr();
134
- int desc = 0;
135
-
136
- if (s->mte_active[0]) {
137
- desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
138
- desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
139
- desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
140
- desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
141
- desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << msz) - 1);
142
- desc <<= SVE_MTEDESC_SHIFT;
143
- }
144
- desc = simd_desc(vsz, vsz, desc | scale);
145
+ uint32_t desc;
146
147
tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
148
tcg_gen_addi_ptr(t_zm, tcg_env, vec_full_reg_offset(s, zm));
149
tcg_gen_addi_ptr(t_zt, tcg_env, vec_full_reg_offset(s, zt));
43
+
150
+
44
TCGv_i64 clean_data_tbi(DisasContext *s, TCGv_i64 addr);
151
+ desc = make_svemte_desc(s, vec_full_reg_size(s), 1, msz, is_write, scale);
45
TCGv_i64 gen_mte_check1(DisasContext *s, TCGv_i64 addr, bool is_write,
152
fn(tcg_env, t_zt, t_pg, t_zm, scalar, tcg_constant_i32(desc));
46
bool tag_checked, int log2_size);
47
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/translate-a64.c
50
+++ b/target/arm/translate-a64.c
51
@@ -XXX,XX +XXX,XX @@ static bool sme_access_check(DisasContext *s)
52
return true;
53
}
153
}
54
154
55
+/* This function corresponds to CheckSMEEnabled. */
56
+bool sme_enabled_check(DisasContext *s)
57
+{
58
+ /*
59
+ * Note that unlike sve_excp_el, we have not constrained sme_excp_el
60
+ * to be zero when fp_excp_el has priority. This is because we need
61
+ * sme_excp_el by itself for cpregs access checks.
62
+ */
63
+ if (!s->fp_excp_el || s->sme_excp_el < s->fp_excp_el) {
64
+ s->fp_access_checked = true;
65
+ return sme_access_check(s);
66
+ }
67
+ return fp_access_check_only(s);
68
+}
69
+
70
+/* Common subroutine for CheckSMEAnd*Enabled. */
71
+bool sme_enabled_check_with_svcr(DisasContext *s, unsigned req)
72
+{
73
+ if (!sme_enabled_check(s)) {
74
+ return false;
75
+ }
76
+ if (FIELD_EX64(req, SVCR, SM) && !s->pstate_sm) {
77
+ gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
78
+ syn_smetrap(SME_ET_NotStreaming, false));
79
+ return false;
80
+ }
81
+ if (FIELD_EX64(req, SVCR, ZA) && !s->pstate_za) {
82
+ gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
83
+ syn_smetrap(SME_ET_InactiveZA, false));
84
+ return false;
85
+ }
86
+ return true;
87
+}
88
+
89
/*
90
* This utility function is for doing register extension with an
91
* optional shift. You will likely want to pass a temporary for the
92
--
155
--
93
2.25.1
156
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Make sure to zero the currently reserved fields.
3
These functions "use the standard load helpers", but
4
fail to clean_data_tbi or populate mtedesc.
4
5
6
Cc: qemu-stable@nongnu.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20220708151540.18136-36-richard.henderson@linaro.org
9
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
10
Message-id: 20240207025210.8837-6-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
12
---
10
linux-user/aarch64/signal.c | 9 ++++++++-
13
target/arm/tcg/translate-sve.c | 15 +++++++++++++--
11
1 file changed, 8 insertions(+), 1 deletion(-)
14
1 file changed, 13 insertions(+), 2 deletions(-)
12
15
13
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
16
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
14
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
15
--- a/linux-user/aarch64/signal.c
18
--- a/target/arm/tcg/translate-sve.c
16
+++ b/linux-user/aarch64/signal.c
19
+++ b/target/arm/tcg/translate-sve.c
17
@@ -XXX,XX +XXX,XX @@ struct target_extra_context {
20
@@ -XXX,XX +XXX,XX @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
18
struct target_sve_context {
21
unsigned vsz = vec_full_reg_size(s);
19
struct target_aarch64_ctx head;
22
TCGv_ptr t_pg;
20
uint16_t vl;
23
int poff;
21
- uint16_t reserved[3];
24
+ uint32_t desc;
22
+ uint16_t flags;
25
23
+ uint16_t reserved[2];
26
/* Load the first quadword using the normal predicated load helpers. */
24
/* The actual SVE data immediately follows. It is laid out
27
+ if (!s->mte_active[0]) {
25
* according to TARGET_SVE_SIG_{Z,P}REG_OFFSET, based off of
28
+ addr = clean_data_tbi(s, addr);
26
* the original struct pointer.
29
+ }
27
@@ -XXX,XX +XXX,XX @@ struct target_sve_context {
28
#define TARGET_SVE_SIG_CONTEXT_SIZE(VQ) \
29
(TARGET_SVE_SIG_PREG_OFFSET(VQ, 17))
30
31
+#define TARGET_SVE_SIG_FLAG_SM 1
32
+
30
+
33
struct target_rt_sigframe {
31
poff = pred_full_reg_offset(s, pg);
34
struct target_siginfo info;
32
if (vsz > 16) {
35
struct target_ucontext uc;
33
/*
36
@@ -XXX,XX +XXX,XX @@ static void target_setup_sve_record(struct target_sve_context *sve,
34
@@ -XXX,XX +XXX,XX @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
37
{
35
38
int i, j;
36
gen_helper_gvec_mem *fn
39
37
= ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
40
+ memset(sve, 0, sizeof(*sve));
38
- fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(16, 16, zt)));
41
__put_user(TARGET_SVE_MAGIC, &sve->head.magic);
39
+ desc = make_svemte_desc(s, 16, 1, dtype_msz(dtype), false, zt);
42
__put_user(size, &sve->head.size);
40
+ fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
43
__put_user(vq * TARGET_SVE_VQ_BYTES, &sve->vl);
41
44
+ if (FIELD_EX64(env->svcr, SVCR, SM)) {
42
/* Replicate that first quadword. */
45
+ __put_user(TARGET_SVE_SIG_FLAG_SM, &sve->flags);
43
if (vsz > 16) {
44
@@ -XXX,XX +XXX,XX @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
45
unsigned vsz_r32;
46
TCGv_ptr t_pg;
47
int poff, doff;
48
+ uint32_t desc;
49
50
if (vsz < 32) {
51
/*
52
@@ -XXX,XX +XXX,XX @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
53
}
54
55
/* Load the first octaword using the normal predicated load helpers. */
56
+ if (!s->mte_active[0]) {
57
+ addr = clean_data_tbi(s, addr);
46
+ }
58
+ }
47
59
48
/* Note that SVE regs are stored as a byte stream, with each byte element
60
poff = pred_full_reg_offset(s, pg);
49
* at a subsequent address. This corresponds to a little-endian store
61
if (vsz > 32) {
62
@@ -XXX,XX +XXX,XX @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
63
64
gen_helper_gvec_mem *fn
65
= ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
66
- fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(32, 32, zt)));
67
+ desc = make_svemte_desc(s, 32, 1, dtype_msz(dtype), false, zt);
68
+ fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
69
70
/*
71
* Replicate that first octaword.
50
--
72
--
51
2.25.1
73
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Set the SM bit in the SVE record on signal delivery, create the ZA record.
3
The TBI and TCMA bits are located within mtedesc, not desc.
4
Restore SM and ZA state according to the records present on return.
5
4
5
Cc: qemu-stable@nongnu.org
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-41-richard.henderson@linaro.org
8
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
9
Message-id: 20240207025210.8837-7-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
---
11
linux-user/aarch64/signal.c | 167 +++++++++++++++++++++++++++++++++---
12
target/arm/tcg/sme_helper.c | 8 ++++----
12
1 file changed, 154 insertions(+), 13 deletions(-)
13
target/arm/tcg/sve_helper.c | 12 ++++++------
14
2 files changed, 10 insertions(+), 10 deletions(-)
13
15
14
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
16
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/linux-user/aarch64/signal.c
18
--- a/target/arm/tcg/sme_helper.c
17
+++ b/linux-user/aarch64/signal.c
19
+++ b/target/arm/tcg/sme_helper.c
18
@@ -XXX,XX +XXX,XX @@ struct target_sve_context {
20
@@ -XXX,XX +XXX,XX @@ void sme_ld1_mte(CPUARMState *env, void *za, uint64_t *vg,
19
21
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
20
#define TARGET_SVE_SIG_FLAG_SM 1
22
21
23
/* Perform gross MTE suppression early. */
22
+#define TARGET_ZA_MAGIC 0x54366345
24
- if (!tbi_check(desc, bit55) ||
23
+
25
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
24
+struct target_za_context {
26
+ if (!tbi_check(mtedesc, bit55) ||
25
+ struct target_aarch64_ctx head;
27
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
26
+ uint16_t vl;
28
mtedesc = 0;
27
+ uint16_t reserved[3];
28
+ /* The actual ZA data immediately follows. */
29
+};
30
+
31
+#define TARGET_ZA_SIG_REGS_OFFSET \
32
+ QEMU_ALIGN_UP(sizeof(struct target_za_context), TARGET_SVE_VQ_BYTES)
33
+#define TARGET_ZA_SIG_ZAV_OFFSET(VQ, N) \
34
+ (TARGET_ZA_SIG_REGS_OFFSET + (VQ) * TARGET_SVE_VQ_BYTES * (N))
35
+#define TARGET_ZA_SIG_CONTEXT_SIZE(VQ) \
36
+ TARGET_ZA_SIG_ZAV_OFFSET(VQ, VQ * TARGET_SVE_VQ_BYTES)
37
+
38
struct target_rt_sigframe {
39
struct target_siginfo info;
40
struct target_ucontext uc;
41
@@ -XXX,XX +XXX,XX @@ static void target_setup_end_record(struct target_aarch64_ctx *end)
42
}
43
44
static void target_setup_sve_record(struct target_sve_context *sve,
45
- CPUARMState *env, int vq, int size)
46
+ CPUARMState *env, int size)
47
{
48
- int i, j;
49
+ int i, j, vq = sve_vq(env);
50
51
memset(sve, 0, sizeof(*sve));
52
__put_user(TARGET_SVE_MAGIC, &sve->head.magic);
53
@@ -XXX,XX +XXX,XX @@ static void target_setup_sve_record(struct target_sve_context *sve,
54
}
29
}
55
}
30
56
31
@@ -XXX,XX +XXX,XX @@ void sme_st1_mte(CPUARMState *env, void *za, uint64_t *vg, target_ulong addr,
57
+static void target_setup_za_record(struct target_za_context *za,
32
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
58
+ CPUARMState *env, int size)
33
59
+{
34
/* Perform gross MTE suppression early. */
60
+ int vq = sme_vq(env);
35
- if (!tbi_check(desc, bit55) ||
61
+ int vl = vq * TARGET_SVE_VQ_BYTES;
36
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
62
+ int i, j;
37
+ if (!tbi_check(mtedesc, bit55) ||
63
+
38
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
64
+ memset(za, 0, sizeof(*za));
39
mtedesc = 0;
65
+ __put_user(TARGET_ZA_MAGIC, &za->head.magic);
66
+ __put_user(size, &za->head.size);
67
+ __put_user(vl, &za->vl);
68
+
69
+ if (size == TARGET_ZA_SIG_CONTEXT_SIZE(0)) {
70
+ return;
71
+ }
72
+ assert(size == TARGET_ZA_SIG_CONTEXT_SIZE(vq));
73
+
74
+ /*
75
+ * Note that ZA vectors are stored as a byte stream,
76
+ * with each byte element at a subsequent address.
77
+ */
78
+ for (i = 0; i < vl; ++i) {
79
+ uint64_t *z = (void *)za + TARGET_ZA_SIG_ZAV_OFFSET(vq, i);
80
+ for (j = 0; j < vq * 2; ++j) {
81
+ __put_user_e(env->zarray[i].d[j], z + j, le);
82
+ }
83
+ }
84
+}
85
+
86
static void target_restore_general_frame(CPUARMState *env,
87
struct target_rt_sigframe *sf)
88
{
89
@@ -XXX,XX +XXX,XX @@ static void target_restore_fpsimd_record(CPUARMState *env,
90
91
static bool target_restore_sve_record(CPUARMState *env,
92
struct target_sve_context *sve,
93
- int size)
94
+ int size, int *svcr)
95
{
96
- int i, j, vl, vq;
97
+ int i, j, vl, vq, flags;
98
+ bool sm;
99
100
- if (!cpu_isar_feature(aa64_sve, env_archcpu(env))) {
101
+ __get_user(vl, &sve->vl);
102
+ __get_user(flags, &sve->flags);
103
+
104
+ sm = flags & TARGET_SVE_SIG_FLAG_SM;
105
+
106
+ /* The cpu must support Streaming or Non-streaming SVE. */
107
+ if (sm
108
+ ? !cpu_isar_feature(aa64_sme, env_archcpu(env))
109
+ : !cpu_isar_feature(aa64_sve, env_archcpu(env))) {
110
return false;
111
}
40
}
112
41
113
- __get_user(vl, &sve->vl);
42
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
114
- vq = sve_vq(env);
43
index XXXXXXX..XXXXXXX 100644
115
+ /*
44
--- a/target/arm/tcg/sve_helper.c
116
+ * Note that we cannot use sve_vq() because that depends on the
45
+++ b/target/arm/tcg/sve_helper.c
117
+ * current setting of PSTATE.SM, not the state to be restored.
46
@@ -XXX,XX +XXX,XX @@ void sve_ldN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr,
118
+ */
47
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
119
+ vq = sve_vqm1_for_el_sm(env, 0, sm) + 1;
48
120
49
/* Perform gross MTE suppression early. */
121
/* Reject mismatched VL. */
50
- if (!tbi_check(desc, bit55) ||
122
if (vl != vq * TARGET_SVE_VQ_BYTES) {
51
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
123
@@ -XXX,XX +XXX,XX @@ static bool target_restore_sve_record(CPUARMState *env,
52
+ if (!tbi_check(mtedesc, bit55) ||
124
return false;
53
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
54
mtedesc = 0;
125
}
55
}
126
56
127
+ *svcr = FIELD_DP64(*svcr, SVCR, SM, sm);
57
@@ -XXX,XX +XXX,XX @@ void sve_ldnfff1_r_mte(CPUARMState *env, void *vg, target_ulong addr,
128
+
58
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
129
/*
59
130
* Note that SVE regs are stored as a byte stream, with each byte element
60
/* Perform gross MTE suppression early. */
131
* at a subsequent address. This corresponds to a little-endian load
61
- if (!tbi_check(desc, bit55) ||
132
@@ -XXX,XX +XXX,XX @@ static bool target_restore_sve_record(CPUARMState *env,
62
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
133
return true;
63
+ if (!tbi_check(mtedesc, bit55) ||
134
}
64
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
135
65
mtedesc = 0;
136
+static bool target_restore_za_record(CPUARMState *env,
137
+ struct target_za_context *za,
138
+ int size, int *svcr)
139
+{
140
+ int i, j, vl, vq;
141
+
142
+ if (!cpu_isar_feature(aa64_sme, env_archcpu(env))) {
143
+ return false;
144
+ }
145
+
146
+ __get_user(vl, &za->vl);
147
+ vq = sme_vq(env);
148
+
149
+ /* Reject mismatched VL. */
150
+ if (vl != vq * TARGET_SVE_VQ_BYTES) {
151
+ return false;
152
+ }
153
+
154
+ /* Accept empty record -- used to clear PSTATE.ZA. */
155
+ if (size <= TARGET_ZA_SIG_CONTEXT_SIZE(0)) {
156
+ return true;
157
+ }
158
+
159
+ /* Reject non-empty but incomplete record. */
160
+ if (size < TARGET_ZA_SIG_CONTEXT_SIZE(vq)) {
161
+ return false;
162
+ }
163
+
164
+ *svcr = FIELD_DP64(*svcr, SVCR, ZA, 1);
165
+
166
+ for (i = 0; i < vl; ++i) {
167
+ uint64_t *z = (void *)za + TARGET_ZA_SIG_ZAV_OFFSET(vq, i);
168
+ for (j = 0; j < vq * 2; ++j) {
169
+ __get_user_e(env->zarray[i].d[j], z + j, le);
170
+ }
171
+ }
172
+ return true;
173
+}
174
+
175
static int target_restore_sigframe(CPUARMState *env,
176
struct target_rt_sigframe *sf)
177
{
178
struct target_aarch64_ctx *ctx, *extra = NULL;
179
struct target_fpsimd_context *fpsimd = NULL;
180
struct target_sve_context *sve = NULL;
181
+ struct target_za_context *za = NULL;
182
uint64_t extra_datap = 0;
183
bool used_extra = false;
184
int sve_size = 0;
185
+ int za_size = 0;
186
+ int svcr = 0;
187
188
target_restore_general_frame(env, sf);
189
190
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
191
sve_size = size;
192
break;
193
194
+ case TARGET_ZA_MAGIC:
195
+ if (za || size < sizeof(struct target_za_context)) {
196
+ goto err;
197
+ }
198
+ za = (struct target_za_context *)ctx;
199
+ za_size = size;
200
+ break;
201
+
202
case TARGET_EXTRA_MAGIC:
203
if (extra || size != sizeof(struct target_extra_context)) {
204
goto err;
205
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
206
}
66
}
207
67
208
/* SVE data, if present, overwrites FPSIMD data. */
68
@@ -XXX,XX +XXX,XX @@ void sve_stN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr,
209
- if (sve && !target_restore_sve_record(env, sve, sve_size)) {
69
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
210
+ if (sve && !target_restore_sve_record(env, sve, sve_size, &svcr)) {
70
211
goto err;
71
/* Perform gross MTE suppression early. */
72
- if (!tbi_check(desc, bit55) ||
73
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
74
+ if (!tbi_check(mtedesc, bit55) ||
75
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
76
mtedesc = 0;
212
}
77
}
213
+ if (za && !target_restore_za_record(env, za, za_size, &svcr)) {
78
214
+ goto err;
215
+ }
216
+ if (env->svcr != svcr) {
217
+ env->svcr = svcr;
218
+ arm_rebuild_hflags(env);
219
+ }
220
unlock_user(extra, extra_datap, 0);
221
return 0;
222
223
@@ -XXX,XX +XXX,XX @@ static void target_setup_frame(int usig, struct target_sigaction *ka,
224
.total_size = offsetof(struct target_rt_sigframe,
225
uc.tuc_mcontext.__reserved),
226
};
227
- int fpsimd_ofs, fr_ofs, sve_ofs = 0, vq = 0, sve_size = 0;
228
+ int fpsimd_ofs, fr_ofs, sve_ofs = 0, za_ofs = 0;
229
+ int sve_size = 0, za_size = 0;
230
struct target_rt_sigframe *frame;
231
struct target_rt_frame_record *fr;
232
abi_ulong frame_addr, return_addr;
233
@@ -XXX,XX +XXX,XX @@ static void target_setup_frame(int usig, struct target_sigaction *ka,
234
&layout);
235
236
/* SVE state needs saving only if it exists. */
237
- if (cpu_isar_feature(aa64_sve, env_archcpu(env))) {
238
- vq = sve_vq(env);
239
- sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
240
+ if (cpu_isar_feature(aa64_sve, env_archcpu(env)) ||
241
+ cpu_isar_feature(aa64_sme, env_archcpu(env))) {
242
+ sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(sve_vq(env)), 16);
243
sve_ofs = alloc_sigframe_space(sve_size, &layout);
244
}
245
+ if (cpu_isar_feature(aa64_sme, env_archcpu(env))) {
246
+ /* ZA state needs saving only if it is enabled. */
247
+ if (FIELD_EX64(env->svcr, SVCR, ZA)) {
248
+ za_size = TARGET_ZA_SIG_CONTEXT_SIZE(sme_vq(env));
249
+ } else {
250
+ za_size = TARGET_ZA_SIG_CONTEXT_SIZE(0);
251
+ }
252
+ za_ofs = alloc_sigframe_space(za_size, &layout);
253
+ }
254
255
if (layout.extra_ofs) {
256
/* Reserve space for the extra end marker. The standard end marker
257
@@ -XXX,XX +XXX,XX @@ static void target_setup_frame(int usig, struct target_sigaction *ka,
258
target_setup_end_record((void *)frame + layout.extra_end_ofs);
259
}
260
if (sve_ofs) {
261
- target_setup_sve_record((void *)frame + sve_ofs, env, vq, sve_size);
262
+ target_setup_sve_record((void *)frame + sve_ofs, env, sve_size);
263
+ }
264
+ if (za_ofs) {
265
+ target_setup_za_record((void *)frame + za_ofs, env, za_size);
266
}
267
268
/* Set up the stack frame for unwinding. */
269
@@ -XXX,XX +XXX,XX @@ static void target_setup_frame(int usig, struct target_sigaction *ka,
270
env->btype = 2;
271
}
272
273
+ /*
274
+ * Invoke the signal handler with both SM and ZA disabled.
275
+ * When clearing SM, ResetSVEState, per SMSTOP.
276
+ */
277
+ if (FIELD_EX64(env->svcr, SVCR, SM)) {
278
+ arm_reset_sve_state(env);
279
+ }
280
+ if (env->svcr) {
281
+ env->svcr = 0;
282
+ arm_rebuild_hflags(env);
283
+ }
284
+
285
if (info) {
286
tswap_siginfo(&frame->info, info);
287
env->xregs[1] = frame_addr + offsetof(struct target_rt_sigframe, info);
288
--
79
--
289
2.25.1
80
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The raven_io_ops MemoryRegionOps is the only one in the source tree
2
which sets .valid.unaligned to indicate that it should support
3
unaligned accesses and which does not also set .impl.unaligned to
4
indicate that its read and write functions can do the unaligned
5
handling themselves. This is a problem, because at the moment the
6
core memory system does not implement the support for handling
7
unaligned accesses by doing a series of aligned accesses and
8
combining them (system/memory.c:access_with_adjusted_size() has a
9
TODO comment noting this).
2
10
3
Note that SME remains effectively disabled for user-only,
11
Fortunately raven_io_read() and raven_io_write() will correctly deal
4
because we do not yet set CPACR_EL1.SMEN. This needs to
12
with the case of being passed an unaligned address, so we can fix the
5
wait until the kernel ABI is implemented.
13
missing unaligned access support by setting .impl.unaligned in the
14
MemoryRegionOps struct.
6
15
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Fixes: 9a1839164c9c8f06 ("raven: Implement non-contiguous I/O region")
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20220708151540.18136-33-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Tested-by: Cédric Le Goater <clg@redhat.com>
19
Reviewed-by: Cédric Le Goater <clg@redhat.com>
20
Message-id: 20240112134640.1775041-1-peter.maydell@linaro.org
11
---
21
---
12
docs/system/arm/emulation.rst | 4 ++++
22
hw/pci-host/raven.c | 1 +
13
target/arm/cpu64.c | 11 +++++++++++
23
1 file changed, 1 insertion(+)
14
2 files changed, 15 insertions(+)
15
24
16
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
25
diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c
17
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
18
--- a/docs/system/arm/emulation.rst
27
--- a/hw/pci-host/raven.c
19
+++ b/docs/system/arm/emulation.rst
28
+++ b/hw/pci-host/raven.c
20
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
29
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps raven_io_ops = {
21
- FEAT_SHA512 (Advanced SIMD SHA512 instructions)
30
.write = raven_io_write,
22
- FEAT_SM3 (Advanced SIMD SM3 instructions)
31
.endianness = DEVICE_LITTLE_ENDIAN,
23
- FEAT_SM4 (Advanced SIMD SM4 instructions)
32
.impl.max_access_size = 4,
24
+- FEAT_SME (Scalable Matrix Extension)
33
+ .impl.unaligned = true,
25
+- FEAT_SME_FA64 (Full A64 instruction set in Streaming SVE mode)
34
.valid.unaligned = true,
26
+- FEAT_SME_F64F64 (Double-precision floating-point outer product instructions)
35
};
27
+- FEAT_SME_I16I64 (16-bit to 64-bit integer widening outer product instructions)
28
- FEAT_SPECRES (Speculation restriction instructions)
29
- FEAT_SSBS (Speculative Store Bypass Safe)
30
- FEAT_TLBIOS (TLB invalidate instructions in Outer Shareable domain)
31
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/cpu64.c
34
+++ b/target/arm/cpu64.c
35
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
36
*/
37
t = FIELD_DP64(t, ID_AA64PFR1, MTE, 3); /* FEAT_MTE3 */
38
t = FIELD_DP64(t, ID_AA64PFR1, RAS_FRAC, 0); /* FEAT_RASv1p1 + FEAT_DoubleFault */
39
+ t = FIELD_DP64(t, ID_AA64PFR1, SME, 1); /* FEAT_SME */
40
t = FIELD_DP64(t, ID_AA64PFR1, CSV2_FRAC, 0); /* FEAT_CSV2_2 */
41
cpu->isar.id_aa64pfr1 = t;
42
43
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
44
t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* FEAT_PMUv3p4 */
45
cpu->isar.id_aa64dfr0 = t;
46
47
+ t = cpu->isar.id_aa64smfr0;
48
+ t = FIELD_DP64(t, ID_AA64SMFR0, F32F32, 1); /* FEAT_SME */
49
+ t = FIELD_DP64(t, ID_AA64SMFR0, B16F32, 1); /* FEAT_SME */
50
+ t = FIELD_DP64(t, ID_AA64SMFR0, F16F32, 1); /* FEAT_SME */
51
+ t = FIELD_DP64(t, ID_AA64SMFR0, I8I32, 0xf); /* FEAT_SME */
52
+ t = FIELD_DP64(t, ID_AA64SMFR0, F64F64, 1); /* FEAT_SME_F64F64 */
53
+ t = FIELD_DP64(t, ID_AA64SMFR0, I16I64, 0xf); /* FEAT_SME_I16I64 */
54
+ t = FIELD_DP64(t, ID_AA64SMFR0, FA64, 1); /* FEAT_SME_FA64 */
55
+ cpu->isar.id_aa64smfr0 = t;
56
+
57
/* Replicate the same data to the 32-bit id registers. */
58
aa32_max_features(cpu);
59
36
60
--
37
--
61
2.25.1
38
2.34.1
39
40
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Suppress the deprecation warning when we're running under qtest,
2
to avoid "make check" including warning messages in its output.
2
3
3
There's no reason to set CPACR_EL1.ZEN if SVE disabled.
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Message-id: 20240206154151.155620-1-peter.maydell@linaro.org
7
---
8
hw/block/tc58128.c | 4 +++-
9
1 file changed, 3 insertions(+), 1 deletion(-)
4
10
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
diff --git a/hw/block/tc58128.c b/hw/block/tc58128.c
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
index XXXXXXX..XXXXXXX 100644
7
Message-id: 20220708151540.18136-44-richard.henderson@linaro.org
13
--- a/hw/block/tc58128.c
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
+++ b/hw/block/tc58128.c
9
---
15
@@ -XXX,XX +XXX,XX @@ static sh7750_io_device tc58128 = {
10
target/arm/cpu.c | 7 +++----
16
11
1 file changed, 3 insertions(+), 4 deletions(-)
17
int tc58128_init(struct SH7750State *s, const char *zone1, const char *zone2)
18
{
19
- warn_report_once("The TC58128 flash device is deprecated");
20
+ if (!qtest_enabled()) {
21
+ warn_report_once("The TC58128 flash device is deprecated");
22
+ }
23
init_dev(&tc58128_devs[0], zone1);
24
init_dev(&tc58128_devs[1], zone2);
25
return sh7750_register_io_device(s, &tc58128);
26
--
27
2.34.1
12
28
13
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
29
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/cpu.c
16
+++ b/target/arm/cpu.c
17
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(DeviceState *dev)
18
/* and to the FP/Neon instructions */
19
env->cp15.cpacr_el1 = FIELD_DP64(env->cp15.cpacr_el1,
20
CPACR_EL1, FPEN, 3);
21
- /* and to the SVE instructions */
22
- env->cp15.cpacr_el1 = FIELD_DP64(env->cp15.cpacr_el1,
23
- CPACR_EL1, ZEN, 3);
24
- /* with reasonable vector length */
25
+ /* and to the SVE instructions, with default vector length */
26
if (cpu_isar_feature(aa64_sve, cpu)) {
27
+ env->cp15.cpacr_el1 = FIELD_DP64(env->cp15.cpacr_el1,
28
+ CPACR_EL1, ZEN, 3);
29
env->vfp.zcr_el[1] = cpu->sve_default_vq - 1;
30
}
31
/*
32
--
33
2.25.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
We deliberately don't include qtests_npcm7xx in qtests_aarch64,
2
because we already get the coverage of those tests via qtests_arm,
3
and we don't want to use extra CI minutes testing them twice.
2
4
3
Add "sve" to the sve prctl functions, to distinguish
5
In commit 327b680877b79c4b we added it to qtests_aarch64; revert
4
them from the coming "sme" prctls with similar names.
6
that change.
5
7
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Fixes: 327b680877b79c4b ("tests/qtest: Creating qtest for GMAC Module")
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-42-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Message-id: 20240206163043.315535-1-peter.maydell@linaro.org
10
---
12
---
11
linux-user/aarch64/target_prctl.h | 8 ++++----
13
tests/qtest/meson.build | 1 -
12
linux-user/syscall.c | 12 ++++++------
14
1 file changed, 1 deletion(-)
13
2 files changed, 10 insertions(+), 10 deletions(-)
14
15
15
diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
16
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/linux-user/aarch64/target_prctl.h
18
--- a/tests/qtest/meson.build
18
+++ b/linux-user/aarch64/target_prctl.h
19
+++ b/tests/qtest/meson.build
19
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@ qtests_aarch64 = \
20
#ifndef AARCH64_TARGET_PRCTL_H
21
(config_all_devices.has_key('CONFIG_RASPI') ? ['bcm2835-dma-test'] : []) + \
21
#define AARCH64_TARGET_PRCTL_H
22
(config_all_accel.has_key('CONFIG_TCG') and \
22
23
config_all_devices.has_key('CONFIG_TPM_TIS_I2C') ? ['tpm-tis-i2c-test'] : []) + \
23
-static abi_long do_prctl_get_vl(CPUArchState *env)
24
- (config_all_devices.has_key('CONFIG_NPCM7XX') ? qtests_npcm7xx : []) + \
24
+static abi_long do_prctl_sve_get_vl(CPUArchState *env)
25
['arm-cpu-features',
25
{
26
'numa-test',
26
ARMCPU *cpu = env_archcpu(env);
27
'boot-serial-test',
27
if (cpu_isar_feature(aa64_sve, cpu)) {
28
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl_get_vl(CPUArchState *env)
29
}
30
return -TARGET_EINVAL;
31
}
32
-#define do_prctl_get_vl do_prctl_get_vl
33
+#define do_prctl_sve_get_vl do_prctl_sve_get_vl
34
35
-static abi_long do_prctl_set_vl(CPUArchState *env, abi_long arg2)
36
+static abi_long do_prctl_sve_set_vl(CPUArchState *env, abi_long arg2)
37
{
38
/*
39
* We cannot support either PR_SVE_SET_VL_ONEXEC or PR_SVE_VL_INHERIT.
40
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl_set_vl(CPUArchState *env, abi_long arg2)
41
}
42
return -TARGET_EINVAL;
43
}
44
-#define do_prctl_set_vl do_prctl_set_vl
45
+#define do_prctl_sve_set_vl do_prctl_sve_set_vl
46
47
static abi_long do_prctl_reset_keys(CPUArchState *env, abi_long arg2)
48
{
49
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
50
index XXXXXXX..XXXXXXX 100644
51
--- a/linux-user/syscall.c
52
+++ b/linux-user/syscall.c
53
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl_inval1(CPUArchState *env, abi_long arg2)
54
#ifndef do_prctl_set_fp_mode
55
#define do_prctl_set_fp_mode do_prctl_inval1
56
#endif
57
-#ifndef do_prctl_get_vl
58
-#define do_prctl_get_vl do_prctl_inval0
59
+#ifndef do_prctl_sve_get_vl
60
+#define do_prctl_sve_get_vl do_prctl_inval0
61
#endif
62
-#ifndef do_prctl_set_vl
63
-#define do_prctl_set_vl do_prctl_inval1
64
+#ifndef do_prctl_sve_set_vl
65
+#define do_prctl_sve_set_vl do_prctl_inval1
66
#endif
67
#ifndef do_prctl_reset_keys
68
#define do_prctl_reset_keys do_prctl_inval1
69
@@ -XXX,XX +XXX,XX @@ static abi_long do_prctl(CPUArchState *env, abi_long option, abi_long arg2,
70
case PR_SET_FP_MODE:
71
return do_prctl_set_fp_mode(env, arg2);
72
case PR_SVE_GET_VL:
73
- return do_prctl_get_vl(env);
74
+ return do_prctl_sve_get_vl(env);
75
case PR_SVE_SET_VL:
76
- return do_prctl_set_vl(env, arg2);
77
+ return do_prctl_sve_set_vl(env, arg2);
78
case PR_PAC_RESET_KEYS:
79
if (arg3 || arg4 || arg5) {
80
return -TARGET_EINVAL;
81
--
28
--
82
2.25.1
29
2.34.1
30
31
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Allow changes to the virt GTDT -- we are going to add the IRQ
2
entry for a new timer to it.
2
3
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20220708151540.18136-39-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
6
Message-id: 20240122143537.233498-2-peter.maydell@linaro.org
7
---
7
---
8
linux-user/aarch64/signal.c | 3 +++
8
tests/qtest/bios-tables-test-allowed-diff.h | 2 ++
9
1 file changed, 3 insertions(+)
9
1 file changed, 2 insertions(+)
10
10
11
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
11
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
12
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
13
--- a/linux-user/aarch64/signal.c
13
--- a/tests/qtest/bios-tables-test-allowed-diff.h
14
+++ b/linux-user/aarch64/signal.c
14
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
15
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
15
@@ -1 +1,3 @@
16
__get_user(extra_size,
16
/* List of comma-separated changed AML files to ignore */
17
&((struct target_extra_context *)ctx)->size);
17
+"tests/data/acpi/virt/FACP",
18
extra = lock_user(VERIFY_READ, extra_datap, extra_size, 0);
18
+"tests/data/acpi/virt/GTDT",
19
+ if (!extra) {
20
+ return 1;
21
+ }
22
break;
23
24
default:
25
--
19
--
26
2.25.1
20
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Armv8.1+ CPUs have the Virtual Host Extension (VHE) which adds a
2
2
non-secure EL2 virtual timer. We implemented the timer itself in the
3
This is an SVE instruction that operates using the SVE vector
3
CPU model, but never wired up its IRQ line to the GIC.
4
length but that it is present only if SME is implemented.
4
5
5
Wire up the IRQ line (this is always safe whether the CPU has the
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
interrupt or not, since it always creates the outbound IRQ line).
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Report it to the guest via dtb and ACPI if the CPU has the feature.
8
Message-id: 20220708151540.18136-31-richard.henderson@linaro.org
8
9
The DTB binding is documented in the kernel's
10
Documentation/devicetree/bindings/timer/arm\,arch_timer.yaml
11
and the ACPI table entries are documented in the ACPI specification
12
version 6.3 or later.
13
14
Because the IRQ line ACPI binding is new in 6.3, we need to bump the
15
FADT table rev to show that we might be using 6.3 features.
16
17
Note that exposing this IRQ in the DTB will trigger a bug in EDK2
18
versions prior to edk2-stable202311, for users who use the virt board
19
with 'virtualization=on' to enable EL2 emulation and are booting an
20
EDK2 guest BIOS, if that EDK2 has assertions enabled. The effect is
21
that EDK2 will assert on bootup:
22
23
ASSERT [ArmTimerDxe] /home/kraxel/projects/qemu/roms/edk2/ArmVirtPkg/Library/ArmVirtTimerFdtClientLib/ArmVirtTimerFdtClientLib.c(72): PropSize == 36 || PropSize == 48
24
25
If you see that assertion you should do one of:
26
* update your EDK2 binaries to edk2-stable202311 or newer
27
* use the 'virt-8.2' versioned machine type
28
* not use 'virtualization=on'
29
30
(The versions shipped with QEMU itself have the fix.)
31
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
32
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
33
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
34
Message-id: 20240122143537.233498-3-peter.maydell@linaro.org
10
---
35
---
11
target/arm/helper.h | 18 +++++++
36
include/hw/arm/virt.h | 2 ++
12
target/arm/sve.decode | 5 ++
37
hw/arm/virt-acpi-build.c | 20 ++++++++++----
13
target/arm/translate-sve.c | 102 +++++++++++++++++++++++++++++++++++++
38
hw/arm/virt.c | 60 ++++++++++++++++++++++++++++++++++------
14
target/arm/vec_helper.c | 24 +++++++++
39
3 files changed, 67 insertions(+), 15 deletions(-)
15
4 files changed, 149 insertions(+)
40
16
41
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
18
index XXXXXXX..XXXXXXX 100644
42
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.h
43
--- a/include/hw/arm/virt.h
20
+++ b/target/arm/helper.h
44
+++ b/include/hw/arm/virt.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(gvec_bfmlal, TCG_CALL_NO_RWG,
45
@@ -XXX,XX +XXX,XX @@ struct VirtMachineClass {
22
DEF_HELPER_FLAGS_6(gvec_bfmlal_idx, TCG_CALL_NO_RWG,
46
/* Machines < 6.2 have no support for describing cpu topology to guest */
23
void, ptr, ptr, ptr, ptr, ptr, i32)
47
bool no_cpu_topology;
24
48
bool no_tcg_lpa2;
25
+DEF_HELPER_FLAGS_5(gvec_sclamp_b, TCG_CALL_NO_RWG,
49
+ bool no_ns_el2_virt_timer_irq;
26
+ void, ptr, ptr, ptr, ptr, i32)
50
};
27
+DEF_HELPER_FLAGS_5(gvec_sclamp_h, TCG_CALL_NO_RWG,
51
28
+ void, ptr, ptr, ptr, ptr, i32)
52
struct VirtMachineState {
29
+DEF_HELPER_FLAGS_5(gvec_sclamp_s, TCG_CALL_NO_RWG,
53
@@ -XXX,XX +XXX,XX @@ struct VirtMachineState {
30
+ void, ptr, ptr, ptr, ptr, i32)
54
PCIBus *bus;
31
+DEF_HELPER_FLAGS_5(gvec_sclamp_d, TCG_CALL_NO_RWG,
55
char *oem_id;
32
+ void, ptr, ptr, ptr, ptr, i32)
56
char *oem_table_id;
33
+
57
+ bool ns_el2_virt_timer_irq;
34
+DEF_HELPER_FLAGS_5(gvec_uclamp_b, TCG_CALL_NO_RWG,
58
};
35
+ void, ptr, ptr, ptr, ptr, i32)
59
36
+DEF_HELPER_FLAGS_5(gvec_uclamp_h, TCG_CALL_NO_RWG,
60
#define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
37
+ void, ptr, ptr, ptr, ptr, i32)
61
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
38
+DEF_HELPER_FLAGS_5(gvec_uclamp_s, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_5(gvec_uclamp_d, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, ptr, i32)
42
+
43
#ifdef TARGET_AARCH64
44
#include "helper-a64.h"
45
#include "helper-sve.h"
46
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
47
index XXXXXXX..XXXXXXX 100644
62
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/sve.decode
63
--- a/hw/arm/virt-acpi-build.c
49
+++ b/target/arm/sve.decode
64
+++ b/hw/arm/virt-acpi-build.c
50
@@ -XXX,XX +XXX,XX @@ PSEL 00100101 .. 1 100 .. 01 .... 0 .... 0 .... \
65
@@ -XXX,XX +XXX,XX @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
51
@psel esz=2 imm=%psel_imm_s
66
}
52
PSEL 00100101 .1 1 000 .. 01 .... 0 .... 0 .... \
67
53
@psel esz=3 imm=%psel_imm_d
68
/*
54
+
69
- * ACPI spec, Revision 5.1
55
+### SVE clamp
70
- * 5.2.24 Generic Timer Description Table (GTDT)
56
+
71
+ * ACPI spec, Revision 6.5
57
+SCLAMP 01000100 .. 0 ..... 110000 ..... ..... @rda_rn_rm
72
+ * 5.2.25 Generic Timer Description Table (GTDT)
58
+UCLAMP 01000100 .. 0 ..... 110001 ..... ..... @rda_rn_rm
73
*/
59
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
74
static void
75
build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
76
@@ -XXX,XX +XXX,XX @@ build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
77
uint32_t irqflags = vmc->claim_edge_triggered_timers ?
78
1 : /* Interrupt is Edge triggered */
79
0; /* Interrupt is Level triggered */
80
- AcpiTable table = { .sig = "GTDT", .rev = 2, .oem_id = vms->oem_id,
81
+ AcpiTable table = { .sig = "GTDT", .rev = 3, .oem_id = vms->oem_id,
82
.oem_table_id = vms->oem_table_id };
83
84
acpi_table_begin(&table, table_data);
85
@@ -XXX,XX +XXX,XX @@ build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
86
build_append_int_noprefix(table_data, 0, 4);
87
/* Platform Timer Offset */
88
build_append_int_noprefix(table_data, 0, 4);
89
-
90
+ if (vms->ns_el2_virt_timer_irq) {
91
+ /* Virtual EL2 Timer GSIV */
92
+ build_append_int_noprefix(table_data, ARCH_TIMER_NS_EL2_VIRT_IRQ, 4);
93
+ /* Virtual EL2 Timer Flags */
94
+ build_append_int_noprefix(table_data, irqflags, 4);
95
+ } else {
96
+ build_append_int_noprefix(table_data, 0, 4);
97
+ build_append_int_noprefix(table_data, 0, 4);
98
+ }
99
acpi_table_end(linker, &table);
100
}
101
102
@@ -XXX,XX +XXX,XX @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
103
static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
104
VirtMachineState *vms, unsigned dsdt_tbl_offset)
105
{
106
- /* ACPI v6.0 */
107
+ /* ACPI v6.3 */
108
AcpiFadtData fadt = {
109
.rev = 6,
110
- .minor_ver = 0,
111
+ .minor_ver = 3,
112
.flags = 1 << ACPI_FADT_F_HW_REDUCED_ACPI,
113
.xdsdt_tbl_offset = &dsdt_tbl_offset,
114
};
115
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
60
index XXXXXXX..XXXXXXX 100644
116
index XXXXXXX..XXXXXXX 100644
61
--- a/target/arm/translate-sve.c
117
--- a/hw/arm/virt.c
62
+++ b/target/arm/translate-sve.c
118
+++ b/hw/arm/virt.c
63
@@ -XXX,XX +XXX,XX @@ static bool trans_PSEL(DisasContext *s, arg_psel *a)
119
@@ -XXX,XX +XXX,XX @@ static void create_randomness(MachineState *ms, const char *node)
64
tcg_temp_free_ptr(ptr);
120
qemu_fdt_setprop(ms->fdt, node, "rng-seed", seed.rng, sizeof(seed.rng));
65
return true;
121
}
66
}
122
67
+
123
+/*
68
+static void gen_sclamp_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_i32 a)
124
+ * The CPU object always exposes the NS EL2 virt timer IRQ line,
125
+ * but we don't want to advertise it to the guest in the dtb or ACPI
126
+ * table unless it's really going to do something.
127
+ */
128
+static bool ns_el2_virt_timer_present(void)
69
+{
129
+{
70
+ tcg_gen_smax_i32(d, a, n);
130
+ ARMCPU *cpu = ARM_CPU(qemu_get_cpu(0));
71
+ tcg_gen_smin_i32(d, d, m);
131
+ CPUARMState *env = &cpu->env;
132
+
133
+ return arm_feature(env, ARM_FEATURE_AARCH64) &&
134
+ arm_feature(env, ARM_FEATURE_EL2) && cpu_isar_feature(aa64_vh, cpu);
72
+}
135
+}
73
+
136
+
74
+static void gen_sclamp_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 a)
137
static void create_fdt(VirtMachineState *vms)
75
+{
138
{
76
+ tcg_gen_smax_i64(d, a, n);
139
MachineState *ms = MACHINE(vms);
77
+ tcg_gen_smin_i64(d, d, m);
140
@@ -XXX,XX +XXX,XX @@ static void fdt_add_timer_nodes(const VirtMachineState *vms)
78
+}
141
"arm,armv7-timer");
79
+
80
+static void gen_sclamp_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
81
+ TCGv_vec m, TCGv_vec a)
82
+{
83
+ tcg_gen_smax_vec(vece, d, a, n);
84
+ tcg_gen_smin_vec(vece, d, d, m);
85
+}
86
+
87
+static void gen_sclamp(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
88
+ uint32_t a, uint32_t oprsz, uint32_t maxsz)
89
+{
90
+ static const TCGOpcode vecop[] = {
91
+ INDEX_op_smin_vec, INDEX_op_smax_vec, 0
92
+ };
93
+ static const GVecGen4 ops[4] = {
94
+ { .fniv = gen_sclamp_vec,
95
+ .fno = gen_helper_gvec_sclamp_b,
96
+ .opt_opc = vecop,
97
+ .vece = MO_8 },
98
+ { .fniv = gen_sclamp_vec,
99
+ .fno = gen_helper_gvec_sclamp_h,
100
+ .opt_opc = vecop,
101
+ .vece = MO_16 },
102
+ { .fni4 = gen_sclamp_i32,
103
+ .fniv = gen_sclamp_vec,
104
+ .fno = gen_helper_gvec_sclamp_s,
105
+ .opt_opc = vecop,
106
+ .vece = MO_32 },
107
+ { .fni8 = gen_sclamp_i64,
108
+ .fniv = gen_sclamp_vec,
109
+ .fno = gen_helper_gvec_sclamp_d,
110
+ .opt_opc = vecop,
111
+ .vece = MO_64,
112
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64 }
113
+ };
114
+ tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &ops[vece]);
115
+}
116
+
117
+TRANS_FEAT(SCLAMP, aa64_sme, gen_gvec_fn_arg_zzzz, gen_sclamp, a)
118
+
119
+static void gen_uclamp_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_i32 a)
120
+{
121
+ tcg_gen_umax_i32(d, a, n);
122
+ tcg_gen_umin_i32(d, d, m);
123
+}
124
+
125
+static void gen_uclamp_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 a)
126
+{
127
+ tcg_gen_umax_i64(d, a, n);
128
+ tcg_gen_umin_i64(d, d, m);
129
+}
130
+
131
+static void gen_uclamp_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
132
+ TCGv_vec m, TCGv_vec a)
133
+{
134
+ tcg_gen_umax_vec(vece, d, a, n);
135
+ tcg_gen_umin_vec(vece, d, d, m);
136
+}
137
+
138
+static void gen_uclamp(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
139
+ uint32_t a, uint32_t oprsz, uint32_t maxsz)
140
+{
141
+ static const TCGOpcode vecop[] = {
142
+ INDEX_op_umin_vec, INDEX_op_umax_vec, 0
143
+ };
144
+ static const GVecGen4 ops[4] = {
145
+ { .fniv = gen_uclamp_vec,
146
+ .fno = gen_helper_gvec_uclamp_b,
147
+ .opt_opc = vecop,
148
+ .vece = MO_8 },
149
+ { .fniv = gen_uclamp_vec,
150
+ .fno = gen_helper_gvec_uclamp_h,
151
+ .opt_opc = vecop,
152
+ .vece = MO_16 },
153
+ { .fni4 = gen_uclamp_i32,
154
+ .fniv = gen_uclamp_vec,
155
+ .fno = gen_helper_gvec_uclamp_s,
156
+ .opt_opc = vecop,
157
+ .vece = MO_32 },
158
+ { .fni8 = gen_uclamp_i64,
159
+ .fniv = gen_uclamp_vec,
160
+ .fno = gen_helper_gvec_uclamp_d,
161
+ .opt_opc = vecop,
162
+ .vece = MO_64,
163
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64 }
164
+ };
165
+ tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &ops[vece]);
166
+}
167
+
168
+TRANS_FEAT(UCLAMP, aa64_sme, gen_gvec_fn_arg_zzzz, gen_uclamp, a)
169
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
170
index XXXXXXX..XXXXXXX 100644
171
--- a/target/arm/vec_helper.c
172
+++ b/target/arm/vec_helper.c
173
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_bfmlal_idx)(void *vd, void *vn, void *vm,
174
}
142
}
175
clear_tail(d, opr_sz, simd_maxsz(desc));
143
qemu_fdt_setprop(ms->fdt, "/timer", "always-on", NULL, 0);
176
}
144
- qemu_fdt_setprop_cells(ms->fdt, "/timer", "interrupts",
177
+
145
- GIC_FDT_IRQ_TYPE_PPI,
178
+#define DO_CLAMP(NAME, TYPE) \
146
- INTID_TO_PPI(ARCH_TIMER_S_EL1_IRQ), irqflags,
179
+void HELPER(NAME)(void *d, void *n, void *m, void *a, uint32_t desc) \
147
- GIC_FDT_IRQ_TYPE_PPI,
180
+{ \
148
- INTID_TO_PPI(ARCH_TIMER_NS_EL1_IRQ), irqflags,
181
+ intptr_t i, opr_sz = simd_oprsz(desc); \
149
- GIC_FDT_IRQ_TYPE_PPI,
182
+ for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \
150
- INTID_TO_PPI(ARCH_TIMER_VIRT_IRQ), irqflags,
183
+ TYPE aa = *(TYPE *)(a + i); \
151
- GIC_FDT_IRQ_TYPE_PPI,
184
+ TYPE nn = *(TYPE *)(n + i); \
152
- INTID_TO_PPI(ARCH_TIMER_NS_EL2_IRQ), irqflags);
185
+ TYPE mm = *(TYPE *)(m + i); \
153
+ if (vms->ns_el2_virt_timer_irq) {
186
+ TYPE dd = MIN(MAX(aa, nn), mm); \
154
+ qemu_fdt_setprop_cells(ms->fdt, "/timer", "interrupts",
187
+ *(TYPE *)(d + i) = dd; \
155
+ GIC_FDT_IRQ_TYPE_PPI,
188
+ } \
156
+ INTID_TO_PPI(ARCH_TIMER_S_EL1_IRQ), irqflags,
189
+ clear_tail(d, opr_sz, simd_maxsz(desc)); \
157
+ GIC_FDT_IRQ_TYPE_PPI,
190
+}
158
+ INTID_TO_PPI(ARCH_TIMER_NS_EL1_IRQ), irqflags,
191
+
159
+ GIC_FDT_IRQ_TYPE_PPI,
192
+DO_CLAMP(gvec_sclamp_b, int8_t)
160
+ INTID_TO_PPI(ARCH_TIMER_VIRT_IRQ), irqflags,
193
+DO_CLAMP(gvec_sclamp_h, int16_t)
161
+ GIC_FDT_IRQ_TYPE_PPI,
194
+DO_CLAMP(gvec_sclamp_s, int32_t)
162
+ INTID_TO_PPI(ARCH_TIMER_NS_EL2_IRQ), irqflags,
195
+DO_CLAMP(gvec_sclamp_d, int64_t)
163
+ GIC_FDT_IRQ_TYPE_PPI,
196
+
164
+ INTID_TO_PPI(ARCH_TIMER_NS_EL2_VIRT_IRQ), irqflags);
197
+DO_CLAMP(gvec_uclamp_b, uint8_t)
165
+ } else {
198
+DO_CLAMP(gvec_uclamp_h, uint16_t)
166
+ qemu_fdt_setprop_cells(ms->fdt, "/timer", "interrupts",
199
+DO_CLAMP(gvec_uclamp_s, uint32_t)
167
+ GIC_FDT_IRQ_TYPE_PPI,
200
+DO_CLAMP(gvec_uclamp_d, uint64_t)
168
+ INTID_TO_PPI(ARCH_TIMER_S_EL1_IRQ), irqflags,
169
+ GIC_FDT_IRQ_TYPE_PPI,
170
+ INTID_TO_PPI(ARCH_TIMER_NS_EL1_IRQ), irqflags,
171
+ GIC_FDT_IRQ_TYPE_PPI,
172
+ INTID_TO_PPI(ARCH_TIMER_VIRT_IRQ), irqflags,
173
+ GIC_FDT_IRQ_TYPE_PPI,
174
+ INTID_TO_PPI(ARCH_TIMER_NS_EL2_IRQ), irqflags);
175
+ }
176
}
177
178
static void fdt_add_cpu_nodes(const VirtMachineState *vms)
179
@@ -XXX,XX +XXX,XX @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
180
[GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
181
[GTIMER_HYP] = ARCH_TIMER_NS_EL2_IRQ,
182
[GTIMER_SEC] = ARCH_TIMER_S_EL1_IRQ,
183
+ [GTIMER_HYPVIRT] = ARCH_TIMER_NS_EL2_VIRT_IRQ,
184
};
185
186
for (unsigned irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
187
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
188
qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
189
object_unref(cpuobj);
190
}
191
+
192
+ /* Now we've created the CPUs we can see if they have the hypvirt timer */
193
+ vms->ns_el2_virt_timer_irq = ns_el2_virt_timer_present() &&
194
+ !vmc->no_ns_el2_virt_timer_irq;
195
+
196
fdt_add_timer_nodes(vms);
197
fdt_add_cpu_nodes(vms);
198
199
@@ -XXX,XX +XXX,XX @@ DEFINE_VIRT_MACHINE_AS_LATEST(9, 0)
200
201
static void virt_machine_8_2_options(MachineClass *mc)
202
{
203
+ VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
204
+
205
virt_machine_9_0_options(mc);
206
compat_props_add(mc->compat_props, hw_compat_8_2, hw_compat_8_2_len);
207
+ /*
208
+ * Don't expose NS_EL2_VIRT timer IRQ in DTB on ACPI on 8.2 and
209
+ * earlier machines. (Exposing it tickles a bug in older EDK2
210
+ * guest BIOS binaries.)
211
+ */
212
+ vmc->no_ns_el2_virt_timer_irq = true;
213
}
214
DEFINE_VIRT_MACHINE(8, 2)
215
201
--
216
--
202
2.25.1
217
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Update the virt golden reference files to say that the FACP is ACPI
2
2
v6.3, and the GTDT table is a revision 3 table with space for the
3
In parse_user_sigframe, the kernel rejects duplicate sve records,
3
virtual EL2 timer.
4
or records that are smaller than the header. We were silently
4
5
allowing these cases to pass, dropping the record.
5
Diffs from iasl:
6
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
@@ -XXX,XX +XXX,XX @@
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
/*
9
Message-id: 20220708151540.18136-38-richard.henderson@linaro.org
9
* Intel ACPI Component Architecture
10
* AML/ASL+ Disassembler version 20200925 (64-bit version)
11
* Copyright (c) 2000 - 2020 Intel Corporation
12
*
13
- * Disassembly of tests/data/acpi/virt/FACP, Mon Jan 22 13:48:40 2024
14
+ * Disassembly of /tmp/aml-W8RZH2, Mon Jan 22 13:48:40 2024
15
*
16
* ACPI Data Table [FACP]
17
*
18
* Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue
19
*/
20
21
[000h 0000 4] Signature : "FACP" [Fixed ACPI Description Table (FADT)]
22
[004h 0004 4] Table Length : 00000114
23
[008h 0008 1] Revision : 06
24
-[009h 0009 1] Checksum : 15
25
+[009h 0009 1] Checksum : 12
26
[00Ah 0010 6] Oem ID : "BOCHS "
27
[010h 0016 8] Oem Table ID : "BXPC "
28
[018h 0024 4] Oem Revision : 00000001
29
[01Ch 0028 4] Asl Compiler ID : "BXPC"
30
[020h 0032 4] Asl Compiler Revision : 00000001
31
32
[024h 0036 4] FACS Address : 00000000
33
[028h 0040 4] DSDT Address : 00000000
34
[02Ch 0044 1] Model : 00
35
[02Dh 0045 1] PM Profile : 00 [Unspecified]
36
[02Eh 0046 2] SCI Interrupt : 0000
37
[030h 0048 4] SMI Command Port : 00000000
38
[034h 0052 1] ACPI Enable Value : 00
39
[035h 0053 1] ACPI Disable Value : 00
40
[036h 0054 1] S4BIOS Command : 00
41
[037h 0055 1] P-State Control : 00
42
@@ -XXX,XX +XXX,XX @@
43
Use APIC Physical Destination Mode (V4) : 0
44
Hardware Reduced (V5) : 1
45
Low Power S0 Idle (V5) : 0
46
47
[074h 0116 12] Reset Register : [Generic Address Structure]
48
[074h 0116 1] Space ID : 00 [SystemMemory]
49
[075h 0117 1] Bit Width : 00
50
[076h 0118 1] Bit Offset : 00
51
[077h 0119 1] Encoded Access Width : 00 [Undefined/Legacy]
52
[078h 0120 8] Address : 0000000000000000
53
54
[080h 0128 1] Value to cause reset : 00
55
[081h 0129 2] ARM Flags (decoded below) : 0003
56
PSCI Compliant : 1
57
Must use HVC for PSCI : 1
58
59
-[083h 0131 1] FADT Minor Revision : 00
60
+[083h 0131 1] FADT Minor Revision : 03
61
[084h 0132 8] FACS Address : 0000000000000000
62
[08Ch 0140 8] DSDT Address : 0000000000000000
63
[094h 0148 12] PM1A Event Block : [Generic Address Structure]
64
[094h 0148 1] Space ID : 00 [SystemMemory]
65
[095h 0149 1] Bit Width : 00
66
[096h 0150 1] Bit Offset : 00
67
[097h 0151 1] Encoded Access Width : 00 [Undefined/Legacy]
68
[098h 0152 8] Address : 0000000000000000
69
70
[0A0h 0160 12] PM1B Event Block : [Generic Address Structure]
71
[0A0h 0160 1] Space ID : 00 [SystemMemory]
72
[0A1h 0161 1] Bit Width : 00
73
[0A2h 0162 1] Bit Offset : 00
74
[0A3h 0163 1] Encoded Access Width : 00 [Undefined/Legacy]
75
[0A4h 0164 8] Address : 0000000000000000
76
77
@@ -XXX,XX +XXX,XX @@
78
[0F5h 0245 1] Bit Width : 00
79
[0F6h 0246 1] Bit Offset : 00
80
[0F7h 0247 1] Encoded Access Width : 00 [Undefined/Legacy]
81
[0F8h 0248 8] Address : 0000000000000000
82
83
[100h 0256 12] Sleep Status Register : [Generic Address Structure]
84
[100h 0256 1] Space ID : 00 [SystemMemory]
85
[101h 0257 1] Bit Width : 00
86
[102h 0258 1] Bit Offset : 00
87
[103h 0259 1] Encoded Access Width : 00 [Undefined/Legacy]
88
[104h 0260 8] Address : 0000000000000000
89
90
[10Ch 0268 8] Hypervisor ID : 00000000554D4551
91
92
Raw Table Data: Length 276 (0x114)
93
94
- 0000: 46 41 43 50 14 01 00 00 06 15 42 4F 43 48 53 20 // FACP......BOCHS
95
+ 0000: 46 41 43 50 14 01 00 00 06 12 42 4F 43 48 53 20 // FACP......BOCHS
96
0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43 // BXPC ....BXPC
97
0020: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
98
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
99
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
100
0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
101
0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
102
0070: 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
103
- 0080: 00 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
104
+ 0080: 00 03 00 03 00 00 00 00 00 00 00 00 00 00 00 00 // ................
105
0090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
106
00A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
107
00B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
108
00C0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
109
00D0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
110
00E0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
111
00F0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // ................
112
0100: 00 00 00 00 00 00 00 00 00 00 00 00 51 45 4D 55 // ............QEMU
113
0110: 00 00 00 00 // ....
114
115
@@ -XXX,XX +XXX,XX @@
116
/*
117
* Intel ACPI Component Architecture
118
* AML/ASL+ Disassembler version 20200925 (64-bit version)
119
* Copyright (c) 2000 - 2020 Intel Corporation
120
*
121
- * Disassembly of tests/data/acpi/virt/GTDT, Mon Jan 22 13:48:40 2024
122
+ * Disassembly of /tmp/aml-XDSZH2, Mon Jan 22 13:48:40 2024
123
*
124
* ACPI Data Table [GTDT]
125
*
126
* Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue
127
*/
128
129
[000h 0000 4] Signature : "GTDT" [Generic Timer Description Table]
130
-[004h 0004 4] Table Length : 00000060
131
-[008h 0008 1] Revision : 02
132
-[009h 0009 1] Checksum : 9C
133
+[004h 0004 4] Table Length : 00000068
134
+[008h 0008 1] Revision : 03
135
+[009h 0009 1] Checksum : 93
136
[00Ah 0010 6] Oem ID : "BOCHS "
137
[010h 0016 8] Oem Table ID : "BXPC "
138
[018h 0024 4] Oem Revision : 00000001
139
[01Ch 0028 4] Asl Compiler ID : "BXPC"
140
[020h 0032 4] Asl Compiler Revision : 00000001
141
142
[024h 0036 8] Counter Block Address : FFFFFFFFFFFFFFFF
143
[02Ch 0044 4] Reserved : 00000000
144
145
[030h 0048 4] Secure EL1 Interrupt : 0000001D
146
[034h 0052 4] EL1 Flags (decoded below) : 00000000
147
Trigger Mode : 0
148
Polarity : 0
149
Always On : 0
150
151
[038h 0056 4] Non-Secure EL1 Interrupt : 0000001E
152
@@ -XXX,XX +XXX,XX @@
153
154
[040h 0064 4] Virtual Timer Interrupt : 0000001B
155
[044h 0068 4] VT Flags (decoded below) : 00000000
156
Trigger Mode : 0
157
Polarity : 0
158
Always On : 0
159
160
[048h 0072 4] Non-Secure EL2 Interrupt : 0000001A
161
[04Ch 0076 4] NEL2 Flags (decoded below) : 00000000
162
Trigger Mode : 0
163
Polarity : 0
164
Always On : 0
165
[050h 0080 8] Counter Read Block Address : FFFFFFFFFFFFFFFF
166
167
[058h 0088 4] Platform Timer Count : 00000000
168
[05Ch 0092 4] Platform Timer Offset : 00000000
169
+[060h 0096 4] Virtual EL2 Timer GSIV : 00000000
170
+[064h 0100 4] Virtual EL2 Timer Flags : 00000000
171
172
-Raw Table Data: Length 96 (0x60)
173
+Raw Table Data: Length 104 (0x68)
174
175
- 0000: 47 54 44 54 60 00 00 00 02 9C 42 4F 43 48 53 20 // GTDT`.....BOCHS
176
+ 0000: 47 54 44 54 68 00 00 00 03 93 42 4F 43 48 53 20 // GTDTh.....BOCHS
177
0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43 // BXPC ....BXPC
178
0020: 01 00 00 00 FF FF FF FF FF FF FF FF 00 00 00 00 // ................
179
0030: 1D 00 00 00 00 00 00 00 1E 00 00 00 04 00 00 00 // ................
180
0040: 1B 00 00 00 00 00 00 00 1A 00 00 00 00 00 00 00 // ................
181
0050: FF FF FF FF FF FF FF FF 00 00 00 00 00 00 00 00 // ................
182
+ 0060: 00 00 00 00 00 00 00 00 // ........
183
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
184
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
185
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
186
Message-id: 20240122143537.233498-4-peter.maydell@linaro.org
11
---
187
---
12
linux-user/aarch64/signal.c | 5 ++++-
188
tests/qtest/bios-tables-test-allowed-diff.h | 2 --
13
1 file changed, 4 insertions(+), 1 deletion(-)
189
tests/data/acpi/virt/FACP | Bin 276 -> 276 bytes
14
190
tests/data/acpi/virt/GTDT | Bin 96 -> 104 bytes
15
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
191
3 files changed, 2 deletions(-)
192
193
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
16
index XXXXXXX..XXXXXXX 100644
194
index XXXXXXX..XXXXXXX 100644
17
--- a/linux-user/aarch64/signal.c
195
--- a/tests/qtest/bios-tables-test-allowed-diff.h
18
+++ b/linux-user/aarch64/signal.c
196
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
19
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
197
@@ -1,3 +1 @@
20
break;
198
/* List of comma-separated changed AML files to ignore */
21
199
-"tests/data/acpi/virt/FACP",
22
case TARGET_SVE_MAGIC:
200
-"tests/data/acpi/virt/GTDT",
23
+ if (sve || size < sizeof(struct target_sve_context)) {
201
diff --git a/tests/data/acpi/virt/FACP b/tests/data/acpi/virt/FACP
24
+ goto err;
202
index XXXXXXX..XXXXXXX 100644
25
+ }
203
GIT binary patch
26
if (cpu_isar_feature(aa64_sve, env_archcpu(env))) {
204
delta 25
27
vq = sve_vq(env);
205
gcmbQjG=+)F&CxkPgpq-PO=u!l<;2F$$vli407<0<)c^nh
28
sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
206
29
- if (!sve && size == sve_size) {
207
delta 28
30
+ if (size == sve_size) {
208
kcmbQjG=+)F&CxkPgpq-PO>`nx<-|!<6Akz$^DuG%0AAS!ssI20
31
sve = (struct target_sve_context *)ctx;
209
32
break;
210
diff --git a/tests/data/acpi/virt/GTDT b/tests/data/acpi/virt/GTDT
33
}
211
index XXXXXXX..XXXXXXX 100644
212
GIT binary patch
213
delta 25
214
bcmYeu;BpUf3CUn!U|^m+kt>V?$N&QXMtB4L
215
216
delta 16
217
Xcmc~u;BpUf2}xjJU|^avkt+-UB60)u
218
34
--
219
--
35
2.25.1
220
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The patchset adding the GMAC ethernet to this SoC crossed in the
2
mail with the patchset cleaning up the NIC handling. When we
3
create the GMAC modules we must call qemu_configure_nic_device()
4
so that the user has the opportunity to use the -nic commandline
5
option to create a network backend and connect it to the GMACs.
2
6
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Add the missing call.
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
5
Message-id: 20220708151540.18136-35-richard.henderson@linaro.org
9
Fixes: 21e5326a7c ("hw/arm: Add GMAC devices to NPCM7XX SoC")
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
12
Message-id: 20240206171231.396392-2-peter.maydell@linaro.org
7
---
13
---
8
linux-user/aarch64/cpu_loop.c | 9 +++++++++
14
hw/arm/npcm7xx.c | 1 +
9
1 file changed, 9 insertions(+)
15
1 file changed, 1 insertion(+)
10
16
11
diff --git a/linux-user/aarch64/cpu_loop.c b/linux-user/aarch64/cpu_loop.c
17
diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
12
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
13
--- a/linux-user/aarch64/cpu_loop.c
19
--- a/hw/arm/npcm7xx.c
14
+++ b/linux-user/aarch64/cpu_loop.c
20
+++ b/hw/arm/npcm7xx.c
15
@@ -XXX,XX +XXX,XX @@ void cpu_loop(CPUARMState *env)
21
@@ -XXX,XX +XXX,XX @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
16
22
for (i = 0; i < ARRAY_SIZE(s->gmac); i++) {
17
switch (trapnr) {
23
SysBusDevice *sbd = SYS_BUS_DEVICE(&s->gmac[i]);
18
case EXCP_SWI:
24
19
+ /*
25
+ qemu_configure_nic_device(DEVICE(sbd), false, NULL);
20
+ * On syscall, PSTATE.ZA is preserved, along with the ZA matrix.
26
/*
21
+ * PSTATE.SM is cleared, per SMSTOP, which does ResetSVEState.
27
* The device exists regardless of whether it's connected to a QEMU
22
+ */
28
* netdev backend. So always instantiate it even if there is no
23
+ if (FIELD_EX64(env->svcr, SVCR, SM)) {
24
+ env->svcr = FIELD_DP64(env->svcr, SVCR, SM, 0);
25
+ arm_rebuild_hflags(env);
26
+ arm_reset_sve_state(env);
27
+ }
28
ret = do_syscall(env,
29
env->xregs[8],
30
env->xregs[0],
31
--
29
--
32
2.25.1
30
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Currently QEMU will warn if there is a NIC on the board that
2
is not connected to a backend. By default the '-nic user' will
3
get used for all NICs, but if you manually connect a specific
4
NIC to a specific backend, then the other NICs on the board
5
have no backend and will be warned about:
2
6
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
qemu-system-arm: warning: nic npcm7xx-emc.1 has no peer
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
qemu-system-arm: warning: nic npcm-gmac.0 has no peer
5
Message-id: 20220708151540.18136-34-richard.henderson@linaro.org
9
qemu-system-arm: warning: nic npcm-gmac.1 has no peer
10
11
So suppress those warnings by manually connecting every NIC
12
on the board to some backend.
13
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
16
Reviewed-by: Thomas Huth <thuth@redhat.com>
17
Message-id: 20240206171231.396392-3-peter.maydell@linaro.org
7
---
18
---
8
linux-user/aarch64/target_cpu.h | 5 ++++-
19
tests/qtest/npcm7xx_emc-test.c | 5 ++++-
9
1 file changed, 4 insertions(+), 1 deletion(-)
20
1 file changed, 4 insertions(+), 1 deletion(-)
10
21
11
diff --git a/linux-user/aarch64/target_cpu.h b/linux-user/aarch64/target_cpu.h
22
diff --git a/tests/qtest/npcm7xx_emc-test.c b/tests/qtest/npcm7xx_emc-test.c
12
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
13
--- a/linux-user/aarch64/target_cpu.h
24
--- a/tests/qtest/npcm7xx_emc-test.c
14
+++ b/linux-user/aarch64/target_cpu.h
25
+++ b/tests/qtest/npcm7xx_emc-test.c
15
@@ -XXX,XX +XXX,XX @@ static inline void cpu_clone_regs_parent(CPUARMState *env, unsigned flags)
26
@@ -XXX,XX +XXX,XX @@ static int *packet_test_init(int module_num, GString *cmd_line)
16
27
* KISS and use -nic. The driver accepts 'emc0' and 'emc1' as aliases
17
static inline void cpu_set_tls(CPUARMState *env, target_ulong newtls)
28
* in the 'model' field to specify the device to match.
18
{
19
- /* Note that AArch64 Linux keeps the TLS pointer in TPIDR; this is
20
+ /*
21
+ * Note that AArch64 Linux keeps the TLS pointer in TPIDR; this is
22
* different from AArch32 Linux, which uses TPIDRRO.
23
*/
29
*/
24
env->cp15.tpidr_el[0] = newtls;
30
- g_string_append_printf(cmd_line, " -nic socket,fd=%d,model=emc%d ",
25
+ /* TPIDR2_EL0 is cleared with CLONE_SETTLS. */
31
+ g_string_append_printf(cmd_line, " -nic socket,fd=%d,model=emc%d "
26
+ env->cp15.tpidr2_el0 = 0;
32
+ "-nic user,model=npcm7xx-emc "
27
}
33
+ "-nic user,model=npcm-gmac "
28
34
+ "-nic user,model=npcm-gmac",
29
static inline abi_ulong get_sp_from_cpustate(CPUARMState *state)
35
test_sockets[1], module_num);
36
37
g_test_queue_destroy(packet_test_clear, test_sockets);
30
--
38
--
31
2.25.1
39
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
It doesn't make sense to read the value of MDCR_EL2 on a non-A-profile
2
CPU, and in fact if you try to do it we will assert:
2
3
3
This new behaviour is in the ARM pseudocode function
4
#6 0x00007ffff4b95e96 in __GI___assert_fail
4
AArch64.CheckFPAdvSIMDEnabled, which applies to AArch32
5
(assertion=0x5555565a8c70 "!arm_feature(env, ARM_FEATURE_M)", file=0x5555565a6e5c "../../target/arm/helper.c", line=12600, function=0x5555565a9560 <__PRETTY_FUNCTION__.0> "arm_security_space_below_el3") at ./assert/assert.c:101
5
via AArch32.CheckAdvSIMDOrFPEnabled when the EL to which
6
#7 0x0000555555ebf412 in arm_security_space_below_el3 (env=0x555557bc8190) at ../../target/arm/helper.c:12600
6
the trap would be delivered is in AArch64 mode.
7
#8 0x0000555555ea6f89 in arm_is_el2_enabled (env=0x555557bc8190) at ../../target/arm/cpu.h:2595
8
#9 0x0000555555ea942f in arm_mdcr_el2_eff (env=0x555557bc8190) at ../../target/arm/internals.h:1512
7
9
8
Given that ARMv9 drops support for AArch32 outside EL0, the trap EL
10
We might call pmu_counter_enabled() on an M-profile CPU (for example
9
detection ought to be trivially true, but the pseudocode still contains
11
from the migration pre/post hooks in machine.c); this should always
10
a number of conditions, and QEMU has not yet committed to dropping A32
12
return false because these CPUs don't set ARM_FEATURE_PMU.
11
support for EL[12] when v9 features are present.
12
13
13
Since the computation of SME_TRAP_NONSTREAMING is necessarily different
14
Avoid the assertion by not calling arm_mdcr_el2_eff() before we
14
for the two modes, we might as well preserve bits within TBFLAG_ANY and
15
have done the early return for "PMU not present".
15
allocate separate bits within TBFLAG_A32 and TBFLAG_A64 instead.
16
16
17
Note that DDI0616A.a has typos for bits [22:21] of LD1RO in the table
17
This fixes an assertion failure if you try to do a loadvm or
18
of instructions illegal in streaming mode.
18
savevm for an M-profile board.
19
19
20
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
20
Cc: qemu-stable@nongnu.org
21
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
21
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2155
22
Message-id: 20220708151540.18136-4-richard.henderson@linaro.org
23
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
24
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
25
Message-id: 20240208153346.970021-1-peter.maydell@linaro.org
24
---
26
---
25
target/arm/cpu.h | 7 +++
27
target/arm/helper.c | 12 ++++++++++--
26
target/arm/translate.h | 4 ++
28
1 file changed, 10 insertions(+), 2 deletions(-)
27
target/arm/sme-fa64.decode | 90 ++++++++++++++++++++++++++++++++++++++
28
target/arm/helper.c | 41 +++++++++++++++++
29
target/arm/translate-a64.c | 40 ++++++++++++++++-
30
target/arm/translate-vfp.c | 12 +++++
31
target/arm/translate.c | 2 +
32
target/arm/meson.build | 1 +
33
8 files changed, 195 insertions(+), 2 deletions(-)
34
create mode 100644 target/arm/sme-fa64.decode
35
29
36
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
37
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/cpu.h
39
+++ b/target/arm/cpu.h
40
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, HSTR_ACTIVE, 9, 1)
41
* the same thing as the current security state of the processor!
42
*/
43
FIELD(TBFLAG_A32, NS, 10, 1)
44
+/*
45
+ * Indicates that SME Streaming mode is active, and SMCR_ELx.FA64 is not.
46
+ * This requires an SME trap from AArch32 mode when using NEON.
47
+ */
48
+FIELD(TBFLAG_A32, SME_TRAP_NONSTREAMING, 11, 1)
49
50
/*
51
* Bit usage when in AArch32 state, for M-profile only.
52
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, SMEEXC_EL, 20, 2)
53
FIELD(TBFLAG_A64, PSTATE_SM, 22, 1)
54
FIELD(TBFLAG_A64, PSTATE_ZA, 23, 1)
55
FIELD(TBFLAG_A64, SVL, 24, 4)
56
+/* Indicates that SME Streaming mode is active, and SMCR_ELx.FA64 is not. */
57
+FIELD(TBFLAG_A64, SME_TRAP_NONSTREAMING, 28, 1)
58
59
/*
60
* Helpers for using the above.
61
diff --git a/target/arm/translate.h b/target/arm/translate.h
62
index XXXXXXX..XXXXXXX 100644
63
--- a/target/arm/translate.h
64
+++ b/target/arm/translate.h
65
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
66
bool pstate_sm;
67
/* True if PSTATE.ZA is set. */
68
bool pstate_za;
69
+ /* True if non-streaming insns should raise an SME Streaming exception. */
70
+ bool sme_trap_nonstreaming;
71
+ /* True if the current instruction is non-streaming. */
72
+ bool is_nonstreaming;
73
/* True if MVE insns are definitely not predicated by VPR or LTPSIZE */
74
bool mve_no_pred;
75
/*
76
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
77
new file mode 100644
78
index XXXXXXX..XXXXXXX
79
--- /dev/null
80
+++ b/target/arm/sme-fa64.decode
81
@@ -XXX,XX +XXX,XX @@
82
+# AArch64 SME allowed instruction decoding
83
+#
84
+# Copyright (c) 2022 Linaro, Ltd
85
+#
86
+# This library is free software; you can redistribute it and/or
87
+# modify it under the terms of the GNU Lesser General Public
88
+# License as published by the Free Software Foundation; either
89
+# version 2.1 of the License, or (at your option) any later version.
90
+#
91
+# This library is distributed in the hope that it will be useful,
92
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
93
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
94
+# Lesser General Public License for more details.
95
+#
96
+# You should have received a copy of the GNU Lesser General Public
97
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
98
+
99
+#
100
+# This file is processed by scripts/decodetree.py
101
+#
102
+
103
+# These patterns are taken from Appendix E1.1 of DDI0616 A.a,
104
+# Arm Architecture Reference Manual Supplement,
105
+# The Scalable Matrix Extension (SME), for Armv9-A
106
+
107
+{
108
+ [
109
+ OK 0-00 1110 0000 0001 0010 11-- ---- ---- # SMOV W|Xd,Vn.B[0]
110
+ OK 0-00 1110 0000 0010 0010 11-- ---- ---- # SMOV W|Xd,Vn.H[0]
111
+ OK 0100 1110 0000 0100 0010 11-- ---- ---- # SMOV Xd,Vn.S[0]
112
+ OK 0000 1110 0000 0001 0011 11-- ---- ---- # UMOV Wd,Vn.B[0]
113
+ OK 0000 1110 0000 0010 0011 11-- ---- ---- # UMOV Wd,Vn.H[0]
114
+ OK 0000 1110 0000 0100 0011 11-- ---- ---- # UMOV Wd,Vn.S[0]
115
+ OK 0100 1110 0000 1000 0011 11-- ---- ---- # UMOV Xd,Vn.D[0]
116
+ ]
117
+ FAIL 0--0 111- ---- ---- ---- ---- ---- ---- # Advanced SIMD vector operations
118
+}
119
+
120
+{
121
+ [
122
+ OK 0101 1110 --1- ---- 11-1 11-- ---- ---- # FMULX/FRECPS/FRSQRTS (scalar)
123
+ OK 0101 1110 -10- ---- 00-1 11-- ---- ---- # FMULX/FRECPS/FRSQRTS (scalar, FP16)
124
+ OK 01-1 1110 1-10 0001 11-1 10-- ---- ---- # FRECPE/FRSQRTE/FRECPX (scalar)
125
+ OK 01-1 1110 1111 1001 11-1 10-- ---- ---- # FRECPE/FRSQRTE/FRECPX (scalar, FP16)
126
+ ]
127
+ FAIL 01-1 111- ---- ---- ---- ---- ---- ---- # Advanced SIMD single-element operations
128
+}
129
+
130
+FAIL 0-00 110- ---- ---- ---- ---- ---- ---- # Advanced SIMD structure load/store
131
+FAIL 1100 1110 ---- ---- ---- ---- ---- ---- # Advanced SIMD cryptography extensions
132
+FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
133
+
134
+# These are the "avoidance of doubt" final table of Illegal Advanced SIMD instructions
135
+# We don't actually need to include these, as the default is OK.
136
+# -001 111- ---- ---- ---- ---- ---- ---- # Scalar floating-point operations
137
+# --10 110- ---- ---- ---- ---- ---- ---- # Load/store pair of FP registers
138
+# --01 1100 ---- ---- ---- ---- ---- ---- # Load FP register (PC-relative literal)
139
+# --11 1100 --0- ---- ---- ---- ---- ---- # Load/store FP register (unscaled imm)
140
+# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
141
+# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
142
+
143
+FAIL 0000 0100 --1- ---- 1010 ---- ---- ---- # ADR
144
+FAIL 0000 0100 --1- ---- 1011 -0-- ---- ---- # FTSSEL, FEXPA
145
+FAIL 0000 0101 --10 0001 100- ---- ---- ---- # COMPACT
146
+FAIL 0010 0101 --01 100- 1111 000- ---0 ---- # RDFFR, RDFFRS
147
+FAIL 0010 0101 --10 1--- 1001 ---- ---- ---- # WRFFR, SETFFR
148
+FAIL 0100 0101 --0- ---- 1011 ---- ---- ---- # BDEP, BEXT, BGRP
149
+FAIL 0100 0101 000- ---- 0110 1--- ---- ---- # PMULLB, PMULLT (128b result)
150
+FAIL 0110 0100 --1- ---- 1110 01-- ---- ---- # FMMLA, BFMMLA
151
+FAIL 0110 0101 --0- ---- 0000 11-- ---- ---- # FTSMUL
152
+FAIL 0110 0101 --01 0--- 100- ---- ---- ---- # FTMAD
153
+FAIL 0110 0101 --01 1--- 001- ---- ---- ---- # FADDA
154
+FAIL 0100 0101 --0- ---- 1001 10-- ---- ---- # SMMLA, UMMLA, USMMLA
155
+FAIL 0100 0101 --1- ---- 1--- ---- ---- ---- # SVE2 string/histo/crypto instructions
156
+FAIL 1000 010- -00- ---- 10-- ---- ---- ---- # SVE2 32-bit gather NT load (vector+scalar)
157
+FAIL 1000 010- -00- ---- 111- ---- ---- ---- # SVE 32-bit gather prefetch (vector+imm)
158
+FAIL 1000 0100 0-1- ---- 0--- ---- ---- ---- # SVE 32-bit gather prefetch (scalar+vector)
159
+FAIL 1000 010- -01- ---- 1--- ---- ---- ---- # SVE 32-bit gather load (vector+imm)
160
+FAIL 1000 0100 0-0- ---- 0--- ---- ---- ---- # SVE 32-bit gather load byte (scalar+vector)
161
+FAIL 1000 0100 1--- ---- 0--- ---- ---- ---- # SVE 32-bit gather load half (scalar+vector)
162
+FAIL 1000 0101 0--- ---- 0--- ---- ---- ---- # SVE 32-bit gather load word (scalar+vector)
163
+FAIL 1010 010- ---- ---- 011- ---- ---- ---- # SVE contiguous FF load (scalar+scalar)
164
+FAIL 1010 010- ---1 ---- 101- ---- ---- ---- # SVE contiguous NF load (scalar+imm)
165
+FAIL 1010 010- -01- ---- 000- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+scalar)
166
+FAIL 1010 010- -010 ---- 001- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+imm)
167
+FAIL 1100 010- ---- ---- ---- ---- ---- ---- # SVE 64-bit gather load/prefetch
168
+FAIL 1110 010- -00- ---- 001- ---- ---- ---- # SVE2 64-bit scatter NT store (vector+scalar)
169
+FAIL 1110 010- -10- ---- 001- ---- ---- ---- # SVE2 32-bit scatter NT store (vector+scalar)
170
+FAIL 1110 010- ---- ---- 1-0- ---- ---- ---- # SVE scatter store (scalar+32-bit vector)
171
+FAIL 1110 010- ---- ---- 101- ---- ---- ---- # SVE scatter store (misc)
172
diff --git a/target/arm/helper.c b/target/arm/helper.c
30
diff --git a/target/arm/helper.c b/target/arm/helper.c
173
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
174
--- a/target/arm/helper.c
32
--- a/target/arm/helper.c
175
+++ b/target/arm/helper.c
33
+++ b/target/arm/helper.c
176
@@ -XXX,XX +XXX,XX @@ int sme_exception_el(CPUARMState *env, int el)
34
@@ -XXX,XX +XXX,XX @@ static bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
177
return 0;
35
bool enabled, prohibited = false, filtered;
178
}
36
bool secure = arm_is_secure(env);
179
37
int el = arm_current_el(env);
180
+/* This corresponds to the ARM pseudocode function IsFullA64Enabled(). */
38
- uint64_t mdcr_el2 = arm_mdcr_el2_eff(env);
181
+static bool sme_fa64(CPUARMState *env, int el)
39
- uint8_t hpmn = mdcr_el2 & MDCR_HPMN;
182
+{
40
+ uint64_t mdcr_el2;
183
+ if (!cpu_isar_feature(aa64_sme_fa64, env_archcpu(env))) {
41
+ uint8_t hpmn;
184
+ return false;
185
+ }
186
+
187
+ if (el <= 1 && !el_is_in_host(env, el)) {
188
+ if (!FIELD_EX64(env->vfp.smcr_el[1], SMCR, FA64)) {
189
+ return false;
190
+ }
191
+ }
192
+ if (el <= 2 && arm_is_el2_enabled(env)) {
193
+ if (!FIELD_EX64(env->vfp.smcr_el[2], SMCR, FA64)) {
194
+ return false;
195
+ }
196
+ }
197
+ if (arm_feature(env, ARM_FEATURE_EL3)) {
198
+ if (!FIELD_EX64(env->vfp.smcr_el[3], SMCR, FA64)) {
199
+ return false;
200
+ }
201
+ }
202
+
203
+ return true;
204
+}
205
+
206
/*
207
* Given that SVE is enabled, return the vector length for EL.
208
*/
209
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a32(CPUARMState *env, int fp_el,
210
DP_TBFLAG_ANY(flags, PSTATE__IL, 1);
211
}
212
42
213
+ /*
43
+ /*
214
+ * The SME exception we are testing for is raised via
44
+ * We might be called for M-profile cores where MDCR_EL2 doesn't
215
+ * AArch64.CheckFPAdvSIMDEnabled(), as called from
45
+ * exist and arm_mdcr_el2_eff() will assert, so this early-exit check
216
+ * AArch32.CheckAdvSIMDOrFPEnabled().
46
+ * must be before we read that value.
217
+ */
47
+ */
218
+ if (el == 0
48
if (!arm_feature(env, ARM_FEATURE_PMU)) {
219
+ && FIELD_EX64(env->svcr, SVCR, SM)
220
+ && (!arm_is_el2_enabled(env)
221
+ || (arm_el_is_aa64(env, 2) && !(env->cp15.hcr_el2 & HCR_TGE)))
222
+ && arm_el_is_aa64(env, 1)
223
+ && !sme_fa64(env, el)) {
224
+ DP_TBFLAG_A32(flags, SME_TRAP_NONSTREAMING, 1);
225
+ }
226
+
227
return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
228
}
229
230
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
231
}
232
if (FIELD_EX64(env->svcr, SVCR, SM)) {
233
DP_TBFLAG_A64(flags, PSTATE_SM, 1);
234
+ DP_TBFLAG_A64(flags, SME_TRAP_NONSTREAMING, !sme_fa64(env, el));
235
}
236
DP_TBFLAG_A64(flags, PSTATE_ZA, FIELD_EX64(env->svcr, SVCR, ZA));
237
}
238
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
239
index XXXXXXX..XXXXXXX 100644
240
--- a/target/arm/translate-a64.c
241
+++ b/target/arm/translate-a64.c
242
@@ -XXX,XX +XXX,XX @@ static void do_vec_ld(DisasContext *s, int destidx, int element,
243
* unallocated-encoding checks (otherwise the syndrome information
244
* for the resulting exception will be incorrect).
245
*/
246
-static bool fp_access_check(DisasContext *s)
247
+static bool fp_access_check_only(DisasContext *s)
248
{
249
if (s->fp_excp_el) {
250
assert(!s->fp_access_checked);
251
@@ -XXX,XX +XXX,XX @@ static bool fp_access_check(DisasContext *s)
252
return true;
253
}
254
255
+static bool fp_access_check(DisasContext *s)
256
+{
257
+ if (!fp_access_check_only(s)) {
258
+ return false;
259
+ }
260
+ if (s->sme_trap_nonstreaming && s->is_nonstreaming) {
261
+ gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
262
+ syn_smetrap(SME_ET_Streaming, false));
263
+ return false;
264
+ }
265
+ return true;
266
+}
267
+
268
/* Check that SVE access is enabled. If it is, return true.
269
* If not, emit code to generate an appropriate exception and return false.
270
*/
271
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
272
default:
273
g_assert_not_reached();
274
}
275
- if ((ri->type & ARM_CP_FPU) && !fp_access_check(s)) {
276
+ if ((ri->type & ARM_CP_FPU) && !fp_access_check_only(s)) {
277
return;
278
} else if ((ri->type & ARM_CP_SVE) && !sve_access_check(s)) {
279
return;
280
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_simd_fp(DisasContext *s, uint32_t insn)
281
}
282
}
283
284
+/*
285
+ * Include the generated SME FA64 decoder.
286
+ */
287
+
288
+#include "decode-sme-fa64.c.inc"
289
+
290
+static bool trans_OK(DisasContext *s, arg_OK *a)
291
+{
292
+ return true;
293
+}
294
+
295
+static bool trans_FAIL(DisasContext *s, arg_OK *a)
296
+{
297
+ s->is_nonstreaming = true;
298
+ return true;
299
+}
300
+
301
/**
302
* is_guarded_page:
303
* @env: The cpu environment
304
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
305
dc->mte_active[1] = EX_TBFLAG_A64(tb_flags, MTE0_ACTIVE);
306
dc->pstate_sm = EX_TBFLAG_A64(tb_flags, PSTATE_SM);
307
dc->pstate_za = EX_TBFLAG_A64(tb_flags, PSTATE_ZA);
308
+ dc->sme_trap_nonstreaming = EX_TBFLAG_A64(tb_flags, SME_TRAP_NONSTREAMING);
309
dc->vec_len = 0;
310
dc->vec_stride = 0;
311
dc->cp_regs = arm_cpu->cp_regs;
312
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
313
}
314
}
315
316
+ s->is_nonstreaming = false;
317
+ if (s->sme_trap_nonstreaming) {
318
+ disas_sme_fa64(s, insn);
319
+ }
320
+
321
switch (extract32(insn, 25, 4)) {
322
case 0x0:
323
if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
324
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
325
index XXXXXXX..XXXXXXX 100644
326
--- a/target/arm/translate-vfp.c
327
+++ b/target/arm/translate-vfp.c
328
@@ -XXX,XX +XXX,XX @@ static bool vfp_access_check_a(DisasContext *s, bool ignore_vfp_enabled)
329
return false;
49
return false;
330
}
50
}
331
51
332
+ /*
52
+ mdcr_el2 = arm_mdcr_el2_eff(env);
333
+ * Note that rebuild_hflags_a32 has already accounted for being in EL0
53
+ hpmn = mdcr_el2 & MDCR_HPMN;
334
+ * and the higher EL in A64 mode, etc. Unlike A64 mode, there do not
335
+ * appear to be any insns which touch VFP which are allowed.
336
+ */
337
+ if (s->sme_trap_nonstreaming) {
338
+ gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
339
+ syn_smetrap(SME_ET_Streaming,
340
+ s->base.pc_next - s->pc_curr == 2));
341
+ return false;
342
+ }
343
+
54
+
344
if (!s->vfp_enabled && !ignore_vfp_enabled) {
55
if (!arm_feature(env, ARM_FEATURE_EL2) ||
345
assert(!arm_dc_feature(s, ARM_FEATURE_M));
56
(counter < hpmn || counter == 31)) {
346
unallocated_encoding(s);
57
e = env->cp15.c9_pmcr & PMCRE;
347
diff --git a/target/arm/translate.c b/target/arm/translate.c
348
index XXXXXXX..XXXXXXX 100644
349
--- a/target/arm/translate.c
350
+++ b/target/arm/translate.c
351
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
352
dc->vec_len = EX_TBFLAG_A32(tb_flags, VECLEN);
353
dc->vec_stride = EX_TBFLAG_A32(tb_flags, VECSTRIDE);
354
}
355
+ dc->sme_trap_nonstreaming =
356
+ EX_TBFLAG_A32(tb_flags, SME_TRAP_NONSTREAMING);
357
}
358
dc->cp_regs = cpu->cp_regs;
359
dc->features = env->features;
360
diff --git a/target/arm/meson.build b/target/arm/meson.build
361
index XXXXXXX..XXXXXXX 100644
362
--- a/target/arm/meson.build
363
+++ b/target/arm/meson.build
364
@@ -XXX,XX +XXX,XX @@
365
gen = [
366
decodetree.process('sve.decode', extra_args: '--decode=disas_sve'),
367
decodetree.process('sme.decode', extra_args: '--decode=disas_sme'),
368
+ decodetree.process('sme-fa64.decode', extra_args: '--static-decode=disas_sme_fa64'),
369
decodetree.process('neon-shared.decode', extra_args: '--decode=disas_neon_shared'),
370
decodetree.process('neon-dp.decode', extra_args: '--decode=disas_neon_dp'),
371
decodetree.process('neon-ls.decode', extra_args: '--decode=disas_neon_ls'),
372
--
58
--
373
2.25.1
59
2.34.1
60
61
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Nabih Estefan <nabihestefan@google.com>
2
2
3
The pseudocode for CheckSVEEnabled gains a check for Streaming
3
Fix the nocm_gmac-test.c file to run on a nuvoton 7xx machine instead
4
SVE mode, and for SME present but SVE absent.
4
of 8xx. Also fix comments referencing this and values expecting 8xx.
5
5
6
Change-Id: Iabd0fba14910c3f1e883c4a9521350f3db9ffab8
7
Signed-Off-By: Nabih Estefan <nabihestefan@google.com>
8
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
9
Message-id: 20240208194759.2858582-2-nabihestefan@google.com
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
[PMM: commit message tweaks]
8
Message-id: 20220708151540.18136-17-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
13
---
11
target/arm/translate-a64.c | 22 ++++++++++++++++------
14
tests/qtest/npcm_gmac-test.c | 84 +-----------------------------------
12
1 file changed, 16 insertions(+), 6 deletions(-)
15
tests/qtest/meson.build | 3 +-
16
2 files changed, 4 insertions(+), 83 deletions(-)
13
17
14
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
18
diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c
15
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/translate-a64.c
20
--- a/tests/qtest/npcm_gmac-test.c
17
+++ b/target/arm/translate-a64.c
21
+++ b/tests/qtest/npcm_gmac-test.c
18
@@ -XXX,XX +XXX,XX @@ static bool fp_access_check(DisasContext *s)
22
@@ -XXX,XX +XXX,XX @@ typedef struct TestData {
19
return true;
23
const GMACModule *module;
24
} TestData;
25
26
-/* Values extracted from hw/arm/npcm8xx.c */
27
+/* Values extracted from hw/arm/npcm7xx.c */
28
static const GMACModule gmac_module_list[] = {
29
{
30
.irq = 14,
31
@@ -XXX,XX +XXX,XX @@ static const GMACModule gmac_module_list[] = {
32
.irq = 15,
33
.base_addr = 0xf0804000
34
},
35
- {
36
- .irq = 16,
37
- .base_addr = 0xf0806000
38
- },
39
- {
40
- .irq = 17,
41
- .base_addr = 0xf0808000
42
- }
43
};
44
45
/* Returns the index of the GMAC module. */
46
@@ -XXX,XX +XXX,XX @@ static uint32_t gmac_read(QTestState *qts, const GMACModule *mod,
47
return qtest_readl(qts, mod->base_addr + regno);
20
}
48
}
21
49
22
-/* Check that SVE access is enabled. If it is, return true.
50
-static uint16_t pcs_read(QTestState *qts, const GMACModule *mod,
23
+/*
51
- NPCMRegister regno)
24
+ * Check that SVE access is enabled. If it is, return true.
52
-{
25
* If not, emit code to generate an appropriate exception and return false.
53
- uint32_t write_value = (regno & 0x3ffe00) >> 9;
26
+ * This function corresponds to CheckSVEEnabled().
54
- qtest_writel(qts, PCS_BASE_ADDRESS + NPCM_PCS_IND_AC_BA, write_value);
27
*/
55
- uint32_t read_offset = regno & 0x1ff;
28
bool sve_access_check(DisasContext *s)
56
- return qtest_readl(qts, PCS_BASE_ADDRESS + read_offset);
57
-}
58
-
59
/* Check that GMAC registers are reset to default value */
60
static void test_init(gconstpointer test_data)
29
{
61
{
30
- if (s->sve_excp_el) {
62
const TestData *td = test_data;
31
- assert(!s->sve_access_checked);
63
const GMACModule *mod = td->module;
32
- s->sve_access_checked = true;
64
- QTestState *qts = qtest_init("-machine npcm845-evb");
65
+ QTestState *qts = qtest_init("-machine npcm750-evb");
66
67
#define CHECK_REG32(regno, value) \
68
do { \
69
g_assert_cmphex(gmac_read(qts, mod, (regno)), ==, (value)); \
70
} while (0)
71
72
-#define CHECK_REG_PCS(regno, value) \
73
- do { \
74
- g_assert_cmphex(pcs_read(qts, mod, (regno)), ==, (value)); \
75
- } while (0)
33
-
76
-
34
+ if (s->pstate_sm || !dc_isar_feature(aa64_sve, s)) {
77
CHECK_REG32(NPCM_DMA_BUS_MODE, 0x00020100);
35
+ assert(dc_isar_feature(aa64_sme, s));
78
CHECK_REG32(NPCM_DMA_XMT_POLL_DEMAND, 0);
36
+ if (!sme_sm_enabled_check(s)) {
79
CHECK_REG32(NPCM_DMA_RCV_POLL_DEMAND, 0);
37
+ goto fail_exit;
80
@@ -XXX,XX +XXX,XX @@ static void test_init(gconstpointer test_data)
38
+ }
81
CHECK_REG32(NPCM_GMAC_PTP_TAR, 0);
39
+ } else if (s->sve_excp_el) {
82
CHECK_REG32(NPCM_GMAC_PTP_TTSR, 0);
40
gen_exception_insn_el(s, s->pc_curr, EXCP_UDEF,
83
41
syn_sve_access_trap(), s->sve_excp_el);
84
- /* TODO Add registers PCS */
42
- return false;
85
- if (mod->base_addr == 0xf0802000) {
43
+ goto fail_exit;
86
- CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID1, 0x699e);
44
}
87
- CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID2, 0);
45
s->sve_access_checked = true;
88
- CHECK_REG_PCS(NPCM_PCS_SR_CTL_STS, 0x8000);
46
return fp_access_check(s);
89
-
47
+
90
- CHECK_REG_PCS(NPCM_PCS_SR_MII_CTRL, 0x1140);
48
+ fail_exit:
91
- CHECK_REG_PCS(NPCM_PCS_SR_MII_STS, 0x0109);
49
+ /* Assert that we only raise one exception per instruction. */
92
- CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID1, 0x699e);
50
+ assert(!s->sve_access_checked);
93
- CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID2, 0x0ced0);
51
+ s->sve_access_checked = true;
94
- CHECK_REG_PCS(NPCM_PCS_SR_MII_AN_ADV, 0x0020);
52
+ return false;
95
- CHECK_REG_PCS(NPCM_PCS_SR_MII_LP_BABL, 0);
96
- CHECK_REG_PCS(NPCM_PCS_SR_MII_AN_EXPN, 0);
97
- CHECK_REG_PCS(NPCM_PCS_SR_MII_EXT_STS, 0xc000);
98
-
99
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_ABL, 0x0003);
100
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_LWR, 0x0038);
101
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_UPR, 0);
102
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_LWR, 0x0038);
103
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_UPR, 0);
104
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_LWR, 0x0058);
105
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_UPR, 0);
106
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_LWR, 0x0048);
107
- CHECK_REG_PCS(NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_UPR, 0);
108
-
109
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MMD_DIG_CTRL1, 0x2400);
110
- CHECK_REG_PCS(NPCM_PCS_VR_MII_AN_CTRL, 0);
111
- CHECK_REG_PCS(NPCM_PCS_VR_MII_AN_INTR_STS, 0x000a);
112
- CHECK_REG_PCS(NPCM_PCS_VR_MII_TC, 0);
113
- CHECK_REG_PCS(NPCM_PCS_VR_MII_DBG_CTRL, 0);
114
- CHECK_REG_PCS(NPCM_PCS_VR_MII_EEE_MCTRL0, 0x899c);
115
- CHECK_REG_PCS(NPCM_PCS_VR_MII_EEE_TXTIMER, 0);
116
- CHECK_REG_PCS(NPCM_PCS_VR_MII_EEE_RXTIMER, 0);
117
- CHECK_REG_PCS(NPCM_PCS_VR_MII_LINK_TIMER_CTRL, 0);
118
- CHECK_REG_PCS(NPCM_PCS_VR_MII_EEE_MCTRL1, 0);
119
- CHECK_REG_PCS(NPCM_PCS_VR_MII_DIG_STS, 0x0010);
120
- CHECK_REG_PCS(NPCM_PCS_VR_MII_ICG_ERRCNT1, 0);
121
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MISC_STS, 0);
122
- CHECK_REG_PCS(NPCM_PCS_VR_MII_RX_LSTS, 0);
123
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_BSTCTRL0, 0x00a);
124
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_LVLCTRL0, 0x007f);
125
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_GENCTRL0, 0x0001);
126
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_GENCTRL1, 0);
127
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_TX_STS, 0);
128
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_RX_GENCTRL0, 0x0100);
129
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_RX_GENCTRL1, 0x1100);
130
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_RX_LOS_CTRL0, 0x000e);
131
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MPLL_CTRL0, 0x0100);
132
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MPLL_CTRL1, 0x0032);
133
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MPLL_STS, 0x0001);
134
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MISC_CTRL2, 0);
135
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_LVL_CTRL, 0x0019);
136
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MISC_CTRL0, 0);
137
- CHECK_REG_PCS(NPCM_PCS_VR_MII_MP_MISC_CTRL1, 0);
138
- CHECK_REG_PCS(NPCM_PCS_VR_MII_DIG_CTRL2, 0);
139
- CHECK_REG_PCS(NPCM_PCS_VR_MII_DIG_ERRCNT_SEL, 0);
140
- }
141
-
142
qtest_quit(qts);
53
}
143
}
54
144
55
/*
145
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
146
index XXXXXXX..XXXXXXX 100644
147
--- a/tests/qtest/meson.build
148
+++ b/tests/qtest/meson.build
149
@@ -XXX,XX +XXX,XX @@ qtests_npcm7xx = \
150
'npcm7xx_sdhci-test',
151
'npcm7xx_smbus-test',
152
'npcm7xx_timer-test',
153
- 'npcm7xx_watchdog_timer-test'] + \
154
+ 'npcm7xx_watchdog_timer-test',
155
+ 'npcm_gmac-test'] + \
156
(slirp.found() ? ['npcm7xx_emc-test'] : [])
157
qtests_aspeed = \
158
['aspeed_hace-test',
56
--
159
--
57
2.25.1
160
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Luc Michel <luc.michel@amd.com>
2
2
3
This is an SVE instruction that operates using the SVE vector
3
An access fault is raised when the Access Flag is not set in the
4
length but that it is present only if SME is implemented.
4
looked-up PTE and the AFFD field is not set in the corresponding context
5
descriptor. This was already implemented for stage 2. Implement it for
6
stage 1 as well.
5
7
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Luc Michel <luc.michel@amd.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Mostafa Saleh <smostafa@google.com>
8
Message-id: 20220708151540.18136-30-richard.henderson@linaro.org
10
Reviewed-by: Eric Auger <eric.auger@redhat.com>
11
Tested-by: Mostafa Saleh <smostafa@google.com>
12
Message-id: 20240213082211.3330400-1-luc.michel@amd.com
13
[PMM: tweaked comment text]
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
15
---
11
target/arm/helper-sve.h | 2 ++
16
hw/arm/smmuv3-internal.h | 1 +
12
target/arm/sve.decode | 1 +
17
include/hw/arm/smmu-common.h | 1 +
13
target/arm/sve_helper.c | 16 ++++++++++++++++
18
hw/arm/smmu-common.c | 11 +++++++++++
14
target/arm/translate-sve.c | 2 ++
19
hw/arm/smmuv3.c | 1 +
15
4 files changed, 21 insertions(+)
20
4 files changed, 14 insertions(+)
16
21
17
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
22
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
18
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-sve.h
24
--- a/hw/arm/smmuv3-internal.h
20
+++ b/target/arm/helper-sve.h
25
+++ b/hw/arm/smmuv3-internal.h
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_revh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
@@ -XXX,XX +XXX,XX @@ static inline int pa_range(STE *ste)
22
27
#define CD_EPD(x, sel) extract32((x)->word[0], (16 * (sel)) + 14, 1)
23
DEF_HELPER_FLAGS_4(sve_revw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
#define CD_ENDI(x) extract32((x)->word[0], 15, 1)
24
29
#define CD_IPS(x) extract32((x)->word[1], 0 , 3)
25
+DEF_HELPER_FLAGS_4(sme_revd_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
+#define CD_AFFD(x) extract32((x)->word[1], 3 , 1)
31
#define CD_TBI(x) extract32((x)->word[1], 6 , 2)
32
#define CD_HD(x) extract32((x)->word[1], 10 , 1)
33
#define CD_HA(x) extract32((x)->word[1], 11 , 1)
34
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
35
index XXXXXXX..XXXXXXX 100644
36
--- a/include/hw/arm/smmu-common.h
37
+++ b/include/hw/arm/smmu-common.h
38
@@ -XXX,XX +XXX,XX @@ typedef struct SMMUTransCfg {
39
bool disabled; /* smmu is disabled */
40
bool bypassed; /* translation is bypassed */
41
bool aborted; /* translation is aborted */
42
+ bool affd; /* AF fault disable */
43
uint32_t iotlb_hits; /* counts IOTLB hits */
44
uint32_t iotlb_misses; /* counts IOTLB misses*/
45
/* Used by stage-1 only. */
46
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/hw/arm/smmu-common.c
49
+++ b/hw/arm/smmu-common.c
50
@@ -XXX,XX +XXX,XX @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg,
51
pte_addr, pte, iova, gpa,
52
block_size >> 20);
53
}
26
+
54
+
27
DEF_HELPER_FLAGS_4(sve_rbit_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
55
+ /*
28
DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
56
+ * QEMU does not currently implement HTTU, so if AFFD and PTE.AF
29
DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
57
+ * are 0 we take an Access flag fault. (5.4. Context Descriptor)
30
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
58
+ * An Access flag fault takes priority over a Permission fault.
59
+ */
60
+ if (!PTE_AF(pte) && !cfg->affd) {
61
+ info->type = SMMU_PTW_ERR_ACCESS;
62
+ goto error;
63
+ }
64
+
65
ap = PTE_AP(pte);
66
if (is_permission_fault(ap, perm)) {
67
info->type = SMMU_PTW_ERR_PERMISSION;
68
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
31
index XXXXXXX..XXXXXXX 100644
69
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/sve.decode
70
--- a/hw/arm/smmuv3.c
33
+++ b/target/arm/sve.decode
71
+++ b/hw/arm/smmuv3.c
34
@@ -XXX,XX +XXX,XX @@ REVB 00000101 .. 1001 00 100 ... ..... ..... @rd_pg_rn
72
@@ -XXX,XX +XXX,XX @@ static int decode_cd(SMMUTransCfg *cfg, CD *cd, SMMUEventInfo *event)
35
REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn
73
cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
36
REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn
74
cfg->tbi = CD_TBI(cd);
37
RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn
75
cfg->asid = CD_ASID(cd);
38
+REVD 00000101 00 1011 10 100 ... ..... ..... @rd_pg_rn_e0
76
+ cfg->affd = CD_AFFD(cd);
39
77
40
# SVE vector splice (predicated, destructive)
78
trace_smmuv3_decode_cd(cfg->oas);
41
SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm
42
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/sve_helper.c
45
+++ b/target/arm/sve_helper.c
46
@@ -XXX,XX +XXX,XX @@ DO_ZPZ_D(sve_revh_d, uint64_t, hswap64)
47
48
DO_ZPZ_D(sve_revw_d, uint64_t, wswap64)
49
50
+void HELPER(sme_revd_q)(void *vd, void *vn, void *vg, uint32_t desc)
51
+{
52
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
53
+ uint64_t *d = vd, *n = vn;
54
+ uint8_t *pg = vg;
55
+
56
+ for (i = 0; i < opr_sz; i += 2) {
57
+ if (pg[H1(i)] & 1) {
58
+ uint64_t n0 = n[i + 0];
59
+ uint64_t n1 = n[i + 1];
60
+ d[i + 0] = n1;
61
+ d[i + 1] = n0;
62
+ }
63
+ }
64
+}
65
+
66
DO_ZPZ(sve_rbit_b, uint8_t, H1, revbit8)
67
DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16)
68
DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32)
69
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/target/arm/translate-sve.c
72
+++ b/target/arm/translate-sve.c
73
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(REVH, aa64_sve, gen_gvec_ool_arg_zpz, revh_fns[a->esz], a, 0)
74
TRANS_FEAT(REVW, aa64_sve, gen_gvec_ool_arg_zpz,
75
a->esz == 3 ? gen_helper_sve_revw_d : NULL, a, 0)
76
77
+TRANS_FEAT(REVD, aa64_sme, gen_gvec_ool_arg_zpz, gen_helper_sme_revd_q, a, 0)
78
+
79
TRANS_FEAT(SPLICE, aa64_sve, gen_gvec_ool_arg_zpzz,
80
gen_helper_sve_splice, a, a->esz)
81
79
82
--
80
--
83
2.25.1
81
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
Move the checks out of the parsing loop and into the
3
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
restore function. This more closely mirrors the code
5
structure in the kernel, and is slightly clearer.
6
7
Reject rather than silently skip incorrect VL and SVE record sizes,
8
bringing our checks in to line with those the kernel does.
9
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240213155214.13619-2-philmd@linaro.org
12
Message-id: 20220708151540.18136-40-richard.henderson@linaro.org
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
---
7
---
15
linux-user/aarch64/signal.c | 51 +++++++++++++++++++++++++------------
8
hw/arm/stellaris.c | 6 ++++--
16
1 file changed, 35 insertions(+), 16 deletions(-)
9
1 file changed, 4 insertions(+), 2 deletions(-)
17
10
18
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
11
diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
19
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
20
--- a/linux-user/aarch64/signal.c
13
--- a/hw/arm/stellaris.c
21
+++ b/linux-user/aarch64/signal.c
14
+++ b/hw/arm/stellaris.c
22
@@ -XXX,XX +XXX,XX @@ static void target_restore_fpsimd_record(CPUARMState *env,
15
@@ -XXX,XX +XXX,XX @@ static void stellaris_adc_trigger(void *opaque, int irq, int level)
23
}
16
}
24
}
17
}
25
18
26
-static void target_restore_sve_record(CPUARMState *env,
19
-static void stellaris_adc_reset(StellarisADCState *s)
27
- struct target_sve_context *sve, int vq)
20
+static void stellaris_adc_reset_hold(Object *obj)
28
+static bool target_restore_sve_record(CPUARMState *env,
29
+ struct target_sve_context *sve,
30
+ int size)
31
{
21
{
32
- int i, j;
22
+ StellarisADCState *s = STELLARIS_ADC(obj);
33
+ int i, j, vl, vq;
23
int n;
34
24
35
- /* Note that SVE regs are stored as a byte stream, with each byte element
25
for (n = 0; n < 4; n++) {
36
+ if (!cpu_isar_feature(aa64_sve, env_archcpu(env))) {
26
@@ -XXX,XX +XXX,XX @@ static void stellaris_adc_init(Object *obj)
37
+ return false;
27
memory_region_init_io(&s->iomem, obj, &stellaris_adc_ops, s,
38
+ }
28
"adc", 0x1000);
39
+
29
sysbus_init_mmio(sbd, &s->iomem);
40
+ __get_user(vl, &sve->vl);
30
- stellaris_adc_reset(s);
41
+ vq = sve_vq(env);
31
qdev_init_gpio_in(dev, stellaris_adc_trigger, 1);
42
+
43
+ /* Reject mismatched VL. */
44
+ if (vl != vq * TARGET_SVE_VQ_BYTES) {
45
+ return false;
46
+ }
47
+
48
+ /* Accept empty record -- used to clear PSTATE.SM. */
49
+ if (size <= sizeof(*sve)) {
50
+ return true;
51
+ }
52
+
53
+ /* Reject non-empty but incomplete record. */
54
+ if (size < TARGET_SVE_SIG_CONTEXT_SIZE(vq)) {
55
+ return false;
56
+ }
57
+
58
+ /*
59
+ * Note that SVE regs are stored as a byte stream, with each byte element
60
* at a subsequent address. This corresponds to a little-endian load
61
* of our 64-bit hunks.
62
*/
63
@@ -XXX,XX +XXX,XX @@ static void target_restore_sve_record(CPUARMState *env,
64
}
65
}
66
}
67
+ return true;
68
}
32
}
69
33
70
static int target_restore_sigframe(CPUARMState *env,
34
@@ -XXX,XX +XXX,XX @@ static const TypeInfo stellaris_i2c_info = {
71
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
35
static void stellaris_adc_class_init(ObjectClass *klass, void *data)
72
struct target_sve_context *sve = NULL;
36
{
73
uint64_t extra_datap = 0;
37
DeviceClass *dc = DEVICE_CLASS(klass);
74
bool used_extra = false;
38
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
75
- int vq = 0, sve_size = 0;
39
76
+ int sve_size = 0;
40
+ rc->phases.hold = stellaris_adc_reset_hold;
77
41
dc->vmsd = &vmstate_stellaris_adc;
78
target_restore_general_frame(env, sf);
42
}
79
43
80
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
81
if (sve || size < sizeof(struct target_sve_context)) {
82
goto err;
83
}
84
- if (cpu_isar_feature(aa64_sve, env_archcpu(env))) {
85
- vq = sve_vq(env);
86
- sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
87
- if (size == sve_size) {
88
- sve = (struct target_sve_context *)ctx;
89
- break;
90
- }
91
- }
92
- goto err;
93
+ sve = (struct target_sve_context *)ctx;
94
+ sve_size = size;
95
+ break;
96
97
case TARGET_EXTRA_MAGIC:
98
if (extra || size != sizeof(struct target_extra_context)) {
99
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
100
}
101
102
/* SVE data, if present, overwrites FPSIMD data. */
103
- if (sve) {
104
- target_restore_sve_record(env, sve, vq);
105
+ if (sve && !target_restore_sve_record(env, sve, sve_size)) {
106
+ goto err;
107
}
108
unlock_user(extra, extra_datap, 0);
109
return 0;
110
--
44
--
111
2.25.1
45
2.34.1
46
47
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
These SME instructions are nominally within the SVE decode space,
3
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
4
so we add them to sve.decode and translate-sve.c.
4
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
5
Message-id: 20240213155214.13619-3-philmd@linaro.org
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-18-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
8
---
11
target/arm/translate-a64.h | 12 ++++++++++++
9
hw/arm/stellaris.c | 26 ++++++++++++++++++++++----
12
target/arm/sve.decode | 5 ++++-
10
1 file changed, 22 insertions(+), 4 deletions(-)
13
target/arm/translate-sve.c | 38 ++++++++++++++++++++++++++++++++++++++
14
3 files changed, 54 insertions(+), 1 deletion(-)
15
11
16
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
12
diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
17
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate-a64.h
14
--- a/hw/arm/stellaris.c
19
+++ b/target/arm/translate-a64.h
15
+++ b/hw/arm/stellaris.c
20
@@ -XXX,XX +XXX,XX @@ static inline int vec_full_reg_size(DisasContext *s)
16
@@ -XXX,XX +XXX,XX @@ static void stellaris_sys_instance_init(Object *obj)
21
return s->vl;
17
s->sysclk = qdev_init_clock_out(DEVICE(s), "SYSCLK");
22
}
18
}
23
19
24
+/* Return the byte size of the vector register, SVL / 8. */
20
-/* I2C controller. */
25
+static inline int streaming_vec_reg_size(DisasContext *s)
21
+/*
26
+{
22
+ * I2C controller.
27
+ return s->svl;
23
+ * ??? For now we only implement the master interface.
24
+ */
25
26
#define TYPE_STELLARIS_I2C "stellaris-i2c"
27
OBJECT_DECLARE_SIMPLE_TYPE(stellaris_i2c_state, STELLARIS_I2C)
28
@@ -XXX,XX +XXX,XX @@ static void stellaris_i2c_write(void *opaque, hwaddr offset,
29
stellaris_i2c_update(s);
30
}
31
32
-static void stellaris_i2c_reset(stellaris_i2c_state *s)
33
+static void stellaris_i2c_reset_enter(Object *obj, ResetType type)
34
{
35
+ stellaris_i2c_state *s = STELLARIS_I2C(obj);
36
+
37
if (s->mcs & STELLARIS_I2C_MCS_BUSBSY)
38
i2c_end_transfer(s->bus);
28
+}
39
+}
29
+
40
+
30
/*
41
+static void stellaris_i2c_reset_hold(Object *obj)
31
* Return the offset info CPUARMState of the predicate vector register Pn.
32
* Note for this purpose, FFR is P16.
33
@@ -XXX,XX +XXX,XX @@ static inline int pred_full_reg_size(DisasContext *s)
34
return s->vl >> 3;
35
}
36
37
+/* Return the byte size of the predicate register, SVL / 64. */
38
+static inline int streaming_pred_reg_size(DisasContext *s)
39
+{
42
+{
40
+ return s->svl >> 3;
43
+ stellaris_i2c_state *s = STELLARIS_I2C(obj);
44
45
s->msa = 0;
46
s->mcs = 0;
47
@@ -XXX,XX +XXX,XX @@ static void stellaris_i2c_reset(stellaris_i2c_state *s)
48
s->mimr = 0;
49
s->mris = 0;
50
s->mcr = 0;
41
+}
51
+}
42
+
52
+
43
/*
53
+static void stellaris_i2c_reset_exit(Object *obj)
44
* Round up the size of a register to a size allowed by
54
+{
45
* the tcg vector infrastructure. Any operation which uses this
55
+ stellaris_i2c_state *s = STELLARIS_I2C(obj);
46
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
56
+
47
index XXXXXXX..XXXXXXX 100644
57
stellaris_i2c_update(s);
48
--- a/target/arm/sve.decode
49
+++ b/target/arm/sve.decode
50
@@ -XXX,XX +XXX,XX @@ INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5
51
# SVE index generation (register start, register increment)
52
INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm
53
54
-### SVE Stack Allocation Group
55
+### SVE / Streaming SVE Stack Allocation Group
56
57
# SVE stack frame adjustment
58
ADDVL 00000100 001 ..... 01010 ...... ..... @rd_rn_i6
59
+ADDSVL 00000100 001 ..... 01011 ...... ..... @rd_rn_i6
60
ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_i6
61
+ADDSPL 00000100 011 ..... 01011 ...... ..... @rd_rn_i6
62
63
# SVE stack frame size
64
RDVL 00000100 101 11111 01010 imm:s6 rd:5
65
+RDSVL 00000100 101 11111 01011 imm:s6 rd:5
66
67
### SVE Bitwise Shift - Unpredicated Group
68
69
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/target/arm/translate-sve.c
72
+++ b/target/arm/translate-sve.c
73
@@ -XXX,XX +XXX,XX @@ static bool trans_ADDVL(DisasContext *s, arg_ADDVL *a)
74
return true;
75
}
58
}
76
59
77
+static bool trans_ADDSVL(DisasContext *s, arg_ADDSVL *a)
60
@@ -XXX,XX +XXX,XX @@ static void stellaris_i2c_init(Object *obj)
78
+{
61
memory_region_init_io(&s->iomem, obj, &stellaris_i2c_ops, s,
79
+ if (!dc_isar_feature(aa64_sme, s)) {
62
"i2c", 0x1000);
80
+ return false;
63
sysbus_init_mmio(sbd, &s->iomem);
81
+ }
64
- /* ??? For now we only implement the master interface. */
82
+ if (sme_enabled_check(s)) {
65
- stellaris_i2c_reset(s);
83
+ TCGv_i64 rd = cpu_reg_sp(s, a->rd);
66
}
84
+ TCGv_i64 rn = cpu_reg_sp(s, a->rn);
67
85
+ tcg_gen_addi_i64(rd, rn, a->imm * streaming_vec_reg_size(s));
68
/* Analogue to Digital Converter. This is only partially implemented,
86
+ }
69
@@ -XXX,XX +XXX,XX @@ type_init(stellaris_machine_init)
87
+ return true;
70
static void stellaris_i2c_class_init(ObjectClass *klass, void *data)
88
+}
89
+
90
static bool trans_ADDPL(DisasContext *s, arg_ADDPL *a)
91
{
71
{
92
if (!dc_isar_feature(aa64_sve, s)) {
72
DeviceClass *dc = DEVICE_CLASS(klass);
93
@@ -XXX,XX +XXX,XX @@ static bool trans_ADDPL(DisasContext *s, arg_ADDPL *a)
73
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
94
return true;
74
75
+ rc->phases.enter = stellaris_i2c_reset_enter;
76
+ rc->phases.hold = stellaris_i2c_reset_hold;
77
+ rc->phases.exit = stellaris_i2c_reset_exit;
78
dc->vmsd = &vmstate_stellaris_i2c;
95
}
79
}
96
80
97
+static bool trans_ADDSPL(DisasContext *s, arg_ADDSPL *a)
98
+{
99
+ if (!dc_isar_feature(aa64_sme, s)) {
100
+ return false;
101
+ }
102
+ if (sme_enabled_check(s)) {
103
+ TCGv_i64 rd = cpu_reg_sp(s, a->rd);
104
+ TCGv_i64 rn = cpu_reg_sp(s, a->rn);
105
+ tcg_gen_addi_i64(rd, rn, a->imm * streaming_pred_reg_size(s));
106
+ }
107
+ return true;
108
+}
109
+
110
static bool trans_RDVL(DisasContext *s, arg_RDVL *a)
111
{
112
if (!dc_isar_feature(aa64_sve, s)) {
113
@@ -XXX,XX +XXX,XX @@ static bool trans_RDVL(DisasContext *s, arg_RDVL *a)
114
return true;
115
}
116
117
+static bool trans_RDSVL(DisasContext *s, arg_RDSVL *a)
118
+{
119
+ if (!dc_isar_feature(aa64_sme, s)) {
120
+ return false;
121
+ }
122
+ if (sme_enabled_check(s)) {
123
+ TCGv_i64 reg = cpu_reg(s, a->rd);
124
+ tcg_gen_movi_i64(reg, a->imm * streaming_vec_reg_size(s));
125
+ }
126
+ return true;
127
+}
128
+
129
/*
130
*** SVE Compute Vector Address Group
131
*/
132
--
81
--
133
2.25.1
82
2.34.1
83
84
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
This is an SVE instruction that operates using the SVE vector
3
QDev objects created with qdev_new() need to manually add
4
length but that it is present only if SME is implemented.
4
their parent relationship with object_property_add_child().
5
5
6
This commit plug the devices which aren't part of the SoC;
7
they will be plugged into a SoC container in the next one.
8
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20240213155214.13619-4-philmd@linaro.org
8
Message-id: 20220708151540.18136-29-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
13
---
11
target/arm/sve.decode | 20 +++++++++++++
14
hw/arm/stellaris.c | 4 ++++
12
target/arm/translate-sve.c | 57 ++++++++++++++++++++++++++++++++++++++
15
1 file changed, 4 insertions(+)
13
2 files changed, 77 insertions(+)
14
16
15
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
17
diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sve.decode
19
--- a/hw/arm/stellaris.c
18
+++ b/target/arm/sve.decode
20
+++ b/hw/arm/stellaris.c
19
@@ -XXX,XX +XXX,XX @@ BFMLALT_zzxw 01100100 11 1 ..... 0100.1 ..... ..... @rrxr_3a esz=2
21
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
20
22
&error_fatal);
21
### SVE2 floating-point bfloat16 dot-product (indexed)
23
22
BFDOT_zzxz 01100100 01 1 ..... 010000 ..... ..... @rrxr_2 esz=2
24
ssddev = qdev_new("ssd0323");
23
+
25
+ object_property_add_child(OBJECT(ms), "oled", OBJECT(ssddev));
24
+### SVE broadcast predicate element
26
qdev_prop_set_uint8(ssddev, "cs", 1);
25
+
27
qdev_realize_and_unref(ssddev, bus, &error_fatal);
26
+&psel esz pd pn pm rv imm
28
27
+%psel_rv 16:2 !function=plus_12
29
gpio_d_splitter = qdev_new(TYPE_SPLIT_IRQ);
28
+%psel_imm_b 22:2 19:2
30
+ object_property_add_child(OBJECT(ms), "splitter",
29
+%psel_imm_h 22:2 20:1
31
+ OBJECT(gpio_d_splitter));
30
+%psel_imm_s 22:2
32
qdev_prop_set_uint32(gpio_d_splitter, "num-lines", 2);
31
+%psel_imm_d 23:1
33
qdev_realize_and_unref(gpio_d_splitter, NULL, &error_fatal);
32
+@psel ........ .. . ... .. .. pn:4 . pm:4 . pd:4 \
34
qdev_connect_gpio_out(
33
+ &psel rv=%psel_rv
35
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
34
+
36
DeviceState *gpad;
35
+PSEL 00100101 .. 1 ..1 .. 01 .... 0 .... 0 .... \
37
36
+ @psel esz=0 imm=%psel_imm_b
38
gpad = qdev_new(TYPE_STELLARIS_GAMEPAD);
37
+PSEL 00100101 .. 1 .10 .. 01 .... 0 .... 0 .... \
39
+ object_property_add_child(OBJECT(ms), "gamepad", OBJECT(gpad));
38
+ @psel esz=1 imm=%psel_imm_h
40
for (i = 0; i < ARRAY_SIZE(gpad_keycode); i++) {
39
+PSEL 00100101 .. 1 100 .. 01 .... 0 .... 0 .... \
41
qlist_append_int(gpad_keycode_list, gpad_keycode[i]);
40
+ @psel esz=2 imm=%psel_imm_s
42
}
41
+PSEL 00100101 .1 1 000 .. 01 .... 0 .... 0 .... \
42
+ @psel esz=3 imm=%psel_imm_d
43
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/target/arm/translate-sve.c
46
+++ b/target/arm/translate-sve.c
47
@@ -XXX,XX +XXX,XX @@ static bool do_BFMLAL_zzxw(DisasContext *s, arg_rrxr_esz *a, bool sel)
48
49
TRANS_FEAT(BFMLALB_zzxw, aa64_sve_bf16, do_BFMLAL_zzxw, a, false)
50
TRANS_FEAT(BFMLALT_zzxw, aa64_sve_bf16, do_BFMLAL_zzxw, a, true)
51
+
52
+static bool trans_PSEL(DisasContext *s, arg_psel *a)
53
+{
54
+ int vl = vec_full_reg_size(s);
55
+ int pl = pred_gvec_reg_size(s);
56
+ int elements = vl >> a->esz;
57
+ TCGv_i64 tmp, didx, dbit;
58
+ TCGv_ptr ptr;
59
+
60
+ if (!dc_isar_feature(aa64_sme, s)) {
61
+ return false;
62
+ }
63
+ if (!sve_access_check(s)) {
64
+ return true;
65
+ }
66
+
67
+ tmp = tcg_temp_new_i64();
68
+ dbit = tcg_temp_new_i64();
69
+ didx = tcg_temp_new_i64();
70
+ ptr = tcg_temp_new_ptr();
71
+
72
+ /* Compute the predicate element. */
73
+ tcg_gen_addi_i64(tmp, cpu_reg(s, a->rv), a->imm);
74
+ if (is_power_of_2(elements)) {
75
+ tcg_gen_andi_i64(tmp, tmp, elements - 1);
76
+ } else {
77
+ tcg_gen_remu_i64(tmp, tmp, tcg_constant_i64(elements));
78
+ }
79
+
80
+ /* Extract the predicate byte and bit indices. */
81
+ tcg_gen_shli_i64(tmp, tmp, a->esz);
82
+ tcg_gen_andi_i64(dbit, tmp, 7);
83
+ tcg_gen_shri_i64(didx, tmp, 3);
84
+ if (HOST_BIG_ENDIAN) {
85
+ tcg_gen_xori_i64(didx, didx, 7);
86
+ }
87
+
88
+ /* Load the predicate word. */
89
+ tcg_gen_trunc_i64_ptr(ptr, didx);
90
+ tcg_gen_add_ptr(ptr, ptr, cpu_env);
91
+ tcg_gen_ld8u_i64(tmp, ptr, pred_full_reg_offset(s, a->pm));
92
+
93
+ /* Extract the predicate bit and replicate to MO_64. */
94
+ tcg_gen_shr_i64(tmp, tmp, dbit);
95
+ tcg_gen_andi_i64(tmp, tmp, 1);
96
+ tcg_gen_neg_i64(tmp, tmp);
97
+
98
+ /* Apply to either copy the source, or write zeros. */
99
+ tcg_gen_gvec_ands(MO_64, pred_full_reg_offset(s, a->pd),
100
+ pred_full_reg_offset(s, a->pn), tmp, pl, pl);
101
+
102
+ tcg_temp_free_i64(tmp);
103
+ tcg_temp_free_i64(dbit);
104
+ tcg_temp_free_i64(didx);
105
+ tcg_temp_free_ptr(ptr);
106
+ return true;
107
+}
108
--
43
--
109
2.25.1
44
2.34.1
45
46
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
Add a TCGv_ptr base argument, which will be cpu_env for SVE.
3
QDev objects created with qdev_new() need to manually add
4
We will reuse this for SME save and restore array insns.
4
their parent relationship with object_property_add_child().
5
5
6
Since we don't model the SoC, just use a QOM container.
7
8
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20240213155214.13619-5-philmd@linaro.org
8
Message-id: 20220708151540.18136-22-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
12
---
11
target/arm/translate-a64.h | 3 +++
13
hw/arm/stellaris.c | 11 ++++++++++-
12
target/arm/translate-sve.c | 48 ++++++++++++++++++++++++++++----------
14
1 file changed, 10 insertions(+), 1 deletion(-)
13
2 files changed, 39 insertions(+), 12 deletions(-)
14
15
15
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
16
diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-a64.h
18
--- a/hw/arm/stellaris.c
18
+++ b/target/arm/translate-a64.h
19
+++ b/hw/arm/stellaris.c
19
@@ -XXX,XX +XXX,XX @@ void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
20
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
20
uint32_t rm_ofs, int64_t shift,
21
* 400fe000 system control
21
uint32_t opr_sz, uint32_t max_sz);
22
*/
22
23
23
+void gen_sve_ldr(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm);
24
+ Object *soc_container;
24
+void gen_sve_str(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm);
25
DeviceState *gpio_dev[7], *nvic;
26
qemu_irq gpio_in[7][8];
27
qemu_irq gpio_out[7][8];
28
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
29
flash_size = (((board->dc0 & 0xffff) + 1) << 1) * 1024;
30
sram_size = ((board->dc0 >> 18) + 1) * 1024;
31
32
+ soc_container = object_new("container");
33
+ object_property_add_child(OBJECT(ms), "soc", soc_container);
25
+
34
+
26
#endif /* TARGET_ARM_TRANSLATE_A64_H */
35
/* Flash programming is done via the SCU, so pretend it is ROM. */
27
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
36
memory_region_init_rom(flash, NULL, "stellaris.flash", flash_size,
28
index XXXXXXX..XXXXXXX 100644
37
&error_fatal);
29
--- a/target/arm/translate-sve.c
38
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
30
+++ b/target/arm/translate-sve.c
39
* need its sysclk output.
31
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(UCVTF_dd, aa64_sve, gen_gvec_fpst_arg_zpz,
40
*/
32
* The load should begin at the address Rn + IMM.
41
ssys_dev = qdev_new(TYPE_STELLARIS_SYS);
33
*/
42
+ object_property_add_child(soc_container, "sys", OBJECT(ssys_dev));
34
35
-static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
36
+void gen_sve_ldr(DisasContext *s, TCGv_ptr base, int vofs,
37
+ int len, int rn, int imm)
38
{
39
int len_align = QEMU_ALIGN_DOWN(len, 8);
40
int len_remain = len % 8;
41
@@ -XXX,XX +XXX,XX @@ static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
42
t0 = tcg_temp_new_i64();
43
for (i = 0; i < len_align; i += 8) {
44
tcg_gen_qemu_ld_i64(t0, clean_addr, midx, MO_LEUQ);
45
- tcg_gen_st_i64(t0, cpu_env, vofs + i);
46
+ tcg_gen_st_i64(t0, base, vofs + i);
47
tcg_gen_addi_i64(clean_addr, clean_addr, 8);
48
}
49
tcg_temp_free_i64(t0);
50
@@ -XXX,XX +XXX,XX @@ static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
51
clean_addr = new_tmp_a64_local(s);
52
tcg_gen_mov_i64(clean_addr, t0);
53
54
+ if (base != cpu_env) {
55
+ TCGv_ptr b = tcg_temp_local_new_ptr();
56
+ tcg_gen_mov_ptr(b, base);
57
+ base = b;
58
+ }
59
+
60
gen_set_label(loop);
61
62
t0 = tcg_temp_new_i64();
63
@@ -XXX,XX +XXX,XX @@ static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
64
tcg_gen_addi_i64(clean_addr, clean_addr, 8);
65
66
tp = tcg_temp_new_ptr();
67
- tcg_gen_add_ptr(tp, cpu_env, i);
68
+ tcg_gen_add_ptr(tp, base, i);
69
tcg_gen_addi_ptr(i, i, 8);
70
tcg_gen_st_i64(t0, tp, vofs);
71
tcg_temp_free_ptr(tp);
72
@@ -XXX,XX +XXX,XX @@ static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
73
74
tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
75
tcg_temp_free_ptr(i);
76
+
77
+ if (base != cpu_env) {
78
+ tcg_temp_free_ptr(base);
79
+ assert(len_remain == 0);
80
+ }
81
}
82
43
83
/*
44
/*
84
@@ -XXX,XX +XXX,XX @@ static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
45
* Most devices come preprogrammed with a MAC address in the user data.
85
default:
46
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
86
g_assert_not_reached();
47
sysbus_realize_and_unref(SYS_BUS_DEVICE(ssys_dev), &error_fatal);
87
}
48
88
- tcg_gen_st_i64(t0, cpu_env, vofs + len_align);
49
nvic = qdev_new(TYPE_ARMV7M);
89
+ tcg_gen_st_i64(t0, base, vofs + len_align);
50
+ object_property_add_child(soc_container, "v7m", OBJECT(nvic));
90
tcg_temp_free_i64(t0);
51
qdev_prop_set_uint32(nvic, "num-irq", NUM_IRQ_LINES);
91
}
52
qdev_prop_set_uint8(nvic, "num-prio-bits", NUM_PRIO_BITS);
92
}
53
qdev_prop_set_string(nvic, "cpu-type", ms->cpu_type);
93
54
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
94
/* Similarly for stores. */
55
95
-static void do_str(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
56
dev = qdev_new(TYPE_STELLARIS_GPTM);
96
+void gen_sve_str(DisasContext *s, TCGv_ptr base, int vofs,
57
sbd = SYS_BUS_DEVICE(dev);
97
+ int len, int rn, int imm)
58
+ object_property_add_child(soc_container, "gptm[*]", OBJECT(dev));
98
{
59
qdev_connect_clock_in(dev, "clk",
99
int len_align = QEMU_ALIGN_DOWN(len, 8);
60
qdev_get_clock_out(ssys_dev, "SYSCLK"));
100
int len_remain = len % 8;
61
sysbus_realize_and_unref(sbd, &error_fatal);
101
@@ -XXX,XX +XXX,XX @@ static void do_str(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
62
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
102
63
103
t0 = tcg_temp_new_i64();
64
if (board->dc1 & (1 << 3)) { /* watchdog present */
104
for (i = 0; i < len_align; i += 8) {
65
dev = qdev_new(TYPE_LUMINARY_WATCHDOG);
105
- tcg_gen_ld_i64(t0, cpu_env, vofs + i);
66
-
106
+ tcg_gen_ld_i64(t0, base, vofs + i);
67
+ object_property_add_child(soc_container, "wdg", OBJECT(dev));
107
tcg_gen_qemu_st_i64(t0, clean_addr, midx, MO_LEUQ);
68
qdev_connect_clock_in(dev, "WDOGCLK",
108
tcg_gen_addi_i64(clean_addr, clean_addr, 8);
69
qdev_get_clock_out(ssys_dev, "SYSCLK"));
109
}
70
110
@@ -XXX,XX +XXX,XX @@ static void do_str(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
71
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
111
clean_addr = new_tmp_a64_local(s);
72
SysBusDevice *sbd;
112
tcg_gen_mov_i64(clean_addr, t0);
73
113
74
dev = qdev_new("pl011_luminary");
114
+ if (base != cpu_env) {
75
+ object_property_add_child(soc_container, "uart[*]", OBJECT(dev));
115
+ TCGv_ptr b = tcg_temp_local_new_ptr();
76
sbd = SYS_BUS_DEVICE(dev);
116
+ tcg_gen_mov_ptr(b, base);
77
qdev_prop_set_chr(dev, "chardev", serial_hd(i));
117
+ base = b;
78
sysbus_realize_and_unref(sbd, &error_fatal);
118
+ }
79
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
119
+
80
DeviceState *enet;
120
gen_set_label(loop);
81
121
82
enet = qdev_new("stellaris_enet");
122
t0 = tcg_temp_new_i64();
83
+ object_property_add_child(soc_container, "enet", OBJECT(enet));
123
tp = tcg_temp_new_ptr();
84
if (nd) {
124
- tcg_gen_add_ptr(tp, cpu_env, i);
85
qdev_set_nic_properties(enet, nd);
125
+ tcg_gen_add_ptr(tp, base, i);
86
} else {
126
tcg_gen_ld_i64(t0, tp, vofs);
127
tcg_gen_addi_ptr(i, i, 8);
128
tcg_temp_free_ptr(tp);
129
@@ -XXX,XX +XXX,XX @@ static void do_str(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
130
131
tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
132
tcg_temp_free_ptr(i);
133
+
134
+ if (base != cpu_env) {
135
+ tcg_temp_free_ptr(base);
136
+ assert(len_remain == 0);
137
+ }
138
}
139
140
/* Predicate register stores can be any multiple of 2. */
141
if (len_remain) {
142
t0 = tcg_temp_new_i64();
143
- tcg_gen_ld_i64(t0, cpu_env, vofs + len_align);
144
+ tcg_gen_ld_i64(t0, base, vofs + len_align);
145
146
switch (len_remain) {
147
case 2:
148
@@ -XXX,XX +XXX,XX @@ static bool trans_LDR_zri(DisasContext *s, arg_rri *a)
149
if (sve_access_check(s)) {
150
int size = vec_full_reg_size(s);
151
int off = vec_full_reg_offset(s, a->rd);
152
- do_ldr(s, off, size, a->rn, a->imm * size);
153
+ gen_sve_ldr(s, cpu_env, off, size, a->rn, a->imm * size);
154
}
155
return true;
156
}
157
@@ -XXX,XX +XXX,XX @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a)
158
if (sve_access_check(s)) {
159
int size = pred_full_reg_size(s);
160
int off = pred_full_reg_offset(s, a->rd);
161
- do_ldr(s, off, size, a->rn, a->imm * size);
162
+ gen_sve_ldr(s, cpu_env, off, size, a->rn, a->imm * size);
163
}
164
return true;
165
}
166
@@ -XXX,XX +XXX,XX @@ static bool trans_STR_zri(DisasContext *s, arg_rri *a)
167
if (sve_access_check(s)) {
168
int size = vec_full_reg_size(s);
169
int off = vec_full_reg_offset(s, a->rd);
170
- do_str(s, off, size, a->rn, a->imm * size);
171
+ gen_sve_str(s, cpu_env, off, size, a->rn, a->imm * size);
172
}
173
return true;
174
}
175
@@ -XXX,XX +XXX,XX @@ static bool trans_STR_pri(DisasContext *s, arg_rri *a)
176
if (sve_access_check(s)) {
177
int size = pred_full_reg_size(s);
178
int off = pred_full_reg_offset(s, a->rd);
179
- do_str(s, off, size, a->rn, a->imm * size);
180
+ gen_sve_str(s, cpu_env, off, size, a->rn, a->imm * size);
181
}
182
return true;
183
}
184
--
87
--
185
2.25.1
88
2.34.1
89
90
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
We support two different encodings for the AArch32 IMPDEF
2
CBAR register -- older cores like the Cortex A9, A7, A15
3
have this at 4, c15, c0, 0; newer cores like the
4
Cortex A35, A53, A57 and A72 have it at 1 c15 c0 0.
2
5
3
We can handle both exception entry and exception return by
6
When we implemented this we picked which encoding to
4
hooking into aarch64_sve_change_el.
7
use based on whether the CPU set ARM_FEATURE_AARCH64.
8
However this isn't right for three cases:
9
* the qemu-system-arm 'max' CPU, which is supposed to be
10
a variant on a Cortex-A57; it ought to use the same
11
encoding the A57 does and which the AArch64 'max'
12
exposes to AArch32 guest code
13
* the Cortex-R52, which is AArch32-only but has the CBAR
14
at the newer encoding (and where we incorrectly are
15
not yet setting ARM_FEATURE_CBAR_RO anyway)
16
* any possible future support for other v8 AArch32
17
only CPUs, or for supporting "boot the CPU into
18
AArch32 mode" on our existing cores like the A57 etc
5
19
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
20
Make the decision of the encoding be based on whether
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
21
the CPU implements the ARM_FEATURE_V8 flag instead.
8
Message-id: 20220708151540.18136-32-richard.henderson@linaro.org
22
23
This changes the behaviour only for the qemu-system-arm
24
'-cpu max'. We don't expect anybody to be relying on the
25
old behaviour because:
26
* it's not what the real hardware Cortex-A57 does
27
(and that's what our ID register claims we are)
28
* we don't implement the memory-mapped GICv3 support
29
which is the only thing that exists at the peripheral
30
base address pointed to by the register
31
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
32
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
33
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
34
Message-id: 20240206132931.38376-2-peter.maydell@linaro.org
10
---
35
---
11
target/arm/helper.c | 15 +++++++++++++--
36
target/arm/helper.c | 2 +-
12
1 file changed, 13 insertions(+), 2 deletions(-)
37
1 file changed, 1 insertion(+), 1 deletion(-)
13
38
14
diff --git a/target/arm/helper.c b/target/arm/helper.c
39
diff --git a/target/arm/helper.c b/target/arm/helper.c
15
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.c
41
--- a/target/arm/helper.c
17
+++ b/target/arm/helper.c
42
+++ b/target/arm/helper.c
18
@@ -XXX,XX +XXX,XX @@ void aarch64_sve_change_el(CPUARMState *env, int old_el,
43
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
19
return;
44
* AArch64 cores we might need to add a specific feature flag
20
}
45
* to indicate cores with "flavour 2" CBAR.
21
46
*/
22
+ old_a64 = old_el ? arm_el_is_aa64(env, old_el) : el0_a64;
47
- if (arm_feature(env, ARM_FEATURE_AARCH64)) {
23
+ new_a64 = new_el ? arm_el_is_aa64(env, new_el) : el0_a64;
48
+ if (arm_feature(env, ARM_FEATURE_V8)) {
24
+
49
/* 32 bit view is [31:18] 0...0 [43:32]. */
25
+ /*
50
uint32_t cbar32 = (extract64(cpu->reset_cbar, 18, 14) << 18)
26
+ * Both AArch64.TakeException and AArch64.ExceptionReturn
51
| extract64(cpu->reset_cbar, 32, 12);
27
+ * invoke ResetSVEState when taking an exception from, or
28
+ * returning to, AArch32 state when PSTATE.SM is enabled.
29
+ */
30
+ if (old_a64 != new_a64 && FIELD_EX64(env->svcr, SVCR, SM)) {
31
+ arm_reset_sve_state(env);
32
+ return;
33
+ }
34
+
35
/*
36
* DDI0584A.d sec 3.2: "If SVE instructions are disabled or trapped
37
* at ELx, or not available because the EL is in AArch32 state, then
38
@@ -XXX,XX +XXX,XX @@ void aarch64_sve_change_el(CPUARMState *env, int old_el,
39
* we already have the correct register contents when encountering the
40
* vq0->vq0 transition between EL0->EL1.
41
*/
42
- old_a64 = old_el ? arm_el_is_aa64(env, old_el) : el0_a64;
43
old_len = (old_a64 && !sve_exception_el(env, old_el)
44
? sve_vqm1_for_el(env, old_el) : 0);
45
- new_a64 = new_el ? arm_el_is_aa64(env, new_el) : el0_a64;
46
new_len = (new_a64 && !sve_exception_el(env, new_el)
47
? sve_vqm1_for_el(env, new_el) : 0);
48
49
--
52
--
50
2.25.1
53
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The Cortex-R52 implements the Configuration Base Address Register
2
(CBAR), as a read-only register. Add ARM_FEATURE_CBAR_RO to this CPU
3
type, so that our implementation provides the register and the
4
associated qdev property.
2
5
3
This is SMOPA, SUMOPA, USMOPA_s, UMOPA, for both Int8 and Int16.
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20240206132931.38376-3-peter.maydell@linaro.org
9
---
10
target/arm/tcg/cpu32.c | 1 +
11
1 file changed, 1 insertion(+)
4
12
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
diff --git a/target/arm/tcg/cpu32.c b/target/arm/tcg/cpu32.c
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20220708151540.18136-28-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/helper-sme.h | 16 ++++++++
11
target/arm/sme.decode | 10 +++++
12
target/arm/sme_helper.c | 82 ++++++++++++++++++++++++++++++++++++++
13
target/arm/translate-sme.c | 10 +++++
14
4 files changed, 118 insertions(+)
15
16
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
17
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-sme.h
15
--- a/target/arm/tcg/cpu32.c
19
+++ b/target/arm/helper-sme.h
16
+++ b/target/arm/tcg/cpu32.c
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_7(sme_fmopa_d, TCG_CALL_NO_RWG,
17
@@ -XXX,XX +XXX,XX @@ static void cortex_r52_initfn(Object *obj)
21
void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
18
set_feature(&cpu->env, ARM_FEATURE_PMSA);
22
DEF_HELPER_FLAGS_6(sme_bfmopa, TCG_CALL_NO_RWG,
19
set_feature(&cpu->env, ARM_FEATURE_NEON);
23
void, ptr, ptr, ptr, ptr, ptr, i32)
20
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
24
+DEF_HELPER_FLAGS_6(sme_smopa_s, TCG_CALL_NO_RWG,
21
+ set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
25
+ void, ptr, ptr, ptr, ptr, ptr, i32)
22
cpu->midr = 0x411fd133; /* r1p3 */
26
+DEF_HELPER_FLAGS_6(sme_umopa_s, TCG_CALL_NO_RWG,
23
cpu->revidr = 0x00000000;
27
+ void, ptr, ptr, ptr, ptr, ptr, i32)
24
cpu->reset_fpsid = 0x41034023;
28
+DEF_HELPER_FLAGS_6(sme_sumopa_s, TCG_CALL_NO_RWG,
29
+ void, ptr, ptr, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_6(sme_usmopa_s, TCG_CALL_NO_RWG,
31
+ void, ptr, ptr, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_6(sme_smopa_d, TCG_CALL_NO_RWG,
33
+ void, ptr, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_6(sme_umopa_d, TCG_CALL_NO_RWG,
35
+ void, ptr, ptr, ptr, ptr, ptr, i32)
36
+DEF_HELPER_FLAGS_6(sme_sumopa_d, TCG_CALL_NO_RWG,
37
+ void, ptr, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_6(sme_usmopa_d, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, ptr, ptr, i32)
40
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/sme.decode
43
+++ b/target/arm/sme.decode
44
@@ -XXX,XX +XXX,XX @@ FMOPA_d 10000000 110 ..... ... ... ..... . 0 ... @op_64
45
46
BFMOPA 10000001 100 ..... ... ... ..... . 00 .. @op_32
47
FMOPA_h 10000001 101 ..... ... ... ..... . 00 .. @op_32
48
+
49
+SMOPA_s 1010000 0 10 0 ..... ... ... ..... . 00 .. @op_32
50
+SUMOPA_s 1010000 0 10 1 ..... ... ... ..... . 00 .. @op_32
51
+USMOPA_s 1010000 1 10 0 ..... ... ... ..... . 00 .. @op_32
52
+UMOPA_s 1010000 1 10 1 ..... ... ... ..... . 00 .. @op_32
53
+
54
+SMOPA_d 1010000 0 11 0 ..... ... ... ..... . 0 ... @op_64
55
+SUMOPA_d 1010000 0 11 1 ..... ... ... ..... . 0 ... @op_64
56
+USMOPA_d 1010000 1 11 0 ..... ... ... ..... . 0 ... @op_64
57
+UMOPA_d 1010000 1 11 1 ..... ... ... ..... . 0 ... @op_64
58
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/sme_helper.c
61
+++ b/target/arm/sme_helper.c
62
@@ -XXX,XX +XXX,XX @@ void HELPER(sme_bfmopa)(void *vza, void *vzn, void *vzm, void *vpn,
63
} while (row & 15);
64
}
65
}
66
+
67
+typedef uint64_t IMOPFn(uint64_t, uint64_t, uint64_t, uint8_t, bool);
68
+
69
+static inline void do_imopa(uint64_t *za, uint64_t *zn, uint64_t *zm,
70
+ uint8_t *pn, uint8_t *pm,
71
+ uint32_t desc, IMOPFn *fn)
72
+{
73
+ intptr_t row, col, oprsz = simd_oprsz(desc) / 8;
74
+ bool neg = simd_data(desc);
75
+
76
+ for (row = 0; row < oprsz; ++row) {
77
+ uint8_t pa = pn[H1(row)];
78
+ uint64_t *za_row = &za[tile_vslice_index(row)];
79
+ uint64_t n = zn[row];
80
+
81
+ for (col = 0; col < oprsz; ++col) {
82
+ uint8_t pb = pm[H1(col)];
83
+ uint64_t *a = &za_row[col];
84
+
85
+ *a = fn(n, zm[col], *a, pa & pb, neg);
86
+ }
87
+ }
88
+}
89
+
90
+#define DEF_IMOP_32(NAME, NTYPE, MTYPE) \
91
+static uint64_t NAME(uint64_t n, uint64_t m, uint64_t a, uint8_t p, bool neg) \
92
+{ \
93
+ uint32_t sum0 = 0, sum1 = 0; \
94
+ /* Apply P to N as a mask, making the inactive elements 0. */ \
95
+ n &= expand_pred_b(p); \
96
+ sum0 += (NTYPE)(n >> 0) * (MTYPE)(m >> 0); \
97
+ sum0 += (NTYPE)(n >> 8) * (MTYPE)(m >> 8); \
98
+ sum0 += (NTYPE)(n >> 16) * (MTYPE)(m >> 16); \
99
+ sum0 += (NTYPE)(n >> 24) * (MTYPE)(m >> 24); \
100
+ sum1 += (NTYPE)(n >> 32) * (MTYPE)(m >> 32); \
101
+ sum1 += (NTYPE)(n >> 40) * (MTYPE)(m >> 40); \
102
+ sum1 += (NTYPE)(n >> 48) * (MTYPE)(m >> 48); \
103
+ sum1 += (NTYPE)(n >> 56) * (MTYPE)(m >> 56); \
104
+ if (neg) { \
105
+ sum0 = (uint32_t)a - sum0, sum1 = (uint32_t)(a >> 32) - sum1; \
106
+ } else { \
107
+ sum0 = (uint32_t)a + sum0, sum1 = (uint32_t)(a >> 32) + sum1; \
108
+ } \
109
+ return ((uint64_t)sum1 << 32) | sum0; \
110
+}
111
+
112
+#define DEF_IMOP_64(NAME, NTYPE, MTYPE) \
113
+static uint64_t NAME(uint64_t n, uint64_t m, uint64_t a, uint8_t p, bool neg) \
114
+{ \
115
+ uint64_t sum = 0; \
116
+ /* Apply P to N as a mask, making the inactive elements 0. */ \
117
+ n &= expand_pred_h(p); \
118
+ sum += (NTYPE)(n >> 0) * (MTYPE)(m >> 0); \
119
+ sum += (NTYPE)(n >> 16) * (MTYPE)(m >> 16); \
120
+ sum += (NTYPE)(n >> 32) * (MTYPE)(m >> 32); \
121
+ sum += (NTYPE)(n >> 48) * (MTYPE)(m >> 48); \
122
+ return neg ? a - sum : a + sum; \
123
+}
124
+
125
+DEF_IMOP_32(smopa_s, int8_t, int8_t)
126
+DEF_IMOP_32(umopa_s, uint8_t, uint8_t)
127
+DEF_IMOP_32(sumopa_s, int8_t, uint8_t)
128
+DEF_IMOP_32(usmopa_s, uint8_t, int8_t)
129
+
130
+DEF_IMOP_64(smopa_d, int16_t, int16_t)
131
+DEF_IMOP_64(umopa_d, uint16_t, uint16_t)
132
+DEF_IMOP_64(sumopa_d, int16_t, uint16_t)
133
+DEF_IMOP_64(usmopa_d, uint16_t, int16_t)
134
+
135
+#define DEF_IMOPH(NAME) \
136
+ void HELPER(sme_##NAME)(void *vza, void *vzn, void *vzm, void *vpn, \
137
+ void *vpm, uint32_t desc) \
138
+ { do_imopa(vza, vzn, vzm, vpn, vpm, desc, NAME); }
139
+
140
+DEF_IMOPH(smopa_s)
141
+DEF_IMOPH(umopa_s)
142
+DEF_IMOPH(sumopa_s)
143
+DEF_IMOPH(usmopa_s)
144
+DEF_IMOPH(smopa_d)
145
+DEF_IMOPH(umopa_d)
146
+DEF_IMOPH(sumopa_d)
147
+DEF_IMOPH(usmopa_d)
148
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
149
index XXXXXXX..XXXXXXX 100644
150
--- a/target/arm/translate-sme.c
151
+++ b/target/arm/translate-sme.c
152
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a, MO_64, gen_helper_sme_f
153
154
/* TODO: FEAT_EBF16 */
155
TRANS_FEAT(BFMOPA, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_bfmopa)
156
+
157
+TRANS_FEAT(SMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_smopa_s)
158
+TRANS_FEAT(UMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_umopa_s)
159
+TRANS_FEAT(SUMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_sumopa_s)
160
+TRANS_FEAT(USMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_usmopa_s)
161
+
162
+TRANS_FEAT(SMOPA_d, aa64_sme_i16i64, do_outprod, a, MO_64, gen_helper_sme_smopa_d)
163
+TRANS_FEAT(UMOPA_d, aa64_sme_i16i64, do_outprod, a, MO_64, gen_helper_sme_umopa_d)
164
+TRANS_FEAT(SUMOPA_d, aa64_sme_i16i64, do_outprod, a, MO_64, gen_helper_sme_sumopa_d)
165
+TRANS_FEAT(USMOPA_d, aa64_sme_i16i64, do_outprod, a, MO_64, gen_helper_sme_usmopa_d)
166
--
25
--
167
2.25.1
26
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Add the Cortex-R52 IMPDEF sysregs, by defining them here and
2
also by enabling the AUXCR feature which defines the ACTLR
3
and HACTLR registers. As is our usual practice, we make these
4
simple reads-as-zero stubs for now.
2
5
3
We can reuse the SVE functions for implementing moves to/from
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
horizontal tile slices, but we need new ones for moves to/from
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
vertical tile slices.
8
Message-id: 20240206132931.38376-4-peter.maydell@linaro.org
9
---
10
target/arm/tcg/cpu32.c | 108 +++++++++++++++++++++++++++++++++++++++++
11
1 file changed, 108 insertions(+)
6
12
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
diff --git a/target/arm/tcg/cpu32.c b/target/arm/tcg/cpu32.c
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20220708151540.18136-20-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/helper-sme.h | 12 +++
13
target/arm/helper-sve.h | 2 +
14
target/arm/translate-a64.h | 8 ++
15
target/arm/translate.h | 5 ++
16
target/arm/sme.decode | 15 ++++
17
target/arm/sme_helper.c | 151 ++++++++++++++++++++++++++++++++++++-
18
target/arm/sve_helper.c | 12 +++
19
target/arm/translate-sme.c | 127 +++++++++++++++++++++++++++++++
20
8 files changed, 331 insertions(+), 1 deletion(-)
21
22
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
23
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/helper-sme.h
15
--- a/target/arm/tcg/cpu32.c
25
+++ b/target/arm/helper-sme.h
16
+++ b/target/arm/tcg/cpu32.c
26
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(set_pstate_sm, TCG_CALL_NO_RWG, void, env, i32)
17
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
27
DEF_HELPER_FLAGS_2(set_pstate_za, TCG_CALL_NO_RWG, void, env, i32)
18
define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
28
29
DEF_HELPER_FLAGS_3(sme_zero, TCG_CALL_NO_RWG, void, env, i32, i32)
30
+
31
+/* Move to/from vertical array slices, i.e. columns, so 'c'. */
32
+DEF_HELPER_FLAGS_4(sme_mova_cz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(sme_mova_zc_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(sme_mova_cz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(sme_mova_zc_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
+DEF_HELPER_FLAGS_4(sme_mova_cz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
+DEF_HELPER_FLAGS_4(sme_mova_zc_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_4(sme_mova_cz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_4(sme_mova_zc_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(sme_mova_cz_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
41
+DEF_HELPER_FLAGS_4(sme_mova_zc_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
42
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/helper-sve.h
45
+++ b/target/arm/helper-sve.h
46
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG,
47
void, ptr, ptr, ptr, ptr, i32)
48
DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG,
49
void, ptr, ptr, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_5(sve_sel_zpzz_q, TCG_CALL_NO_RWG,
51
+ void, ptr, ptr, ptr, ptr, i32)
52
53
DEF_HELPER_FLAGS_5(sve2_addp_zpzz_b, TCG_CALL_NO_RWG,
54
void, ptr, ptr, ptr, ptr, i32)
55
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
56
index XXXXXXX..XXXXXXX 100644
57
--- a/target/arm/translate-a64.h
58
+++ b/target/arm/translate-a64.h
59
@@ -XXX,XX +XXX,XX @@ static inline int pred_gvec_reg_size(DisasContext *s)
60
return size_for_gvec(pred_full_reg_size(s));
61
}
19
}
62
20
63
+/* Return a newly allocated pointer to the predicate register. */
21
+static const ARMCPRegInfo cortex_r52_cp_reginfo[] = {
64
+static inline TCGv_ptr pred_full_reg_ptr(DisasContext *s, int regno)
22
+ { .name = "CPUACTLR", .cp = 15, .opc1 = 0, .crm = 15,
65
+{
23
+ .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
66
+ TCGv_ptr ret = tcg_temp_new_ptr();
24
+ { .name = "IMP_ATCMREGIONR",
67
+ tcg_gen_addi_ptr(ret, cpu_env, pred_full_reg_offset(s, regno));
25
+ .cp = 15, .opc1 = 0, .crn = 9, .crm = 1, .opc2 = 0,
68
+ return ret;
26
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
69
+}
27
+ { .name = "IMP_BTCMREGIONR",
70
+
28
+ .cp = 15, .opc1 = 0, .crn = 9, .crm = 1, .opc2 = 1,
71
bool disas_sve(DisasContext *, uint32_t);
29
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
72
bool disas_sme(DisasContext *, uint32_t);
30
+ { .name = "IMP_CTCMREGIONR",
73
31
+ .cp = 15, .opc1 = 0, .crn = 9, .crm = 1, .opc2 = 2,
74
diff --git a/target/arm/translate.h b/target/arm/translate.h
32
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
75
index XXXXXXX..XXXXXXX 100644
33
+ { .name = "IMP_CSCTLR",
76
--- a/target/arm/translate.h
34
+ .cp = 15, .opc1 = 1, .crn = 9, .crm = 1, .opc2 = 0,
77
+++ b/target/arm/translate.h
35
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
78
@@ -XXX,XX +XXX,XX @@ static inline int plus_2(DisasContext *s, int x)
36
+ { .name = "IMP_BPCTLR",
79
return x + 2;
37
+ .cp = 15, .opc1 = 1, .crn = 9, .crm = 1, .opc2 = 1,
80
}
38
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
81
39
+ { .name = "IMP_MEMPROTCLR",
82
+static inline int plus_12(DisasContext *s, int x)
40
+ .cp = 15, .opc1 = 1, .crn = 9, .crm = 1, .opc2 = 2,
83
+{
41
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
84
+ return x + 12;
42
+ { .name = "IMP_SLAVEPCTLR",
85
+}
43
+ .cp = 15, .opc1 = 0, .crn = 11, .crm = 0, .opc2 = 0,
86
+
44
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
87
static inline int times_2(DisasContext *s, int x)
45
+ { .name = "IMP_PERIPHREGIONR",
88
{
46
+ .cp = 15, .opc1 = 0, .crn = 15, .crm = 0, .opc2 = 0,
89
return x * 2;
47
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
90
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
48
+ { .name = "IMP_FLASHIFREGIONR",
91
index XXXXXXX..XXXXXXX 100644
49
+ .cp = 15, .opc1 = 0, .crn = 15, .crm = 0, .opc2 = 1,
92
--- a/target/arm/sme.decode
50
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
93
+++ b/target/arm/sme.decode
51
+ { .name = "IMP_BUILDOPTR",
94
@@ -XXX,XX +XXX,XX @@
52
+ .cp = 15, .opc1 = 0, .crn = 15, .crm = 2, .opc2 = 0,
95
### SME Misc
53
+ .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
96
54
+ { .name = "IMP_PINOPTR",
97
ZERO 11000000 00 001 00000000000 imm:8
55
+ .cp = 15, .opc1 = 0, .crn = 15, .crm = 2, .opc2 = 7,
98
+
56
+ .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
99
+### SME Move into/from Array
57
+ { .name = "IMP_QOSR",
100
+
58
+ .cp = 15, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 1,
101
+%mova_rs 13:2 !function=plus_12
59
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
102
+&mova esz rs pg zr za_imm v:bool to_vec:bool
60
+ { .name = "IMP_BUSTIMEOUTR",
103
+
61
+ .cp = 15, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 2,
104
+MOVA 11000000 esz:2 00000 0 v:1 .. pg:3 zr:5 0 za_imm:4 \
62
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
105
+ &mova to_vec=0 rs=%mova_rs
63
+ { .name = "IMP_INTMONR",
106
+MOVA 11000000 11 00000 1 v:1 .. pg:3 zr:5 0 za_imm:4 \
64
+ .cp = 15, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 4,
107
+ &mova to_vec=0 rs=%mova_rs esz=4
65
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
108
+
66
+ { .name = "IMP_ICERR0",
109
+MOVA 11000000 esz:2 00001 0 v:1 .. pg:3 0 za_imm:4 zr:5 \
67
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 0, .opc2 = 0,
110
+ &mova to_vec=1 rs=%mova_rs
68
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
111
+MOVA 11000000 11 00001 1 v:1 .. pg:3 0 za_imm:4 zr:5 \
69
+ { .name = "IMP_ICERR1",
112
+ &mova to_vec=1 rs=%mova_rs esz=4
70
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 0, .opc2 = 1,
113
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
71
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
114
index XXXXXXX..XXXXXXX 100644
72
+ { .name = "IMP_DCERR0",
115
--- a/target/arm/sme_helper.c
73
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 1, .opc2 = 0,
116
+++ b/target/arm/sme_helper.c
74
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
117
@@ -XXX,XX +XXX,XX @@
75
+ { .name = "IMP_DCERR1",
118
76
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 1, .opc2 = 1,
119
#include "qemu/osdep.h"
77
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
120
#include "cpu.h"
78
+ { .name = "IMP_TCMERR0",
121
-#include "internals.h"
79
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 2, .opc2 = 0,
122
+#include "tcg/tcg-gvec-desc.h"
80
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
123
#include "exec/helper-proto.h"
81
+ { .name = "IMP_TCMERR1",
124
+#include "qemu/int128.h"
82
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 2, .opc2 = 1,
125
+#include "vec_internal.h"
83
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
126
84
+ { .name = "IMP_TCMSYNDR0",
127
/* ResetSVEState */
85
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 2, .opc2 = 2,
128
void arm_reset_sve_state(CPUARMState *env)
86
+ .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
129
@@ -XXX,XX +XXX,XX @@ void helper_sme_zero(CPUARMState *env, uint32_t imm, uint32_t svl)
87
+ { .name = "IMP_TCMSYNDR1",
130
}
88
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 2, .opc2 = 3,
131
}
89
+ .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
132
}
90
+ { .name = "IMP_FLASHERR0",
91
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 3, .opc2 = 0,
92
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
93
+ { .name = "IMP_FLASHERR1",
94
+ .cp = 15, .opc1 = 2, .crn = 15, .crm = 3, .opc2 = 1,
95
+ .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
96
+ { .name = "IMP_CDBGDR0",
97
+ .cp = 15, .opc1 = 3, .crn = 15, .crm = 0, .opc2 = 0,
98
+ .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
99
+ { .name = "IMP_CBDGBR1",
100
+ .cp = 15, .opc1 = 3, .crn = 15, .crm = 0, .opc2 = 1,
101
+ .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
102
+ { .name = "IMP_TESTR0",
103
+ .cp = 15, .opc1 = 4, .crn = 15, .crm = 0, .opc2 = 0,
104
+ .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
105
+ { .name = "IMP_TESTR1",
106
+ .cp = 15, .opc1 = 4, .crn = 15, .crm = 0, .opc2 = 1,
107
+ .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
108
+ { .name = "IMP_CDBGDCI",
109
+ .cp = 15, .opc1 = 0, .crn = 15, .crm = 15, .opc2 = 0,
110
+ .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
111
+ { .name = "IMP_CDBGDCT",
112
+ .cp = 15, .opc1 = 3, .crn = 15, .crm = 2, .opc2 = 0,
113
+ .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
114
+ { .name = "IMP_CDBGICT",
115
+ .cp = 15, .opc1 = 3, .crn = 15, .crm = 2, .opc2 = 1,
116
+ .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
117
+ { .name = "IMP_CDBGDCD",
118
+ .cp = 15, .opc1 = 3, .crn = 15, .crm = 4, .opc2 = 0,
119
+ .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
120
+ { .name = "IMP_CDBGICD",
121
+ .cp = 15, .opc1 = 3, .crn = 15, .crm = 4, .opc2 = 1,
122
+ .access = PL1_W, .type = ARM_CP_NOP, .resetvalue = 0 },
123
+};
133
+
124
+
134
+
125
+
135
+/*
126
static void cortex_r52_initfn(Object *obj)
136
+ * When considering the ZA storage as an array of elements of
127
{
137
+ * type T, the index within that array of the Nth element of
128
ARMCPU *cpu = ARM_CPU(obj);
138
+ * a vertical slice of a tile can be calculated like this,
129
@@ -XXX,XX +XXX,XX @@ static void cortex_r52_initfn(Object *obj)
139
+ * regardless of the size of type T. This is because the tiles
130
set_feature(&cpu->env, ARM_FEATURE_NEON);
140
+ * are interleaved, so if type T is size N bytes then row 1 of
131
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
141
+ * the tile is N rows away from row 0. The division by N to
132
set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
142
+ * convert a byte offset into an array index and the multiplication
133
+ set_feature(&cpu->env, ARM_FEATURE_AUXCR);
143
+ * by N to convert from vslice-index-within-the-tile to
134
cpu->midr = 0x411fd133; /* r1p3 */
144
+ * the index within the ZA storage cancel out.
135
cpu->revidr = 0x00000000;
145
+ */
136
cpu->reset_fpsid = 0x41034023;
146
+#define tile_vslice_index(i) ((i) * sizeof(ARMVectorReg))
137
@@ -XXX,XX +XXX,XX @@ static void cortex_r52_initfn(Object *obj)
138
139
cpu->pmsav7_dregion = 16;
140
cpu->pmsav8r_hdregion = 16;
147
+
141
+
148
+/*
142
+ define_arm_cp_regs(cpu, cortex_r52_cp_reginfo);
149
+ * When doing byte arithmetic on the ZA storage, the element
150
+ * byteoff bytes away in a tile vertical slice is always this
151
+ * many bytes away in the ZA storage, regardless of the
152
+ * size of the tile element, assuming that byteoff is a multiple
153
+ * of the element size. Again this is because of the interleaving
154
+ * of the tiles. For instance if we have 1 byte per element then
155
+ * each row of the ZA storage has one byte of the vslice data,
156
+ * and (counting from 0) byte 8 goes in row 8 of the storage
157
+ * at offset (8 * row-size-in-bytes).
158
+ * If we have 8 bytes per element then each row of the ZA storage
159
+ * has 8 bytes of the data, but there are 8 interleaved tiles and
160
+ * so byte 8 of the data goes into row 1 of the tile,
161
+ * which is again row 8 of the storage, so the offset is still
162
+ * (8 * row-size-in-bytes). Similarly for other element sizes.
163
+ */
164
+#define tile_vslice_offset(byteoff) ((byteoff) * sizeof(ARMVectorReg))
165
+
166
+
167
+/*
168
+ * Move Zreg vector to ZArray column.
169
+ */
170
+#define DO_MOVA_C(NAME, TYPE, H) \
171
+void HELPER(NAME)(void *za, void *vn, void *vg, uint32_t desc) \
172
+{ \
173
+ int i, oprsz = simd_oprsz(desc); \
174
+ for (i = 0; i < oprsz; ) { \
175
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
176
+ do { \
177
+ if (pg & 1) { \
178
+ *(TYPE *)(za + tile_vslice_offset(i)) = *(TYPE *)(vn + H(i)); \
179
+ } \
180
+ i += sizeof(TYPE); \
181
+ pg >>= sizeof(TYPE); \
182
+ } while (i & 15); \
183
+ } \
184
+}
185
+
186
+DO_MOVA_C(sme_mova_cz_b, uint8_t, H1)
187
+DO_MOVA_C(sme_mova_cz_h, uint16_t, H1_2)
188
+DO_MOVA_C(sme_mova_cz_s, uint32_t, H1_4)
189
+
190
+void HELPER(sme_mova_cz_d)(void *za, void *vn, void *vg, uint32_t desc)
191
+{
192
+ int i, oprsz = simd_oprsz(desc) / 8;
193
+ uint8_t *pg = vg;
194
+ uint64_t *n = vn;
195
+ uint64_t *a = za;
196
+
197
+ for (i = 0; i < oprsz; i++) {
198
+ if (pg[H1(i)] & 1) {
199
+ a[tile_vslice_index(i)] = n[i];
200
+ }
201
+ }
202
+}
203
+
204
+void HELPER(sme_mova_cz_q)(void *za, void *vn, void *vg, uint32_t desc)
205
+{
206
+ int i, oprsz = simd_oprsz(desc) / 16;
207
+ uint16_t *pg = vg;
208
+ Int128 *n = vn;
209
+ Int128 *a = za;
210
+
211
+ /*
212
+ * Int128 is used here simply to copy 16 bytes, and to simplify
213
+ * the address arithmetic.
214
+ */
215
+ for (i = 0; i < oprsz; i++) {
216
+ if (pg[H2(i)] & 1) {
217
+ a[tile_vslice_index(i)] = n[i];
218
+ }
219
+ }
220
+}
221
+
222
+#undef DO_MOVA_C
223
+
224
+/*
225
+ * Move ZArray column to Zreg vector.
226
+ */
227
+#define DO_MOVA_Z(NAME, TYPE, H) \
228
+void HELPER(NAME)(void *vd, void *za, void *vg, uint32_t desc) \
229
+{ \
230
+ int i, oprsz = simd_oprsz(desc); \
231
+ for (i = 0; i < oprsz; ) { \
232
+ uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
233
+ do { \
234
+ if (pg & 1) { \
235
+ *(TYPE *)(vd + H(i)) = *(TYPE *)(za + tile_vslice_offset(i)); \
236
+ } \
237
+ i += sizeof(TYPE); \
238
+ pg >>= sizeof(TYPE); \
239
+ } while (i & 15); \
240
+ } \
241
+}
242
+
243
+DO_MOVA_Z(sme_mova_zc_b, uint8_t, H1)
244
+DO_MOVA_Z(sme_mova_zc_h, uint16_t, H1_2)
245
+DO_MOVA_Z(sme_mova_zc_s, uint32_t, H1_4)
246
+
247
+void HELPER(sme_mova_zc_d)(void *vd, void *za, void *vg, uint32_t desc)
248
+{
249
+ int i, oprsz = simd_oprsz(desc) / 8;
250
+ uint8_t *pg = vg;
251
+ uint64_t *d = vd;
252
+ uint64_t *a = za;
253
+
254
+ for (i = 0; i < oprsz; i++) {
255
+ if (pg[H1(i)] & 1) {
256
+ d[i] = a[tile_vslice_index(i)];
257
+ }
258
+ }
259
+}
260
+
261
+void HELPER(sme_mova_zc_q)(void *vd, void *za, void *vg, uint32_t desc)
262
+{
263
+ int i, oprsz = simd_oprsz(desc) / 16;
264
+ uint16_t *pg = vg;
265
+ Int128 *d = vd;
266
+ Int128 *a = za;
267
+
268
+ /*
269
+ * Int128 is used here simply to copy 16 bytes, and to simplify
270
+ * the address arithmetic.
271
+ */
272
+ for (i = 0; i < oprsz; i++, za += sizeof(ARMVectorReg)) {
273
+ if (pg[H2(i)] & 1) {
274
+ d[i] = a[tile_vslice_index(i)];
275
+ }
276
+ }
277
+}
278
+
279
+#undef DO_MOVA_Z
280
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
281
index XXXXXXX..XXXXXXX 100644
282
--- a/target/arm/sve_helper.c
283
+++ b/target/arm/sve_helper.c
284
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm,
285
}
286
}
143
}
287
144
288
+void HELPER(sve_sel_zpzz_q)(void *vd, void *vn, void *vm,
145
static void cortex_r5f_initfn(Object *obj)
289
+ void *vg, uint32_t desc)
290
+{
291
+ intptr_t i, opr_sz = simd_oprsz(desc) / 16;
292
+ Int128 *d = vd, *n = vn, *m = vm;
293
+ uint16_t *pg = vg;
294
+
295
+ for (i = 0; i < opr_sz; i += 1) {
296
+ d[i] = (pg[H2(i)] & 1 ? n : m)[i];
297
+ }
298
+}
299
+
300
/* Two operand comparison controlled by a predicate.
301
* ??? It is very tempting to want to be able to expand this inline
302
* with x86 instructions, e.g.
303
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
304
index XXXXXXX..XXXXXXX 100644
305
--- a/target/arm/translate-sme.c
306
+++ b/target/arm/translate-sme.c
307
@@ -XXX,XX +XXX,XX @@
308
#include "decode-sme.c.inc"
309
310
311
+/*
312
+ * Resolve tile.size[index] to a host pointer, where tile and index
313
+ * are always decoded together, dependent on the element size.
314
+ */
315
+static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs,
316
+ int tile_index, bool vertical)
317
+{
318
+ int tile = tile_index >> (4 - esz);
319
+ int index = esz == MO_128 ? 0 : extract32(tile_index, 0, 4 - esz);
320
+ int pos, len, offset;
321
+ TCGv_i32 tmp;
322
+ TCGv_ptr addr;
323
+
324
+ /* Compute the final index, which is Rs+imm. */
325
+ tmp = tcg_temp_new_i32();
326
+ tcg_gen_trunc_tl_i32(tmp, cpu_reg(s, rs));
327
+ tcg_gen_addi_i32(tmp, tmp, index);
328
+
329
+ /* Prepare a power-of-two modulo via extraction of @len bits. */
330
+ len = ctz32(streaming_vec_reg_size(s)) - esz;
331
+
332
+ if (vertical) {
333
+ /*
334
+ * Compute the byte offset of the index within the tile:
335
+ * (index % (svl / size)) * size
336
+ * = (index % (svl >> esz)) << esz
337
+ * Perform the power-of-two modulo via extraction of the low @len bits.
338
+ * Perform the multiply by shifting left by @pos bits.
339
+ * Perform these operations simultaneously via deposit into zero.
340
+ */
341
+ pos = esz;
342
+ tcg_gen_deposit_z_i32(tmp, tmp, pos, len);
343
+
344
+ /*
345
+ * For big-endian, adjust the indexed column byte offset within
346
+ * the uint64_t host words that make up env->zarray[].
347
+ */
348
+ if (HOST_BIG_ENDIAN && esz < MO_64) {
349
+ tcg_gen_xori_i32(tmp, tmp, 8 - (1 << esz));
350
+ }
351
+ } else {
352
+ /*
353
+ * Compute the byte offset of the index within the tile:
354
+ * (index % (svl / size)) * (size * sizeof(row))
355
+ * = (index % (svl >> esz)) << (esz + log2(sizeof(row)))
356
+ */
357
+ pos = esz + ctz32(sizeof(ARMVectorReg));
358
+ tcg_gen_deposit_z_i32(tmp, tmp, pos, len);
359
+
360
+ /* Row slices are always aligned and need no endian adjustment. */
361
+ }
362
+
363
+ /* The tile byte offset within env->zarray is the row. */
364
+ offset = tile * sizeof(ARMVectorReg);
365
+
366
+ /* Include the byte offset of zarray to make this relative to env. */
367
+ offset += offsetof(CPUARMState, zarray);
368
+ tcg_gen_addi_i32(tmp, tmp, offset);
369
+
370
+ /* Add the byte offset to env to produce the final pointer. */
371
+ addr = tcg_temp_new_ptr();
372
+ tcg_gen_ext_i32_ptr(addr, tmp);
373
+ tcg_temp_free_i32(tmp);
374
+ tcg_gen_add_ptr(addr, addr, cpu_env);
375
+
376
+ return addr;
377
+}
378
+
379
static bool trans_ZERO(DisasContext *s, arg_ZERO *a)
380
{
381
if (!dc_isar_feature(aa64_sme, s)) {
382
@@ -XXX,XX +XXX,XX @@ static bool trans_ZERO(DisasContext *s, arg_ZERO *a)
383
}
384
return true;
385
}
386
+
387
+static bool trans_MOVA(DisasContext *s, arg_MOVA *a)
388
+{
389
+ static gen_helper_gvec_4 * const h_fns[5] = {
390
+ gen_helper_sve_sel_zpzz_b, gen_helper_sve_sel_zpzz_h,
391
+ gen_helper_sve_sel_zpzz_s, gen_helper_sve_sel_zpzz_d,
392
+ gen_helper_sve_sel_zpzz_q
393
+ };
394
+ static gen_helper_gvec_3 * const cz_fns[5] = {
395
+ gen_helper_sme_mova_cz_b, gen_helper_sme_mova_cz_h,
396
+ gen_helper_sme_mova_cz_s, gen_helper_sme_mova_cz_d,
397
+ gen_helper_sme_mova_cz_q,
398
+ };
399
+ static gen_helper_gvec_3 * const zc_fns[5] = {
400
+ gen_helper_sme_mova_zc_b, gen_helper_sme_mova_zc_h,
401
+ gen_helper_sme_mova_zc_s, gen_helper_sme_mova_zc_d,
402
+ gen_helper_sme_mova_zc_q,
403
+ };
404
+
405
+ TCGv_ptr t_za, t_zr, t_pg;
406
+ TCGv_i32 t_desc;
407
+ int svl;
408
+
409
+ if (!dc_isar_feature(aa64_sme, s)) {
410
+ return false;
411
+ }
412
+ if (!sme_smza_enabled_check(s)) {
413
+ return true;
414
+ }
415
+
416
+ t_za = get_tile_rowcol(s, a->esz, a->rs, a->za_imm, a->v);
417
+ t_zr = vec_full_reg_ptr(s, a->zr);
418
+ t_pg = pred_full_reg_ptr(s, a->pg);
419
+
420
+ svl = streaming_vec_reg_size(s);
421
+ t_desc = tcg_constant_i32(simd_desc(svl, svl, 0));
422
+
423
+ if (a->v) {
424
+ /* Vertical slice -- use sme mova helpers. */
425
+ if (a->to_vec) {
426
+ zc_fns[a->esz](t_zr, t_za, t_pg, t_desc);
427
+ } else {
428
+ cz_fns[a->esz](t_za, t_zr, t_pg, t_desc);
429
+ }
430
+ } else {
431
+ /* Horizontal slice -- reuse sve sel helpers. */
432
+ if (a->to_vec) {
433
+ h_fns[a->esz](t_zr, t_za, t_zr, t_pg, t_desc);
434
+ } else {
435
+ h_fns[a->esz](t_za, t_zr, t_za, t_pg, t_desc);
436
+ }
437
+ }
438
+
439
+ tcg_temp_free_ptr(t_za);
440
+ tcg_temp_free_ptr(t_zr);
441
+ tcg_temp_free_ptr(t_pg);
442
+
443
+ return true;
444
+}
445
--
146
--
446
2.25.1
147
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Architecturally, the AArch32 MSR/MRS to/from banked register
2
instructions are UNPREDICTABLE for attempts to access a banked
3
register that the guest could access in a more direct way (e.g.
4
using this insn to access r8_fiq when already in FIQ mode). QEMU has
5
chosen to UNDEF on all of these.
2
6
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
However, for the case of accessing SPSR_hyp from hyp mode, it turns
4
Message-id: 20220708151540.18136-26-richard.henderson@linaro.org
8
out that real hardware permits this, with the same effect as if the
9
guest had directly written to SPSR. Further, there is some
10
guest code out there that assumes it can do this, because it
11
happens to work on hardware: an example Cortex-R52 startup code
12
fragment uses this, and it got copied into various other places,
13
including Zephyr. Zephyr was fixed to not use this:
14
https://github.com/zephyrproject-rtos/zephyr/issues/47330
15
but other examples are still out there, like the selftest
16
binary for the MPS3-AN536.
17
18
For convenience of being able to run guest code, permit
19
this UNPREDICTABLE access instead of UNDEFing it.
20
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
22
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
23
Message-id: 20240206132931.38376-5-peter.maydell@linaro.org
7
---
24
---
8
target/arm/helper-sme.h | 2 ++
25
target/arm/tcg/op_helper.c | 43 ++++++++++++++++++++++++++------------
9
target/arm/sme.decode | 2 ++
26
target/arm/tcg/translate.c | 19 +++++++++++------
10
target/arm/sme_helper.c | 56 ++++++++++++++++++++++++++++++++++++++
27
2 files changed, 43 insertions(+), 19 deletions(-)
11
target/arm/translate-sme.c | 30 ++++++++++++++++++++
12
4 files changed, 90 insertions(+)
13
28
14
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
29
diff --git a/target/arm/tcg/op_helper.c b/target/arm/tcg/op_helper.c
15
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sme.h
31
--- a/target/arm/tcg/op_helper.c
17
+++ b/target/arm/helper-sme.h
32
+++ b/target/arm/tcg/op_helper.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_7(sme_fmopa_s, TCG_CALL_NO_RWG,
33
@@ -XXX,XX +XXX,XX @@ static void msr_mrs_banked_exc_checks(CPUARMState *env, uint32_t tgtmode,
19
void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
34
*/
20
DEF_HELPER_FLAGS_7(sme_fmopa_d, TCG_CALL_NO_RWG,
35
int curmode = env->uncached_cpsr & CPSR_M;
21
void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
36
22
+DEF_HELPER_FLAGS_6(sme_bfmopa, TCG_CALL_NO_RWG,
37
- if (regno == 17) {
23
+ void, ptr, ptr, ptr, ptr, ptr, i32)
38
- /* ELR_Hyp: a special case because access from tgtmode is OK */
24
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
39
- if (curmode != ARM_CPU_MODE_HYP && curmode != ARM_CPU_MODE_MON) {
25
index XXXXXXX..XXXXXXX 100644
40
- goto undef;
26
--- a/target/arm/sme.decode
41
+ if (tgtmode == ARM_CPU_MODE_HYP) {
27
+++ b/target/arm/sme.decode
42
+ /*
28
@@ -XXX,XX +XXX,XX @@ ADDVA_d 11000000 11 01000 1 ... ... ..... 00 ... @adda_64
43
+ * Handle Hyp target regs first because some are special cases
29
44
+ * which don't want the usual "not accessible from tgtmode" check.
30
FMOPA_s 10000000 100 ..... ... ... ..... . 00 .. @op_32
45
+ */
31
FMOPA_d 10000000 110 ..... ... ... ..... . 0 ... @op_64
46
+ switch (regno) {
32
+
47
+ case 16 ... 17: /* ELR_Hyp, SPSR_Hyp */
33
+BFMOPA 10000001 100 ..... ... ... ..... . 00 .. @op_32
48
+ if (curmode != ARM_CPU_MODE_HYP && curmode != ARM_CPU_MODE_MON) {
34
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
49
+ goto undef;
35
index XXXXXXX..XXXXXXX 100644
50
+ }
36
--- a/target/arm/sme_helper.c
51
+ break;
37
+++ b/target/arm/sme_helper.c
52
+ case 13:
38
@@ -XXX,XX +XXX,XX @@ void HELPER(sme_fmopa_d)(void *vza, void *vzn, void *vzm, void *vpn,
53
+ if (curmode != ARM_CPU_MODE_MON) {
54
+ goto undef;
55
+ }
56
+ break;
57
+ default:
58
+ g_assert_not_reached();
59
}
60
return;
61
}
62
@@ -XXX,XX +XXX,XX @@ static void msr_mrs_banked_exc_checks(CPUARMState *env, uint32_t tgtmode,
39
}
63
}
40
}
64
}
41
}
65
42
+
66
- if (tgtmode == ARM_CPU_MODE_HYP) {
43
+/*
67
- /* SPSR_Hyp, r13_hyp: accessible from Monitor mode only */
44
+ * Alter PAIR as needed for controlling predicates being false,
68
- if (curmode != ARM_CPU_MODE_MON) {
45
+ * and for NEG on an enabled row element.
69
- goto undef;
46
+ */
70
- }
47
+static inline uint32_t f16mop_adj_pair(uint32_t pair, uint32_t pg, uint32_t neg)
71
- }
48
+{
72
-
49
+ /*
73
return;
50
+ * The pseudocode uses a conditional negate after the conditional zero.
74
51
+ * It is simpler here to unconditionally negate before conditional zero.
75
undef:
52
+ */
76
@@ -XXX,XX +XXX,XX @@ void HELPER(msr_banked)(CPUARMState *env, uint32_t value, uint32_t tgtmode,
53
+ pair ^= neg;
77
54
+ if (!(pg & 1)) {
78
switch (regno) {
55
+ pair &= 0xffff0000u;
79
case 16: /* SPSRs */
56
+ }
80
- env->banked_spsr[bank_number(tgtmode)] = value;
57
+ if (!(pg & 4)) {
81
+ if (tgtmode == (env->uncached_cpsr & CPSR_M)) {
58
+ pair &= 0x0000ffffu;
82
+ /* Only happens for SPSR_Hyp access in Hyp mode */
59
+ }
83
+ env->spsr = value;
60
+ return pair;
84
+ } else {
61
+}
85
+ env->banked_spsr[bank_number(tgtmode)] = value;
62
+
86
+ }
63
+void HELPER(sme_bfmopa)(void *vza, void *vzn, void *vzm, void *vpn,
87
break;
64
+ void *vpm, uint32_t desc)
88
case 17: /* ELR_Hyp */
65
+{
89
env->elr_el[2] = value;
66
+ intptr_t row, col, oprsz = simd_maxsz(desc);
90
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(mrs_banked)(CPUARMState *env, uint32_t tgtmode, uint32_t regno)
67
+ uint32_t neg = simd_data(desc) * 0x80008000u;
91
68
+ uint16_t *pn = vpn, *pm = vpm;
92
switch (regno) {
69
+
93
case 16: /* SPSRs */
70
+ for (row = 0; row < oprsz; ) {
94
- return env->banked_spsr[bank_number(tgtmode)];
71
+ uint16_t prow = pn[H2(row >> 4)];
95
+ if (tgtmode == (env->uncached_cpsr & CPSR_M)) {
72
+ do {
96
+ /* Only happens for SPSR_Hyp access in Hyp mode */
73
+ void *vza_row = vza + tile_vslice_offset(row);
97
+ return env->spsr;
74
+ uint32_t n = *(uint32_t *)(vzn + H1_4(row));
98
+ } else {
75
+
99
+ return env->banked_spsr[bank_number(tgtmode)];
76
+ n = f16mop_adj_pair(n, prow, neg);
100
+ }
77
+
101
case 17: /* ELR_Hyp */
78
+ for (col = 0; col < oprsz; ) {
102
return env->elr_el[2];
79
+ uint16_t pcol = pm[H2(col >> 4)];
103
case 13:
80
+ do {
104
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
81
+ if (prow & pcol & 0b0101) {
82
+ uint32_t *a = vza_row + H1_4(col);
83
+ uint32_t m = *(uint32_t *)(vzm + H1_4(col));
84
+
85
+ m = f16mop_adj_pair(m, pcol, 0);
86
+ *a = bfdotadd(*a, n, m);
87
+
88
+ col += 4;
89
+ pcol >>= 4;
90
+ }
91
+ } while (col & 15);
92
+ }
93
+ row += 4;
94
+ prow >>= 4;
95
+ } while (row & 15);
96
+ }
97
+}
98
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
99
index XXXXXXX..XXXXXXX 100644
105
index XXXXXXX..XXXXXXX 100644
100
--- a/target/arm/translate-sme.c
106
--- a/target/arm/tcg/translate.c
101
+++ b/target/arm/translate-sme.c
107
+++ b/target/arm/tcg/translate.c
102
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(ADDVA_s, aa64_sme, do_adda, a, MO_32, gen_helper_sme_addva_s)
108
@@ -XXX,XX +XXX,XX @@ static bool msr_banked_access_decode(DisasContext *s, int r, int sysm, int rn,
103
TRANS_FEAT(ADDHA_d, aa64_sme_i16i64, do_adda, a, MO_64, gen_helper_sme_addha_d)
109
break;
104
TRANS_FEAT(ADDVA_d, aa64_sme_i16i64, do_adda, a, MO_64, gen_helper_sme_addva_d)
110
case ARM_CPU_MODE_HYP:
105
111
/*
106
+static bool do_outprod(DisasContext *s, arg_op *a, MemOp esz,
112
- * SPSR_hyp and r13_hyp can only be accessed from Monitor mode
107
+ gen_helper_gvec_5 *fn)
113
- * (and so we can forbid accesses from EL2 or below). elr_hyp
108
+{
114
- * can be accessed also from Hyp mode, so forbid accesses from
109
+ int svl = streaming_vec_reg_size(s);
115
- * EL0 or EL1.
110
+ uint32_t desc = simd_desc(svl, svl, a->sub);
116
+ * r13_hyp can only be accessed from Monitor mode, and so we
111
+ TCGv_ptr za, zn, zm, pn, pm;
117
+ * can forbid accesses from EL2 or below.
112
+
118
+ * elr_hyp can be accessed also from Hyp mode, so forbid
113
+ if (!sme_smza_enabled_check(s)) {
119
+ * accesses from EL0 or EL1.
114
+ return true;
120
+ * SPSR_hyp is supposed to be in the same category as r13_hyp
115
+ }
121
+ * and UNPREDICTABLE if accessed from anything except Monitor
116
+
122
+ * mode. However there is some real-world code that will do
117
+ /* Sum XZR+zad to find ZAd. */
123
+ * it because at least some hardware happens to permit the
118
+ za = get_tile_rowcol(s, esz, 31, a->zad, false);
124
+ * access. (Notably a standard Cortex-R52 startup code fragment
119
+ zn = vec_full_reg_ptr(s, a->zn);
125
+ * does this.) So we permit SPSR_hyp from Hyp mode also, to allow
120
+ zm = vec_full_reg_ptr(s, a->zm);
126
+ * this (incorrect) guest code to run.
121
+ pn = pred_full_reg_ptr(s, a->pn);
127
*/
122
+ pm = pred_full_reg_ptr(s, a->pm);
128
- if (!arm_dc_feature(s, ARM_FEATURE_EL2) || s->current_el < 2 ||
123
+
129
- (s->current_el < 3 && *regno != 17)) {
124
+ fn(za, zn, zm, pn, pm, tcg_constant_i32(desc));
130
+ if (!arm_dc_feature(s, ARM_FEATURE_EL2) || s->current_el < 2
125
+
131
+ || (s->current_el < 3 && *regno != 16 && *regno != 17)) {
126
+ tcg_temp_free_ptr(za);
132
goto undef;
127
+ tcg_temp_free_ptr(zn);
133
}
128
+ tcg_temp_free_ptr(pn);
134
break;
129
+ tcg_temp_free_ptr(pm);
130
+ return true;
131
+}
132
+
133
static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz,
134
gen_helper_gvec_5_ptr *fn)
135
{
136
@@ -XXX,XX +XXX,XX @@ static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz,
137
138
TRANS_FEAT(FMOPA_s, aa64_sme, do_outprod_fpst, a, MO_32, gen_helper_sme_fmopa_s)
139
TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a, MO_64, gen_helper_sme_fmopa_d)
140
+
141
+/* TODO: FEAT_EBF16 */
142
+TRANS_FEAT(BFMOPA, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_bfmopa)
143
--
135
--
144
2.25.1
136
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
We currently guard the CFG3 register read with
2
(scc_partno(s) == 0x524 && scc_partno(s) == 0x547)
3
which is clearly wrong as it is never true.
2
4
3
We can reuse the SVE functions for LDR and STR, passing in the
5
This register is present on all board types except AN524
4
base of the ZA vector and a zero offset.
6
and AN527; correct the condition.
5
7
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Fixes: 6ac80818941829c0 ("hw/misc/mps2-scc: Implement changes for AN547")
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-23-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20240206132931.38376-6-peter.maydell@linaro.org
10
---
13
---
11
target/arm/sme.decode | 7 +++++++
14
hw/misc/mps2-scc.c | 2 +-
12
target/arm/translate-sme.c | 24 ++++++++++++++++++++++++
15
1 file changed, 1 insertion(+), 1 deletion(-)
13
2 files changed, 31 insertions(+)
14
16
15
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
17
diff --git a/hw/misc/mps2-scc.c b/hw/misc/mps2-scc.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme.decode
19
--- a/hw/misc/mps2-scc.c
18
+++ b/target/arm/sme.decode
20
+++ b/hw/misc/mps2-scc.c
19
@@ -XXX,XX +XXX,XX @@ LDST1 1110000 0 esz:2 st:1 rm:5 v:1 .. pg:3 rn:5 0 za_imm:4 \
21
@@ -XXX,XX +XXX,XX @@ static uint64_t mps2_scc_read(void *opaque, hwaddr offset, unsigned size)
20
&ldst rs=%mova_rs
22
r = s->cfg2;
21
LDST1 1110000 111 st:1 rm:5 v:1 .. pg:3 rn:5 0 za_imm:4 \
23
break;
22
&ldst esz=4 rs=%mova_rs
24
case A_CFG3:
23
+
25
- if (scc_partno(s) == 0x524 && scc_partno(s) == 0x547) {
24
+&ldstr rv rn imm
26
+ if (scc_partno(s) == 0x524 || scc_partno(s) == 0x547) {
25
+@ldstr ....... ... . ...... .. ... rn:5 . imm:4 \
27
/* CFG3 reserved on AN524 */
26
+ &ldstr rv=%mova_rs
28
goto bad_offset;
27
+
29
}
28
+LDR 1110000 100 0 000000 .. 000 ..... 0 .... @ldstr
29
+STR 1110000 100 1 000000 .. 000 ..... 0 .... @ldstr
30
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/translate-sme.c
33
+++ b/target/arm/translate-sme.c
34
@@ -XXX,XX +XXX,XX @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
35
tcg_temp_free_i64(addr);
36
return true;
37
}
38
+
39
+typedef void GenLdStR(DisasContext *, TCGv_ptr, int, int, int, int);
40
+
41
+static bool do_ldst_r(DisasContext *s, arg_ldstr *a, GenLdStR *fn)
42
+{
43
+ int svl = streaming_vec_reg_size(s);
44
+ int imm = a->imm;
45
+ TCGv_ptr base;
46
+
47
+ if (!sme_za_enabled_check(s)) {
48
+ return true;
49
+ }
50
+
51
+ /* ZA[n] equates to ZA0H.B[n]. */
52
+ base = get_tile_rowcol(s, MO_8, a->rv, imm, false);
53
+
54
+ fn(s, base, 0, svl, a->rn, imm * svl);
55
+
56
+ tcg_temp_free_ptr(base);
57
+ return true;
58
+}
59
+
60
+TRANS_FEAT(LDR, aa64_sme, do_ldst_r, a, gen_sve_ldr)
61
+TRANS_FEAT(STR, aa64_sme, do_ldst_r, a, gen_sve_str)
62
--
30
--
63
2.25.1
31
2.34.1
32
33
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The MPS SCC device has a lot of different flavours for the various
2
different MPS FPGA images, which look mostly similar but have
3
differences in how particular registers are handled. Currently we
4
deal with this with a lot of open-coded checks on scc_partno(), but
5
as we add more board types this is getting a bit hard to read.
2
6
3
We cannot reuse the SVE functions for LD[1-4] and ST[1-4],
7
Factor out the conditions into some functions which we can
4
because those functions accept only a Zreg register number.
8
give more descriptive names to.
5
For SME, we want to pass a pointer into ZA storage.
6
9
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20220708151540.18136-21-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20240206132931.38376-7-peter.maydell@linaro.org
11
---
14
---
12
target/arm/helper-sme.h | 82 +++++
15
hw/misc/mps2-scc.c | 45 +++++++++++++++++++++++++++++++--------------
13
target/arm/sme.decode | 9 +
16
1 file changed, 31 insertions(+), 14 deletions(-)
14
target/arm/sme_helper.c | 595 +++++++++++++++++++++++++++++++++++++
15
target/arm/translate-sme.c | 70 +++++
16
4 files changed, 756 insertions(+)
17
17
18
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
18
diff --git a/hw/misc/mps2-scc.c b/hw/misc/mps2-scc.c
19
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-sme.h
20
--- a/hw/misc/mps2-scc.c
21
+++ b/target/arm/helper-sme.h
21
+++ b/hw/misc/mps2-scc.c
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sme_mova_cz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
@@ -XXX,XX +XXX,XX @@ static int scc_partno(MPS2SCC *s)
23
DEF_HELPER_FLAGS_4(sme_mova_zc_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
return extract32(s->id, 4, 8);
24
DEF_HELPER_FLAGS_4(sme_mova_cz_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
DEF_HELPER_FLAGS_4(sme_mova_zc_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+
27
+DEF_HELPER_FLAGS_5(sme_ld1b_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
28
+DEF_HELPER_FLAGS_5(sme_ld1b_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
29
+DEF_HELPER_FLAGS_5(sme_ld1b_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
30
+DEF_HELPER_FLAGS_5(sme_ld1b_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
31
+
32
+DEF_HELPER_FLAGS_5(sme_ld1h_be_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
33
+DEF_HELPER_FLAGS_5(sme_ld1h_le_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
34
+DEF_HELPER_FLAGS_5(sme_ld1h_be_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
35
+DEF_HELPER_FLAGS_5(sme_ld1h_le_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
36
+DEF_HELPER_FLAGS_5(sme_ld1h_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
37
+DEF_HELPER_FLAGS_5(sme_ld1h_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
38
+DEF_HELPER_FLAGS_5(sme_ld1h_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
39
+DEF_HELPER_FLAGS_5(sme_ld1h_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
40
+
41
+DEF_HELPER_FLAGS_5(sme_ld1s_be_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
42
+DEF_HELPER_FLAGS_5(sme_ld1s_le_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
43
+DEF_HELPER_FLAGS_5(sme_ld1s_be_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
44
+DEF_HELPER_FLAGS_5(sme_ld1s_le_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
45
+DEF_HELPER_FLAGS_5(sme_ld1s_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
46
+DEF_HELPER_FLAGS_5(sme_ld1s_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
47
+DEF_HELPER_FLAGS_5(sme_ld1s_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
48
+DEF_HELPER_FLAGS_5(sme_ld1s_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
49
+
50
+DEF_HELPER_FLAGS_5(sme_ld1d_be_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
51
+DEF_HELPER_FLAGS_5(sme_ld1d_le_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
52
+DEF_HELPER_FLAGS_5(sme_ld1d_be_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
53
+DEF_HELPER_FLAGS_5(sme_ld1d_le_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
54
+DEF_HELPER_FLAGS_5(sme_ld1d_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
55
+DEF_HELPER_FLAGS_5(sme_ld1d_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
56
+DEF_HELPER_FLAGS_5(sme_ld1d_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
57
+DEF_HELPER_FLAGS_5(sme_ld1d_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
58
+
59
+DEF_HELPER_FLAGS_5(sme_ld1q_be_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
60
+DEF_HELPER_FLAGS_5(sme_ld1q_le_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
61
+DEF_HELPER_FLAGS_5(sme_ld1q_be_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
62
+DEF_HELPER_FLAGS_5(sme_ld1q_le_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
63
+DEF_HELPER_FLAGS_5(sme_ld1q_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
64
+DEF_HELPER_FLAGS_5(sme_ld1q_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
65
+DEF_HELPER_FLAGS_5(sme_ld1q_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
66
+DEF_HELPER_FLAGS_5(sme_ld1q_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
67
+
68
+DEF_HELPER_FLAGS_5(sme_st1b_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
69
+DEF_HELPER_FLAGS_5(sme_st1b_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
70
+DEF_HELPER_FLAGS_5(sme_st1b_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
71
+DEF_HELPER_FLAGS_5(sme_st1b_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
72
+
73
+DEF_HELPER_FLAGS_5(sme_st1h_be_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
74
+DEF_HELPER_FLAGS_5(sme_st1h_le_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
75
+DEF_HELPER_FLAGS_5(sme_st1h_be_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
76
+DEF_HELPER_FLAGS_5(sme_st1h_le_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
77
+DEF_HELPER_FLAGS_5(sme_st1h_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
78
+DEF_HELPER_FLAGS_5(sme_st1h_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
79
+DEF_HELPER_FLAGS_5(sme_st1h_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
80
+DEF_HELPER_FLAGS_5(sme_st1h_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
81
+
82
+DEF_HELPER_FLAGS_5(sme_st1s_be_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
83
+DEF_HELPER_FLAGS_5(sme_st1s_le_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
84
+DEF_HELPER_FLAGS_5(sme_st1s_be_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
85
+DEF_HELPER_FLAGS_5(sme_st1s_le_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
86
+DEF_HELPER_FLAGS_5(sme_st1s_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
87
+DEF_HELPER_FLAGS_5(sme_st1s_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
88
+DEF_HELPER_FLAGS_5(sme_st1s_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
89
+DEF_HELPER_FLAGS_5(sme_st1s_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
90
+
91
+DEF_HELPER_FLAGS_5(sme_st1d_be_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
92
+DEF_HELPER_FLAGS_5(sme_st1d_le_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
93
+DEF_HELPER_FLAGS_5(sme_st1d_be_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
94
+DEF_HELPER_FLAGS_5(sme_st1d_le_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
95
+DEF_HELPER_FLAGS_5(sme_st1d_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
96
+DEF_HELPER_FLAGS_5(sme_st1d_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
97
+DEF_HELPER_FLAGS_5(sme_st1d_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
98
+DEF_HELPER_FLAGS_5(sme_st1d_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
99
+
100
+DEF_HELPER_FLAGS_5(sme_st1q_be_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
101
+DEF_HELPER_FLAGS_5(sme_st1q_le_h, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
102
+DEF_HELPER_FLAGS_5(sme_st1q_be_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
103
+DEF_HELPER_FLAGS_5(sme_st1q_le_v, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
104
+DEF_HELPER_FLAGS_5(sme_st1q_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
105
+DEF_HELPER_FLAGS_5(sme_st1q_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
106
+DEF_HELPER_FLAGS_5(sme_st1q_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
107
+DEF_HELPER_FLAGS_5(sme_st1q_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
108
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
109
index XXXXXXX..XXXXXXX 100644
110
--- a/target/arm/sme.decode
111
+++ b/target/arm/sme.decode
112
@@ -XXX,XX +XXX,XX @@ MOVA 11000000 esz:2 00001 0 v:1 .. pg:3 0 za_imm:4 zr:5 \
113
&mova to_vec=1 rs=%mova_rs
114
MOVA 11000000 11 00001 1 v:1 .. pg:3 0 za_imm:4 zr:5 \
115
&mova to_vec=1 rs=%mova_rs esz=4
116
+
117
+### SME Memory
118
+
119
+&ldst esz rs pg rn rm za_imm v:bool st:bool
120
+
121
+LDST1 1110000 0 esz:2 st:1 rm:5 v:1 .. pg:3 rn:5 0 za_imm:4 \
122
+ &ldst rs=%mova_rs
123
+LDST1 1110000 111 st:1 rm:5 v:1 .. pg:3 rn:5 0 za_imm:4 \
124
+ &ldst esz=4 rs=%mova_rs
125
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
126
index XXXXXXX..XXXXXXX 100644
127
--- a/target/arm/sme_helper.c
128
+++ b/target/arm/sme_helper.c
129
@@ -XXX,XX +XXX,XX @@
130
131
#include "qemu/osdep.h"
132
#include "cpu.h"
133
+#include "internals.h"
134
#include "tcg/tcg-gvec-desc.h"
135
#include "exec/helper-proto.h"
136
+#include "exec/cpu_ldst.h"
137
+#include "exec/exec-all.h"
138
#include "qemu/int128.h"
139
#include "vec_internal.h"
140
+#include "sve_ldst_internal.h"
141
142
/* ResetSVEState */
143
void arm_reset_sve_state(CPUARMState *env)
144
@@ -XXX,XX +XXX,XX @@ void HELPER(sme_mova_zc_q)(void *vd, void *za, void *vg, uint32_t desc)
145
}
24
}
146
25
147
#undef DO_MOVA_Z
26
+/* Is CFG_REG2 present? */
148
+
27
+static bool have_cfg2(MPS2SCC *s)
149
+/*
150
+ * Clear elements in a tile slice comprising len bytes.
151
+ */
152
+
153
+typedef void ClearFn(void *ptr, size_t off, size_t len);
154
+
155
+static void clear_horizontal(void *ptr, size_t off, size_t len)
156
+{
28
+{
157
+ memset(ptr + off, 0, len);
29
+ return scc_partno(s) == 0x524 || scc_partno(s) == 0x547;
158
+}
30
+}
159
+
31
+
160
+static void clear_vertical_b(void *vptr, size_t off, size_t len)
32
+/* Is CFG_REG3 present? */
33
+static bool have_cfg3(MPS2SCC *s)
161
+{
34
+{
162
+ for (size_t i = 0; i < len; ++i) {
35
+ return scc_partno(s) != 0x524 && scc_partno(s) != 0x547;
163
+ *(uint8_t *)(vptr + tile_vslice_offset(i + off)) = 0;
164
+ }
165
+}
36
+}
166
+
37
+
167
+static void clear_vertical_h(void *vptr, size_t off, size_t len)
38
+/* Is CFG_REG5 present? */
39
+static bool have_cfg5(MPS2SCC *s)
168
+{
40
+{
169
+ for (size_t i = 0; i < len; i += 2) {
41
+ return scc_partno(s) == 0x524 || scc_partno(s) == 0x547;
170
+ *(uint16_t *)(vptr + tile_vslice_offset(i + off)) = 0;
171
+ }
172
+}
42
+}
173
+
43
+
174
+static void clear_vertical_s(void *vptr, size_t off, size_t len)
44
+/* Is CFG_REG6 present? */
45
+static bool have_cfg6(MPS2SCC *s)
175
+{
46
+{
176
+ for (size_t i = 0; i < len; i += 4) {
47
+ return scc_partno(s) == 0x524;
177
+ *(uint32_t *)(vptr + tile_vslice_offset(i + off)) = 0;
178
+ }
179
+}
48
+}
180
+
49
+
181
+static void clear_vertical_d(void *vptr, size_t off, size_t len)
50
/* Handle a write via the SYS_CFG channel to the specified function/device.
182
+{
51
* Return false on error (reported to guest via SYS_CFGCTRL ERROR bit).
183
+ for (size_t i = 0; i < len; i += 8) {
52
*/
184
+ *(uint64_t *)(vptr + tile_vslice_offset(i + off)) = 0;
53
@@ -XXX,XX +XXX,XX @@ static uint64_t mps2_scc_read(void *opaque, hwaddr offset, unsigned size)
185
+ }
54
r = s->cfg1;
186
+}
55
break;
187
+
56
case A_CFG2:
188
+static void clear_vertical_q(void *vptr, size_t off, size_t len)
57
- if (scc_partno(s) != 0x524 && scc_partno(s) != 0x547) {
189
+{
58
- /* CFG2 reserved on other boards */
190
+ for (size_t i = 0; i < len; i += 16) {
59
+ if (!have_cfg2(s)) {
191
+ memset(vptr + tile_vslice_offset(i + off), 0, 16);
60
goto bad_offset;
192
+ }
61
}
193
+}
62
r = s->cfg2;
194
+
63
break;
195
+/*
64
case A_CFG3:
196
+ * Copy elements from an array into a tile slice comprising len bytes.
65
- if (scc_partno(s) == 0x524 || scc_partno(s) == 0x547) {
197
+ */
66
- /* CFG3 reserved on AN524 */
198
+
67
+ if (!have_cfg3(s)) {
199
+typedef void CopyFn(void *dst, const void *src, size_t len);
68
goto bad_offset;
200
+
69
}
201
+static void copy_horizontal(void *dst, const void *src, size_t len)
70
/* These are user-settable DIP switches on the board. We don't
202
+{
71
@@ -XXX,XX +XXX,XX @@ static uint64_t mps2_scc_read(void *opaque, hwaddr offset, unsigned size)
203
+ memcpy(dst, src, len);
72
r = s->cfg4;
204
+}
73
break;
205
+
74
case A_CFG5:
206
+static void copy_vertical_b(void *vdst, const void *vsrc, size_t len)
75
- if (scc_partno(s) != 0x524 && scc_partno(s) != 0x547) {
207
+{
76
- /* CFG5 reserved on other boards */
208
+ const uint8_t *src = vsrc;
77
+ if (!have_cfg5(s)) {
209
+ uint8_t *dst = vdst;
78
goto bad_offset;
210
+ size_t i;
79
}
211
+
80
r = s->cfg5;
212
+ for (i = 0; i < len; ++i) {
81
break;
213
+ dst[tile_vslice_index(i)] = src[i];
82
case A_CFG6:
214
+ }
83
- if (scc_partno(s) != 0x524) {
215
+}
84
- /* CFG6 reserved on other boards */
216
+
85
+ if (!have_cfg6(s)) {
217
+static void copy_vertical_h(void *vdst, const void *vsrc, size_t len)
86
goto bad_offset;
218
+{
87
}
219
+ const uint16_t *src = vsrc;
88
r = s->cfg6;
220
+ uint16_t *dst = vdst;
89
@@ -XXX,XX +XXX,XX @@ static void mps2_scc_write(void *opaque, hwaddr offset, uint64_t value,
221
+ size_t i;
90
}
222
+
91
break;
223
+ for (i = 0; i < len / 2; ++i) {
92
case A_CFG2:
224
+ dst[tile_vslice_index(i)] = src[i];
93
- if (scc_partno(s) != 0x524 && scc_partno(s) != 0x547) {
225
+ }
94
- /* CFG2 reserved on other boards */
226
+}
95
+ if (!have_cfg2(s)) {
227
+
96
goto bad_offset;
228
+static void copy_vertical_s(void *vdst, const void *vsrc, size_t len)
97
}
229
+{
98
/* AN524: QSPI Select signal */
230
+ const uint32_t *src = vsrc;
99
s->cfg2 = value;
231
+ uint32_t *dst = vdst;
100
break;
232
+ size_t i;
101
case A_CFG5:
233
+
102
- if (scc_partno(s) != 0x524 && scc_partno(s) != 0x547) {
234
+ for (i = 0; i < len / 4; ++i) {
103
- /* CFG5 reserved on other boards */
235
+ dst[tile_vslice_index(i)] = src[i];
104
+ if (!have_cfg5(s)) {
236
+ }
105
goto bad_offset;
237
+}
106
}
238
+
107
/* AN524: ACLK frequency in Hz */
239
+static void copy_vertical_d(void *vdst, const void *vsrc, size_t len)
108
s->cfg5 = value;
240
+{
109
break;
241
+ const uint64_t *src = vsrc;
110
case A_CFG6:
242
+ uint64_t *dst = vdst;
111
- if (scc_partno(s) != 0x524) {
243
+ size_t i;
112
- /* CFG6 reserved on other boards */
244
+
113
+ if (!have_cfg6(s)) {
245
+ for (i = 0; i < len / 8; ++i) {
114
goto bad_offset;
246
+ dst[tile_vslice_index(i)] = src[i];
115
}
247
+ }
116
/* AN524: Clock divider for BRAM */
248
+}
249
+
250
+static void copy_vertical_q(void *vdst, const void *vsrc, size_t len)
251
+{
252
+ for (size_t i = 0; i < len; i += 16) {
253
+ memcpy(vdst + tile_vslice_offset(i), vsrc + i, 16);
254
+ }
255
+}
256
+
257
+/*
258
+ * Host and TLB primitives for vertical tile slice addressing.
259
+ */
260
+
261
+#define DO_LD(NAME, TYPE, HOST, TLB) \
262
+static inline void sme_##NAME##_v_host(void *za, intptr_t off, void *host) \
263
+{ \
264
+ TYPE val = HOST(host); \
265
+ *(TYPE *)(za + tile_vslice_offset(off)) = val; \
266
+} \
267
+static inline void sme_##NAME##_v_tlb(CPUARMState *env, void *za, \
268
+ intptr_t off, target_ulong addr, uintptr_t ra) \
269
+{ \
270
+ TYPE val = TLB(env, useronly_clean_ptr(addr), ra); \
271
+ *(TYPE *)(za + tile_vslice_offset(off)) = val; \
272
+}
273
+
274
+#define DO_ST(NAME, TYPE, HOST, TLB) \
275
+static inline void sme_##NAME##_v_host(void *za, intptr_t off, void *host) \
276
+{ \
277
+ TYPE val = *(TYPE *)(za + tile_vslice_offset(off)); \
278
+ HOST(host, val); \
279
+} \
280
+static inline void sme_##NAME##_v_tlb(CPUARMState *env, void *za, \
281
+ intptr_t off, target_ulong addr, uintptr_t ra) \
282
+{ \
283
+ TYPE val = *(TYPE *)(za + tile_vslice_offset(off)); \
284
+ TLB(env, useronly_clean_ptr(addr), val, ra); \
285
+}
286
+
287
+/*
288
+ * The ARMVectorReg elements are stored in host-endian 64-bit units.
289
+ * For 128-bit quantities, the sequence defined by the Elem[] pseudocode
290
+ * corresponds to storing the two 64-bit pieces in little-endian order.
291
+ */
292
+#define DO_LDQ(HNAME, VNAME, BE, HOST, TLB) \
293
+static inline void HNAME##_host(void *za, intptr_t off, void *host) \
294
+{ \
295
+ uint64_t val0 = HOST(host), val1 = HOST(host + 8); \
296
+ uint64_t *ptr = za + off; \
297
+ ptr[0] = BE ? val1 : val0, ptr[1] = BE ? val0 : val1; \
298
+} \
299
+static inline void VNAME##_v_host(void *za, intptr_t off, void *host) \
300
+{ \
301
+ HNAME##_host(za, tile_vslice_offset(off), host); \
302
+} \
303
+static inline void HNAME##_tlb(CPUARMState *env, void *za, intptr_t off, \
304
+ target_ulong addr, uintptr_t ra) \
305
+{ \
306
+ uint64_t val0 = TLB(env, useronly_clean_ptr(addr), ra); \
307
+ uint64_t val1 = TLB(env, useronly_clean_ptr(addr + 8), ra); \
308
+ uint64_t *ptr = za + off; \
309
+ ptr[0] = BE ? val1 : val0, ptr[1] = BE ? val0 : val1; \
310
+} \
311
+static inline void VNAME##_v_tlb(CPUARMState *env, void *za, intptr_t off, \
312
+ target_ulong addr, uintptr_t ra) \
313
+{ \
314
+ HNAME##_tlb(env, za, tile_vslice_offset(off), addr, ra); \
315
+}
316
+
317
+#define DO_STQ(HNAME, VNAME, BE, HOST, TLB) \
318
+static inline void HNAME##_host(void *za, intptr_t off, void *host) \
319
+{ \
320
+ uint64_t *ptr = za + off; \
321
+ HOST(host, ptr[BE]); \
322
+ HOST(host + 1, ptr[!BE]); \
323
+} \
324
+static inline void VNAME##_v_host(void *za, intptr_t off, void *host) \
325
+{ \
326
+ HNAME##_host(za, tile_vslice_offset(off), host); \
327
+} \
328
+static inline void HNAME##_tlb(CPUARMState *env, void *za, intptr_t off, \
329
+ target_ulong addr, uintptr_t ra) \
330
+{ \
331
+ uint64_t *ptr = za + off; \
332
+ TLB(env, useronly_clean_ptr(addr), ptr[BE], ra); \
333
+ TLB(env, useronly_clean_ptr(addr + 8), ptr[!BE], ra); \
334
+} \
335
+static inline void VNAME##_v_tlb(CPUARMState *env, void *za, intptr_t off, \
336
+ target_ulong addr, uintptr_t ra) \
337
+{ \
338
+ HNAME##_tlb(env, za, tile_vslice_offset(off), addr, ra); \
339
+}
340
+
341
+DO_LD(ld1b, uint8_t, ldub_p, cpu_ldub_data_ra)
342
+DO_LD(ld1h_be, uint16_t, lduw_be_p, cpu_lduw_be_data_ra)
343
+DO_LD(ld1h_le, uint16_t, lduw_le_p, cpu_lduw_le_data_ra)
344
+DO_LD(ld1s_be, uint32_t, ldl_be_p, cpu_ldl_be_data_ra)
345
+DO_LD(ld1s_le, uint32_t, ldl_le_p, cpu_ldl_le_data_ra)
346
+DO_LD(ld1d_be, uint64_t, ldq_be_p, cpu_ldq_be_data_ra)
347
+DO_LD(ld1d_le, uint64_t, ldq_le_p, cpu_ldq_le_data_ra)
348
+
349
+DO_LDQ(sve_ld1qq_be, sme_ld1q_be, 1, ldq_be_p, cpu_ldq_be_data_ra)
350
+DO_LDQ(sve_ld1qq_le, sme_ld1q_le, 0, ldq_le_p, cpu_ldq_le_data_ra)
351
+
352
+DO_ST(st1b, uint8_t, stb_p, cpu_stb_data_ra)
353
+DO_ST(st1h_be, uint16_t, stw_be_p, cpu_stw_be_data_ra)
354
+DO_ST(st1h_le, uint16_t, stw_le_p, cpu_stw_le_data_ra)
355
+DO_ST(st1s_be, uint32_t, stl_be_p, cpu_stl_be_data_ra)
356
+DO_ST(st1s_le, uint32_t, stl_le_p, cpu_stl_le_data_ra)
357
+DO_ST(st1d_be, uint64_t, stq_be_p, cpu_stq_be_data_ra)
358
+DO_ST(st1d_le, uint64_t, stq_le_p, cpu_stq_le_data_ra)
359
+
360
+DO_STQ(sve_st1qq_be, sme_st1q_be, 1, stq_be_p, cpu_stq_be_data_ra)
361
+DO_STQ(sve_st1qq_le, sme_st1q_le, 0, stq_le_p, cpu_stq_le_data_ra)
362
+
363
+#undef DO_LD
364
+#undef DO_ST
365
+#undef DO_LDQ
366
+#undef DO_STQ
367
+
368
+/*
369
+ * Common helper for all contiguous predicated loads.
370
+ */
371
+
372
+static inline QEMU_ALWAYS_INLINE
373
+void sme_ld1(CPUARMState *env, void *za, uint64_t *vg,
374
+ const target_ulong addr, uint32_t desc, const uintptr_t ra,
375
+ const int esz, uint32_t mtedesc, bool vertical,
376
+ sve_ldst1_host_fn *host_fn,
377
+ sve_ldst1_tlb_fn *tlb_fn,
378
+ ClearFn *clr_fn,
379
+ CopyFn *cpy_fn)
380
+{
381
+ const intptr_t reg_max = simd_oprsz(desc);
382
+ const intptr_t esize = 1 << esz;
383
+ intptr_t reg_off, reg_last;
384
+ SVEContLdSt info;
385
+ void *host;
386
+ int flags;
387
+
388
+ /* Find the active elements. */
389
+ if (!sve_cont_ldst_elements(&info, addr, vg, reg_max, esz, esize)) {
390
+ /* The entire predicate was false; no load occurs. */
391
+ clr_fn(za, 0, reg_max);
392
+ return;
393
+ }
394
+
395
+ /* Probe the page(s). Exit with exception for any invalid page. */
396
+ sve_cont_ldst_pages(&info, FAULT_ALL, env, addr, MMU_DATA_LOAD, ra);
397
+
398
+ /* Handle watchpoints for all active elements. */
399
+ sve_cont_ldst_watchpoints(&info, env, vg, addr, esize, esize,
400
+ BP_MEM_READ, ra);
401
+
402
+ /*
403
+ * Handle mte checks for all active elements.
404
+ * Since TBI must be set for MTE, !mtedesc => !mte_active.
405
+ */
406
+ if (mtedesc) {
407
+ sve_cont_ldst_mte_check(&info, env, vg, addr, esize, esize,
408
+ mtedesc, ra);
409
+ }
410
+
411
+ flags = info.page[0].flags | info.page[1].flags;
412
+ if (unlikely(flags != 0)) {
413
+#ifdef CONFIG_USER_ONLY
414
+ g_assert_not_reached();
415
+#else
416
+ /*
417
+ * At least one page includes MMIO.
418
+ * Any bus operation can fail with cpu_transaction_failed,
419
+ * which for ARM will raise SyncExternal. Perform the load
420
+ * into scratch memory to preserve register state until the end.
421
+ */
422
+ ARMVectorReg scratch = { };
423
+
424
+ reg_off = info.reg_off_first[0];
425
+ reg_last = info.reg_off_last[1];
426
+ if (reg_last < 0) {
427
+ reg_last = info.reg_off_split;
428
+ if (reg_last < 0) {
429
+ reg_last = info.reg_off_last[0];
430
+ }
431
+ }
432
+
433
+ do {
434
+ uint64_t pg = vg[reg_off >> 6];
435
+ do {
436
+ if ((pg >> (reg_off & 63)) & 1) {
437
+ tlb_fn(env, &scratch, reg_off, addr + reg_off, ra);
438
+ }
439
+ reg_off += esize;
440
+ } while (reg_off & 63);
441
+ } while (reg_off <= reg_last);
442
+
443
+ cpy_fn(za, &scratch, reg_max);
444
+ return;
445
+#endif
446
+ }
447
+
448
+ /* The entire operation is in RAM, on valid pages. */
449
+
450
+ reg_off = info.reg_off_first[0];
451
+ reg_last = info.reg_off_last[0];
452
+ host = info.page[0].host;
453
+
454
+ if (!vertical) {
455
+ memset(za, 0, reg_max);
456
+ } else if (reg_off) {
457
+ clr_fn(za, 0, reg_off);
458
+ }
459
+
460
+ while (reg_off <= reg_last) {
461
+ uint64_t pg = vg[reg_off >> 6];
462
+ do {
463
+ if ((pg >> (reg_off & 63)) & 1) {
464
+ host_fn(za, reg_off, host + reg_off);
465
+ } else if (vertical) {
466
+ clr_fn(za, reg_off, esize);
467
+ }
468
+ reg_off += esize;
469
+ } while (reg_off <= reg_last && (reg_off & 63));
470
+ }
471
+
472
+ /*
473
+ * Use the slow path to manage the cross-page misalignment.
474
+ * But we know this is RAM and cannot trap.
475
+ */
476
+ reg_off = info.reg_off_split;
477
+ if (unlikely(reg_off >= 0)) {
478
+ tlb_fn(env, za, reg_off, addr + reg_off, ra);
479
+ }
480
+
481
+ reg_off = info.reg_off_first[1];
482
+ if (unlikely(reg_off >= 0)) {
483
+ reg_last = info.reg_off_last[1];
484
+ host = info.page[1].host;
485
+
486
+ do {
487
+ uint64_t pg = vg[reg_off >> 6];
488
+ do {
489
+ if ((pg >> (reg_off & 63)) & 1) {
490
+ host_fn(za, reg_off, host + reg_off);
491
+ } else if (vertical) {
492
+ clr_fn(za, reg_off, esize);
493
+ }
494
+ reg_off += esize;
495
+ } while (reg_off & 63);
496
+ } while (reg_off <= reg_last);
497
+ }
498
+}
499
+
500
+static inline QEMU_ALWAYS_INLINE
501
+void sme_ld1_mte(CPUARMState *env, void *za, uint64_t *vg,
502
+ target_ulong addr, uint32_t desc, uintptr_t ra,
503
+ const int esz, bool vertical,
504
+ sve_ldst1_host_fn *host_fn,
505
+ sve_ldst1_tlb_fn *tlb_fn,
506
+ ClearFn *clr_fn,
507
+ CopyFn *cpy_fn)
508
+{
509
+ uint32_t mtedesc = desc >> (SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
510
+ int bit55 = extract64(addr, 55, 1);
511
+
512
+ /* Remove mtedesc from the normal sve descriptor. */
513
+ desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
514
+
515
+ /* Perform gross MTE suppression early. */
516
+ if (!tbi_check(desc, bit55) ||
517
+ tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
518
+ mtedesc = 0;
519
+ }
520
+
521
+ sme_ld1(env, za, vg, addr, desc, ra, esz, mtedesc, vertical,
522
+ host_fn, tlb_fn, clr_fn, cpy_fn);
523
+}
524
+
525
+#define DO_LD(L, END, ESZ) \
526
+void HELPER(sme_ld1##L##END##_h)(CPUARMState *env, void *za, void *vg, \
527
+ target_ulong addr, uint32_t desc) \
528
+{ \
529
+ sme_ld1(env, za, vg, addr, desc, GETPC(), ESZ, 0, false, \
530
+ sve_ld1##L##L##END##_host, sve_ld1##L##L##END##_tlb, \
531
+ clear_horizontal, copy_horizontal); \
532
+} \
533
+void HELPER(sme_ld1##L##END##_v)(CPUARMState *env, void *za, void *vg, \
534
+ target_ulong addr, uint32_t desc) \
535
+{ \
536
+ sme_ld1(env, za, vg, addr, desc, GETPC(), ESZ, 0, true, \
537
+ sme_ld1##L##END##_v_host, sme_ld1##L##END##_v_tlb, \
538
+ clear_vertical_##L, copy_vertical_##L); \
539
+} \
540
+void HELPER(sme_ld1##L##END##_h_mte)(CPUARMState *env, void *za, void *vg, \
541
+ target_ulong addr, uint32_t desc) \
542
+{ \
543
+ sme_ld1_mte(env, za, vg, addr, desc, GETPC(), ESZ, false, \
544
+ sve_ld1##L##L##END##_host, sve_ld1##L##L##END##_tlb, \
545
+ clear_horizontal, copy_horizontal); \
546
+} \
547
+void HELPER(sme_ld1##L##END##_v_mte)(CPUARMState *env, void *za, void *vg, \
548
+ target_ulong addr, uint32_t desc) \
549
+{ \
550
+ sme_ld1_mte(env, za, vg, addr, desc, GETPC(), ESZ, true, \
551
+ sme_ld1##L##END##_v_host, sme_ld1##L##END##_v_tlb, \
552
+ clear_vertical_##L, copy_vertical_##L); \
553
+}
554
+
555
+DO_LD(b, , MO_8)
556
+DO_LD(h, _be, MO_16)
557
+DO_LD(h, _le, MO_16)
558
+DO_LD(s, _be, MO_32)
559
+DO_LD(s, _le, MO_32)
560
+DO_LD(d, _be, MO_64)
561
+DO_LD(d, _le, MO_64)
562
+DO_LD(q, _be, MO_128)
563
+DO_LD(q, _le, MO_128)
564
+
565
+#undef DO_LD
566
+
567
+/*
568
+ * Common helper for all contiguous predicated stores.
569
+ */
570
+
571
+static inline QEMU_ALWAYS_INLINE
572
+void sme_st1(CPUARMState *env, void *za, uint64_t *vg,
573
+ const target_ulong addr, uint32_t desc, const uintptr_t ra,
574
+ const int esz, uint32_t mtedesc, bool vertical,
575
+ sve_ldst1_host_fn *host_fn,
576
+ sve_ldst1_tlb_fn *tlb_fn)
577
+{
578
+ const intptr_t reg_max = simd_oprsz(desc);
579
+ const intptr_t esize = 1 << esz;
580
+ intptr_t reg_off, reg_last;
581
+ SVEContLdSt info;
582
+ void *host;
583
+ int flags;
584
+
585
+ /* Find the active elements. */
586
+ if (!sve_cont_ldst_elements(&info, addr, vg, reg_max, esz, esize)) {
587
+ /* The entire predicate was false; no store occurs. */
588
+ return;
589
+ }
590
+
591
+ /* Probe the page(s). Exit with exception for any invalid page. */
592
+ sve_cont_ldst_pages(&info, FAULT_ALL, env, addr, MMU_DATA_STORE, ra);
593
+
594
+ /* Handle watchpoints for all active elements. */
595
+ sve_cont_ldst_watchpoints(&info, env, vg, addr, esize, esize,
596
+ BP_MEM_WRITE, ra);
597
+
598
+ /*
599
+ * Handle mte checks for all active elements.
600
+ * Since TBI must be set for MTE, !mtedesc => !mte_active.
601
+ */
602
+ if (mtedesc) {
603
+ sve_cont_ldst_mte_check(&info, env, vg, addr, esize, esize,
604
+ mtedesc, ra);
605
+ }
606
+
607
+ flags = info.page[0].flags | info.page[1].flags;
608
+ if (unlikely(flags != 0)) {
609
+#ifdef CONFIG_USER_ONLY
610
+ g_assert_not_reached();
611
+#else
612
+ /*
613
+ * At least one page includes MMIO.
614
+ * Any bus operation can fail with cpu_transaction_failed,
615
+ * which for ARM will raise SyncExternal. We cannot avoid
616
+ * this fault and will leave with the store incomplete.
617
+ */
618
+ reg_off = info.reg_off_first[0];
619
+ reg_last = info.reg_off_last[1];
620
+ if (reg_last < 0) {
621
+ reg_last = info.reg_off_split;
622
+ if (reg_last < 0) {
623
+ reg_last = info.reg_off_last[0];
624
+ }
625
+ }
626
+
627
+ do {
628
+ uint64_t pg = vg[reg_off >> 6];
629
+ do {
630
+ if ((pg >> (reg_off & 63)) & 1) {
631
+ tlb_fn(env, za, reg_off, addr + reg_off, ra);
632
+ }
633
+ reg_off += esize;
634
+ } while (reg_off & 63);
635
+ } while (reg_off <= reg_last);
636
+ return;
637
+#endif
638
+ }
639
+
640
+ reg_off = info.reg_off_first[0];
641
+ reg_last = info.reg_off_last[0];
642
+ host = info.page[0].host;
643
+
644
+ while (reg_off <= reg_last) {
645
+ uint64_t pg = vg[reg_off >> 6];
646
+ do {
647
+ if ((pg >> (reg_off & 63)) & 1) {
648
+ host_fn(za, reg_off, host + reg_off);
649
+ }
650
+ reg_off += 1 << esz;
651
+ } while (reg_off <= reg_last && (reg_off & 63));
652
+ }
653
+
654
+ /*
655
+ * Use the slow path to manage the cross-page misalignment.
656
+ * But we know this is RAM and cannot trap.
657
+ */
658
+ reg_off = info.reg_off_split;
659
+ if (unlikely(reg_off >= 0)) {
660
+ tlb_fn(env, za, reg_off, addr + reg_off, ra);
661
+ }
662
+
663
+ reg_off = info.reg_off_first[1];
664
+ if (unlikely(reg_off >= 0)) {
665
+ reg_last = info.reg_off_last[1];
666
+ host = info.page[1].host;
667
+
668
+ do {
669
+ uint64_t pg = vg[reg_off >> 6];
670
+ do {
671
+ if ((pg >> (reg_off & 63)) & 1) {
672
+ host_fn(za, reg_off, host + reg_off);
673
+ }
674
+ reg_off += 1 << esz;
675
+ } while (reg_off & 63);
676
+ } while (reg_off <= reg_last);
677
+ }
678
+}
679
+
680
+static inline QEMU_ALWAYS_INLINE
681
+void sme_st1_mte(CPUARMState *env, void *za, uint64_t *vg, target_ulong addr,
682
+ uint32_t desc, uintptr_t ra, int esz, bool vertical,
683
+ sve_ldst1_host_fn *host_fn,
684
+ sve_ldst1_tlb_fn *tlb_fn)
685
+{
686
+ uint32_t mtedesc = desc >> (SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
687
+ int bit55 = extract64(addr, 55, 1);
688
+
689
+ /* Remove mtedesc from the normal sve descriptor. */
690
+ desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
691
+
692
+ /* Perform gross MTE suppression early. */
693
+ if (!tbi_check(desc, bit55) ||
694
+ tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
695
+ mtedesc = 0;
696
+ }
697
+
698
+ sme_st1(env, za, vg, addr, desc, ra, esz, mtedesc,
699
+ vertical, host_fn, tlb_fn);
700
+}
701
+
702
+#define DO_ST(L, END, ESZ) \
703
+void HELPER(sme_st1##L##END##_h)(CPUARMState *env, void *za, void *vg, \
704
+ target_ulong addr, uint32_t desc) \
705
+{ \
706
+ sme_st1(env, za, vg, addr, desc, GETPC(), ESZ, 0, false, \
707
+ sve_st1##L##L##END##_host, sve_st1##L##L##END##_tlb); \
708
+} \
709
+void HELPER(sme_st1##L##END##_v)(CPUARMState *env, void *za, void *vg, \
710
+ target_ulong addr, uint32_t desc) \
711
+{ \
712
+ sme_st1(env, za, vg, addr, desc, GETPC(), ESZ, 0, true, \
713
+ sme_st1##L##END##_v_host, sme_st1##L##END##_v_tlb); \
714
+} \
715
+void HELPER(sme_st1##L##END##_h_mte)(CPUARMState *env, void *za, void *vg, \
716
+ target_ulong addr, uint32_t desc) \
717
+{ \
718
+ sme_st1_mte(env, za, vg, addr, desc, GETPC(), ESZ, false, \
719
+ sve_st1##L##L##END##_host, sve_st1##L##L##END##_tlb); \
720
+} \
721
+void HELPER(sme_st1##L##END##_v_mte)(CPUARMState *env, void *za, void *vg, \
722
+ target_ulong addr, uint32_t desc) \
723
+{ \
724
+ sme_st1_mte(env, za, vg, addr, desc, GETPC(), ESZ, true, \
725
+ sme_st1##L##END##_v_host, sme_st1##L##END##_v_tlb); \
726
+}
727
+
728
+DO_ST(b, , MO_8)
729
+DO_ST(h, _be, MO_16)
730
+DO_ST(h, _le, MO_16)
731
+DO_ST(s, _be, MO_32)
732
+DO_ST(s, _le, MO_32)
733
+DO_ST(d, _be, MO_64)
734
+DO_ST(d, _le, MO_64)
735
+DO_ST(q, _be, MO_128)
736
+DO_ST(q, _le, MO_128)
737
+
738
+#undef DO_ST
739
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
740
index XXXXXXX..XXXXXXX 100644
741
--- a/target/arm/translate-sme.c
742
+++ b/target/arm/translate-sme.c
743
@@ -XXX,XX +XXX,XX @@ static bool trans_MOVA(DisasContext *s, arg_MOVA *a)
744
745
return true;
746
}
747
+
748
+static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
749
+{
750
+ typedef void GenLdSt1(TCGv_env, TCGv_ptr, TCGv_ptr, TCGv, TCGv_i32);
751
+
752
+ /*
753
+ * Indexed by [esz][be][v][mte][st], which is (except for load/store)
754
+ * also the order in which the elements appear in the function names,
755
+ * and so how we must concatenate the pieces.
756
+ */
757
+
758
+#define FN_LS(F) { gen_helper_sme_ld1##F, gen_helper_sme_st1##F }
759
+#define FN_MTE(F) { FN_LS(F), FN_LS(F##_mte) }
760
+#define FN_HV(F) { FN_MTE(F##_h), FN_MTE(F##_v) }
761
+#define FN_END(L, B) { FN_HV(L), FN_HV(B) }
762
+
763
+ static GenLdSt1 * const fns[5][2][2][2][2] = {
764
+ FN_END(b, b),
765
+ FN_END(h_le, h_be),
766
+ FN_END(s_le, s_be),
767
+ FN_END(d_le, d_be),
768
+ FN_END(q_le, q_be),
769
+ };
770
+
771
+#undef FN_LS
772
+#undef FN_MTE
773
+#undef FN_HV
774
+#undef FN_END
775
+
776
+ TCGv_ptr t_za, t_pg;
777
+ TCGv_i64 addr;
778
+ int svl, desc = 0;
779
+ bool be = s->be_data == MO_BE;
780
+ bool mte = s->mte_active[0];
781
+
782
+ if (!dc_isar_feature(aa64_sme, s)) {
783
+ return false;
784
+ }
785
+ if (!sme_smza_enabled_check(s)) {
786
+ return true;
787
+ }
788
+
789
+ t_za = get_tile_rowcol(s, a->esz, a->rs, a->za_imm, a->v);
790
+ t_pg = pred_full_reg_ptr(s, a->pg);
791
+ addr = tcg_temp_new_i64();
792
+
793
+ tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->esz);
794
+ tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
795
+
796
+ if (mte) {
797
+ desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
798
+ desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
799
+ desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
800
+ desc = FIELD_DP32(desc, MTEDESC, WRITE, a->st);
801
+ desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << a->esz) - 1);
802
+ desc <<= SVE_MTEDESC_SHIFT;
803
+ } else {
804
+ addr = clean_data_tbi(s, addr);
805
+ }
806
+ svl = streaming_vec_reg_size(s);
807
+ desc = simd_desc(svl, svl, desc);
808
+
809
+ fns[a->esz][be][a->v][mte][a->st](cpu_env, t_za, t_pg, addr,
810
+ tcg_constant_i32(desc));
811
+
812
+ tcg_temp_free_ptr(t_za);
813
+ tcg_temp_free_ptr(t_pg);
814
+ tcg_temp_free_i64(addr);
815
+ return true;
816
+}
817
--
117
--
818
2.25.1
118
2.34.1
119
120
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The MPS2 SCC device is broadly the same for all FPGA images, but has
2
2
minor differences in the behaviour of the CFG registers depending on
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
the image. In many cases we don't really care about the functionality
4
Message-id: 20220708151540.18136-27-richard.henderson@linaro.org
4
controlled by these registers and a reads-as-written or similar
5
behaviour is sufficient for the moment.
6
7
For the AN536 the required behaviour is:
8
9
* A_CFG0 has CPU reset and halt bits
10
- implement as reads-as-written for the moment
11
* A_CFG1 has flash or ATCM address 0 remap handling
12
- QEMU doesn't model this; implement as reads-as-written
13
* A_CFG2 has QSPI select (like AN524)
14
- implemented (no behaviour, as with AN524)
15
* A_CFG3 is MCC_MSB_ADDR "additional MCC addressing bits"
16
- QEMU doesn't care about these, so use the existing
17
RAZ behaviour for convenience
18
* A_CFG4 is board rev (like all other images)
19
- no change needed
20
* A_CFG5 is ACLK frq in hz (like AN524)
21
- implemented as reads-as-written, as for other boards
22
* A_CFG6 is core 0 vector table base address
23
- implemented as reads-as-written for the moment
24
* A_CFG7 is core 1 vector table base address
25
- implemented as reads-as-written for the moment
26
27
Make the changes necessary for this; leave TODO comments where
28
appropriate to indicate where we might want to come back and
29
implement things like CPU reset.
30
31
The other aspects of the device specific to this FPGA image (like the
32
values of the board ID and similar registers) will be set via the
33
device's qdev properties.
34
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
35
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
36
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
37
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
38
Message-id: 20240206132931.38376-8-peter.maydell@linaro.org
7
---
39
---
8
target/arm/helper-sme.h | 2 ++
40
include/hw/misc/mps2-scc.h | 1 +
9
target/arm/sme.decode | 1 +
41
hw/misc/mps2-scc.c | 101 +++++++++++++++++++++++++++++++++----
10
target/arm/sme_helper.c | 74 ++++++++++++++++++++++++++++++++++++++
42
2 files changed, 92 insertions(+), 10 deletions(-)
11
target/arm/translate-sme.c | 1 +
43
12
4 files changed, 78 insertions(+)
44
diff --git a/include/hw/misc/mps2-scc.h b/include/hw/misc/mps2-scc.h
13
14
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
15
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sme.h
46
--- a/include/hw/misc/mps2-scc.h
17
+++ b/target/arm/helper-sme.h
47
+++ b/include/hw/misc/mps2-scc.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sme_addva_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
48
@@ -XXX,XX +XXX,XX @@ struct MPS2SCC {
19
DEF_HELPER_FLAGS_5(sme_addha_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
49
uint32_t cfg4;
20
DEF_HELPER_FLAGS_5(sme_addva_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
50
uint32_t cfg5;
21
51
uint32_t cfg6;
22
+DEF_HELPER_FLAGS_7(sme_fmopa_h, TCG_CALL_NO_RWG,
52
+ uint32_t cfg7;
23
+ void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
53
uint32_t cfgdata_rtn;
24
DEF_HELPER_FLAGS_7(sme_fmopa_s, TCG_CALL_NO_RWG,
54
uint32_t cfgdata_out;
25
void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
55
uint32_t cfgctrl;
26
DEF_HELPER_FLAGS_7(sme_fmopa_d, TCG_CALL_NO_RWG,
56
diff --git a/hw/misc/mps2-scc.c b/hw/misc/mps2-scc.c
27
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
28
index XXXXXXX..XXXXXXX 100644
57
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/sme.decode
58
--- a/hw/misc/mps2-scc.c
30
+++ b/target/arm/sme.decode
59
+++ b/hw/misc/mps2-scc.c
31
@@ -XXX,XX +XXX,XX @@ FMOPA_s 10000000 100 ..... ... ... ..... . 00 .. @op_32
60
@@ -XXX,XX +XXX,XX @@ REG32(CFG3, 0xc)
32
FMOPA_d 10000000 110 ..... ... ... ..... . 0 ... @op_64
61
REG32(CFG4, 0x10)
33
62
REG32(CFG5, 0x14)
34
BFMOPA 10000001 100 ..... ... ... ..... . 00 .. @op_32
63
REG32(CFG6, 0x18)
35
+FMOPA_h 10000001 101 ..... ... ... ..... . 00 .. @op_32
64
+REG32(CFG7, 0x1c)
36
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
65
REG32(CFGDATA_RTN, 0xa0)
37
index XXXXXXX..XXXXXXX 100644
66
REG32(CFGDATA_OUT, 0xa4)
38
--- a/target/arm/sme_helper.c
67
REG32(CFGCTRL, 0xa8)
39
+++ b/target/arm/sme_helper.c
68
@@ -XXX,XX +XXX,XX @@ static int scc_partno(MPS2SCC *s)
40
@@ -XXX,XX +XXX,XX @@ static inline uint32_t f16mop_adj_pair(uint32_t pair, uint32_t pg, uint32_t neg)
69
/* Is CFG_REG2 present? */
41
return pair;
70
static bool have_cfg2(MPS2SCC *s)
42
}
71
{
43
72
- return scc_partno(s) == 0x524 || scc_partno(s) == 0x547;
44
+static float32 f16_dotadd(float32 sum, uint32_t e1, uint32_t e2,
73
+ return scc_partno(s) == 0x524 || scc_partno(s) == 0x547 ||
45
+ float_status *s_std, float_status *s_odd)
74
+ scc_partno(s) == 0x536;
46
+{
75
}
47
+ float64 e1r = float16_to_float64(e1 & 0xffff, true, s_std);
76
48
+ float64 e1c = float16_to_float64(e1 >> 16, true, s_std);
77
/* Is CFG_REG3 present? */
49
+ float64 e2r = float16_to_float64(e2 & 0xffff, true, s_std);
78
static bool have_cfg3(MPS2SCC *s)
50
+ float64 e2c = float16_to_float64(e2 >> 16, true, s_std);
79
{
51
+ float64 t64;
80
- return scc_partno(s) != 0x524 && scc_partno(s) != 0x547;
52
+ float32 t32;
81
+ return scc_partno(s) != 0x524 && scc_partno(s) != 0x547 &&
53
+
82
+ scc_partno(s) != 0x536;
54
+ /*
83
}
55
+ * The ARM pseudocode function FPDot performs both multiplies
84
56
+ * and the add with a single rounding operation. Emulate this
85
/* Is CFG_REG5 present? */
57
+ * by performing the first multiply in round-to-odd, then doing
86
static bool have_cfg5(MPS2SCC *s)
58
+ * the second multiply as fused multiply-add, and rounding to
87
{
59
+ * float32 all in one step.
88
- return scc_partno(s) == 0x524 || scc_partno(s) == 0x547;
60
+ */
89
+ return scc_partno(s) == 0x524 || scc_partno(s) == 0x547 ||
61
+ t64 = float64_mul(e1r, e2r, s_odd);
90
+ scc_partno(s) == 0x536;
62
+ t64 = float64r32_muladd(e1c, e2c, t64, 0, s_std);
91
}
63
+
92
64
+ /* This conversion is exact, because we've already rounded. */
93
/* Is CFG_REG6 present? */
65
+ t32 = float64_to_float32(t64, s_std);
94
static bool have_cfg6(MPS2SCC *s)
66
+
95
{
67
+ /* The final accumulation step is not fused. */
96
- return scc_partno(s) == 0x524;
68
+ return float32_add(sum, t32, s_std);
97
+ return scc_partno(s) == 0x524 || scc_partno(s) == 0x536;
69
+}
98
+}
70
+
99
+
71
+void HELPER(sme_fmopa_h)(void *vza, void *vzn, void *vzm, void *vpn,
100
+/* Is CFG_REG7 present? */
72
+ void *vpm, void *vst, uint32_t desc)
101
+static bool have_cfg7(MPS2SCC *s)
73
+{
102
+{
74
+ intptr_t row, col, oprsz = simd_maxsz(desc);
103
+ return scc_partno(s) == 0x536;
75
+ uint32_t neg = simd_data(desc) * 0x80008000u;
104
+}
76
+ uint16_t *pn = vpn, *pm = vpm;
105
+
77
+ float_status fpst_odd, fpst_std;
106
+/* Does CFG_REG0 drive the 'remap' GPIO output? */
78
+
107
+static bool cfg0_is_remap(MPS2SCC *s)
79
+ /*
108
+{
80
+ * Make a copy of float_status because this operation does not
109
+ return scc_partno(s) != 0x536;
81
+ * update the cumulative fp exception status. It also produces
110
+}
82
+ * default nans. Make a second copy with round-to-odd -- see above.
111
+
83
+ */
112
+/* Is CFG_REG1 driving a set of LEDs? */
84
+ fpst_std = *(float_status *)vst;
113
+static bool cfg1_is_leds(MPS2SCC *s)
85
+ set_default_nan_mode(true, &fpst_std);
114
+{
86
+ fpst_odd = fpst_std;
115
+ return scc_partno(s) != 0x536;
87
+ set_float_rounding_mode(float_round_to_odd, &fpst_odd);
116
}
88
+
117
89
+ for (row = 0; row < oprsz; ) {
118
/* Handle a write via the SYS_CFG channel to the specified function/device.
90
+ uint16_t prow = pn[H2(row >> 4)];
119
@@ -XXX,XX +XXX,XX @@ static uint64_t mps2_scc_read(void *opaque, hwaddr offset, unsigned size)
91
+ do {
120
if (!have_cfg3(s)) {
92
+ void *vza_row = vza + tile_vslice_offset(row);
121
goto bad_offset;
93
+ uint32_t n = *(uint32_t *)(vzn + H1_4(row));
122
}
94
+
123
- /* These are user-settable DIP switches on the board. We don't
95
+ n = f16mop_adj_pair(n, prow, neg);
124
+ /*
96
+
125
+ * These are user-settable DIP switches on the board. We don't
97
+ for (col = 0; col < oprsz; ) {
126
* model that, so just return zeroes.
98
+ uint16_t pcol = pm[H2(col >> 4)];
127
+ *
99
+ do {
128
+ * TODO: for AN536 this is MCC_MSB_ADDR "additional MCC addressing
100
+ if (prow & pcol & 0b0101) {
129
+ * bits". These change which part of the DDR4 the motherboard
101
+ uint32_t *a = vza_row + H1_4(col);
130
+ * configuration controller can see in its memory map (see the
102
+ uint32_t m = *(uint32_t *)(vzm + H1_4(col));
131
+ * appnote section 2.4). QEMU doesn't model the MCC at all, so these
103
+
132
+ * bits are not interesting to us; read-as-zero is as good as anything
104
+ m = f16mop_adj_pair(m, pcol, 0);
133
+ * else.
105
+ *a = f16_dotadd(*a, n, m, &fpst_std, &fpst_odd);
134
*/
106
+
135
r = 0;
107
+ col += 4;
136
break;
108
+ pcol >>= 4;
137
@@ -XXX,XX +XXX,XX @@ static uint64_t mps2_scc_read(void *opaque, hwaddr offset, unsigned size)
109
+ }
138
}
110
+ } while (col & 15);
139
r = s->cfg6;
140
break;
141
+ case A_CFG7:
142
+ if (!have_cfg7(s)) {
143
+ goto bad_offset;
144
+ }
145
+ r = s->cfg7;
146
+ break;
147
case A_CFGDATA_RTN:
148
r = s->cfgdata_rtn;
149
break;
150
@@ -XXX,XX +XXX,XX @@ static void mps2_scc_write(void *opaque, hwaddr offset, uint64_t value,
151
* we always reflect bit 0 in the 'remap' GPIO output line,
152
* and let the board wire it up or not as it chooses.
153
* TODO on some boards bit 1 is CPU_WAIT.
154
+ *
155
+ * TODO: on the AN536 this register controls reset and halt
156
+ * for both CPUs. For the moment we don't implement this, so the
157
+ * register just reads as written.
158
*/
159
s->cfg0 = value;
160
- qemu_set_irq(s->remap, s->cfg0 & 1);
161
+ if (cfg0_is_remap(s)) {
162
+ qemu_set_irq(s->remap, s->cfg0 & 1);
163
+ }
164
break;
165
case A_CFG1:
166
s->cfg1 = value;
167
- for (size_t i = 0; i < ARRAY_SIZE(s->led); i++) {
168
- led_set_state(s->led[i], extract32(value, i, 1));
169
+ /*
170
+ * On most boards this register drives LEDs.
171
+ *
172
+ * TODO: for AN536 this controls whether flash and ATCM are
173
+ * enabled or disabled on reset. QEMU doesn't model this, and
174
+ * always wires up RAM in the ATCM area and ROM in the flash area.
175
+ */
176
+ if (cfg1_is_leds(s)) {
177
+ for (size_t i = 0; i < ARRAY_SIZE(s->led); i++) {
178
+ led_set_state(s->led[i], extract32(value, i, 1));
111
+ }
179
+ }
112
+ row += 4;
180
}
113
+ prow >>= 4;
181
break;
114
+ } while (row & 15);
182
case A_CFG2:
183
if (!have_cfg2(s)) {
184
goto bad_offset;
185
}
186
- /* AN524: QSPI Select signal */
187
+ /* AN524, AN536: QSPI Select signal */
188
s->cfg2 = value;
189
break;
190
case A_CFG5:
191
if (!have_cfg5(s)) {
192
goto bad_offset;
193
}
194
- /* AN524: ACLK frequency in Hz */
195
+ /* AN524, AN536: ACLK frequency in Hz */
196
s->cfg5 = value;
197
break;
198
case A_CFG6:
199
@@ -XXX,XX +XXX,XX @@ static void mps2_scc_write(void *opaque, hwaddr offset, uint64_t value,
200
goto bad_offset;
201
}
202
/* AN524: Clock divider for BRAM */
203
+ /* AN536: Core 0 vector table base address */
204
+ s->cfg6 = value;
205
+ break;
206
+ case A_CFG7:
207
+ if (!have_cfg7(s)) {
208
+ goto bad_offset;
209
+ }
210
+ /* AN536: Core 1 vector table base address */
211
s->cfg6 = value;
212
break;
213
case A_CFGDATA_OUT:
214
@@ -XXX,XX +XXX,XX @@ static void mps2_scc_finalize(Object *obj)
215
g_free(s->oscclk_reset);
216
}
217
218
+static bool cfg7_needed(void *opaque)
219
+{
220
+ MPS2SCC *s = opaque;
221
+
222
+ return have_cfg7(s);
223
+}
224
+
225
+static const VMStateDescription vmstate_cfg7 = {
226
+ .name = "mps2-scc/cfg7",
227
+ .version_id = 1,
228
+ .minimum_version_id = 1,
229
+ .needed = cfg7_needed,
230
+ .fields = (const VMStateField[]) {
231
+ VMSTATE_UINT32(cfg7, MPS2SCC),
232
+ VMSTATE_END_OF_LIST()
115
+ }
233
+ }
116
+}
234
+};
117
+
235
+
118
void HELPER(sme_bfmopa)(void *vza, void *vzn, void *vzm, void *vpn,
236
static const VMStateDescription mps2_scc_vmstate = {
119
void *vpm, uint32_t desc)
237
.name = "mps2-scc",
120
{
238
.version_id = 3,
121
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
239
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription mps2_scc_vmstate = {
122
index XXXXXXX..XXXXXXX 100644
240
VMSTATE_VARRAY_UINT32(oscclk, MPS2SCC, num_oscclk,
123
--- a/target/arm/translate-sme.c
241
0, vmstate_info_uint32, uint32_t),
124
+++ b/target/arm/translate-sme.c
242
VMSTATE_END_OF_LIST()
125
@@ -XXX,XX +XXX,XX @@ static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz,
243
+ },
126
return true;
244
+ .subsections = (const VMStateDescription * const []) {
127
}
245
+ &vmstate_cfg7,
128
246
+ NULL
129
+TRANS_FEAT(FMOPA_h, aa64_sme, do_outprod_fpst, a, MO_32, gen_helper_sme_fmopa_h)
247
}
130
TRANS_FEAT(FMOPA_s, aa64_sme, do_outprod_fpst, a, MO_32, gen_helper_sme_fmopa_s)
248
};
131
TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a, MO_64, gen_helper_sme_fmopa_d)
132
249
133
--
250
--
134
2.25.1
251
2.34.1
252
253
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The AN536 is another FPGA image for the MPS3 development board. Unlike
2
2
the existing FPGA images we already model, this board uses a Cortex-R
3
This includes the build rules for the decoder, and the
3
family CPU, and it does not use any equivalent to the M-profile
4
new file for translation, but excludes any instructions.
4
"Subsystem for Embedded" SoC-equivalent that we model in hw/arm/armsse.c.
5
5
It's therefore more convenient for us to model it as a completely
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
separate C file.
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
8
Message-id: 20220708151540.18136-3-richard.henderson@linaro.org
8
This commit adds the basic skeleton of the board model, and the
9
code to create all the RAM and ROM. We assume that we're probably
10
going to want to add more images in future, so use the same
11
base class/subclass setup that mps2-tz.c uses, even though at
12
the moment there's only a single subclass.
13
14
Following commits will add the CPUs and the peripherals.
15
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
18
Message-id: 20240206132931.38376-9-peter.maydell@linaro.org
10
---
19
---
11
target/arm/translate-a64.h | 1 +
20
MAINTAINERS | 3 +-
12
target/arm/sme.decode | 20 ++++++++++++++++++++
21
configs/devices/arm-softmmu/default.mak | 1 +
13
target/arm/translate-a64.c | 7 ++++++-
22
hw/arm/mps3r.c | 239 ++++++++++++++++++++++++
14
target/arm/translate-sme.c | 35 +++++++++++++++++++++++++++++++++++
23
hw/arm/Kconfig | 5 +
15
target/arm/meson.build | 2 ++
24
hw/arm/meson.build | 1 +
16
5 files changed, 64 insertions(+), 1 deletion(-)
25
5 files changed, 248 insertions(+), 1 deletion(-)
17
create mode 100644 target/arm/sme.decode
26
create mode 100644 hw/arm/mps3r.c
18
create mode 100644 target/arm/translate-sme.c
27
19
28
diff --git a/MAINTAINERS b/MAINTAINERS
20
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
21
index XXXXXXX..XXXXXXX 100644
29
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/translate-a64.h
30
--- a/MAINTAINERS
23
+++ b/target/arm/translate-a64.h
31
+++ b/MAINTAINERS
24
@@ -XXX,XX +XXX,XX @@ static inline int pred_gvec_reg_size(DisasContext *s)
32
@@ -XXX,XX +XXX,XX @@ F: include/hw/misc/imx7_*.h
25
}
33
F: hw/pci-host/designware.c
26
34
F: include/hw/pci-host/designware.h
27
bool disas_sve(DisasContext *, uint32_t);
35
28
+bool disas_sme(DisasContext *, uint32_t);
36
-MPS2
29
37
+MPS2 / MPS3
30
void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
38
M: Peter Maydell <peter.maydell@linaro.org>
31
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
39
L: qemu-arm@nongnu.org
32
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
40
S: Maintained
41
F: hw/arm/mps2.c
42
F: hw/arm/mps2-tz.c
43
+F: hw/arm/mps3r.c
44
F: hw/misc/mps2-*.c
45
F: include/hw/misc/mps2-*.h
46
F: hw/arm/armsse.c
47
diff --git a/configs/devices/arm-softmmu/default.mak b/configs/devices/arm-softmmu/default.mak
48
index XXXXXXX..XXXXXXX 100644
49
--- a/configs/devices/arm-softmmu/default.mak
50
+++ b/configs/devices/arm-softmmu/default.mak
51
@@ -XXX,XX +XXX,XX @@ CONFIG_ARM_VIRT=y
52
# CONFIG_INTEGRATOR=n
53
# CONFIG_FSL_IMX31=n
54
# CONFIG_MUSICPAL=n
55
+# CONFIG_MPS3R=n
56
# CONFIG_MUSCA=n
57
# CONFIG_CHEETAH=n
58
# CONFIG_SX1=n
59
diff --git a/hw/arm/mps3r.c b/hw/arm/mps3r.c
33
new file mode 100644
60
new file mode 100644
34
index XXXXXXX..XXXXXXX
61
index XXXXXXX..XXXXXXX
35
--- /dev/null
62
--- /dev/null
36
+++ b/target/arm/sme.decode
63
+++ b/hw/arm/mps3r.c
37
@@ -XXX,XX +XXX,XX @@
38
+# AArch64 SME instruction descriptions
39
+#
40
+# Copyright (c) 2022 Linaro, Ltd
41
+#
42
+# This library is free software; you can redistribute it and/or
43
+# modify it under the terms of the GNU Lesser General Public
44
+# License as published by the Free Software Foundation; either
45
+# version 2.1 of the License, or (at your option) any later version.
46
+#
47
+# This library is distributed in the hope that it will be useful,
48
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
49
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
50
+# Lesser General Public License for more details.
51
+#
52
+# You should have received a copy of the GNU Lesser General Public
53
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
54
+
55
+#
56
+# This file is processed by scripts/decodetree.py
57
+#
58
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/translate-a64.c
61
+++ b/target/arm/translate-a64.c
62
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
63
}
64
65
switch (extract32(insn, 25, 4)) {
66
- case 0x0: case 0x1: case 0x3: /* UNALLOCATED */
67
+ case 0x0:
68
+ if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
69
+ unallocated_encoding(s);
70
+ }
71
+ break;
72
+ case 0x1: case 0x3: /* UNALLOCATED */
73
unallocated_encoding(s);
74
break;
75
case 0x2:
76
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
77
new file mode 100644
78
index XXXXXXX..XXXXXXX
79
--- /dev/null
80
+++ b/target/arm/translate-sme.c
81
@@ -XXX,XX +XXX,XX @@
64
@@ -XXX,XX +XXX,XX @@
82
+/*
65
+/*
83
+ * AArch64 SME translation
66
+ * Arm MPS3 board emulation for Cortex-R-based FPGA images.
67
+ * (For M-profile images see mps2.c and mps2tz.c.)
84
+ *
68
+ *
85
+ * Copyright (c) 2022 Linaro, Ltd
69
+ * Copyright (c) 2017 Linaro Limited
70
+ * Written by Peter Maydell
86
+ *
71
+ *
87
+ * This library is free software; you can redistribute it and/or
72
+ * This program is free software; you can redistribute it and/or modify
88
+ * modify it under the terms of the GNU Lesser General Public
73
+ * it under the terms of the GNU General Public License version 2 or
89
+ * License as published by the Free Software Foundation; either
74
+ * (at your option) any later version.
90
+ * version 2.1 of the License, or (at your option) any later version.
75
+ */
76
+
77
+/*
78
+ * The MPS3 is an FPGA based dev board. This file handles FPGA images
79
+ * which use the Cortex-R CPUs. We model these separately from the
80
+ * M-profile images, because on M-profile the FPGA image is based on
81
+ * a "Subsystem for Embedded" which is similar to an SoC, whereas
82
+ * the R-profile FPGA images don't have that abstraction layer.
91
+ *
83
+ *
92
+ * This library is distributed in the hope that it will be useful,
84
+ * We model the following FPGA images here:
93
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
85
+ * "mps3-an536" -- dual Cortex-R52 as documented in Arm Application Note AN536
94
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
95
+ * Lesser General Public License for more details.
96
+ *
86
+ *
97
+ * You should have received a copy of the GNU Lesser General Public
87
+ * Application Note AN536:
98
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
88
+ * https://developer.arm.com/documentation/dai0536/latest/
99
+ */
89
+ */
100
+
90
+
101
+#include "qemu/osdep.h"
91
+#include "qemu/osdep.h"
92
+#include "qemu/units.h"
93
+#include "qapi/error.h"
94
+#include "exec/address-spaces.h"
102
+#include "cpu.h"
95
+#include "cpu.h"
103
+#include "tcg/tcg-op.h"
96
+#include "hw/boards.h"
104
+#include "tcg/tcg-op-gvec.h"
97
+#include "hw/arm/boot.h"
105
+#include "tcg/tcg-gvec-desc.h"
98
+
106
+#include "translate.h"
99
+/* Define the layout of RAM and ROM in a board */
107
+#include "exec/helper-gen.h"
100
+typedef struct RAMInfo {
108
+#include "translate-a64.h"
101
+ const char *name;
109
+#include "fpu/softfloat.h"
102
+ hwaddr base;
110
+
103
+ hwaddr size;
104
+ int mrindex; /* index into rams[]; -1 for the system RAM block */
105
+ int flags;
106
+} RAMInfo;
111
+
107
+
112
+/*
108
+/*
113
+ * Include the generated decoder.
109
+ * The MPS3 DDR is 3GiB, but on a 32-bit host QEMU doesn't permit
110
+ * emulation of that much guest RAM, so artificially make it smaller.
114
+ */
111
+ */
115
+
112
+#if HOST_LONG_BITS == 32
116
+#include "decode-sme.c.inc"
113
+#define MPS3_DDR_SIZE (1 * GiB)
117
diff --git a/target/arm/meson.build b/target/arm/meson.build
114
+#else
115
+#define MPS3_DDR_SIZE (3 * GiB)
116
+#endif
117
+
118
+/*
119
+ * Flag values:
120
+ * IS_MAIN: this is the main machine RAM
121
+ * IS_ROM: this area is read-only
122
+ */
123
+#define IS_MAIN 1
124
+#define IS_ROM 2
125
+
126
+#define MPS3R_RAM_MAX 9
127
+
128
+typedef enum MPS3RFPGAType {
129
+ FPGA_AN536,
130
+} MPS3RFPGAType;
131
+
132
+struct MPS3RMachineClass {
133
+ MachineClass parent;
134
+ MPS3RFPGAType fpga_type;
135
+ const RAMInfo *raminfo;
136
+};
137
+
138
+struct MPS3RMachineState {
139
+ MachineState parent;
140
+ MemoryRegion ram[MPS3R_RAM_MAX];
141
+};
142
+
143
+#define TYPE_MPS3R_MACHINE "mps3r"
144
+#define TYPE_MPS3R_AN536_MACHINE MACHINE_TYPE_NAME("mps3-an536")
145
+
146
+OBJECT_DECLARE_TYPE(MPS3RMachineState, MPS3RMachineClass, MPS3R_MACHINE)
147
+
148
+static const RAMInfo an536_raminfo[] = {
149
+ {
150
+ .name = "ATCM",
151
+ .base = 0x00000000,
152
+ .size = 0x00008000,
153
+ .mrindex = 0,
154
+ }, {
155
+ /* We model the QSPI flash as simple ROM for now */
156
+ .name = "QSPI",
157
+ .base = 0x08000000,
158
+ .size = 0x00800000,
159
+ .flags = IS_ROM,
160
+ .mrindex = 1,
161
+ }, {
162
+ .name = "BRAM",
163
+ .base = 0x10000000,
164
+ .size = 0x00080000,
165
+ .mrindex = 2,
166
+ }, {
167
+ .name = "DDR",
168
+ .base = 0x20000000,
169
+ .size = MPS3_DDR_SIZE,
170
+ .mrindex = -1,
171
+ }, {
172
+ .name = "ATCM0",
173
+ .base = 0xee000000,
174
+ .size = 0x00008000,
175
+ .mrindex = 3,
176
+ }, {
177
+ .name = "BTCM0",
178
+ .base = 0xee100000,
179
+ .size = 0x00008000,
180
+ .mrindex = 4,
181
+ }, {
182
+ .name = "CTCM0",
183
+ .base = 0xee200000,
184
+ .size = 0x00008000,
185
+ .mrindex = 5,
186
+ }, {
187
+ .name = "ATCM1",
188
+ .base = 0xee400000,
189
+ .size = 0x00008000,
190
+ .mrindex = 6,
191
+ }, {
192
+ .name = "BTCM1",
193
+ .base = 0xee500000,
194
+ .size = 0x00008000,
195
+ .mrindex = 7,
196
+ }, {
197
+ .name = "CTCM1",
198
+ .base = 0xee600000,
199
+ .size = 0x00008000,
200
+ .mrindex = 8,
201
+ }, {
202
+ .name = NULL,
203
+ }
204
+};
205
+
206
+static MemoryRegion *mr_for_raminfo(MPS3RMachineState *mms,
207
+ const RAMInfo *raminfo)
208
+{
209
+ /* Return an initialized MemoryRegion for the RAMInfo. */
210
+ MemoryRegion *ram;
211
+
212
+ if (raminfo->mrindex < 0) {
213
+ /* Means this RAMInfo is for QEMU's "system memory" */
214
+ MachineState *machine = MACHINE(mms);
215
+ assert(!(raminfo->flags & IS_ROM));
216
+ return machine->ram;
217
+ }
218
+
219
+ assert(raminfo->mrindex < MPS3R_RAM_MAX);
220
+ ram = &mms->ram[raminfo->mrindex];
221
+
222
+ memory_region_init_ram(ram, NULL, raminfo->name,
223
+ raminfo->size, &error_fatal);
224
+ if (raminfo->flags & IS_ROM) {
225
+ memory_region_set_readonly(ram, true);
226
+ }
227
+ return ram;
228
+}
229
+
230
+static void mps3r_common_init(MachineState *machine)
231
+{
232
+ MPS3RMachineState *mms = MPS3R_MACHINE(machine);
233
+ MPS3RMachineClass *mmc = MPS3R_MACHINE_GET_CLASS(mms);
234
+ MemoryRegion *sysmem = get_system_memory();
235
+
236
+ for (const RAMInfo *ri = mmc->raminfo; ri->name; ri++) {
237
+ MemoryRegion *mr = mr_for_raminfo(mms, ri);
238
+ memory_region_add_subregion(sysmem, ri->base, mr);
239
+ }
240
+}
241
+
242
+static void mps3r_set_default_ram_info(MPS3RMachineClass *mmc)
243
+{
244
+ /*
245
+ * Set mc->default_ram_size and default_ram_id from the
246
+ * information in mmc->raminfo.
247
+ */
248
+ MachineClass *mc = MACHINE_CLASS(mmc);
249
+ const RAMInfo *p;
250
+
251
+ for (p = mmc->raminfo; p->name; p++) {
252
+ if (p->mrindex < 0) {
253
+ /* Found the entry for "system memory" */
254
+ mc->default_ram_size = p->size;
255
+ mc->default_ram_id = p->name;
256
+ return;
257
+ }
258
+ }
259
+ g_assert_not_reached();
260
+}
261
+
262
+static void mps3r_class_init(ObjectClass *oc, void *data)
263
+{
264
+ MachineClass *mc = MACHINE_CLASS(oc);
265
+
266
+ mc->init = mps3r_common_init;
267
+}
268
+
269
+static void mps3r_an536_class_init(ObjectClass *oc, void *data)
270
+{
271
+ MachineClass *mc = MACHINE_CLASS(oc);
272
+ MPS3RMachineClass *mmc = MPS3R_MACHINE_CLASS(oc);
273
+ static const char * const valid_cpu_types[] = {
274
+ ARM_CPU_TYPE_NAME("cortex-r52"),
275
+ NULL
276
+ };
277
+
278
+ mc->desc = "ARM MPS3 with AN536 FPGA image for Cortex-R52";
279
+ mc->default_cpus = 2;
280
+ mc->min_cpus = mc->default_cpus;
281
+ mc->max_cpus = mc->default_cpus;
282
+ mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-r52");
283
+ mc->valid_cpu_types = valid_cpu_types;
284
+ mmc->raminfo = an536_raminfo;
285
+ mps3r_set_default_ram_info(mmc);
286
+}
287
+
288
+static const TypeInfo mps3r_machine_types[] = {
289
+ {
290
+ .name = TYPE_MPS3R_MACHINE,
291
+ .parent = TYPE_MACHINE,
292
+ .abstract = true,
293
+ .instance_size = sizeof(MPS3RMachineState),
294
+ .class_size = sizeof(MPS3RMachineClass),
295
+ .class_init = mps3r_class_init,
296
+ }, {
297
+ .name = TYPE_MPS3R_AN536_MACHINE,
298
+ .parent = TYPE_MPS3R_MACHINE,
299
+ .class_init = mps3r_an536_class_init,
300
+ },
301
+};
302
+
303
+DEFINE_TYPES(mps3r_machine_types);
304
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
118
index XXXXXXX..XXXXXXX 100644
305
index XXXXXXX..XXXXXXX 100644
119
--- a/target/arm/meson.build
306
--- a/hw/arm/Kconfig
120
+++ b/target/arm/meson.build
307
+++ b/hw/arm/Kconfig
121
@@ -XXX,XX +XXX,XX @@
308
@@ -XXX,XX +XXX,XX @@ config MAINSTONE
122
gen = [
309
select PFLASH_CFI01
123
decodetree.process('sve.decode', extra_args: '--decode=disas_sve'),
310
select SMC91C111
124
+ decodetree.process('sme.decode', extra_args: '--decode=disas_sme'),
311
125
decodetree.process('neon-shared.decode', extra_args: '--decode=disas_neon_shared'),
312
+config MPS3R
126
decodetree.process('neon-dp.decode', extra_args: '--decode=disas_neon_dp'),
313
+ bool
127
decodetree.process('neon-ls.decode', extra_args: '--decode=disas_neon_ls'),
314
+ default y
128
@@ -XXX,XX +XXX,XX @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
315
+ depends on TCG && ARM
129
'sme_helper.c',
316
+
130
'translate-a64.c',
317
config MUSCA
131
'translate-sve.c',
318
bool
132
+ 'translate-sme.c',
319
default y
133
))
320
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
134
321
index XXXXXXX..XXXXXXX 100644
135
arm_softmmu_ss = ss.source_set()
322
--- a/hw/arm/meson.build
323
+++ b/hw/arm/meson.build
324
@@ -XXX,XX +XXX,XX @@ arm_ss.add(when: 'CONFIG_HIGHBANK', if_true: files('highbank.c'))
325
arm_ss.add(when: 'CONFIG_INTEGRATOR', if_true: files('integratorcp.c'))
326
arm_ss.add(when: 'CONFIG_MAINSTONE', if_true: files('mainstone.c'))
327
arm_ss.add(when: 'CONFIG_MICROBIT', if_true: files('microbit.c'))
328
+arm_ss.add(when: 'CONFIG_MPS3R', if_true: files('mps3r.c'))
329
arm_ss.add(when: 'CONFIG_MUSICPAL', if_true: files('musicpal.c'))
330
arm_ss.add(when: 'CONFIG_NETDUINOPLUS2', if_true: files('netduinoplus2.c'))
331
arm_ss.add(when: 'CONFIG_OLIMEX_STM32_H405', if_true: files('olimex-stm32-h405.c'))
136
--
332
--
137
2.25.1
333
2.34.1
334
335
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark ADR as a non-streaming instruction, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Removing entries from sme-fa64.decode is an easy way to see
7
what remains to be done.
8
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20220708151540.18136-5-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
target/arm/translate.h | 7 +++++++
15
target/arm/sme-fa64.decode | 1 -
16
target/arm/translate-sve.c | 8 ++++----
17
3 files changed, 11 insertions(+), 5 deletions(-)
18
19
diff --git a/target/arm/translate.h b/target/arm/translate.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/translate.h
22
+++ b/target/arm/translate.h
23
@@ -XXX,XX +XXX,XX @@ uint64_t asimd_imm_const(uint32_t imm, int cmode, int op);
24
static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \
25
{ return dc_isar_feature(FEAT, s) && FUNC(s, __VA_ARGS__); }
26
27
+#define TRANS_FEAT_NONSTREAMING(NAME, FEAT, FUNC, ...) \
28
+ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \
29
+ { \
30
+ s->is_nonstreaming = true; \
31
+ return dc_isar_feature(FEAT, s) && FUNC(s, __VA_ARGS__); \
32
+ }
33
+
34
#endif /* TARGET_ARM_TRANSLATE_H */
35
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
36
index XXXXXXX..XXXXXXX 100644
37
--- a/target/arm/sme-fa64.decode
38
+++ b/target/arm/sme-fa64.decode
39
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
40
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
41
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
42
43
-FAIL 0000 0100 --1- ---- 1010 ---- ---- ---- # ADR
44
FAIL 0000 0100 --1- ---- 1011 -0-- ---- ---- # FTSSEL, FEXPA
45
FAIL 0000 0101 --10 0001 100- ---- ---- ---- # COMPACT
46
FAIL 0010 0101 --01 100- 1111 000- ---0 ---- # RDFFR, RDFFRS
47
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/translate-sve.c
50
+++ b/target/arm/translate-sve.c
51
@@ -XXX,XX +XXX,XX @@ static bool do_adr(DisasContext *s, arg_rrri *a, gen_helper_gvec_3 *fn)
52
return gen_gvec_ool_zzz(s, fn, a->rd, a->rn, a->rm, a->imm);
53
}
54
55
-TRANS_FEAT(ADR_p32, aa64_sve, do_adr, a, gen_helper_sve_adr_p32)
56
-TRANS_FEAT(ADR_p64, aa64_sve, do_adr, a, gen_helper_sve_adr_p64)
57
-TRANS_FEAT(ADR_s32, aa64_sve, do_adr, a, gen_helper_sve_adr_s32)
58
-TRANS_FEAT(ADR_u32, aa64_sve, do_adr, a, gen_helper_sve_adr_u32)
59
+TRANS_FEAT_NONSTREAMING(ADR_p32, aa64_sve, do_adr, a, gen_helper_sve_adr_p32)
60
+TRANS_FEAT_NONSTREAMING(ADR_p64, aa64_sve, do_adr, a, gen_helper_sve_adr_p64)
61
+TRANS_FEAT_NONSTREAMING(ADR_s32, aa64_sve, do_adr, a, gen_helper_sve_adr_s32)
62
+TRANS_FEAT_NONSTREAMING(ADR_u32, aa64_sve, do_adr, a, gen_helper_sve_adr_u32)
63
64
/*
65
*** SVE Integer Misc - Unpredicated Group
66
--
67
2.25.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark these as a non-streaming instructions, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-6-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/sme-fa64.decode | 2 --
12
target/arm/translate-sve.c | 9 ++++++---
13
2 files changed, 6 insertions(+), 5 deletions(-)
14
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
21
FAIL 0000 0100 --1- ---- 1011 -0-- ---- ---- # FTSSEL, FEXPA
22
FAIL 0000 0101 --10 0001 100- ---- ---- ---- # COMPACT
23
-FAIL 0010 0101 --01 100- 1111 000- ---0 ---- # RDFFR, RDFFRS
24
-FAIL 0010 0101 --10 1--- 1001 ---- ---- ---- # WRFFR, SETFFR
25
FAIL 0100 0101 --0- ---- 1011 ---- ---- ---- # BDEP, BEXT, BGRP
26
FAIL 0100 0101 000- ---- 0110 1--- ---- ---- # PMULLB, PMULLT (128b result)
27
FAIL 0110 0100 --1- ---- 1110 01-- ---- ---- # FMMLA, BFMMLA
28
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-sve.c
31
+++ b/target/arm/translate-sve.c
32
@@ -XXX,XX +XXX,XX @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
33
TRANS_FEAT(PTRUE, aa64_sve, do_predset, a->esz, a->rd, a->pat, a->s)
34
35
/* Note pat == 31 is #all, to set all elements. */
36
-TRANS_FEAT(SETFFR, aa64_sve, do_predset, 0, FFR_PRED_NUM, 31, false)
37
+TRANS_FEAT_NONSTREAMING(SETFFR, aa64_sve,
38
+ do_predset, 0, FFR_PRED_NUM, 31, false)
39
40
/* Note pat == 32 is #unimp, to set no elements. */
41
TRANS_FEAT(PFALSE, aa64_sve, do_predset, 0, a->rd, 32, false)
42
@@ -XXX,XX +XXX,XX @@ static bool trans_RDFFR_p(DisasContext *s, arg_RDFFR_p *a)
43
.rd = a->rd, .pg = a->pg, .s = a->s,
44
.rn = FFR_PRED_NUM, .rm = FFR_PRED_NUM,
45
};
46
+
47
+ s->is_nonstreaming = true;
48
return trans_AND_pppp(s, &alt_a);
49
}
50
51
-TRANS_FEAT(RDFFR, aa64_sve, do_mov_p, a->rd, FFR_PRED_NUM)
52
-TRANS_FEAT(WRFFR, aa64_sve, do_mov_p, FFR_PRED_NUM, a->rn)
53
+TRANS_FEAT_NONSTREAMING(RDFFR, aa64_sve, do_mov_p, a->rd, FFR_PRED_NUM)
54
+TRANS_FEAT_NONSTREAMING(WRFFR, aa64_sve, do_mov_p, FFR_PRED_NUM, a->rn)
55
56
static bool do_pfirst_pnext(DisasContext *s, arg_rr_esz *a,
57
void (*gen_fn)(TCGv_i32, TCGv_ptr,
58
--
59
2.25.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark these as a non-streaming instructions, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-7-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/sme-fa64.decode | 3 ---
12
target/arm/translate-sve.c | 22 ++++++++++++----------
13
2 files changed, 12 insertions(+), 13 deletions(-)
14
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
21
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
22
23
-FAIL 0000 0100 --1- ---- 1011 -0-- ---- ---- # FTSSEL, FEXPA
24
-FAIL 0000 0101 --10 0001 100- ---- ---- ---- # COMPACT
25
-FAIL 0100 0101 --0- ---- 1011 ---- ---- ---- # BDEP, BEXT, BGRP
26
FAIL 0100 0101 000- ---- 0110 1--- ---- ---- # PMULLB, PMULLT (128b result)
27
FAIL 0110 0100 --1- ---- 1110 01-- ---- ---- # FMMLA, BFMMLA
28
FAIL 0110 0101 --0- ---- 0000 11-- ---- ---- # FTSMUL
29
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate-sve.c
32
+++ b/target/arm/translate-sve.c
33
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_2 * const fexpa_fns[4] = {
34
NULL, gen_helper_sve_fexpa_h,
35
gen_helper_sve_fexpa_s, gen_helper_sve_fexpa_d,
36
};
37
-TRANS_FEAT(FEXPA, aa64_sve, gen_gvec_ool_zz,
38
- fexpa_fns[a->esz], a->rd, a->rn, 0)
39
+TRANS_FEAT_NONSTREAMING(FEXPA, aa64_sve, gen_gvec_ool_zz,
40
+ fexpa_fns[a->esz], a->rd, a->rn, 0)
41
42
static gen_helper_gvec_3 * const ftssel_fns[4] = {
43
NULL, gen_helper_sve_ftssel_h,
44
gen_helper_sve_ftssel_s, gen_helper_sve_ftssel_d,
45
};
46
-TRANS_FEAT(FTSSEL, aa64_sve, gen_gvec_ool_arg_zzz, ftssel_fns[a->esz], a, 0)
47
+TRANS_FEAT_NONSTREAMING(FTSSEL, aa64_sve, gen_gvec_ool_arg_zzz,
48
+ ftssel_fns[a->esz], a, 0)
49
50
/*
51
*** SVE Predicate Logical Operations Group
52
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(TRN2_q, aa64_sve_f64mm, gen_gvec_ool_arg_zzz,
53
static gen_helper_gvec_3 * const compact_fns[4] = {
54
NULL, NULL, gen_helper_sve_compact_s, gen_helper_sve_compact_d
55
};
56
-TRANS_FEAT(COMPACT, aa64_sve, gen_gvec_ool_arg_zpz, compact_fns[a->esz], a, 0)
57
+TRANS_FEAT_NONSTREAMING(COMPACT, aa64_sve, gen_gvec_ool_arg_zpz,
58
+ compact_fns[a->esz], a, 0)
59
60
/* Call the helper that computes the ARM LastActiveElement pseudocode
61
* function, scaled by the element size. This includes the not found
62
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3 * const bext_fns[4] = {
63
gen_helper_sve2_bext_b, gen_helper_sve2_bext_h,
64
gen_helper_sve2_bext_s, gen_helper_sve2_bext_d,
65
};
66
-TRANS_FEAT(BEXT, aa64_sve2_bitperm, gen_gvec_ool_arg_zzz,
67
- bext_fns[a->esz], a, 0)
68
+TRANS_FEAT_NONSTREAMING(BEXT, aa64_sve2_bitperm, gen_gvec_ool_arg_zzz,
69
+ bext_fns[a->esz], a, 0)
70
71
static gen_helper_gvec_3 * const bdep_fns[4] = {
72
gen_helper_sve2_bdep_b, gen_helper_sve2_bdep_h,
73
gen_helper_sve2_bdep_s, gen_helper_sve2_bdep_d,
74
};
75
-TRANS_FEAT(BDEP, aa64_sve2_bitperm, gen_gvec_ool_arg_zzz,
76
- bdep_fns[a->esz], a, 0)
77
+TRANS_FEAT_NONSTREAMING(BDEP, aa64_sve2_bitperm, gen_gvec_ool_arg_zzz,
78
+ bdep_fns[a->esz], a, 0)
79
80
static gen_helper_gvec_3 * const bgrp_fns[4] = {
81
gen_helper_sve2_bgrp_b, gen_helper_sve2_bgrp_h,
82
gen_helper_sve2_bgrp_s, gen_helper_sve2_bgrp_d,
83
};
84
-TRANS_FEAT(BGRP, aa64_sve2_bitperm, gen_gvec_ool_arg_zzz,
85
- bgrp_fns[a->esz], a, 0)
86
+TRANS_FEAT_NONSTREAMING(BGRP, aa64_sve2_bitperm, gen_gvec_ool_arg_zzz,
87
+ bgrp_fns[a->esz], a, 0)
88
89
static gen_helper_gvec_3 * const cadd_fns[4] = {
90
gen_helper_sve2_cadd_b, gen_helper_sve2_cadd_h,
91
--
92
2.25.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark these as a non-streaming instructions, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-8-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/sme-fa64.decode | 2 --
12
target/arm/translate-sve.c | 24 +++++++++++++++---------
13
2 files changed, 15 insertions(+), 11 deletions(-)
14
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
21
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
22
23
-FAIL 0100 0101 000- ---- 0110 1--- ---- ---- # PMULLB, PMULLT (128b result)
24
-FAIL 0110 0100 --1- ---- 1110 01-- ---- ---- # FMMLA, BFMMLA
25
FAIL 0110 0101 --0- ---- 0000 11-- ---- ---- # FTSMUL
26
FAIL 0110 0101 --01 0--- 100- ---- ---- ---- # FTMAD
27
FAIL 0110 0101 --01 1--- 001- ---- ---- ---- # FADDA
28
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/translate-sve.c
31
+++ b/target/arm/translate-sve.c
32
@@ -XXX,XX +XXX,XX @@ static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel)
33
gen_helper_gvec_pmull_q, gen_helper_sve2_pmull_h,
34
NULL, gen_helper_sve2_pmull_d,
35
};
36
- if (a->esz == 0
37
- ? !dc_isar_feature(aa64_sve2_pmull128, s)
38
- : !dc_isar_feature(aa64_sve, s)) {
39
+
40
+ if (a->esz == 0) {
41
+ if (!dc_isar_feature(aa64_sve2_pmull128, s)) {
42
+ return false;
43
+ }
44
+ s->is_nonstreaming = true;
45
+ } else if (!dc_isar_feature(aa64_sve, s)) {
46
return false;
47
}
48
return gen_gvec_ool_arg_zzz(s, fns[a->esz], a, sel);
49
@@ -XXX,XX +XXX,XX @@ DO_ZPZZ_FP(FMINP, aa64_sve2, sve2_fminp_zpzz)
50
* SVE Integer Multiply-Add (unpredicated)
51
*/
52
53
-TRANS_FEAT(FMMLA_s, aa64_sve_f32mm, gen_gvec_fpst_zzzz, gen_helper_fmmla_s,
54
- a->rd, a->rn, a->rm, a->ra, 0, FPST_FPCR)
55
-TRANS_FEAT(FMMLA_d, aa64_sve_f64mm, gen_gvec_fpst_zzzz, gen_helper_fmmla_d,
56
- a->rd, a->rn, a->rm, a->ra, 0, FPST_FPCR)
57
+TRANS_FEAT_NONSTREAMING(FMMLA_s, aa64_sve_f32mm, gen_gvec_fpst_zzzz,
58
+ gen_helper_fmmla_s, a->rd, a->rn, a->rm, a->ra,
59
+ 0, FPST_FPCR)
60
+TRANS_FEAT_NONSTREAMING(FMMLA_d, aa64_sve_f64mm, gen_gvec_fpst_zzzz,
61
+ gen_helper_fmmla_d, a->rd, a->rn, a->rm, a->ra,
62
+ 0, FPST_FPCR)
63
64
static gen_helper_gvec_4 * const sqdmlal_zzzw_fns[] = {
65
NULL, gen_helper_sve2_sqdmlal_zzzw_h,
66
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_ool_arg_zzzz,
67
TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_ool_arg_zzxz,
68
gen_helper_gvec_bfdot_idx, a)
69
70
-TRANS_FEAT(BFMMLA, aa64_sve_bf16, gen_gvec_ool_arg_zzzz,
71
- gen_helper_gvec_bfmmla, a, 0)
72
+TRANS_FEAT_NONSTREAMING(BFMMLA, aa64_sve_bf16, gen_gvec_ool_arg_zzzz,
73
+ gen_helper_gvec_bfmmla, a, 0)
74
75
static bool do_BFMLAL_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel)
76
{
77
--
78
2.25.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark these as a non-streaming instructions, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-9-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/sme-fa64.decode | 3 ---
12
target/arm/translate-sve.c | 15 +++++++++++----
13
2 files changed, 11 insertions(+), 7 deletions(-)
14
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
21
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
22
23
-FAIL 0110 0101 --0- ---- 0000 11-- ---- ---- # FTSMUL
24
-FAIL 0110 0101 --01 0--- 100- ---- ---- ---- # FTMAD
25
-FAIL 0110 0101 --01 1--- 001- ---- ---- ---- # FADDA
26
FAIL 0100 0101 --0- ---- 1001 10-- ---- ---- # SMMLA, UMMLA, USMMLA
27
FAIL 0100 0101 --1- ---- 1--- ---- ---- ---- # SVE2 string/histo/crypto instructions
28
FAIL 1000 010- -00- ---- 10-- ---- ---- ---- # SVE2 32-bit gather NT load (vector+scalar)
29
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate-sve.c
32
+++ b/target/arm/translate-sve.c
33
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const ftmad_fns[4] = {
34
NULL, gen_helper_sve_ftmad_h,
35
gen_helper_sve_ftmad_s, gen_helper_sve_ftmad_d,
36
};
37
-TRANS_FEAT(FTMAD, aa64_sve, gen_gvec_fpst_zzz,
38
- ftmad_fns[a->esz], a->rd, a->rn, a->rm, a->imm,
39
- a->esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR)
40
+TRANS_FEAT_NONSTREAMING(FTMAD, aa64_sve, gen_gvec_fpst_zzz,
41
+ ftmad_fns[a->esz], a->rd, a->rn, a->rm, a->imm,
42
+ a->esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR)
43
44
/*
45
*** SVE Floating Point Accumulating Reduction Group
46
@@ -XXX,XX +XXX,XX @@ static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a)
47
if (a->esz == 0 || !dc_isar_feature(aa64_sve, s)) {
48
return false;
49
}
50
+ s->is_nonstreaming = true;
51
if (!sve_access_check(s)) {
52
return true;
53
}
54
@@ -XXX,XX +XXX,XX @@ static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a)
55
DO_FP3(FADD_zzz, fadd)
56
DO_FP3(FSUB_zzz, fsub)
57
DO_FP3(FMUL_zzz, fmul)
58
-DO_FP3(FTSMUL, ftsmul)
59
DO_FP3(FRECPS, recps)
60
DO_FP3(FRSQRTS, rsqrts)
61
62
#undef DO_FP3
63
64
+static gen_helper_gvec_3_ptr * const ftsmul_fns[4] = {
65
+ NULL, gen_helper_gvec_ftsmul_h,
66
+ gen_helper_gvec_ftsmul_s, gen_helper_gvec_ftsmul_d
67
+};
68
+TRANS_FEAT_NONSTREAMING(FTSMUL, aa64_sve, gen_gvec_fpst_arg_zzz,
69
+ ftsmul_fns[a->esz], a, 0)
70
+
71
/*
72
*** SVE Floating Point Arithmetic - Predicated Group
73
*/
74
--
75
2.25.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark these as a non-streaming instructions, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-10-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/sme-fa64.decode | 1 -
12
target/arm/translate-sve.c | 12 ++++++------
13
2 files changed, 6 insertions(+), 7 deletions(-)
14
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
21
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
22
23
-FAIL 0100 0101 --0- ---- 1001 10-- ---- ---- # SMMLA, UMMLA, USMMLA
24
FAIL 0100 0101 --1- ---- 1--- ---- ---- ---- # SVE2 string/histo/crypto instructions
25
FAIL 1000 010- -00- ---- 10-- ---- ---- ---- # SVE2 32-bit gather NT load (vector+scalar)
26
FAIL 1000 010- -00- ---- 111- ---- ---- ---- # SVE 32-bit gather prefetch (vector+imm)
27
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/translate-sve.c
30
+++ b/target/arm/translate-sve.c
31
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(FMLALT_zzxw, aa64_sve2, do_FMLAL_zzxw, a, false, true)
32
TRANS_FEAT(FMLSLB_zzxw, aa64_sve2, do_FMLAL_zzxw, a, true, false)
33
TRANS_FEAT(FMLSLT_zzxw, aa64_sve2, do_FMLAL_zzxw, a, true, true)
34
35
-TRANS_FEAT(SMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
36
- gen_helper_gvec_smmla_b, a, 0)
37
-TRANS_FEAT(USMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
38
- gen_helper_gvec_usmmla_b, a, 0)
39
-TRANS_FEAT(UMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
40
- gen_helper_gvec_ummla_b, a, 0)
41
+TRANS_FEAT_NONSTREAMING(SMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
42
+ gen_helper_gvec_smmla_b, a, 0)
43
+TRANS_FEAT_NONSTREAMING(USMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
44
+ gen_helper_gvec_usmmla_b, a, 0)
45
+TRANS_FEAT_NONSTREAMING(UMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
46
+ gen_helper_gvec_ummla_b, a, 0)
47
48
TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_ool_arg_zzzz,
49
gen_helper_gvec_bfdot, a, 0)
50
--
51
2.25.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark these as non-streaming instructions, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-11-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/sme-fa64.decode | 1 -
12
target/arm/translate-sve.c | 35 ++++++++++++++++++-----------------
13
2 files changed, 18 insertions(+), 18 deletions(-)
14
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
21
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
22
23
-FAIL 0100 0101 --1- ---- 1--- ---- ---- ---- # SVE2 string/histo/crypto instructions
24
FAIL 1000 010- -00- ---- 10-- ---- ---- ---- # SVE2 32-bit gather NT load (vector+scalar)
25
FAIL 1000 010- -00- ---- 111- ---- ---- ---- # SVE 32-bit gather prefetch (vector+imm)
26
FAIL 1000 0100 0-1- ---- 0--- ---- ---- ---- # SVE 32-bit gather prefetch (scalar+vector)
27
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/translate-sve.c
30
+++ b/target/arm/translate-sve.c
31
@@ -XXX,XX +XXX,XX @@ DO_SVE2_ZZZ_NARROW(RSUBHNT, rsubhnt)
32
static gen_helper_gvec_flags_4 * const match_fns[4] = {
33
gen_helper_sve2_match_ppzz_b, gen_helper_sve2_match_ppzz_h, NULL, NULL
34
};
35
-TRANS_FEAT(MATCH, aa64_sve2, do_ppzz_flags, a, match_fns[a->esz])
36
+TRANS_FEAT_NONSTREAMING(MATCH, aa64_sve2, do_ppzz_flags, a, match_fns[a->esz])
37
38
static gen_helper_gvec_flags_4 * const nmatch_fns[4] = {
39
gen_helper_sve2_nmatch_ppzz_b, gen_helper_sve2_nmatch_ppzz_h, NULL, NULL
40
};
41
-TRANS_FEAT(NMATCH, aa64_sve2, do_ppzz_flags, a, nmatch_fns[a->esz])
42
+TRANS_FEAT_NONSTREAMING(NMATCH, aa64_sve2, do_ppzz_flags, a, nmatch_fns[a->esz])
43
44
static gen_helper_gvec_4 * const histcnt_fns[4] = {
45
NULL, NULL, gen_helper_sve2_histcnt_s, gen_helper_sve2_histcnt_d
46
};
47
-TRANS_FEAT(HISTCNT, aa64_sve2, gen_gvec_ool_arg_zpzz,
48
- histcnt_fns[a->esz], a, 0)
49
+TRANS_FEAT_NONSTREAMING(HISTCNT, aa64_sve2, gen_gvec_ool_arg_zpzz,
50
+ histcnt_fns[a->esz], a, 0)
51
52
-TRANS_FEAT(HISTSEG, aa64_sve2, gen_gvec_ool_arg_zzz,
53
- a->esz == 0 ? gen_helper_sve2_histseg : NULL, a, 0)
54
+TRANS_FEAT_NONSTREAMING(HISTSEG, aa64_sve2, gen_gvec_ool_arg_zzz,
55
+ a->esz == 0 ? gen_helper_sve2_histseg : NULL, a, 0)
56
57
DO_ZPZZ_FP(FADDP, aa64_sve2, sve2_faddp_zpzz)
58
DO_ZPZZ_FP(FMAXNMP, aa64_sve2, sve2_fmaxnmp_zpzz)
59
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SQRDCMLAH_zzzz, aa64_sve2, gen_gvec_ool_zzzz,
60
TRANS_FEAT(USDOT_zzzz, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
61
a->esz == 2 ? gen_helper_gvec_usdot_b : NULL, a, 0)
62
63
-TRANS_FEAT(AESMC, aa64_sve2_aes, gen_gvec_ool_zz,
64
- gen_helper_crypto_aesmc, a->rd, a->rd, a->decrypt)
65
+TRANS_FEAT_NONSTREAMING(AESMC, aa64_sve2_aes, gen_gvec_ool_zz,
66
+ gen_helper_crypto_aesmc, a->rd, a->rd, a->decrypt)
67
68
-TRANS_FEAT(AESE, aa64_sve2_aes, gen_gvec_ool_arg_zzz,
69
- gen_helper_crypto_aese, a, false)
70
-TRANS_FEAT(AESD, aa64_sve2_aes, gen_gvec_ool_arg_zzz,
71
- gen_helper_crypto_aese, a, true)
72
+TRANS_FEAT_NONSTREAMING(AESE, aa64_sve2_aes, gen_gvec_ool_arg_zzz,
73
+ gen_helper_crypto_aese, a, false)
74
+TRANS_FEAT_NONSTREAMING(AESD, aa64_sve2_aes, gen_gvec_ool_arg_zzz,
75
+ gen_helper_crypto_aese, a, true)
76
77
-TRANS_FEAT(SM4E, aa64_sve2_sm4, gen_gvec_ool_arg_zzz,
78
- gen_helper_crypto_sm4e, a, 0)
79
-TRANS_FEAT(SM4EKEY, aa64_sve2_sm4, gen_gvec_ool_arg_zzz,
80
- gen_helper_crypto_sm4ekey, a, 0)
81
+TRANS_FEAT_NONSTREAMING(SM4E, aa64_sve2_sm4, gen_gvec_ool_arg_zzz,
82
+ gen_helper_crypto_sm4e, a, 0)
83
+TRANS_FEAT_NONSTREAMING(SM4EKEY, aa64_sve2_sm4, gen_gvec_ool_arg_zzz,
84
+ gen_helper_crypto_sm4ekey, a, 0)
85
86
-TRANS_FEAT(RAX1, aa64_sve2_sha3, gen_gvec_fn_arg_zzz, gen_gvec_rax1, a)
87
+TRANS_FEAT_NONSTREAMING(RAX1, aa64_sve2_sha3, gen_gvec_fn_arg_zzz,
88
+ gen_gvec_rax1, a)
89
90
TRANS_FEAT(FCVTNT_sh, aa64_sve2, gen_gvec_fpst_arg_zpz,
91
gen_helper_sve2_fcvtnt_sh, a, 0, FPST_FPCR)
92
--
93
2.25.1
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark these as a non-streaming instructions, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-12-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/sme-fa64.decode | 9 ---------
12
target/arm/translate-sve.c | 6 ++++++
13
2 files changed, 6 insertions(+), 9 deletions(-)
14
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
21
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
22
23
-FAIL 1000 010- -00- ---- 10-- ---- ---- ---- # SVE2 32-bit gather NT load (vector+scalar)
24
FAIL 1000 010- -00- ---- 111- ---- ---- ---- # SVE 32-bit gather prefetch (vector+imm)
25
FAIL 1000 0100 0-1- ---- 0--- ---- ---- ---- # SVE 32-bit gather prefetch (scalar+vector)
26
-FAIL 1000 010- -01- ---- 1--- ---- ---- ---- # SVE 32-bit gather load (vector+imm)
27
-FAIL 1000 0100 0-0- ---- 0--- ---- ---- ---- # SVE 32-bit gather load byte (scalar+vector)
28
-FAIL 1000 0100 1--- ---- 0--- ---- ---- ---- # SVE 32-bit gather load half (scalar+vector)
29
-FAIL 1000 0101 0--- ---- 0--- ---- ---- ---- # SVE 32-bit gather load word (scalar+vector)
30
FAIL 1010 010- ---- ---- 011- ---- ---- ---- # SVE contiguous FF load (scalar+scalar)
31
FAIL 1010 010- ---1 ---- 101- ---- ---- ---- # SVE contiguous NF load (scalar+imm)
32
FAIL 1010 010- -01- ---- 000- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+scalar)
33
FAIL 1010 010- -010 ---- 001- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+imm)
34
FAIL 1100 010- ---- ---- ---- ---- ---- ---- # SVE 64-bit gather load/prefetch
35
-FAIL 1110 010- -00- ---- 001- ---- ---- ---- # SVE2 64-bit scatter NT store (vector+scalar)
36
-FAIL 1110 010- -10- ---- 001- ---- ---- ---- # SVE2 32-bit scatter NT store (vector+scalar)
37
-FAIL 1110 010- ---- ---- 1-0- ---- ---- ---- # SVE scatter store (scalar+32-bit vector)
38
-FAIL 1110 010- ---- ---- 101- ---- ---- ---- # SVE scatter store (misc)
39
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate-sve.c
42
+++ b/target/arm/translate-sve.c
43
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a)
44
if (!dc_isar_feature(aa64_sve, s)) {
45
return false;
46
}
47
+ s->is_nonstreaming = true;
48
if (!sve_access_check(s)) {
49
return true;
50
}
51
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a)
52
if (!dc_isar_feature(aa64_sve, s)) {
53
return false;
54
}
55
+ s->is_nonstreaming = true;
56
if (!sve_access_check(s)) {
57
return true;
58
}
59
@@ -XXX,XX +XXX,XX @@ static bool trans_LDNT1_zprz(DisasContext *s, arg_LD1_zprz *a)
60
if (!dc_isar_feature(aa64_sve2, s)) {
61
return false;
62
}
63
+ s->is_nonstreaming = true;
64
if (!sve_access_check(s)) {
65
return true;
66
}
67
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a)
68
if (!dc_isar_feature(aa64_sve, s)) {
69
return false;
70
}
71
+ s->is_nonstreaming = true;
72
if (!sve_access_check(s)) {
73
return true;
74
}
75
@@ -XXX,XX +XXX,XX @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a)
76
if (!dc_isar_feature(aa64_sve, s)) {
77
return false;
78
}
79
+ s->is_nonstreaming = true;
80
if (!sve_access_check(s)) {
81
return true;
82
}
83
@@ -XXX,XX +XXX,XX @@ static bool trans_STNT1_zprz(DisasContext *s, arg_ST1_zprz *a)
84
if (!dc_isar_feature(aa64_sve2, s)) {
85
return false;
86
}
87
+ s->is_nonstreaming = true;
88
if (!sve_access_check(s)) {
89
return true;
90
}
91
--
92
2.25.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Create the CPUs, the GIC, and the per-CPU RAM block for
2
the mps3-an536 board.
2
3
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20220708151540.18136-24-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20240206132931.38376-10-peter.maydell@linaro.org
7
---
6
---
8
target/arm/helper-sme.h | 5 +++
7
hw/arm/mps3r.c | 180 ++++++++++++++++++++++++++++++++++++++++++++++++-
9
target/arm/sme.decode | 11 +++++
8
1 file changed, 177 insertions(+), 3 deletions(-)
10
target/arm/sme_helper.c | 90 ++++++++++++++++++++++++++++++++++++++
11
target/arm/translate-sme.c | 31 +++++++++++++
12
4 files changed, 137 insertions(+)
13
9
14
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
10
diff --git a/hw/arm/mps3r.c b/hw/arm/mps3r.c
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sme.h
12
--- a/hw/arm/mps3r.c
17
+++ b/target/arm/helper-sme.h
13
+++ b/hw/arm/mps3r.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sme_st1q_be_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i
14
@@ -XXX,XX +XXX,XX @@
19
DEF_HELPER_FLAGS_5(sme_st1q_le_h_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
15
#include "qemu/osdep.h"
20
DEF_HELPER_FLAGS_5(sme_st1q_be_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
16
#include "qemu/units.h"
21
DEF_HELPER_FLAGS_5(sme_st1q_le_v_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, tl, i32)
17
#include "qapi/error.h"
22
+
18
+#include "qapi/qmp/qlist.h"
23
+DEF_HELPER_FLAGS_5(sme_addha_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
#include "exec/address-spaces.h"
24
+DEF_HELPER_FLAGS_5(sme_addva_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
#include "cpu.h"
25
+DEF_HELPER_FLAGS_5(sme_addha_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
21
#include "hw/boards.h"
26
+DEF_HELPER_FLAGS_5(sme_addva_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+#include "hw/qdev-properties.h"
27
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
23
#include "hw/arm/boot.h"
28
index XXXXXXX..XXXXXXX 100644
24
+#include "hw/arm/bsa.h"
29
--- a/target/arm/sme.decode
25
+#include "hw/intc/arm_gicv3.h"
30
+++ b/target/arm/sme.decode
26
31
@@ -XXX,XX +XXX,XX @@ LDST1 1110000 111 st:1 rm:5 v:1 .. pg:3 rn:5 0 za_imm:4 \
27
/* Define the layout of RAM and ROM in a board */
32
28
typedef struct RAMInfo {
33
LDR 1110000 100 0 000000 .. 000 ..... 0 .... @ldstr
29
@@ -XXX,XX +XXX,XX @@ typedef struct RAMInfo {
34
STR 1110000 100 1 000000 .. 000 ..... 0 .... @ldstr
30
#define IS_ROM 2
35
+
31
36
+### SME Add Vector to Array
32
#define MPS3R_RAM_MAX 9
37
+
33
+#define MPS3R_CPU_MAX 2
38
+&adda zad zn pm pn
34
+
39
+@adda_32 ........ .. ..... . pm:3 pn:3 zn:5 ... zad:2 &adda
35
+#define PERIPHBASE 0xf0000000
40
+@adda_64 ........ .. ..... . pm:3 pn:3 zn:5 .. zad:3 &adda
36
+#define NUM_SPIS 96
41
+
37
42
+ADDHA_s 11000000 10 01000 0 ... ... ..... 000 .. @adda_32
38
typedef enum MPS3RFPGAType {
43
+ADDVA_s 11000000 10 01000 1 ... ... ..... 000 .. @adda_32
39
FPGA_AN536,
44
+ADDHA_d 11000000 11 01000 0 ... ... ..... 00 ... @adda_64
40
@@ -XXX,XX +XXX,XX @@ struct MPS3RMachineClass {
45
+ADDVA_d 11000000 11 01000 1 ... ... ..... 00 ... @adda_64
41
MachineClass parent;
46
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
42
MPS3RFPGAType fpga_type;
47
index XXXXXXX..XXXXXXX 100644
43
const RAMInfo *raminfo;
48
--- a/target/arm/sme_helper.c
44
+ hwaddr loader_start;
49
+++ b/target/arm/sme_helper.c
45
};
50
@@ -XXX,XX +XXX,XX @@ DO_ST(q, _be, MO_128)
46
51
DO_ST(q, _le, MO_128)
47
struct MPS3RMachineState {
52
48
MachineState parent;
53
#undef DO_ST
49
+ struct arm_boot_info bootinfo;
54
+
50
MemoryRegion ram[MPS3R_RAM_MAX];
55
+void HELPER(sme_addha_s)(void *vzda, void *vzn, void *vpn,
51
+ Object *cpu[MPS3R_CPU_MAX];
56
+ void *vpm, uint32_t desc)
52
+ MemoryRegion cpu_sysmem[MPS3R_CPU_MAX];
53
+ MemoryRegion sysmem_alias[MPS3R_CPU_MAX];
54
+ MemoryRegion cpu_ram[MPS3R_CPU_MAX];
55
+ GICv3State gic;
56
};
57
58
#define TYPE_MPS3R_MACHINE "mps3r"
59
@@ -XXX,XX +XXX,XX @@ static MemoryRegion *mr_for_raminfo(MPS3RMachineState *mms,
60
return ram;
61
}
62
63
+/*
64
+ * There is no defined secondary boot protocol for Linux for the AN536,
65
+ * because real hardware has a restriction that atomic operations between
66
+ * the two CPUs do not function correctly, and so true SMP is not
67
+ * possible. Therefore for cases where the user is directly booting
68
+ * a kernel, we treat the system as essentially uniprocessor, and
69
+ * put the secondary CPU into power-off state (as if the user on the
70
+ * real hardware had configured the secondary to be halted via the
71
+ * SCC config registers).
72
+ *
73
+ * Note that the default secondary boot code would not work here anyway
74
+ * as it assumes a GICv2, and we have a GICv3.
75
+ */
76
+static void mps3r_write_secondary_boot(ARMCPU *cpu,
77
+ const struct arm_boot_info *info)
57
+{
78
+{
58
+ intptr_t row, col, oprsz = simd_oprsz(desc) / 4;
79
+ /*
59
+ uint64_t *pn = vpn, *pm = vpm;
80
+ * Power the secondary CPU off. This means we don't need to write any
60
+ uint32_t *zda = vzda, *zn = vzn;
81
+ * boot code into guest memory. Note that the 'cpu' argument to this
61
+
82
+ * function is the primary CPU we passed to arm_load_kernel(), not
62
+ for (row = 0; row < oprsz; ) {
83
+ * the secondary. Loop around all the other CPUs, as the boot.c
63
+ uint64_t pa = pn[row >> 4];
84
+ * code does for the "disable secondaries if PSCI is enabled" case.
64
+ do {
85
+ */
65
+ if (pa & 1) {
86
+ for (CPUState *cs = first_cpu; cs; cs = CPU_NEXT(cs)) {
66
+ for (col = 0; col < oprsz; ) {
87
+ if (cs != first_cpu) {
67
+ uint64_t pb = pm[col >> 4];
88
+ object_property_set_bool(OBJECT(cs), "start-powered-off", true,
68
+ do {
89
+ &error_abort);
69
+ if (pb & 1) {
70
+ zda[tile_vslice_index(row) + H4(col)] += zn[H4(col)];
71
+ }
72
+ pb >>= 4;
73
+ } while (++col & 15);
74
+ }
75
+ }
76
+ pa >>= 4;
77
+ } while (++row & 15);
78
+ }
79
+}
80
+
81
+void HELPER(sme_addha_d)(void *vzda, void *vzn, void *vpn,
82
+ void *vpm, uint32_t desc)
83
+{
84
+ intptr_t row, col, oprsz = simd_oprsz(desc) / 8;
85
+ uint8_t *pn = vpn, *pm = vpm;
86
+ uint64_t *zda = vzda, *zn = vzn;
87
+
88
+ for (row = 0; row < oprsz; ++row) {
89
+ if (pn[H1(row)] & 1) {
90
+ for (col = 0; col < oprsz; ++col) {
91
+ if (pm[H1(col)] & 1) {
92
+ zda[tile_vslice_index(row) + col] += zn[col];
93
+ }
94
+ }
95
+ }
90
+ }
96
+ }
91
+ }
97
+}
92
+}
98
+
93
+
99
+void HELPER(sme_addva_s)(void *vzda, void *vzn, void *vpn,
94
+static void mps3r_secondary_cpu_reset(ARMCPU *cpu,
100
+ void *vpm, uint32_t desc)
95
+ const struct arm_boot_info *info)
101
+{
96
+{
102
+ intptr_t row, col, oprsz = simd_oprsz(desc) / 4;
97
+ /* We don't need to do anything here because the CPU will be off */
103
+ uint64_t *pn = vpn, *pm = vpm;
98
+}
104
+ uint32_t *zda = vzda, *zn = vzn;
99
+
105
+
100
+static void create_gic(MPS3RMachineState *mms, MemoryRegion *sysmem)
106
+ for (row = 0; row < oprsz; ) {
101
+{
107
+ uint64_t pa = pn[row >> 4];
102
+ MachineState *machine = MACHINE(mms);
108
+ do {
103
+ DeviceState *gicdev;
109
+ if (pa & 1) {
104
+ QList *redist_region_count;
110
+ uint32_t zn_row = zn[H4(row)];
105
+
111
+ for (col = 0; col < oprsz; ) {
106
+ object_initialize_child(OBJECT(mms), "gic", &mms->gic, TYPE_ARM_GICV3);
112
+ uint64_t pb = pm[col >> 4];
107
+ gicdev = DEVICE(&mms->gic);
113
+ do {
108
+ qdev_prop_set_uint32(gicdev, "num-cpu", machine->smp.cpus);
114
+ if (pb & 1) {
109
+ qdev_prop_set_uint32(gicdev, "num-irq", NUM_SPIS + GIC_INTERNAL);
115
+ zda[tile_vslice_index(row) + H4(col)] += zn_row;
110
+ redist_region_count = qlist_new();
116
+ }
111
+ qlist_append_int(redist_region_count, machine->smp.cpus);
117
+ pb >>= 4;
112
+ qdev_prop_set_array(gicdev, "redist-region-count", redist_region_count);
118
+ } while (++col & 15);
113
+ object_property_set_link(OBJECT(&mms->gic), "sysmem",
119
+ }
114
+ OBJECT(sysmem), &error_fatal);
120
+ }
115
+ sysbus_realize(SYS_BUS_DEVICE(&mms->gic), &error_fatal);
121
+ pa >>= 4;
116
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->gic), 0, PERIPHBASE);
122
+ } while (++row & 15);
117
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->gic), 1, PERIPHBASE + 0x100000);
118
+ /*
119
+ * Wire the outputs from each CPU's generic timer and the GICv3
120
+ * maintenance interrupt signal to the appropriate GIC PPI inputs,
121
+ * and the GIC's IRQ/FIQ/VIRQ/VFIQ interrupt outputs to the CPU's inputs.
122
+ */
123
+ for (int i = 0; i < machine->smp.cpus; i++) {
124
+ DeviceState *cpudev = DEVICE(mms->cpu[i]);
125
+ SysBusDevice *gicsbd = SYS_BUS_DEVICE(&mms->gic);
126
+ int intidbase = NUM_SPIS + i * GIC_INTERNAL;
127
+ int irq;
128
+ /*
129
+ * Mapping from the output timer irq lines from the CPU to the
130
+ * GIC PPI inputs used for this board. This isn't a BSA board,
131
+ * but it uses the standard convention for the PPI numbers.
132
+ */
133
+ const int timer_irq[] = {
134
+ [GTIMER_PHYS] = ARCH_TIMER_NS_EL1_IRQ,
135
+ [GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
136
+ [GTIMER_HYP] = ARCH_TIMER_NS_EL2_IRQ,
137
+ };
138
+
139
+ for (irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
140
+ qdev_connect_gpio_out(cpudev, irq,
141
+ qdev_get_gpio_in(gicdev,
142
+ intidbase + timer_irq[irq]));
143
+ }
144
+
145
+ qdev_connect_gpio_out_named(cpudev, "gicv3-maintenance-interrupt", 0,
146
+ qdev_get_gpio_in(gicdev,
147
+ intidbase + ARCH_GIC_MAINT_IRQ));
148
+
149
+ qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
150
+ qdev_get_gpio_in(gicdev,
151
+ intidbase + VIRTUAL_PMU_IRQ));
152
+
153
+ sysbus_connect_irq(gicsbd, i,
154
+ qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
155
+ sysbus_connect_irq(gicsbd, i + machine->smp.cpus,
156
+ qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
157
+ sysbus_connect_irq(gicsbd, i + 2 * machine->smp.cpus,
158
+ qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
159
+ sysbus_connect_irq(gicsbd, i + 3 * machine->smp.cpus,
160
+ qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
123
+ }
161
+ }
124
+}
162
+}
125
+
163
+
126
+void HELPER(sme_addva_d)(void *vzda, void *vzn, void *vpn,
164
static void mps3r_common_init(MachineState *machine)
127
+ void *vpm, uint32_t desc)
165
{
128
+{
166
MPS3RMachineState *mms = MPS3R_MACHINE(machine);
129
+ intptr_t row, col, oprsz = simd_oprsz(desc) / 8;
167
@@ -XXX,XX +XXX,XX @@ static void mps3r_common_init(MachineState *machine)
130
+ uint8_t *pn = vpn, *pm = vpm;
168
MemoryRegion *mr = mr_for_raminfo(mms, ri);
131
+ uint64_t *zda = vzda, *zn = vzn;
169
memory_region_add_subregion(sysmem, ri->base, mr);
132
+
170
}
133
+ for (row = 0; row < oprsz; ++row) {
171
+
134
+ if (pn[H1(row)] & 1) {
172
+ assert(machine->smp.cpus <= MPS3R_CPU_MAX);
135
+ uint64_t zn_row = zn[row];
173
+ for (int i = 0; i < machine->smp.cpus; i++) {
136
+ for (col = 0; col < oprsz; ++col) {
174
+ g_autofree char *sysmem_name = g_strdup_printf("cpu-%d-memory", i);
137
+ if (pm[H1(col)] & 1) {
175
+ g_autofree char *ramname = g_strdup_printf("cpu-%d-memory", i);
138
+ zda[tile_vslice_index(row) + col] += zn_row;
176
+ g_autofree char *alias_name = g_strdup_printf("sysmem-alias-%d", i);
139
+ }
177
+
140
+ }
178
+ /*
141
+ }
179
+ * Each CPU has some private RAM/peripherals, so create the container
180
+ * which will house those, with the whole-machine system memory being
181
+ * used where there's no CPU-specific device. Note that we need the
182
+ * sysmem_alias aliases because we can't put one MR (the original
183
+ * 'sysmem') into more than one other MR.
184
+ */
185
+ memory_region_init(&mms->cpu_sysmem[i], OBJECT(machine),
186
+ sysmem_name, UINT64_MAX);
187
+ memory_region_init_alias(&mms->sysmem_alias[i], OBJECT(machine),
188
+ alias_name, sysmem, 0, UINT64_MAX);
189
+ memory_region_add_subregion_overlap(&mms->cpu_sysmem[i], 0,
190
+ &mms->sysmem_alias[i], -1);
191
+
192
+ mms->cpu[i] = object_new(machine->cpu_type);
193
+ object_property_set_link(mms->cpu[i], "memory",
194
+ OBJECT(&mms->cpu_sysmem[i]), &error_abort);
195
+ object_property_set_int(mms->cpu[i], "reset-cbar",
196
+ PERIPHBASE, &error_abort);
197
+ qdev_realize(DEVICE(mms->cpu[i]), NULL, &error_fatal);
198
+ object_unref(mms->cpu[i]);
199
+
200
+ /* Per-CPU RAM */
201
+ memory_region_init_ram(&mms->cpu_ram[i], NULL, ramname,
202
+ 0x1000, &error_fatal);
203
+ memory_region_add_subregion(&mms->cpu_sysmem[i], 0xe7c01000,
204
+ &mms->cpu_ram[i]);
142
+ }
205
+ }
143
+}
206
+
144
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
207
+ create_gic(mms, sysmem);
145
index XXXXXXX..XXXXXXX 100644
208
+
146
--- a/target/arm/translate-sme.c
209
+ mms->bootinfo.ram_size = machine->ram_size;
147
+++ b/target/arm/translate-sme.c
210
+ mms->bootinfo.board_id = -1;
148
@@ -XXX,XX +XXX,XX @@ static bool do_ldst_r(DisasContext *s, arg_ldstr *a, GenLdStR *fn)
211
+ mms->bootinfo.loader_start = mmc->loader_start;
149
212
+ mms->bootinfo.write_secondary_boot = mps3r_write_secondary_boot;
150
TRANS_FEAT(LDR, aa64_sme, do_ldst_r, a, gen_sve_ldr)
213
+ mms->bootinfo.secondary_cpu_reset_hook = mps3r_secondary_cpu_reset;
151
TRANS_FEAT(STR, aa64_sme, do_ldst_r, a, gen_sve_str)
214
+ arm_load_kernel(ARM_CPU(mms->cpu[0]), machine, &mms->bootinfo);
152
+
215
}
153
+static bool do_adda(DisasContext *s, arg_adda *a, MemOp esz,
216
154
+ gen_helper_gvec_4 *fn)
217
static void mps3r_set_default_ram_info(MPS3RMachineClass *mmc)
155
+{
218
@@ -XXX,XX +XXX,XX @@ static void mps3r_set_default_ram_info(MPS3RMachineClass *mmc)
156
+ int svl = streaming_vec_reg_size(s);
219
/* Found the entry for "system memory" */
157
+ uint32_t desc = simd_desc(svl, svl, 0);
220
mc->default_ram_size = p->size;
158
+ TCGv_ptr za, zn, pn, pm;
221
mc->default_ram_id = p->name;
159
+
222
+ mmc->loader_start = p->base;
160
+ if (!sme_smza_enabled_check(s)) {
223
return;
161
+ return true;
224
}
162
+ }
225
}
163
+
226
@@ -XXX,XX +XXX,XX @@ static void mps3r_an536_class_init(ObjectClass *oc, void *data)
164
+ /* Sum XZR+zad to find ZAd. */
227
};
165
+ za = get_tile_rowcol(s, esz, 31, a->zad, false);
228
166
+ zn = vec_full_reg_ptr(s, a->zn);
229
mc->desc = "ARM MPS3 with AN536 FPGA image for Cortex-R52";
167
+ pn = pred_full_reg_ptr(s, a->pn);
230
- mc->default_cpus = 2;
168
+ pm = pred_full_reg_ptr(s, a->pm);
231
- mc->min_cpus = mc->default_cpus;
169
+
232
- mc->max_cpus = mc->default_cpus;
170
+ fn(za, zn, pn, pm, tcg_constant_i32(desc));
233
+ /*
171
+
234
+ * In the real FPGA image there are always two cores, but the standard
172
+ tcg_temp_free_ptr(za);
235
+ * initial setting for the SCC SYSCON 0x000 register is 0x21, meaning
173
+ tcg_temp_free_ptr(zn);
236
+ * that the second core is held in reset and halted. Many images built for
174
+ tcg_temp_free_ptr(pn);
237
+ * the board do not expect the second core to run at startup (especially
175
+ tcg_temp_free_ptr(pm);
238
+ * since on the real FPGA image it is not possible to use LDREX/STREX
176
+ return true;
239
+ * in RAM between the two cores, so a true SMP setup isn't supported).
177
+}
240
+ *
178
+
241
+ * As QEMU's equivalent of this, we support both -smp 1 and -smp 2,
179
+TRANS_FEAT(ADDHA_s, aa64_sme, do_adda, a, MO_32, gen_helper_sme_addha_s)
242
+ * with the default being -smp 1. This seems a more intuitive UI for
180
+TRANS_FEAT(ADDVA_s, aa64_sme, do_adda, a, MO_32, gen_helper_sme_addva_s)
243
+ * QEMU users than, for instance, having a machine property to allow
181
+TRANS_FEAT(ADDHA_d, aa64_sme_i16i64, do_adda, a, MO_64, gen_helper_sme_addha_d)
244
+ * the user to set the initial value of the SYSCON 0x000 register.
182
+TRANS_FEAT(ADDVA_d, aa64_sme_i16i64, do_adda, a, MO_64, gen_helper_sme_addva_d)
245
+ */
246
+ mc->default_cpus = 1;
247
+ mc->min_cpus = 1;
248
+ mc->max_cpus = 2;
249
mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-r52");
250
mc->valid_cpu_types = valid_cpu_types;
251
mmc->raminfo = an536_raminfo;
183
--
252
--
184
2.25.1
253
2.34.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
This board has a lot of UARTs: there is one UART per CPU in the
2
per-CPU peripheral part of the address map, whose interrupts are
3
connected as per-CPU interrupt lines. Then there are 4 UARTs in the
4
normal part of the peripheral space, whose interrupts are shared
5
peripheral interrupts.
2
6
3
Mark these as a non-streaming instructions, which should trap if full
7
Connect and wire them all up; this involves some OR gates where
4
a64 support is not enabled in streaming mode. In this case, introduce
8
multiple overflow interrupts are wired into one GIC input.
5
PRF_ns (prefetch non-streaming) to handle the checks.
6
9
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20220708151540.18136-13-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
12
Message-id: 20240206132931.38376-11-peter.maydell@linaro.org
11
---
13
---
12
target/arm/sme-fa64.decode | 3 ---
14
hw/arm/mps3r.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++
13
target/arm/sve.decode | 10 +++++-----
15
1 file changed, 94 insertions(+)
14
target/arm/translate-sve.c | 11 +++++++++++
15
3 files changed, 16 insertions(+), 8 deletions(-)
16
16
17
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
17
diff --git a/hw/arm/mps3r.c b/hw/arm/mps3r.c
18
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/sme-fa64.decode
19
--- a/hw/arm/mps3r.c
20
+++ b/target/arm/sme-fa64.decode
20
+++ b/hw/arm/mps3r.c
21
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
21
@@ -XXX,XX +XXX,XX @@
22
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
22
#include "qapi/qmp/qlist.h"
23
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
23
#include "exec/address-spaces.h"
24
24
#include "cpu.h"
25
-FAIL 1000 010- -00- ---- 111- ---- ---- ---- # SVE 32-bit gather prefetch (vector+imm)
25
+#include "sysemu/sysemu.h"
26
-FAIL 1000 0100 0-1- ---- 0--- ---- ---- ---- # SVE 32-bit gather prefetch (scalar+vector)
26
#include "hw/boards.h"
27
FAIL 1010 010- ---- ---- 011- ---- ---- ---- # SVE contiguous FF load (scalar+scalar)
27
+#include "hw/or-irq.h"
28
FAIL 1010 010- ---1 ---- 101- ---- ---- ---- # SVE contiguous NF load (scalar+imm)
28
#include "hw/qdev-properties.h"
29
FAIL 1010 010- -01- ---- 000- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+scalar)
29
#include "hw/arm/boot.h"
30
FAIL 1010 010- -010 ---- 001- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+imm)
30
#include "hw/arm/bsa.h"
31
-FAIL 1100 010- ---- ---- ---- ---- ---- ---- # SVE 64-bit gather load/prefetch
31
+#include "hw/char/cmsdk-apb-uart.h"
32
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
32
#include "hw/intc/arm_gicv3.h"
33
index XXXXXXX..XXXXXXX 100644
33
34
--- a/target/arm/sve.decode
34
/* Define the layout of RAM and ROM in a board */
35
+++ b/target/arm/sve.decode
35
@@ -XXX,XX +XXX,XX @@ typedef struct RAMInfo {
36
@@ -XXX,XX +XXX,XX @@ LD1RO_zpri 1010010 .. 01 0.... 001 ... ..... ..... \
36
37
@rpri_load_msz nreg=0
37
#define MPS3R_RAM_MAX 9
38
38
#define MPS3R_CPU_MAX 2
39
# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets)
39
+#define MPS3R_UART_MAX 4 /* shared UART count */
40
-PRF 1000010 00 -1 ----- 0-- --- ----- 0 ----
40
41
+PRF_ns 1000010 00 -1 ----- 0-- --- ----- 0 ----
41
#define PERIPHBASE 0xf0000000
42
42
#define NUM_SPIS 96
43
# SVE 32-bit gather prefetch (vector plus immediate)
43
@@ -XXX,XX +XXX,XX @@ struct MPS3RMachineState {
44
-PRF 1000010 -- 00 ----- 111 --- ----- 0 ----
44
MemoryRegion sysmem_alias[MPS3R_CPU_MAX];
45
+PRF_ns 1000010 -- 00 ----- 111 --- ----- 0 ----
45
MemoryRegion cpu_ram[MPS3R_CPU_MAX];
46
46
GICv3State gic;
47
# SVE contiguous prefetch (scalar plus immediate)
47
+ /* per-CPU UARTs followed by the shared UARTs */
48
PRF 1000010 11 1- ----- 0-- --- ----- 0 ----
48
+ CMSDKAPBUART uart[MPS3R_CPU_MAX + MPS3R_UART_MAX];
49
@@ -XXX,XX +XXX,XX @@ LD1_zpiz 1100010 .. 01 ..... 1.. ... ..... ..... \
49
+ OrIRQState cpu_uart_oflow[MPS3R_CPU_MAX];
50
@rpri_g_load esz=3
50
+ OrIRQState uart_oflow;
51
51
};
52
# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets)
52
53
-PRF 1100010 00 11 ----- 1-- --- ----- 0 ----
53
#define TYPE_MPS3R_MACHINE "mps3r"
54
+PRF_ns 1100010 00 11 ----- 1-- --- ----- 0 ----
54
@@ -XXX,XX +XXX,XX @@ struct MPS3RMachineState {
55
55
56
# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets)
56
OBJECT_DECLARE_TYPE(MPS3RMachineState, MPS3RMachineClass, MPS3R_MACHINE)
57
-PRF 1100010 00 -1 ----- 0-- --- ----- 0 ----
57
58
+PRF_ns 1100010 00 -1 ----- 0-- --- ----- 0 ----
58
+/*
59
59
+ * Main clock frequency CLK in Hz (50MHz). In the image there are also
60
# SVE 64-bit gather prefetch (vector plus immediate)
60
+ * ACLK, MCLK, GPUCLK and PERIPHCLK at the same frequency; for our
61
-PRF 1100010 -- 00 ----- 111 --- ----- 0 ----
61
+ * model we just roll them all into one.
62
+PRF_ns 1100010 -- 00 ----- 111 --- ----- 0 ----
62
+ */
63
63
+#define CLK_FRQ 50000000
64
### SVE Memory Store Group
64
+
65
65
static const RAMInfo an536_raminfo[] = {
66
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
66
{
67
index XXXXXXX..XXXXXXX 100644
67
.name = "ATCM",
68
--- a/target/arm/translate-sve.c
68
@@ -XXX,XX +XXX,XX @@ static void create_gic(MPS3RMachineState *mms, MemoryRegion *sysmem)
69
+++ b/target/arm/translate-sve.c
69
}
70
@@ -XXX,XX +XXX,XX @@ static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a)
71
return true;
72
}
70
}
73
71
74
+static bool trans_PRF_ns(DisasContext *s, arg_PRF_ns *a)
72
+/*
73
+ * Create UART uartno, and map it into the MemoryRegion mem at address baseaddr.
74
+ * The qemu_irq arguments are where we connect the various IRQs from the UART.
75
+ */
76
+static void create_uart(MPS3RMachineState *mms, int uartno, MemoryRegion *mem,
77
+ hwaddr baseaddr, qemu_irq txirq, qemu_irq rxirq,
78
+ qemu_irq txoverirq, qemu_irq rxoverirq,
79
+ qemu_irq combirq)
75
+{
80
+{
76
+ if (!dc_isar_feature(aa64_sve, s)) {
81
+ g_autofree char *s = g_strdup_printf("uart%d", uartno);
77
+ return false;
82
+ SysBusDevice *sbd;
78
+ }
83
+
79
+ /* Prefetch is a nop within QEMU. */
84
+ assert(uartno < ARRAY_SIZE(mms->uart));
80
+ s->is_nonstreaming = true;
85
+ object_initialize_child(OBJECT(mms), s, &mms->uart[uartno],
81
+ (void)sve_access_check(s);
86
+ TYPE_CMSDK_APB_UART);
82
+ return true;
87
+ qdev_prop_set_uint32(DEVICE(&mms->uart[uartno]), "pclk-frq", CLK_FRQ);
88
+ qdev_prop_set_chr(DEVICE(&mms->uart[uartno]), "chardev", serial_hd(uartno));
89
+ sbd = SYS_BUS_DEVICE(&mms->uart[uartno]);
90
+ sysbus_realize(sbd, &error_fatal);
91
+ memory_region_add_subregion(mem, baseaddr,
92
+ sysbus_mmio_get_region(sbd, 0));
93
+ sysbus_connect_irq(sbd, 0, txirq);
94
+ sysbus_connect_irq(sbd, 1, rxirq);
95
+ sysbus_connect_irq(sbd, 2, txoverirq);
96
+ sysbus_connect_irq(sbd, 3, rxoverirq);
97
+ sysbus_connect_irq(sbd, 4, combirq);
83
+}
98
+}
84
+
99
+
85
/*
100
static void mps3r_common_init(MachineState *machine)
86
* Move Prefix
101
{
87
*
102
MPS3RMachineState *mms = MPS3R_MACHINE(machine);
103
MPS3RMachineClass *mmc = MPS3R_MACHINE_GET_CLASS(mms);
104
MemoryRegion *sysmem = get_system_memory();
105
+ DeviceState *gicdev;
106
107
for (const RAMInfo *ri = mmc->raminfo; ri->name; ri++) {
108
MemoryRegion *mr = mr_for_raminfo(mms, ri);
109
@@ -XXX,XX +XXX,XX @@ static void mps3r_common_init(MachineState *machine)
110
}
111
112
create_gic(mms, sysmem);
113
+ gicdev = DEVICE(&mms->gic);
114
+
115
+ /*
116
+ * UARTs 0 and 1 are per-CPU; their interrupts are wired to
117
+ * the relevant CPU's PPI 0..3, aka INTID 16..19
118
+ */
119
+ for (int i = 0; i < machine->smp.cpus; i++) {
120
+ int intidbase = NUM_SPIS + i * GIC_INTERNAL;
121
+ g_autofree char *s = g_strdup_printf("cpu-uart-oflow-orgate%d", i);
122
+ DeviceState *orgate;
123
+
124
+ /* The two overflow IRQs from the UART are ORed together into PPI 3 */
125
+ object_initialize_child(OBJECT(mms), s, &mms->cpu_uart_oflow[i],
126
+ TYPE_OR_IRQ);
127
+ orgate = DEVICE(&mms->cpu_uart_oflow[i]);
128
+ qdev_prop_set_uint32(orgate, "num-lines", 2);
129
+ qdev_realize(orgate, NULL, &error_fatal);
130
+ qdev_connect_gpio_out(orgate, 0,
131
+ qdev_get_gpio_in(gicdev, intidbase + 19));
132
+
133
+ create_uart(mms, i, &mms->cpu_sysmem[i], 0xe7c00000,
134
+ qdev_get_gpio_in(gicdev, intidbase + 17), /* tx */
135
+ qdev_get_gpio_in(gicdev, intidbase + 16), /* rx */
136
+ qdev_get_gpio_in(orgate, 0), /* txover */
137
+ qdev_get_gpio_in(orgate, 1), /* rxover */
138
+ qdev_get_gpio_in(gicdev, intidbase + 18) /* combined */);
139
+ }
140
+ /*
141
+ * UARTs 2 to 5 are whole-system; all overflow IRQs are ORed
142
+ * together into IRQ 17
143
+ */
144
+ object_initialize_child(OBJECT(mms), "uart-oflow-orgate",
145
+ &mms->uart_oflow, TYPE_OR_IRQ);
146
+ qdev_prop_set_uint32(DEVICE(&mms->uart_oflow), "num-lines",
147
+ MPS3R_UART_MAX * 2);
148
+ qdev_realize(DEVICE(&mms->uart_oflow), NULL, &error_fatal);
149
+ qdev_connect_gpio_out(DEVICE(&mms->uart_oflow), 0,
150
+ qdev_get_gpio_in(gicdev, 17));
151
+
152
+ for (int i = 0; i < MPS3R_UART_MAX; i++) {
153
+ hwaddr baseaddr = 0xe0205000 + i * 0x1000;
154
+ int rxirq = 5 + i * 2, txirq = 6 + i * 2, combirq = 13 + i;
155
+
156
+ create_uart(mms, i + MPS3R_CPU_MAX, sysmem, baseaddr,
157
+ qdev_get_gpio_in(gicdev, txirq),
158
+ qdev_get_gpio_in(gicdev, rxirq),
159
+ qdev_get_gpio_in(DEVICE(&mms->uart_oflow), i * 2),
160
+ qdev_get_gpio_in(DEVICE(&mms->uart_oflow), i * 2 + 1),
161
+ qdev_get_gpio_in(gicdev, combirq));
162
+ }
163
164
mms->bootinfo.ram_size = machine->ram_size;
165
mms->bootinfo.board_id = -1;
88
--
166
--
89
2.25.1
167
2.34.1
168
169
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Mark these as a non-streaming instructions, which should trap
4
if full a64 support is not enabled in streaming mode.
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20220708151540.18136-14-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
target/arm/sme-fa64.decode | 2 --
12
target/arm/translate-sve.c | 2 ++
13
2 files changed, 2 insertions(+), 2 deletions(-)
14
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
21
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
22
23
-FAIL 1010 010- ---- ---- 011- ---- ---- ---- # SVE contiguous FF load (scalar+scalar)
24
-FAIL 1010 010- ---1 ---- 101- ---- ---- ---- # SVE contiguous NF load (scalar+imm)
25
FAIL 1010 010- -01- ---- 000- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+scalar)
26
FAIL 1010 010- -010 ---- 001- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+imm)
27
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/translate-sve.c
30
+++ b/target/arm/translate-sve.c
31
@@ -XXX,XX +XXX,XX @@ static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a)
32
if (!dc_isar_feature(aa64_sve, s)) {
33
return false;
34
}
35
+ s->is_nonstreaming = true;
36
if (sve_access_check(s)) {
37
TCGv_i64 addr = new_tmp_a64(s);
38
tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
39
@@ -XXX,XX +XXX,XX @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a)
40
if (!dc_isar_feature(aa64_sve, s)) {
41
return false;
42
}
43
+ s->is_nonstreaming = true;
44
if (sve_access_check(s)) {
45
int vsz = vec_full_reg_size(s);
46
int elements = vsz >> dtype_esz[a->dtype];
47
--
48
2.25.1
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Add the GPIO, watchdog, dual-timer and I2C devices to the mps3-an536
2
board. These are all simple devices that just need to be created and
3
wired up.
2
4
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20220708151540.18136-19-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Message-id: 20240206132931.38376-12-peter.maydell@linaro.org
7
---
8
---
8
target/arm/helper-sme.h | 2 ++
9
hw/arm/mps3r.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++
9
target/arm/sme.decode | 4 ++++
10
1 file changed, 59 insertions(+)
10
target/arm/sme_helper.c | 25 +++++++++++++++++++++++++
11
target/arm/translate-sme.c | 13 +++++++++++++
12
4 files changed, 44 insertions(+)
13
11
14
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
12
diff --git a/hw/arm/mps3r.c b/hw/arm/mps3r.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sme.h
14
--- a/hw/arm/mps3r.c
17
+++ b/target/arm/helper-sme.h
15
+++ b/hw/arm/mps3r.c
18
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@
19
17
#include "sysemu/sysemu.h"
20
DEF_HELPER_FLAGS_2(set_pstate_sm, TCG_CALL_NO_RWG, void, env, i32)
18
#include "hw/boards.h"
21
DEF_HELPER_FLAGS_2(set_pstate_za, TCG_CALL_NO_RWG, void, env, i32)
19
#include "hw/or-irq.h"
20
+#include "hw/qdev-clock.h"
21
#include "hw/qdev-properties.h"
22
#include "hw/arm/boot.h"
23
#include "hw/arm/bsa.h"
24
#include "hw/char/cmsdk-apb-uart.h"
25
+#include "hw/i2c/arm_sbcon_i2c.h"
26
#include "hw/intc/arm_gicv3.h"
27
+#include "hw/misc/unimp.h"
28
+#include "hw/timer/cmsdk-apb-dualtimer.h"
29
+#include "hw/watchdog/cmsdk-apb-watchdog.h"
30
31
/* Define the layout of RAM and ROM in a board */
32
typedef struct RAMInfo {
33
@@ -XXX,XX +XXX,XX @@ struct MPS3RMachineState {
34
CMSDKAPBUART uart[MPS3R_CPU_MAX + MPS3R_UART_MAX];
35
OrIRQState cpu_uart_oflow[MPS3R_CPU_MAX];
36
OrIRQState uart_oflow;
37
+ CMSDKAPBWatchdog watchdog;
38
+ CMSDKAPBDualTimer dualtimer;
39
+ ArmSbconI2CState i2c[5];
40
+ Clock *clk;
41
};
42
43
#define TYPE_MPS3R_MACHINE "mps3r"
44
@@ -XXX,XX +XXX,XX @@ static void mps3r_common_init(MachineState *machine)
45
MemoryRegion *sysmem = get_system_memory();
46
DeviceState *gicdev;
47
48
+ mms->clk = clock_new(OBJECT(machine), "CLK");
49
+ clock_set_hz(mms->clk, CLK_FRQ);
22
+
50
+
23
+DEF_HELPER_FLAGS_3(sme_zero, TCG_CALL_NO_RWG, void, env, i32, i32)
51
for (const RAMInfo *ri = mmc->raminfo; ri->name; ri++) {
24
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
52
MemoryRegion *mr = mr_for_raminfo(mms, ri);
25
index XXXXXXX..XXXXXXX 100644
53
memory_region_add_subregion(sysmem, ri->base, mr);
26
--- a/target/arm/sme.decode
54
@@ -XXX,XX +XXX,XX @@ static void mps3r_common_init(MachineState *machine)
27
+++ b/target/arm/sme.decode
55
qdev_get_gpio_in(gicdev, combirq));
28
@@ -XXX,XX +XXX,XX @@
29
#
30
# This file is processed by scripts/decodetree.py
31
#
32
+
33
+### SME Misc
34
+
35
+ZERO 11000000 00 001 00000000000 imm:8
36
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/sme_helper.c
39
+++ b/target/arm/sme_helper.c
40
@@ -XXX,XX +XXX,XX @@ void helper_set_pstate_za(CPUARMState *env, uint32_t i)
41
memset(env->zarray, 0, sizeof(env->zarray));
42
}
56
}
43
}
57
44
+
58
+ for (int i = 0; i < 4; i++) {
45
+void helper_sme_zero(CPUARMState *env, uint32_t imm, uint32_t svl)
59
+ /* CMSDK GPIO controllers */
46
+{
60
+ g_autofree char *s = g_strdup_printf("gpio%d", i);
47
+ uint32_t i;
61
+ create_unimplemented_device(s, 0xe0000000 + i * 0x1000, 0x1000);
48
+
49
+ /*
50
+ * Special case clearing the entire ZA space.
51
+ * This falls into the CONSTRAINED UNPREDICTABLE zeroing of any
52
+ * parts of the ZA storage outside of SVL.
53
+ */
54
+ if (imm == 0xff) {
55
+ memset(env->zarray, 0, sizeof(env->zarray));
56
+ return;
57
+ }
62
+ }
58
+
63
+
59
+ /*
64
+ object_initialize_child(OBJECT(mms), "watchdog", &mms->watchdog,
60
+ * Recall that ZAnH.D[m] is spread across ZA[n+8*m],
65
+ TYPE_CMSDK_APB_WATCHDOG);
61
+ * so each row is discontiguous within ZA[].
66
+ qdev_connect_clock_in(DEVICE(&mms->watchdog), "WDOGCLK", mms->clk);
62
+ */
67
+ sysbus_realize(SYS_BUS_DEVICE(&mms->watchdog), &error_fatal);
63
+ for (i = 0; i < svl; i++) {
68
+ sysbus_connect_irq(SYS_BUS_DEVICE(&mms->watchdog), 0,
64
+ if (imm & (1 << (i % 8))) {
69
+ qdev_get_gpio_in(gicdev, 0));
65
+ memset(&env->zarray[i], 0, svl);
70
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->watchdog), 0, 0xe0100000);
71
+
72
+ object_initialize_child(OBJECT(mms), "dualtimer", &mms->dualtimer,
73
+ TYPE_CMSDK_APB_DUALTIMER);
74
+ qdev_connect_clock_in(DEVICE(&mms->dualtimer), "TIMCLK", mms->clk);
75
+ sysbus_realize(SYS_BUS_DEVICE(&mms->dualtimer), &error_fatal);
76
+ sysbus_connect_irq(SYS_BUS_DEVICE(&mms->dualtimer), 0,
77
+ qdev_get_gpio_in(gicdev, 3));
78
+ sysbus_connect_irq(SYS_BUS_DEVICE(&mms->dualtimer), 1,
79
+ qdev_get_gpio_in(gicdev, 1));
80
+ sysbus_connect_irq(SYS_BUS_DEVICE(&mms->dualtimer), 2,
81
+ qdev_get_gpio_in(gicdev, 2));
82
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->dualtimer), 0, 0xe0101000);
83
+
84
+ for (int i = 0; i < ARRAY_SIZE(mms->i2c); i++) {
85
+ static const hwaddr i2cbase[] = {0xe0102000, /* Touch */
86
+ 0xe0103000, /* Audio */
87
+ 0xe0107000, /* Shield0 */
88
+ 0xe0108000, /* Shield1 */
89
+ 0xe0109000}; /* DDR4 EEPROM */
90
+ g_autofree char *s = g_strdup_printf("i2c%d", i);
91
+
92
+ object_initialize_child(OBJECT(mms), s, &mms->i2c[i],
93
+ TYPE_ARM_SBCON_I2C);
94
+ sysbus_realize(SYS_BUS_DEVICE(&mms->i2c[i]), &error_fatal);
95
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->i2c[i]), 0, i2cbase[i]);
96
+ if (i != 2 && i != 3) {
97
+ /*
98
+ * internal-only bus: mark it full to avoid user-created
99
+ * i2c devices being plugged into it.
100
+ */
101
+ qbus_mark_full(qdev_get_child_bus(DEVICE(&mms->i2c[i]), "i2c"));
66
+ }
102
+ }
67
+ }
103
+ }
68
+}
69
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/target/arm/translate-sme.c
72
+++ b/target/arm/translate-sme.c
73
@@ -XXX,XX +XXX,XX @@
74
*/
75
76
#include "decode-sme.c.inc"
77
+
104
+
78
+
105
mms->bootinfo.ram_size = machine->ram_size;
79
+static bool trans_ZERO(DisasContext *s, arg_ZERO *a)
106
mms->bootinfo.board_id = -1;
80
+{
107
mms->bootinfo.loader_start = mmc->loader_start;
81
+ if (!dc_isar_feature(aa64_sme, s)) {
82
+ return false;
83
+ }
84
+ if (sme_za_enabled_check(s)) {
85
+ gen_helper_sme_zero(cpu_env, tcg_constant_i32(a->imm),
86
+ tcg_constant_i32(streaming_vec_reg_size(s)));
87
+ }
88
+ return true;
89
+}
90
--
108
--
91
2.25.1
109
2.34.1
110
111
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Add the remaining devices (or unimplemented-device stubs) for
2
this board: SPI controllers, SCC, FPGAIO, I2S, RTC, the
3
QSPI write-config block, and ethernet.
2
4
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20220708151540.18136-25-richard.henderson@linaro.org
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Message-id: 20240206132931.38376-13-peter.maydell@linaro.org
7
---
8
---
8
target/arm/helper-sme.h | 5 +++
9
hw/arm/mps3r.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++
9
target/arm/sme.decode | 9 +++++
10
1 file changed, 74 insertions(+)
10
target/arm/sme_helper.c | 69 ++++++++++++++++++++++++++++++++++++++
11
target/arm/translate-sme.c | 32 ++++++++++++++++++
12
4 files changed, 115 insertions(+)
13
11
14
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
12
diff --git a/hw/arm/mps3r.c b/hw/arm/mps3r.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sme.h
14
--- a/hw/arm/mps3r.c
17
+++ b/target/arm/helper-sme.h
15
+++ b/hw/arm/mps3r.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sme_addha_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@
19
DEF_HELPER_FLAGS_5(sme_addva_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
17
#include "hw/char/cmsdk-apb-uart.h"
20
DEF_HELPER_FLAGS_5(sme_addha_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
#include "hw/i2c/arm_sbcon_i2c.h"
21
DEF_HELPER_FLAGS_5(sme_addva_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
#include "hw/intc/arm_gicv3.h"
20
+#include "hw/misc/mps2-scc.h"
21
+#include "hw/misc/mps2-fpgaio.h"
22
#include "hw/misc/unimp.h"
23
+#include "hw/net/lan9118.h"
24
+#include "hw/rtc/pl031.h"
25
+#include "hw/ssi/pl022.h"
26
#include "hw/timer/cmsdk-apb-dualtimer.h"
27
#include "hw/watchdog/cmsdk-apb-watchdog.h"
28
29
@@ -XXX,XX +XXX,XX @@ struct MPS3RMachineState {
30
CMSDKAPBWatchdog watchdog;
31
CMSDKAPBDualTimer dualtimer;
32
ArmSbconI2CState i2c[5];
33
+ PL022State spi[3];
34
+ MPS2SCC scc;
35
+ MPS2FPGAIO fpgaio;
36
+ UnimplementedDeviceState i2s_audio;
37
+ PL031State rtc;
38
Clock *clk;
39
};
40
41
@@ -XXX,XX +XXX,XX @@ static const RAMInfo an536_raminfo[] = {
42
}
43
};
44
45
+static const int an536_oscclk[] = {
46
+ 24000000, /* 24MHz reference for RTC and timers */
47
+ 50000000, /* 50MHz ACLK */
48
+ 50000000, /* 50MHz MCLK */
49
+ 50000000, /* 50MHz GPUCLK */
50
+ 24576000, /* 24.576MHz AUDCLK */
51
+ 23750000, /* 23.75MHz HDLCDCLK */
52
+ 100000000, /* 100MHz DDR4_REF_CLK */
53
+};
22
+
54
+
23
+DEF_HELPER_FLAGS_7(sme_fmopa_s, TCG_CALL_NO_RWG,
55
static MemoryRegion *mr_for_raminfo(MPS3RMachineState *mms,
24
+ void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
56
const RAMInfo *raminfo)
25
+DEF_HELPER_FLAGS_7(sme_fmopa_d, TCG_CALL_NO_RWG,
57
{
26
+ void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
58
@@ -XXX,XX +XXX,XX @@ static void mps3r_common_init(MachineState *machine)
27
diff --git a/target/arm/sme.decode b/target/arm/sme.decode
59
MPS3RMachineClass *mmc = MPS3R_MACHINE_GET_CLASS(mms);
28
index XXXXXXX..XXXXXXX 100644
60
MemoryRegion *sysmem = get_system_memory();
29
--- a/target/arm/sme.decode
61
DeviceState *gicdev;
30
+++ b/target/arm/sme.decode
62
+ QList *oscclk;
31
@@ -XXX,XX +XXX,XX @@ ADDHA_s 11000000 10 01000 0 ... ... ..... 000 .. @adda_32
63
32
ADDVA_s 11000000 10 01000 1 ... ... ..... 000 .. @adda_32
64
mms->clk = clock_new(OBJECT(machine), "CLK");
33
ADDHA_d 11000000 11 01000 0 ... ... ..... 00 ... @adda_64
65
clock_set_hz(mms->clk, CLK_FRQ);
34
ADDVA_d 11000000 11 01000 1 ... ... ..... 00 ... @adda_64
66
@@ -XXX,XX +XXX,XX @@ static void mps3r_common_init(MachineState *machine)
35
+
36
+### SME Outer Product
37
+
38
+&op zad zn zm pm pn sub:bool
39
+@op_32 ........ ... zm:5 pm:3 pn:3 zn:5 sub:1 .. zad:2 &op
40
+@op_64 ........ ... zm:5 pm:3 pn:3 zn:5 sub:1 . zad:3 &op
41
+
42
+FMOPA_s 10000000 100 ..... ... ... ..... . 00 .. @op_32
43
+FMOPA_d 10000000 110 ..... ... ... ..... . 0 ... @op_64
44
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
45
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/sme_helper.c
47
+++ b/target/arm/sme_helper.c
48
@@ -XXX,XX +XXX,XX @@
49
#include "exec/cpu_ldst.h"
50
#include "exec/exec-all.h"
51
#include "qemu/int128.h"
52
+#include "fpu/softfloat.h"
53
#include "vec_internal.h"
54
#include "sve_ldst_internal.h"
55
56
@@ -XXX,XX +XXX,XX @@ void HELPER(sme_addva_d)(void *vzda, void *vzn, void *vpn,
57
}
67
}
58
}
68
}
59
}
69
70
+ for (int i = 0; i < ARRAY_SIZE(mms->spi); i++) {
71
+ g_autofree char *s = g_strdup_printf("spi%d", i);
72
+ hwaddr baseaddr = 0xe0104000 + i * 0x1000;
60
+
73
+
61
+void HELPER(sme_fmopa_s)(void *vza, void *vzn, void *vzm, void *vpn,
74
+ object_initialize_child(OBJECT(mms), s, &mms->spi[i], TYPE_PL022);
62
+ void *vpm, void *vst, uint32_t desc)
75
+ sysbus_realize(SYS_BUS_DEVICE(&mms->spi[i]), &error_fatal);
63
+{
76
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->spi[i]), 0, baseaddr);
64
+ intptr_t row, col, oprsz = simd_maxsz(desc);
77
+ sysbus_connect_irq(SYS_BUS_DEVICE(&mms->spi[i]), 0,
65
+ uint32_t neg = simd_data(desc) << 31;
78
+ qdev_get_gpio_in(gicdev, 22 + i));
66
+ uint16_t *pn = vpn, *pm = vpm;
79
+ }
67
+ float_status fpst;
80
+
81
+ object_initialize_child(OBJECT(mms), "scc", &mms->scc, TYPE_MPS2_SCC);
82
+ qdev_prop_set_uint32(DEVICE(&mms->scc), "scc-cfg0", 0);
83
+ qdev_prop_set_uint32(DEVICE(&mms->scc), "scc-cfg4", 0x2);
84
+ qdev_prop_set_uint32(DEVICE(&mms->scc), "scc-aid", 0x00200008);
85
+ qdev_prop_set_uint32(DEVICE(&mms->scc), "scc-id", 0x41055360);
86
+ oscclk = qlist_new();
87
+ for (int i = 0; i < ARRAY_SIZE(an536_oscclk); i++) {
88
+ qlist_append_int(oscclk, an536_oscclk[i]);
89
+ }
90
+ qdev_prop_set_array(DEVICE(&mms->scc), "oscclk", oscclk);
91
+ sysbus_realize(SYS_BUS_DEVICE(&mms->scc), &error_fatal);
92
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->scc), 0, 0xe0200000);
93
+
94
+ create_unimplemented_device("i2s-audio", 0xe0201000, 0x1000);
95
+
96
+ object_initialize_child(OBJECT(mms), "fpgaio", &mms->fpgaio,
97
+ TYPE_MPS2_FPGAIO);
98
+ qdev_prop_set_uint32(DEVICE(&mms->fpgaio), "prescale-clk", an536_oscclk[1]);
99
+ qdev_prop_set_uint32(DEVICE(&mms->fpgaio), "num-leds", 10);
100
+ qdev_prop_set_bit(DEVICE(&mms->fpgaio), "has-switches", true);
101
+ qdev_prop_set_bit(DEVICE(&mms->fpgaio), "has-dbgctrl", false);
102
+ sysbus_realize(SYS_BUS_DEVICE(&mms->fpgaio), &error_fatal);
103
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->fpgaio), 0, 0xe0202000);
104
+
105
+ create_unimplemented_device("clcd", 0xe0209000, 0x1000);
106
+
107
+ object_initialize_child(OBJECT(mms), "rtc", &mms->rtc, TYPE_PL031);
108
+ sysbus_realize(SYS_BUS_DEVICE(&mms->rtc), &error_fatal);
109
+ sysbus_mmio_map(SYS_BUS_DEVICE(&mms->rtc), 0, 0xe020a000);
110
+ sysbus_connect_irq(SYS_BUS_DEVICE(&mms->rtc), 0,
111
+ qdev_get_gpio_in(gicdev, 4));
68
+
112
+
69
+ /*
113
+ /*
70
+ * Make a copy of float_status because this operation does not
114
+ * In hardware this is a LAN9220; the LAN9118 is software compatible
71
+ * update the cumulative fp exception status. It also produces
115
+ * except that it doesn't support the checksum-offload feature.
72
+ * default nans.
73
+ */
116
+ */
74
+ fpst = *(float_status *)vst;
117
+ lan9118_init(0xe0300000,
75
+ set_default_nan_mode(true, &fpst);
118
+ qdev_get_gpio_in(gicdev, 18));
76
+
119
+
77
+ for (row = 0; row < oprsz; ) {
120
+ create_unimplemented_device("usb", 0xe0301000, 0x1000);
78
+ uint16_t pa = pn[H2(row >> 4)];
121
+ create_unimplemented_device("qspi-write-config", 0xe0600000, 0x1000);
79
+ do {
80
+ if (pa & 1) {
81
+ void *vza_row = vza + tile_vslice_offset(row);
82
+ uint32_t n = *(uint32_t *)(vzn + H1_4(row)) ^ neg;
83
+
122
+
84
+ for (col = 0; col < oprsz; ) {
123
mms->bootinfo.ram_size = machine->ram_size;
85
+ uint16_t pb = pm[H2(col >> 4)];
124
mms->bootinfo.board_id = -1;
86
+ do {
125
mms->bootinfo.loader_start = mmc->loader_start;
87
+ if (pb & 1) {
88
+ uint32_t *a = vza_row + H1_4(col);
89
+ uint32_t *m = vzm + H1_4(col);
90
+ *a = float32_muladd(n, *m, *a, 0, vst);
91
+ }
92
+ col += 4;
93
+ pb >>= 4;
94
+ } while (col & 15);
95
+ }
96
+ }
97
+ row += 4;
98
+ pa >>= 4;
99
+ } while (row & 15);
100
+ }
101
+}
102
+
103
+void HELPER(sme_fmopa_d)(void *vza, void *vzn, void *vzm, void *vpn,
104
+ void *vpm, void *vst, uint32_t desc)
105
+{
106
+ intptr_t row, col, oprsz = simd_oprsz(desc) / 8;
107
+ uint64_t neg = (uint64_t)simd_data(desc) << 63;
108
+ uint64_t *za = vza, *zn = vzn, *zm = vzm;
109
+ uint8_t *pn = vpn, *pm = vpm;
110
+ float_status fpst = *(float_status *)vst;
111
+
112
+ set_default_nan_mode(true, &fpst);
113
+
114
+ for (row = 0; row < oprsz; ++row) {
115
+ if (pn[H1(row)] & 1) {
116
+ uint64_t *za_row = &za[tile_vslice_index(row)];
117
+ uint64_t n = zn[row] ^ neg;
118
+
119
+ for (col = 0; col < oprsz; ++col) {
120
+ if (pm[H1(col)] & 1) {
121
+ uint64_t *a = &za_row[col];
122
+ *a = float64_muladd(n, zm[col], *a, 0, &fpst);
123
+ }
124
+ }
125
+ }
126
+ }
127
+}
128
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
129
index XXXXXXX..XXXXXXX 100644
130
--- a/target/arm/translate-sme.c
131
+++ b/target/arm/translate-sme.c
132
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(ADDHA_s, aa64_sme, do_adda, a, MO_32, gen_helper_sme_addha_s)
133
TRANS_FEAT(ADDVA_s, aa64_sme, do_adda, a, MO_32, gen_helper_sme_addva_s)
134
TRANS_FEAT(ADDHA_d, aa64_sme_i16i64, do_adda, a, MO_64, gen_helper_sme_addha_d)
135
TRANS_FEAT(ADDVA_d, aa64_sme_i16i64, do_adda, a, MO_64, gen_helper_sme_addva_d)
136
+
137
+static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz,
138
+ gen_helper_gvec_5_ptr *fn)
139
+{
140
+ int svl = streaming_vec_reg_size(s);
141
+ uint32_t desc = simd_desc(svl, svl, a->sub);
142
+ TCGv_ptr za, zn, zm, pn, pm, fpst;
143
+
144
+ if (!sme_smza_enabled_check(s)) {
145
+ return true;
146
+ }
147
+
148
+ /* Sum XZR+zad to find ZAd. */
149
+ za = get_tile_rowcol(s, esz, 31, a->zad, false);
150
+ zn = vec_full_reg_ptr(s, a->zn);
151
+ zm = vec_full_reg_ptr(s, a->zm);
152
+ pn = pred_full_reg_ptr(s, a->pn);
153
+ pm = pred_full_reg_ptr(s, a->pm);
154
+ fpst = fpstatus_ptr(FPST_FPCR);
155
+
156
+ fn(za, zn, zm, pn, pm, fpst, tcg_constant_i32(desc));
157
+
158
+ tcg_temp_free_ptr(za);
159
+ tcg_temp_free_ptr(zn);
160
+ tcg_temp_free_ptr(pn);
161
+ tcg_temp_free_ptr(pm);
162
+ tcg_temp_free_ptr(fpst);
163
+ return true;
164
+}
165
+
166
+TRANS_FEAT(FMOPA_s, aa64_sme, do_outprod_fpst, a, MO_32, gen_helper_sme_fmopa_s)
167
+TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a, MO_64, gen_helper_sme_fmopa_d)
168
--
126
--
169
2.25.1
127
2.34.1
128
129
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Add documentation for the mps3-an536 board type.
2
2
3
Mark these as a non-streaming instructions, which should trap
3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
if full a64 support is not enabled in streaming mode.
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Message-id: 20240206132931.38376-14-peter.maydell@linaro.org
6
---
7
docs/system/arm/mps2.rst | 37 ++++++++++++++++++++++++++++++++++---
8
1 file changed, 34 insertions(+), 3 deletions(-)
5
9
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
diff --git a/docs/system/arm/mps2.rst b/docs/system/arm/mps2.rst
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
index XXXXXXX..XXXXXXX 100644
8
Message-id: 20220708151540.18136-15-richard.henderson@linaro.org
12
--- a/docs/system/arm/mps2.rst
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
+++ b/docs/system/arm/mps2.rst
10
---
14
@@ -XXX,XX +XXX,XX @@
11
target/arm/sme-fa64.decode | 3 ---
15
-Arm MPS2 and MPS3 boards (``mps2-an385``, ``mps2-an386``, ``mps2-an500``, ``mps2-an505``, ``mps2-an511``, ``mps2-an521``, ``mps3-an524``, ``mps3-an547``)
12
target/arm/translate-sve.c | 2 ++
16
-=========================================================================================================================================================
13
2 files changed, 2 insertions(+), 3 deletions(-)
17
+Arm MPS2 and MPS3 boards (``mps2-an385``, ``mps2-an386``, ``mps2-an500``, ``mps2-an505``, ``mps2-an511``, ``mps2-an521``, ``mps3-an524``, ``mps3-an536``, ``mps3-an547``)
18
+=========================================================================================================================================================================
19
20
-These board models all use Arm M-profile CPUs.
21
+These board models use Arm M-profile or R-profile CPUs.
22
23
The Arm MPS2, MPS2+ and MPS3 dev boards are FPGA based (the 2+ has a
24
bigger FPGA but is otherwise the same as the 2; the 3 has a bigger
25
@@ -XXX,XX +XXX,XX @@ FPGA image.
26
27
QEMU models the following FPGA images:
28
29
+FPGA images using M-profile CPUs:
30
+
31
``mps2-an385``
32
Cortex-M3 as documented in Arm Application Note AN385
33
``mps2-an386``
34
@@ -XXX,XX +XXX,XX @@ QEMU models the following FPGA images:
35
``mps3-an547``
36
Cortex-M55 on an MPS3, as documented in Arm Application Note AN547
37
38
+FPGA images using R-profile CPUs:
39
+
40
+``mps3-an536``
41
+ Dual Cortex-R52 on an MPS3, as documented in Arm Application Note AN536
42
+
43
Differences between QEMU and real hardware:
44
45
- AN385/AN386 remapping of low 16K of memory to either ZBT SSRAM1 or to
46
@@ -XXX,XX +XXX,XX @@ Differences between QEMU and real hardware:
47
flash, but only as simple ROM, so attempting to rewrite the flash
48
from the guest will fail
49
- QEMU does not model the USB controller in MPS3 boards
50
+- AN536 does not support runtime control of CPU reset and halt via
51
+ the SCC CFG_REG0 register.
52
+- AN536 does not support enabling or disabling the flash and ATCM
53
+ interfaces via the SCC CFG_REG1 register.
54
+- AN536 does not support setting of the initial vector table
55
+ base address via the SCC CFG_REG6 and CFG_REG7 register config,
56
+ and does not provide a mechanism for specifying these values at
57
+ startup, so all guest images must be built to start from TCM
58
+ (i.e. to expect the interrupt vector base at 0 from reset).
59
+- AN536 defaults to only creating a single CPU; this is the equivalent
60
+ of the way the real FPGA image usually runs with the second Cortex-R52
61
+ held in halt via the initial SCC CFG_REG0 register setting. You can
62
+ create the second CPU with ``-smp 2``; both CPUs will then start
63
+ execution immediately on startup.
64
+
65
+Note that for the AN536 the first UART is accessible only by
66
+CPU0, and the second UART is accessible only by CPU1. The
67
+first UART accessible shared between both CPUs is the third
68
+UART. Guest software might therefore be built to use either
69
+the first UART or the third UART; if you don't see any output
70
+from the UART you are looking at, try one of the others.
71
+(Even if the AN536 machine is started with a single CPU and so
72
+no "CPU1-only UART", the UART numbering remains the same,
73
+with the third UART being the first of the shared ones.)
74
75
Machine-specific options
76
""""""""""""""""""""""""
77
--
78
2.34.1
14
79
15
diff --git a/target/arm/sme-fa64.decode b/target/arm/sme-fa64.decode
80
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/sme-fa64.decode
18
+++ b/target/arm/sme-fa64.decode
19
@@ -XXX,XX +XXX,XX @@ FAIL 0001 1110 0111 1110 0000 00-- ---- ---- # FJCVTZS
20
# --11 1100 --0- ---- ---- ---- ---- ---- # Load/store FP register (unscaled imm)
21
# --11 1100 --1- ---- ---- ---- ---- --10 # Load/store FP register (register offset)
22
# --11 1101 ---- ---- ---- ---- ---- ---- # Load/store FP register (scaled imm)
23
-
24
-FAIL 1010 010- -01- ---- 000- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+scalar)
25
-FAIL 1010 010- -010 ---- 001- ---- ---- ---- # SVE load & replicate 32 bytes (scalar+imm)
26
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/translate-sve.c
29
+++ b/target/arm/translate-sve.c
30
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1RO_zprr(DisasContext *s, arg_rprr_load *a)
31
if (a->rm == 31) {
32
return false;
33
}
34
+ s->is_nonstreaming = true;
35
if (sve_access_check(s)) {
36
TCGv_i64 addr = new_tmp_a64(s);
37
tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
38
@@ -XXX,XX +XXX,XX @@ static bool trans_LD1RO_zpri(DisasContext *s, arg_rpri_load *a)
39
if (!dc_isar_feature(aa64_sve_f64mm, s)) {
40
return false;
41
}
42
+ s->is_nonstreaming = true;
43
if (sve_access_check(s)) {
44
TCGv_i64 addr = new_tmp_a64(s);
45
tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 32);
46
--
47
2.25.1
diff view generated by jsdifflib