1
A largish pull request: the big things are Richard's PAuth work
1
Hi; most of this is the first half of the A64 simd decodetree
2
and Aaron's PMU emulation improvements.
2
conversion; the rest is a mix of fixes from the last couple of weeks.
3
4
v2 uses patches from the v2 decodetree series to avoid a few
5
regressions in some A32 insns.
6
7
(Richard: I'm still planning to review the second half of the
8
v2 decodetree series; I just wanted to get the respin of this
9
pullreq out today...)
3
10
4
thanks
11
thanks
5
-- PMM
12
-- PMM
6
13
14
The following changes since commit ad10b4badc1dd5b28305f9b9f1168cf0aa3ae946:
7
15
8
The following changes since commit 681d61362d3f766a00806b89d6581869041f73cb:
16
Merge tag 'pull-error-2024-05-27' of https://repo.or.cz/qemu/armbru into staging (2024-05-27 06:40:42 -0700)
9
10
Merge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into staging (2019-01-17 12:48:42 +0000)
11
17
12
are available in the Git repository at:
18
are available in the Git repository at:
13
19
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190118
20
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20240528
15
21
16
for you to fetch changes up to 2a0ed2804e2c77a1c4e255f05ab739618e05c85d:
22
for you to fetch changes up to f240df3c31b40e4cf1af1f156a88efc1a1df406c:
17
23
18
tests/libqtest: Introduce qtest_init_with_serial() (2019-01-18 14:17:38 +0000)
24
target/arm: Convert disas_simd_3same_logic to decodetree (2024-05-28 14:29:01 +0100)
19
25
20
----------------------------------------------------------------
26
----------------------------------------------------------------
21
target-arm queue:
27
target-arm queue:
22
* hw/char/stm32f2xx_usart: Do not update data register when device is disabled
28
* xlnx_dpdma: fix descriptor endianness bug
23
* hw/arm/virt-acpi-build: Set COHACC override flag in IORT SMMUv3 node
29
* hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
24
* target/arm: Allow Aarch32 exception return to switch from Mon->Hyp
30
* hw/arm/npcm7xx: remove setting of mp-affinity
25
* ftgmac100: implement the new MDIO interface on Aspeed SoC
31
* hw/char: Correct STM32L4x5 usart register CR2 field ADD_0 size
26
* implement the ARMv8.3-PAuth extension
32
* hw/intc/arm_gic: Fix handling of NS view of GICC_APR<n>
27
* improve emulation of the ARM PMU
33
* hw/input/tsc2005: Fix -Wchar-subscripts warning in tsc2005_txrx()
34
* hw: arm: Remove use of tabs in some source files
35
* docs/system: Remove ADC from raspi documentation
36
* target/arm: Start of the conversion of A64 SIMD to decodetree
28
37
29
----------------------------------------------------------------
38
----------------------------------------------------------------
30
Aaron Lindsay (13):
39
Alexandra Diupina (1):
31
migration: Add post_save function to VMStateDescription
40
xlnx_dpdma: fix descriptor endianness bug
32
target/arm: Reorganize PMCCNTR accesses
33
target/arm: Swap PMU values before/after migrations
34
target/arm: Filter cycle counter based on PMCCFILTR_EL0
35
target/arm: Allow AArch32 access for PMCCFILTR
36
target/arm: Implement PMOVSSET
37
target/arm: Define FIELDs for ID_DFR0
38
target/arm: Make PMCEID[01]_EL0 64 bit registers, add PMCEID[23]
39
target/arm: Add array for supported PMU events, generate PMCEID[01]_EL0
40
target/arm: Finish implementation of PM[X]EVCNTR and PM[X]EVTYPER
41
target/arm: PMU: Add instruction and cycle events
42
target/arm: PMU: Set PMCR.N to 4
43
target/arm: Implement PMSWINC
44
41
45
Alexander Graf (1):
42
Andrey Shumilin (1):
46
target/arm: Allow Aarch32 exception return to switch from Mon->Hyp
43
hw/intc/arm_gic: Fix handling of NS view of GICC_APR<n>
47
44
48
Cédric Le Goater (1):
45
Dorjoy Chowdhury (1):
49
ftgmac100: implement the new MDIO interface on Aspeed SoC
46
hw/arm/npcm7xx: remove setting of mp-affinity
50
47
51
Eric Auger (1):
48
Inès Varhol (1):
52
hw/arm/virt-acpi-build: Set COHACC override flag in IORT SMMUv3 node
49
hw/char: Correct STM32L4x5 usart register CR2 field ADD_0 size
53
54
Julia Suvorova (1):
55
tests/libqtest: Introduce qtest_init_with_serial()
56
50
57
Philippe Mathieu-Daudé (1):
51
Philippe Mathieu-Daudé (1):
58
hw/char/stm32f2xx_usart: Do not update data register when device is disabled
52
hw/input/tsc2005: Fix -Wchar-subscripts warning in tsc2005_txrx()
59
53
60
Richard Henderson (31):
54
Rayhan Faizel (1):
61
target/arm: Add state for the ARMv8.3-PAuth extension
55
docs/system: Remove ADC from raspi documentation
62
target/arm: Add SCTLR bits through ARMv8.5
63
target/arm: Add PAuth active bit to tbflags
64
target/arm: Introduce raise_exception_ra
65
target/arm: Add PAuth helpers
66
target/arm: Decode PAuth within system hint space
67
target/arm: Rearrange decode in disas_data_proc_1src
68
target/arm: Decode PAuth within disas_data_proc_1src
69
target/arm: Decode PAuth within disas_data_proc_2src
70
target/arm: Move helper_exception_return to helper-a64.c
71
target/arm: Add new_pc argument to helper_exception_return
72
target/arm: Rearrange decode in disas_uncond_b_reg
73
target/arm: Decode PAuth within disas_uncond_b_reg
74
target/arm: Decode Load/store register (pac)
75
target/arm: Move cpu_mmu_index out of line
76
target/arm: Introduce arm_mmu_idx
77
target/arm: Introduce arm_stage1_mmu_idx
78
target/arm: Create ARMVAParameters and helpers
79
target/arm: Merge TBFLAG_AA_TB{0, 1} to TBII
80
target/arm: Export aa64_va_parameters to internals.h
81
target/arm: Add aa64_va_parameters_both
82
target/arm: Decode TBID from TCR
83
target/arm: Reuse aa64_va_parameters for setting tbflags
84
target/arm: Implement pauth_strip
85
target/arm: Implement pauth_auth
86
target/arm: Implement pauth_addpac
87
target/arm: Implement pauth_computepac
88
target/arm: Add PAuth system registers
89
target/arm: Enable PAuth for -cpu max
90
target/arm: Enable PAuth for user-only
91
target/arm: Tidy TBI handling in gen_a64_set_pc
92
56
93
target/arm/Makefile.objs | 1 +
57
Richard Henderson (34):
94
include/hw/acpi/acpi-defs.h | 2 +
58
target/arm: Use PLD, PLDW, PLI not NOP for t32
95
include/migration/vmstate.h | 1 +
59
target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer)
96
target/arm/cpu.h | 244 +++++----
60
target/arm: Fix decode of FMOV (hp) vs MOVI
97
target/arm/helper-a64.h | 14 +
61
target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16)
98
target/arm/helper.h | 1 -
62
target/arm: Split out gengvec.c
99
target/arm/internals.h | 77 +++
63
target/arm: Split out gengvec64.c
100
target/arm/translate.h | 5 +-
64
target/arm: Convert Cryptographic AES to decodetree
101
tests/libqtest.h | 11 +
65
target/arm: Convert Cryptographic 3-register SHA to decodetree
102
hw/arm/virt-acpi-build.c | 1 +
66
target/arm: Convert Cryptographic 2-register SHA to decodetree
103
hw/char/stm32f2xx_usart.c | 3 +-
67
target/arm: Convert Cryptographic 3-register SHA512 to decodetree
104
hw/net/ftgmac100.c | 80 ++-
68
target/arm: Convert Cryptographic 2-register SHA512 to decodetree
105
migration/vmstate.c | 13 +-
69
target/arm: Convert Cryptographic 4-register to decodetree
106
target/arm/cpu.c | 19 +-
70
target/arm: Convert Cryptographic 3-register, imm2 to decodetree
107
target/arm/cpu64.c | 68 ++-
71
target/arm: Convert XAR to decodetree
108
target/arm/helper-a64.c | 155 ++++++
72
target/arm: Convert Advanced SIMD copy to decodetree
109
target/arm/helper.c | 1222 +++++++++++++++++++++++++++++++++----------
73
target/arm: Convert FMULX to decodetree
110
target/arm/machine.c | 24 +
74
target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree
111
target/arm/op_helper.c | 174 +-----
75
target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree
112
target/arm/pauth_helper.c | 497 ++++++++++++++++++
76
target/arm: Introduce vfp_load_reg16
113
target/arm/translate-a64.c | 537 ++++++++++++++++---
77
target/arm: Expand vfp neg and abs inline
114
tests/libqtest.c | 26 +
78
target/arm: Convert FNMUL to decodetree
115
docs/devel/migration.rst | 9 +-
79
target/arm: Convert FMLA, FMLS to decodetree
116
23 files changed, 2552 insertions(+), 632 deletions(-)
80
target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree
117
create mode 100644 target/arm/pauth_helper.c
81
target/arm: Convert FABD to decodetree
82
target/arm: Convert FRECPS, FRSQRTS to decodetree
83
target/arm: Convert FADDP to decodetree
84
target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree
85
target/arm: Use gvec for neon faddp, fmaxp, fminp
86
target/arm: Convert ADDP to decodetree
87
target/arm: Use gvec for neon padd
88
target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree
89
target/arm: Use gvec for neon pmax, pmin
90
target/arm: Convert FMLAL, FMLSL to decodetree
91
target/arm: Convert disas_simd_3same_logic to decodetree
118
92
93
Tanmay Patil (1):
94
hw: arm: Remove use of tabs in some source files
95
96
Zenghui Yu (1):
97
hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
98
99
docs/system/arm/raspi.rst | 1 -
100
target/arm/helper.h | 68 +-
101
target/arm/tcg/helper-a64.h | 12 +
102
target/arm/tcg/translate-a64.h | 4 +
103
target/arm/tcg/translate.h | 51 +
104
target/arm/tcg/a64.decode | 315 +++-
105
target/arm/tcg/t32.decode | 25 +-
106
hw/arm/boot.c | 8 +-
107
hw/arm/npcm7xx.c | 3 -
108
hw/char/omap_uart.c | 49 +-
109
hw/char/stm32l4x5_usart.c | 2 +-
110
hw/dma/xlnx_dpdma.c | 68 +-
111
hw/gpio/zaurus.c | 59 +-
112
hw/input/tsc2005.c | 135 +-
113
hw/intc/arm_gic.c | 4 +-
114
target/arm/hvf/hvf.c | 130 +-
115
target/arm/tcg/gengvec.c | 1672 +++++++++++++++++++++
116
target/arm/tcg/gengvec64.c | 190 +++
117
target/arm/tcg/neon_helper.c | 5 -
118
target/arm/tcg/translate-a64.c | 3137 +++++++++++++--------------------------
119
target/arm/tcg/translate-neon.c | 136 +-
120
target/arm/tcg/translate-sve.c | 145 +-
121
target/arm/tcg/translate-vfp.c | 93 +-
122
target/arm/tcg/translate.c | 1592 +-------------------
123
target/arm/tcg/vec_helper.c | 221 ++-
124
target/arm/vfp_helper.c | 30 -
125
target/arm/tcg/meson.build | 2 +
126
27 files changed, 3860 insertions(+), 4297 deletions(-)
127
create mode 100644 target/arm/tcg/gengvec.c
128
create mode 100644 target/arm/tcg/gengvec64.c
129
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
When the device is disabled, the internal circuitry keeps the data
4
register loaded and doesn't update it.
5
6
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
8
Message-id: 20190104182057.8778-1-philmd@redhat.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
hw/char/stm32f2xx_usart.c | 3 +--
12
1 file changed, 1 insertion(+), 2 deletions(-)
13
14
diff --git a/hw/char/stm32f2xx_usart.c b/hw/char/stm32f2xx_usart.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/char/stm32f2xx_usart.c
17
+++ b/hw/char/stm32f2xx_usart.c
18
@@ -XXX,XX +XXX,XX @@ static void stm32f2xx_usart_receive(void *opaque, const uint8_t *buf, int size)
19
{
20
STM32F2XXUsartState *s = opaque;
21
22
- s->usart_dr = *buf;
23
-
24
if (!(s->usart_cr1 & USART_CR1_UE && s->usart_cr1 & USART_CR1_RE)) {
25
/* USART not enabled - drop the chars */
26
DB_PRINT("Dropping the chars\n");
27
return;
28
}
29
30
+ s->usart_dr = *buf;
31
s->usart_sr |= USART_SR_RXNE;
32
33
if (s->usart_cr1 & USART_CR1_RXNEIE) {
34
--
35
2.20.1
36
37
diff view generated by jsdifflib
Deleted patch
1
From: Eric Auger <eric.auger@redhat.com>
2
1
3
Let's report IO-coherent access is supported for translation
4
table walks, descriptor fetches and queues by setting the COHACC
5
override flag. Without that, we observe wrong command opcodes.
6
The DT description also advertises the dma coherency.
7
8
Fixes a703b4f6c1ee ("hw/arm/virt-acpi-build: Add smmuv3 node in IORT table")
9
10
Signed-off-by: Eric Auger <eric.auger@redhat.com>
11
Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
12
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
13
Reviewed-by: Andrew Jones <drjones@redhat.com>
14
Message-id: 20190107101041.765-1-eric.auger@redhat.com
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
17
include/hw/acpi/acpi-defs.h | 2 ++
18
hw/arm/virt-acpi-build.c | 1 +
19
2 files changed, 3 insertions(+)
20
21
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
22
index XXXXXXX..XXXXXXX 100644
23
--- a/include/hw/acpi/acpi-defs.h
24
+++ b/include/hw/acpi/acpi-defs.h
25
@@ -XXX,XX +XXX,XX @@ struct AcpiIortItsGroup {
26
} QEMU_PACKED;
27
typedef struct AcpiIortItsGroup AcpiIortItsGroup;
28
29
+#define ACPI_IORT_SMMU_V3_COHACC_OVERRIDE 1
30
+
31
struct AcpiIortSmmu3 {
32
ACPI_IORT_NODE_HEADER_DEF
33
uint64_t base_address;
34
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
35
index XXXXXXX..XXXXXXX 100644
36
--- a/hw/arm/virt-acpi-build.c
37
+++ b/hw/arm/virt-acpi-build.c
38
@@ -XXX,XX +XXX,XX @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
39
smmu->mapping_count = cpu_to_le32(1);
40
smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
41
smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
42
+ smmu->flags = cpu_to_le32(ACPI_IORT_SMMU_V3_COHACC_OVERRIDE);
43
smmu->event_gsiv = cpu_to_le32(irq);
44
smmu->pri_gsiv = cpu_to_le32(irq + 1);
45
smmu->gerr_gsiv = cpu_to_le32(irq + 2);
46
--
47
2.20.1
48
49
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Alexandra Diupina <adiupina@astralinux.ru>
2
2
3
Rename arm_ccnt_enabled to pmu_counter_enabled, and add logic to only
3
Add xlnx_dpdma_read_descriptor() and
4
return 'true' if the specified counter is enabled and neither prohibited
4
xlnx_dpdma_write_descriptor() functions.
5
or filtered.
5
xlnx_dpdma_read_descriptor() combines reading a
6
descriptor from desc_addr by calling dma_memory_read()
7
and swapping the desc fields from guest memory order
8
to host memory order. xlnx_dpdma_write_descriptor()
9
performs similar actions when writing a descriptor.
6
10
7
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
11
Found by Linux Verification Center (linuxtesting.org) with SVACE.
8
Signed-off-by: Aaron Lindsay <aclindsa@gmail.com>
12
13
Fixes: d3c6369a96 ("introduce xlnx-dpdma")
14
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
15
[PMM: tweaked indent, dropped behaviour change for write-failure case]
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20181211151945.29137-5-aaron@os.amperecomputing.com
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
18
---
14
target/arm/cpu.h | 10 ++++-
19
hw/dma/xlnx_dpdma.c | 68 ++++++++++++++++++++++++++++++++++++++++++---
15
target/arm/cpu.c | 3 ++
20
1 file changed, 64 insertions(+), 4 deletions(-)
16
target/arm/helper.c | 96 +++++++++++++++++++++++++++++++++++++++++----
17
3 files changed, 101 insertions(+), 8 deletions(-)
18
21
19
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
22
diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c
20
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/cpu.h
24
--- a/hw/dma/xlnx_dpdma.c
22
+++ b/target/arm/cpu.h
25
+++ b/hw/dma/xlnx_dpdma.c
23
@@ -XXX,XX +XXX,XX @@ void pmccntr_op_finish(CPUARMState *env);
26
@@ -XXX,XX +XXX,XX @@ static void xlnx_dpdma_register_types(void)
24
void pmu_op_start(CPUARMState *env);
27
type_register_static(&xlnx_dpdma_info);
25
void pmu_op_finish(CPUARMState *env);
26
27
+/**
28
+ * Functions to register as EL change hooks for PMU mode filtering
29
+ */
30
+void pmu_pre_el_change(ARMCPU *cpu, void *ignored);
31
+void pmu_post_el_change(ARMCPU *cpu, void *ignored);
32
+
33
/* SCTLR bit meanings. Several bits have been reused in newer
34
* versions of the architecture; in that case we define constants
35
* for both old and new bit meanings. Code which tests against those
36
@@ -XXX,XX +XXX,XX @@ void pmu_op_finish(CPUARMState *env);
37
38
#define MDCR_EPMAD (1U << 21)
39
#define MDCR_EDAD (1U << 20)
40
-#define MDCR_SPME (1U << 17)
41
+#define MDCR_SPME (1U << 17) /* MDCR_EL3 */
42
+#define MDCR_HPMD (1U << 17) /* MDCR_EL2 */
43
#define MDCR_SDD (1U << 16)
44
#define MDCR_SPD (3U << 14)
45
#define MDCR_TDRA (1U << 11)
46
@@ -XXX,XX +XXX,XX @@ void pmu_op_finish(CPUARMState *env);
47
#define MDCR_HPME (1U << 7)
48
#define MDCR_TPM (1U << 6)
49
#define MDCR_TPMCR (1U << 5)
50
+#define MDCR_HPMN (0x1fU)
51
52
/* Not all of the MDCR_EL3 bits are present in the 32-bit SDCR */
53
#define SDCR_VALID_MASK (MDCR_EPMAD | MDCR_EDAD | MDCR_SPME | MDCR_SPD)
54
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/cpu.c
57
+++ b/target/arm/cpu.c
58
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
59
if (!cpu->has_pmu) {
60
unset_feature(env, ARM_FEATURE_PMU);
61
cpu->id_aa64dfr0 &= ~0xf00;
62
+ } else if (!kvm_enabled()) {
63
+ arm_register_pre_el_change_hook(cpu, &pmu_pre_el_change, 0);
64
+ arm_register_el_change_hook(cpu, &pmu_post_el_change, 0);
65
}
66
67
if (!arm_feature(env, ARM_FEATURE_EL2)) {
68
diff --git a/target/arm/helper.c b/target/arm/helper.c
69
index XXXXXXX..XXXXXXX 100644
70
--- a/target/arm/helper.c
71
+++ b/target/arm/helper.c
72
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
73
/* Definitions for the PMU registers */
74
#define PMCRN_MASK 0xf800
75
#define PMCRN_SHIFT 11
76
+#define PMCRDP 0x10
77
#define PMCRD 0x8
78
#define PMCRC 0x4
79
#define PMCRE 0x1
80
81
+#define PMXEVTYPER_P 0x80000000
82
+#define PMXEVTYPER_U 0x40000000
83
+#define PMXEVTYPER_NSK 0x20000000
84
+#define PMXEVTYPER_NSU 0x10000000
85
+#define PMXEVTYPER_NSH 0x08000000
86
+#define PMXEVTYPER_M 0x04000000
87
+#define PMXEVTYPER_MT 0x02000000
88
+#define PMXEVTYPER_EVTCOUNT 0x0000ffff
89
+#define PMXEVTYPER_MASK (PMXEVTYPER_P | PMXEVTYPER_U | PMXEVTYPER_NSK | \
90
+ PMXEVTYPER_NSU | PMXEVTYPER_NSH | \
91
+ PMXEVTYPER_M | PMXEVTYPER_MT | \
92
+ PMXEVTYPER_EVTCOUNT)
93
+
94
static inline uint32_t pmu_num_counters(CPUARMState *env)
95
{
96
return (env->cp15.c9_pmcr & PMCRN_MASK) >> PMCRN_SHIFT;
97
@@ -XXX,XX +XXX,XX @@ static CPAccessResult pmreg_access_ccntr(CPUARMState *env,
98
return pmreg_access(env, ri, isread);
99
}
28
}
100
29
101
-static inline bool arm_ccnt_enabled(CPUARMState *env)
30
+static MemTxResult xlnx_dpdma_read_descriptor(XlnxDPDMAState *s,
102
+/* Returns true if the counter (pass 31 for PMCCNTR) should count events using
31
+ uint64_t desc_addr,
103
+ * the current EL, security state, and register configuration.
32
+ DPDMADescriptor *desc)
104
+ */
33
+{
105
+static bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
34
+ MemTxResult res = dma_memory_read(&address_space_memory, desc_addr,
106
{
35
+ &desc, sizeof(DPDMADescriptor),
107
- /* This does not support checking PMCCFILTR_EL0 register */
36
+ MEMTXATTRS_UNSPECIFIED);
108
+ uint64_t filter;
37
+ if (res) {
109
+ bool e, p, u, nsk, nsu, nsh, m;
38
+ return res;
110
+ bool enabled, prohibited, filtered;
111
+ bool secure = arm_is_secure(env);
112
+ int el = arm_current_el(env);
113
+ uint8_t hpmn = env->cp15.mdcr_el2 & MDCR_HPMN;
114
115
- if (!(env->cp15.c9_pmcr & PMCRE) || !(env->cp15.c9_pmcnten & (1 << 31))) {
116
- return false;
117
+ if (!arm_feature(env, ARM_FEATURE_EL2) ||
118
+ (counter < hpmn || counter == 31)) {
119
+ e = env->cp15.c9_pmcr & PMCRE;
120
+ } else {
121
+ e = env->cp15.mdcr_el2 & MDCR_HPME;
122
+ }
123
+ enabled = e && (env->cp15.c9_pmcnten & (1 << counter));
124
+
125
+ if (!secure) {
126
+ if (el == 2 && (counter < hpmn || counter == 31)) {
127
+ prohibited = env->cp15.mdcr_el2 & MDCR_HPMD;
128
+ } else {
129
+ prohibited = false;
130
+ }
131
+ } else {
132
+ prohibited = arm_feature(env, ARM_FEATURE_EL3) &&
133
+ (env->cp15.mdcr_el3 & MDCR_SPME);
134
}
135
136
- return true;
137
+ if (prohibited && counter == 31) {
138
+ prohibited = env->cp15.c9_pmcr & PMCRDP;
139
+ }
39
+ }
140
+
40
+
141
+ /* TODO Remove assert, set filter to correct PMEVTYPER */
41
+ /* Convert from LE into host endianness. */
142
+ assert(counter == 31);
42
+ desc->control = le32_to_cpu(desc->control);
143
+ filter = env->cp15.pmccfiltr_el0;
43
+ desc->descriptor_id = le32_to_cpu(desc->descriptor_id);
44
+ desc->xfer_size = le32_to_cpu(desc->xfer_size);
45
+ desc->line_size_stride = le32_to_cpu(desc->line_size_stride);
46
+ desc->timestamp_lsb = le32_to_cpu(desc->timestamp_lsb);
47
+ desc->timestamp_msb = le32_to_cpu(desc->timestamp_msb);
48
+ desc->address_extension = le32_to_cpu(desc->address_extension);
49
+ desc->next_descriptor = le32_to_cpu(desc->next_descriptor);
50
+ desc->source_address = le32_to_cpu(desc->source_address);
51
+ desc->address_extension_23 = le32_to_cpu(desc->address_extension_23);
52
+ desc->address_extension_45 = le32_to_cpu(desc->address_extension_45);
53
+ desc->source_address2 = le32_to_cpu(desc->source_address2);
54
+ desc->source_address3 = le32_to_cpu(desc->source_address3);
55
+ desc->source_address4 = le32_to_cpu(desc->source_address4);
56
+ desc->source_address5 = le32_to_cpu(desc->source_address5);
57
+ desc->crc = le32_to_cpu(desc->crc);
144
+
58
+
145
+ p = filter & PMXEVTYPER_P;
59
+ return res;
146
+ u = filter & PMXEVTYPER_U;
147
+ nsk = arm_feature(env, ARM_FEATURE_EL3) && (filter & PMXEVTYPER_NSK);
148
+ nsu = arm_feature(env, ARM_FEATURE_EL3) && (filter & PMXEVTYPER_NSU);
149
+ nsh = arm_feature(env, ARM_FEATURE_EL2) && (filter & PMXEVTYPER_NSH);
150
+ m = arm_el_is_aa64(env, 1) &&
151
+ arm_feature(env, ARM_FEATURE_EL3) && (filter & PMXEVTYPER_M);
152
+
153
+ if (el == 0) {
154
+ filtered = secure ? u : u != nsu;
155
+ } else if (el == 1) {
156
+ filtered = secure ? p : p != nsk;
157
+ } else if (el == 2) {
158
+ filtered = !nsh;
159
+ } else { /* EL3 */
160
+ filtered = m != p;
161
+ }
162
+
163
+ return enabled && !prohibited && !filtered;
164
}
165
+
166
/*
167
* Ensure c15_ccnt is the guest-visible count so that operations such as
168
* enabling/disabling the counter or filtering, modifying the count itself,
169
@@ -XXX,XX +XXX,XX @@ void pmccntr_op_start(CPUARMState *env)
170
cycles = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
171
ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
172
173
- if (arm_ccnt_enabled(env)) {
174
+ if (pmu_counter_enabled(env, 31)) {
175
uint64_t eff_cycles = cycles;
176
if (env->cp15.c9_pmcr & PMCRD) {
177
/* Increment once every 64 processor clock cycles */
178
@@ -XXX,XX +XXX,XX @@ void pmccntr_op_start(CPUARMState *env)
179
*/
180
void pmccntr_op_finish(CPUARMState *env)
181
{
182
- if (arm_ccnt_enabled(env)) {
183
+ if (pmu_counter_enabled(env, 31)) {
184
uint64_t prev_cycles = env->cp15.c15_ccnt_delta;
185
186
if (env->cp15.c9_pmcr & PMCRD) {
187
@@ -XXX,XX +XXX,XX @@ void pmu_op_finish(CPUARMState *env)
188
pmccntr_op_finish(env);
189
}
190
191
+void pmu_pre_el_change(ARMCPU *cpu, void *ignored)
192
+{
193
+ pmu_op_start(&cpu->env);
194
+}
60
+}
195
+
61
+
196
+void pmu_post_el_change(ARMCPU *cpu, void *ignored)
62
+static MemTxResult xlnx_dpdma_write_descriptor(uint64_t desc_addr,
63
+ DPDMADescriptor *desc)
197
+{
64
+{
198
+ pmu_op_finish(&cpu->env);
65
+ DPDMADescriptor tmp_desc = *desc;
66
+
67
+ /* Convert from host endianness into LE. */
68
+ tmp_desc.control = cpu_to_le32(tmp_desc.control);
69
+ tmp_desc.descriptor_id = cpu_to_le32(tmp_desc.descriptor_id);
70
+ tmp_desc.xfer_size = cpu_to_le32(tmp_desc.xfer_size);
71
+ tmp_desc.line_size_stride = cpu_to_le32(tmp_desc.line_size_stride);
72
+ tmp_desc.timestamp_lsb = cpu_to_le32(tmp_desc.timestamp_lsb);
73
+ tmp_desc.timestamp_msb = cpu_to_le32(tmp_desc.timestamp_msb);
74
+ tmp_desc.address_extension = cpu_to_le32(tmp_desc.address_extension);
75
+ tmp_desc.next_descriptor = cpu_to_le32(tmp_desc.next_descriptor);
76
+ tmp_desc.source_address = cpu_to_le32(tmp_desc.source_address);
77
+ tmp_desc.address_extension_23 = cpu_to_le32(tmp_desc.address_extension_23);
78
+ tmp_desc.address_extension_45 = cpu_to_le32(tmp_desc.address_extension_45);
79
+ tmp_desc.source_address2 = cpu_to_le32(tmp_desc.source_address2);
80
+ tmp_desc.source_address3 = cpu_to_le32(tmp_desc.source_address3);
81
+ tmp_desc.source_address4 = cpu_to_le32(tmp_desc.source_address4);
82
+ tmp_desc.source_address5 = cpu_to_le32(tmp_desc.source_address5);
83
+ tmp_desc.crc = cpu_to_le32(tmp_desc.crc);
84
+
85
+ return dma_memory_write(&address_space_memory, desc_addr, &tmp_desc,
86
+ sizeof(DPDMADescriptor), MEMTXATTRS_UNSPECIFIED);
199
+}
87
+}
200
+
88
+
201
static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
89
size_t xlnx_dpdma_start_operation(XlnxDPDMAState *s, uint8_t channel,
202
uint64_t value)
90
bool one_desc)
203
{
91
{
204
@@ -XXX,XX +XXX,XX @@ void pmu_op_finish(CPUARMState *env)
92
@@ -XXX,XX +XXX,XX @@ size_t xlnx_dpdma_start_operation(XlnxDPDMAState *s, uint8_t channel,
205
{
93
desc_addr = xlnx_dpdma_descriptor_next_address(s, channel);
206
}
94
}
207
95
208
+void pmu_pre_el_change(ARMCPU *cpu, void *ignored)
96
- if (dma_memory_read(&address_space_memory, desc_addr, &desc,
209
+{
97
- sizeof(DPDMADescriptor), MEMTXATTRS_UNSPECIFIED)) {
210
+}
98
+ if (xlnx_dpdma_read_descriptor(s, desc_addr, &desc)) {
211
+
99
s->registers[DPDMA_EISR] |= ((1 << 1) << channel);
212
+void pmu_post_el_change(ARMCPU *cpu, void *ignored)
100
xlnx_dpdma_update_irq(s);
213
+{
101
s->operation_finished[channel] = true;
214
+}
102
@@ -XXX,XX +XXX,XX @@ size_t xlnx_dpdma_start_operation(XlnxDPDMAState *s, uint8_t channel,
215
+
103
/* The descriptor need to be updated when it's completed. */
216
#endif
104
DPRINTF("update the descriptor with the done flag set.\n");
217
105
xlnx_dpdma_desc_set_done(&desc);
218
static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
106
- dma_memory_write(&address_space_memory, desc_addr, &desc,
107
- sizeof(DPDMADescriptor), MEMTXATTRS_UNSPECIFIED);
108
+ if (xlnx_dpdma_write_descriptor(desc_addr, &desc)) {
109
+ DPRINTF("Can't write the descriptor.\n");
110
+ /* TODO: check hardware behaviour for memory write failure */
111
+ }
112
}
113
114
if (xlnx_dpdma_desc_completion_interrupt(&desc)) {
219
--
115
--
220
2.20.1
116
2.34.1
221
222
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Zenghui Yu <zenghui.yu@linux.dev>
2
2
3
Add an array for PMOVSSET so we only define it for v7ve+ platforms
3
We wrongly encoded ID_AA64PFR1_EL1 using {3,0,0,4,2} in hvf_sreg_match[] so
4
we fail to get the expected ARMCPRegInfo from cp_regs hash table with the
5
wrong key.
4
6
5
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
7
Fix it with the correct encoding {3,0,0,4,1}. With that fixed, the Linux
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
guest can properly detect FEAT_SSBS2 on my M1 HW.
7
Message-id: 20181211151945.29137-7-aaron@os.amperecomputing.com
9
10
All DBG{B,W}{V,C}R_EL1 registers are also wrongly encoded with op0 == 14.
11
It happens to work because HVF_SYSREG(CRn, CRm, 14, op1, op2) equals to
12
HVF_SYSREG(CRn, CRm, 2, op1, op2), by definition. But we shouldn't rely on
13
it.
14
15
Cc: qemu-stable@nongnu.org
16
Fixes: a1477da3ddeb ("hvf: Add Apple Silicon support")
17
Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev>
18
Reviewed-by: Alexander Graf <agraf@csgraf.de>
19
Message-id: 20240503153453.54389-1-zenghui.yu@linux.dev
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
21
---
10
target/arm/helper.c | 28 ++++++++++++++++++++++++++++
22
target/arm/hvf/hvf.c | 130 +++++++++++++++++++++----------------------
11
1 file changed, 28 insertions(+)
23
1 file changed, 65 insertions(+), 65 deletions(-)
12
24
13
diff --git a/target/arm/helper.c b/target/arm/helper.c
25
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
14
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper.c
27
--- a/target/arm/hvf/hvf.c
16
+++ b/target/arm/helper.c
28
+++ b/target/arm/hvf/hvf.c
17
@@ -XXX,XX +XXX,XX @@ static void pmovsr_write(CPUARMState *env, const ARMCPRegInfo *ri,
29
@@ -XXX,XX +XXX,XX @@ struct hvf_sreg_match {
18
env->cp15.c9_pmovsr &= ~value;
19
}
20
21
+static void pmovsset_write(CPUARMState *env, const ARMCPRegInfo *ri,
22
+ uint64_t value)
23
+{
24
+ value &= pmu_counter_mask(env);
25
+ env->cp15.c9_pmovsr |= value;
26
+}
27
+
28
static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
29
uint64_t value)
30
{
31
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7mp_cp_reginfo[] = {
32
REGINFO_SENTINEL
33
};
30
};
34
31
35
+static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
32
static struct hvf_sreg_match hvf_sreg_match[] = {
36
+ /* PMOVSSET is not implemented in v7 before v7ve */
33
- { HV_SYS_REG_DBGBVR0_EL1, HVF_SYSREG(0, 0, 14, 0, 4) },
37
+ { .name = "PMOVSSET", .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 3,
34
- { HV_SYS_REG_DBGBCR0_EL1, HVF_SYSREG(0, 0, 14, 0, 5) },
38
+ .access = PL0_RW, .accessfn = pmreg_access,
35
- { HV_SYS_REG_DBGWVR0_EL1, HVF_SYSREG(0, 0, 14, 0, 6) },
39
+ .type = ARM_CP_ALIAS,
36
- { HV_SYS_REG_DBGWCR0_EL1, HVF_SYSREG(0, 0, 14, 0, 7) },
40
+ .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),
37
+ { HV_SYS_REG_DBGBVR0_EL1, HVF_SYSREG(0, 0, 2, 0, 4) },
41
+ .writefn = pmovsset_write,
38
+ { HV_SYS_REG_DBGBCR0_EL1, HVF_SYSREG(0, 0, 2, 0, 5) },
42
+ .raw_writefn = raw_write },
39
+ { HV_SYS_REG_DBGWVR0_EL1, HVF_SYSREG(0, 0, 2, 0, 6) },
43
+ { .name = "PMOVSSET_EL0", .state = ARM_CP_STATE_AA64,
40
+ { HV_SYS_REG_DBGWCR0_EL1, HVF_SYSREG(0, 0, 2, 0, 7) },
44
+ .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 14, .opc2 = 3,
41
45
+ .access = PL0_RW, .accessfn = pmreg_access,
42
- { HV_SYS_REG_DBGBVR1_EL1, HVF_SYSREG(0, 1, 14, 0, 4) },
46
+ .type = ARM_CP_ALIAS,
43
- { HV_SYS_REG_DBGBCR1_EL1, HVF_SYSREG(0, 1, 14, 0, 5) },
47
+ .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
44
- { HV_SYS_REG_DBGWVR1_EL1, HVF_SYSREG(0, 1, 14, 0, 6) },
48
+ .writefn = pmovsset_write,
45
- { HV_SYS_REG_DBGWCR1_EL1, HVF_SYSREG(0, 1, 14, 0, 7) },
49
+ .raw_writefn = raw_write },
46
+ { HV_SYS_REG_DBGBVR1_EL1, HVF_SYSREG(0, 1, 2, 0, 4) },
50
+ REGINFO_SENTINEL
47
+ { HV_SYS_REG_DBGBCR1_EL1, HVF_SYSREG(0, 1, 2, 0, 5) },
51
+};
48
+ { HV_SYS_REG_DBGWVR1_EL1, HVF_SYSREG(0, 1, 2, 0, 6) },
52
+
49
+ { HV_SYS_REG_DBGWCR1_EL1, HVF_SYSREG(0, 1, 2, 0, 7) },
53
static void teecr_write(CPUARMState *env, const ARMCPRegInfo *ri,
50
54
uint64_t value)
51
- { HV_SYS_REG_DBGBVR2_EL1, HVF_SYSREG(0, 2, 14, 0, 4) },
55
{
52
- { HV_SYS_REG_DBGBCR2_EL1, HVF_SYSREG(0, 2, 14, 0, 5) },
56
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
53
- { HV_SYS_REG_DBGWVR2_EL1, HVF_SYSREG(0, 2, 14, 0, 6) },
57
!arm_feature(env, ARM_FEATURE_PMSA)) {
54
- { HV_SYS_REG_DBGWCR2_EL1, HVF_SYSREG(0, 2, 14, 0, 7) },
58
define_arm_cp_regs(cpu, v7mp_cp_reginfo);
55
+ { HV_SYS_REG_DBGBVR2_EL1, HVF_SYSREG(0, 2, 2, 0, 4) },
59
}
56
+ { HV_SYS_REG_DBGBCR2_EL1, HVF_SYSREG(0, 2, 2, 0, 5) },
60
+ if (arm_feature(env, ARM_FEATURE_V7VE)) {
57
+ { HV_SYS_REG_DBGWVR2_EL1, HVF_SYSREG(0, 2, 2, 0, 6) },
61
+ define_arm_cp_regs(cpu, pmovsset_cp_reginfo);
58
+ { HV_SYS_REG_DBGWCR2_EL1, HVF_SYSREG(0, 2, 2, 0, 7) },
62
+ }
59
63
if (arm_feature(env, ARM_FEATURE_V7)) {
60
- { HV_SYS_REG_DBGBVR3_EL1, HVF_SYSREG(0, 3, 14, 0, 4) },
64
/* v7 performance monitor control register: same implementor
61
- { HV_SYS_REG_DBGBCR3_EL1, HVF_SYSREG(0, 3, 14, 0, 5) },
65
* field as main ID register, and we implement only the cycle
62
- { HV_SYS_REG_DBGWVR3_EL1, HVF_SYSREG(0, 3, 14, 0, 6) },
63
- { HV_SYS_REG_DBGWCR3_EL1, HVF_SYSREG(0, 3, 14, 0, 7) },
64
+ { HV_SYS_REG_DBGBVR3_EL1, HVF_SYSREG(0, 3, 2, 0, 4) },
65
+ { HV_SYS_REG_DBGBCR3_EL1, HVF_SYSREG(0, 3, 2, 0, 5) },
66
+ { HV_SYS_REG_DBGWVR3_EL1, HVF_SYSREG(0, 3, 2, 0, 6) },
67
+ { HV_SYS_REG_DBGWCR3_EL1, HVF_SYSREG(0, 3, 2, 0, 7) },
68
69
- { HV_SYS_REG_DBGBVR4_EL1, HVF_SYSREG(0, 4, 14, 0, 4) },
70
- { HV_SYS_REG_DBGBCR4_EL1, HVF_SYSREG(0, 4, 14, 0, 5) },
71
- { HV_SYS_REG_DBGWVR4_EL1, HVF_SYSREG(0, 4, 14, 0, 6) },
72
- { HV_SYS_REG_DBGWCR4_EL1, HVF_SYSREG(0, 4, 14, 0, 7) },
73
+ { HV_SYS_REG_DBGBVR4_EL1, HVF_SYSREG(0, 4, 2, 0, 4) },
74
+ { HV_SYS_REG_DBGBCR4_EL1, HVF_SYSREG(0, 4, 2, 0, 5) },
75
+ { HV_SYS_REG_DBGWVR4_EL1, HVF_SYSREG(0, 4, 2, 0, 6) },
76
+ { HV_SYS_REG_DBGWCR4_EL1, HVF_SYSREG(0, 4, 2, 0, 7) },
77
78
- { HV_SYS_REG_DBGBVR5_EL1, HVF_SYSREG(0, 5, 14, 0, 4) },
79
- { HV_SYS_REG_DBGBCR5_EL1, HVF_SYSREG(0, 5, 14, 0, 5) },
80
- { HV_SYS_REG_DBGWVR5_EL1, HVF_SYSREG(0, 5, 14, 0, 6) },
81
- { HV_SYS_REG_DBGWCR5_EL1, HVF_SYSREG(0, 5, 14, 0, 7) },
82
+ { HV_SYS_REG_DBGBVR5_EL1, HVF_SYSREG(0, 5, 2, 0, 4) },
83
+ { HV_SYS_REG_DBGBCR5_EL1, HVF_SYSREG(0, 5, 2, 0, 5) },
84
+ { HV_SYS_REG_DBGWVR5_EL1, HVF_SYSREG(0, 5, 2, 0, 6) },
85
+ { HV_SYS_REG_DBGWCR5_EL1, HVF_SYSREG(0, 5, 2, 0, 7) },
86
87
- { HV_SYS_REG_DBGBVR6_EL1, HVF_SYSREG(0, 6, 14, 0, 4) },
88
- { HV_SYS_REG_DBGBCR6_EL1, HVF_SYSREG(0, 6, 14, 0, 5) },
89
- { HV_SYS_REG_DBGWVR6_EL1, HVF_SYSREG(0, 6, 14, 0, 6) },
90
- { HV_SYS_REG_DBGWCR6_EL1, HVF_SYSREG(0, 6, 14, 0, 7) },
91
+ { HV_SYS_REG_DBGBVR6_EL1, HVF_SYSREG(0, 6, 2, 0, 4) },
92
+ { HV_SYS_REG_DBGBCR6_EL1, HVF_SYSREG(0, 6, 2, 0, 5) },
93
+ { HV_SYS_REG_DBGWVR6_EL1, HVF_SYSREG(0, 6, 2, 0, 6) },
94
+ { HV_SYS_REG_DBGWCR6_EL1, HVF_SYSREG(0, 6, 2, 0, 7) },
95
96
- { HV_SYS_REG_DBGBVR7_EL1, HVF_SYSREG(0, 7, 14, 0, 4) },
97
- { HV_SYS_REG_DBGBCR7_EL1, HVF_SYSREG(0, 7, 14, 0, 5) },
98
- { HV_SYS_REG_DBGWVR7_EL1, HVF_SYSREG(0, 7, 14, 0, 6) },
99
- { HV_SYS_REG_DBGWCR7_EL1, HVF_SYSREG(0, 7, 14, 0, 7) },
100
+ { HV_SYS_REG_DBGBVR7_EL1, HVF_SYSREG(0, 7, 2, 0, 4) },
101
+ { HV_SYS_REG_DBGBCR7_EL1, HVF_SYSREG(0, 7, 2, 0, 5) },
102
+ { HV_SYS_REG_DBGWVR7_EL1, HVF_SYSREG(0, 7, 2, 0, 6) },
103
+ { HV_SYS_REG_DBGWCR7_EL1, HVF_SYSREG(0, 7, 2, 0, 7) },
104
105
- { HV_SYS_REG_DBGBVR8_EL1, HVF_SYSREG(0, 8, 14, 0, 4) },
106
- { HV_SYS_REG_DBGBCR8_EL1, HVF_SYSREG(0, 8, 14, 0, 5) },
107
- { HV_SYS_REG_DBGWVR8_EL1, HVF_SYSREG(0, 8, 14, 0, 6) },
108
- { HV_SYS_REG_DBGWCR8_EL1, HVF_SYSREG(0, 8, 14, 0, 7) },
109
+ { HV_SYS_REG_DBGBVR8_EL1, HVF_SYSREG(0, 8, 2, 0, 4) },
110
+ { HV_SYS_REG_DBGBCR8_EL1, HVF_SYSREG(0, 8, 2, 0, 5) },
111
+ { HV_SYS_REG_DBGWVR8_EL1, HVF_SYSREG(0, 8, 2, 0, 6) },
112
+ { HV_SYS_REG_DBGWCR8_EL1, HVF_SYSREG(0, 8, 2, 0, 7) },
113
114
- { HV_SYS_REG_DBGBVR9_EL1, HVF_SYSREG(0, 9, 14, 0, 4) },
115
- { HV_SYS_REG_DBGBCR9_EL1, HVF_SYSREG(0, 9, 14, 0, 5) },
116
- { HV_SYS_REG_DBGWVR9_EL1, HVF_SYSREG(0, 9, 14, 0, 6) },
117
- { HV_SYS_REG_DBGWCR9_EL1, HVF_SYSREG(0, 9, 14, 0, 7) },
118
+ { HV_SYS_REG_DBGBVR9_EL1, HVF_SYSREG(0, 9, 2, 0, 4) },
119
+ { HV_SYS_REG_DBGBCR9_EL1, HVF_SYSREG(0, 9, 2, 0, 5) },
120
+ { HV_SYS_REG_DBGWVR9_EL1, HVF_SYSREG(0, 9, 2, 0, 6) },
121
+ { HV_SYS_REG_DBGWCR9_EL1, HVF_SYSREG(0, 9, 2, 0, 7) },
122
123
- { HV_SYS_REG_DBGBVR10_EL1, HVF_SYSREG(0, 10, 14, 0, 4) },
124
- { HV_SYS_REG_DBGBCR10_EL1, HVF_SYSREG(0, 10, 14, 0, 5) },
125
- { HV_SYS_REG_DBGWVR10_EL1, HVF_SYSREG(0, 10, 14, 0, 6) },
126
- { HV_SYS_REG_DBGWCR10_EL1, HVF_SYSREG(0, 10, 14, 0, 7) },
127
+ { HV_SYS_REG_DBGBVR10_EL1, HVF_SYSREG(0, 10, 2, 0, 4) },
128
+ { HV_SYS_REG_DBGBCR10_EL1, HVF_SYSREG(0, 10, 2, 0, 5) },
129
+ { HV_SYS_REG_DBGWVR10_EL1, HVF_SYSREG(0, 10, 2, 0, 6) },
130
+ { HV_SYS_REG_DBGWCR10_EL1, HVF_SYSREG(0, 10, 2, 0, 7) },
131
132
- { HV_SYS_REG_DBGBVR11_EL1, HVF_SYSREG(0, 11, 14, 0, 4) },
133
- { HV_SYS_REG_DBGBCR11_EL1, HVF_SYSREG(0, 11, 14, 0, 5) },
134
- { HV_SYS_REG_DBGWVR11_EL1, HVF_SYSREG(0, 11, 14, 0, 6) },
135
- { HV_SYS_REG_DBGWCR11_EL1, HVF_SYSREG(0, 11, 14, 0, 7) },
136
+ { HV_SYS_REG_DBGBVR11_EL1, HVF_SYSREG(0, 11, 2, 0, 4) },
137
+ { HV_SYS_REG_DBGBCR11_EL1, HVF_SYSREG(0, 11, 2, 0, 5) },
138
+ { HV_SYS_REG_DBGWVR11_EL1, HVF_SYSREG(0, 11, 2, 0, 6) },
139
+ { HV_SYS_REG_DBGWCR11_EL1, HVF_SYSREG(0, 11, 2, 0, 7) },
140
141
- { HV_SYS_REG_DBGBVR12_EL1, HVF_SYSREG(0, 12, 14, 0, 4) },
142
- { HV_SYS_REG_DBGBCR12_EL1, HVF_SYSREG(0, 12, 14, 0, 5) },
143
- { HV_SYS_REG_DBGWVR12_EL1, HVF_SYSREG(0, 12, 14, 0, 6) },
144
- { HV_SYS_REG_DBGWCR12_EL1, HVF_SYSREG(0, 12, 14, 0, 7) },
145
+ { HV_SYS_REG_DBGBVR12_EL1, HVF_SYSREG(0, 12, 2, 0, 4) },
146
+ { HV_SYS_REG_DBGBCR12_EL1, HVF_SYSREG(0, 12, 2, 0, 5) },
147
+ { HV_SYS_REG_DBGWVR12_EL1, HVF_SYSREG(0, 12, 2, 0, 6) },
148
+ { HV_SYS_REG_DBGWCR12_EL1, HVF_SYSREG(0, 12, 2, 0, 7) },
149
150
- { HV_SYS_REG_DBGBVR13_EL1, HVF_SYSREG(0, 13, 14, 0, 4) },
151
- { HV_SYS_REG_DBGBCR13_EL1, HVF_SYSREG(0, 13, 14, 0, 5) },
152
- { HV_SYS_REG_DBGWVR13_EL1, HVF_SYSREG(0, 13, 14, 0, 6) },
153
- { HV_SYS_REG_DBGWCR13_EL1, HVF_SYSREG(0, 13, 14, 0, 7) },
154
+ { HV_SYS_REG_DBGBVR13_EL1, HVF_SYSREG(0, 13, 2, 0, 4) },
155
+ { HV_SYS_REG_DBGBCR13_EL1, HVF_SYSREG(0, 13, 2, 0, 5) },
156
+ { HV_SYS_REG_DBGWVR13_EL1, HVF_SYSREG(0, 13, 2, 0, 6) },
157
+ { HV_SYS_REG_DBGWCR13_EL1, HVF_SYSREG(0, 13, 2, 0, 7) },
158
159
- { HV_SYS_REG_DBGBVR14_EL1, HVF_SYSREG(0, 14, 14, 0, 4) },
160
- { HV_SYS_REG_DBGBCR14_EL1, HVF_SYSREG(0, 14, 14, 0, 5) },
161
- { HV_SYS_REG_DBGWVR14_EL1, HVF_SYSREG(0, 14, 14, 0, 6) },
162
- { HV_SYS_REG_DBGWCR14_EL1, HVF_SYSREG(0, 14, 14, 0, 7) },
163
+ { HV_SYS_REG_DBGBVR14_EL1, HVF_SYSREG(0, 14, 2, 0, 4) },
164
+ { HV_SYS_REG_DBGBCR14_EL1, HVF_SYSREG(0, 14, 2, 0, 5) },
165
+ { HV_SYS_REG_DBGWVR14_EL1, HVF_SYSREG(0, 14, 2, 0, 6) },
166
+ { HV_SYS_REG_DBGWCR14_EL1, HVF_SYSREG(0, 14, 2, 0, 7) },
167
168
- { HV_SYS_REG_DBGBVR15_EL1, HVF_SYSREG(0, 15, 14, 0, 4) },
169
- { HV_SYS_REG_DBGBCR15_EL1, HVF_SYSREG(0, 15, 14, 0, 5) },
170
- { HV_SYS_REG_DBGWVR15_EL1, HVF_SYSREG(0, 15, 14, 0, 6) },
171
- { HV_SYS_REG_DBGWCR15_EL1, HVF_SYSREG(0, 15, 14, 0, 7) },
172
+ { HV_SYS_REG_DBGBVR15_EL1, HVF_SYSREG(0, 15, 2, 0, 4) },
173
+ { HV_SYS_REG_DBGBCR15_EL1, HVF_SYSREG(0, 15, 2, 0, 5) },
174
+ { HV_SYS_REG_DBGWVR15_EL1, HVF_SYSREG(0, 15, 2, 0, 6) },
175
+ { HV_SYS_REG_DBGWCR15_EL1, HVF_SYSREG(0, 15, 2, 0, 7) },
176
177
#ifdef SYNC_NO_RAW_REGS
178
/*
179
@@ -XXX,XX +XXX,XX @@ static struct hvf_sreg_match hvf_sreg_match[] = {
180
{ HV_SYS_REG_MPIDR_EL1, HVF_SYSREG(0, 0, 3, 0, 5) },
181
{ HV_SYS_REG_ID_AA64PFR0_EL1, HVF_SYSREG(0, 4, 3, 0, 0) },
182
#endif
183
- { HV_SYS_REG_ID_AA64PFR1_EL1, HVF_SYSREG(0, 4, 3, 0, 2) },
184
+ { HV_SYS_REG_ID_AA64PFR1_EL1, HVF_SYSREG(0, 4, 3, 0, 1) },
185
{ HV_SYS_REG_ID_AA64DFR0_EL1, HVF_SYSREG(0, 5, 3, 0, 0) },
186
{ HV_SYS_REG_ID_AA64DFR1_EL1, HVF_SYSREG(0, 5, 3, 0, 1) },
187
{ HV_SYS_REG_ID_AA64ISAR0_EL1, HVF_SYSREG(0, 6, 3, 0, 0) },
66
--
188
--
67
2.20.1
189
2.34.1
68
69
diff view generated by jsdifflib
1
From: Julia Suvorova <jusual@mail.ru>
1
From: Dorjoy Chowdhury <dorjoychy111@gmail.com>
2
2
3
Run qtest with a socket that connects QEMU chardev and test code.
3
The value of the mp-affinity property being set in npcm7xx_realize is
4
always the same as the default value it would have when arm_cpu_realizefn
5
is called if the property is not set here. So there is no need to set
6
the property value in npcm7xx_realize function.
4
7
5
Signed-off-by: Julia Suvorova <jusual@mail.ru>
8
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20190117161640.5496-2-jusual@mail.ru
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
11
Message-id: 20240504141733.14813-1-dorjoychy111@gmail.com
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
13
---
10
tests/libqtest.h | 11 +++++++++++
14
hw/arm/npcm7xx.c | 3 ---
11
tests/libqtest.c | 26 ++++++++++++++++++++++++++
15
1 file changed, 3 deletions(-)
12
2 files changed, 37 insertions(+)
13
16
14
diff --git a/tests/libqtest.h b/tests/libqtest.h
17
diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/tests/libqtest.h
19
--- a/hw/arm/npcm7xx.c
17
+++ b/tests/libqtest.h
20
+++ b/hw/arm/npcm7xx.c
18
@@ -XXX,XX +XXX,XX @@ QTestState *qtest_init(const char *extra_args);
21
@@ -XXX,XX +XXX,XX @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
19
*/
22
20
QTestState *qtest_init_without_qmp_handshake(const char *extra_args);
23
/* CPUs */
21
24
for (i = 0; i < nc->num_cpus; i++) {
22
+/**
25
- object_property_set_int(OBJECT(&s->cpu[i]), "mp-affinity",
23
+ * qtest_init_with_serial:
26
- arm_build_mp_affinity(i, NPCM7XX_MAX_NUM_CPUS),
24
+ * @extra_args: other arguments to pass to QEMU. CAUTION: these
27
- &error_abort);
25
+ * arguments are subject to word splitting and shell evaluation.
28
object_property_set_int(OBJECT(&s->cpu[i]), "reset-cbar",
26
+ * @sock_fd: pointer to store the socket file descriptor for
29
NPCM7XX_GIC_CPU_IF_ADDR, &error_abort);
27
+ * connection with serial.
30
object_property_set_bool(OBJECT(&s->cpu[i]), "reset-hivecs", true,
28
+ *
29
+ * Returns: #QTestState instance.
30
+ */
31
+QTestState *qtest_init_with_serial(const char *extra_args, int *sock_fd);
32
+
33
/**
34
* qtest_quit:
35
* @s: #QTestState instance to operate on.
36
diff --git a/tests/libqtest.c b/tests/libqtest.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/tests/libqtest.c
39
+++ b/tests/libqtest.c
40
@@ -XXX,XX +XXX,XX @@ QTestState *qtest_initf(const char *fmt, ...)
41
return s;
42
}
43
44
+QTestState *qtest_init_with_serial(const char *extra_args, int *sock_fd)
45
+{
46
+ int sock_fd_init;
47
+ char *sock_path, sock_dir[] = "/tmp/qtest-serial-XXXXXX";
48
+ QTestState *qts;
49
+
50
+ g_assert(mkdtemp(sock_dir));
51
+ sock_path = g_strdup_printf("%s/sock", sock_dir);
52
+
53
+ sock_fd_init = init_socket(sock_path);
54
+
55
+ qts = qtest_initf("-chardev socket,id=s0,path=%s,nowait "
56
+ "-serial chardev:s0 %s",
57
+ sock_path, extra_args);
58
+
59
+ *sock_fd = socket_accept(sock_fd_init);
60
+
61
+ unlink(sock_path);
62
+ g_free(sock_path);
63
+ rmdir(sock_dir);
64
+
65
+ g_assert(*sock_fd >= 0);
66
+
67
+ return qts;
68
+}
69
+
70
void qtest_quit(QTestState *s)
71
{
72
g_hook_destroy_link(&abrt_hooks, g_hook_find_data(&abrt_hooks, TRUE, s));
73
--
31
--
74
2.20.1
32
2.34.1
75
33
76
34
diff view generated by jsdifflib
1
From: Alexander Graf <agraf@suse.de>
1
From: Inès Varhol <ines.varhol@telecom-paris.fr>
2
2
3
In U-boot, we switch from S-SVC -> Mon -> Hyp mode when we want to
3
Signed-off-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
4
enter Hyp mode. The change into Hyp mode is done by doing an
4
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
5
exception return from Mon. This doesn't work with current QEMU.
5
Message-id: 20240505141613.387508-1-ines.varhol@telecom-paris.fr
6
7
The problem is that in bad_mode_switch() we refuse to allow
8
the change of mode.
9
10
Note that bad_mode_switch() is used to do validation for two situations:
11
12
(1) changes to mode by instructions writing to CPSR.M
13
(ie not exception take/return) -- this corresponds to the
14
Armv8 Arm ARM pseudocode Arch32.WriteModeByInstr
15
(2) changes to mode by exception return
16
17
Attempting to enter or leave Hyp mode via case (1) is forbidden in
18
v8 and UNPREDICTABLE in v7, and QEMU is correct to disallow it
19
there. However, we're already doing that check at the top of the
20
bad_mode_switch() function, so if that passes then we should allow
21
the case (2) exception return mode changes to switch into Hyp mode.
22
23
We want to test whether we're trying to return to the nonexistent
24
"secure Hyp" mode, so we need to look at arm_is_secure_below_el3()
25
rather than arm_is_secure(), since the latter is always true if
26
we're in Mon (EL3).
27
28
Signed-off-by: Alexander Graf <agraf@suse.de>
29
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
30
Message-id: 20190109152430.32359-1-agraf@suse.de
31
[PMM: rewrote commit message]
32
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
33
---
8
---
34
target/arm/helper.c | 2 +-
9
hw/char/stm32l4x5_usart.c | 2 +-
35
1 file changed, 1 insertion(+), 1 deletion(-)
10
1 file changed, 1 insertion(+), 1 deletion(-)
36
11
37
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/hw/char/stm32l4x5_usart.c b/hw/char/stm32l4x5_usart.c
38
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/helper.c
14
--- a/hw/char/stm32l4x5_usart.c
40
+++ b/target/arm/helper.c
15
+++ b/hw/char/stm32l4x5_usart.c
41
@@ -XXX,XX +XXX,XX @@ static int bad_mode_switch(CPUARMState *env, int mode, CPSRWriteType write_type)
16
@@ -XXX,XX +XXX,XX @@ REG32(CR1, 0x00)
42
return 0;
17
FIELD(CR1, UE, 0, 1) /* USART enable */
43
case ARM_CPU_MODE_HYP:
18
REG32(CR2, 0x04)
44
return !arm_feature(env, ARM_FEATURE_EL2)
19
FIELD(CR2, ADD_1, 28, 4) /* ADD[7:4] */
45
- || arm_current_el(env) < 2 || arm_is_secure(env);
20
- FIELD(CR2, ADD_0, 24, 1) /* ADD[3:0] */
46
+ || arm_current_el(env) < 2 || arm_is_secure_below_el3(env);
21
+ FIELD(CR2, ADD_0, 24, 4) /* ADD[3:0] */
47
case ARM_CPU_MODE_MON:
22
FIELD(CR2, RTOEN, 23, 1) /* Receiver timeout enable */
48
return arm_current_el(env) < 3;
23
FIELD(CR2, ABRMOD, 21, 2) /* Auto baud rate mode */
49
default:
24
FIELD(CR2, ABREN, 20, 1) /* Auto baud rate enable */
50
--
25
--
51
2.20.1
26
2.34.1
52
27
53
28
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Andrey Shumilin <shum.sdl@nppct.ru>
2
2
3
This both advertises that we support four counters and enables them
3
In gic_cpu_read() and gic_cpu_write(), we delegate the handling of
4
because the pmu_num_counters() reads this value from PMCR.
4
reading and writing the Non-Secure view of the GICC_APR<n> registers
5
to functions gic_apr_ns_view() and gic_apr_write_ns_view().
6
Unfortunately we got the order of the arguments wrong, swapping the
7
CPU number and the register number (which the compiler doesn't catch
8
because they're both integers).
5
9
6
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
10
Most guests probably didn't notice this bug because directly
7
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
11
accessing the APR registers is typically something only done by
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
firmware when it is doing state save for going into a sleep mode.
9
Message-id: 20181211151945.29137-13-aaron@os.amperecomputing.com
13
14
Correct the mismatched call arguments.
15
16
Found by Linux Verification Center (linuxtesting.org) with SVACE.
17
18
Cc: qemu-stable@nongnu.org
19
Fixes: 51fd06e0ee ("hw/intc/arm_gic: Fix handling of GICC_APR<n>, GICC_NSAPR<n> registers")
20
Signed-off-by: Andrey Shumilin <shum.sdl@nppct.ru>
21
[PMM: Rewrote commit message]
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
24
Reviewed-by: Alex Bennée<alex.bennee@linaro.org>
11
---
25
---
12
target/arm/helper.c | 10 +++++-----
26
hw/intc/arm_gic.c | 4 ++--
13
1 file changed, 5 insertions(+), 5 deletions(-)
27
1 file changed, 2 insertions(+), 2 deletions(-)
14
28
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
29
diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
16
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
31
--- a/hw/intc/arm_gic.c
18
+++ b/target/arm/helper.c
32
+++ b/hw/intc/arm_gic.c
19
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
33
@@ -XXX,XX +XXX,XX @@ static MemTxResult gic_cpu_read(GICState *s, int cpu, int offset,
20
.access = PL1_W, .type = ARM_CP_NOP },
34
*data = s->h_apr[gic_get_vcpu_real_id(cpu)];
21
/* Performance monitors are implementation defined in v7,
35
} else if (gic_cpu_ns_access(s, cpu, attrs)) {
22
* but with an ARM recommended set of registers, which we
36
/* NS view of GICC_APR<n> is the top half of GIC_NSAPR<n> */
23
- * follow (although we don't actually implement any counters)
37
- *data = gic_apr_ns_view(s, regno, cpu);
24
+ * follow.
38
+ *data = gic_apr_ns_view(s, cpu, regno);
25
*
39
} else {
26
* Performance registers fall into three categories:
40
*data = s->apr[regno][cpu];
27
* (a) always UNDEF in PL0, RW in PL1 (PMINTENSET, PMINTENCLR)
41
}
28
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
42
@@ -XXX,XX +XXX,XX @@ static MemTxResult gic_cpu_write(GICState *s, int cpu, int offset,
29
}
43
s->h_apr[gic_get_vcpu_real_id(cpu)] = value;
30
if (arm_feature(env, ARM_FEATURE_V7)) {
44
} else if (gic_cpu_ns_access(s, cpu, attrs)) {
31
/* v7 performance monitor control register: same implementor
45
/* NS view of GICC_APR<n> is the top half of GIC_NSAPR<n> */
32
- * field as main ID register, and we implement only the cycle
46
- gic_apr_write_ns_view(s, regno, cpu, value);
33
- * count register.
47
+ gic_apr_write_ns_view(s, cpu, regno, value);
34
+ * field as main ID register, and we implement four counters in
48
} else {
35
+ * addition to the cycle count register.
49
s->apr[regno][cpu] = value;
36
*/
50
}
37
- unsigned int i, pmcrn = 0;
38
+ unsigned int i, pmcrn = 4;
39
ARMCPRegInfo pmcr = {
40
.name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
41
.access = PL0_RW,
42
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
43
.access = PL0_RW, .accessfn = pmreg_access,
44
.type = ARM_CP_IO,
45
.fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
46
- .resetvalue = cpu->midr & 0xff000000,
47
+ .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT),
48
.writefn = pmcr_write, .raw_writefn = raw_write,
49
};
50
define_one_arm_cp_reg(cpu, &pmcr);
51
--
51
--
52
2.20.1
52
2.34.1
53
53
54
54
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
This commit doesn't add any supported events, but provides the framework
3
Check the function index is in range and use an unsigned
4
for adding them. We store the pm_event structs in a simple array, and
4
variable to avoid the following warning with GCC 13.2.0:
5
provide the mapping from the event numbers to array indexes in the
6
supported_event_map array. Because the value of PMCEID[01] depends upon
7
which events are supported at runtime, generate it dynamically.
8
5
9
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
6
[666/5358] Compiling C object libcommon.fa.p/hw_input_tsc2005.c.o
7
hw/input/tsc2005.c: In function 'tsc2005_timer_tick':
8
hw/input/tsc2005.c:416:26: warning: array subscript has type 'char' [-Wchar-subscripts]
9
416 | s->dav |= mode_regs[s->function];
10
| ~^~~~~~~~~~
11
12
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
13
Message-id: 20240508143513.44996-1-philmd@linaro.org
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Message-id: 20181211151945.29137-10-aaron@os.amperecomputing.com
15
[PMM: fixed missing ')']
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
17
---
14
target/arm/cpu.h | 10 ++++++++
18
hw/input/tsc2005.c | 5 ++++-
15
target/arm/cpu.c | 19 +++++++++------
19
1 file changed, 4 insertions(+), 1 deletion(-)
16
target/arm/cpu64.c | 4 ----
17
target/arm/helper.c | 57 +++++++++++++++++++++++++++++++++++++++++++++
18
4 files changed, 79 insertions(+), 11 deletions(-)
19
20
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
21
diff --git a/hw/input/tsc2005.c b/hw/input/tsc2005.c
21
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu.h
23
--- a/hw/input/tsc2005.c
23
+++ b/target/arm/cpu.h
24
+++ b/hw/input/tsc2005.c
24
@@ -XXX,XX +XXX,XX @@ void pmu_op_finish(CPUARMState *env);
25
@@ -XXX,XX +XXX,XX @@ uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len)
25
void pmu_pre_el_change(ARMCPU *cpu, void *ignored);
26
static void tsc2005_timer_tick(void *opaque)
26
void pmu_post_el_change(ARMCPU *cpu, void *ignored);
27
{
27
28
TSC2005State *s = opaque;
28
+/*
29
+ unsigned int function = s->function;
29
+ * get_pmceid
30
+ * @env: CPUARMState
31
+ * @which: which PMCEID register to return (0 or 1)
32
+ *
33
+ * Return the PMCEID[01]_EL0 register values corresponding to the counters
34
+ * which are supported given the current configuration
35
+ */
36
+uint64_t get_pmceid(CPUARMState *env, unsigned which);
37
+
30
+
38
/* SCTLR bit meanings. Several bits have been reused in newer
31
+ assert(function < ARRAY_SIZE(mode_regs));
39
* versions of the architecture; in that case we define constants
32
40
* for both old and new bit meanings. Code which tests against those
33
/* Timer ticked -- a set of conversions has been finished. */
41
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
34
42
index XXXXXXX..XXXXXXX 100644
35
@@ -XXX,XX +XXX,XX @@ static void tsc2005_timer_tick(void *opaque)
43
--- a/target/arm/cpu.c
36
return;
44
+++ b/target/arm/cpu.c
37
45
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
38
s->busy = false;
46
39
- s->dav |= mode_regs[s->function];
47
if (!cpu->has_pmu) {
40
+ s->dav |= mode_regs[function];
48
unset_feature(env, ARM_FEATURE_PMU);
41
s->function = -1;
49
+ }
42
tsc2005_pin_update(s);
50
+ if (arm_feature(env, ARM_FEATURE_PMU)) {
51
+ cpu->pmceid0 = get_pmceid(&cpu->env, 0);
52
+ cpu->pmceid1 = get_pmceid(&cpu->env, 1);
53
+
54
+ if (!kvm_enabled()) {
55
+ arm_register_pre_el_change_hook(cpu, &pmu_pre_el_change, 0);
56
+ arm_register_el_change_hook(cpu, &pmu_post_el_change, 0);
57
+ }
58
+ } else {
59
cpu->id_aa64dfr0 &= ~0xf00;
60
- } else if (!kvm_enabled()) {
61
- arm_register_pre_el_change_hook(cpu, &pmu_pre_el_change, 0);
62
- arm_register_el_change_hook(cpu, &pmu_post_el_change, 0);
63
+ cpu->pmceid0 = 0;
64
+ cpu->pmceid1 = 0;
65
}
66
67
if (!arm_feature(env, ARM_FEATURE_EL2)) {
68
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
69
cpu->id_pfr0 = 0x00001131;
70
cpu->id_pfr1 = 0x00011011;
71
cpu->id_dfr0 = 0x02010555;
72
- cpu->pmceid0 = 0x00000000;
73
- cpu->pmceid1 = 0x00000000;
74
cpu->id_afr0 = 0x00000000;
75
cpu->id_mmfr0 = 0x10101105;
76
cpu->id_mmfr1 = 0x40000000;
77
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
78
cpu->id_pfr0 = 0x00001131;
79
cpu->id_pfr1 = 0x00011011;
80
cpu->id_dfr0 = 0x02010555;
81
- cpu->pmceid0 = 0x0000000;
82
- cpu->pmceid1 = 0x00000000;
83
cpu->id_afr0 = 0x00000000;
84
cpu->id_mmfr0 = 0x10201105;
85
cpu->id_mmfr1 = 0x20000000;
86
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/cpu64.c
89
+++ b/target/arm/cpu64.c
90
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
91
cpu->isar.id_isar6 = 0;
92
cpu->isar.id_aa64pfr0 = 0x00002222;
93
cpu->id_aa64dfr0 = 0x10305106;
94
- cpu->pmceid0 = 0x00000000;
95
- cpu->pmceid1 = 0x00000000;
96
cpu->isar.id_aa64isar0 = 0x00011120;
97
cpu->isar.id_aa64mmfr0 = 0x00001124;
98
cpu->dbgdidr = 0x3516d000;
99
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
100
cpu->isar.id_isar5 = 0x00011121;
101
cpu->isar.id_aa64pfr0 = 0x00002222;
102
cpu->id_aa64dfr0 = 0x10305106;
103
- cpu->pmceid0 = 0x00000000;
104
- cpu->pmceid1 = 0x00000000;
105
cpu->isar.id_aa64isar0 = 0x00011120;
106
cpu->isar.id_aa64mmfr0 = 0x00001124;
107
cpu->dbgdidr = 0x3516d000;
108
diff --git a/target/arm/helper.c b/target/arm/helper.c
109
index XXXXXXX..XXXXXXX 100644
110
--- a/target/arm/helper.c
111
+++ b/target/arm/helper.c
112
@@ -XXX,XX +XXX,XX @@ static inline uint64_t pmu_counter_mask(CPUARMState *env)
113
return (1 << 31) | ((1 << pmu_num_counters(env)) - 1);
114
}
43
}
115
116
+typedef struct pm_event {
117
+ uint16_t number; /* PMEVTYPER.evtCount is 16 bits wide */
118
+ /* If the event is supported on this CPU (used to generate PMCEID[01]) */
119
+ bool (*supported)(CPUARMState *);
120
+ /*
121
+ * Retrieve the current count of the underlying event. The programmed
122
+ * counters hold a difference from the return value from this function
123
+ */
124
+ uint64_t (*get_count)(CPUARMState *);
125
+} pm_event;
126
+
127
+static const pm_event pm_events[] = {
128
+};
129
+
130
+/*
131
+ * Note: Before increasing MAX_EVENT_ID beyond 0x3f into the 0x40xx range of
132
+ * events (i.e. the statistical profiling extension), this implementation
133
+ * should first be updated to something sparse instead of the current
134
+ * supported_event_map[] array.
135
+ */
136
+#define MAX_EVENT_ID 0x0
137
+#define UNSUPPORTED_EVENT UINT16_MAX
138
+static uint16_t supported_event_map[MAX_EVENT_ID + 1];
139
+
140
+/*
141
+ * Called upon initialization to build PMCEID0_EL0 or PMCEID1_EL0 (indicated by
142
+ * 'which'). We also use it to build a map of ARM event numbers to indices in
143
+ * our pm_events array.
144
+ *
145
+ * Note: Events in the 0x40XX range are not currently supported.
146
+ */
147
+uint64_t get_pmceid(CPUARMState *env, unsigned which)
148
+{
149
+ uint64_t pmceid = 0;
150
+ unsigned int i;
151
+
152
+ assert(which <= 1);
153
+
154
+ for (i = 0; i < ARRAY_SIZE(supported_event_map); i++) {
155
+ supported_event_map[i] = UNSUPPORTED_EVENT;
156
+ }
157
+
158
+ for (i = 0; i < ARRAY_SIZE(pm_events); i++) {
159
+ const pm_event *cnt = &pm_events[i];
160
+ assert(cnt->number <= MAX_EVENT_ID);
161
+ /* We do not currently support events in the 0x40xx range */
162
+ assert(cnt->number <= 0x3f);
163
+
164
+ if ((cnt->number & 0x20) == (which << 6) &&
165
+ cnt->supported(env)) {
166
+ pmceid |= (1 << (cnt->number & 0x1f));
167
+ supported_event_map[cnt->number] = i;
168
+ }
169
+ }
170
+ return pmceid;
171
+}
172
+
173
static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
174
bool isread)
175
{
176
--
44
--
177
2.20.1
45
2.34.1
178
46
179
47
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Tanmay Patil <tanmaynpatil105@gmail.com>
2
2
3
Because of the PMU's design, many register accesses have side effects
3
Some of the source files for older devices use hardcoded tabs
4
which are inter-related, meaning that the normal method of saving CP
4
instead of our current coding standard's required spaces.
5
registers can result in inconsistent state. These side-effects are
5
Fix these in the following files:
6
largely handled in pmu_op_start/finish functions which can be called
6
    - hw/arm/boot.c
7
before and after the state is saved/restored. By doing this and adding
7
    - hw/char/omap_uart.c
8
raw read/write functions for the affected registers, we avoid
8
    - hw/gpio/zaurus.c
9
migration-related inconsistencies.
9
    - hw/input/tsc2005.c
10
10
11
Signed-off-by: Aaron Lindsay <aclindsa@gmail.com>
11
This commit is mostly whitespace-only changes; it also
12
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
12
adds curly-braces to some 'if' statements.
13
14
This addresses part of https://gitlab.com/qemu-project/qemu/-/issues/373
15
but some other files remain to be handled.
16
17
Signed-off-by: Tanmay Patil <tanmaynpatil105@gmail.com>
18
Message-id: 20240508081502.88375-1-tanmaynpatil105@gmail.com
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 20181211151945.29137-4-aaron@os.amperecomputing.com
20
[PMM: tweaked commit message]
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
22
---
17
target/arm/helper.c | 6 ++++--
23
hw/arm/boot.c | 8 +--
18
target/arm/machine.c | 24 ++++++++++++++++++++++++
24
hw/char/omap_uart.c | 49 +++++++++--------
19
2 files changed, 28 insertions(+), 2 deletions(-)
25
hw/gpio/zaurus.c | 59 ++++++++++----------
26
hw/input/tsc2005.c | 130 ++++++++++++++++++++++++--------------------
27
4 files changed, 130 insertions(+), 116 deletions(-)
20
28
21
diff --git a/target/arm/helper.c b/target/arm/helper.c
29
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
22
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/helper.c
31
--- a/hw/arm/boot.c
24
+++ b/target/arm/helper.c
32
+++ b/hw/arm/boot.c
25
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
33
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args_old(const struct arm_boot_info *info,
26
.opc0 = 3, .opc1 = 3, .crn = 9, .crm = 13, .opc2 = 0,
34
WRITE_WORD(p, info->ram_size / 4096);
27
.access = PL0_RW, .accessfn = pmreg_access_ccntr,
35
/* ramdisk_size */
28
.type = ARM_CP_IO,
36
WRITE_WORD(p, 0);
29
- .readfn = pmccntr_read, .writefn = pmccntr_write, },
37
-#define FLAG_READONLY    1
30
+ .fieldoffset = offsetof(CPUARMState, cp15.c15_ccnt),
38
-#define FLAG_RDLOAD    4
31
+ .readfn = pmccntr_read, .writefn = pmccntr_write,
39
-#define FLAG_RDPROMPT    8
32
+ .raw_readfn = raw_read, .raw_writefn = raw_write, },
40
+#define FLAG_READONLY 1
33
#endif
41
+#define FLAG_RDLOAD 4
34
{ .name = "PMCCFILTR_EL0", .state = ARM_CP_STATE_AA64,
42
+#define FLAG_RDPROMPT 8
35
.opc0 = 3, .opc1 = 3, .crn = 14, .crm = 15, .opc2 = 7,
43
/* flags */
36
- .writefn = pmccfiltr_write,
44
WRITE_WORD(p, FLAG_READONLY | FLAG_RDLOAD | FLAG_RDPROMPT);
37
+ .writefn = pmccfiltr_write, .raw_writefn = raw_write,
45
/* rootdev */
38
.access = PL0_RW, .accessfn = pmreg_access,
46
- WRITE_WORD(p, (31 << 8) | 0);    /* /dev/mtdblock0 */
39
.type = ARM_CP_IO,
47
+ WRITE_WORD(p, (31 << 8) | 0); /* /dev/mtdblock0 */
40
.fieldoffset = offsetof(CPUARMState, cp15.pmccfiltr_el0),
48
/* video_num_cols */
41
diff --git a/target/arm/machine.c b/target/arm/machine.c
49
WRITE_WORD(p, 0);
50
/* video_num_rows */
51
diff --git a/hw/char/omap_uart.c b/hw/char/omap_uart.c
42
index XXXXXXX..XXXXXXX 100644
52
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/machine.c
53
--- a/hw/char/omap_uart.c
44
+++ b/target/arm/machine.c
54
+++ b/hw/char/omap_uart.c
45
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque)
55
@@ -XXX,XX +XXX,XX @@ struct omap_uart_s *omap_uart_init(hwaddr base,
56
s->fclk = fclk;
57
s->irq = irq;
58
s->serial = serial_mm_init(get_system_memory(), base, 2, irq,
59
- omap_clk_getrate(fclk)/16,
60
+ omap_clk_getrate(fclk) / 16,
61
chr ?: qemu_chr_new(label, "null", NULL),
62
DEVICE_NATIVE_ENDIAN);
63
return s;
64
@@ -XXX,XX +XXX,XX @@ static uint64_t omap_uart_read(void *opaque, hwaddr addr, unsigned size)
65
}
66
67
switch (addr) {
68
- case 0x20:    /* MDR1 */
69
+ case 0x20: /* MDR1 */
70
return s->mdr[0];
71
- case 0x24:    /* MDR2 */
72
+ case 0x24: /* MDR2 */
73
return s->mdr[1];
74
- case 0x40:    /* SCR */
75
+ case 0x40: /* SCR */
76
return s->scr;
77
- case 0x44:    /* SSR */
78
+ case 0x44: /* SSR */
79
return 0x0;
80
- case 0x48:    /* EBLR (OMAP2) */
81
+ case 0x48: /* EBLR (OMAP2) */
82
return s->eblr;
83
- case 0x4C:    /* OSC_12M_SEL (OMAP1) */
84
+ case 0x4C: /* OSC_12M_SEL (OMAP1) */
85
return s->clksel;
86
- case 0x50:    /* MVR */
87
+ case 0x50: /* MVR */
88
return 0x30;
89
- case 0x54:    /* SYSC (OMAP2) */
90
+ case 0x54: /* SYSC (OMAP2) */
91
return s->syscontrol;
92
- case 0x58:    /* SYSS (OMAP2) */
93
+ case 0x58: /* SYSS (OMAP2) */
94
return 1;
95
- case 0x5c:    /* WER (OMAP2) */
96
+ case 0x5c: /* WER (OMAP2) */
97
return s->wkup;
98
- case 0x60:    /* CFPS (OMAP2) */
99
+ case 0x60: /* CFPS (OMAP2) */
100
return s->cfps;
101
}
102
103
@@ -XXX,XX +XXX,XX @@ static void omap_uart_write(void *opaque, hwaddr addr,
104
}
105
106
switch (addr) {
107
- case 0x20:    /* MDR1 */
108
+ case 0x20: /* MDR1 */
109
s->mdr[0] = value & 0x7f;
110
break;
111
- case 0x24:    /* MDR2 */
112
+ case 0x24: /* MDR2 */
113
s->mdr[1] = value & 0xff;
114
break;
115
- case 0x40:    /* SCR */
116
+ case 0x40: /* SCR */
117
s->scr = value & 0xff;
118
break;
119
- case 0x48:    /* EBLR (OMAP2) */
120
+ case 0x48: /* EBLR (OMAP2) */
121
s->eblr = value & 0xff;
122
break;
123
- case 0x4C:    /* OSC_12M_SEL (OMAP1) */
124
+ case 0x4C: /* OSC_12M_SEL (OMAP1) */
125
s->clksel = value & 1;
126
break;
127
- case 0x44:    /* SSR */
128
- case 0x50:    /* MVR */
129
- case 0x58:    /* SYSS (OMAP2) */
130
+ case 0x44: /* SSR */
131
+ case 0x50: /* MVR */
132
+ case 0x58: /* SYSS (OMAP2) */
133
OMAP_RO_REG(addr);
134
break;
135
- case 0x54:    /* SYSC (OMAP2) */
136
+ case 0x54: /* SYSC (OMAP2) */
137
s->syscontrol = value & 0x1d;
138
- if (value & 2)
139
+ if (value & 2) {
140
omap_uart_reset(s);
141
+ }
142
break;
143
- case 0x5c:    /* WER (OMAP2) */
144
+ case 0x5c: /* WER (OMAP2) */
145
s->wkup = value & 0x7f;
146
break;
147
- case 0x60:    /* CFPS (OMAP2) */
148
+ case 0x60: /* CFPS (OMAP2) */
149
s->cfps = value & 0xff;
150
break;
151
default:
152
diff --git a/hw/gpio/zaurus.c b/hw/gpio/zaurus.c
153
index XXXXXXX..XXXXXXX 100644
154
--- a/hw/gpio/zaurus.c
155
+++ b/hw/gpio/zaurus.c
156
@@ -XXX,XX +XXX,XX @@ struct ScoopInfo {
157
uint16_t isr;
158
};
159
160
-#define SCOOP_MCR    0x00
161
-#define SCOOP_CDR    0x04
162
-#define SCOOP_CSR    0x08
163
-#define SCOOP_CPR    0x0c
164
-#define SCOOP_CCR    0x10
165
-#define SCOOP_IRR_IRM    0x14
166
-#define SCOOP_IMR    0x18
167
-#define SCOOP_ISR    0x1c
168
-#define SCOOP_GPCR    0x20
169
-#define SCOOP_GPWR    0x24
170
-#define SCOOP_GPRR    0x28
171
+#define SCOOP_MCR 0x00
172
+#define SCOOP_CDR 0x04
173
+#define SCOOP_CSR 0x08
174
+#define SCOOP_CPR 0x0c
175
+#define SCOOP_CCR 0x10
176
+#define SCOOP_IRR_IRM 0x14
177
+#define SCOOP_IMR 0x18
178
+#define SCOOP_ISR 0x1c
179
+#define SCOOP_GPCR 0x20
180
+#define SCOOP_GPWR 0x24
181
+#define SCOOP_GPRR 0x28
182
183
-static inline void scoop_gpio_handler_update(ScoopInfo *s) {
184
+static inline void scoop_gpio_handler_update(ScoopInfo *s)
185
+{
186
uint32_t level, diff;
187
int bit;
188
level = s->gpio_level & s->gpio_dir;
189
@@ -XXX,XX +XXX,XX @@ static void scoop_write(void *opaque, hwaddr addr,
190
break;
191
case SCOOP_CPR:
192
s->power = value;
193
- if (value & 0x80)
194
+ if (value & 0x80) {
195
s->power |= 0x8040;
196
+ }
197
break;
198
case SCOOP_CCR:
199
s->ccr = value;
200
@@ -XXX,XX +XXX,XX @@ static void scoop_write(void *opaque, hwaddr addr,
201
scoop_gpio_handler_update(s);
202
break;
203
case SCOOP_GPWR:
204
- case SCOOP_GPRR:    /* GPRR is probably R/O in real HW */
205
+ case SCOOP_GPRR: /* GPRR is probably R/O in real HW */
206
s->gpio_level = value & s->gpio_dir;
207
scoop_gpio_handler_update(s);
208
break;
209
@@ -XXX,XX +XXX,XX @@ static void scoop_gpio_set(void *opaque, int line, int level)
46
{
210
{
47
ARMCPU *cpu = opaque;
211
ScoopInfo *s = (ScoopInfo *) opaque;
48
212
49
+ if (!kvm_enabled()) {
213
- if (level)
50
+ pmu_op_start(&cpu->env);
214
+ if (level) {
215
s->gpio_level |= (1 << line);
216
- else
217
+ } else {
218
s->gpio_level &= ~(1 << line);
51
+ }
219
+ }
52
+
220
}
53
if (kvm_enabled()) {
221
54
if (!write_kvmstate_to_list(cpu)) {
222
static void scoop_init(Object *obj)
55
/* This should never fail */
223
@@ -XXX,XX +XXX,XX @@ static int scoop_post_load(void *opaque, int version_id)
56
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque)
57
return 0;
224
return 0;
58
}
225
}
59
226
60
+static int cpu_post_save(void *opaque)
227
-static bool is_version_0 (void *opaque, int version_id)
61
+{
228
+static bool is_version_0(void *opaque, int version_id)
62
+ ARMCPU *cpu = opaque;
229
{
63
+
230
return version_id == 0;
64
+ if (!kvm_enabled()) {
231
}
65
+ pmu_op_finish(&cpu->env);
232
@@ -XXX,XX +XXX,XX @@ type_init(scoop_register_types)
233
234
/* Write the bootloader parameters memory area. */
235
236
-#define MAGIC_CHG(a, b, c, d)    ((d << 24) | (c << 16) | (b << 8) | a)
237
+#define MAGIC_CHG(a, b, c, d) ((d << 24) | (c << 16) | (b << 8) | a)
238
239
static struct QEMU_PACKED sl_param_info {
240
uint32_t comadj_keyword;
241
@@ -XXX,XX +XXX,XX @@ static struct QEMU_PACKED sl_param_info {
242
uint32_t phad_keyword;
243
int32_t phadadj;
244
} zaurus_bootparam = {
245
- .comadj_keyword    = MAGIC_CHG('C', 'M', 'A', 'D'),
246
- .comadj        = 125,
247
- .uuid_keyword    = MAGIC_CHG('U', 'U', 'I', 'D'),
248
- .uuid        = { -1 },
249
- .touch_keyword    = MAGIC_CHG('T', 'U', 'C', 'H'),
250
- .touch_xp        = -1,
251
- .adadj_keyword    = MAGIC_CHG('B', 'V', 'A', 'D'),
252
- .adadj        = -1,
253
- .phad_keyword    = MAGIC_CHG('P', 'H', 'A', 'D'),
254
- .phadadj        = 0x01,
255
+ .comadj_keyword = MAGIC_CHG('C', 'M', 'A', 'D'),
256
+ .comadj = 125,
257
+ .uuid_keyword = MAGIC_CHG('U', 'U', 'I', 'D'),
258
+ .uuid = { -1 },
259
+ .touch_keyword = MAGIC_CHG('T', 'U', 'C', 'H'),
260
+ .touch_xp = -1,
261
+ .adadj_keyword = MAGIC_CHG('B', 'V', 'A', 'D'),
262
+ .adadj = -1,
263
+ .phad_keyword = MAGIC_CHG('P', 'H', 'A', 'D'),
264
+ .phadadj = 0x01,
265
};
266
267
void sl_bootparam_write(hwaddr ptr)
268
diff --git a/hw/input/tsc2005.c b/hw/input/tsc2005.c
269
index XXXXXXX..XXXXXXX 100644
270
--- a/hw/input/tsc2005.c
271
+++ b/hw/input/tsc2005.c
272
@@ -XXX,XX +XXX,XX @@
273
#include "migration/vmstate.h"
274
#include "trace.h"
275
276
-#define TSC_CUT_RESOLUTION(value, p)    ((value) >> (16 - (p ? 12 : 10)))
277
+#define TSC_CUT_RESOLUTION(value, p) ((value) >> (16 - (p ? 12 : 10)))
278
279
typedef struct {
280
- qemu_irq pint;    /* Combination of the nPENIRQ and DAV signals */
281
+ qemu_irq pint; /* Combination of the nPENIRQ and DAV signals */
282
QEMUTimer *timer;
283
uint16_t model;
284
285
@@ -XXX,XX +XXX,XX @@ typedef struct {
286
} TSC2005State;
287
288
enum {
289
- TSC_MODE_XYZ_SCAN    = 0x0,
290
+ TSC_MODE_XYZ_SCAN = 0x0,
291
TSC_MODE_XY_SCAN,
292
TSC_MODE_X,
293
TSC_MODE_Y,
294
@@ -XXX,XX +XXX,XX @@ enum {
295
};
296
297
static const uint16_t mode_regs[16] = {
298
- 0xf000,    /* X, Y, Z scan */
299
- 0xc000,    /* X, Y scan */
300
- 0x8000,    /* X */
301
- 0x4000,    /* Y */
302
- 0x3000,    /* Z */
303
- 0x0800,    /* AUX */
304
- 0x0400,    /* TEMP1 */
305
- 0x0200,    /* TEMP2 */
306
- 0x0800,    /* AUX scan */
307
- 0x0040,    /* X test */
308
- 0x0020,    /* Y test */
309
- 0x0080,    /* Short-circuit test */
310
- 0x0000,    /* Reserved */
311
- 0x0000,    /* X+, X- drivers */
312
- 0x0000,    /* Y+, Y- drivers */
313
- 0x0000,    /* Y+, X- drivers */
314
+ 0xf000, /* X, Y, Z scan */
315
+ 0xc000, /* X, Y scan */
316
+ 0x8000, /* X */
317
+ 0x4000, /* Y */
318
+ 0x3000, /* Z */
319
+ 0x0800, /* AUX */
320
+ 0x0400, /* TEMP1 */
321
+ 0x0200, /* TEMP2 */
322
+ 0x0800, /* AUX scan */
323
+ 0x0040, /* X test */
324
+ 0x0020, /* Y test */
325
+ 0x0080, /* Short-circuit test */
326
+ 0x0000, /* Reserved */
327
+ 0x0000, /* X+, X- drivers */
328
+ 0x0000, /* Y+, Y- drivers */
329
+ 0x0000, /* Y+, X- drivers */
330
};
331
332
-#define X_TRANSFORM(s)            \
333
+#define X_TRANSFORM(s) \
334
((s->y * s->tr[0] - s->x * s->tr[1]) / s->tr[2] + s->tr[3])
335
-#define Y_TRANSFORM(s)            \
336
+#define Y_TRANSFORM(s) \
337
((s->y * s->tr[4] - s->x * s->tr[5]) / s->tr[6] + s->tr[7])
338
-#define Z1_TRANSFORM(s)            \
339
+#define Z1_TRANSFORM(s) \
340
((400 - ((s)->x >> 7) + ((s)->pressure << 10)) << 4)
341
-#define Z2_TRANSFORM(s)            \
342
+#define Z2_TRANSFORM(s) \
343
((4000 + ((s)->y >> 7) - ((s)->pressure << 10)) << 4)
344
345
-#define AUX_VAL                (700 << 4)    /* +/- 3 at 12-bit */
346
-#define TEMP1_VAL            (1264 << 4)    /* +/- 5 at 12-bit */
347
-#define TEMP2_VAL            (1531 << 4)    /* +/- 5 at 12-bit */
348
+#define AUX_VAL (700 << 4) /* +/- 3 at 12-bit */
349
+#define TEMP1_VAL (1264 << 4) /* +/- 5 at 12-bit */
350
+#define TEMP2_VAL (1531 << 4) /* +/- 5 at 12-bit */
351
352
static uint16_t tsc2005_read(TSC2005State *s, int reg)
353
{
354
uint16_t ret;
355
356
switch (reg) {
357
- case 0x0:    /* X */
358
+ case 0x0: /* X */
359
s->dav &= ~mode_regs[TSC_MODE_X];
360
return TSC_CUT_RESOLUTION(X_TRANSFORM(s), s->precision) +
361
(s->noise & 3);
362
- case 0x1:    /* Y */
363
+ case 0x1: /* Y */
364
s->dav &= ~mode_regs[TSC_MODE_Y];
365
- s->noise ++;
366
+ s->noise++;
367
return TSC_CUT_RESOLUTION(Y_TRANSFORM(s), s->precision) ^
368
(s->noise & 3);
369
- case 0x2:    /* Z1 */
370
+ case 0x2: /* Z1 */
371
s->dav &= 0xdfff;
372
return TSC_CUT_RESOLUTION(Z1_TRANSFORM(s), s->precision) -
373
(s->noise & 3);
374
- case 0x3:    /* Z2 */
375
+ case 0x3: /* Z2 */
376
s->dav &= 0xefff;
377
return TSC_CUT_RESOLUTION(Z2_TRANSFORM(s), s->precision) |
378
(s->noise & 3);
379
380
- case 0x4:    /* AUX */
381
+ case 0x4: /* AUX */
382
s->dav &= ~mode_regs[TSC_MODE_AUX];
383
return TSC_CUT_RESOLUTION(AUX_VAL, s->precision);
384
385
- case 0x5:    /* TEMP1 */
386
+ case 0x5: /* TEMP1 */
387
s->dav &= ~mode_regs[TSC_MODE_TEMP1];
388
return TSC_CUT_RESOLUTION(TEMP1_VAL, s->precision) -
389
(s->noise & 5);
390
- case 0x6:    /* TEMP2 */
391
+ case 0x6: /* TEMP2 */
392
s->dav &= 0xdfff;
393
s->dav &= ~mode_regs[TSC_MODE_TEMP2];
394
return TSC_CUT_RESOLUTION(TEMP2_VAL, s->precision) ^
395
(s->noise & 3);
396
397
- case 0x7:    /* Status */
398
+ case 0x7: /* Status */
399
ret = s->dav | (s->reset << 7) | (s->pdst << 2) | 0x0;
400
s->dav &= ~(mode_regs[TSC_MODE_X_TEST] | mode_regs[TSC_MODE_Y_TEST] |
401
mode_regs[TSC_MODE_TS_TEST]);
402
s->reset = true;
403
return ret;
404
405
- case 0x8: /* AUX high threshold */
406
+ case 0x8: /* AUX high threshold */
407
return s->aux_thr[1];
408
- case 0x9: /* AUX low threshold */
409
+ case 0x9: /* AUX low threshold */
410
return s->aux_thr[0];
411
412
- case 0xa: /* TEMP high threshold */
413
+ case 0xa: /* TEMP high threshold */
414
return s->temp_thr[1];
415
- case 0xb: /* TEMP low threshold */
416
+ case 0xb: /* TEMP low threshold */
417
return s->temp_thr[0];
418
419
- case 0xc:    /* CFR0 */
420
+ case 0xc: /* CFR0 */
421
return (s->pressure << 15) | ((!s->busy) << 14) |
422
- (s->nextprecision << 13) | s->timing[0];
423
- case 0xd:    /* CFR1 */
424
+ (s->nextprecision << 13) | s->timing[0];
425
+ case 0xd: /* CFR1 */
426
return s->timing[1];
427
- case 0xe:    /* CFR2 */
428
+ case 0xe: /* CFR2 */
429
return (s->pin_func << 14) | s->filter;
430
431
- case 0xf:    /* Function select status */
432
+ case 0xf: /* Function select status */
433
return s->function >= 0 ? 1 << s->function : 0;
434
}
435
436
@@ -XXX,XX +XXX,XX @@ static void tsc2005_write(TSC2005State *s, int reg, uint16_t data)
437
s->temp_thr[0] = data;
438
break;
439
440
- case 0xc:    /* CFR0 */
441
+ case 0xc: /* CFR0 */
442
s->host_mode = (data >> 15) != 0;
443
if (s->enabled != !(data & 0x4000)) {
444
s->enabled = !(data & 0x4000);
445
trace_tsc2005_sense(s->enabled ? "enabled" : "disabled");
446
- if (s->busy && !s->enabled)
447
+ if (s->busy && !s->enabled) {
448
timer_del(s->timer);
449
+ }
450
s->busy = s->busy && s->enabled;
451
}
452
s->nextprecision = (data >> 13) & 1;
453
@@ -XXX,XX +XXX,XX @@ static void tsc2005_write(TSC2005State *s, int reg, uint16_t data)
454
"tsc2005_write: illegal conversion clock setting\n");
455
}
456
break;
457
- case 0xd:    /* CFR1 */
458
+ case 0xd: /* CFR1 */
459
s->timing[1] = data & 0xf07;
460
break;
461
- case 0xe:    /* CFR2 */
462
+ case 0xe: /* CFR2 */
463
s->pin_func = (data >> 14) & 3;
464
s->filter = data & 0x3fff;
465
break;
466
@@ -XXX,XX +XXX,XX @@ static void tsc2005_pin_update(TSC2005State *s)
467
switch (s->nextfunction) {
468
case TSC_MODE_XYZ_SCAN:
469
case TSC_MODE_XY_SCAN:
470
- if (!s->host_mode && s->dav)
471
+ if (!s->host_mode && s->dav) {
472
s->enabled = false;
473
- if (!s->pressure)
474
+ }
475
+ if (!s->pressure) {
476
return;
477
+ }
478
/* Fall through */
479
case TSC_MODE_AUX_SCAN:
480
break;
481
@@ -XXX,XX +XXX,XX @@ static void tsc2005_pin_update(TSC2005State *s)
482
case TSC_MODE_X:
483
case TSC_MODE_Y:
484
case TSC_MODE_Z:
485
- if (!s->pressure)
486
+ if (!s->pressure) {
487
return;
488
+ }
489
/* Fall through */
490
case TSC_MODE_AUX:
491
case TSC_MODE_TEMP1:
492
@@ -XXX,XX +XXX,XX @@ static void tsc2005_pin_update(TSC2005State *s)
493
case TSC_MODE_X_TEST:
494
case TSC_MODE_Y_TEST:
495
case TSC_MODE_TS_TEST:
496
- if (s->dav)
497
+ if (s->dav) {
498
s->enabled = false;
499
+ }
500
break;
501
502
case TSC_MODE_RESERVED:
503
@@ -XXX,XX +XXX,XX @@ static void tsc2005_pin_update(TSC2005State *s)
504
return;
505
}
506
507
- if (!s->enabled || s->busy)
508
+ if (!s->enabled || s->busy) {
509
return;
66
+ }
510
+ }
67
+
511
68
+ return 0;
512
s->busy = true;
69
+}
513
s->precision = s->nextprecision;
70
+
514
s->function = s->nextfunction;
71
static int cpu_pre_load(void *opaque)
515
- s->pdst = !s->pnd0;    /* Synchronised on internal clock */
72
{
516
+ s->pdst = !s->pnd0; /* Synchronised on internal clock */
73
ARMCPU *cpu = opaque;
517
expires = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
74
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_load(void *opaque)
518
(NANOSECONDS_PER_SECOND >> 7);
519
timer_mod(s->timer, expires);
520
@@ -XXX,XX +XXX,XX @@ static uint8_t tsc2005_txrx_word(void *opaque, uint8_t value)
521
TSC2005State *s = opaque;
522
uint32_t ret = 0;
523
524
- switch (s->state ++) {
525
+ switch (s->state++) {
526
case 0:
527
if (value & 0x80) {
528
/* Command */
529
@@ -XXX,XX +XXX,XX @@ static uint8_t tsc2005_txrx_word(void *opaque, uint8_t value)
530
if (s->enabled != !(value & 1)) {
531
s->enabled = !(value & 1);
532
trace_tsc2005_sense(s->enabled ? "enabled" : "disabled");
533
- if (s->busy && !s->enabled)
534
+ if (s->busy && !s->enabled) {
535
timer_del(s->timer);
536
+ }
537
s->busy = s->busy && s->enabled;
538
}
539
tsc2005_pin_update(s);
540
@@ -XXX,XX +XXX,XX @@ static uint8_t tsc2005_txrx_word(void *opaque, uint8_t value)
541
break;
542
543
case 1:
544
- if (s->command)
545
+ if (s->command) {
546
ret = (s->data >> 8) & 0xff;
547
- else
548
+ } else {
549
s->data |= value << 8;
550
+ }
551
break;
552
553
case 2:
554
@@ -XXX,XX +XXX,XX @@ static void tsc2005_timer_tick(void *opaque)
555
556
/* Timer ticked -- a set of conversions has been finished. */
557
558
- if (!s->busy)
559
+ if (!s->busy) {
560
return;
561
+ }
562
563
s->busy = false;
564
s->dav |= mode_regs[function];
565
@@ -XXX,XX +XXX,XX @@ static void tsc2005_touchscreen_event(void *opaque,
566
* signaling TS events immediately, but for now we simulate
567
* the first conversion delay for sake of correctness.
75
*/
568
*/
76
env->irq_line_state = UINT32_MAX;
569
- if (p != s->pressure)
77
570
+ if (p != s->pressure) {
78
+ if (!kvm_enabled()) {
571
tsc2005_pin_update(s);
79
+ pmu_op_start(&cpu->env);
80
+ }
572
+ }
81
+
82
return 0;
83
}
573
}
84
574
85
@@ -XXX,XX +XXX,XX @@ static int cpu_post_load(void *opaque, int version_id)
575
static int tsc2005_post_load(void *opaque, int version_id)
86
hw_breakpoint_update_all(cpu);
87
hw_watchpoint_update_all(cpu);
88
89
+ if (!kvm_enabled()) {
90
+ pmu_op_finish(&cpu->env);
91
+ }
92
+
93
return 0;
94
}
95
96
@@ -XXX,XX +XXX,XX @@ const VMStateDescription vmstate_arm_cpu = {
97
.version_id = 22,
98
.minimum_version_id = 22,
99
.pre_save = cpu_pre_save,
100
+ .post_save = cpu_post_save,
101
.pre_load = cpu_pre_load,
102
.post_load = cpu_post_load,
103
.fields = (VMStateField[]) {
104
--
576
--
105
2.20.1
577
2.34.1
106
107
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Rayhan Faizel <rayhan.faizel@gmail.com>
2
2
3
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
3
None of the RPi boards have ADC on-board. In real life, an external ADC chip
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
is required to operate on analog signals.
5
Message-id: 20181211151945.29137-9-aaron@os.amperecomputing.com
5
6
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20240512085716.222326-1-rayhan.faizel@gmail.com
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/cpu.h | 4 ++--
11
docs/system/arm/raspi.rst | 1 -
9
target/arm/helper.c | 19 +++++++++++++++++--
12
1 file changed, 1 deletion(-)
10
2 files changed, 19 insertions(+), 4 deletions(-)
11
13
12
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
diff --git a/docs/system/arm/raspi.rst b/docs/system/arm/raspi.rst
13
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.h
16
--- a/docs/system/arm/raspi.rst
15
+++ b/target/arm/cpu.h
17
+++ b/docs/system/arm/raspi.rst
16
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
18
@@ -XXX,XX +XXX,XX @@ Implemented devices
17
uint32_t id_pfr0;
19
Missing devices
18
uint32_t id_pfr1;
20
---------------
19
uint32_t id_dfr0;
21
20
- uint32_t pmceid0;
22
- * Analog to Digital Converter (ADC)
21
- uint32_t pmceid1;
23
* Pulse Width Modulation (PWM)
22
+ uint64_t pmceid0;
24
* PCIE Root Port (raspi4b)
23
+ uint64_t pmceid1;
25
* GENET Ethernet Controller (raspi4b)
24
uint32_t id_afr0;
25
uint32_t id_mmfr0;
26
uint32_t id_mmfr1;
27
diff --git a/target/arm/helper.c b/target/arm/helper.c
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/helper.c
30
+++ b/target/arm/helper.c
31
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
32
} else {
33
define_arm_cp_regs(cpu, not_v7_cp_reginfo);
34
}
35
+ if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
36
+ FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) != 0xf) {
37
+ ARMCPRegInfo v81_pmu_regs[] = {
38
+ { .name = "PMCEID2", .state = ARM_CP_STATE_AA32,
39
+ .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 4,
40
+ .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
41
+ .resetvalue = extract64(cpu->pmceid0, 32, 32) },
42
+ { .name = "PMCEID3", .state = ARM_CP_STATE_AA32,
43
+ .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
44
+ .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
45
+ .resetvalue = extract64(cpu->pmceid1, 32, 32) },
46
+ REGINFO_SENTINEL
47
+ };
48
+ define_arm_cp_regs(cpu, v81_pmu_regs);
49
+ }
50
if (arm_feature(env, ARM_FEATURE_V8)) {
51
/* AArch64 ID registers, which all have impdef reset values.
52
* Note that within the ID register ranges the unused slots
53
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
54
{ .name = "PMCEID0", .state = ARM_CP_STATE_AA32,
55
.cp = 15, .opc1 = 0, .crn = 9, .crm = 12, .opc2 = 6,
56
.access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
57
- .resetvalue = cpu->pmceid0 },
58
+ .resetvalue = extract64(cpu->pmceid0, 0, 32) },
59
{ .name = "PMCEID0_EL0", .state = ARM_CP_STATE_AA64,
60
.opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 6,
61
.access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
62
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
63
{ .name = "PMCEID1", .state = ARM_CP_STATE_AA32,
64
.cp = 15, .opc1 = 0, .crn = 9, .crm = 12, .opc2 = 7,
65
.access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
66
- .resetvalue = cpu->pmceid1 },
67
+ .resetvalue = extract64(cpu->pmceid1, 0, 32) },
68
{ .name = "PMCEID1_EL0", .state = ARM_CP_STATE_AA64,
69
.opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 7,
70
.access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
71
--
26
--
72
2.20.1
27
2.34.1
73
28
74
29
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We will want to check TBI for I and D simultaneously.
3
This fixes a bug in that neither PLI nor PLDW are present in ARMv6T2,
4
but are introduced with ARMv7 and ARMv7MP respectively.
5
For clarity, do not use NOP for PLD.
6
7
Note that there is no PLDW (literal). Architecturally in the
8
T1 encoding of "PLD (literal)" bit 5 is "(0)", which means
9
that it should be zero and if it is not then the behaviour
10
is CONSTRAINED UNPREDICTABLE (might UNDEF, NOP, or ignore the
11
value of the bit).
12
13
In our implementation we have patterns for both:
14
15
+ PLD 1111 1000 -001 1111 1111 ------------ # (literal)
16
+ PLD 1111 1000 -011 1111 1111 ------------ # (literal)
17
18
and so we effectively ignore the value of bit 5. (This is a
19
permitted option for this CONSTRAINED UNPREDICTABLE.) This isn't a
20
behaviour change in this commit, since we previously had NOP lines
21
for both those patterns.
4
22
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
23
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
24
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20190108223129.5570-22-richard.henderson@linaro.org
25
Message-id: 20240524232121.284515-3-richard.henderson@linaro.org
26
[PMM: adjusted commit message to note that PLD (lit) T1 bit 5
27
being 1 is an UNPREDICTABLE case.]
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
28
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
29
---
10
target/arm/internals.h | 15 ++++++++++++---
30
target/arm/tcg/t32.decode | 25 ++++++++++++-------------
11
target/arm/helper.c | 10 ++++++++--
31
target/arm/tcg/translate.c | 4 ++--
12
2 files changed, 20 insertions(+), 5 deletions(-)
32
2 files changed, 14 insertions(+), 15 deletions(-)
13
33
14
diff --git a/target/arm/internals.h b/target/arm/internals.h
34
diff --git a/target/arm/tcg/t32.decode b/target/arm/tcg/t32.decode
15
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/internals.h
36
--- a/target/arm/tcg/t32.decode
17
+++ b/target/arm/internals.h
37
+++ b/target/arm/tcg/t32.decode
18
@@ -XXX,XX +XXX,XX @@ typedef struct ARMVAParameters {
38
@@ -XXX,XX +XXX,XX @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos
19
} ARMVAParameters;
39
# Note that Load, unsigned (literal) overlaps all other load encodings.
20
21
#ifdef CONFIG_USER_ONLY
22
-static inline ARMVAParameters aa64_va_parameters(CPUARMState *env,
23
- uint64_t va,
24
- ARMMMUIdx mmu_idx, bool data)
25
+static inline ARMVAParameters aa64_va_parameters_both(CPUARMState *env,
26
+ uint64_t va,
27
+ ARMMMUIdx mmu_idx)
28
{
40
{
29
return (ARMVAParameters) {
41
{
30
/* 48-bit address space */
42
- NOP 1111 1000 -001 1111 1111 ------------ # PLD
31
@@ -XXX,XX +XXX,XX @@ static inline ARMVAParameters aa64_va_parameters(CPUARMState *env,
43
+ PLD 1111 1000 -001 1111 1111 ------------ # (literal)
32
.tbi = false,
44
LDRB_ri 1111 1000 .001 1111 .... ............ @ldst_ri_lit
33
};
45
}
46
{
47
- NOP 1111 1000 1001 ---- 1111 ------------ # PLD
48
+ PLD 1111 1000 1001 ---- 1111 ------------ # (immediate T1)
49
LDRB_ri 1111 1000 1001 .... .... ............ @ldst_ri_pos
50
}
51
LDRB_ri 1111 1000 0001 .... .... 1..1 ........ @ldst_ri_idx
52
{
53
- NOP 1111 1000 0001 ---- 1111 1100 -------- # PLD
54
+ PLD 1111 1000 0001 ---- 1111 1100 -------- # (immediate T2)
55
LDRB_ri 1111 1000 0001 .... .... 1100 ........ @ldst_ri_neg
56
}
57
LDRBT_ri 1111 1000 0001 .... .... 1110 ........ @ldst_ri_unp
58
{
59
- NOP 1111 1000 0001 ---- 1111 000000 -- ---- # PLD
60
+ PLD 1111 1000 0001 ---- 1111 000000 -- ---- # (register)
61
LDRB_rr 1111 1000 0001 .... .... 000000 .. .... @ldst_rr
62
}
34
}
63
}
35
+
64
{
36
+static inline ARMVAParameters aa64_va_parameters(CPUARMState *env,
65
{
37
+ uint64_t va,
66
- NOP 1111 1000 -011 1111 1111 ------------ # PLD
38
+ ARMMMUIdx mmu_idx, bool data)
67
+ PLD 1111 1000 -011 1111 1111 ------------ # (literal)
39
+{
68
LDRH_ri 1111 1000 .011 1111 .... ............ @ldst_ri_lit
40
+ return aa64_va_parameters_both(env, va, mmu_idx);
69
}
41
+}
70
{
42
#else
71
- NOP 1111 1000 1011 ---- 1111 ------------ # PLDW
43
+ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
72
+ PLDW 1111 1000 1011 ---- 1111 ------------ # (immediate T1)
44
+ ARMMMUIdx mmu_idx);
73
LDRH_ri 1111 1000 1011 .... .... ............ @ldst_ri_pos
45
ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
74
}
46
ARMMMUIdx mmu_idx, bool data);
75
LDRH_ri 1111 1000 0011 .... .... 1..1 ........ @ldst_ri_idx
47
#endif
76
{
48
diff --git a/target/arm/helper.c b/target/arm/helper.c
77
- NOP 1111 1000 0011 ---- 1111 1100 -------- # PLDW
78
+ PLDW 1111 1000 0011 ---- 1111 1100 -------- # (immediate T2)
79
LDRH_ri 1111 1000 0011 .... .... 1100 ........ @ldst_ri_neg
80
}
81
LDRHT_ri 1111 1000 0011 .... .... 1110 ........ @ldst_ri_unp
82
{
83
- NOP 1111 1000 0011 ---- 1111 000000 -- ---- # PLDW
84
+ PLDW 1111 1000 0011 ---- 1111 000000 -- ---- # (register)
85
LDRH_rr 1111 1000 0011 .... .... 000000 .. .... @ldst_rr
86
}
87
}
88
@@ -XXX,XX +XXX,XX @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos
89
LDRT_ri 1111 1000 0101 .... .... 1110 ........ @ldst_ri_unp
90
LDR_rr 1111 1000 0101 .... .... 000000 .. .... @ldst_rr
91
}
92
-# NOPs here are PLI.
93
{
94
{
95
- NOP 1111 1001 -001 1111 1111 ------------
96
+ PLI 1111 1001 -001 1111 1111 ------------ # (literal T3)
97
LDRSB_ri 1111 1001 .001 1111 .... ............ @ldst_ri_lit
98
}
99
{
100
- NOP 1111 1001 1001 ---- 1111 ------------
101
+ PLI 1111 1001 1001 ---- 1111 ------------ # (immediate T1)
102
LDRSB_ri 1111 1001 1001 .... .... ............ @ldst_ri_pos
103
}
104
LDRSB_ri 1111 1001 0001 .... .... 1..1 ........ @ldst_ri_idx
105
{
106
- NOP 1111 1001 0001 ---- 1111 1100 --------
107
+ PLI 1111 1001 0001 ---- 1111 1100 -------- # (immediate T2)
108
LDRSB_ri 1111 1001 0001 .... .... 1100 ........ @ldst_ri_neg
109
}
110
LDRSBT_ri 1111 1001 0001 .... .... 1110 ........ @ldst_ri_unp
111
{
112
- NOP 1111 1001 0001 ---- 1111 000000 -- ----
113
+ PLI 1111 1001 0001 ---- 1111 000000 -- ---- # (register)
114
LDRSB_rr 1111 1001 0001 .... .... 000000 .. .... @ldst_rr
115
}
116
}
117
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
49
index XXXXXXX..XXXXXXX 100644
118
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/helper.c
119
--- a/target/arm/tcg/translate.c
51
+++ b/target/arm/helper.c
120
+++ b/target/arm/tcg/translate.c
52
@@ -XXX,XX +XXX,XX @@ static uint8_t convert_stage2_attrs(CPUARMState *env, uint8_t s2attrs)
121
@@ -XXX,XX +XXX,XX @@ static bool trans_PLD(DisasContext *s, arg_PLD *a)
53
return (hiattr << 6) | (hihint << 4) | (loattr << 2) | lohint;
122
return ENABLE_ARCH_5TE;
54
}
123
}
55
124
56
-ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
125
-static bool trans_PLDW(DisasContext *s, arg_PLD *a)
57
- ARMMMUIdx mmu_idx, bool data)
126
+static bool trans_PLDW(DisasContext *s, arg_PLDW *a)
58
+ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
59
+ ARMMMUIdx mmu_idx)
60
{
127
{
61
uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
128
return arm_dc_feature(s, ARM_FEATURE_V7MP);
62
uint32_t el = regime_el(env, mmu_idx);
63
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
64
};
65
}
129
}
66
130
67
+ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
131
-static bool trans_PLI(DisasContext *s, arg_PLD *a)
68
+ ARMMMUIdx mmu_idx, bool data)
132
+static bool trans_PLI(DisasContext *s, arg_PLI *a)
69
+{
70
+ return aa64_va_parameters_both(env, va, mmu_idx);
71
+}
72
+
73
static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
74
ARMMMUIdx mmu_idx)
75
{
133
{
134
return ENABLE_ARCH_7;
135
}
76
--
136
--
77
2.20.1
137
2.34.1
78
79
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We will shortly want to talk about TBI as it relates to data.
3
Fixes RISU mismatch for "fcvtzs h31, h0, #14".
4
Passing around a pair of variables is less convenient than a
5
single variable.
6
4
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20190108223129.5570-20-richard.henderson@linaro.org
7
Message-id: 20240524232121.284515-5-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
9
---
12
target/arm/cpu.h | 3 +--
10
target/arm/tcg/translate-a64.c | 3 +++
13
target/arm/translate.h | 3 +--
11
1 file changed, 3 insertions(+)
14
target/arm/helper.c | 5 ++---
15
target/arm/translate-a64.c | 13 +++++++------
16
4 files changed, 11 insertions(+), 13 deletions(-)
17
12
18
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
13
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
19
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/cpu.h
15
--- a/target/arm/tcg/translate-a64.c
21
+++ b/target/arm/cpu.h
16
+++ b/target/arm/tcg/translate-a64.c
22
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, HANDLER, 21, 1)
17
@@ -XXX,XX +XXX,XX @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
23
FIELD(TBFLAG_A32, STACKCHECK, 22, 1)
18
read_vec_element_i32(s, tcg_op, rn, pass, size);
24
19
fn(tcg_op, tcg_op, tcg_shift, tcg_fpstatus);
25
/* Bit usage when in AArch64 state */
20
if (is_scalar) {
26
-FIELD(TBFLAG_A64, TBI0, 0, 1)
21
+ if (size == MO_16 && !is_u) {
27
-FIELD(TBFLAG_A64, TBI1, 1, 1)
22
+ tcg_gen_ext16u_i32(tcg_op, tcg_op);
28
+FIELD(TBFLAG_A64, TBII, 0, 2)
23
+ }
29
FIELD(TBFLAG_A64, SVEEXC_EL, 2, 2)
24
write_fp_sreg(s, rd, tcg_op);
30
FIELD(TBFLAG_A64, ZCR_LEN, 4, 4)
25
} else {
31
FIELD(TBFLAG_A64, PAUTH_ACTIVE, 8, 1)
26
write_vec_element_i32(s, tcg_op, rd, pass, size);
32
diff --git a/target/arm/translate.h b/target/arm/translate.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/translate.h
35
+++ b/target/arm/translate.h
36
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
37
int user;
38
#endif
39
ARMMMUIdx mmu_idx; /* MMU index to use for normal loads/stores */
40
- bool tbi0; /* TBI0 for EL0/1 or TBI for EL2/3 */
41
- bool tbi1; /* TBI1 for EL0/1, not used for EL2/3 */
42
+ uint8_t tbii; /* TBI1|TBI0 for EL0/1 or TBI for EL2/3 */
43
bool ns; /* Use non-secure CPREG bank on access */
44
int fp_excp_el; /* FP exception EL or 0 if enabled */
45
int sve_excp_el; /* SVE exception EL or 0 if enabled */
46
diff --git a/target/arm/helper.c b/target/arm/helper.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/helper.c
49
+++ b/target/arm/helper.c
50
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
51
*pc = env->pc;
52
flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
53
/* Get control bits for tagged addresses */
54
- flags = FIELD_DP32(flags, TBFLAG_A64, TBI0,
55
+ flags = FIELD_DP32(flags, TBFLAG_A64, TBII,
56
+ (arm_regime_tbi1(env, mmu_idx) << 1) |
57
arm_regime_tbi0(env, mmu_idx));
58
- flags = FIELD_DP32(flags, TBFLAG_A64, TBI1,
59
- arm_regime_tbi1(env, mmu_idx));
60
61
if (cpu_isar_feature(aa64_sve, cpu)) {
62
int sve_el = sve_exception_el(env, current_el);
63
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/translate-a64.c
66
+++ b/target/arm/translate-a64.c
67
@@ -XXX,XX +XXX,XX @@ void gen_a64_set_pc_im(uint64_t val)
68
*/
69
static void gen_a64_set_pc(DisasContext *s, TCGv_i64 src)
70
{
71
+ /* Note that TBII is TBI1:TBI0. */
72
+ int tbi = s->tbii;
73
74
if (s->current_el <= 1) {
75
/* Test if NEITHER or BOTH TBI values are set. If so, no need to
76
* examine bit 55 of address, can just generate code.
77
* If mixed, then test via generated code
78
*/
79
- if (s->tbi0 && s->tbi1) {
80
+ if (tbi == 3) {
81
TCGv_i64 tmp_reg = tcg_temp_new_i64();
82
/* Both bits set, sign extension from bit 55 into [63:56] will
83
* cover both cases
84
@@ -XXX,XX +XXX,XX @@ static void gen_a64_set_pc(DisasContext *s, TCGv_i64 src)
85
tcg_gen_shli_i64(tmp_reg, src, 8);
86
tcg_gen_sari_i64(cpu_pc, tmp_reg, 8);
87
tcg_temp_free_i64(tmp_reg);
88
- } else if (!s->tbi0 && !s->tbi1) {
89
+ } else if (tbi == 0) {
90
/* Neither bit set, just load it as-is */
91
tcg_gen_mov_i64(cpu_pc, src);
92
} else {
93
@@ -XXX,XX +XXX,XX @@ static void gen_a64_set_pc(DisasContext *s, TCGv_i64 src)
94
95
tcg_gen_andi_i64(tcg_bit55, src, (1ull << 55));
96
97
- if (s->tbi0) {
98
+ if (tbi == 1) {
99
/* tbi0==1, tbi1==0, so 0-fill upper byte if bit 55 = 0 */
100
tcg_gen_andi_i64(tcg_tmpval, src,
101
0x00FFFFFFFFFFFFFFull);
102
@@ -XXX,XX +XXX,XX @@ static void gen_a64_set_pc(DisasContext *s, TCGv_i64 src)
103
tcg_temp_free_i64(tcg_tmpval);
104
}
105
} else { /* EL > 1 */
106
- if (s->tbi0) {
107
+ if (tbi != 0) {
108
/* Force tag byte to all zero */
109
tcg_gen_andi_i64(cpu_pc, src, 0x00FFFFFFFFFFFFFFull);
110
} else {
111
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
112
dc->condexec_cond = 0;
113
core_mmu_idx = FIELD_EX32(tb_flags, TBFLAG_ANY, MMUIDX);
114
dc->mmu_idx = core_to_arm_mmu_idx(env, core_mmu_idx);
115
- dc->tbi0 = FIELD_EX32(tb_flags, TBFLAG_A64, TBI0);
116
- dc->tbi1 = FIELD_EX32(tb_flags, TBFLAG_A64, TBI1);
117
+ dc->tbii = FIELD_EX32(tb_flags, TBFLAG_A64, TBII);
118
dc->current_el = arm_mmu_idx_to_el(dc->mmu_idx);
119
#if !defined(CONFIG_USER_ONLY)
120
dc->user = (dc->current_el == 0);
121
--
27
--
122
2.20.1
28
2.34.1
123
124
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Not that there are any stores involved, but why argue with ARM's
3
The decode of FMOV (vector, immediate, half-precision) vs
4
naming convention.
4
invalid cases of MOVI are incorrect.
5
5
6
Fixes RISU mismatch for invalid insn 0x2f01fd31.
7
8
Fixes: 70b4e6a4457 ("arm/translate-a64: add FP16 FMOV to simd_mod_imm")
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20190108223129.5570-15-richard.henderson@linaro.org
11
Message-id: 20240524232121.284515-6-richard.henderson@linaro.org
9
[fixed trivial comment nit]
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/translate-a64.c | 61 ++++++++++++++++++++++++++++++++++++++
14
target/arm/tcg/translate-a64.c | 24 ++++++++++++++----------
13
1 file changed, 61 insertions(+)
15
1 file changed, 14 insertions(+), 10 deletions(-)
14
16
15
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
17
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-a64.c
19
--- a/target/arm/tcg/translate-a64.c
18
+++ b/target/arm/translate-a64.c
20
+++ b/target/arm/tcg/translate-a64.c
19
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_atomic(DisasContext *s, uint32_t insn,
21
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
20
s->be_data | size | MO_ALIGN);
22
bool is_q = extract32(insn, 30, 1);
21
}
23
uint64_t imm = 0;
22
24
23
+/*
25
- if (o2 != 0 || ((cmode == 0xf) && is_neg && !is_q)) {
24
+ * PAC memory operations
26
- /* Check for FMOV (vector, immediate) - half-precision */
25
+ *
27
- if (!(dc_isar_feature(aa64_fp16, s) && o2 && cmode == 0xf)) {
26
+ * 31 30 27 26 24 22 21 12 11 10 5 0
28
+ if (o2) {
27
+ * +------+-------+---+-----+-----+---+--------+---+---+----+-----+
29
+ if (cmode != 0xf || is_neg) {
28
+ * | size | 1 1 1 | V | 0 0 | M S | 1 | imm9 | W | 1 | Rn | Rt |
30
unallocated_encoding(s);
29
+ * +------+-------+---+-----+-----+---+--------+---+---+----+-----+
31
return;
30
+ *
32
}
31
+ * Rt: the result register
33
- }
32
+ * Rn: base address or SP
34
-
33
+ * V: vector flag (always 0 as of v8.3)
35
- if (!fp_access_check(s)) {
34
+ * M: clear for key DA, set for key DB
36
- return;
35
+ * W: pre-indexing flag
37
- }
36
+ * S: sign for imm9.
38
-
37
+ */
39
- if (cmode == 15 && o2 && !is_neg) {
38
+static void disas_ldst_pac(DisasContext *s, uint32_t insn,
40
/* FMOV (vector, immediate) - half-precision */
39
+ int size, int rt, bool is_vector)
41
+ if (!dc_isar_feature(aa64_fp16, s)) {
40
+{
42
+ unallocated_encoding(s);
41
+ int rn = extract32(insn, 5, 5);
43
+ return;
42
+ bool is_wback = extract32(insn, 11, 1);
44
+ }
43
+ bool use_key_a = !extract32(insn, 23, 1);
45
imm = vfp_expand_imm(MO_16, abcdefgh);
44
+ int offset;
46
/* now duplicate across the lanes */
45
+ TCGv_i64 tcg_addr, tcg_rt;
47
imm = dup_const(MO_16, imm);
46
+
48
} else {
47
+ if (size != 3 || is_vector || !dc_isar_feature(aa64_pauth, s)) {
49
+ if (cmode == 0xf && is_neg && !is_q) {
48
+ unallocated_encoding(s);
50
+ unallocated_encoding(s);
51
+ return;
52
+ }
53
imm = asimd_imm_const(abcdefgh, cmode, is_neg);
54
}
55
56
+ if (!fp_access_check(s)) {
49
+ return;
57
+ return;
50
+ }
58
+ }
51
+
59
+
52
+ if (rn == 31) {
60
if (!((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9)) {
53
+ gen_check_sp_alignment(s);
61
/* MOVI or MVNI, with MVNI negation handled above. */
54
+ }
62
tcg_gen_gvec_dup_imm(MO_64, vec_full_reg_offset(s, rd), is_q ? 16 : 8,
55
+ tcg_addr = read_cpu_reg_sp(s, rn, 1);
56
+
57
+ if (s->pauth_active) {
58
+ if (use_key_a) {
59
+ gen_helper_autda(tcg_addr, cpu_env, tcg_addr, cpu_X[31]);
60
+ } else {
61
+ gen_helper_autdb(tcg_addr, cpu_env, tcg_addr, cpu_X[31]);
62
+ }
63
+ }
64
+
65
+ /* Form the 10-bit signed, scaled offset. */
66
+ offset = (extract32(insn, 22, 1) << 9) | extract32(insn, 12, 9);
67
+ offset = sextract32(offset << size, 0, 10 + size);
68
+ tcg_gen_addi_i64(tcg_addr, tcg_addr, offset);
69
+
70
+ tcg_rt = cpu_reg(s, rt);
71
+
72
+ do_gpr_ld(s, tcg_rt, tcg_addr, size, /* is_signed */ false,
73
+ /* extend */ false, /* iss_valid */ !is_wback,
74
+ /* iss_srt */ rt, /* iss_sf */ true, /* iss_ar */ false);
75
+
76
+ if (is_wback) {
77
+ tcg_gen_mov_i64(cpu_reg_sp(s, rn), tcg_addr);
78
+ }
79
+}
80
+
81
/* Load/store register (all forms) */
82
static void disas_ldst_reg(DisasContext *s, uint32_t insn)
83
{
84
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_reg(DisasContext *s, uint32_t insn)
85
case 2:
86
disas_ldst_reg_roffset(s, insn, opc, size, rt, is_vector);
87
return;
88
+ default:
89
+ disas_ldst_pac(s, insn, size, rt, is_vector);
90
+ return;
91
}
92
break;
93
case 1:
94
--
63
--
95
2.20.1
64
2.34.1
96
97
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Use TBID in aa64_va_parameters depending on the data parameter.
3
All of these insns have "if sz == '1' then UNDEFINED" in their pseudocode.
4
This automatically updates all existing users of the function.
4
Fixes a RISU miscompare for invalid insn 0x5ef0c87a.
5
5
6
Fixes: 5c36d89567c ("arm/translate-a64: add all FP16 ops in simd_scalar_pairwise")
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20190108223129.5570-23-richard.henderson@linaro.org
9
Message-id: 20240524232121.284515-7-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
---
11
target/arm/internals.h | 1 +
12
target/arm/tcg/translate-a64.c | 2 +-
12
target/arm/helper.c | 14 +++++++++++---
13
1 file changed, 1 insertion(+), 1 deletion(-)
13
2 files changed, 12 insertions(+), 3 deletions(-)
14
14
15
diff --git a/target/arm/internals.h b/target/arm/internals.h
15
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
16
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/internals.h
17
--- a/target/arm/tcg/translate-a64.c
18
+++ b/target/arm/internals.h
18
+++ b/target/arm/tcg/translate-a64.c
19
@@ -XXX,XX +XXX,XX @@ typedef struct ARMVAParameters {
19
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
20
unsigned tsz : 8;
20
case 0x2f: /* FMINP */
21
unsigned select : 1;
21
/* FP op, size[0] is 32 or 64 bit*/
22
bool tbi : 1;
22
if (!u) {
23
+ bool tbid : 1;
23
- if (!dc_isar_feature(aa64_fp16, s)) {
24
bool epd : 1;
24
+ if ((size & 1) || !dc_isar_feature(aa64_fp16, s)) {
25
bool hpd : 1;
25
unallocated_encoding(s);
26
bool using16k : 1;
26
return;
27
diff --git a/target/arm/helper.c b/target/arm/helper.c
27
} else {
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/helper.c
30
+++ b/target/arm/helper.c
31
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
32
{
33
uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
34
uint32_t el = regime_el(env, mmu_idx);
35
- bool tbi, epd, hpd, using16k, using64k;
36
+ bool tbi, tbid, epd, hpd, using16k, using64k;
37
int select, tsz;
38
39
/*
40
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
41
using16k = extract32(tcr, 15, 1);
42
if (mmu_idx == ARMMMUIdx_S2NS) {
43
/* VTCR_EL2 */
44
- tbi = hpd = false;
45
+ tbi = tbid = hpd = false;
46
} else {
47
tbi = extract32(tcr, 20, 1);
48
hpd = extract32(tcr, 24, 1);
49
+ tbid = extract32(tcr, 29, 1);
50
}
51
epd = false;
52
} else if (!select) {
53
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
54
using16k = extract32(tcr, 15, 1);
55
tbi = extract64(tcr, 37, 1);
56
hpd = extract64(tcr, 41, 1);
57
+ tbid = extract64(tcr, 51, 1);
58
} else {
59
int tg = extract32(tcr, 30, 2);
60
using16k = tg == 1;
61
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
62
epd = extract32(tcr, 23, 1);
63
tbi = extract64(tcr, 38, 1);
64
hpd = extract64(tcr, 42, 1);
65
+ tbid = extract64(tcr, 52, 1);
66
}
67
tsz = MIN(tsz, 39); /* TODO: ARMv8.4-TTST */
68
tsz = MAX(tsz, 16); /* TODO: ARMv8.2-LVA */
69
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
70
.tsz = tsz,
71
.select = select,
72
.tbi = tbi,
73
+ .tbid = tbid,
74
.epd = epd,
75
.hpd = hpd,
76
.using16k = using16k,
77
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
78
ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
79
ARMMMUIdx mmu_idx, bool data)
80
{
81
- return aa64_va_parameters_both(env, va, mmu_idx);
82
+ ARMVAParameters ret = aa64_va_parameters_both(env, va, mmu_idx);
83
+
84
+ /* Present TBI as a composite with TBID. */
85
+ ret.tbi &= (data || !ret.tbid);
86
+ return ret;
87
}
88
89
static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
90
--
28
--
91
2.20.1
29
2.34.1
92
93
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The cryptographic internals are stubbed out for now,
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
but the enable and trap bits are checked.
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Message-id: 20240524232121.284515-8-richard.henderson@linaro.org
8
Message-id: 20190108223129.5570-6-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
8
---
11
target/arm/Makefile.objs | 1 +
9
target/arm/tcg/translate.h | 5 +
12
target/arm/helper-a64.h | 12 +++
10
target/arm/tcg/gengvec.c | 1612 ++++++++++++++++++++++++++++++++++++
13
target/arm/internals.h | 6 ++
11
target/arm/tcg/translate.c | 1588 -----------------------------------
14
target/arm/pauth_helper.c | 186 ++++++++++++++++++++++++++++++++++++++
12
target/arm/tcg/meson.build | 1 +
15
4 files changed, 205 insertions(+)
13
4 files changed, 1618 insertions(+), 1588 deletions(-)
16
create mode 100644 target/arm/pauth_helper.c
14
create mode 100644 target/arm/tcg/gengvec.c
17
15
18
diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
16
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
19
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/Makefile.objs
18
--- a/target/arm/tcg/translate.h
21
+++ b/target/arm/Makefile.objs
19
+++ b/target/arm/tcg/translate.h
22
@@ -XXX,XX +XXX,XX @@ obj-y += translate.o op_helper.o helper.o cpu.o
20
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
23
obj-y += neon_helper.o iwmmxt_helper.o vec_helper.o
21
void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
24
obj-y += gdbstub.o
22
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
25
obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o
23
26
+obj-$(TARGET_AARCH64) += pauth_helper.o
24
+void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh);
27
obj-y += crypto_helper.o
25
+void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh);
28
obj-$(CONFIG_SOFTMMU) += arm-powerctl.o
26
+void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh);
29
27
+void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh);
30
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
28
+
31
index XXXXXXX..XXXXXXX 100644
29
void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
32
--- a/target/arm/helper-a64.h
30
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
33
+++ b/target/arm/helper-a64.h
31
void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
34
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(advsimd_rinth, f16, f16, ptr)
32
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
35
DEF_HELPER_2(advsimd_f16tosinth, i32, f16, ptr)
36
DEF_HELPER_2(advsimd_f16touinth, i32, f16, ptr)
37
DEF_HELPER_2(sqrt_f16, f16, f16, ptr)
38
+
39
+DEF_HELPER_FLAGS_3(pacia, TCG_CALL_NO_WG, i64, env, i64, i64)
40
+DEF_HELPER_FLAGS_3(pacib, TCG_CALL_NO_WG, i64, env, i64, i64)
41
+DEF_HELPER_FLAGS_3(pacda, TCG_CALL_NO_WG, i64, env, i64, i64)
42
+DEF_HELPER_FLAGS_3(pacdb, TCG_CALL_NO_WG, i64, env, i64, i64)
43
+DEF_HELPER_FLAGS_3(pacga, TCG_CALL_NO_WG, i64, env, i64, i64)
44
+DEF_HELPER_FLAGS_3(autia, TCG_CALL_NO_WG, i64, env, i64, i64)
45
+DEF_HELPER_FLAGS_3(autib, TCG_CALL_NO_WG, i64, env, i64, i64)
46
+DEF_HELPER_FLAGS_3(autda, TCG_CALL_NO_WG, i64, env, i64, i64)
47
+DEF_HELPER_FLAGS_3(autdb, TCG_CALL_NO_WG, i64, env, i64, i64)
48
+DEF_HELPER_FLAGS_2(xpaci, TCG_CALL_NO_RWG_SE, i64, env, i64)
49
+DEF_HELPER_FLAGS_2(xpacd, TCG_CALL_NO_RWG_SE, i64, env, i64)
50
diff --git a/target/arm/internals.h b/target/arm/internals.h
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/internals.h
53
+++ b/target/arm/internals.h
54
@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
55
EC_CP14DTTRAP = 0x06,
56
EC_ADVSIMDFPACCESSTRAP = 0x07,
57
EC_FPIDTRAP = 0x08,
58
+ EC_PACTRAP = 0x09,
59
EC_CP14RRTTRAP = 0x0c,
60
EC_ILLEGALSTATE = 0x0e,
61
EC_AA32_SVC = 0x11,
62
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_sve_access_trap(void)
63
return EC_SVEACCESSTRAP << ARM_EL_EC_SHIFT;
64
}
65
66
+static inline uint32_t syn_pactrap(void)
67
+{
68
+ return EC_PACTRAP << ARM_EL_EC_SHIFT;
69
+}
70
+
71
static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
72
{
73
return (EC_INSNABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
74
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
75
new file mode 100644
33
new file mode 100644
76
index XXXXXXX..XXXXXXX
34
index XXXXXXX..XXXXXXX
77
--- /dev/null
35
--- /dev/null
78
+++ b/target/arm/pauth_helper.c
36
+++ b/target/arm/tcg/gengvec.c
79
@@ -XXX,XX +XXX,XX @@
37
@@ -XXX,XX +XXX,XX @@
80
+/*
38
+/*
81
+ * ARM v8.3-PAuth Operations
39
+ * ARM generic vector expansion
82
+ *
40
+ *
83
+ * Copyright (c) 2019 Linaro, Ltd.
41
+ * Copyright (c) 2003 Fabrice Bellard
42
+ * Copyright (c) 2005-2007 CodeSourcery
43
+ * Copyright (c) 2007 OpenedHand, Ltd.
84
+ *
44
+ *
85
+ * This library is free software; you can redistribute it and/or
45
+ * This library is free software; you can redistribute it and/or
86
+ * modify it under the terms of the GNU Lesser General Public
46
+ * modify it under the terms of the GNU Lesser General Public
87
+ * License as published by the Free Software Foundation; either
47
+ * License as published by the Free Software Foundation; either
88
+ * version 2 of the License, or (at your option) any later version.
48
+ * version 2.1 of the License, or (at your option) any later version.
89
+ *
49
+ *
90
+ * This library is distributed in the hope that it will be useful,
50
+ * This library is distributed in the hope that it will be useful,
91
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
51
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
92
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
52
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
93
+ * Lesser General Public License for more details.
53
+ * Lesser General Public License for more details.
94
+ *
54
+ *
95
+ * You should have received a copy of the GNU Lesser General Public
55
+ * You should have received a copy of the GNU Lesser General Public
96
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
56
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
97
+ */
57
+ */
98
+
58
+
99
+#include "qemu/osdep.h"
59
+#include "qemu/osdep.h"
100
+#include "cpu.h"
60
+#include "translate.h"
101
+#include "internals.h"
61
+
102
+#include "exec/exec-all.h"
62
+
103
+#include "exec/cpu_ldst.h"
63
+static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
104
+#include "exec/helper-proto.h"
64
+ uint32_t opr_sz, uint32_t max_sz,
105
+#include "tcg/tcg-gvec-desc.h"
65
+ gen_helper_gvec_3_ptr *fn)
106
+
66
+{
107
+
67
+ TCGv_ptr qc_ptr = tcg_temp_new_ptr();
108
+static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
68
+
109
+ ARMPACKey key)
69
+ tcg_gen_addi_ptr(qc_ptr, tcg_env, offsetof(CPUARMState, vfp.qc));
110
+{
70
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
111
+ g_assert_not_reached(); /* FIXME */
71
+ opr_sz, max_sz, 0, fn);
112
+}
72
+}
113
+
73
+
114
+static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
74
+void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
115
+ ARMPACKey *key, bool data)
75
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
116
+{
76
+{
117
+ g_assert_not_reached(); /* FIXME */
77
+ static gen_helper_gvec_3_ptr * const fns[2] = {
118
+}
78
+ gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
119
+
79
+ };
120
+static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
80
+ tcg_debug_assert(vece >= 1 && vece <= 2);
121
+ ARMPACKey *key, bool data, int keynumber)
81
+ gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
122
+{
82
+}
123
+ g_assert_not_reached(); /* FIXME */
83
+
124
+}
84
+void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
125
+
85
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
126
+static uint64_t pauth_strip(CPUARMState *env, uint64_t ptr, bool data)
86
+{
127
+{
87
+ static gen_helper_gvec_3_ptr * const fns[2] = {
128
+ g_assert_not_reached(); /* FIXME */
88
+ gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
129
+}
89
+ };
130
+
90
+ tcg_debug_assert(vece >= 1 && vece <= 2);
131
+static void QEMU_NORETURN pauth_trap(CPUARMState *env, int target_el,
91
+ gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
132
+ uintptr_t ra)
92
+}
133
+{
93
+
134
+ raise_exception_ra(env, EXCP_UDEF, syn_pactrap(), target_el, ra);
94
+#define GEN_CMP0(NAME, COND) \
135
+}
95
+ void NAME(unsigned vece, uint32_t d, uint32_t m, \
136
+
96
+ uint32_t opr_sz, uint32_t max_sz) \
137
+static void pauth_check_trap(CPUARMState *env, int el, uintptr_t ra)
97
+ { tcg_gen_gvec_cmpi(COND, vece, d, m, 0, opr_sz, max_sz); }
138
+{
98
+
139
+ if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) {
99
+GEN_CMP0(gen_gvec_ceq0, TCG_COND_EQ)
140
+ uint64_t hcr = arm_hcr_el2_eff(env);
100
+GEN_CMP0(gen_gvec_cle0, TCG_COND_LE)
141
+ bool trap = !(hcr & HCR_API);
101
+GEN_CMP0(gen_gvec_cge0, TCG_COND_GE)
142
+ /* FIXME: ARMv8.1-VHE: trap only applies to EL1&0 regime. */
102
+GEN_CMP0(gen_gvec_clt0, TCG_COND_LT)
143
+ /* FIXME: ARMv8.3-NV: HCR_NV trap takes precedence for ERETA[AB]. */
103
+GEN_CMP0(gen_gvec_cgt0, TCG_COND_GT)
144
+ if (trap) {
104
+
145
+ pauth_trap(env, 2, ra);
105
+#undef GEN_CMP0
146
+ }
106
+
107
+static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
108
+{
109
+ tcg_gen_vec_sar8i_i64(a, a, shift);
110
+ tcg_gen_vec_add8_i64(d, d, a);
111
+}
112
+
113
+static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
114
+{
115
+ tcg_gen_vec_sar16i_i64(a, a, shift);
116
+ tcg_gen_vec_add16_i64(d, d, a);
117
+}
118
+
119
+static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
120
+{
121
+ tcg_gen_sari_i32(a, a, shift);
122
+ tcg_gen_add_i32(d, d, a);
123
+}
124
+
125
+static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
126
+{
127
+ tcg_gen_sari_i64(a, a, shift);
128
+ tcg_gen_add_i64(d, d, a);
129
+}
130
+
131
+static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
132
+{
133
+ tcg_gen_sari_vec(vece, a, a, sh);
134
+ tcg_gen_add_vec(vece, d, d, a);
135
+}
136
+
137
+void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
138
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
139
+{
140
+ static const TCGOpcode vecop_list[] = {
141
+ INDEX_op_sari_vec, INDEX_op_add_vec, 0
142
+ };
143
+ static const GVecGen2i ops[4] = {
144
+ { .fni8 = gen_ssra8_i64,
145
+ .fniv = gen_ssra_vec,
146
+ .fno = gen_helper_gvec_ssra_b,
147
+ .load_dest = true,
148
+ .opt_opc = vecop_list,
149
+ .vece = MO_8 },
150
+ { .fni8 = gen_ssra16_i64,
151
+ .fniv = gen_ssra_vec,
152
+ .fno = gen_helper_gvec_ssra_h,
153
+ .load_dest = true,
154
+ .opt_opc = vecop_list,
155
+ .vece = MO_16 },
156
+ { .fni4 = gen_ssra32_i32,
157
+ .fniv = gen_ssra_vec,
158
+ .fno = gen_helper_gvec_ssra_s,
159
+ .load_dest = true,
160
+ .opt_opc = vecop_list,
161
+ .vece = MO_32 },
162
+ { .fni8 = gen_ssra64_i64,
163
+ .fniv = gen_ssra_vec,
164
+ .fno = gen_helper_gvec_ssra_d,
165
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
166
+ .opt_opc = vecop_list,
167
+ .load_dest = true,
168
+ .vece = MO_64 },
169
+ };
170
+
171
+ /* tszimm encoding produces immediates in the range [1..esize]. */
172
+ tcg_debug_assert(shift > 0);
173
+ tcg_debug_assert(shift <= (8 << vece));
174
+
175
+ /*
176
+ * Shifts larger than the element size are architecturally valid.
177
+ * Signed results in all sign bits.
178
+ */
179
+ shift = MIN(shift, (8 << vece) - 1);
180
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
181
+}
182
+
183
+static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
184
+{
185
+ tcg_gen_vec_shr8i_i64(a, a, shift);
186
+ tcg_gen_vec_add8_i64(d, d, a);
187
+}
188
+
189
+static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
190
+{
191
+ tcg_gen_vec_shr16i_i64(a, a, shift);
192
+ tcg_gen_vec_add16_i64(d, d, a);
193
+}
194
+
195
+static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
196
+{
197
+ tcg_gen_shri_i32(a, a, shift);
198
+ tcg_gen_add_i32(d, d, a);
199
+}
200
+
201
+static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
202
+{
203
+ tcg_gen_shri_i64(a, a, shift);
204
+ tcg_gen_add_i64(d, d, a);
205
+}
206
+
207
+static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
208
+{
209
+ tcg_gen_shri_vec(vece, a, a, sh);
210
+ tcg_gen_add_vec(vece, d, d, a);
211
+}
212
+
213
+void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
214
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
215
+{
216
+ static const TCGOpcode vecop_list[] = {
217
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
218
+ };
219
+ static const GVecGen2i ops[4] = {
220
+ { .fni8 = gen_usra8_i64,
221
+ .fniv = gen_usra_vec,
222
+ .fno = gen_helper_gvec_usra_b,
223
+ .load_dest = true,
224
+ .opt_opc = vecop_list,
225
+ .vece = MO_8, },
226
+ { .fni8 = gen_usra16_i64,
227
+ .fniv = gen_usra_vec,
228
+ .fno = gen_helper_gvec_usra_h,
229
+ .load_dest = true,
230
+ .opt_opc = vecop_list,
231
+ .vece = MO_16, },
232
+ { .fni4 = gen_usra32_i32,
233
+ .fniv = gen_usra_vec,
234
+ .fno = gen_helper_gvec_usra_s,
235
+ .load_dest = true,
236
+ .opt_opc = vecop_list,
237
+ .vece = MO_32, },
238
+ { .fni8 = gen_usra64_i64,
239
+ .fniv = gen_usra_vec,
240
+ .fno = gen_helper_gvec_usra_d,
241
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
242
+ .load_dest = true,
243
+ .opt_opc = vecop_list,
244
+ .vece = MO_64, },
245
+ };
246
+
247
+ /* tszimm encoding produces immediates in the range [1..esize]. */
248
+ tcg_debug_assert(shift > 0);
249
+ tcg_debug_assert(shift <= (8 << vece));
250
+
251
+ /*
252
+ * Shifts larger than the element size are architecturally valid.
253
+ * Unsigned results in all zeros as input to accumulate: nop.
254
+ */
255
+ if (shift < (8 << vece)) {
256
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
257
+ } else {
258
+ /* Nop, but we do need to clear the tail. */
259
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
147
+ }
260
+ }
148
+ if (el < 3 && arm_feature(env, ARM_FEATURE_EL3)) {
261
+}
149
+ if (!(env->cp15.scr_el3 & SCR_API)) {
262
+
150
+ pauth_trap(env, 3, ra);
263
+/*
151
+ }
264
+ * Shift one less than the requested amount, and the low bit is
265
+ * the rounding bit. For the 8 and 16-bit operations, because we
266
+ * mask the low bit, we can perform a normal integer shift instead
267
+ * of a vector shift.
268
+ */
269
+static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
270
+{
271
+ TCGv_i64 t = tcg_temp_new_i64();
272
+
273
+ tcg_gen_shri_i64(t, a, sh - 1);
274
+ tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
275
+ tcg_gen_vec_sar8i_i64(d, a, sh);
276
+ tcg_gen_vec_add8_i64(d, d, t);
277
+}
278
+
279
+static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
280
+{
281
+ TCGv_i64 t = tcg_temp_new_i64();
282
+
283
+ tcg_gen_shri_i64(t, a, sh - 1);
284
+ tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
285
+ tcg_gen_vec_sar16i_i64(d, a, sh);
286
+ tcg_gen_vec_add16_i64(d, d, t);
287
+}
288
+
289
+void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
290
+{
291
+ TCGv_i32 t;
292
+
293
+ /* Handle shift by the input size for the benefit of trans_SRSHR_ri */
294
+ if (sh == 32) {
295
+ tcg_gen_movi_i32(d, 0);
296
+ return;
152
+ }
297
+ }
153
+}
298
+ t = tcg_temp_new_i32();
154
+
299
+ tcg_gen_extract_i32(t, a, sh - 1, 1);
155
+static bool pauth_key_enabled(CPUARMState *env, int el, uint32_t bit)
300
+ tcg_gen_sari_i32(d, a, sh);
156
+{
301
+ tcg_gen_add_i32(d, d, t);
157
+ uint32_t sctlr;
302
+}
158
+ if (el == 0) {
303
+
159
+ /* FIXME: ARMv8.1-VHE S2 translation regime. */
304
+ void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
160
+ sctlr = env->cp15.sctlr_el[1];
305
+{
306
+ TCGv_i64 t = tcg_temp_new_i64();
307
+
308
+ tcg_gen_extract_i64(t, a, sh - 1, 1);
309
+ tcg_gen_sari_i64(d, a, sh);
310
+ tcg_gen_add_i64(d, d, t);
311
+}
312
+
313
+static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
314
+{
315
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
316
+ TCGv_vec ones = tcg_temp_new_vec_matching(d);
317
+
318
+ tcg_gen_shri_vec(vece, t, a, sh - 1);
319
+ tcg_gen_dupi_vec(vece, ones, 1);
320
+ tcg_gen_and_vec(vece, t, t, ones);
321
+ tcg_gen_sari_vec(vece, d, a, sh);
322
+ tcg_gen_add_vec(vece, d, d, t);
323
+}
324
+
325
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
326
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
327
+{
328
+ static const TCGOpcode vecop_list[] = {
329
+ INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
330
+ };
331
+ static const GVecGen2i ops[4] = {
332
+ { .fni8 = gen_srshr8_i64,
333
+ .fniv = gen_srshr_vec,
334
+ .fno = gen_helper_gvec_srshr_b,
335
+ .opt_opc = vecop_list,
336
+ .vece = MO_8 },
337
+ { .fni8 = gen_srshr16_i64,
338
+ .fniv = gen_srshr_vec,
339
+ .fno = gen_helper_gvec_srshr_h,
340
+ .opt_opc = vecop_list,
341
+ .vece = MO_16 },
342
+ { .fni4 = gen_srshr32_i32,
343
+ .fniv = gen_srshr_vec,
344
+ .fno = gen_helper_gvec_srshr_s,
345
+ .opt_opc = vecop_list,
346
+ .vece = MO_32 },
347
+ { .fni8 = gen_srshr64_i64,
348
+ .fniv = gen_srshr_vec,
349
+ .fno = gen_helper_gvec_srshr_d,
350
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
351
+ .opt_opc = vecop_list,
352
+ .vece = MO_64 },
353
+ };
354
+
355
+ /* tszimm encoding produces immediates in the range [1..esize] */
356
+ tcg_debug_assert(shift > 0);
357
+ tcg_debug_assert(shift <= (8 << vece));
358
+
359
+ if (shift == (8 << vece)) {
360
+ /*
361
+ * Shifts larger than the element size are architecturally valid.
362
+ * Signed results in all sign bits. With rounding, this produces
363
+ * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
364
+ * I.e. always zero.
365
+ */
366
+ tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
161
+ } else {
367
+ } else {
162
+ sctlr = env->cp15.sctlr_el[el];
368
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
163
+ }
369
+ }
164
+ return (sctlr & bit) != 0;
370
+}
165
+}
371
+
166
+
372
+static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
167
+uint64_t HELPER(pacia)(CPUARMState *env, uint64_t x, uint64_t y)
373
+{
168
+{
374
+ TCGv_i64 t = tcg_temp_new_i64();
169
+ int el = arm_current_el(env);
375
+
170
+ if (!pauth_key_enabled(env, el, SCTLR_EnIA)) {
376
+ gen_srshr8_i64(t, a, sh);
171
+ return x;
377
+ tcg_gen_vec_add8_i64(d, d, t);
378
+}
379
+
380
+static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
381
+{
382
+ TCGv_i64 t = tcg_temp_new_i64();
383
+
384
+ gen_srshr16_i64(t, a, sh);
385
+ tcg_gen_vec_add16_i64(d, d, t);
386
+}
387
+
388
+static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
389
+{
390
+ TCGv_i32 t = tcg_temp_new_i32();
391
+
392
+ gen_srshr32_i32(t, a, sh);
393
+ tcg_gen_add_i32(d, d, t);
394
+}
395
+
396
+static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
397
+{
398
+ TCGv_i64 t = tcg_temp_new_i64();
399
+
400
+ gen_srshr64_i64(t, a, sh);
401
+ tcg_gen_add_i64(d, d, t);
402
+}
403
+
404
+static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
405
+{
406
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
407
+
408
+ gen_srshr_vec(vece, t, a, sh);
409
+ tcg_gen_add_vec(vece, d, d, t);
410
+}
411
+
412
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
413
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
414
+{
415
+ static const TCGOpcode vecop_list[] = {
416
+ INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
417
+ };
418
+ static const GVecGen2i ops[4] = {
419
+ { .fni8 = gen_srsra8_i64,
420
+ .fniv = gen_srsra_vec,
421
+ .fno = gen_helper_gvec_srsra_b,
422
+ .opt_opc = vecop_list,
423
+ .load_dest = true,
424
+ .vece = MO_8 },
425
+ { .fni8 = gen_srsra16_i64,
426
+ .fniv = gen_srsra_vec,
427
+ .fno = gen_helper_gvec_srsra_h,
428
+ .opt_opc = vecop_list,
429
+ .load_dest = true,
430
+ .vece = MO_16 },
431
+ { .fni4 = gen_srsra32_i32,
432
+ .fniv = gen_srsra_vec,
433
+ .fno = gen_helper_gvec_srsra_s,
434
+ .opt_opc = vecop_list,
435
+ .load_dest = true,
436
+ .vece = MO_32 },
437
+ { .fni8 = gen_srsra64_i64,
438
+ .fniv = gen_srsra_vec,
439
+ .fno = gen_helper_gvec_srsra_d,
440
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
441
+ .opt_opc = vecop_list,
442
+ .load_dest = true,
443
+ .vece = MO_64 },
444
+ };
445
+
446
+ /* tszimm encoding produces immediates in the range [1..esize] */
447
+ tcg_debug_assert(shift > 0);
448
+ tcg_debug_assert(shift <= (8 << vece));
449
+
450
+ /*
451
+ * Shifts larger than the element size are architecturally valid.
452
+ * Signed results in all sign bits. With rounding, this produces
453
+ * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
454
+ * I.e. always zero. With accumulation, this leaves D unchanged.
455
+ */
456
+ if (shift == (8 << vece)) {
457
+ /* Nop, but we do need to clear the tail. */
458
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
459
+ } else {
460
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
172
+ }
461
+ }
173
+ pauth_check_trap(env, el, GETPC());
462
+}
174
+ return pauth_addpac(env, x, y, &env->apia_key, false);
463
+
175
+}
464
+static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
176
+
465
+{
177
+uint64_t HELPER(pacib)(CPUARMState *env, uint64_t x, uint64_t y)
466
+ TCGv_i64 t = tcg_temp_new_i64();
178
+{
467
+
179
+ int el = arm_current_el(env);
468
+ tcg_gen_shri_i64(t, a, sh - 1);
180
+ if (!pauth_key_enabled(env, el, SCTLR_EnIB)) {
469
+ tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
181
+ return x;
470
+ tcg_gen_vec_shr8i_i64(d, a, sh);
471
+ tcg_gen_vec_add8_i64(d, d, t);
472
+}
473
+
474
+static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
475
+{
476
+ TCGv_i64 t = tcg_temp_new_i64();
477
+
478
+ tcg_gen_shri_i64(t, a, sh - 1);
479
+ tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
480
+ tcg_gen_vec_shr16i_i64(d, a, sh);
481
+ tcg_gen_vec_add16_i64(d, d, t);
482
+}
483
+
484
+void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
485
+{
486
+ TCGv_i32 t;
487
+
488
+ /* Handle shift by the input size for the benefit of trans_URSHR_ri */
489
+ if (sh == 32) {
490
+ tcg_gen_extract_i32(d, a, sh - 1, 1);
491
+ return;
182
+ }
492
+ }
183
+ pauth_check_trap(env, el, GETPC());
493
+ t = tcg_temp_new_i32();
184
+ return pauth_addpac(env, x, y, &env->apib_key, false);
494
+ tcg_gen_extract_i32(t, a, sh - 1, 1);
185
+}
495
+ tcg_gen_shri_i32(d, a, sh);
186
+
496
+ tcg_gen_add_i32(d, d, t);
187
+uint64_t HELPER(pacda)(CPUARMState *env, uint64_t x, uint64_t y)
497
+}
188
+{
498
+
189
+ int el = arm_current_el(env);
499
+void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
190
+ if (!pauth_key_enabled(env, el, SCTLR_EnDA)) {
500
+{
191
+ return x;
501
+ TCGv_i64 t = tcg_temp_new_i64();
502
+
503
+ tcg_gen_extract_i64(t, a, sh - 1, 1);
504
+ tcg_gen_shri_i64(d, a, sh);
505
+ tcg_gen_add_i64(d, d, t);
506
+}
507
+
508
+static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
509
+{
510
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
511
+ TCGv_vec ones = tcg_temp_new_vec_matching(d);
512
+
513
+ tcg_gen_shri_vec(vece, t, a, shift - 1);
514
+ tcg_gen_dupi_vec(vece, ones, 1);
515
+ tcg_gen_and_vec(vece, t, t, ones);
516
+ tcg_gen_shri_vec(vece, d, a, shift);
517
+ tcg_gen_add_vec(vece, d, d, t);
518
+}
519
+
520
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
521
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
522
+{
523
+ static const TCGOpcode vecop_list[] = {
524
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
525
+ };
526
+ static const GVecGen2i ops[4] = {
527
+ { .fni8 = gen_urshr8_i64,
528
+ .fniv = gen_urshr_vec,
529
+ .fno = gen_helper_gvec_urshr_b,
530
+ .opt_opc = vecop_list,
531
+ .vece = MO_8 },
532
+ { .fni8 = gen_urshr16_i64,
533
+ .fniv = gen_urshr_vec,
534
+ .fno = gen_helper_gvec_urshr_h,
535
+ .opt_opc = vecop_list,
536
+ .vece = MO_16 },
537
+ { .fni4 = gen_urshr32_i32,
538
+ .fniv = gen_urshr_vec,
539
+ .fno = gen_helper_gvec_urshr_s,
540
+ .opt_opc = vecop_list,
541
+ .vece = MO_32 },
542
+ { .fni8 = gen_urshr64_i64,
543
+ .fniv = gen_urshr_vec,
544
+ .fno = gen_helper_gvec_urshr_d,
545
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
546
+ .opt_opc = vecop_list,
547
+ .vece = MO_64 },
548
+ };
549
+
550
+ /* tszimm encoding produces immediates in the range [1..esize] */
551
+ tcg_debug_assert(shift > 0);
552
+ tcg_debug_assert(shift <= (8 << vece));
553
+
554
+ if (shift == (8 << vece)) {
555
+ /*
556
+ * Shifts larger than the element size are architecturally valid.
557
+ * Unsigned results in zero. With rounding, this produces a
558
+ * copy of the most significant bit.
559
+ */
560
+ tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
561
+ } else {
562
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
192
+ }
563
+ }
193
+ pauth_check_trap(env, el, GETPC());
564
+}
194
+ return pauth_addpac(env, x, y, &env->apda_key, true);
565
+
195
+}
566
+static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
196
+
567
+{
197
+uint64_t HELPER(pacdb)(CPUARMState *env, uint64_t x, uint64_t y)
568
+ TCGv_i64 t = tcg_temp_new_i64();
198
+{
569
+
199
+ int el = arm_current_el(env);
570
+ if (sh == 8) {
200
+ if (!pauth_key_enabled(env, el, SCTLR_EnDB)) {
571
+ tcg_gen_vec_shr8i_i64(t, a, 7);
201
+ return x;
572
+ } else {
573
+ gen_urshr8_i64(t, a, sh);
202
+ }
574
+ }
203
+ pauth_check_trap(env, el, GETPC());
575
+ tcg_gen_vec_add8_i64(d, d, t);
204
+ return pauth_addpac(env, x, y, &env->apdb_key, true);
576
+}
205
+}
577
+
206
+
578
+static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
207
+uint64_t HELPER(pacga)(CPUARMState *env, uint64_t x, uint64_t y)
579
+{
208
+{
580
+ TCGv_i64 t = tcg_temp_new_i64();
209
+ uint64_t pac;
581
+
210
+
582
+ if (sh == 16) {
211
+ pauth_check_trap(env, arm_current_el(env), GETPC());
583
+ tcg_gen_vec_shr16i_i64(t, a, 15);
212
+ pac = pauth_computepac(x, y, env->apga_key);
584
+ } else {
213
+
585
+ gen_urshr16_i64(t, a, sh);
214
+ return pac & 0xffffffff00000000ull;
215
+}
216
+
217
+uint64_t HELPER(autia)(CPUARMState *env, uint64_t x, uint64_t y)
218
+{
219
+ int el = arm_current_el(env);
220
+ if (!pauth_key_enabled(env, el, SCTLR_EnIA)) {
221
+ return x;
222
+ }
586
+ }
223
+ pauth_check_trap(env, el, GETPC());
587
+ tcg_gen_vec_add16_i64(d, d, t);
224
+ return pauth_auth(env, x, y, &env->apia_key, false, 0);
588
+}
225
+}
589
+
226
+
590
+static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
227
+uint64_t HELPER(autib)(CPUARMState *env, uint64_t x, uint64_t y)
591
+{
228
+{
592
+ TCGv_i32 t = tcg_temp_new_i32();
229
+ int el = arm_current_el(env);
593
+
230
+ if (!pauth_key_enabled(env, el, SCTLR_EnIB)) {
594
+ if (sh == 32) {
231
+ return x;
595
+ tcg_gen_shri_i32(t, a, 31);
596
+ } else {
597
+ gen_urshr32_i32(t, a, sh);
232
+ }
598
+ }
233
+ pauth_check_trap(env, el, GETPC());
599
+ tcg_gen_add_i32(d, d, t);
234
+ return pauth_auth(env, x, y, &env->apib_key, false, 1);
600
+}
235
+}
601
+
236
+
602
+static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
237
+uint64_t HELPER(autda)(CPUARMState *env, uint64_t x, uint64_t y)
603
+{
238
+{
604
+ TCGv_i64 t = tcg_temp_new_i64();
239
+ int el = arm_current_el(env);
605
+
240
+ if (!pauth_key_enabled(env, el, SCTLR_EnDA)) {
606
+ if (sh == 64) {
241
+ return x;
607
+ tcg_gen_shri_i64(t, a, 63);
608
+ } else {
609
+ gen_urshr64_i64(t, a, sh);
242
+ }
610
+ }
243
+ pauth_check_trap(env, el, GETPC());
611
+ tcg_gen_add_i64(d, d, t);
244
+ return pauth_auth(env, x, y, &env->apda_key, true, 0);
612
+}
245
+}
613
+
246
+
614
+static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
247
+uint64_t HELPER(autdb)(CPUARMState *env, uint64_t x, uint64_t y)
615
+{
248
+{
616
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
249
+ int el = arm_current_el(env);
617
+
250
+ if (!pauth_key_enabled(env, el, SCTLR_EnDB)) {
618
+ if (sh == (8 << vece)) {
251
+ return x;
619
+ tcg_gen_shri_vec(vece, t, a, sh - 1);
620
+ } else {
621
+ gen_urshr_vec(vece, t, a, sh);
252
+ }
622
+ }
253
+ pauth_check_trap(env, el, GETPC());
623
+ tcg_gen_add_vec(vece, d, d, t);
254
+ return pauth_auth(env, x, y, &env->apdb_key, true, 1);
624
+}
255
+}
625
+
256
+
626
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
257
+uint64_t HELPER(xpaci)(CPUARMState *env, uint64_t a)
627
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
258
+{
628
+{
259
+ return pauth_strip(env, a, false);
629
+ static const TCGOpcode vecop_list[] = {
260
+}
630
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
261
+
631
+ };
262
+uint64_t HELPER(xpacd)(CPUARMState *env, uint64_t a)
632
+ static const GVecGen2i ops[4] = {
263
+{
633
+ { .fni8 = gen_ursra8_i64,
264
+ return pauth_strip(env, a, true);
634
+ .fniv = gen_ursra_vec,
265
+}
635
+ .fno = gen_helper_gvec_ursra_b,
636
+ .opt_opc = vecop_list,
637
+ .load_dest = true,
638
+ .vece = MO_8 },
639
+ { .fni8 = gen_ursra16_i64,
640
+ .fniv = gen_ursra_vec,
641
+ .fno = gen_helper_gvec_ursra_h,
642
+ .opt_opc = vecop_list,
643
+ .load_dest = true,
644
+ .vece = MO_16 },
645
+ { .fni4 = gen_ursra32_i32,
646
+ .fniv = gen_ursra_vec,
647
+ .fno = gen_helper_gvec_ursra_s,
648
+ .opt_opc = vecop_list,
649
+ .load_dest = true,
650
+ .vece = MO_32 },
651
+ { .fni8 = gen_ursra64_i64,
652
+ .fniv = gen_ursra_vec,
653
+ .fno = gen_helper_gvec_ursra_d,
654
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
655
+ .opt_opc = vecop_list,
656
+ .load_dest = true,
657
+ .vece = MO_64 },
658
+ };
659
+
660
+ /* tszimm encoding produces immediates in the range [1..esize] */
661
+ tcg_debug_assert(shift > 0);
662
+ tcg_debug_assert(shift <= (8 << vece));
663
+
664
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
665
+}
666
+
667
+static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
668
+{
669
+ uint64_t mask = dup_const(MO_8, 0xff >> shift);
670
+ TCGv_i64 t = tcg_temp_new_i64();
671
+
672
+ tcg_gen_shri_i64(t, a, shift);
673
+ tcg_gen_andi_i64(t, t, mask);
674
+ tcg_gen_andi_i64(d, d, ~mask);
675
+ tcg_gen_or_i64(d, d, t);
676
+}
677
+
678
+static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
679
+{
680
+ uint64_t mask = dup_const(MO_16, 0xffff >> shift);
681
+ TCGv_i64 t = tcg_temp_new_i64();
682
+
683
+ tcg_gen_shri_i64(t, a, shift);
684
+ tcg_gen_andi_i64(t, t, mask);
685
+ tcg_gen_andi_i64(d, d, ~mask);
686
+ tcg_gen_or_i64(d, d, t);
687
+}
688
+
689
+static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
690
+{
691
+ tcg_gen_shri_i32(a, a, shift);
692
+ tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
693
+}
694
+
695
+static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
696
+{
697
+ tcg_gen_shri_i64(a, a, shift);
698
+ tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
699
+}
700
+
701
+static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
702
+{
703
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
704
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
705
+
706
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
707
+ tcg_gen_shri_vec(vece, t, a, sh);
708
+ tcg_gen_and_vec(vece, d, d, m);
709
+ tcg_gen_or_vec(vece, d, d, t);
710
+}
711
+
712
+void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
713
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
714
+{
715
+ static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
716
+ const GVecGen2i ops[4] = {
717
+ { .fni8 = gen_shr8_ins_i64,
718
+ .fniv = gen_shr_ins_vec,
719
+ .fno = gen_helper_gvec_sri_b,
720
+ .load_dest = true,
721
+ .opt_opc = vecop_list,
722
+ .vece = MO_8 },
723
+ { .fni8 = gen_shr16_ins_i64,
724
+ .fniv = gen_shr_ins_vec,
725
+ .fno = gen_helper_gvec_sri_h,
726
+ .load_dest = true,
727
+ .opt_opc = vecop_list,
728
+ .vece = MO_16 },
729
+ { .fni4 = gen_shr32_ins_i32,
730
+ .fniv = gen_shr_ins_vec,
731
+ .fno = gen_helper_gvec_sri_s,
732
+ .load_dest = true,
733
+ .opt_opc = vecop_list,
734
+ .vece = MO_32 },
735
+ { .fni8 = gen_shr64_ins_i64,
736
+ .fniv = gen_shr_ins_vec,
737
+ .fno = gen_helper_gvec_sri_d,
738
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
739
+ .load_dest = true,
740
+ .opt_opc = vecop_list,
741
+ .vece = MO_64 },
742
+ };
743
+
744
+ /* tszimm encoding produces immediates in the range [1..esize]. */
745
+ tcg_debug_assert(shift > 0);
746
+ tcg_debug_assert(shift <= (8 << vece));
747
+
748
+ /* Shift of esize leaves destination unchanged. */
749
+ if (shift < (8 << vece)) {
750
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
751
+ } else {
752
+ /* Nop, but we do need to clear the tail. */
753
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
754
+ }
755
+}
756
+
757
+static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
758
+{
759
+ uint64_t mask = dup_const(MO_8, 0xff << shift);
760
+ TCGv_i64 t = tcg_temp_new_i64();
761
+
762
+ tcg_gen_shli_i64(t, a, shift);
763
+ tcg_gen_andi_i64(t, t, mask);
764
+ tcg_gen_andi_i64(d, d, ~mask);
765
+ tcg_gen_or_i64(d, d, t);
766
+}
767
+
768
+static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
769
+{
770
+ uint64_t mask = dup_const(MO_16, 0xffff << shift);
771
+ TCGv_i64 t = tcg_temp_new_i64();
772
+
773
+ tcg_gen_shli_i64(t, a, shift);
774
+ tcg_gen_andi_i64(t, t, mask);
775
+ tcg_gen_andi_i64(d, d, ~mask);
776
+ tcg_gen_or_i64(d, d, t);
777
+}
778
+
779
+static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
780
+{
781
+ tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
782
+}
783
+
784
+static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
785
+{
786
+ tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
787
+}
788
+
789
+static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
790
+{
791
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
792
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
793
+
794
+ tcg_gen_shli_vec(vece, t, a, sh);
795
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
796
+ tcg_gen_and_vec(vece, d, d, m);
797
+ tcg_gen_or_vec(vece, d, d, t);
798
+}
799
+
800
+void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
801
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
802
+{
803
+ static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
804
+ const GVecGen2i ops[4] = {
805
+ { .fni8 = gen_shl8_ins_i64,
806
+ .fniv = gen_shl_ins_vec,
807
+ .fno = gen_helper_gvec_sli_b,
808
+ .load_dest = true,
809
+ .opt_opc = vecop_list,
810
+ .vece = MO_8 },
811
+ { .fni8 = gen_shl16_ins_i64,
812
+ .fniv = gen_shl_ins_vec,
813
+ .fno = gen_helper_gvec_sli_h,
814
+ .load_dest = true,
815
+ .opt_opc = vecop_list,
816
+ .vece = MO_16 },
817
+ { .fni4 = gen_shl32_ins_i32,
818
+ .fniv = gen_shl_ins_vec,
819
+ .fno = gen_helper_gvec_sli_s,
820
+ .load_dest = true,
821
+ .opt_opc = vecop_list,
822
+ .vece = MO_32 },
823
+ { .fni8 = gen_shl64_ins_i64,
824
+ .fniv = gen_shl_ins_vec,
825
+ .fno = gen_helper_gvec_sli_d,
826
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
827
+ .load_dest = true,
828
+ .opt_opc = vecop_list,
829
+ .vece = MO_64 },
830
+ };
831
+
832
+ /* tszimm encoding produces immediates in the range [0..esize-1]. */
833
+ tcg_debug_assert(shift >= 0);
834
+ tcg_debug_assert(shift < (8 << vece));
835
+
836
+ if (shift == 0) {
837
+ tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
838
+ } else {
839
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
840
+ }
841
+}
842
+
843
+static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
844
+{
845
+ gen_helper_neon_mul_u8(a, a, b);
846
+ gen_helper_neon_add_u8(d, d, a);
847
+}
848
+
849
+static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
850
+{
851
+ gen_helper_neon_mul_u8(a, a, b);
852
+ gen_helper_neon_sub_u8(d, d, a);
853
+}
854
+
855
+static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
856
+{
857
+ gen_helper_neon_mul_u16(a, a, b);
858
+ gen_helper_neon_add_u16(d, d, a);
859
+}
860
+
861
+static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
862
+{
863
+ gen_helper_neon_mul_u16(a, a, b);
864
+ gen_helper_neon_sub_u16(d, d, a);
865
+}
866
+
867
+static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
868
+{
869
+ tcg_gen_mul_i32(a, a, b);
870
+ tcg_gen_add_i32(d, d, a);
871
+}
872
+
873
+static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
874
+{
875
+ tcg_gen_mul_i32(a, a, b);
876
+ tcg_gen_sub_i32(d, d, a);
877
+}
878
+
879
+static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
880
+{
881
+ tcg_gen_mul_i64(a, a, b);
882
+ tcg_gen_add_i64(d, d, a);
883
+}
884
+
885
+static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
886
+{
887
+ tcg_gen_mul_i64(a, a, b);
888
+ tcg_gen_sub_i64(d, d, a);
889
+}
890
+
891
+static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
892
+{
893
+ tcg_gen_mul_vec(vece, a, a, b);
894
+ tcg_gen_add_vec(vece, d, d, a);
895
+}
896
+
897
+static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
898
+{
899
+ tcg_gen_mul_vec(vece, a, a, b);
900
+ tcg_gen_sub_vec(vece, d, d, a);
901
+}
902
+
903
+/* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
904
+ * these tables are shared with AArch64 which does support them.
905
+ */
906
+void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
907
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
908
+{
909
+ static const TCGOpcode vecop_list[] = {
910
+ INDEX_op_mul_vec, INDEX_op_add_vec, 0
911
+ };
912
+ static const GVecGen3 ops[4] = {
913
+ { .fni4 = gen_mla8_i32,
914
+ .fniv = gen_mla_vec,
915
+ .load_dest = true,
916
+ .opt_opc = vecop_list,
917
+ .vece = MO_8 },
918
+ { .fni4 = gen_mla16_i32,
919
+ .fniv = gen_mla_vec,
920
+ .load_dest = true,
921
+ .opt_opc = vecop_list,
922
+ .vece = MO_16 },
923
+ { .fni4 = gen_mla32_i32,
924
+ .fniv = gen_mla_vec,
925
+ .load_dest = true,
926
+ .opt_opc = vecop_list,
927
+ .vece = MO_32 },
928
+ { .fni8 = gen_mla64_i64,
929
+ .fniv = gen_mla_vec,
930
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
931
+ .load_dest = true,
932
+ .opt_opc = vecop_list,
933
+ .vece = MO_64 },
934
+ };
935
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
936
+}
937
+
938
+void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
939
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
940
+{
941
+ static const TCGOpcode vecop_list[] = {
942
+ INDEX_op_mul_vec, INDEX_op_sub_vec, 0
943
+ };
944
+ static const GVecGen3 ops[4] = {
945
+ { .fni4 = gen_mls8_i32,
946
+ .fniv = gen_mls_vec,
947
+ .load_dest = true,
948
+ .opt_opc = vecop_list,
949
+ .vece = MO_8 },
950
+ { .fni4 = gen_mls16_i32,
951
+ .fniv = gen_mls_vec,
952
+ .load_dest = true,
953
+ .opt_opc = vecop_list,
954
+ .vece = MO_16 },
955
+ { .fni4 = gen_mls32_i32,
956
+ .fniv = gen_mls_vec,
957
+ .load_dest = true,
958
+ .opt_opc = vecop_list,
959
+ .vece = MO_32 },
960
+ { .fni8 = gen_mls64_i64,
961
+ .fniv = gen_mls_vec,
962
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
963
+ .load_dest = true,
964
+ .opt_opc = vecop_list,
965
+ .vece = MO_64 },
966
+ };
967
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
968
+}
969
+
970
+/* CMTST : test is "if (X & Y != 0)". */
971
+static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
972
+{
973
+ tcg_gen_and_i32(d, a, b);
974
+ tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0));
975
+}
976
+
977
+void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
978
+{
979
+ tcg_gen_and_i64(d, a, b);
980
+ tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0));
981
+}
982
+
983
+static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
984
+{
985
+ tcg_gen_and_vec(vece, d, a, b);
986
+ tcg_gen_dupi_vec(vece, a, 0);
987
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
988
+}
989
+
990
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
991
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
992
+{
993
+ static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
994
+ static const GVecGen3 ops[4] = {
995
+ { .fni4 = gen_helper_neon_tst_u8,
996
+ .fniv = gen_cmtst_vec,
997
+ .opt_opc = vecop_list,
998
+ .vece = MO_8 },
999
+ { .fni4 = gen_helper_neon_tst_u16,
1000
+ .fniv = gen_cmtst_vec,
1001
+ .opt_opc = vecop_list,
1002
+ .vece = MO_16 },
1003
+ { .fni4 = gen_cmtst_i32,
1004
+ .fniv = gen_cmtst_vec,
1005
+ .opt_opc = vecop_list,
1006
+ .vece = MO_32 },
1007
+ { .fni8 = gen_cmtst_i64,
1008
+ .fniv = gen_cmtst_vec,
1009
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1010
+ .opt_opc = vecop_list,
1011
+ .vece = MO_64 },
1012
+ };
1013
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1014
+}
1015
+
1016
+void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
1017
+{
1018
+ TCGv_i32 lval = tcg_temp_new_i32();
1019
+ TCGv_i32 rval = tcg_temp_new_i32();
1020
+ TCGv_i32 lsh = tcg_temp_new_i32();
1021
+ TCGv_i32 rsh = tcg_temp_new_i32();
1022
+ TCGv_i32 zero = tcg_constant_i32(0);
1023
+ TCGv_i32 max = tcg_constant_i32(32);
1024
+
1025
+ /*
1026
+ * Rely on the TCG guarantee that out of range shifts produce
1027
+ * unspecified results, not undefined behaviour (i.e. no trap).
1028
+ * Discard out-of-range results after the fact.
1029
+ */
1030
+ tcg_gen_ext8s_i32(lsh, shift);
1031
+ tcg_gen_neg_i32(rsh, lsh);
1032
+ tcg_gen_shl_i32(lval, src, lsh);
1033
+ tcg_gen_shr_i32(rval, src, rsh);
1034
+ tcg_gen_movcond_i32(TCG_COND_LTU, dst, lsh, max, lval, zero);
1035
+ tcg_gen_movcond_i32(TCG_COND_LTU, dst, rsh, max, rval, dst);
1036
+}
1037
+
1038
+void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
1039
+{
1040
+ TCGv_i64 lval = tcg_temp_new_i64();
1041
+ TCGv_i64 rval = tcg_temp_new_i64();
1042
+ TCGv_i64 lsh = tcg_temp_new_i64();
1043
+ TCGv_i64 rsh = tcg_temp_new_i64();
1044
+ TCGv_i64 zero = tcg_constant_i64(0);
1045
+ TCGv_i64 max = tcg_constant_i64(64);
1046
+
1047
+ /*
1048
+ * Rely on the TCG guarantee that out of range shifts produce
1049
+ * unspecified results, not undefined behaviour (i.e. no trap).
1050
+ * Discard out-of-range results after the fact.
1051
+ */
1052
+ tcg_gen_ext8s_i64(lsh, shift);
1053
+ tcg_gen_neg_i64(rsh, lsh);
1054
+ tcg_gen_shl_i64(lval, src, lsh);
1055
+ tcg_gen_shr_i64(rval, src, rsh);
1056
+ tcg_gen_movcond_i64(TCG_COND_LTU, dst, lsh, max, lval, zero);
1057
+ tcg_gen_movcond_i64(TCG_COND_LTU, dst, rsh, max, rval, dst);
1058
+}
1059
+
1060
+static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
1061
+ TCGv_vec src, TCGv_vec shift)
1062
+{
1063
+ TCGv_vec lval = tcg_temp_new_vec_matching(dst);
1064
+ TCGv_vec rval = tcg_temp_new_vec_matching(dst);
1065
+ TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
1066
+ TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
1067
+ TCGv_vec msk, max;
1068
+
1069
+ tcg_gen_neg_vec(vece, rsh, shift);
1070
+ if (vece == MO_8) {
1071
+ tcg_gen_mov_vec(lsh, shift);
1072
+ } else {
1073
+ msk = tcg_temp_new_vec_matching(dst);
1074
+ tcg_gen_dupi_vec(vece, msk, 0xff);
1075
+ tcg_gen_and_vec(vece, lsh, shift, msk);
1076
+ tcg_gen_and_vec(vece, rsh, rsh, msk);
1077
+ }
1078
+
1079
+ /*
1080
+ * Rely on the TCG guarantee that out of range shifts produce
1081
+ * unspecified results, not undefined behaviour (i.e. no trap).
1082
+ * Discard out-of-range results after the fact.
1083
+ */
1084
+ tcg_gen_shlv_vec(vece, lval, src, lsh);
1085
+ tcg_gen_shrv_vec(vece, rval, src, rsh);
1086
+
1087
+ max = tcg_temp_new_vec_matching(dst);
1088
+ tcg_gen_dupi_vec(vece, max, 8 << vece);
1089
+
1090
+ /*
1091
+ * The choice of LT (signed) and GEU (unsigned) are biased toward
1092
+ * the instructions of the x86_64 host. For MO_8, the whole byte
1093
+ * is significant so we must use an unsigned compare; otherwise we
1094
+ * have already masked to a byte and so a signed compare works.
1095
+ * Other tcg hosts have a full set of comparisons and do not care.
1096
+ */
1097
+ if (vece == MO_8) {
1098
+ tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
1099
+ tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
1100
+ tcg_gen_andc_vec(vece, lval, lval, lsh);
1101
+ tcg_gen_andc_vec(vece, rval, rval, rsh);
1102
+ } else {
1103
+ tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
1104
+ tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
1105
+ tcg_gen_and_vec(vece, lval, lval, lsh);
1106
+ tcg_gen_and_vec(vece, rval, rval, rsh);
1107
+ }
1108
+ tcg_gen_or_vec(vece, dst, lval, rval);
1109
+}
1110
+
1111
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1112
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1113
+{
1114
+ static const TCGOpcode vecop_list[] = {
1115
+ INDEX_op_neg_vec, INDEX_op_shlv_vec,
1116
+ INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
1117
+ };
1118
+ static const GVecGen3 ops[4] = {
1119
+ { .fniv = gen_ushl_vec,
1120
+ .fno = gen_helper_gvec_ushl_b,
1121
+ .opt_opc = vecop_list,
1122
+ .vece = MO_8 },
1123
+ { .fniv = gen_ushl_vec,
1124
+ .fno = gen_helper_gvec_ushl_h,
1125
+ .opt_opc = vecop_list,
1126
+ .vece = MO_16 },
1127
+ { .fni4 = gen_ushl_i32,
1128
+ .fniv = gen_ushl_vec,
1129
+ .opt_opc = vecop_list,
1130
+ .vece = MO_32 },
1131
+ { .fni8 = gen_ushl_i64,
1132
+ .fniv = gen_ushl_vec,
1133
+ .opt_opc = vecop_list,
1134
+ .vece = MO_64 },
1135
+ };
1136
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1137
+}
1138
+
1139
+void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
1140
+{
1141
+ TCGv_i32 lval = tcg_temp_new_i32();
1142
+ TCGv_i32 rval = tcg_temp_new_i32();
1143
+ TCGv_i32 lsh = tcg_temp_new_i32();
1144
+ TCGv_i32 rsh = tcg_temp_new_i32();
1145
+ TCGv_i32 zero = tcg_constant_i32(0);
1146
+ TCGv_i32 max = tcg_constant_i32(31);
1147
+
1148
+ /*
1149
+ * Rely on the TCG guarantee that out of range shifts produce
1150
+ * unspecified results, not undefined behaviour (i.e. no trap).
1151
+ * Discard out-of-range results after the fact.
1152
+ */
1153
+ tcg_gen_ext8s_i32(lsh, shift);
1154
+ tcg_gen_neg_i32(rsh, lsh);
1155
+ tcg_gen_shl_i32(lval, src, lsh);
1156
+ tcg_gen_umin_i32(rsh, rsh, max);
1157
+ tcg_gen_sar_i32(rval, src, rsh);
1158
+ tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero);
1159
+ tcg_gen_movcond_i32(TCG_COND_LT, dst, lsh, zero, rval, lval);
1160
+}
1161
+
1162
+void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
1163
+{
1164
+ TCGv_i64 lval = tcg_temp_new_i64();
1165
+ TCGv_i64 rval = tcg_temp_new_i64();
1166
+ TCGv_i64 lsh = tcg_temp_new_i64();
1167
+ TCGv_i64 rsh = tcg_temp_new_i64();
1168
+ TCGv_i64 zero = tcg_constant_i64(0);
1169
+ TCGv_i64 max = tcg_constant_i64(63);
1170
+
1171
+ /*
1172
+ * Rely on the TCG guarantee that out of range shifts produce
1173
+ * unspecified results, not undefined behaviour (i.e. no trap).
1174
+ * Discard out-of-range results after the fact.
1175
+ */
1176
+ tcg_gen_ext8s_i64(lsh, shift);
1177
+ tcg_gen_neg_i64(rsh, lsh);
1178
+ tcg_gen_shl_i64(lval, src, lsh);
1179
+ tcg_gen_umin_i64(rsh, rsh, max);
1180
+ tcg_gen_sar_i64(rval, src, rsh);
1181
+ tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero);
1182
+ tcg_gen_movcond_i64(TCG_COND_LT, dst, lsh, zero, rval, lval);
1183
+}
1184
+
1185
+static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
1186
+ TCGv_vec src, TCGv_vec shift)
1187
+{
1188
+ TCGv_vec lval = tcg_temp_new_vec_matching(dst);
1189
+ TCGv_vec rval = tcg_temp_new_vec_matching(dst);
1190
+ TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
1191
+ TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
1192
+ TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
1193
+
1194
+ /*
1195
+ * Rely on the TCG guarantee that out of range shifts produce
1196
+ * unspecified results, not undefined behaviour (i.e. no trap).
1197
+ * Discard out-of-range results after the fact.
1198
+ */
1199
+ tcg_gen_neg_vec(vece, rsh, shift);
1200
+ if (vece == MO_8) {
1201
+ tcg_gen_mov_vec(lsh, shift);
1202
+ } else {
1203
+ tcg_gen_dupi_vec(vece, tmp, 0xff);
1204
+ tcg_gen_and_vec(vece, lsh, shift, tmp);
1205
+ tcg_gen_and_vec(vece, rsh, rsh, tmp);
1206
+ }
1207
+
1208
+ /* Bound rsh so out of bound right shift gets -1. */
1209
+ tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
1210
+ tcg_gen_umin_vec(vece, rsh, rsh, tmp);
1211
+ tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
1212
+
1213
+ tcg_gen_shlv_vec(vece, lval, src, lsh);
1214
+ tcg_gen_sarv_vec(vece, rval, src, rsh);
1215
+
1216
+ /* Select in-bound left shift. */
1217
+ tcg_gen_andc_vec(vece, lval, lval, tmp);
1218
+
1219
+ /* Select between left and right shift. */
1220
+ if (vece == MO_8) {
1221
+ tcg_gen_dupi_vec(vece, tmp, 0);
1222
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval);
1223
+ } else {
1224
+ tcg_gen_dupi_vec(vece, tmp, 0x80);
1225
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval);
1226
+ }
1227
+}
1228
+
1229
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1230
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1231
+{
1232
+ static const TCGOpcode vecop_list[] = {
1233
+ INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
1234
+ INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
1235
+ };
1236
+ static const GVecGen3 ops[4] = {
1237
+ { .fniv = gen_sshl_vec,
1238
+ .fno = gen_helper_gvec_sshl_b,
1239
+ .opt_opc = vecop_list,
1240
+ .vece = MO_8 },
1241
+ { .fniv = gen_sshl_vec,
1242
+ .fno = gen_helper_gvec_sshl_h,
1243
+ .opt_opc = vecop_list,
1244
+ .vece = MO_16 },
1245
+ { .fni4 = gen_sshl_i32,
1246
+ .fniv = gen_sshl_vec,
1247
+ .opt_opc = vecop_list,
1248
+ .vece = MO_32 },
1249
+ { .fni8 = gen_sshl_i64,
1250
+ .fniv = gen_sshl_vec,
1251
+ .opt_opc = vecop_list,
1252
+ .vece = MO_64 },
1253
+ };
1254
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1255
+}
1256
+
1257
+static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
1258
+ TCGv_vec a, TCGv_vec b)
1259
+{
1260
+ TCGv_vec x = tcg_temp_new_vec_matching(t);
1261
+ tcg_gen_add_vec(vece, x, a, b);
1262
+ tcg_gen_usadd_vec(vece, t, a, b);
1263
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
1264
+ tcg_gen_or_vec(vece, sat, sat, x);
1265
+}
1266
+
1267
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1268
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1269
+{
1270
+ static const TCGOpcode vecop_list[] = {
1271
+ INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
1272
+ };
1273
+ static const GVecGen4 ops[4] = {
1274
+ { .fniv = gen_uqadd_vec,
1275
+ .fno = gen_helper_gvec_uqadd_b,
1276
+ .write_aofs = true,
1277
+ .opt_opc = vecop_list,
1278
+ .vece = MO_8 },
1279
+ { .fniv = gen_uqadd_vec,
1280
+ .fno = gen_helper_gvec_uqadd_h,
1281
+ .write_aofs = true,
1282
+ .opt_opc = vecop_list,
1283
+ .vece = MO_16 },
1284
+ { .fniv = gen_uqadd_vec,
1285
+ .fno = gen_helper_gvec_uqadd_s,
1286
+ .write_aofs = true,
1287
+ .opt_opc = vecop_list,
1288
+ .vece = MO_32 },
1289
+ { .fniv = gen_uqadd_vec,
1290
+ .fno = gen_helper_gvec_uqadd_d,
1291
+ .write_aofs = true,
1292
+ .opt_opc = vecop_list,
1293
+ .vece = MO_64 },
1294
+ };
1295
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
1296
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1297
+}
1298
+
1299
+static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
1300
+ TCGv_vec a, TCGv_vec b)
1301
+{
1302
+ TCGv_vec x = tcg_temp_new_vec_matching(t);
1303
+ tcg_gen_add_vec(vece, x, a, b);
1304
+ tcg_gen_ssadd_vec(vece, t, a, b);
1305
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
1306
+ tcg_gen_or_vec(vece, sat, sat, x);
1307
+}
1308
+
1309
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1310
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1311
+{
1312
+ static const TCGOpcode vecop_list[] = {
1313
+ INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
1314
+ };
1315
+ static const GVecGen4 ops[4] = {
1316
+ { .fniv = gen_sqadd_vec,
1317
+ .fno = gen_helper_gvec_sqadd_b,
1318
+ .opt_opc = vecop_list,
1319
+ .write_aofs = true,
1320
+ .vece = MO_8 },
1321
+ { .fniv = gen_sqadd_vec,
1322
+ .fno = gen_helper_gvec_sqadd_h,
1323
+ .opt_opc = vecop_list,
1324
+ .write_aofs = true,
1325
+ .vece = MO_16 },
1326
+ { .fniv = gen_sqadd_vec,
1327
+ .fno = gen_helper_gvec_sqadd_s,
1328
+ .opt_opc = vecop_list,
1329
+ .write_aofs = true,
1330
+ .vece = MO_32 },
1331
+ { .fniv = gen_sqadd_vec,
1332
+ .fno = gen_helper_gvec_sqadd_d,
1333
+ .opt_opc = vecop_list,
1334
+ .write_aofs = true,
1335
+ .vece = MO_64 },
1336
+ };
1337
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
1338
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1339
+}
1340
+
1341
+static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
1342
+ TCGv_vec a, TCGv_vec b)
1343
+{
1344
+ TCGv_vec x = tcg_temp_new_vec_matching(t);
1345
+ tcg_gen_sub_vec(vece, x, a, b);
1346
+ tcg_gen_ussub_vec(vece, t, a, b);
1347
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
1348
+ tcg_gen_or_vec(vece, sat, sat, x);
1349
+}
1350
+
1351
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1352
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1353
+{
1354
+ static const TCGOpcode vecop_list[] = {
1355
+ INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
1356
+ };
1357
+ static const GVecGen4 ops[4] = {
1358
+ { .fniv = gen_uqsub_vec,
1359
+ .fno = gen_helper_gvec_uqsub_b,
1360
+ .opt_opc = vecop_list,
1361
+ .write_aofs = true,
1362
+ .vece = MO_8 },
1363
+ { .fniv = gen_uqsub_vec,
1364
+ .fno = gen_helper_gvec_uqsub_h,
1365
+ .opt_opc = vecop_list,
1366
+ .write_aofs = true,
1367
+ .vece = MO_16 },
1368
+ { .fniv = gen_uqsub_vec,
1369
+ .fno = gen_helper_gvec_uqsub_s,
1370
+ .opt_opc = vecop_list,
1371
+ .write_aofs = true,
1372
+ .vece = MO_32 },
1373
+ { .fniv = gen_uqsub_vec,
1374
+ .fno = gen_helper_gvec_uqsub_d,
1375
+ .opt_opc = vecop_list,
1376
+ .write_aofs = true,
1377
+ .vece = MO_64 },
1378
+ };
1379
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
1380
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1381
+}
1382
+
1383
+static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
1384
+ TCGv_vec a, TCGv_vec b)
1385
+{
1386
+ TCGv_vec x = tcg_temp_new_vec_matching(t);
1387
+ tcg_gen_sub_vec(vece, x, a, b);
1388
+ tcg_gen_sssub_vec(vece, t, a, b);
1389
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
1390
+ tcg_gen_or_vec(vece, sat, sat, x);
1391
+}
1392
+
1393
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1394
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1395
+{
1396
+ static const TCGOpcode vecop_list[] = {
1397
+ INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
1398
+ };
1399
+ static const GVecGen4 ops[4] = {
1400
+ { .fniv = gen_sqsub_vec,
1401
+ .fno = gen_helper_gvec_sqsub_b,
1402
+ .opt_opc = vecop_list,
1403
+ .write_aofs = true,
1404
+ .vece = MO_8 },
1405
+ { .fniv = gen_sqsub_vec,
1406
+ .fno = gen_helper_gvec_sqsub_h,
1407
+ .opt_opc = vecop_list,
1408
+ .write_aofs = true,
1409
+ .vece = MO_16 },
1410
+ { .fniv = gen_sqsub_vec,
1411
+ .fno = gen_helper_gvec_sqsub_s,
1412
+ .opt_opc = vecop_list,
1413
+ .write_aofs = true,
1414
+ .vece = MO_32 },
1415
+ { .fniv = gen_sqsub_vec,
1416
+ .fno = gen_helper_gvec_sqsub_d,
1417
+ .opt_opc = vecop_list,
1418
+ .write_aofs = true,
1419
+ .vece = MO_64 },
1420
+ };
1421
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
1422
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1423
+}
1424
+
1425
+static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
1426
+{
1427
+ TCGv_i32 t = tcg_temp_new_i32();
1428
+
1429
+ tcg_gen_sub_i32(t, a, b);
1430
+ tcg_gen_sub_i32(d, b, a);
1431
+ tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
1432
+}
1433
+
1434
+static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
1435
+{
1436
+ TCGv_i64 t = tcg_temp_new_i64();
1437
+
1438
+ tcg_gen_sub_i64(t, a, b);
1439
+ tcg_gen_sub_i64(d, b, a);
1440
+ tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
1441
+}
1442
+
1443
+static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
1444
+{
1445
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
1446
+
1447
+ tcg_gen_smin_vec(vece, t, a, b);
1448
+ tcg_gen_smax_vec(vece, d, a, b);
1449
+ tcg_gen_sub_vec(vece, d, d, t);
1450
+}
1451
+
1452
+void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1453
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1454
+{
1455
+ static const TCGOpcode vecop_list[] = {
1456
+ INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
1457
+ };
1458
+ static const GVecGen3 ops[4] = {
1459
+ { .fniv = gen_sabd_vec,
1460
+ .fno = gen_helper_gvec_sabd_b,
1461
+ .opt_opc = vecop_list,
1462
+ .vece = MO_8 },
1463
+ { .fniv = gen_sabd_vec,
1464
+ .fno = gen_helper_gvec_sabd_h,
1465
+ .opt_opc = vecop_list,
1466
+ .vece = MO_16 },
1467
+ { .fni4 = gen_sabd_i32,
1468
+ .fniv = gen_sabd_vec,
1469
+ .fno = gen_helper_gvec_sabd_s,
1470
+ .opt_opc = vecop_list,
1471
+ .vece = MO_32 },
1472
+ { .fni8 = gen_sabd_i64,
1473
+ .fniv = gen_sabd_vec,
1474
+ .fno = gen_helper_gvec_sabd_d,
1475
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1476
+ .opt_opc = vecop_list,
1477
+ .vece = MO_64 },
1478
+ };
1479
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1480
+}
1481
+
1482
+static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
1483
+{
1484
+ TCGv_i32 t = tcg_temp_new_i32();
1485
+
1486
+ tcg_gen_sub_i32(t, a, b);
1487
+ tcg_gen_sub_i32(d, b, a);
1488
+ tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
1489
+}
1490
+
1491
+static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
1492
+{
1493
+ TCGv_i64 t = tcg_temp_new_i64();
1494
+
1495
+ tcg_gen_sub_i64(t, a, b);
1496
+ tcg_gen_sub_i64(d, b, a);
1497
+ tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
1498
+}
1499
+
1500
+static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
1501
+{
1502
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
1503
+
1504
+ tcg_gen_umin_vec(vece, t, a, b);
1505
+ tcg_gen_umax_vec(vece, d, a, b);
1506
+ tcg_gen_sub_vec(vece, d, d, t);
1507
+}
1508
+
1509
+void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1510
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1511
+{
1512
+ static const TCGOpcode vecop_list[] = {
1513
+ INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
1514
+ };
1515
+ static const GVecGen3 ops[4] = {
1516
+ { .fniv = gen_uabd_vec,
1517
+ .fno = gen_helper_gvec_uabd_b,
1518
+ .opt_opc = vecop_list,
1519
+ .vece = MO_8 },
1520
+ { .fniv = gen_uabd_vec,
1521
+ .fno = gen_helper_gvec_uabd_h,
1522
+ .opt_opc = vecop_list,
1523
+ .vece = MO_16 },
1524
+ { .fni4 = gen_uabd_i32,
1525
+ .fniv = gen_uabd_vec,
1526
+ .fno = gen_helper_gvec_uabd_s,
1527
+ .opt_opc = vecop_list,
1528
+ .vece = MO_32 },
1529
+ { .fni8 = gen_uabd_i64,
1530
+ .fniv = gen_uabd_vec,
1531
+ .fno = gen_helper_gvec_uabd_d,
1532
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1533
+ .opt_opc = vecop_list,
1534
+ .vece = MO_64 },
1535
+ };
1536
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1537
+}
1538
+
1539
+static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
1540
+{
1541
+ TCGv_i32 t = tcg_temp_new_i32();
1542
+ gen_sabd_i32(t, a, b);
1543
+ tcg_gen_add_i32(d, d, t);
1544
+}
1545
+
1546
+static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
1547
+{
1548
+ TCGv_i64 t = tcg_temp_new_i64();
1549
+ gen_sabd_i64(t, a, b);
1550
+ tcg_gen_add_i64(d, d, t);
1551
+}
1552
+
1553
+static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
1554
+{
1555
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
1556
+ gen_sabd_vec(vece, t, a, b);
1557
+ tcg_gen_add_vec(vece, d, d, t);
1558
+}
1559
+
1560
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1561
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1562
+{
1563
+ static const TCGOpcode vecop_list[] = {
1564
+ INDEX_op_sub_vec, INDEX_op_add_vec,
1565
+ INDEX_op_smin_vec, INDEX_op_smax_vec, 0
1566
+ };
1567
+ static const GVecGen3 ops[4] = {
1568
+ { .fniv = gen_saba_vec,
1569
+ .fno = gen_helper_gvec_saba_b,
1570
+ .opt_opc = vecop_list,
1571
+ .load_dest = true,
1572
+ .vece = MO_8 },
1573
+ { .fniv = gen_saba_vec,
1574
+ .fno = gen_helper_gvec_saba_h,
1575
+ .opt_opc = vecop_list,
1576
+ .load_dest = true,
1577
+ .vece = MO_16 },
1578
+ { .fni4 = gen_saba_i32,
1579
+ .fniv = gen_saba_vec,
1580
+ .fno = gen_helper_gvec_saba_s,
1581
+ .opt_opc = vecop_list,
1582
+ .load_dest = true,
1583
+ .vece = MO_32 },
1584
+ { .fni8 = gen_saba_i64,
1585
+ .fniv = gen_saba_vec,
1586
+ .fno = gen_helper_gvec_saba_d,
1587
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1588
+ .opt_opc = vecop_list,
1589
+ .load_dest = true,
1590
+ .vece = MO_64 },
1591
+ };
1592
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1593
+}
1594
+
1595
+static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
1596
+{
1597
+ TCGv_i32 t = tcg_temp_new_i32();
1598
+ gen_uabd_i32(t, a, b);
1599
+ tcg_gen_add_i32(d, d, t);
1600
+}
1601
+
1602
+static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
1603
+{
1604
+ TCGv_i64 t = tcg_temp_new_i64();
1605
+ gen_uabd_i64(t, a, b);
1606
+ tcg_gen_add_i64(d, d, t);
1607
+}
1608
+
1609
+static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
1610
+{
1611
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
1612
+ gen_uabd_vec(vece, t, a, b);
1613
+ tcg_gen_add_vec(vece, d, d, t);
1614
+}
1615
+
1616
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1617
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1618
+{
1619
+ static const TCGOpcode vecop_list[] = {
1620
+ INDEX_op_sub_vec, INDEX_op_add_vec,
1621
+ INDEX_op_umin_vec, INDEX_op_umax_vec, 0
1622
+ };
1623
+ static const GVecGen3 ops[4] = {
1624
+ { .fniv = gen_uaba_vec,
1625
+ .fno = gen_helper_gvec_uaba_b,
1626
+ .opt_opc = vecop_list,
1627
+ .load_dest = true,
1628
+ .vece = MO_8 },
1629
+ { .fniv = gen_uaba_vec,
1630
+ .fno = gen_helper_gvec_uaba_h,
1631
+ .opt_opc = vecop_list,
1632
+ .load_dest = true,
1633
+ .vece = MO_16 },
1634
+ { .fni4 = gen_uaba_i32,
1635
+ .fniv = gen_uaba_vec,
1636
+ .fno = gen_helper_gvec_uaba_s,
1637
+ .opt_opc = vecop_list,
1638
+ .load_dest = true,
1639
+ .vece = MO_32 },
1640
+ { .fni8 = gen_uaba_i64,
1641
+ .fniv = gen_uaba_vec,
1642
+ .fno = gen_helper_gvec_uaba_d,
1643
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1644
+ .opt_opc = vecop_list,
1645
+ .load_dest = true,
1646
+ .vece = MO_64 },
1647
+ };
1648
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1649
+}
1650
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
1651
index XXXXXXX..XXXXXXX 100644
1652
--- a/target/arm/tcg/translate.c
1653
+++ b/target/arm/tcg/translate.c
1654
@@ -XXX,XX +XXX,XX @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
1655
gen_rfe(s, pc, load_cpu_field(spsr));
1656
}
1657
1658
-static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
1659
- uint32_t opr_sz, uint32_t max_sz,
1660
- gen_helper_gvec_3_ptr *fn)
1661
-{
1662
- TCGv_ptr qc_ptr = tcg_temp_new_ptr();
1663
-
1664
- tcg_gen_addi_ptr(qc_ptr, tcg_env, offsetof(CPUARMState, vfp.qc));
1665
- tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
1666
- opr_sz, max_sz, 0, fn);
1667
-}
1668
-
1669
-void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1670
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1671
-{
1672
- static gen_helper_gvec_3_ptr * const fns[2] = {
1673
- gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
1674
- };
1675
- tcg_debug_assert(vece >= 1 && vece <= 2);
1676
- gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
1677
-}
1678
-
1679
-void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1680
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1681
-{
1682
- static gen_helper_gvec_3_ptr * const fns[2] = {
1683
- gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
1684
- };
1685
- tcg_debug_assert(vece >= 1 && vece <= 2);
1686
- gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
1687
-}
1688
-
1689
-#define GEN_CMP0(NAME, COND) \
1690
- void NAME(unsigned vece, uint32_t d, uint32_t m, \
1691
- uint32_t opr_sz, uint32_t max_sz) \
1692
- { tcg_gen_gvec_cmpi(COND, vece, d, m, 0, opr_sz, max_sz); }
1693
-
1694
-GEN_CMP0(gen_gvec_ceq0, TCG_COND_EQ)
1695
-GEN_CMP0(gen_gvec_cle0, TCG_COND_LE)
1696
-GEN_CMP0(gen_gvec_cge0, TCG_COND_GE)
1697
-GEN_CMP0(gen_gvec_clt0, TCG_COND_LT)
1698
-GEN_CMP0(gen_gvec_cgt0, TCG_COND_GT)
1699
-
1700
-#undef GEN_CMP0
1701
-
1702
-static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1703
-{
1704
- tcg_gen_vec_sar8i_i64(a, a, shift);
1705
- tcg_gen_vec_add8_i64(d, d, a);
1706
-}
1707
-
1708
-static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1709
-{
1710
- tcg_gen_vec_sar16i_i64(a, a, shift);
1711
- tcg_gen_vec_add16_i64(d, d, a);
1712
-}
1713
-
1714
-static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
1715
-{
1716
- tcg_gen_sari_i32(a, a, shift);
1717
- tcg_gen_add_i32(d, d, a);
1718
-}
1719
-
1720
-static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1721
-{
1722
- tcg_gen_sari_i64(a, a, shift);
1723
- tcg_gen_add_i64(d, d, a);
1724
-}
1725
-
1726
-static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
1727
-{
1728
- tcg_gen_sari_vec(vece, a, a, sh);
1729
- tcg_gen_add_vec(vece, d, d, a);
1730
-}
1731
-
1732
-void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
1733
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
1734
-{
1735
- static const TCGOpcode vecop_list[] = {
1736
- INDEX_op_sari_vec, INDEX_op_add_vec, 0
1737
- };
1738
- static const GVecGen2i ops[4] = {
1739
- { .fni8 = gen_ssra8_i64,
1740
- .fniv = gen_ssra_vec,
1741
- .fno = gen_helper_gvec_ssra_b,
1742
- .load_dest = true,
1743
- .opt_opc = vecop_list,
1744
- .vece = MO_8 },
1745
- { .fni8 = gen_ssra16_i64,
1746
- .fniv = gen_ssra_vec,
1747
- .fno = gen_helper_gvec_ssra_h,
1748
- .load_dest = true,
1749
- .opt_opc = vecop_list,
1750
- .vece = MO_16 },
1751
- { .fni4 = gen_ssra32_i32,
1752
- .fniv = gen_ssra_vec,
1753
- .fno = gen_helper_gvec_ssra_s,
1754
- .load_dest = true,
1755
- .opt_opc = vecop_list,
1756
- .vece = MO_32 },
1757
- { .fni8 = gen_ssra64_i64,
1758
- .fniv = gen_ssra_vec,
1759
- .fno = gen_helper_gvec_ssra_d,
1760
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1761
- .opt_opc = vecop_list,
1762
- .load_dest = true,
1763
- .vece = MO_64 },
1764
- };
1765
-
1766
- /* tszimm encoding produces immediates in the range [1..esize]. */
1767
- tcg_debug_assert(shift > 0);
1768
- tcg_debug_assert(shift <= (8 << vece));
1769
-
1770
- /*
1771
- * Shifts larger than the element size are architecturally valid.
1772
- * Signed results in all sign bits.
1773
- */
1774
- shift = MIN(shift, (8 << vece) - 1);
1775
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
1776
-}
1777
-
1778
-static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1779
-{
1780
- tcg_gen_vec_shr8i_i64(a, a, shift);
1781
- tcg_gen_vec_add8_i64(d, d, a);
1782
-}
1783
-
1784
-static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1785
-{
1786
- tcg_gen_vec_shr16i_i64(a, a, shift);
1787
- tcg_gen_vec_add16_i64(d, d, a);
1788
-}
1789
-
1790
-static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
1791
-{
1792
- tcg_gen_shri_i32(a, a, shift);
1793
- tcg_gen_add_i32(d, d, a);
1794
-}
1795
-
1796
-static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1797
-{
1798
- tcg_gen_shri_i64(a, a, shift);
1799
- tcg_gen_add_i64(d, d, a);
1800
-}
1801
-
1802
-static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
1803
-{
1804
- tcg_gen_shri_vec(vece, a, a, sh);
1805
- tcg_gen_add_vec(vece, d, d, a);
1806
-}
1807
-
1808
-void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
1809
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
1810
-{
1811
- static const TCGOpcode vecop_list[] = {
1812
- INDEX_op_shri_vec, INDEX_op_add_vec, 0
1813
- };
1814
- static const GVecGen2i ops[4] = {
1815
- { .fni8 = gen_usra8_i64,
1816
- .fniv = gen_usra_vec,
1817
- .fno = gen_helper_gvec_usra_b,
1818
- .load_dest = true,
1819
- .opt_opc = vecop_list,
1820
- .vece = MO_8, },
1821
- { .fni8 = gen_usra16_i64,
1822
- .fniv = gen_usra_vec,
1823
- .fno = gen_helper_gvec_usra_h,
1824
- .load_dest = true,
1825
- .opt_opc = vecop_list,
1826
- .vece = MO_16, },
1827
- { .fni4 = gen_usra32_i32,
1828
- .fniv = gen_usra_vec,
1829
- .fno = gen_helper_gvec_usra_s,
1830
- .load_dest = true,
1831
- .opt_opc = vecop_list,
1832
- .vece = MO_32, },
1833
- { .fni8 = gen_usra64_i64,
1834
- .fniv = gen_usra_vec,
1835
- .fno = gen_helper_gvec_usra_d,
1836
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1837
- .load_dest = true,
1838
- .opt_opc = vecop_list,
1839
- .vece = MO_64, },
1840
- };
1841
-
1842
- /* tszimm encoding produces immediates in the range [1..esize]. */
1843
- tcg_debug_assert(shift > 0);
1844
- tcg_debug_assert(shift <= (8 << vece));
1845
-
1846
- /*
1847
- * Shifts larger than the element size are architecturally valid.
1848
- * Unsigned results in all zeros as input to accumulate: nop.
1849
- */
1850
- if (shift < (8 << vece)) {
1851
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
1852
- } else {
1853
- /* Nop, but we do need to clear the tail. */
1854
- tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
1855
- }
1856
-}
1857
-
1858
-/*
1859
- * Shift one less than the requested amount, and the low bit is
1860
- * the rounding bit. For the 8 and 16-bit operations, because we
1861
- * mask the low bit, we can perform a normal integer shift instead
1862
- * of a vector shift.
1863
- */
1864
-static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1865
-{
1866
- TCGv_i64 t = tcg_temp_new_i64();
1867
-
1868
- tcg_gen_shri_i64(t, a, sh - 1);
1869
- tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
1870
- tcg_gen_vec_sar8i_i64(d, a, sh);
1871
- tcg_gen_vec_add8_i64(d, d, t);
1872
-}
1873
-
1874
-static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1875
-{
1876
- TCGv_i64 t = tcg_temp_new_i64();
1877
-
1878
- tcg_gen_shri_i64(t, a, sh - 1);
1879
- tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
1880
- tcg_gen_vec_sar16i_i64(d, a, sh);
1881
- tcg_gen_vec_add16_i64(d, d, t);
1882
-}
1883
-
1884
-static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
1885
-{
1886
- TCGv_i32 t;
1887
-
1888
- /* Handle shift by the input size for the benefit of trans_SRSHR_ri */
1889
- if (sh == 32) {
1890
- tcg_gen_movi_i32(d, 0);
1891
- return;
1892
- }
1893
- t = tcg_temp_new_i32();
1894
- tcg_gen_extract_i32(t, a, sh - 1, 1);
1895
- tcg_gen_sari_i32(d, a, sh);
1896
- tcg_gen_add_i32(d, d, t);
1897
-}
1898
-
1899
-static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1900
-{
1901
- TCGv_i64 t = tcg_temp_new_i64();
1902
-
1903
- tcg_gen_extract_i64(t, a, sh - 1, 1);
1904
- tcg_gen_sari_i64(d, a, sh);
1905
- tcg_gen_add_i64(d, d, t);
1906
-}
1907
-
1908
-static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
1909
-{
1910
- TCGv_vec t = tcg_temp_new_vec_matching(d);
1911
- TCGv_vec ones = tcg_temp_new_vec_matching(d);
1912
-
1913
- tcg_gen_shri_vec(vece, t, a, sh - 1);
1914
- tcg_gen_dupi_vec(vece, ones, 1);
1915
- tcg_gen_and_vec(vece, t, t, ones);
1916
- tcg_gen_sari_vec(vece, d, a, sh);
1917
- tcg_gen_add_vec(vece, d, d, t);
1918
-}
1919
-
1920
-void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
1921
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
1922
-{
1923
- static const TCGOpcode vecop_list[] = {
1924
- INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
1925
- };
1926
- static const GVecGen2i ops[4] = {
1927
- { .fni8 = gen_srshr8_i64,
1928
- .fniv = gen_srshr_vec,
1929
- .fno = gen_helper_gvec_srshr_b,
1930
- .opt_opc = vecop_list,
1931
- .vece = MO_8 },
1932
- { .fni8 = gen_srshr16_i64,
1933
- .fniv = gen_srshr_vec,
1934
- .fno = gen_helper_gvec_srshr_h,
1935
- .opt_opc = vecop_list,
1936
- .vece = MO_16 },
1937
- { .fni4 = gen_srshr32_i32,
1938
- .fniv = gen_srshr_vec,
1939
- .fno = gen_helper_gvec_srshr_s,
1940
- .opt_opc = vecop_list,
1941
- .vece = MO_32 },
1942
- { .fni8 = gen_srshr64_i64,
1943
- .fniv = gen_srshr_vec,
1944
- .fno = gen_helper_gvec_srshr_d,
1945
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1946
- .opt_opc = vecop_list,
1947
- .vece = MO_64 },
1948
- };
1949
-
1950
- /* tszimm encoding produces immediates in the range [1..esize] */
1951
- tcg_debug_assert(shift > 0);
1952
- tcg_debug_assert(shift <= (8 << vece));
1953
-
1954
- if (shift == (8 << vece)) {
1955
- /*
1956
- * Shifts larger than the element size are architecturally valid.
1957
- * Signed results in all sign bits. With rounding, this produces
1958
- * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
1959
- * I.e. always zero.
1960
- */
1961
- tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
1962
- } else {
1963
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
1964
- }
1965
-}
1966
-
1967
-static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1968
-{
1969
- TCGv_i64 t = tcg_temp_new_i64();
1970
-
1971
- gen_srshr8_i64(t, a, sh);
1972
- tcg_gen_vec_add8_i64(d, d, t);
1973
-}
1974
-
1975
-static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1976
-{
1977
- TCGv_i64 t = tcg_temp_new_i64();
1978
-
1979
- gen_srshr16_i64(t, a, sh);
1980
- tcg_gen_vec_add16_i64(d, d, t);
1981
-}
1982
-
1983
-static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
1984
-{
1985
- TCGv_i32 t = tcg_temp_new_i32();
1986
-
1987
- gen_srshr32_i32(t, a, sh);
1988
- tcg_gen_add_i32(d, d, t);
1989
-}
1990
-
1991
-static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1992
-{
1993
- TCGv_i64 t = tcg_temp_new_i64();
1994
-
1995
- gen_srshr64_i64(t, a, sh);
1996
- tcg_gen_add_i64(d, d, t);
1997
-}
1998
-
1999
-static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
2000
-{
2001
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2002
-
2003
- gen_srshr_vec(vece, t, a, sh);
2004
- tcg_gen_add_vec(vece, d, d, t);
2005
-}
2006
-
2007
-void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2008
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2009
-{
2010
- static const TCGOpcode vecop_list[] = {
2011
- INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
2012
- };
2013
- static const GVecGen2i ops[4] = {
2014
- { .fni8 = gen_srsra8_i64,
2015
- .fniv = gen_srsra_vec,
2016
- .fno = gen_helper_gvec_srsra_b,
2017
- .opt_opc = vecop_list,
2018
- .load_dest = true,
2019
- .vece = MO_8 },
2020
- { .fni8 = gen_srsra16_i64,
2021
- .fniv = gen_srsra_vec,
2022
- .fno = gen_helper_gvec_srsra_h,
2023
- .opt_opc = vecop_list,
2024
- .load_dest = true,
2025
- .vece = MO_16 },
2026
- { .fni4 = gen_srsra32_i32,
2027
- .fniv = gen_srsra_vec,
2028
- .fno = gen_helper_gvec_srsra_s,
2029
- .opt_opc = vecop_list,
2030
- .load_dest = true,
2031
- .vece = MO_32 },
2032
- { .fni8 = gen_srsra64_i64,
2033
- .fniv = gen_srsra_vec,
2034
- .fno = gen_helper_gvec_srsra_d,
2035
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2036
- .opt_opc = vecop_list,
2037
- .load_dest = true,
2038
- .vece = MO_64 },
2039
- };
2040
-
2041
- /* tszimm encoding produces immediates in the range [1..esize] */
2042
- tcg_debug_assert(shift > 0);
2043
- tcg_debug_assert(shift <= (8 << vece));
2044
-
2045
- /*
2046
- * Shifts larger than the element size are architecturally valid.
2047
- * Signed results in all sign bits. With rounding, this produces
2048
- * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
2049
- * I.e. always zero. With accumulation, this leaves D unchanged.
2050
- */
2051
- if (shift == (8 << vece)) {
2052
- /* Nop, but we do need to clear the tail. */
2053
- tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
2054
- } else {
2055
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2056
- }
2057
-}
2058
-
2059
-static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2060
-{
2061
- TCGv_i64 t = tcg_temp_new_i64();
2062
-
2063
- tcg_gen_shri_i64(t, a, sh - 1);
2064
- tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
2065
- tcg_gen_vec_shr8i_i64(d, a, sh);
2066
- tcg_gen_vec_add8_i64(d, d, t);
2067
-}
2068
-
2069
-static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2070
-{
2071
- TCGv_i64 t = tcg_temp_new_i64();
2072
-
2073
- tcg_gen_shri_i64(t, a, sh - 1);
2074
- tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
2075
- tcg_gen_vec_shr16i_i64(d, a, sh);
2076
- tcg_gen_vec_add16_i64(d, d, t);
2077
-}
2078
-
2079
-static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
2080
-{
2081
- TCGv_i32 t;
2082
-
2083
- /* Handle shift by the input size for the benefit of trans_URSHR_ri */
2084
- if (sh == 32) {
2085
- tcg_gen_extract_i32(d, a, sh - 1, 1);
2086
- return;
2087
- }
2088
- t = tcg_temp_new_i32();
2089
- tcg_gen_extract_i32(t, a, sh - 1, 1);
2090
- tcg_gen_shri_i32(d, a, sh);
2091
- tcg_gen_add_i32(d, d, t);
2092
-}
2093
-
2094
-static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2095
-{
2096
- TCGv_i64 t = tcg_temp_new_i64();
2097
-
2098
- tcg_gen_extract_i64(t, a, sh - 1, 1);
2099
- tcg_gen_shri_i64(d, a, sh);
2100
- tcg_gen_add_i64(d, d, t);
2101
-}
2102
-
2103
-static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
2104
-{
2105
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2106
- TCGv_vec ones = tcg_temp_new_vec_matching(d);
2107
-
2108
- tcg_gen_shri_vec(vece, t, a, shift - 1);
2109
- tcg_gen_dupi_vec(vece, ones, 1);
2110
- tcg_gen_and_vec(vece, t, t, ones);
2111
- tcg_gen_shri_vec(vece, d, a, shift);
2112
- tcg_gen_add_vec(vece, d, d, t);
2113
-}
2114
-
2115
-void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2116
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2117
-{
2118
- static const TCGOpcode vecop_list[] = {
2119
- INDEX_op_shri_vec, INDEX_op_add_vec, 0
2120
- };
2121
- static const GVecGen2i ops[4] = {
2122
- { .fni8 = gen_urshr8_i64,
2123
- .fniv = gen_urshr_vec,
2124
- .fno = gen_helper_gvec_urshr_b,
2125
- .opt_opc = vecop_list,
2126
- .vece = MO_8 },
2127
- { .fni8 = gen_urshr16_i64,
2128
- .fniv = gen_urshr_vec,
2129
- .fno = gen_helper_gvec_urshr_h,
2130
- .opt_opc = vecop_list,
2131
- .vece = MO_16 },
2132
- { .fni4 = gen_urshr32_i32,
2133
- .fniv = gen_urshr_vec,
2134
- .fno = gen_helper_gvec_urshr_s,
2135
- .opt_opc = vecop_list,
2136
- .vece = MO_32 },
2137
- { .fni8 = gen_urshr64_i64,
2138
- .fniv = gen_urshr_vec,
2139
- .fno = gen_helper_gvec_urshr_d,
2140
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2141
- .opt_opc = vecop_list,
2142
- .vece = MO_64 },
2143
- };
2144
-
2145
- /* tszimm encoding produces immediates in the range [1..esize] */
2146
- tcg_debug_assert(shift > 0);
2147
- tcg_debug_assert(shift <= (8 << vece));
2148
-
2149
- if (shift == (8 << vece)) {
2150
- /*
2151
- * Shifts larger than the element size are architecturally valid.
2152
- * Unsigned results in zero. With rounding, this produces a
2153
- * copy of the most significant bit.
2154
- */
2155
- tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
2156
- } else {
2157
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2158
- }
2159
-}
2160
-
2161
-static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2162
-{
2163
- TCGv_i64 t = tcg_temp_new_i64();
2164
-
2165
- if (sh == 8) {
2166
- tcg_gen_vec_shr8i_i64(t, a, 7);
2167
- } else {
2168
- gen_urshr8_i64(t, a, sh);
2169
- }
2170
- tcg_gen_vec_add8_i64(d, d, t);
2171
-}
2172
-
2173
-static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2174
-{
2175
- TCGv_i64 t = tcg_temp_new_i64();
2176
-
2177
- if (sh == 16) {
2178
- tcg_gen_vec_shr16i_i64(t, a, 15);
2179
- } else {
2180
- gen_urshr16_i64(t, a, sh);
2181
- }
2182
- tcg_gen_vec_add16_i64(d, d, t);
2183
-}
2184
-
2185
-static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
2186
-{
2187
- TCGv_i32 t = tcg_temp_new_i32();
2188
-
2189
- if (sh == 32) {
2190
- tcg_gen_shri_i32(t, a, 31);
2191
- } else {
2192
- gen_urshr32_i32(t, a, sh);
2193
- }
2194
- tcg_gen_add_i32(d, d, t);
2195
-}
2196
-
2197
-static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2198
-{
2199
- TCGv_i64 t = tcg_temp_new_i64();
2200
-
2201
- if (sh == 64) {
2202
- tcg_gen_shri_i64(t, a, 63);
2203
- } else {
2204
- gen_urshr64_i64(t, a, sh);
2205
- }
2206
- tcg_gen_add_i64(d, d, t);
2207
-}
2208
-
2209
-static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
2210
-{
2211
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2212
-
2213
- if (sh == (8 << vece)) {
2214
- tcg_gen_shri_vec(vece, t, a, sh - 1);
2215
- } else {
2216
- gen_urshr_vec(vece, t, a, sh);
2217
- }
2218
- tcg_gen_add_vec(vece, d, d, t);
2219
-}
2220
-
2221
-void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2222
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2223
-{
2224
- static const TCGOpcode vecop_list[] = {
2225
- INDEX_op_shri_vec, INDEX_op_add_vec, 0
2226
- };
2227
- static const GVecGen2i ops[4] = {
2228
- { .fni8 = gen_ursra8_i64,
2229
- .fniv = gen_ursra_vec,
2230
- .fno = gen_helper_gvec_ursra_b,
2231
- .opt_opc = vecop_list,
2232
- .load_dest = true,
2233
- .vece = MO_8 },
2234
- { .fni8 = gen_ursra16_i64,
2235
- .fniv = gen_ursra_vec,
2236
- .fno = gen_helper_gvec_ursra_h,
2237
- .opt_opc = vecop_list,
2238
- .load_dest = true,
2239
- .vece = MO_16 },
2240
- { .fni4 = gen_ursra32_i32,
2241
- .fniv = gen_ursra_vec,
2242
- .fno = gen_helper_gvec_ursra_s,
2243
- .opt_opc = vecop_list,
2244
- .load_dest = true,
2245
- .vece = MO_32 },
2246
- { .fni8 = gen_ursra64_i64,
2247
- .fniv = gen_ursra_vec,
2248
- .fno = gen_helper_gvec_ursra_d,
2249
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2250
- .opt_opc = vecop_list,
2251
- .load_dest = true,
2252
- .vece = MO_64 },
2253
- };
2254
-
2255
- /* tszimm encoding produces immediates in the range [1..esize] */
2256
- tcg_debug_assert(shift > 0);
2257
- tcg_debug_assert(shift <= (8 << vece));
2258
-
2259
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2260
-}
2261
-
2262
-static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2263
-{
2264
- uint64_t mask = dup_const(MO_8, 0xff >> shift);
2265
- TCGv_i64 t = tcg_temp_new_i64();
2266
-
2267
- tcg_gen_shri_i64(t, a, shift);
2268
- tcg_gen_andi_i64(t, t, mask);
2269
- tcg_gen_andi_i64(d, d, ~mask);
2270
- tcg_gen_or_i64(d, d, t);
2271
-}
2272
-
2273
-static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2274
-{
2275
- uint64_t mask = dup_const(MO_16, 0xffff >> shift);
2276
- TCGv_i64 t = tcg_temp_new_i64();
2277
-
2278
- tcg_gen_shri_i64(t, a, shift);
2279
- tcg_gen_andi_i64(t, t, mask);
2280
- tcg_gen_andi_i64(d, d, ~mask);
2281
- tcg_gen_or_i64(d, d, t);
2282
-}
2283
-
2284
-static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
2285
-{
2286
- tcg_gen_shri_i32(a, a, shift);
2287
- tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
2288
-}
2289
-
2290
-static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2291
-{
2292
- tcg_gen_shri_i64(a, a, shift);
2293
- tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
2294
-}
2295
-
2296
-static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
2297
-{
2298
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2299
- TCGv_vec m = tcg_temp_new_vec_matching(d);
2300
-
2301
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
2302
- tcg_gen_shri_vec(vece, t, a, sh);
2303
- tcg_gen_and_vec(vece, d, d, m);
2304
- tcg_gen_or_vec(vece, d, d, t);
2305
-}
2306
-
2307
-void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2308
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2309
-{
2310
- static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
2311
- const GVecGen2i ops[4] = {
2312
- { .fni8 = gen_shr8_ins_i64,
2313
- .fniv = gen_shr_ins_vec,
2314
- .fno = gen_helper_gvec_sri_b,
2315
- .load_dest = true,
2316
- .opt_opc = vecop_list,
2317
- .vece = MO_8 },
2318
- { .fni8 = gen_shr16_ins_i64,
2319
- .fniv = gen_shr_ins_vec,
2320
- .fno = gen_helper_gvec_sri_h,
2321
- .load_dest = true,
2322
- .opt_opc = vecop_list,
2323
- .vece = MO_16 },
2324
- { .fni4 = gen_shr32_ins_i32,
2325
- .fniv = gen_shr_ins_vec,
2326
- .fno = gen_helper_gvec_sri_s,
2327
- .load_dest = true,
2328
- .opt_opc = vecop_list,
2329
- .vece = MO_32 },
2330
- { .fni8 = gen_shr64_ins_i64,
2331
- .fniv = gen_shr_ins_vec,
2332
- .fno = gen_helper_gvec_sri_d,
2333
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2334
- .load_dest = true,
2335
- .opt_opc = vecop_list,
2336
- .vece = MO_64 },
2337
- };
2338
-
2339
- /* tszimm encoding produces immediates in the range [1..esize]. */
2340
- tcg_debug_assert(shift > 0);
2341
- tcg_debug_assert(shift <= (8 << vece));
2342
-
2343
- /* Shift of esize leaves destination unchanged. */
2344
- if (shift < (8 << vece)) {
2345
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2346
- } else {
2347
- /* Nop, but we do need to clear the tail. */
2348
- tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
2349
- }
2350
-}
2351
-
2352
-static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2353
-{
2354
- uint64_t mask = dup_const(MO_8, 0xff << shift);
2355
- TCGv_i64 t = tcg_temp_new_i64();
2356
-
2357
- tcg_gen_shli_i64(t, a, shift);
2358
- tcg_gen_andi_i64(t, t, mask);
2359
- tcg_gen_andi_i64(d, d, ~mask);
2360
- tcg_gen_or_i64(d, d, t);
2361
-}
2362
-
2363
-static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2364
-{
2365
- uint64_t mask = dup_const(MO_16, 0xffff << shift);
2366
- TCGv_i64 t = tcg_temp_new_i64();
2367
-
2368
- tcg_gen_shli_i64(t, a, shift);
2369
- tcg_gen_andi_i64(t, t, mask);
2370
- tcg_gen_andi_i64(d, d, ~mask);
2371
- tcg_gen_or_i64(d, d, t);
2372
-}
2373
-
2374
-static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
2375
-{
2376
- tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
2377
-}
2378
-
2379
-static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2380
-{
2381
- tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
2382
-}
2383
-
2384
-static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
2385
-{
2386
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2387
- TCGv_vec m = tcg_temp_new_vec_matching(d);
2388
-
2389
- tcg_gen_shli_vec(vece, t, a, sh);
2390
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
2391
- tcg_gen_and_vec(vece, d, d, m);
2392
- tcg_gen_or_vec(vece, d, d, t);
2393
-}
2394
-
2395
-void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2396
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2397
-{
2398
- static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
2399
- const GVecGen2i ops[4] = {
2400
- { .fni8 = gen_shl8_ins_i64,
2401
- .fniv = gen_shl_ins_vec,
2402
- .fno = gen_helper_gvec_sli_b,
2403
- .load_dest = true,
2404
- .opt_opc = vecop_list,
2405
- .vece = MO_8 },
2406
- { .fni8 = gen_shl16_ins_i64,
2407
- .fniv = gen_shl_ins_vec,
2408
- .fno = gen_helper_gvec_sli_h,
2409
- .load_dest = true,
2410
- .opt_opc = vecop_list,
2411
- .vece = MO_16 },
2412
- { .fni4 = gen_shl32_ins_i32,
2413
- .fniv = gen_shl_ins_vec,
2414
- .fno = gen_helper_gvec_sli_s,
2415
- .load_dest = true,
2416
- .opt_opc = vecop_list,
2417
- .vece = MO_32 },
2418
- { .fni8 = gen_shl64_ins_i64,
2419
- .fniv = gen_shl_ins_vec,
2420
- .fno = gen_helper_gvec_sli_d,
2421
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2422
- .load_dest = true,
2423
- .opt_opc = vecop_list,
2424
- .vece = MO_64 },
2425
- };
2426
-
2427
- /* tszimm encoding produces immediates in the range [0..esize-1]. */
2428
- tcg_debug_assert(shift >= 0);
2429
- tcg_debug_assert(shift < (8 << vece));
2430
-
2431
- if (shift == 0) {
2432
- tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
2433
- } else {
2434
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2435
- }
2436
-}
2437
-
2438
-static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2439
-{
2440
- gen_helper_neon_mul_u8(a, a, b);
2441
- gen_helper_neon_add_u8(d, d, a);
2442
-}
2443
-
2444
-static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2445
-{
2446
- gen_helper_neon_mul_u8(a, a, b);
2447
- gen_helper_neon_sub_u8(d, d, a);
2448
-}
2449
-
2450
-static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2451
-{
2452
- gen_helper_neon_mul_u16(a, a, b);
2453
- gen_helper_neon_add_u16(d, d, a);
2454
-}
2455
-
2456
-static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2457
-{
2458
- gen_helper_neon_mul_u16(a, a, b);
2459
- gen_helper_neon_sub_u16(d, d, a);
2460
-}
2461
-
2462
-static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2463
-{
2464
- tcg_gen_mul_i32(a, a, b);
2465
- tcg_gen_add_i32(d, d, a);
2466
-}
2467
-
2468
-static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2469
-{
2470
- tcg_gen_mul_i32(a, a, b);
2471
- tcg_gen_sub_i32(d, d, a);
2472
-}
2473
-
2474
-static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
2475
-{
2476
- tcg_gen_mul_i64(a, a, b);
2477
- tcg_gen_add_i64(d, d, a);
2478
-}
2479
-
2480
-static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
2481
-{
2482
- tcg_gen_mul_i64(a, a, b);
2483
- tcg_gen_sub_i64(d, d, a);
2484
-}
2485
-
2486
-static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
2487
-{
2488
- tcg_gen_mul_vec(vece, a, a, b);
2489
- tcg_gen_add_vec(vece, d, d, a);
2490
-}
2491
-
2492
-static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
2493
-{
2494
- tcg_gen_mul_vec(vece, a, a, b);
2495
- tcg_gen_sub_vec(vece, d, d, a);
2496
-}
2497
-
2498
-/* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
2499
- * these tables are shared with AArch64 which does support them.
2500
- */
2501
-void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2502
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2503
-{
2504
- static const TCGOpcode vecop_list[] = {
2505
- INDEX_op_mul_vec, INDEX_op_add_vec, 0
2506
- };
2507
- static const GVecGen3 ops[4] = {
2508
- { .fni4 = gen_mla8_i32,
2509
- .fniv = gen_mla_vec,
2510
- .load_dest = true,
2511
- .opt_opc = vecop_list,
2512
- .vece = MO_8 },
2513
- { .fni4 = gen_mla16_i32,
2514
- .fniv = gen_mla_vec,
2515
- .load_dest = true,
2516
- .opt_opc = vecop_list,
2517
- .vece = MO_16 },
2518
- { .fni4 = gen_mla32_i32,
2519
- .fniv = gen_mla_vec,
2520
- .load_dest = true,
2521
- .opt_opc = vecop_list,
2522
- .vece = MO_32 },
2523
- { .fni8 = gen_mla64_i64,
2524
- .fniv = gen_mla_vec,
2525
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2526
- .load_dest = true,
2527
- .opt_opc = vecop_list,
2528
- .vece = MO_64 },
2529
- };
2530
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2531
-}
2532
-
2533
-void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2534
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2535
-{
2536
- static const TCGOpcode vecop_list[] = {
2537
- INDEX_op_mul_vec, INDEX_op_sub_vec, 0
2538
- };
2539
- static const GVecGen3 ops[4] = {
2540
- { .fni4 = gen_mls8_i32,
2541
- .fniv = gen_mls_vec,
2542
- .load_dest = true,
2543
- .opt_opc = vecop_list,
2544
- .vece = MO_8 },
2545
- { .fni4 = gen_mls16_i32,
2546
- .fniv = gen_mls_vec,
2547
- .load_dest = true,
2548
- .opt_opc = vecop_list,
2549
- .vece = MO_16 },
2550
- { .fni4 = gen_mls32_i32,
2551
- .fniv = gen_mls_vec,
2552
- .load_dest = true,
2553
- .opt_opc = vecop_list,
2554
- .vece = MO_32 },
2555
- { .fni8 = gen_mls64_i64,
2556
- .fniv = gen_mls_vec,
2557
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2558
- .load_dest = true,
2559
- .opt_opc = vecop_list,
2560
- .vece = MO_64 },
2561
- };
2562
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2563
-}
2564
-
2565
-/* CMTST : test is "if (X & Y != 0)". */
2566
-static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2567
-{
2568
- tcg_gen_and_i32(d, a, b);
2569
- tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0));
2570
-}
2571
-
2572
-void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
2573
-{
2574
- tcg_gen_and_i64(d, a, b);
2575
- tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0));
2576
-}
2577
-
2578
-static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
2579
-{
2580
- tcg_gen_and_vec(vece, d, a, b);
2581
- tcg_gen_dupi_vec(vece, a, 0);
2582
- tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
2583
-}
2584
-
2585
-void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2586
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2587
-{
2588
- static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
2589
- static const GVecGen3 ops[4] = {
2590
- { .fni4 = gen_helper_neon_tst_u8,
2591
- .fniv = gen_cmtst_vec,
2592
- .opt_opc = vecop_list,
2593
- .vece = MO_8 },
2594
- { .fni4 = gen_helper_neon_tst_u16,
2595
- .fniv = gen_cmtst_vec,
2596
- .opt_opc = vecop_list,
2597
- .vece = MO_16 },
2598
- { .fni4 = gen_cmtst_i32,
2599
- .fniv = gen_cmtst_vec,
2600
- .opt_opc = vecop_list,
2601
- .vece = MO_32 },
2602
- { .fni8 = gen_cmtst_i64,
2603
- .fniv = gen_cmtst_vec,
2604
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2605
- .opt_opc = vecop_list,
2606
- .vece = MO_64 },
2607
- };
2608
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2609
-}
2610
-
2611
-void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
2612
-{
2613
- TCGv_i32 lval = tcg_temp_new_i32();
2614
- TCGv_i32 rval = tcg_temp_new_i32();
2615
- TCGv_i32 lsh = tcg_temp_new_i32();
2616
- TCGv_i32 rsh = tcg_temp_new_i32();
2617
- TCGv_i32 zero = tcg_constant_i32(0);
2618
- TCGv_i32 max = tcg_constant_i32(32);
2619
-
2620
- /*
2621
- * Rely on the TCG guarantee that out of range shifts produce
2622
- * unspecified results, not undefined behaviour (i.e. no trap).
2623
- * Discard out-of-range results after the fact.
2624
- */
2625
- tcg_gen_ext8s_i32(lsh, shift);
2626
- tcg_gen_neg_i32(rsh, lsh);
2627
- tcg_gen_shl_i32(lval, src, lsh);
2628
- tcg_gen_shr_i32(rval, src, rsh);
2629
- tcg_gen_movcond_i32(TCG_COND_LTU, dst, lsh, max, lval, zero);
2630
- tcg_gen_movcond_i32(TCG_COND_LTU, dst, rsh, max, rval, dst);
2631
-}
2632
-
2633
-void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
2634
-{
2635
- TCGv_i64 lval = tcg_temp_new_i64();
2636
- TCGv_i64 rval = tcg_temp_new_i64();
2637
- TCGv_i64 lsh = tcg_temp_new_i64();
2638
- TCGv_i64 rsh = tcg_temp_new_i64();
2639
- TCGv_i64 zero = tcg_constant_i64(0);
2640
- TCGv_i64 max = tcg_constant_i64(64);
2641
-
2642
- /*
2643
- * Rely on the TCG guarantee that out of range shifts produce
2644
- * unspecified results, not undefined behaviour (i.e. no trap).
2645
- * Discard out-of-range results after the fact.
2646
- */
2647
- tcg_gen_ext8s_i64(lsh, shift);
2648
- tcg_gen_neg_i64(rsh, lsh);
2649
- tcg_gen_shl_i64(lval, src, lsh);
2650
- tcg_gen_shr_i64(rval, src, rsh);
2651
- tcg_gen_movcond_i64(TCG_COND_LTU, dst, lsh, max, lval, zero);
2652
- tcg_gen_movcond_i64(TCG_COND_LTU, dst, rsh, max, rval, dst);
2653
-}
2654
-
2655
-static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
2656
- TCGv_vec src, TCGv_vec shift)
2657
-{
2658
- TCGv_vec lval = tcg_temp_new_vec_matching(dst);
2659
- TCGv_vec rval = tcg_temp_new_vec_matching(dst);
2660
- TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
2661
- TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
2662
- TCGv_vec msk, max;
2663
-
2664
- tcg_gen_neg_vec(vece, rsh, shift);
2665
- if (vece == MO_8) {
2666
- tcg_gen_mov_vec(lsh, shift);
2667
- } else {
2668
- msk = tcg_temp_new_vec_matching(dst);
2669
- tcg_gen_dupi_vec(vece, msk, 0xff);
2670
- tcg_gen_and_vec(vece, lsh, shift, msk);
2671
- tcg_gen_and_vec(vece, rsh, rsh, msk);
2672
- }
2673
-
2674
- /*
2675
- * Rely on the TCG guarantee that out of range shifts produce
2676
- * unspecified results, not undefined behaviour (i.e. no trap).
2677
- * Discard out-of-range results after the fact.
2678
- */
2679
- tcg_gen_shlv_vec(vece, lval, src, lsh);
2680
- tcg_gen_shrv_vec(vece, rval, src, rsh);
2681
-
2682
- max = tcg_temp_new_vec_matching(dst);
2683
- tcg_gen_dupi_vec(vece, max, 8 << vece);
2684
-
2685
- /*
2686
- * The choice of LT (signed) and GEU (unsigned) are biased toward
2687
- * the instructions of the x86_64 host. For MO_8, the whole byte
2688
- * is significant so we must use an unsigned compare; otherwise we
2689
- * have already masked to a byte and so a signed compare works.
2690
- * Other tcg hosts have a full set of comparisons and do not care.
2691
- */
2692
- if (vece == MO_8) {
2693
- tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
2694
- tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
2695
- tcg_gen_andc_vec(vece, lval, lval, lsh);
2696
- tcg_gen_andc_vec(vece, rval, rval, rsh);
2697
- } else {
2698
- tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
2699
- tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
2700
- tcg_gen_and_vec(vece, lval, lval, lsh);
2701
- tcg_gen_and_vec(vece, rval, rval, rsh);
2702
- }
2703
- tcg_gen_or_vec(vece, dst, lval, rval);
2704
-}
2705
-
2706
-void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2707
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2708
-{
2709
- static const TCGOpcode vecop_list[] = {
2710
- INDEX_op_neg_vec, INDEX_op_shlv_vec,
2711
- INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
2712
- };
2713
- static const GVecGen3 ops[4] = {
2714
- { .fniv = gen_ushl_vec,
2715
- .fno = gen_helper_gvec_ushl_b,
2716
- .opt_opc = vecop_list,
2717
- .vece = MO_8 },
2718
- { .fniv = gen_ushl_vec,
2719
- .fno = gen_helper_gvec_ushl_h,
2720
- .opt_opc = vecop_list,
2721
- .vece = MO_16 },
2722
- { .fni4 = gen_ushl_i32,
2723
- .fniv = gen_ushl_vec,
2724
- .opt_opc = vecop_list,
2725
- .vece = MO_32 },
2726
- { .fni8 = gen_ushl_i64,
2727
- .fniv = gen_ushl_vec,
2728
- .opt_opc = vecop_list,
2729
- .vece = MO_64 },
2730
- };
2731
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2732
-}
2733
-
2734
-void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
2735
-{
2736
- TCGv_i32 lval = tcg_temp_new_i32();
2737
- TCGv_i32 rval = tcg_temp_new_i32();
2738
- TCGv_i32 lsh = tcg_temp_new_i32();
2739
- TCGv_i32 rsh = tcg_temp_new_i32();
2740
- TCGv_i32 zero = tcg_constant_i32(0);
2741
- TCGv_i32 max = tcg_constant_i32(31);
2742
-
2743
- /*
2744
- * Rely on the TCG guarantee that out of range shifts produce
2745
- * unspecified results, not undefined behaviour (i.e. no trap).
2746
- * Discard out-of-range results after the fact.
2747
- */
2748
- tcg_gen_ext8s_i32(lsh, shift);
2749
- tcg_gen_neg_i32(rsh, lsh);
2750
- tcg_gen_shl_i32(lval, src, lsh);
2751
- tcg_gen_umin_i32(rsh, rsh, max);
2752
- tcg_gen_sar_i32(rval, src, rsh);
2753
- tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero);
2754
- tcg_gen_movcond_i32(TCG_COND_LT, dst, lsh, zero, rval, lval);
2755
-}
2756
-
2757
-void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
2758
-{
2759
- TCGv_i64 lval = tcg_temp_new_i64();
2760
- TCGv_i64 rval = tcg_temp_new_i64();
2761
- TCGv_i64 lsh = tcg_temp_new_i64();
2762
- TCGv_i64 rsh = tcg_temp_new_i64();
2763
- TCGv_i64 zero = tcg_constant_i64(0);
2764
- TCGv_i64 max = tcg_constant_i64(63);
2765
-
2766
- /*
2767
- * Rely on the TCG guarantee that out of range shifts produce
2768
- * unspecified results, not undefined behaviour (i.e. no trap).
2769
- * Discard out-of-range results after the fact.
2770
- */
2771
- tcg_gen_ext8s_i64(lsh, shift);
2772
- tcg_gen_neg_i64(rsh, lsh);
2773
- tcg_gen_shl_i64(lval, src, lsh);
2774
- tcg_gen_umin_i64(rsh, rsh, max);
2775
- tcg_gen_sar_i64(rval, src, rsh);
2776
- tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero);
2777
- tcg_gen_movcond_i64(TCG_COND_LT, dst, lsh, zero, rval, lval);
2778
-}
2779
-
2780
-static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
2781
- TCGv_vec src, TCGv_vec shift)
2782
-{
2783
- TCGv_vec lval = tcg_temp_new_vec_matching(dst);
2784
- TCGv_vec rval = tcg_temp_new_vec_matching(dst);
2785
- TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
2786
- TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
2787
- TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
2788
-
2789
- /*
2790
- * Rely on the TCG guarantee that out of range shifts produce
2791
- * unspecified results, not undefined behaviour (i.e. no trap).
2792
- * Discard out-of-range results after the fact.
2793
- */
2794
- tcg_gen_neg_vec(vece, rsh, shift);
2795
- if (vece == MO_8) {
2796
- tcg_gen_mov_vec(lsh, shift);
2797
- } else {
2798
- tcg_gen_dupi_vec(vece, tmp, 0xff);
2799
- tcg_gen_and_vec(vece, lsh, shift, tmp);
2800
- tcg_gen_and_vec(vece, rsh, rsh, tmp);
2801
- }
2802
-
2803
- /* Bound rsh so out of bound right shift gets -1. */
2804
- tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
2805
- tcg_gen_umin_vec(vece, rsh, rsh, tmp);
2806
- tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
2807
-
2808
- tcg_gen_shlv_vec(vece, lval, src, lsh);
2809
- tcg_gen_sarv_vec(vece, rval, src, rsh);
2810
-
2811
- /* Select in-bound left shift. */
2812
- tcg_gen_andc_vec(vece, lval, lval, tmp);
2813
-
2814
- /* Select between left and right shift. */
2815
- if (vece == MO_8) {
2816
- tcg_gen_dupi_vec(vece, tmp, 0);
2817
- tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval);
2818
- } else {
2819
- tcg_gen_dupi_vec(vece, tmp, 0x80);
2820
- tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval);
2821
- }
2822
-}
2823
-
2824
-void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2825
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2826
-{
2827
- static const TCGOpcode vecop_list[] = {
2828
- INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
2829
- INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
2830
- };
2831
- static const GVecGen3 ops[4] = {
2832
- { .fniv = gen_sshl_vec,
2833
- .fno = gen_helper_gvec_sshl_b,
2834
- .opt_opc = vecop_list,
2835
- .vece = MO_8 },
2836
- { .fniv = gen_sshl_vec,
2837
- .fno = gen_helper_gvec_sshl_h,
2838
- .opt_opc = vecop_list,
2839
- .vece = MO_16 },
2840
- { .fni4 = gen_sshl_i32,
2841
- .fniv = gen_sshl_vec,
2842
- .opt_opc = vecop_list,
2843
- .vece = MO_32 },
2844
- { .fni8 = gen_sshl_i64,
2845
- .fniv = gen_sshl_vec,
2846
- .opt_opc = vecop_list,
2847
- .vece = MO_64 },
2848
- };
2849
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2850
-}
2851
-
2852
-static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
2853
- TCGv_vec a, TCGv_vec b)
2854
-{
2855
- TCGv_vec x = tcg_temp_new_vec_matching(t);
2856
- tcg_gen_add_vec(vece, x, a, b);
2857
- tcg_gen_usadd_vec(vece, t, a, b);
2858
- tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
2859
- tcg_gen_or_vec(vece, sat, sat, x);
2860
-}
2861
-
2862
-void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2863
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2864
-{
2865
- static const TCGOpcode vecop_list[] = {
2866
- INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
2867
- };
2868
- static const GVecGen4 ops[4] = {
2869
- { .fniv = gen_uqadd_vec,
2870
- .fno = gen_helper_gvec_uqadd_b,
2871
- .write_aofs = true,
2872
- .opt_opc = vecop_list,
2873
- .vece = MO_8 },
2874
- { .fniv = gen_uqadd_vec,
2875
- .fno = gen_helper_gvec_uqadd_h,
2876
- .write_aofs = true,
2877
- .opt_opc = vecop_list,
2878
- .vece = MO_16 },
2879
- { .fniv = gen_uqadd_vec,
2880
- .fno = gen_helper_gvec_uqadd_s,
2881
- .write_aofs = true,
2882
- .opt_opc = vecop_list,
2883
- .vece = MO_32 },
2884
- { .fniv = gen_uqadd_vec,
2885
- .fno = gen_helper_gvec_uqadd_d,
2886
- .write_aofs = true,
2887
- .opt_opc = vecop_list,
2888
- .vece = MO_64 },
2889
- };
2890
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
2891
- rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2892
-}
2893
-
2894
-static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
2895
- TCGv_vec a, TCGv_vec b)
2896
-{
2897
- TCGv_vec x = tcg_temp_new_vec_matching(t);
2898
- tcg_gen_add_vec(vece, x, a, b);
2899
- tcg_gen_ssadd_vec(vece, t, a, b);
2900
- tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
2901
- tcg_gen_or_vec(vece, sat, sat, x);
2902
-}
2903
-
2904
-void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2905
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2906
-{
2907
- static const TCGOpcode vecop_list[] = {
2908
- INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
2909
- };
2910
- static const GVecGen4 ops[4] = {
2911
- { .fniv = gen_sqadd_vec,
2912
- .fno = gen_helper_gvec_sqadd_b,
2913
- .opt_opc = vecop_list,
2914
- .write_aofs = true,
2915
- .vece = MO_8 },
2916
- { .fniv = gen_sqadd_vec,
2917
- .fno = gen_helper_gvec_sqadd_h,
2918
- .opt_opc = vecop_list,
2919
- .write_aofs = true,
2920
- .vece = MO_16 },
2921
- { .fniv = gen_sqadd_vec,
2922
- .fno = gen_helper_gvec_sqadd_s,
2923
- .opt_opc = vecop_list,
2924
- .write_aofs = true,
2925
- .vece = MO_32 },
2926
- { .fniv = gen_sqadd_vec,
2927
- .fno = gen_helper_gvec_sqadd_d,
2928
- .opt_opc = vecop_list,
2929
- .write_aofs = true,
2930
- .vece = MO_64 },
2931
- };
2932
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
2933
- rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2934
-}
2935
-
2936
-static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
2937
- TCGv_vec a, TCGv_vec b)
2938
-{
2939
- TCGv_vec x = tcg_temp_new_vec_matching(t);
2940
- tcg_gen_sub_vec(vece, x, a, b);
2941
- tcg_gen_ussub_vec(vece, t, a, b);
2942
- tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
2943
- tcg_gen_or_vec(vece, sat, sat, x);
2944
-}
2945
-
2946
-void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2947
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2948
-{
2949
- static const TCGOpcode vecop_list[] = {
2950
- INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
2951
- };
2952
- static const GVecGen4 ops[4] = {
2953
- { .fniv = gen_uqsub_vec,
2954
- .fno = gen_helper_gvec_uqsub_b,
2955
- .opt_opc = vecop_list,
2956
- .write_aofs = true,
2957
- .vece = MO_8 },
2958
- { .fniv = gen_uqsub_vec,
2959
- .fno = gen_helper_gvec_uqsub_h,
2960
- .opt_opc = vecop_list,
2961
- .write_aofs = true,
2962
- .vece = MO_16 },
2963
- { .fniv = gen_uqsub_vec,
2964
- .fno = gen_helper_gvec_uqsub_s,
2965
- .opt_opc = vecop_list,
2966
- .write_aofs = true,
2967
- .vece = MO_32 },
2968
- { .fniv = gen_uqsub_vec,
2969
- .fno = gen_helper_gvec_uqsub_d,
2970
- .opt_opc = vecop_list,
2971
- .write_aofs = true,
2972
- .vece = MO_64 },
2973
- };
2974
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
2975
- rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2976
-}
2977
-
2978
-static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
2979
- TCGv_vec a, TCGv_vec b)
2980
-{
2981
- TCGv_vec x = tcg_temp_new_vec_matching(t);
2982
- tcg_gen_sub_vec(vece, x, a, b);
2983
- tcg_gen_sssub_vec(vece, t, a, b);
2984
- tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
2985
- tcg_gen_or_vec(vece, sat, sat, x);
2986
-}
2987
-
2988
-void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2989
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2990
-{
2991
- static const TCGOpcode vecop_list[] = {
2992
- INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
2993
- };
2994
- static const GVecGen4 ops[4] = {
2995
- { .fniv = gen_sqsub_vec,
2996
- .fno = gen_helper_gvec_sqsub_b,
2997
- .opt_opc = vecop_list,
2998
- .write_aofs = true,
2999
- .vece = MO_8 },
3000
- { .fniv = gen_sqsub_vec,
3001
- .fno = gen_helper_gvec_sqsub_h,
3002
- .opt_opc = vecop_list,
3003
- .write_aofs = true,
3004
- .vece = MO_16 },
3005
- { .fniv = gen_sqsub_vec,
3006
- .fno = gen_helper_gvec_sqsub_s,
3007
- .opt_opc = vecop_list,
3008
- .write_aofs = true,
3009
- .vece = MO_32 },
3010
- { .fniv = gen_sqsub_vec,
3011
- .fno = gen_helper_gvec_sqsub_d,
3012
- .opt_opc = vecop_list,
3013
- .write_aofs = true,
3014
- .vece = MO_64 },
3015
- };
3016
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
3017
- rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3018
-}
3019
-
3020
-static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
3021
-{
3022
- TCGv_i32 t = tcg_temp_new_i32();
3023
-
3024
- tcg_gen_sub_i32(t, a, b);
3025
- tcg_gen_sub_i32(d, b, a);
3026
- tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
3027
-}
3028
-
3029
-static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
3030
-{
3031
- TCGv_i64 t = tcg_temp_new_i64();
3032
-
3033
- tcg_gen_sub_i64(t, a, b);
3034
- tcg_gen_sub_i64(d, b, a);
3035
- tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
3036
-}
3037
-
3038
-static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
3039
-{
3040
- TCGv_vec t = tcg_temp_new_vec_matching(d);
3041
-
3042
- tcg_gen_smin_vec(vece, t, a, b);
3043
- tcg_gen_smax_vec(vece, d, a, b);
3044
- tcg_gen_sub_vec(vece, d, d, t);
3045
-}
3046
-
3047
-void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
3048
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
3049
-{
3050
- static const TCGOpcode vecop_list[] = {
3051
- INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
3052
- };
3053
- static const GVecGen3 ops[4] = {
3054
- { .fniv = gen_sabd_vec,
3055
- .fno = gen_helper_gvec_sabd_b,
3056
- .opt_opc = vecop_list,
3057
- .vece = MO_8 },
3058
- { .fniv = gen_sabd_vec,
3059
- .fno = gen_helper_gvec_sabd_h,
3060
- .opt_opc = vecop_list,
3061
- .vece = MO_16 },
3062
- { .fni4 = gen_sabd_i32,
3063
- .fniv = gen_sabd_vec,
3064
- .fno = gen_helper_gvec_sabd_s,
3065
- .opt_opc = vecop_list,
3066
- .vece = MO_32 },
3067
- { .fni8 = gen_sabd_i64,
3068
- .fniv = gen_sabd_vec,
3069
- .fno = gen_helper_gvec_sabd_d,
3070
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
3071
- .opt_opc = vecop_list,
3072
- .vece = MO_64 },
3073
- };
3074
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3075
-}
3076
-
3077
-static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
3078
-{
3079
- TCGv_i32 t = tcg_temp_new_i32();
3080
-
3081
- tcg_gen_sub_i32(t, a, b);
3082
- tcg_gen_sub_i32(d, b, a);
3083
- tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
3084
-}
3085
-
3086
-static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
3087
-{
3088
- TCGv_i64 t = tcg_temp_new_i64();
3089
-
3090
- tcg_gen_sub_i64(t, a, b);
3091
- tcg_gen_sub_i64(d, b, a);
3092
- tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
3093
-}
3094
-
3095
-static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
3096
-{
3097
- TCGv_vec t = tcg_temp_new_vec_matching(d);
3098
-
3099
- tcg_gen_umin_vec(vece, t, a, b);
3100
- tcg_gen_umax_vec(vece, d, a, b);
3101
- tcg_gen_sub_vec(vece, d, d, t);
3102
-}
3103
-
3104
-void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
3105
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
3106
-{
3107
- static const TCGOpcode vecop_list[] = {
3108
- INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
3109
- };
3110
- static const GVecGen3 ops[4] = {
3111
- { .fniv = gen_uabd_vec,
3112
- .fno = gen_helper_gvec_uabd_b,
3113
- .opt_opc = vecop_list,
3114
- .vece = MO_8 },
3115
- { .fniv = gen_uabd_vec,
3116
- .fno = gen_helper_gvec_uabd_h,
3117
- .opt_opc = vecop_list,
3118
- .vece = MO_16 },
3119
- { .fni4 = gen_uabd_i32,
3120
- .fniv = gen_uabd_vec,
3121
- .fno = gen_helper_gvec_uabd_s,
3122
- .opt_opc = vecop_list,
3123
- .vece = MO_32 },
3124
- { .fni8 = gen_uabd_i64,
3125
- .fniv = gen_uabd_vec,
3126
- .fno = gen_helper_gvec_uabd_d,
3127
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
3128
- .opt_opc = vecop_list,
3129
- .vece = MO_64 },
3130
- };
3131
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3132
-}
3133
-
3134
-static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
3135
-{
3136
- TCGv_i32 t = tcg_temp_new_i32();
3137
- gen_sabd_i32(t, a, b);
3138
- tcg_gen_add_i32(d, d, t);
3139
-}
3140
-
3141
-static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
3142
-{
3143
- TCGv_i64 t = tcg_temp_new_i64();
3144
- gen_sabd_i64(t, a, b);
3145
- tcg_gen_add_i64(d, d, t);
3146
-}
3147
-
3148
-static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
3149
-{
3150
- TCGv_vec t = tcg_temp_new_vec_matching(d);
3151
- gen_sabd_vec(vece, t, a, b);
3152
- tcg_gen_add_vec(vece, d, d, t);
3153
-}
3154
-
3155
-void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
3156
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
3157
-{
3158
- static const TCGOpcode vecop_list[] = {
3159
- INDEX_op_sub_vec, INDEX_op_add_vec,
3160
- INDEX_op_smin_vec, INDEX_op_smax_vec, 0
3161
- };
3162
- static const GVecGen3 ops[4] = {
3163
- { .fniv = gen_saba_vec,
3164
- .fno = gen_helper_gvec_saba_b,
3165
- .opt_opc = vecop_list,
3166
- .load_dest = true,
3167
- .vece = MO_8 },
3168
- { .fniv = gen_saba_vec,
3169
- .fno = gen_helper_gvec_saba_h,
3170
- .opt_opc = vecop_list,
3171
- .load_dest = true,
3172
- .vece = MO_16 },
3173
- { .fni4 = gen_saba_i32,
3174
- .fniv = gen_saba_vec,
3175
- .fno = gen_helper_gvec_saba_s,
3176
- .opt_opc = vecop_list,
3177
- .load_dest = true,
3178
- .vece = MO_32 },
3179
- { .fni8 = gen_saba_i64,
3180
- .fniv = gen_saba_vec,
3181
- .fno = gen_helper_gvec_saba_d,
3182
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
3183
- .opt_opc = vecop_list,
3184
- .load_dest = true,
3185
- .vece = MO_64 },
3186
- };
3187
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3188
-}
3189
-
3190
-static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
3191
-{
3192
- TCGv_i32 t = tcg_temp_new_i32();
3193
- gen_uabd_i32(t, a, b);
3194
- tcg_gen_add_i32(d, d, t);
3195
-}
3196
-
3197
-static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
3198
-{
3199
- TCGv_i64 t = tcg_temp_new_i64();
3200
- gen_uabd_i64(t, a, b);
3201
- tcg_gen_add_i64(d, d, t);
3202
-}
3203
-
3204
-static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
3205
-{
3206
- TCGv_vec t = tcg_temp_new_vec_matching(d);
3207
- gen_uabd_vec(vece, t, a, b);
3208
- tcg_gen_add_vec(vece, d, d, t);
3209
-}
3210
-
3211
-void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
3212
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
3213
-{
3214
- static const TCGOpcode vecop_list[] = {
3215
- INDEX_op_sub_vec, INDEX_op_add_vec,
3216
- INDEX_op_umin_vec, INDEX_op_umax_vec, 0
3217
- };
3218
- static const GVecGen3 ops[4] = {
3219
- { .fniv = gen_uaba_vec,
3220
- .fno = gen_helper_gvec_uaba_b,
3221
- .opt_opc = vecop_list,
3222
- .load_dest = true,
3223
- .vece = MO_8 },
3224
- { .fniv = gen_uaba_vec,
3225
- .fno = gen_helper_gvec_uaba_h,
3226
- .opt_opc = vecop_list,
3227
- .load_dest = true,
3228
- .vece = MO_16 },
3229
- { .fni4 = gen_uaba_i32,
3230
- .fniv = gen_uaba_vec,
3231
- .fno = gen_helper_gvec_uaba_s,
3232
- .opt_opc = vecop_list,
3233
- .load_dest = true,
3234
- .vece = MO_32 },
3235
- { .fni8 = gen_uaba_i64,
3236
- .fniv = gen_uaba_vec,
3237
- .fno = gen_helper_gvec_uaba_d,
3238
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
3239
- .opt_opc = vecop_list,
3240
- .load_dest = true,
3241
- .vece = MO_64 },
3242
- };
3243
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3244
-}
3245
-
3246
static bool aa32_cpreg_encoding_in_impdef_space(uint8_t crn, uint8_t crm)
3247
{
3248
static const uint16_t mask[3] = {
3249
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
3250
index XXXXXXX..XXXXXXX 100644
3251
--- a/target/arm/tcg/meson.build
3252
+++ b/target/arm/tcg/meson.build
3253
@@ -XXX,XX +XXX,XX @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: gen_a64)
3254
3255
arm_ss.add(files(
3256
'cpu32.c',
3257
+ 'gengvec.c',
3258
'translate.c',
3259
'translate-m-nocp.c',
3260
'translate-mve.c',
266
--
3261
--
267
2.20.1
3262
2.34.1
268
3263
269
3264
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Stripping out the authentication data does not require any crypto,
3
Split some routines out of translate-a64.c and translate-sve.c
4
it merely requires the virtual address parameters.
4
that are used by both.
5
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20190108223129.5570-25-richard.henderson@linaro.org
9
Message-id: 20240524232121.284515-9-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
---
11
target/arm/pauth_helper.c | 14 +++++++++++++-
12
target/arm/tcg/translate-a64.h | 4 +
12
1 file changed, 13 insertions(+), 1 deletion(-)
13
target/arm/tcg/gengvec64.c | 190 +++++++++++++++++++++++++++++++++
14
target/arm/tcg/translate-a64.c | 26 -----
15
target/arm/tcg/translate-sve.c | 145 +------------------------
16
target/arm/tcg/meson.build | 1 +
17
5 files changed, 197 insertions(+), 169 deletions(-)
18
create mode 100644 target/arm/tcg/gengvec64.c
13
19
14
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
20
diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h
15
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/pauth_helper.c
22
--- a/target/arm/tcg/translate-a64.h
17
+++ b/target/arm/pauth_helper.c
23
+++ b/target/arm/tcg/translate-a64.h
18
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
24
@@ -XXX,XX +XXX,XX @@ void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
19
g_assert_not_reached(); /* FIXME */
25
void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
26
uint32_t rm_ofs, int64_t shift,
27
uint32_t opr_sz, uint32_t max_sz);
28
+void gen_gvec_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
29
+ uint32_t a, uint32_t oprsz, uint32_t maxsz);
30
+void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
31
+ uint32_t a, uint32_t oprsz, uint32_t maxsz);
32
33
void gen_sve_ldr(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm);
34
void gen_sve_str(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm);
35
diff --git a/target/arm/tcg/gengvec64.c b/target/arm/tcg/gengvec64.c
36
new file mode 100644
37
index XXXXXXX..XXXXXXX
38
--- /dev/null
39
+++ b/target/arm/tcg/gengvec64.c
40
@@ -XXX,XX +XXX,XX @@
41
+/*
42
+ * AArch64 generic vector expansion
43
+ *
44
+ * Copyright (c) 2013 Alexander Graf <agraf@suse.de>
45
+ *
46
+ * This library is free software; you can redistribute it and/or
47
+ * modify it under the terms of the GNU Lesser General Public
48
+ * License as published by the Free Software Foundation; either
49
+ * version 2.1 of the License, or (at your option) any later version.
50
+ *
51
+ * This library is distributed in the hope that it will be useful,
52
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
53
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
54
+ * Lesser General Public License for more details.
55
+ *
56
+ * You should have received a copy of the GNU Lesser General Public
57
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
58
+ */
59
+
60
+#include "qemu/osdep.h"
61
+#include "translate.h"
62
+#include "translate-a64.h"
63
+
64
+
65
+static void gen_rax1_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)
66
+{
67
+ tcg_gen_rotli_i64(d, m, 1);
68
+ tcg_gen_xor_i64(d, d, n);
69
+}
70
+
71
+static void gen_rax1_vec(unsigned vece, TCGv_vec d, TCGv_vec n, TCGv_vec m)
72
+{
73
+ tcg_gen_rotli_vec(vece, d, m, 1);
74
+ tcg_gen_xor_vec(vece, d, d, n);
75
+}
76
+
77
+void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
78
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
79
+{
80
+ static const TCGOpcode vecop_list[] = { INDEX_op_rotli_vec, 0 };
81
+ static const GVecGen3 op = {
82
+ .fni8 = gen_rax1_i64,
83
+ .fniv = gen_rax1_vec,
84
+ .opt_opc = vecop_list,
85
+ .fno = gen_helper_crypto_rax1,
86
+ .vece = MO_64,
87
+ };
88
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &op);
89
+}
90
+
91
+static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
92
+{
93
+ TCGv_i64 t = tcg_temp_new_i64();
94
+ uint64_t mask = dup_const(MO_8, 0xff >> sh);
95
+
96
+ tcg_gen_xor_i64(t, n, m);
97
+ tcg_gen_shri_i64(d, t, sh);
98
+ tcg_gen_shli_i64(t, t, 8 - sh);
99
+ tcg_gen_andi_i64(d, d, mask);
100
+ tcg_gen_andi_i64(t, t, ~mask);
101
+ tcg_gen_or_i64(d, d, t);
102
+}
103
+
104
+static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
105
+{
106
+ TCGv_i64 t = tcg_temp_new_i64();
107
+ uint64_t mask = dup_const(MO_16, 0xffff >> sh);
108
+
109
+ tcg_gen_xor_i64(t, n, m);
110
+ tcg_gen_shri_i64(d, t, sh);
111
+ tcg_gen_shli_i64(t, t, 16 - sh);
112
+ tcg_gen_andi_i64(d, d, mask);
113
+ tcg_gen_andi_i64(t, t, ~mask);
114
+ tcg_gen_or_i64(d, d, t);
115
+}
116
+
117
+static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh)
118
+{
119
+ tcg_gen_xor_i32(d, n, m);
120
+ tcg_gen_rotri_i32(d, d, sh);
121
+}
122
+
123
+static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
124
+{
125
+ tcg_gen_xor_i64(d, n, m);
126
+ tcg_gen_rotri_i64(d, d, sh);
127
+}
128
+
129
+static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
130
+ TCGv_vec m, int64_t sh)
131
+{
132
+ tcg_gen_xor_vec(vece, d, n, m);
133
+ tcg_gen_rotri_vec(vece, d, d, sh);
134
+}
135
+
136
+void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
137
+ uint32_t rm_ofs, int64_t shift,
138
+ uint32_t opr_sz, uint32_t max_sz)
139
+{
140
+ static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 };
141
+ static const GVecGen3i ops[4] = {
142
+ { .fni8 = gen_xar8_i64,
143
+ .fniv = gen_xar_vec,
144
+ .fno = gen_helper_sve2_xar_b,
145
+ .opt_opc = vecop,
146
+ .vece = MO_8 },
147
+ { .fni8 = gen_xar16_i64,
148
+ .fniv = gen_xar_vec,
149
+ .fno = gen_helper_sve2_xar_h,
150
+ .opt_opc = vecop,
151
+ .vece = MO_16 },
152
+ { .fni4 = gen_xar_i32,
153
+ .fniv = gen_xar_vec,
154
+ .fno = gen_helper_sve2_xar_s,
155
+ .opt_opc = vecop,
156
+ .vece = MO_32 },
157
+ { .fni8 = gen_xar_i64,
158
+ .fniv = gen_xar_vec,
159
+ .fno = gen_helper_gvec_xar_d,
160
+ .opt_opc = vecop,
161
+ .vece = MO_64 }
162
+ };
163
+ int esize = 8 << vece;
164
+
165
+ /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */
166
+ tcg_debug_assert(shift >= 0);
167
+ tcg_debug_assert(shift <= esize);
168
+ shift &= esize - 1;
169
+
170
+ if (shift == 0) {
171
+ /* xar with no rotate devolves to xor. */
172
+ tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz);
173
+ } else {
174
+ tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz,
175
+ shift, &ops[vece]);
176
+ }
177
+}
178
+
179
+static void gen_eor3_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k)
180
+{
181
+ tcg_gen_xor_i64(d, n, m);
182
+ tcg_gen_xor_i64(d, d, k);
183
+}
184
+
185
+static void gen_eor3_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
186
+ TCGv_vec m, TCGv_vec k)
187
+{
188
+ tcg_gen_xor_vec(vece, d, n, m);
189
+ tcg_gen_xor_vec(vece, d, d, k);
190
+}
191
+
192
+void gen_gvec_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
193
+ uint32_t a, uint32_t oprsz, uint32_t maxsz)
194
+{
195
+ static const GVecGen4 op = {
196
+ .fni8 = gen_eor3_i64,
197
+ .fniv = gen_eor3_vec,
198
+ .fno = gen_helper_sve2_eor3,
199
+ .vece = MO_64,
200
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
201
+ };
202
+ tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op);
203
+}
204
+
205
+static void gen_bcax_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k)
206
+{
207
+ tcg_gen_andc_i64(d, m, k);
208
+ tcg_gen_xor_i64(d, d, n);
209
+}
210
+
211
+static void gen_bcax_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
212
+ TCGv_vec m, TCGv_vec k)
213
+{
214
+ tcg_gen_andc_vec(vece, d, m, k);
215
+ tcg_gen_xor_vec(vece, d, d, n);
216
+}
217
+
218
+void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
219
+ uint32_t a, uint32_t oprsz, uint32_t maxsz)
220
+{
221
+ static const GVecGen4 op = {
222
+ .fni8 = gen_bcax_i64,
223
+ .fniv = gen_bcax_vec,
224
+ .fno = gen_helper_sve2_bcax,
225
+ .vece = MO_64,
226
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
227
+ };
228
+ tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op);
229
+}
230
+
231
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
232
index XXXXXXX..XXXXXXX 100644
233
--- a/target/arm/tcg/translate-a64.c
234
+++ b/target/arm/tcg/translate-a64.c
235
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
236
gen_gvec_op2_ool(s, true, rd, rn, 0, genfn);
20
}
237
}
21
238
22
+static uint64_t pauth_original_ptr(uint64_t ptr, ARMVAParameters param)
239
-static void gen_rax1_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)
23
+{
240
-{
24
+ uint64_t extfield = -param.select;
241
- tcg_gen_rotli_i64(d, m, 1);
25
+ int bot_pac_bit = 64 - param.tsz;
242
- tcg_gen_xor_i64(d, d, n);
26
+ int top_pac_bit = 64 - 8 * param.tbi;
243
-}
27
+
244
-
28
+ return deposit64(ptr, bot_pac_bit, top_pac_bit - bot_pac_bit, extfield);
245
-static void gen_rax1_vec(unsigned vece, TCGv_vec d, TCGv_vec n, TCGv_vec m)
29
+}
246
-{
30
+
247
- tcg_gen_rotli_vec(vece, d, m, 1);
31
static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
248
- tcg_gen_xor_vec(vece, d, d, n);
32
ARMPACKey *key, bool data, int keynumber)
249
-}
250
-
251
-void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
252
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
253
-{
254
- static const TCGOpcode vecop_list[] = { INDEX_op_rotli_vec, 0 };
255
- static const GVecGen3 op = {
256
- .fni8 = gen_rax1_i64,
257
- .fniv = gen_rax1_vec,
258
- .opt_opc = vecop_list,
259
- .fno = gen_helper_crypto_rax1,
260
- .vece = MO_64,
261
- };
262
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &op);
263
-}
264
-
265
/* Crypto three-reg SHA512
266
* 31 21 20 16 15 14 13 12 11 10 9 5 4 0
267
* +-----------------------+------+---+---+-----+--------+------+------+
268
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
269
index XXXXXXX..XXXXXXX 100644
270
--- a/target/arm/tcg/translate-sve.c
271
+++ b/target/arm/tcg/translate-sve.c
272
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(ORR_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_or, a)
273
TRANS_FEAT(EOR_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_xor, a)
274
TRANS_FEAT(BIC_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_andc, a)
275
276
-static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
277
-{
278
- TCGv_i64 t = tcg_temp_new_i64();
279
- uint64_t mask = dup_const(MO_8, 0xff >> sh);
280
-
281
- tcg_gen_xor_i64(t, n, m);
282
- tcg_gen_shri_i64(d, t, sh);
283
- tcg_gen_shli_i64(t, t, 8 - sh);
284
- tcg_gen_andi_i64(d, d, mask);
285
- tcg_gen_andi_i64(t, t, ~mask);
286
- tcg_gen_or_i64(d, d, t);
287
-}
288
-
289
-static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
290
-{
291
- TCGv_i64 t = tcg_temp_new_i64();
292
- uint64_t mask = dup_const(MO_16, 0xffff >> sh);
293
-
294
- tcg_gen_xor_i64(t, n, m);
295
- tcg_gen_shri_i64(d, t, sh);
296
- tcg_gen_shli_i64(t, t, 16 - sh);
297
- tcg_gen_andi_i64(d, d, mask);
298
- tcg_gen_andi_i64(t, t, ~mask);
299
- tcg_gen_or_i64(d, d, t);
300
-}
301
-
302
-static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh)
303
-{
304
- tcg_gen_xor_i32(d, n, m);
305
- tcg_gen_rotri_i32(d, d, sh);
306
-}
307
-
308
-static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
309
-{
310
- tcg_gen_xor_i64(d, n, m);
311
- tcg_gen_rotri_i64(d, d, sh);
312
-}
313
-
314
-static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
315
- TCGv_vec m, int64_t sh)
316
-{
317
- tcg_gen_xor_vec(vece, d, n, m);
318
- tcg_gen_rotri_vec(vece, d, d, sh);
319
-}
320
-
321
-void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
322
- uint32_t rm_ofs, int64_t shift,
323
- uint32_t opr_sz, uint32_t max_sz)
324
-{
325
- static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 };
326
- static const GVecGen3i ops[4] = {
327
- { .fni8 = gen_xar8_i64,
328
- .fniv = gen_xar_vec,
329
- .fno = gen_helper_sve2_xar_b,
330
- .opt_opc = vecop,
331
- .vece = MO_8 },
332
- { .fni8 = gen_xar16_i64,
333
- .fniv = gen_xar_vec,
334
- .fno = gen_helper_sve2_xar_h,
335
- .opt_opc = vecop,
336
- .vece = MO_16 },
337
- { .fni4 = gen_xar_i32,
338
- .fniv = gen_xar_vec,
339
- .fno = gen_helper_sve2_xar_s,
340
- .opt_opc = vecop,
341
- .vece = MO_32 },
342
- { .fni8 = gen_xar_i64,
343
- .fniv = gen_xar_vec,
344
- .fno = gen_helper_gvec_xar_d,
345
- .opt_opc = vecop,
346
- .vece = MO_64 }
347
- };
348
- int esize = 8 << vece;
349
-
350
- /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */
351
- tcg_debug_assert(shift >= 0);
352
- tcg_debug_assert(shift <= esize);
353
- shift &= esize - 1;
354
-
355
- if (shift == 0) {
356
- /* xar with no rotate devolves to xor. */
357
- tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz);
358
- } else {
359
- tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz,
360
- shift, &ops[vece]);
361
- }
362
-}
363
-
364
static bool trans_XAR(DisasContext *s, arg_rrri_esz *a)
33
{
365
{
34
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
366
if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) {
35
367
@@ -XXX,XX +XXX,XX @@ static bool trans_XAR(DisasContext *s, arg_rrri_esz *a)
36
static uint64_t pauth_strip(CPUARMState *env, uint64_t ptr, bool data)
368
return true;
37
{
38
- g_assert_not_reached(); /* FIXME */
39
+ ARMMMUIdx mmu_idx = arm_stage1_mmu_idx(env);
40
+ ARMVAParameters param = aa64_va_parameters(env, ptr, mmu_idx, data);
41
+
42
+ return pauth_original_ptr(ptr, param);
43
}
369
}
44
370
45
static void QEMU_NORETURN pauth_trap(CPUARMState *env, int target_el,
371
-static void gen_eor3_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k)
372
-{
373
- tcg_gen_xor_i64(d, n, m);
374
- tcg_gen_xor_i64(d, d, k);
375
-}
376
-
377
-static void gen_eor3_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
378
- TCGv_vec m, TCGv_vec k)
379
-{
380
- tcg_gen_xor_vec(vece, d, n, m);
381
- tcg_gen_xor_vec(vece, d, d, k);
382
-}
383
-
384
-static void gen_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
385
- uint32_t a, uint32_t oprsz, uint32_t maxsz)
386
-{
387
- static const GVecGen4 op = {
388
- .fni8 = gen_eor3_i64,
389
- .fniv = gen_eor3_vec,
390
- .fno = gen_helper_sve2_eor3,
391
- .vece = MO_64,
392
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
393
- };
394
- tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op);
395
-}
396
-
397
-TRANS_FEAT(EOR3, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_eor3, a)
398
-
399
-static void gen_bcax_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k)
400
-{
401
- tcg_gen_andc_i64(d, m, k);
402
- tcg_gen_xor_i64(d, d, n);
403
-}
404
-
405
-static void gen_bcax_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
406
- TCGv_vec m, TCGv_vec k)
407
-{
408
- tcg_gen_andc_vec(vece, d, m, k);
409
- tcg_gen_xor_vec(vece, d, d, n);
410
-}
411
-
412
-static void gen_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
413
- uint32_t a, uint32_t oprsz, uint32_t maxsz)
414
-{
415
- static const GVecGen4 op = {
416
- .fni8 = gen_bcax_i64,
417
- .fniv = gen_bcax_vec,
418
- .fno = gen_helper_sve2_bcax,
419
- .vece = MO_64,
420
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
421
- };
422
- tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op);
423
-}
424
-
425
-TRANS_FEAT(BCAX, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_bcax, a)
426
+TRANS_FEAT(EOR3, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_gvec_eor3, a)
427
+TRANS_FEAT(BCAX, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_gvec_bcax, a)
428
429
static void gen_bsl(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
430
uint32_t a, uint32_t oprsz, uint32_t maxsz)
431
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
432
index XXXXXXX..XXXXXXX 100644
433
--- a/target/arm/tcg/meson.build
434
+++ b/target/arm/tcg/meson.build
435
@@ -XXX,XX +XXX,XX @@ arm_ss.add(files(
436
437
arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
438
'cpu64.c',
439
+ 'gengvec64.c',
440
'translate-a64.c',
441
'translate-sve.c',
442
'translate-sme.c',
46
--
443
--
47
2.20.1
444
2.34.1
48
445
49
446
diff view generated by jsdifflib
1
From: Cédric Le Goater <clg@kaod.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The PHY behind the MAC of an Aspeed SoC can be controlled using two
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
different MDC/MDIO interfaces. The same registers PHYCR (MAC60) and
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
PHYDATA (MAC64) are involved but they have a different layout.
5
Message-id: 20240524232121.284515-10-richard.henderson@linaro.org
6
7
BIT31 of the Feature Register (MAC40) controls which MDC/MDIO
8
interface is active.
9
10
Signed-off-by: Cédric Le Goater <clg@kaod.org>
11
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
12
Reviewed-by: Joel Stanley <joel@jms.id.au>
13
Message-id: 20190111125759.31577-1-clg@kaod.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
7
---
16
hw/net/ftgmac100.c | 80 +++++++++++++++++++++++++++++++++++++++-------
8
target/arm/tcg/a64.decode | 21 +++++++--
17
1 file changed, 68 insertions(+), 12 deletions(-)
9
target/arm/tcg/translate-a64.c | 86 +++++++++++++++-------------------
10
2 files changed, 54 insertions(+), 53 deletions(-)
18
11
19
diff --git a/hw/net/ftgmac100.c b/hw/net/ftgmac100.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
20
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
21
--- a/hw/net/ftgmac100.c
14
--- a/target/arm/tcg/a64.decode
22
+++ b/hw/net/ftgmac100.c
15
+++ b/target/arm/tcg/a64.decode
23
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@
24
#define FTGMAC100_PHYDATA_MIIWDATA(x) ((x) & 0xffff)
17
# This file is processed by scripts/decodetree.py
25
#define FTGMAC100_PHYDATA_MIIRDATA(x) (((x) >> 16) & 0xffff)
18
#
19
20
-&r rn
21
-&ri rd imm
22
-&rri_sf rd rn imm sf
23
-&i imm
24
+%rd 0:5
25
26
+&r rn
27
+&ri rd imm
28
+&rri_sf rd rn imm sf
29
+&i imm
30
+&qrr_e q rd rn esz
31
+&qrrr_e q rd rn rm esz
32
+
33
+@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
34
+@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
35
36
### Data Processing - Immediate
37
38
@@ -XXX,XX +XXX,XX @@ CPYFE 00 011 0 01100 ..... .... 01 ..... ..... @cpy
39
CPYP 00 011 1 01000 ..... .... 01 ..... ..... @cpy
40
CPYM 00 011 1 01010 ..... .... 01 ..... ..... @cpy
41
CPYE 00 011 1 01100 ..... .... 01 ..... ..... @cpy
42
+
43
+### Cryptographic AES
44
+
45
+AESE 01001110 00 10100 00100 10 ..... ..... @r2r_q1e0
46
+AESD 01001110 00 10100 00101 10 ..... ..... @r2r_q1e0
47
+AESMC 01001110 00 10100 00110 10 ..... ..... @rr_q1e0
48
+AESIMC 01001110 00 10100 00111 10 ..... ..... @rr_q1e0
49
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
50
index XXXXXXX..XXXXXXX 100644
51
--- a/target/arm/tcg/translate-a64.c
52
+++ b/target/arm/tcg/translate-a64.c
53
@@ -XXX,XX +XXX,XX @@ bool sme_enabled_check_with_svcr(DisasContext *s, unsigned req)
54
return true;
55
}
26
56
27
+/*
57
+/*
28
+ * PHY control register - New MDC/MDIO interface
58
+ * Expanders for AdvSIMD translation functions.
29
+ */
59
+ */
30
+#define FTGMAC100_PHYCR_NEW_DATA(x) (((x) >> 16) & 0xffff)
60
+
31
+#define FTGMAC100_PHYCR_NEW_FIRE (1 << 15)
61
+static bool do_gvec_op2_ool(DisasContext *s, arg_qrr_e *a, int data,
32
+#define FTGMAC100_PHYCR_NEW_ST_22 (1 << 12)
62
+ gen_helper_gvec_2 *fn)
33
+#define FTGMAC100_PHYCR_NEW_OP(x) (((x) >> 10) & 3)
63
+{
34
+#define FTGMAC100_PHYCR_NEW_OP_WRITE 0x1
64
+ if (!a->q && a->esz == MO_64) {
35
+#define FTGMAC100_PHYCR_NEW_OP_READ 0x2
65
+ return false;
36
+#define FTGMAC100_PHYCR_NEW_DEV(x) (((x) >> 5) & 0x1f)
66
+ }
37
+#define FTGMAC100_PHYCR_NEW_REG(x) ((x) & 0x1f)
67
+ if (fp_access_check(s)) {
68
+ gen_gvec_op2_ool(s, a->q, a->rd, a->rn, data, fn);
69
+ }
70
+ return true;
71
+}
72
+
73
+static bool do_gvec_op3_ool(DisasContext *s, arg_qrrr_e *a, int data,
74
+ gen_helper_gvec_3 *fn)
75
+{
76
+ if (!a->q && a->esz == MO_64) {
77
+ return false;
78
+ }
79
+ if (fp_access_check(s)) {
80
+ gen_gvec_op3_ool(s, a->q, a->rd, a->rn, a->rm, data, fn);
81
+ }
82
+ return true;
83
+}
38
+
84
+
39
/*
85
/*
40
* Feature Register
86
* This utility function is for doing register extension with an
41
*/
87
* optional shift. You will likely want to pass a temporary for the
42
@@ -XXX,XX +XXX,XX @@ static void phy_reset(FTGMAC100State *s)
88
@@ -XXX,XX +XXX,XX @@ static bool trans_EXTR(DisasContext *s, arg_extract *a)
43
s->phy_int = 0;
89
return true;
44
}
90
}
45
91
46
-static uint32_t do_phy_read(FTGMAC100State *s, int reg)
92
+/*
47
+static uint16_t do_phy_read(FTGMAC100State *s, uint8_t reg)
93
+ * Cryptographic AES
48
{
94
+ */
49
- uint32_t val;
95
+
50
+ uint16_t val;
96
+TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese)
51
97
+TRANS_FEAT(AESD, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aesd)
52
switch (reg) {
98
+TRANS_FEAT(AESMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesmc)
53
case MII_BMCR: /* Basic Control */
99
+TRANS_FEAT(AESIMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesimc)
54
@@ -XXX,XX +XXX,XX @@ static uint32_t do_phy_read(FTGMAC100State *s, int reg)
100
+
55
MII_BMCR_FD | MII_BMCR_CTST)
101
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
56
#define MII_ANAR_MASK 0x2d7f
102
* Note that it is the caller's responsibility to ensure that the
57
103
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
58
-static void do_phy_write(FTGMAC100State *s, int reg, uint32_t val)
104
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
59
+static void do_phy_write(FTGMAC100State *s, uint8_t reg, uint16_t val)
60
{
61
switch (reg) {
62
case MII_BMCR: /* Basic Control */
63
@@ -XXX,XX +XXX,XX @@ static void do_phy_write(FTGMAC100State *s, int reg, uint32_t val)
64
}
105
}
65
}
106
}
66
107
67
+static void do_phy_new_ctl(FTGMAC100State *s)
108
-/* Crypto AES
68
+{
109
- * 31 24 23 22 21 17 16 12 11 10 9 5 4 0
69
+ uint8_t reg;
110
- * +-----------------+------+-----------+--------+-----+------+------+
70
+ uint16_t data;
111
- * | 0 1 0 0 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 | Rn | Rd |
71
+
112
- * +-----------------+------+-----------+--------+-----+------+------+
72
+ if (!(s->phycr & FTGMAC100_PHYCR_NEW_ST_22)) {
113
- */
73
+ qemu_log_mask(LOG_UNIMP, "%s: unsupported ST code\n", __func__);
114
-static void disas_crypto_aes(DisasContext *s, uint32_t insn)
74
+ return;
115
-{
75
+ }
116
- int size = extract32(insn, 22, 2);
76
+
117
- int opcode = extract32(insn, 12, 5);
77
+ /* Nothing to do */
118
- int rn = extract32(insn, 5, 5);
78
+ if (!(s->phycr & FTGMAC100_PHYCR_NEW_FIRE)) {
119
- int rd = extract32(insn, 0, 5);
79
+ return;
120
- gen_helper_gvec_2 *genfn2 = NULL;
80
+ }
121
- gen_helper_gvec_3 *genfn3 = NULL;
81
+
122
-
82
+ reg = FTGMAC100_PHYCR_NEW_REG(s->phycr);
123
- if (!dc_isar_feature(aa64_aes, s) || size != 0) {
83
+ data = FTGMAC100_PHYCR_NEW_DATA(s->phycr);
124
- unallocated_encoding(s);
84
+
125
- return;
85
+ switch (FTGMAC100_PHYCR_NEW_OP(s->phycr)) {
126
- }
86
+ case FTGMAC100_PHYCR_NEW_OP_WRITE:
127
-
87
+ do_phy_write(s, reg, data);
128
- switch (opcode) {
88
+ break;
129
- case 0x4: /* AESE */
89
+ case FTGMAC100_PHYCR_NEW_OP_READ:
130
- genfn3 = gen_helper_crypto_aese;
90
+ s->phydata = do_phy_read(s, reg) & 0xffff;
131
- break;
91
+ break;
132
- case 0x6: /* AESMC */
92
+ default:
133
- genfn2 = gen_helper_crypto_aesmc;
93
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid OP code %08x\n",
134
- break;
94
+ __func__, s->phycr);
135
- case 0x5: /* AESD */
95
+ }
136
- genfn3 = gen_helper_crypto_aesd;
96
+
137
- break;
97
+ s->phycr &= ~FTGMAC100_PHYCR_NEW_FIRE;
138
- case 0x7: /* AESIMC */
98
+}
139
- genfn2 = gen_helper_crypto_aesimc;
99
+
140
- break;
100
+static void do_phy_ctl(FTGMAC100State *s)
141
- default:
101
+{
142
- unallocated_encoding(s);
102
+ uint8_t reg = FTGMAC100_PHYCR_REG(s->phycr);
143
- return;
103
+
144
- }
104
+ if (s->phycr & FTGMAC100_PHYCR_MIIWR) {
145
-
105
+ do_phy_write(s, reg, s->phydata & 0xffff);
146
- if (!fp_access_check(s)) {
106
+ s->phycr &= ~FTGMAC100_PHYCR_MIIWR;
147
- return;
107
+ } else if (s->phycr & FTGMAC100_PHYCR_MIIRD) {
148
- }
108
+ s->phydata = do_phy_read(s, reg) << 16;
149
- if (genfn2) {
109
+ s->phycr &= ~FTGMAC100_PHYCR_MIIRD;
150
- gen_gvec_op2_ool(s, true, rd, rn, 0, genfn2);
110
+ } else {
151
- } else {
111
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: no OP code %08x\n",
152
- gen_gvec_op3_ool(s, true, rd, rd, rn, 0, genfn3);
112
+ __func__, s->phycr);
153
- }
113
+ }
154
-}
114
+}
155
-
115
+
156
/* Crypto three-reg SHA
116
static int ftgmac100_read_bd(FTGMAC100Desc *bd, dma_addr_t addr)
157
* 31 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0
117
{
158
* +-----------------+------+---+------+---+--------+-----+------+------+
118
if (dma_memory_read(&address_space_memory, addr, bd, sizeof(*bd))) {
159
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
119
@@ -XXX,XX +XXX,XX @@ static void ftgmac100_write(void *opaque, hwaddr addr,
160
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
120
uint64_t value, unsigned size)
161
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
121
{
162
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
122
FTGMAC100State *s = FTGMAC100(opaque);
163
- { 0x4e280800, 0xff3e0c00, disas_crypto_aes },
123
- int reg;
164
{ 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha },
124
165
{ 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },
125
switch (addr & 0xff) {
166
{ 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 },
126
case FTGMAC100_ISR: /* Interrupt status */
127
@@ -XXX,XX +XXX,XX @@ static void ftgmac100_write(void *opaque, hwaddr addr,
128
break;
129
130
case FTGMAC100_PHYCR: /* PHY Device control */
131
- reg = FTGMAC100_PHYCR_REG(value);
132
s->phycr = value;
133
- if (value & FTGMAC100_PHYCR_MIIWR) {
134
- do_phy_write(s, reg, s->phydata & 0xffff);
135
- s->phycr &= ~FTGMAC100_PHYCR_MIIWR;
136
+ if (s->revr & FTGMAC100_REVR_NEW_MDIO_INTERFACE) {
137
+ do_phy_new_ctl(s);
138
} else {
139
- s->phydata = do_phy_read(s, reg) << 16;
140
- s->phycr &= ~FTGMAC100_PHYCR_MIIRD;
141
+ do_phy_ctl(s);
142
}
143
break;
144
case FTGMAC100_PHYDATA:
145
@@ -XXX,XX +XXX,XX @@ static void ftgmac100_write(void *opaque, hwaddr addr,
146
s->dblac = value;
147
break;
148
case FTGMAC100_REVR: /* Feature Register */
149
- /* TODO: Only Old MDIO interface is supported */
150
- s->revr = value & ~FTGMAC100_REVR_NEW_MDIO_INTERFACE;
151
+ s->revr = value;
152
break;
153
case FTGMAC100_FEAR1: /* Feature Register 1 */
154
s->fear1 = value;
155
--
167
--
156
2.20.1
168
2.34.1
157
158
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
We can perform this with fewer operations.
4
2
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20190108223129.5570-32-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-11-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
7
---
10
target/arm/translate-a64.c | 62 +++++++++++++-------------------------
8
target/arm/tcg/a64.decode | 11 +++++
11
1 file changed, 21 insertions(+), 41 deletions(-)
9
target/arm/tcg/translate-a64.c | 78 +++++-----------------------------
10
2 files changed, 21 insertions(+), 68 deletions(-)
12
11
13
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/translate-a64.c
14
--- a/target/arm/tcg/a64.decode
16
+++ b/target/arm/translate-a64.c
15
+++ b/target/arm/tcg/a64.decode
17
@@ -XXX,XX +XXX,XX @@ void gen_a64_set_pc_im(uint64_t val)
16
@@ -XXX,XX +XXX,XX @@
18
/* Load the PC from a generic TCG variable.
17
19
*
18
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
20
* If address tagging is enabled via the TCR TBI bits, then loading
19
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
21
- * an address into the PC will clear out any tag in the it:
20
+@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
22
+ * an address into the PC will clear out any tag in it:
21
23
* + for EL2 and EL3 there is only one TBI bit, and if it is set
22
### Data Processing - Immediate
24
* then the address is zero-extended, clearing bits [63:56]
23
25
* + for EL0 and EL1, TBI0 controls addresses with bit 55 == 0
24
@@ -XXX,XX +XXX,XX @@ AESE 01001110 00 10100 00100 10 ..... ..... @r2r_q1e0
26
@@ -XXX,XX +XXX,XX @@ static void gen_a64_set_pc(DisasContext *s, TCGv_i64 src)
25
AESD 01001110 00 10100 00101 10 ..... ..... @r2r_q1e0
27
int tbi = s->tbii;
26
AESMC 01001110 00 10100 00110 10 ..... ..... @rr_q1e0
28
27
AESIMC 01001110 00 10100 00111 10 ..... ..... @rr_q1e0
29
if (s->current_el <= 1) {
28
+
30
- /* Test if NEITHER or BOTH TBI values are set. If so, no need to
29
+### Cryptographic three-register SHA
31
- * examine bit 55 of address, can just generate code.
30
+
32
- * If mixed, then test via generated code
31
+SHA1C 0101 1110 000 ..... 000000 ..... ..... @rrr_q1e0
33
- */
32
+SHA1P 0101 1110 000 ..... 000100 ..... ..... @rrr_q1e0
34
- if (tbi == 3) {
33
+SHA1M 0101 1110 000 ..... 001000 ..... ..... @rrr_q1e0
35
- TCGv_i64 tmp_reg = tcg_temp_new_i64();
34
+SHA1SU0 0101 1110 000 ..... 001100 ..... ..... @rrr_q1e0
36
- /* Both bits set, sign extension from bit 55 into [63:56] will
35
+SHA256H 0101 1110 000 ..... 010000 ..... ..... @rrr_q1e0
37
- * cover both cases
36
+SHA256H2 0101 1110 000 ..... 010100 ..... ..... @rrr_q1e0
38
- */
37
+SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0
39
- tcg_gen_shli_i64(tmp_reg, src, 8);
38
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
40
- tcg_gen_sari_i64(cpu_pc, tmp_reg, 8);
39
index XXXXXXX..XXXXXXX 100644
41
- tcg_temp_free_i64(tmp_reg);
40
--- a/target/arm/tcg/translate-a64.c
42
- } else if (tbi == 0) {
41
+++ b/target/arm/tcg/translate-a64.c
43
- /* Neither bit set, just load it as-is */
42
@@ -XXX,XX +XXX,XX @@ static bool trans_EXTR(DisasContext *s, arg_extract *a)
44
- tcg_gen_mov_i64(cpu_pc, src);
43
}
45
- } else {
44
46
- TCGv_i64 tcg_tmpval = tcg_temp_new_i64();
45
/*
47
- TCGv_i64 tcg_bit55 = tcg_temp_new_i64();
46
- * Cryptographic AES
48
- TCGv_i64 tcg_zero = tcg_const_i64(0);
47
+ * Cryptographic AES, SHA
49
+ if (tbi != 0) {
48
*/
50
+ /* Sign-extend from bit 55. */
49
51
+ tcg_gen_sextract_i64(cpu_pc, src, 0, 56);
50
TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese)
52
51
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(AESD, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aesd)
53
- tcg_gen_andi_i64(tcg_bit55, src, (1ull << 55));
52
TRANS_FEAT(AESMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesmc)
54
+ if (tbi != 3) {
53
TRANS_FEAT(AESIMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesimc)
55
+ TCGv_i64 tcg_zero = tcg_const_i64(0);
54
56
55
+TRANS_FEAT(SHA1C, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1c)
57
- if (tbi == 1) {
56
+TRANS_FEAT(SHA1P, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1p)
58
- /* tbi0==1, tbi1==0, so 0-fill upper byte if bit 55 = 0 */
57
+TRANS_FEAT(SHA1M, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1m)
59
- tcg_gen_andi_i64(tcg_tmpval, src,
58
+TRANS_FEAT(SHA1SU0, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1su0)
60
- 0x00FFFFFFFFFFFFFFull);
59
+
61
- tcg_gen_movcond_i64(TCG_COND_EQ, cpu_pc, tcg_bit55, tcg_zero,
60
+TRANS_FEAT(SHA256H, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h)
62
- tcg_tmpval, src);
61
+TRANS_FEAT(SHA256H2, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h2)
63
- } else {
62
+TRANS_FEAT(SHA256SU1, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256su1)
64
- /* tbi0==0, tbi1==1, so 1-fill upper byte if bit 55 = 1 */
63
+
65
- tcg_gen_ori_i64(tcg_tmpval, src,
64
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
66
- 0xFF00000000000000ull);
65
* Note that it is the caller's responsibility to ensure that the
67
- tcg_gen_movcond_i64(TCG_COND_NE, cpu_pc, tcg_bit55, tcg_zero,
66
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
68
- tcg_tmpval, src);
67
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
69
+ /*
70
+ * The two TBI bits differ.
71
+ * If tbi0, then !tbi1: only use the extension if positive.
72
+ * if !tbi0, then tbi1: only use the extension if negative.
73
+ */
74
+ tcg_gen_movcond_i64(tbi == 1 ? TCG_COND_GE : TCG_COND_LT,
75
+ cpu_pc, cpu_pc, tcg_zero, cpu_pc, src);
76
+ tcg_temp_free_i64(tcg_zero);
77
}
78
- tcg_temp_free_i64(tcg_zero);
79
- tcg_temp_free_i64(tcg_bit55);
80
- tcg_temp_free_i64(tcg_tmpval);
81
+ return;
82
}
83
- } else { /* EL > 1 */
84
+ } else {
85
if (tbi != 0) {
86
/* Force tag byte to all zero */
87
- tcg_gen_andi_i64(cpu_pc, src, 0x00FFFFFFFFFFFFFFull);
88
- } else {
89
- /* Load unmodified address */
90
- tcg_gen_mov_i64(cpu_pc, src);
91
+ tcg_gen_extract_i64(cpu_pc, src, 0, 56);
92
+ return;
93
}
94
}
68
}
95
+
96
+ /* Load unmodified address */
97
+ tcg_gen_mov_i64(cpu_pc, src);
98
}
69
}
99
70
100
typedef struct DisasCompare64 {
71
-/* Crypto three-reg SHA
72
- * 31 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0
73
- * +-----------------+------+---+------+---+--------+-----+------+------+
74
- * | 0 1 0 1 1 1 1 0 | size | 0 | Rm | 0 | opcode | 0 0 | Rn | Rd |
75
- * +-----------------+------+---+------+---+--------+-----+------+------+
76
- */
77
-static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
78
-{
79
- int size = extract32(insn, 22, 2);
80
- int opcode = extract32(insn, 12, 3);
81
- int rm = extract32(insn, 16, 5);
82
- int rn = extract32(insn, 5, 5);
83
- int rd = extract32(insn, 0, 5);
84
- gen_helper_gvec_3 *genfn;
85
- bool feature;
86
-
87
- if (size != 0) {
88
- unallocated_encoding(s);
89
- return;
90
- }
91
-
92
- switch (opcode) {
93
- case 0: /* SHA1C */
94
- genfn = gen_helper_crypto_sha1c;
95
- feature = dc_isar_feature(aa64_sha1, s);
96
- break;
97
- case 1: /* SHA1P */
98
- genfn = gen_helper_crypto_sha1p;
99
- feature = dc_isar_feature(aa64_sha1, s);
100
- break;
101
- case 2: /* SHA1M */
102
- genfn = gen_helper_crypto_sha1m;
103
- feature = dc_isar_feature(aa64_sha1, s);
104
- break;
105
- case 3: /* SHA1SU0 */
106
- genfn = gen_helper_crypto_sha1su0;
107
- feature = dc_isar_feature(aa64_sha1, s);
108
- break;
109
- case 4: /* SHA256H */
110
- genfn = gen_helper_crypto_sha256h;
111
- feature = dc_isar_feature(aa64_sha256, s);
112
- break;
113
- case 5: /* SHA256H2 */
114
- genfn = gen_helper_crypto_sha256h2;
115
- feature = dc_isar_feature(aa64_sha256, s);
116
- break;
117
- case 6: /* SHA256SU1 */
118
- genfn = gen_helper_crypto_sha256su1;
119
- feature = dc_isar_feature(aa64_sha256, s);
120
- break;
121
- default:
122
- unallocated_encoding(s);
123
- return;
124
- }
125
-
126
- if (!feature) {
127
- unallocated_encoding(s);
128
- return;
129
- }
130
-
131
- if (!fp_access_check(s)) {
132
- return;
133
- }
134
- gen_gvec_op3_ool(s, true, rd, rn, rm, 0, genfn);
135
-}
136
-
137
/* Crypto two-reg SHA
138
* 31 24 23 22 21 17 16 12 11 10 9 5 4 0
139
* +-----------------+------+-----------+--------+-----+------+------+
140
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
141
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
142
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
143
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
144
- { 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha },
145
{ 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },
146
{ 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 },
147
{ 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 },
101
--
148
--
102
2.20.1
149
2.34.1
103
104
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
This is not really functional yet, because the crypto is not yet
4
implemented. This, however follows the AddPAC pseudo function.
5
2
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20190108223129.5570-27-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-12-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
7
---
11
target/arm/pauth_helper.c | 42 ++++++++++++++++++++++++++++++++++++++-
8
target/arm/tcg/a64.decode | 6 ++++
12
1 file changed, 41 insertions(+), 1 deletion(-)
9
target/arm/tcg/translate-a64.c | 54 +++-------------------------------
10
2 files changed, 10 insertions(+), 50 deletions(-)
13
11
14
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/pauth_helper.c
14
--- a/target/arm/tcg/a64.decode
17
+++ b/target/arm/pauth_helper.c
15
+++ b/target/arm/tcg/a64.decode
18
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
16
@@ -XXX,XX +XXX,XX @@ SHA1SU0 0101 1110 000 ..... 001100 ..... ..... @rrr_q1e0
19
static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
17
SHA256H 0101 1110 000 ..... 010000 ..... ..... @rrr_q1e0
20
ARMPACKey *key, bool data)
18
SHA256H2 0101 1110 000 ..... 010100 ..... ..... @rrr_q1e0
21
{
19
SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0
22
- g_assert_not_reached(); /* FIXME */
23
+ ARMMMUIdx mmu_idx = arm_stage1_mmu_idx(env);
24
+ ARMVAParameters param = aa64_va_parameters(env, ptr, mmu_idx, data);
25
+ uint64_t pac, ext_ptr, ext, test;
26
+ int bot_bit, top_bit;
27
+
20
+
28
+ /* If tagged pointers are in use, use ptr<55>, otherwise ptr<63>. */
21
+### Cryptographic two-register SHA
29
+ if (param.tbi) {
30
+ ext = sextract64(ptr, 55, 1);
31
+ } else {
32
+ ext = sextract64(ptr, 63, 1);
33
+ }
34
+
22
+
35
+ /* Build a pointer with known good extension bits. */
23
+SHA1H 0101 1110 0010 1000 0000 10 ..... ..... @rr_q1e0
36
+ top_bit = 64 - 8 * param.tbi;
24
+SHA1SU1 0101 1110 0010 1000 0001 10 ..... ..... @rr_q1e0
37
+ bot_bit = 64 - param.tsz;
25
+SHA256SU0 0101 1110 0010 1000 0010 10 ..... ..... @rr_q1e0
38
+ ext_ptr = deposit64(ptr, bot_bit, top_bit - bot_bit, ext);
26
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/tcg/translate-a64.c
29
+++ b/target/arm/tcg/translate-a64.c
30
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SHA256H, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256
31
TRANS_FEAT(SHA256H2, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h2)
32
TRANS_FEAT(SHA256SU1, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256su1)
33
34
+TRANS_FEAT(SHA1H, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1h)
35
+TRANS_FEAT(SHA1SU1, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1su1)
36
+TRANS_FEAT(SHA256SU0, aa64_sha256, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha256su0)
39
+
37
+
40
+ pac = pauth_computepac(ext_ptr, modifier, *key);
38
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
41
+
39
* Note that it is the caller's responsibility to ensure that the
42
+ /*
40
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
43
+ * Check if the ptr has good extension bits and corrupt the
41
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
44
+ * pointer authentication code if not.
42
}
45
+ */
46
+ test = sextract64(ptr, bot_bit, top_bit - bot_bit);
47
+ if (test != 0 && test != -1) {
48
+ pac ^= MAKE_64BIT_MASK(top_bit - 1, 1);
49
+ }
50
+
51
+ /*
52
+ * Preserve the determination between upper and lower at bit 55,
53
+ * and insert pointer authentication code.
54
+ */
55
+ if (param.tbi) {
56
+ ptr &= ~MAKE_64BIT_MASK(bot_bit, 55 - bot_bit + 1);
57
+ pac &= MAKE_64BIT_MASK(bot_bit, 54 - bot_bit + 1);
58
+ } else {
59
+ ptr &= MAKE_64BIT_MASK(0, bot_bit);
60
+ pac &= ~(MAKE_64BIT_MASK(55, 1) | MAKE_64BIT_MASK(0, bot_bit));
61
+ }
62
+ ext &= MAKE_64BIT_MASK(55, 1);
63
+ return pac | ext | ptr;
64
}
43
}
65
44
66
static uint64_t pauth_original_ptr(uint64_t ptr, ARMVAParameters param)
45
-/* Crypto two-reg SHA
46
- * 31 24 23 22 21 17 16 12 11 10 9 5 4 0
47
- * +-----------------+------+-----------+--------+-----+------+------+
48
- * | 0 1 0 1 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 | Rn | Rd |
49
- * +-----------------+------+-----------+--------+-----+------+------+
50
- */
51
-static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
52
-{
53
- int size = extract32(insn, 22, 2);
54
- int opcode = extract32(insn, 12, 5);
55
- int rn = extract32(insn, 5, 5);
56
- int rd = extract32(insn, 0, 5);
57
- gen_helper_gvec_2 *genfn;
58
- bool feature;
59
-
60
- if (size != 0) {
61
- unallocated_encoding(s);
62
- return;
63
- }
64
-
65
- switch (opcode) {
66
- case 0: /* SHA1H */
67
- feature = dc_isar_feature(aa64_sha1, s);
68
- genfn = gen_helper_crypto_sha1h;
69
- break;
70
- case 1: /* SHA1SU1 */
71
- feature = dc_isar_feature(aa64_sha1, s);
72
- genfn = gen_helper_crypto_sha1su1;
73
- break;
74
- case 2: /* SHA256SU0 */
75
- feature = dc_isar_feature(aa64_sha256, s);
76
- genfn = gen_helper_crypto_sha256su0;
77
- break;
78
- default:
79
- unallocated_encoding(s);
80
- return;
81
- }
82
-
83
- if (!feature) {
84
- unallocated_encoding(s);
85
- return;
86
- }
87
-
88
- if (!fp_access_check(s)) {
89
- return;
90
- }
91
- gen_gvec_op2_ool(s, true, rd, rn, 0, genfn);
92
-}
93
-
94
/* Crypto three-reg SHA512
95
* 31 21 20 16 15 14 13 12 11 10 9 5 4 0
96
* +-----------------------+------+---+---+-----+--------+------+------+
97
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
98
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
99
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
100
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
101
- { 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },
102
{ 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 },
103
{ 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 },
104
{ 0xce000000, 0xff808000, disas_crypto_four_reg },
67
--
105
--
68
2.20.1
106
2.34.1
69
70
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20181211151945.29137-14-aaron@os.amperecomputing.com
5
Message-id: 20240524232121.284515-13-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/helper.c | 39 +++++++++++++++++++++++++++++++++++++--
8
target/arm/tcg/a64.decode | 11 ++++
9
1 file changed, 37 insertions(+), 2 deletions(-)
9
target/arm/tcg/translate-a64.c | 97 ++++++++--------------------------
10
2 files changed, 32 insertions(+), 76 deletions(-)
10
11
11
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
12
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/helper.c
14
--- a/target/arm/tcg/a64.decode
14
+++ b/target/arm/helper.c
15
+++ b/target/arm/tcg/a64.decode
15
@@ -XXX,XX +XXX,XX @@ static bool event_always_supported(CPUARMState *env)
16
@@ -XXX,XX +XXX,XX @@
17
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
18
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
19
@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
20
+@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
21
22
### Data Processing - Immediate
23
24
@@ -XXX,XX +XXX,XX @@ SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0
25
SHA1H 0101 1110 0010 1000 0000 10 ..... ..... @rr_q1e0
26
SHA1SU1 0101 1110 0010 1000 0001 10 ..... ..... @rr_q1e0
27
SHA256SU0 0101 1110 0010 1000 0010 10 ..... ..... @rr_q1e0
28
+
29
+### Cryptographic three-register SHA512
30
+
31
+SHA512H 1100 1110 011 ..... 100000 ..... ..... @rrr_q1e0
32
+SHA512H2 1100 1110 011 ..... 100001 ..... ..... @rrr_q1e0
33
+SHA512SU1 1100 1110 011 ..... 100010 ..... ..... @rrr_q1e0
34
+RAX1 1100 1110 011 ..... 100011 ..... ..... @rrr_q1e3
35
+SM3PARTW1 1100 1110 011 ..... 110000 ..... ..... @rrr_q1e0
36
+SM3PARTW2 1100 1110 011 ..... 110001 ..... ..... @rrr_q1e0
37
+SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0
38
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/tcg/translate-a64.c
41
+++ b/target/arm/tcg/translate-a64.c
42
@@ -XXX,XX +XXX,XX @@ static bool do_gvec_op3_ool(DisasContext *s, arg_qrrr_e *a, int data,
16
return true;
43
return true;
17
}
44
}
18
45
19
+static uint64_t swinc_get_count(CPUARMState *env)
46
+static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn)
20
+{
47
+{
21
+ /*
48
+ if (!a->q && a->esz == MO_64) {
22
+ * SW_INCR events are written directly to the pmevcntr's by writes to
49
+ return false;
23
+ * PMSWINC, so there is no underlying count maintained by the PMU itself
50
+ }
24
+ */
51
+ if (fp_access_check(s)) {
25
+ return 0;
52
+ gen_gvec_fn3(s, a->q, a->rd, a->rn, a->rm, fn, a->esz);
53
+ }
54
+ return true;
26
+}
55
+}
27
+
56
+
28
/*
57
/*
29
* Return the underlying cycle count for the PMU cycle counters. If we're in
58
* This utility function is for doing register extension with an
30
* usermode, simply return 0.
59
* optional shift. You will likely want to pass a temporary for the
31
@@ -XXX,XX +XXX,XX @@ static uint64_t instructions_get_count(CPUARMState *env)
60
@@ -XXX,XX +XXX,XX @@ static bool trans_EXTR(DisasContext *s, arg_extract *a)
32
#endif
33
34
static const pm_event pm_events[] = {
35
+ { .number = 0x000, /* SW_INCR */
36
+ .supported = event_always_supported,
37
+ .get_count = swinc_get_count,
38
+ },
39
#ifndef CONFIG_USER_ONLY
40
{ .number = 0x008, /* INST_RETIRED, Instruction architecturally executed */
41
.supported = instructions_supported,
42
@@ -XXX,XX +XXX,XX @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
43
pmu_op_finish(env);
44
}
61
}
45
62
46
+static void pmswinc_write(CPUARMState *env, const ARMCPRegInfo *ri,
63
/*
47
+ uint64_t value)
64
- * Cryptographic AES, SHA
48
+{
65
+ * Cryptographic AES, SHA, SHA512
49
+ unsigned int i;
66
*/
50
+ for (i = 0; i < pmu_num_counters(env); i++) {
67
51
+ /* Increment a counter's count iff: */
68
TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese)
52
+ if ((value & (1 << i)) && /* counter's bit is set */
69
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SHA1H, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1h)
53
+ /* counter is enabled and not filtered */
70
TRANS_FEAT(SHA1SU1, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1su1)
54
+ pmu_counter_enabled(env, i) &&
71
TRANS_FEAT(SHA256SU0, aa64_sha256, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha256su0)
55
+ /* counter is SW_INCR */
72
56
+ (env->cp15.c14_pmevtyper[i] & PMXEVTYPER_EVTCOUNT) == 0x0) {
73
+TRANS_FEAT(SHA512H, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512h)
57
+ pmevcntr_op_start(env, i);
74
+TRANS_FEAT(SHA512H2, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512h2)
58
+ env->cp15.c14_pmevcntr[i]++;
75
+TRANS_FEAT(SHA512SU1, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512su1)
59
+ pmevcntr_op_finish(env, i);
76
+TRANS_FEAT(RAX1, aa64_sha3, do_gvec_fn3, a, gen_gvec_rax1)
60
+ }
77
+TRANS_FEAT(SM3PARTW1, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw1)
61
+ }
78
+TRANS_FEAT(SM3PARTW2, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw2)
62
+}
79
+TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey)
63
+
80
+
64
static uint64_t pmccntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
81
+
65
{
82
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
66
uint64_t ret;
83
* Note that it is the caller's responsibility to ensure that the
67
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
84
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
68
.fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
85
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
69
.writefn = pmovsr_write,
86
}
70
.raw_writefn = raw_write },
87
}
71
- /* Unimplemented so WI. */
88
72
{ .name = "PMSWINC", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 4,
89
-/* Crypto three-reg SHA512
73
- .access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NOP },
90
- * 31 21 20 16 15 14 13 12 11 10 9 5 4 0
74
+ .access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NO_RAW,
91
- * +-----------------------+------+---+---+-----+--------+------+------+
75
+ .writefn = pmswinc_write },
92
- * | 1 1 0 0 1 1 1 0 0 1 1 | Rm | 1 | O | 0 0 | opcode | Rn | Rd |
76
+ { .name = "PMSWINC_EL0", .state = ARM_CP_STATE_AA64,
93
- * +-----------------------+------+---+---+-----+--------+------+------+
77
+ .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 4,
94
- */
78
+ .access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NO_RAW,
95
-static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
79
+ .writefn = pmswinc_write },
96
-{
80
{ .name = "PMSELR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 5,
97
- int opcode = extract32(insn, 10, 2);
81
.access = PL0_RW, .type = ARM_CP_ALIAS,
98
- int o = extract32(insn, 14, 1);
82
.fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmselr),
99
- int rm = extract32(insn, 16, 5);
100
- int rn = extract32(insn, 5, 5);
101
- int rd = extract32(insn, 0, 5);
102
- bool feature;
103
- gen_helper_gvec_3 *oolfn = NULL;
104
- GVecGen3Fn *gvecfn = NULL;
105
-
106
- if (o == 0) {
107
- switch (opcode) {
108
- case 0: /* SHA512H */
109
- feature = dc_isar_feature(aa64_sha512, s);
110
- oolfn = gen_helper_crypto_sha512h;
111
- break;
112
- case 1: /* SHA512H2 */
113
- feature = dc_isar_feature(aa64_sha512, s);
114
- oolfn = gen_helper_crypto_sha512h2;
115
- break;
116
- case 2: /* SHA512SU1 */
117
- feature = dc_isar_feature(aa64_sha512, s);
118
- oolfn = gen_helper_crypto_sha512su1;
119
- break;
120
- case 3: /* RAX1 */
121
- feature = dc_isar_feature(aa64_sha3, s);
122
- gvecfn = gen_gvec_rax1;
123
- break;
124
- default:
125
- g_assert_not_reached();
126
- }
127
- } else {
128
- switch (opcode) {
129
- case 0: /* SM3PARTW1 */
130
- feature = dc_isar_feature(aa64_sm3, s);
131
- oolfn = gen_helper_crypto_sm3partw1;
132
- break;
133
- case 1: /* SM3PARTW2 */
134
- feature = dc_isar_feature(aa64_sm3, s);
135
- oolfn = gen_helper_crypto_sm3partw2;
136
- break;
137
- case 2: /* SM4EKEY */
138
- feature = dc_isar_feature(aa64_sm4, s);
139
- oolfn = gen_helper_crypto_sm4ekey;
140
- break;
141
- default:
142
- unallocated_encoding(s);
143
- return;
144
- }
145
- }
146
-
147
- if (!feature) {
148
- unallocated_encoding(s);
149
- return;
150
- }
151
-
152
- if (!fp_access_check(s)) {
153
- return;
154
- }
155
-
156
- if (oolfn) {
157
- gen_gvec_op3_ool(s, true, rd, rn, rm, 0, oolfn);
158
- } else {
159
- gen_gvec_fn3(s, true, rd, rn, rm, gvecfn, MO_64);
160
- }
161
-}
162
-
163
/* Crypto two-reg SHA512
164
* 31 12 11 10 9 5 4 0
165
* +-----------------------------------------+--------+------+------+
166
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
167
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
168
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
169
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
170
- { 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 },
171
{ 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 },
172
{ 0xce000000, 0xff808000, disas_crypto_four_reg },
173
{ 0xce800000, 0xffe00000, disas_crypto_xar },
83
--
174
--
84
2.20.1
175
2.34.1
85
86
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20190108223129.5570-12-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-14-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/helper-a64.h | 2 +-
8
target/arm/tcg/a64.decode | 5 ++++
9
target/arm/helper-a64.c | 10 +++++-----
9
target/arm/tcg/translate-a64.c | 50 ++--------------------------------
10
target/arm/translate-a64.c | 7 ++++++-
10
2 files changed, 8 insertions(+), 47 deletions(-)
11
3 files changed, 12 insertions(+), 7 deletions(-)
12
11
13
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/helper-a64.h
14
--- a/target/arm/tcg/a64.decode
16
+++ b/target/arm/helper-a64.h
15
+++ b/target/arm/tcg/a64.decode
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(advsimd_f16tosinth, i32, f16, ptr)
16
@@ -XXX,XX +XXX,XX @@ RAX1 1100 1110 011 ..... 100011 ..... ..... @rrr_q1e3
18
DEF_HELPER_2(advsimd_f16touinth, i32, f16, ptr)
17
SM3PARTW1 1100 1110 011 ..... 110000 ..... ..... @rrr_q1e0
19
DEF_HELPER_2(sqrt_f16, f16, f16, ptr)
18
SM3PARTW2 1100 1110 011 ..... 110001 ..... ..... @rrr_q1e0
20
19
SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0
21
-DEF_HELPER_1(exception_return, void, env)
20
+
22
+DEF_HELPER_2(exception_return, void, env, i64)
21
+### Cryptographic two-register SHA512
23
22
+
24
DEF_HELPER_FLAGS_3(pacia, TCG_CALL_NO_WG, i64, env, i64, i64)
23
+SHA512SU0 1100 1110 110 00000 100000 ..... ..... @rr_q1e0
25
DEF_HELPER_FLAGS_3(pacib, TCG_CALL_NO_WG, i64, env, i64, i64)
24
+SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0
26
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
25
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
27
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/helper-a64.c
27
--- a/target/arm/tcg/translate-a64.c
29
+++ b/target/arm/helper-a64.c
28
+++ b/target/arm/tcg/translate-a64.c
30
@@ -XXX,XX +XXX,XX @@ static int el_from_spsr(uint32_t spsr)
29
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SM3PARTW1, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3part
30
TRANS_FEAT(SM3PARTW2, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw2)
31
TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey)
32
33
+TRANS_FEAT(SHA512SU0, aa64_sha512, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha512su0)
34
+TRANS_FEAT(SM4E, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4e)
35
+
36
37
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
38
* Note that it is the caller's responsibility to ensure that the
39
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
31
}
40
}
32
}
41
}
33
42
34
-void HELPER(exception_return)(CPUARMState *env)
43
-/* Crypto two-reg SHA512
35
+void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
44
- * 31 12 11 10 9 5 4 0
36
{
45
- * +-----------------------------------------+--------+------+------+
37
int cur_el = arm_current_el(env);
46
- * | 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0 0 | opcode | Rn | Rd |
38
unsigned int spsr_idx = aarch64_banked_spsr_index(cur_el);
47
- * +-----------------------------------------+--------+------+------+
39
@@ -XXX,XX +XXX,XX @@ void HELPER(exception_return)(CPUARMState *env)
48
- */
40
aarch64_sync_64_to_32(env);
49
-static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
41
50
-{
42
if (spsr & CPSR_T) {
51
- int opcode = extract32(insn, 10, 2);
43
- env->regs[15] = env->elr_el[cur_el] & ~0x1;
52
- int rn = extract32(insn, 5, 5);
44
+ env->regs[15] = new_pc & ~0x1;
53
- int rd = extract32(insn, 0, 5);
45
} else {
54
- bool feature;
46
- env->regs[15] = env->elr_el[cur_el] & ~0x3;
55
-
47
+ env->regs[15] = new_pc & ~0x3;
56
- switch (opcode) {
48
}
57
- case 0: /* SHA512SU0 */
49
qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
58
- feature = dc_isar_feature(aa64_sha512, s);
50
"AArch32 EL%d PC 0x%" PRIx32 "\n",
59
- break;
51
@@ -XXX,XX +XXX,XX @@ void HELPER(exception_return)(CPUARMState *env)
60
- case 1: /* SM4E */
52
env->pstate &= ~PSTATE_SS;
61
- feature = dc_isar_feature(aa64_sm4, s);
53
}
62
- break;
54
aarch64_restore_sp(env, new_el);
63
- default:
55
- env->pc = env->elr_el[cur_el];
64
- unallocated_encoding(s);
56
+ env->pc = new_pc;
65
- return;
57
qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
66
- }
58
"AArch64 EL%d PC 0x%" PRIx64 "\n",
67
-
59
cur_el, new_el, env->pc);
68
- if (!feature) {
60
@@ -XXX,XX +XXX,XX @@ illegal_return:
69
- unallocated_encoding(s);
61
* no change to exception level, execution state or stack pointer
70
- return;
62
*/
71
- }
63
env->pstate |= PSTATE_IL;
72
-
64
- env->pc = env->elr_el[cur_el];
73
- if (!fp_access_check(s)) {
65
+ env->pc = new_pc;
74
- return;
66
spsr &= PSTATE_NZCV | PSTATE_DAIF;
75
- }
67
spsr |= pstate_read(env) & ~(PSTATE_NZCV | PSTATE_DAIF);
76
-
68
pstate_write(env, spsr);
77
- switch (opcode) {
69
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
78
- case 0: /* SHA512SU0 */
70
index XXXXXXX..XXXXXXX 100644
79
- gen_gvec_op2_ool(s, true, rd, rn, 0, gen_helper_crypto_sha512su0);
71
--- a/target/arm/translate-a64.c
80
- break;
72
+++ b/target/arm/translate-a64.c
81
- case 1: /* SM4E */
73
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
82
- gen_gvec_op3_ool(s, true, rd, rd, rn, 0, gen_helper_crypto_sm4e);
74
static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
83
- break;
75
{
84
- default:
76
unsigned int opc, op2, op3, rn, op4;
85
- g_assert_not_reached();
77
+ TCGv_i64 dst;
86
- }
78
87
-}
79
opc = extract32(insn, 21, 4);
88
-
80
op2 = extract32(insn, 16, 5);
89
/* Crypto four-register
81
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
90
* 31 23 22 21 20 16 15 14 10 9 5 4 0
82
if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
91
* +-------------------+-----+------+---+------+------+------+
83
gen_io_start();
92
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
84
}
93
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
85
- gen_helper_exception_return(cpu_env);
94
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
86
+ dst = tcg_temp_new_i64();
95
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
87
+ tcg_gen_ld_i64(dst, cpu_env,
96
- { 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 },
88
+ offsetof(CPUARMState, elr_el[s->current_el]));
97
{ 0xce000000, 0xff808000, disas_crypto_four_reg },
89
+ gen_helper_exception_return(cpu_env, dst);
98
{ 0xce800000, 0xffe00000, disas_crypto_xar },
90
+ tcg_temp_free_i64(dst);
99
{ 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 },
91
if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
92
gen_io_end();
93
}
94
--
100
--
95
2.20.1
101
2.34.1
96
97
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
Add storage space for the 5 encryption keys.
4
2
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20190108223129.5570-2-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-15-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
7
---
10
target/arm/cpu.h | 30 +++++++++++++++++++++++++++++-
8
target/arm/tcg/a64.decode | 8 ++
11
1 file changed, 29 insertions(+), 1 deletion(-)
9
target/arm/tcg/translate-a64.c | 132 +++++++++++----------------------
10
2 files changed, 51 insertions(+), 89 deletions(-)
12
11
13
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/cpu.h
14
--- a/target/arm/tcg/a64.decode
16
+++ b/target/arm/cpu.h
15
+++ b/target/arm/tcg/a64.decode
17
@@ -XXX,XX +XXX,XX @@ typedef struct ARMVectorReg {
16
@@ -XXX,XX +XXX,XX @@
18
uint64_t d[2 * ARM_MAX_VQ] QEMU_ALIGNED(16);
17
&i imm
19
} ARMVectorReg;
18
&qrr_e q rd rn esz
20
19
&qrrr_e q rd rn rm esz
21
-/* In AArch32 mode, predicate registers do not exist at all. */
20
+&qrrrr_e q rd rn rm ra esz
22
#ifdef TARGET_AARCH64
21
23
+/* In AArch32 mode, predicate registers do not exist at all. */
22
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
24
typedef struct ARMPredicateReg {
23
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
25
uint64_t p[2 * ARM_MAX_VQ / 8] QEMU_ALIGNED(16);
24
@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
26
} ARMPredicateReg;
25
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
27
+
26
+@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
28
+/* In AArch32 mode, PAC keys do not exist at all. */
27
29
+typedef struct ARMPACKey {
28
### Data Processing - Immediate
30
+ uint64_t lo, hi;
29
31
+} ARMPACKey;
30
@@ -XXX,XX +XXX,XX @@ SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0
32
#endif
31
33
32
SHA512SU0 1100 1110 110 00000 100000 ..... ..... @rr_q1e0
34
33
SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0
35
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
34
+
36
uint32_t cregs[16];
35
+### Cryptographic four-register
37
} iwmmxt;
36
+
38
37
+EOR3 1100 1110 000 ..... 0 ..... ..... ..... @rrrr_q1e3
39
+#ifdef TARGET_AARCH64
38
+BCAX 1100 1110 001 ..... 0 ..... ..... ..... @rrrr_q1e3
40
+ ARMPACKey apia_key;
39
+SM3SS1 1100 1110 010 ..... 0 ..... ..... ..... @rrrr_q1e3
41
+ ARMPACKey apib_key;
40
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
42
+ ARMPACKey apda_key;
41
index XXXXXXX..XXXXXXX 100644
43
+ ARMPACKey apdb_key;
42
--- a/target/arm/tcg/translate-a64.c
44
+ ARMPACKey apga_key;
43
+++ b/target/arm/tcg/translate-a64.c
45
+#endif
44
@@ -XXX,XX +XXX,XX @@ static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn)
46
+
45
return true;
47
#if defined(CONFIG_USER_ONLY)
48
/* For usermode syscall translation. */
49
int eabi;
50
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
51
return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
52
}
46
}
53
47
54
+static inline bool isar_feature_aa64_pauth(const ARMISARegisters *id)
48
+static bool do_gvec_fn4(DisasContext *s, arg_qrrrr_e *a, GVecGen4Fn *fn)
55
+{
49
+{
56
+ /*
50
+ if (!a->q && a->esz == MO_64) {
57
+ * Note that while QEMU will only implement the architected algorithm
51
+ return false;
58
+ * QARMA, and thus APA+GPA, the host cpu for kvm may use implementation
52
+ }
59
+ * defined algorithms, and thus API+GPI, and this predicate controls
53
+ if (fp_access_check(s)) {
60
+ * migration of the 128-bit keys.
54
+ gen_gvec_fn4(s, a->q, a->rd, a->rn, a->rm, a->ra, fn, a->esz);
61
+ */
55
+ }
62
+ return (id->id_aa64isar1 &
56
+ return true;
63
+ (FIELD_DP64(0, ID_AA64ISAR1, APA, -1) |
64
+ FIELD_DP64(0, ID_AA64ISAR1, API, -1) |
65
+ FIELD_DP64(0, ID_AA64ISAR1, GPA, -1) |
66
+ FIELD_DP64(0, ID_AA64ISAR1, GPI, -1))) != 0;
67
+}
57
+}
68
+
58
+
69
static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
59
/*
70
{
60
* This utility function is for doing register extension with an
71
/* We always set the AdvSIMD and FP fields identically wrt FP16. */
61
* optional shift. You will likely want to pass a temporary for the
62
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey)
63
TRANS_FEAT(SHA512SU0, aa64_sha512, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha512su0)
64
TRANS_FEAT(SM4E, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4e)
65
66
+TRANS_FEAT(EOR3, aa64_sha3, do_gvec_fn4, a, gen_gvec_eor3)
67
+TRANS_FEAT(BCAX, aa64_sha3, do_gvec_fn4, a, gen_gvec_bcax)
68
+
69
+static bool trans_SM3SS1(DisasContext *s, arg_SM3SS1 *a)
70
+{
71
+ if (!dc_isar_feature(aa64_sm3, s)) {
72
+ return false;
73
+ }
74
+ if (fp_access_check(s)) {
75
+ TCGv_i32 tcg_op1 = tcg_temp_new_i32();
76
+ TCGv_i32 tcg_op2 = tcg_temp_new_i32();
77
+ TCGv_i32 tcg_op3 = tcg_temp_new_i32();
78
+ TCGv_i32 tcg_res = tcg_temp_new_i32();
79
+ unsigned vsz, dofs;
80
+
81
+ read_vec_element_i32(s, tcg_op1, a->rn, 3, MO_32);
82
+ read_vec_element_i32(s, tcg_op2, a->rm, 3, MO_32);
83
+ read_vec_element_i32(s, tcg_op3, a->ra, 3, MO_32);
84
+
85
+ tcg_gen_rotri_i32(tcg_res, tcg_op1, 20);
86
+ tcg_gen_add_i32(tcg_res, tcg_res, tcg_op2);
87
+ tcg_gen_add_i32(tcg_res, tcg_res, tcg_op3);
88
+ tcg_gen_rotri_i32(tcg_res, tcg_res, 25);
89
+
90
+ /* Clear the whole register first, then store bits [127:96]. */
91
+ vsz = vec_full_reg_size(s);
92
+ dofs = vec_full_reg_offset(s, a->rd);
93
+ tcg_gen_gvec_dup_imm(MO_64, dofs, vsz, vsz, 0);
94
+ write_vec_element_i32(s, tcg_res, a->rd, 3, MO_32);
95
+ }
96
+ return true;
97
+}
98
99
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
100
* Note that it is the caller's responsibility to ensure that the
101
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
102
}
103
}
104
105
-/* Crypto four-register
106
- * 31 23 22 21 20 16 15 14 10 9 5 4 0
107
- * +-------------------+-----+------+---+------+------+------+
108
- * | 1 1 0 0 1 1 1 0 0 | Op0 | Rm | 0 | Ra | Rn | Rd |
109
- * +-------------------+-----+------+---+------+------+------+
110
- */
111
-static void disas_crypto_four_reg(DisasContext *s, uint32_t insn)
112
-{
113
- int op0 = extract32(insn, 21, 2);
114
- int rm = extract32(insn, 16, 5);
115
- int ra = extract32(insn, 10, 5);
116
- int rn = extract32(insn, 5, 5);
117
- int rd = extract32(insn, 0, 5);
118
- bool feature;
119
-
120
- switch (op0) {
121
- case 0: /* EOR3 */
122
- case 1: /* BCAX */
123
- feature = dc_isar_feature(aa64_sha3, s);
124
- break;
125
- case 2: /* SM3SS1 */
126
- feature = dc_isar_feature(aa64_sm3, s);
127
- break;
128
- default:
129
- unallocated_encoding(s);
130
- return;
131
- }
132
-
133
- if (!feature) {
134
- unallocated_encoding(s);
135
- return;
136
- }
137
-
138
- if (!fp_access_check(s)) {
139
- return;
140
- }
141
-
142
- if (op0 < 2) {
143
- TCGv_i64 tcg_op1, tcg_op2, tcg_op3, tcg_res[2];
144
- int pass;
145
-
146
- tcg_op1 = tcg_temp_new_i64();
147
- tcg_op2 = tcg_temp_new_i64();
148
- tcg_op3 = tcg_temp_new_i64();
149
- tcg_res[0] = tcg_temp_new_i64();
150
- tcg_res[1] = tcg_temp_new_i64();
151
-
152
- for (pass = 0; pass < 2; pass++) {
153
- read_vec_element(s, tcg_op1, rn, pass, MO_64);
154
- read_vec_element(s, tcg_op2, rm, pass, MO_64);
155
- read_vec_element(s, tcg_op3, ra, pass, MO_64);
156
-
157
- if (op0 == 0) {
158
- /* EOR3 */
159
- tcg_gen_xor_i64(tcg_res[pass], tcg_op2, tcg_op3);
160
- } else {
161
- /* BCAX */
162
- tcg_gen_andc_i64(tcg_res[pass], tcg_op2, tcg_op3);
163
- }
164
- tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1);
165
- }
166
- write_vec_element(s, tcg_res[0], rd, 0, MO_64);
167
- write_vec_element(s, tcg_res[1], rd, 1, MO_64);
168
- } else {
169
- TCGv_i32 tcg_op1, tcg_op2, tcg_op3, tcg_res, tcg_zero;
170
-
171
- tcg_op1 = tcg_temp_new_i32();
172
- tcg_op2 = tcg_temp_new_i32();
173
- tcg_op3 = tcg_temp_new_i32();
174
- tcg_res = tcg_temp_new_i32();
175
- tcg_zero = tcg_constant_i32(0);
176
-
177
- read_vec_element_i32(s, tcg_op1, rn, 3, MO_32);
178
- read_vec_element_i32(s, tcg_op2, rm, 3, MO_32);
179
- read_vec_element_i32(s, tcg_op3, ra, 3, MO_32);
180
-
181
- tcg_gen_rotri_i32(tcg_res, tcg_op1, 20);
182
- tcg_gen_add_i32(tcg_res, tcg_res, tcg_op2);
183
- tcg_gen_add_i32(tcg_res, tcg_res, tcg_op3);
184
- tcg_gen_rotri_i32(tcg_res, tcg_res, 25);
185
-
186
- write_vec_element_i32(s, tcg_zero, rd, 0, MO_32);
187
- write_vec_element_i32(s, tcg_zero, rd, 1, MO_32);
188
- write_vec_element_i32(s, tcg_zero, rd, 2, MO_32);
189
- write_vec_element_i32(s, tcg_res, rd, 3, MO_32);
190
- }
191
-}
192
-
193
/* Crypto XAR
194
* 31 21 20 16 15 10 9 5 4 0
195
* +-----------------------+------+--------+------+------+
196
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
197
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
198
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
199
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
200
- { 0xce000000, 0xff808000, disas_crypto_four_reg },
201
{ 0xce800000, 0xffe00000, disas_crypto_xar },
202
{ 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 },
203
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
72
--
204
--
73
2.20.1
205
2.34.1
74
75
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Post v8.4 bits taken from SysReg_v85_xml-00bet8.
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20190108223129.5570-3-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
target/arm/cpu.h | 45 +++++++++++++++++++++++++++++++++------------
11
1 file changed, 33 insertions(+), 12 deletions(-)
12
13
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/cpu.h
16
+++ b/target/arm/cpu.h
17
@@ -XXX,XX +XXX,XX @@ void pmccntr_sync(CPUARMState *env);
18
#define SCTLR_A (1U << 1)
19
#define SCTLR_C (1U << 2)
20
#define SCTLR_W (1U << 3) /* up to v6; RAO in v7 */
21
-#define SCTLR_SA (1U << 3)
22
+#define SCTLR_nTLSMD_32 (1U << 3) /* v8.2-LSMAOC, AArch32 only */
23
+#define SCTLR_SA (1U << 3) /* AArch64 only */
24
#define SCTLR_P (1U << 4) /* up to v5; RAO in v6 and v7 */
25
+#define SCTLR_LSMAOE_32 (1U << 4) /* v8.2-LSMAOC, AArch32 only */
26
#define SCTLR_SA0 (1U << 4) /* v8 onward, AArch64 only */
27
#define SCTLR_D (1U << 5) /* up to v5; RAO in v6 */
28
#define SCTLR_CP15BEN (1U << 5) /* v7 onward */
29
#define SCTLR_L (1U << 6) /* up to v5; RAO in v6 and v7; RAZ in v8 */
30
+#define SCTLR_nAA (1U << 6) /* when v8.4-LSE is implemented */
31
#define SCTLR_B (1U << 7) /* up to v6; RAZ in v7 */
32
#define SCTLR_ITD (1U << 7) /* v8 onward */
33
#define SCTLR_S (1U << 8) /* up to v6; RAZ in v7 */
34
@@ -XXX,XX +XXX,XX @@ void pmccntr_sync(CPUARMState *env);
35
#define SCTLR_R (1U << 9) /* up to v6; RAZ in v7 */
36
#define SCTLR_UMA (1U << 9) /* v8 onward, AArch64 only */
37
#define SCTLR_F (1U << 10) /* up to v6 */
38
-#define SCTLR_SW (1U << 10) /* v7 onward */
39
-#define SCTLR_Z (1U << 11)
40
+#define SCTLR_SW (1U << 10) /* v7, RES0 in v8 */
41
+#define SCTLR_Z (1U << 11) /* in v7, RES1 in v8 */
42
+#define SCTLR_EOS (1U << 11) /* v8.5-ExS */
43
#define SCTLR_I (1U << 12)
44
-#define SCTLR_V (1U << 13)
45
+#define SCTLR_V (1U << 13) /* AArch32 only */
46
+#define SCTLR_EnDB (1U << 13) /* v8.3, AArch64 only */
47
#define SCTLR_RR (1U << 14) /* up to v7 */
48
#define SCTLR_DZE (1U << 14) /* v8 onward, AArch64 only */
49
#define SCTLR_L4 (1U << 15) /* up to v6; RAZ in v7 */
50
#define SCTLR_UCT (1U << 15) /* v8 onward, AArch64 only */
51
#define SCTLR_DT (1U << 16) /* up to ??, RAO in v6 and v7 */
52
#define SCTLR_nTWI (1U << 16) /* v8 onward */
53
-#define SCTLR_HA (1U << 17)
54
+#define SCTLR_HA (1U << 17) /* up to v7, RES0 in v8 */
55
#define SCTLR_BR (1U << 17) /* PMSA only */
56
#define SCTLR_IT (1U << 18) /* up to ??, RAO in v6 and v7 */
57
#define SCTLR_nTWE (1U << 18) /* v8 onward */
58
#define SCTLR_WXN (1U << 19)
59
#define SCTLR_ST (1U << 20) /* up to ??, RAZ in v6 */
60
-#define SCTLR_UWXN (1U << 20) /* v7 onward */
61
-#define SCTLR_FI (1U << 21)
62
-#define SCTLR_U (1U << 22)
63
+#define SCTLR_UWXN (1U << 20) /* v7 onward, AArch32 only */
64
+#define SCTLR_FI (1U << 21) /* up to v7, v8 RES0 */
65
+#define SCTLR_IESB (1U << 21) /* v8.2-IESB, AArch64 only */
66
+#define SCTLR_U (1U << 22) /* up to v6, RAO in v7 */
67
+#define SCTLR_EIS (1U << 22) /* v8.5-ExS */
68
#define SCTLR_XP (1U << 23) /* up to v6; v7 onward RAO */
69
+#define SCTLR_SPAN (1U << 23) /* v8.1-PAN */
70
#define SCTLR_VE (1U << 24) /* up to v7 */
71
#define SCTLR_E0E (1U << 24) /* v8 onward, AArch64 only */
72
#define SCTLR_EE (1U << 25)
73
#define SCTLR_L2 (1U << 26) /* up to v6, RAZ in v7 */
74
#define SCTLR_UCI (1U << 26) /* v8 onward, AArch64 only */
75
-#define SCTLR_NMFI (1U << 27)
76
-#define SCTLR_TRE (1U << 28)
77
-#define SCTLR_AFE (1U << 29)
78
-#define SCTLR_TE (1U << 30)
79
+#define SCTLR_NMFI (1U << 27) /* up to v7, RAZ in v7VE and v8 */
80
+#define SCTLR_EnDA (1U << 27) /* v8.3, AArch64 only */
81
+#define SCTLR_TRE (1U << 28) /* AArch32 only */
82
+#define SCTLR_nTLSMD_64 (1U << 28) /* v8.2-LSMAOC, AArch64 only */
83
+#define SCTLR_AFE (1U << 29) /* AArch32 only */
84
+#define SCTLR_LSMAOE_64 (1U << 29) /* v8.2-LSMAOC, AArch64 only */
85
+#define SCTLR_TE (1U << 30) /* AArch32 only */
86
+#define SCTLR_EnIB (1U << 30) /* v8.3, AArch64 only */
87
+#define SCTLR_EnIA (1U << 31) /* v8.3, AArch64 only */
88
+#define SCTLR_BT0 (1ULL << 35) /* v8.5-BTI */
89
+#define SCTLR_BT1 (1ULL << 36) /* v8.5-BTI */
90
+#define SCTLR_ITFSB (1ULL << 37) /* v8.5-MemTag */
91
+#define SCTLR_TCF0 (3ULL << 38) /* v8.5-MemTag */
92
+#define SCTLR_TCF (3ULL << 40) /* v8.5-MemTag */
93
+#define SCTLR_ATA0 (1ULL << 42) /* v8.5-MemTag */
94
+#define SCTLR_ATA (1ULL << 43) /* v8.5-MemTag */
95
+#define SCTLR_DSSBS (1ULL << 44) /* v8.5 */
96
97
#define CPTR_TCPAC (1U << 31)
98
#define CPTR_TTA (1U << 20)
99
--
100
2.20.1
101
102
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
There are 5 bits of state that could be added, but to save
4
space within tbflags, add only a single enable bit.
5
Helpers will determine the rest of the state at runtime.
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20190108223129.5570-4-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/cpu.h | 1 +
13
target/arm/translate.h | 2 ++
14
target/arm/helper.c | 19 +++++++++++++++++++
15
target/arm/translate-a64.c | 1 +
16
4 files changed, 23 insertions(+)
17
18
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/cpu.h
21
+++ b/target/arm/cpu.h
22
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, TBI0, 0, 1)
23
FIELD(TBFLAG_A64, TBI1, 1, 1)
24
FIELD(TBFLAG_A64, SVEEXC_EL, 2, 2)
25
FIELD(TBFLAG_A64, ZCR_LEN, 4, 4)
26
+FIELD(TBFLAG_A64, PAUTH_ACTIVE, 8, 1)
27
28
static inline bool bswap_code(bool sctlr_b)
29
{
30
diff --git a/target/arm/translate.h b/target/arm/translate.h
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/translate.h
33
+++ b/target/arm/translate.h
34
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
35
bool is_ldex;
36
/* True if a single-step exception will be taken to the current EL */
37
bool ss_same_el;
38
+ /* True if v8.3-PAuth is active. */
39
+ bool pauth_active;
40
/* Bottom two bits of XScale c15_cpar coprocessor access control reg */
41
int c15_cpar;
42
/* TCG op of the current insn_start. */
43
diff --git a/target/arm/helper.c b/target/arm/helper.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/target/arm/helper.c
46
+++ b/target/arm/helper.c
47
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
48
flags = FIELD_DP32(flags, TBFLAG_A64, SVEEXC_EL, sve_el);
49
flags = FIELD_DP32(flags, TBFLAG_A64, ZCR_LEN, zcr_len);
50
}
51
+
52
+ if (cpu_isar_feature(aa64_pauth, cpu)) {
53
+ /*
54
+ * In order to save space in flags, we record only whether
55
+ * pauth is "inactive", meaning all insns are implemented as
56
+ * a nop, or "active" when some action must be performed.
57
+ * The decision of which action to take is left to a helper.
58
+ */
59
+ uint64_t sctlr;
60
+ if (current_el == 0) {
61
+ /* FIXME: ARMv8.1-VHE S2 translation regime. */
62
+ sctlr = env->cp15.sctlr_el[1];
63
+ } else {
64
+ sctlr = env->cp15.sctlr_el[current_el];
65
+ }
66
+ if (sctlr & (SCTLR_EnIA | SCTLR_EnIB | SCTLR_EnDA | SCTLR_EnDB)) {
67
+ flags = FIELD_DP32(flags, TBFLAG_A64, PAUTH_ACTIVE, 1);
68
+ }
69
+ }
70
} else {
71
*pc = env->regs[15];
72
flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
73
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/translate-a64.c
76
+++ b/target/arm/translate-a64.c
77
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
78
dc->fp_excp_el = FIELD_EX32(tb_flags, TBFLAG_ANY, FPEXC_EL);
79
dc->sve_excp_el = FIELD_EX32(tb_flags, TBFLAG_A64, SVEEXC_EL);
80
dc->sve_len = (FIELD_EX32(tb_flags, TBFLAG_A64, ZCR_LEN) + 1) * 16;
81
+ dc->pauth_active = FIELD_EX32(tb_flags, TBFLAG_A64, PAUTH_ACTIVE);
82
dc->vec_len = 0;
83
dc->vec_stride = 0;
84
dc->cp_regs = arm_cpu->cp_regs;
85
--
86
2.20.1
87
88
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20190108223129.5570-29-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-16-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/helper.c | 70 +++++++++++++++++++++++++++++++++++++++++++++
8
target/arm/tcg/a64.decode | 10 ++++++++
9
1 file changed, 70 insertions(+)
9
target/arm/tcg/translate-a64.c | 43 ++++++++++------------------------
10
2 files changed, 22 insertions(+), 31 deletions(-)
10
11
11
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
12
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/helper.c
14
--- a/target/arm/tcg/a64.decode
14
+++ b/target/arm/helper.c
15
+++ b/target/arm/tcg/a64.decode
15
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_lor_other(CPUARMState *env,
16
@@ -XXX,XX +XXX,XX @@ SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0
16
return access_lor_ns(env);
17
EOR3 1100 1110 000 ..... 0 ..... ..... ..... @rrrr_q1e3
18
BCAX 1100 1110 001 ..... 0 ..... ..... ..... @rrrr_q1e3
19
SM3SS1 1100 1110 010 ..... 0 ..... ..... ..... @rrrr_q1e3
20
+
21
+### Cryptographic three-register, imm2
22
+
23
+&crypto3i rd rn rm imm
24
+@crypto3i ........ ... rm:5 .. imm:2 .. rn:5 rd:5 &crypto3i
25
+
26
+SM3TT1A 11001110 010 ..... 10 .. 00 ..... ..... @crypto3i
27
+SM3TT1B 11001110 010 ..... 10 .. 01 ..... ..... @crypto3i
28
+SM3TT2A 11001110 010 ..... 10 .. 10 ..... ..... @crypto3i
29
+SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i
30
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/tcg/translate-a64.c
33
+++ b/target/arm/tcg/translate-a64.c
34
@@ -XXX,XX +XXX,XX @@ static bool trans_SM3SS1(DisasContext *s, arg_SM3SS1 *a)
35
return true;
17
}
36
}
18
37
19
+#ifdef TARGET_AARCH64
38
+static bool do_crypto3i(DisasContext *s, arg_crypto3i *a, gen_helper_gvec_3 *fn)
20
+static CPAccessResult access_pauth(CPUARMState *env, const ARMCPRegInfo *ri,
21
+ bool isread)
22
+{
39
+{
23
+ int el = arm_current_el(env);
40
+ if (fp_access_check(s)) {
41
+ gen_gvec_op3_ool(s, true, a->rd, a->rn, a->rm, a->imm, fn);
42
+ }
43
+ return true;
44
+}
45
+TRANS_FEAT(SM3TT1A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1a)
46
+TRANS_FEAT(SM3TT1B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1b)
47
+TRANS_FEAT(SM3TT2A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2a)
48
+TRANS_FEAT(SM3TT2B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2b)
24
+
49
+
25
+ if (el < 2 &&
50
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
26
+ arm_feature(env, ARM_FEATURE_EL2) &&
51
* Note that it is the caller's responsibility to ensure that the
27
+ !(arm_hcr_el2_eff(env) & HCR_APK)) {
52
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
28
+ return CP_ACCESS_TRAP_EL2;
53
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn)
29
+ }
54
vec_full_reg_size(s));
30
+ if (el < 3 &&
31
+ arm_feature(env, ARM_FEATURE_EL3) &&
32
+ !(env->cp15.scr_el3 & SCR_APK)) {
33
+ return CP_ACCESS_TRAP_EL3;
34
+ }
35
+ return CP_ACCESS_OK;
36
+}
37
+
38
+static const ARMCPRegInfo pauth_reginfo[] = {
39
+ { .name = "APDAKEYLO_EL1", .state = ARM_CP_STATE_AA64,
40
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 2, .opc2 = 0,
41
+ .access = PL1_RW, .accessfn = access_pauth,
42
+ .fieldoffset = offsetof(CPUARMState, apda_key.lo) },
43
+ { .name = "APDAKEYHI_EL1", .state = ARM_CP_STATE_AA64,
44
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 2, .opc2 = 1,
45
+ .access = PL1_RW, .accessfn = access_pauth,
46
+ .fieldoffset = offsetof(CPUARMState, apda_key.hi) },
47
+ { .name = "APDBKEYLO_EL1", .state = ARM_CP_STATE_AA64,
48
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 2, .opc2 = 2,
49
+ .access = PL1_RW, .accessfn = access_pauth,
50
+ .fieldoffset = offsetof(CPUARMState, apdb_key.lo) },
51
+ { .name = "APDBKEYHI_EL1", .state = ARM_CP_STATE_AA64,
52
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 2, .opc2 = 3,
53
+ .access = PL1_RW, .accessfn = access_pauth,
54
+ .fieldoffset = offsetof(CPUARMState, apdb_key.hi) },
55
+ { .name = "APGAKEYLO_EL1", .state = ARM_CP_STATE_AA64,
56
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 3, .opc2 = 0,
57
+ .access = PL1_RW, .accessfn = access_pauth,
58
+ .fieldoffset = offsetof(CPUARMState, apga_key.lo) },
59
+ { .name = "APGAKEYHI_EL1", .state = ARM_CP_STATE_AA64,
60
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 3, .opc2 = 1,
61
+ .access = PL1_RW, .accessfn = access_pauth,
62
+ .fieldoffset = offsetof(CPUARMState, apga_key.hi) },
63
+ { .name = "APIAKEYLO_EL1", .state = ARM_CP_STATE_AA64,
64
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 0,
65
+ .access = PL1_RW, .accessfn = access_pauth,
66
+ .fieldoffset = offsetof(CPUARMState, apia_key.lo) },
67
+ { .name = "APIAKEYHI_EL1", .state = ARM_CP_STATE_AA64,
68
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 1,
69
+ .access = PL1_RW, .accessfn = access_pauth,
70
+ .fieldoffset = offsetof(CPUARMState, apia_key.hi) },
71
+ { .name = "APIBKEYLO_EL1", .state = ARM_CP_STATE_AA64,
72
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 2,
73
+ .access = PL1_RW, .accessfn = access_pauth,
74
+ .fieldoffset = offsetof(CPUARMState, apib_key.lo) },
75
+ { .name = "APIBKEYHI_EL1", .state = ARM_CP_STATE_AA64,
76
+ .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 3,
77
+ .access = PL1_RW, .accessfn = access_pauth,
78
+ .fieldoffset = offsetof(CPUARMState, apib_key.hi) },
79
+ REGINFO_SENTINEL
80
+};
81
+#endif
82
+
83
void register_cp_regs_for_features(ARMCPU *cpu)
84
{
85
/* Register all the coprocessor registers based on feature bits */
86
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
87
define_one_arm_cp_reg(cpu, &zcr_el3_reginfo);
88
}
89
}
90
+
91
+#ifdef TARGET_AARCH64
92
+ if (cpu_isar_feature(aa64_pauth, cpu)) {
93
+ define_arm_cp_regs(cpu, pauth_reginfo);
94
+ }
95
+#endif
96
}
55
}
97
56
98
void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
57
-/* Crypto three-reg imm2
58
- * 31 21 20 16 15 14 13 12 11 10 9 5 4 0
59
- * +-----------------------+------+-----+------+--------+------+------+
60
- * | 1 1 0 0 1 1 1 0 0 1 0 | Rm | 1 0 | imm2 | opcode | Rn | Rd |
61
- * +-----------------------+------+-----+------+--------+------+------+
62
- */
63
-static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
64
-{
65
- static gen_helper_gvec_3 * const fns[4] = {
66
- gen_helper_crypto_sm3tt1a, gen_helper_crypto_sm3tt1b,
67
- gen_helper_crypto_sm3tt2a, gen_helper_crypto_sm3tt2b,
68
- };
69
- int opcode = extract32(insn, 10, 2);
70
- int imm2 = extract32(insn, 12, 2);
71
- int rm = extract32(insn, 16, 5);
72
- int rn = extract32(insn, 5, 5);
73
- int rd = extract32(insn, 0, 5);
74
-
75
- if (!dc_isar_feature(aa64_sm3, s)) {
76
- unallocated_encoding(s);
77
- return;
78
- }
79
-
80
- if (!fp_access_check(s)) {
81
- return;
82
- }
83
-
84
- gen_gvec_op3_ool(s, true, rd, rn, rm, imm2, fns[opcode]);
85
-}
86
-
87
/* C3.6 Data processing - SIMD, inc Crypto
88
*
89
* As the decode gets a little complex we are using a table based
90
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
91
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
92
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
93
{ 0xce800000, 0xffe00000, disas_crypto_xar },
94
- { 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 },
95
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
96
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
97
{ 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 },
99
--
98
--
100
2.20.1
99
2.34.1
101
102
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The instruction event is only enabled when icount is used, cycles are
4
always supported. Always defining get_cycle_count (but altering its
5
behavior depending on CONFIG_USER_ONLY) allows us to remove some
6
CONFIG_USER_ONLY #defines throughout the rest of the code.
7
8
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
9
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Message-id: 20181211151945.29137-12-aaron@os.amperecomputing.com
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240524232121.284515-17-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
7
---
14
target/arm/helper.c | 90 ++++++++++++++++++++++-----------------------
8
target/arm/tcg/a64.decode | 4 ++++
15
1 file changed, 44 insertions(+), 46 deletions(-)
9
target/arm/tcg/translate-a64.c | 43 +++++++++++-----------------------
10
2 files changed, 18 insertions(+), 29 deletions(-)
16
11
17
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
18
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.c
14
--- a/target/arm/tcg/a64.decode
20
+++ b/target/arm/helper.c
15
+++ b/target/arm/tcg/a64.decode
21
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@ SM3TT1A 11001110 010 ..... 10 .. 00 ..... ..... @crypto3i
22
#include "arm_ldst.h"
17
SM3TT1B 11001110 010 ..... 10 .. 01 ..... ..... @crypto3i
23
#include <zlib.h> /* For crc32 */
18
SM3TT2A 11001110 010 ..... 10 .. 10 ..... ..... @crypto3i
24
#include "exec/semihost.h"
19
SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i
25
+#include "sysemu/cpus.h"
20
+
26
#include "sysemu/kvm.h"
21
+### Cryptographic XAR
27
#include "fpu/softfloat.h"
22
+
28
#include "qemu/range.h"
23
+XAR 1100 1110 100 rm:5 imm:6 rn:5 rd:5
29
@@ -XXX,XX +XXX,XX @@ typedef struct pm_event {
24
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
30
uint64_t (*get_count)(CPUARMState *);
25
index XXXXXXX..XXXXXXX 100644
31
} pm_event;
26
--- a/target/arm/tcg/translate-a64.c
32
27
+++ b/target/arm/tcg/translate-a64.c
33
+static bool event_always_supported(CPUARMState *env)
28
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SM3TT1B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1b)
29
TRANS_FEAT(SM3TT2A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2a)
30
TRANS_FEAT(SM3TT2B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2b)
31
32
+static bool trans_XAR(DisasContext *s, arg_XAR *a)
34
+{
33
+{
34
+ if (!dc_isar_feature(aa64_sha3, s)) {
35
+ return false;
36
+ }
37
+ if (fp_access_check(s)) {
38
+ gen_gvec_xar(MO_64, vec_full_reg_offset(s, a->rd),
39
+ vec_full_reg_offset(s, a->rn),
40
+ vec_full_reg_offset(s, a->rm), a->imm, 16,
41
+ vec_full_reg_size(s));
42
+ }
35
+ return true;
43
+ return true;
36
+}
44
+}
37
+
45
+
38
+/*
46
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
39
+ * Return the underlying cycle count for the PMU cycle counters. If we're in
47
* Note that it is the caller's responsibility to ensure that the
40
+ * usermode, simply return 0.
48
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
41
+ */
49
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
42
+static uint64_t cycles_get_count(CPUARMState *env)
50
}
43
+{
44
+#ifndef CONFIG_USER_ONLY
45
+ return muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
46
+ ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
47
+#else
48
+ return cpu_get_host_ticks();
49
+#endif
50
+}
51
+
52
+#ifndef CONFIG_USER_ONLY
53
+static bool instructions_supported(CPUARMState *env)
54
+{
55
+ return use_icount == 1 /* Precise instruction counting */;
56
+}
57
+
58
+static uint64_t instructions_get_count(CPUARMState *env)
59
+{
60
+ return (uint64_t)cpu_get_icount_raw();
61
+}
62
+#endif
63
+
64
static const pm_event pm_events[] = {
65
+#ifndef CONFIG_USER_ONLY
66
+ { .number = 0x008, /* INST_RETIRED, Instruction architecturally executed */
67
+ .supported = instructions_supported,
68
+ .get_count = instructions_get_count,
69
+ },
70
+ { .number = 0x011, /* CPU_CYCLES, Cycle */
71
+ .supported = event_always_supported,
72
+ .get_count = cycles_get_count,
73
+ }
74
+#endif
75
};
76
77
/*
78
@@ -XXX,XX +XXX,XX @@ static const pm_event pm_events[] = {
79
* should first be updated to something sparse instead of the current
80
* supported_event_map[] array.
81
*/
82
-#define MAX_EVENT_ID 0x0
83
+#define MAX_EVENT_ID 0x11
84
#define UNSUPPORTED_EVENT UINT16_MAX
85
static uint16_t supported_event_map[MAX_EVENT_ID + 1];
86
87
@@ -XXX,XX +XXX,XX @@ static CPAccessResult pmreg_access_swinc(CPUARMState *env,
88
return pmreg_access(env, ri, isread);
89
}
51
}
90
52
91
-#ifndef CONFIG_USER_ONLY
53
-/* Crypto XAR
54
- * 31 21 20 16 15 10 9 5 4 0
55
- * +-----------------------+------+--------+------+------+
56
- * | 1 1 0 0 1 1 1 0 1 0 0 | Rm | imm6 | Rn | Rd |
57
- * +-----------------------+------+--------+------+------+
58
- */
59
-static void disas_crypto_xar(DisasContext *s, uint32_t insn)
60
-{
61
- int rm = extract32(insn, 16, 5);
62
- int imm6 = extract32(insn, 10, 6);
63
- int rn = extract32(insn, 5, 5);
64
- int rd = extract32(insn, 0, 5);
92
-
65
-
93
static CPAccessResult pmreg_access_selr(CPUARMState *env,
66
- if (!dc_isar_feature(aa64_sha3, s)) {
94
const ARMCPRegInfo *ri,
67
- unallocated_encoding(s);
95
bool isread)
68
- return;
96
@@ -XXX,XX +XXX,XX @@ static bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
69
- }
97
*/
98
void pmccntr_op_start(CPUARMState *env)
99
{
100
- uint64_t cycles = 0;
101
- cycles = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
102
- ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
103
+ uint64_t cycles = cycles_get_count(env);
104
105
if (pmu_counter_enabled(env, 31)) {
106
uint64_t eff_cycles = cycles;
107
@@ -XXX,XX +XXX,XX @@ static void pmccntr_write32(CPUARMState *env, const ARMCPRegInfo *ri,
108
pmccntr_write(env, ri, deposit64(cur_val, 0, 32, value));
109
}
110
111
-#else /* CONFIG_USER_ONLY */
112
-
70
-
113
-void pmccntr_op_start(CPUARMState *env)
71
- if (!fp_access_check(s)) {
114
-{
72
- return;
73
- }
74
-
75
- gen_gvec_xar(MO_64, vec_full_reg_offset(s, rd),
76
- vec_full_reg_offset(s, rn),
77
- vec_full_reg_offset(s, rm), imm6, 16,
78
- vec_full_reg_size(s));
115
-}
79
-}
116
-
80
-
117
-void pmccntr_op_finish(CPUARMState *env)
81
/* C3.6 Data processing - SIMD, inc Crypto
118
-{
82
*
119
-}
83
* As the decode gets a little complex we are using a table based
120
-
84
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
121
-void pmevcntr_op_start(CPUARMState *env, uint8_t i)
85
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
122
-{
86
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
123
-}
87
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
124
-
88
- { 0xce800000, 0xffe00000, disas_crypto_xar },
125
-void pmevcntr_op_finish(CPUARMState *env, uint8_t i)
89
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
126
-{
90
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
127
-}
91
{ 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 },
128
-
129
-void pmu_op_start(CPUARMState *env)
130
-{
131
-}
132
-
133
-void pmu_op_finish(CPUARMState *env)
134
-{
135
-}
136
-
137
-void pmu_pre_el_change(ARMCPU *cpu, void *ignored)
138
-{
139
-}
140
-
141
-void pmu_post_el_change(ARMCPU *cpu, void *ignored)
142
-{
143
-}
144
-
145
-#endif
146
-
147
static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
148
uint64_t value)
149
{
150
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
151
/* Unimplemented so WI. */
152
{ .name = "PMSWINC", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 4,
153
.access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NOP },
154
-#ifndef CONFIG_USER_ONLY
155
{ .name = "PMSELR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 5,
156
.access = PL0_RW, .type = ARM_CP_ALIAS,
157
.fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmselr),
158
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
159
.fieldoffset = offsetof(CPUARMState, cp15.c15_ccnt),
160
.readfn = pmccntr_read, .writefn = pmccntr_write,
161
.raw_readfn = raw_read, .raw_writefn = raw_write, },
162
-#endif
163
{ .name = "PMCCFILTR", .cp = 15, .opc1 = 0, .crn = 14, .crm = 15, .opc2 = 7,
164
.writefn = pmccfiltr_write_a32, .readfn = pmccfiltr_read_a32,
165
.access = PL0_RW, .accessfn = pmreg_access,
166
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
167
* count register.
168
*/
169
unsigned int i, pmcrn = 0;
170
-#ifndef CONFIG_USER_ONLY
171
ARMCPRegInfo pmcr = {
172
.name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
173
.access = PL0_RW,
174
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
175
g_free(pmevtyper_name);
176
g_free(pmevtyper_el0_name);
177
}
178
-#endif
179
ARMCPRegInfo clidr = {
180
.name = "CLIDR", .state = ARM_CP_STATE_BOTH,
181
.opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 1,
182
--
92
--
183
2.20.1
93
2.34.1
184
185
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Add arrays to hold the registers, the definitions themselves, access
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
functions, and logic to reset counters when PMCR.P is set. Update
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
filtering code to support counters other than PMCCNTR. Support migration
5
Message-id: 20240524232121.284515-18-richard.henderson@linaro.org
6
with raw read/write functions.
7
8
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
9
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20181211151945.29137-11-aaron@os.amperecomputing.com
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
7
---
14
target/arm/cpu.h | 3 +
8
target/arm/tcg/a64.decode | 13 +
15
target/arm/helper.c | 296 +++++++++++++++++++++++++++++++++++++++++---
9
target/arm/tcg/translate-a64.c | 426 +++++++++++----------------------
16
2 files changed, 282 insertions(+), 17 deletions(-)
10
2 files changed, 152 insertions(+), 287 deletions(-)
17
11
18
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
19
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/cpu.h
14
--- a/target/arm/tcg/a64.decode
21
+++ b/target/arm/cpu.h
15
+++ b/target/arm/tcg/a64.decode
22
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
16
@@ -XXX,XX +XXX,XX @@ SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i
23
* pmccntr_op_finish.
17
### Cryptographic XAR
24
*/
18
25
uint64_t c15_ccnt_delta;
19
XAR 1100 1110 100 rm:5 imm:6 rn:5 rd:5
26
+ uint64_t c14_pmevcntr[31];
20
+
27
+ uint64_t c14_pmevcntr_delta[31];
21
+### Advanced SIMD scalar copy
28
+ uint64_t c14_pmevtyper[31];
22
+
29
uint64_t pmccfiltr_el0; /* Performance Monitor Filter Register */
23
+DUP_element_s 0101 1110 000 imm:5 0 0000 1 rn:5 rd:5
30
uint64_t vpidr_el2; /* Virtualization Processor ID Register */
24
+
31
uint64_t vmpidr_el2; /* Virtualization Multiprocessor ID Register */
25
+### Advanced SIMD copy
32
diff --git a/target/arm/helper.c b/target/arm/helper.c
26
+
27
+DUP_element_v 0 q:1 00 1110 000 imm:5 0 0000 1 rn:5 rd:5
28
+DUP_general 0 q:1 00 1110 000 imm:5 0 0001 1 rn:5 rd:5
29
+INS_general 0 1 00 1110 000 imm:5 0 0011 1 rn:5 rd:5
30
+SMOV 0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5
31
+UMOV 0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5
32
+INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5
33
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
33
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/helper.c
35
--- a/target/arm/tcg/translate-a64.c
35
+++ b/target/arm/helper.c
36
+++ b/target/arm/tcg/translate-a64.c
36
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
37
@@ -XXX,XX +XXX,XX @@ static bool trans_XAR(DisasContext *s, arg_XAR *a)
37
#define PMCRDP 0x10
38
return true;
38
#define PMCRD 0x8
39
#define PMCRC 0x4
40
+#define PMCRP 0x2
41
#define PMCRE 0x1
42
43
#define PMXEVTYPER_P 0x80000000
44
@@ -XXX,XX +XXX,XX @@ uint64_t get_pmceid(CPUARMState *env, unsigned which)
45
return pmceid;
46
}
39
}
47
40
48
+/*
41
+/*
49
+ * Check at runtime whether a PMU event is supported for the current machine
42
+ * Advanced SIMD copy
50
+ */
43
+ */
51
+static bool event_supported(uint16_t number)
44
+
52
+{
45
+static bool decode_esz_idx(int imm, MemOp *pesz, unsigned *pidx)
53
+ if (number > MAX_EVENT_ID) {
46
+{
54
+ return false;
47
+ unsigned esz = ctz32(imm);
55
+ }
48
+ if (esz <= MO_64) {
56
+ return supported_event_map[number] != UNSUPPORTED_EVENT;
49
+ *pesz = esz;
57
+}
50
+ *pidx = imm >> (esz + 1);
58
+
51
+ return true;
59
static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
52
+ }
60
bool isread)
53
+ return false;
61
{
54
+}
62
@@ -XXX,XX +XXX,XX @@ static bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
55
+
63
prohibited = env->cp15.c9_pmcr & PMCRDP;
56
+static bool trans_DUP_element_s(DisasContext *s, arg_DUP_element_s *a)
64
}
57
+{
65
58
+ MemOp esz;
66
- /* TODO Remove assert, set filter to correct PMEVTYPER */
59
+ unsigned idx;
67
- assert(counter == 31);
60
+
68
- filter = env->cp15.pmccfiltr_el0;
61
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
69
+ if (counter == 31) {
62
+ return false;
70
+ filter = env->cp15.pmccfiltr_el0;
63
+ }
71
+ } else {
64
+ if (fp_access_check(s)) {
72
+ filter = env->cp15.c14_pmevtyper[counter];
73
+ }
74
75
p = filter & PMXEVTYPER_P;
76
u = filter & PMXEVTYPER_U;
77
@@ -XXX,XX +XXX,XX @@ static bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
78
filtered = m != p;
79
}
80
81
+ if (counter != 31) {
82
+ /*
65
+ /*
83
+ * If not checking PMCCNTR, ensure the counter is setup to an event we
66
+ * This instruction just extracts the specified element and
84
+ * support
67
+ * zero-extends it into the bottom of the destination register.
85
+ */
68
+ */
86
+ uint16_t event = filter & PMXEVTYPER_EVTCOUNT;
69
+ TCGv_i64 tmp = tcg_temp_new_i64();
87
+ if (!event_supported(event)) {
70
+ read_vec_element(s, tmp, a->rn, idx, esz);
71
+ write_fp_dreg(s, a->rd, tmp);
72
+ }
73
+ return true;
74
+}
75
+
76
+static bool trans_DUP_element_v(DisasContext *s, arg_DUP_element_v *a)
77
+{
78
+ MemOp esz;
79
+ unsigned idx;
80
+
81
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
82
+ return false;
83
+ }
84
+ if (esz == MO_64 && !a->q) {
85
+ return false;
86
+ }
87
+ if (fp_access_check(s)) {
88
+ tcg_gen_gvec_dup_mem(esz, vec_full_reg_offset(s, a->rd),
89
+ vec_reg_offset(s, a->rn, idx, esz),
90
+ a->q ? 16 : 8, vec_full_reg_size(s));
91
+ }
92
+ return true;
93
+}
94
+
95
+static bool trans_DUP_general(DisasContext *s, arg_DUP_general *a)
96
+{
97
+ MemOp esz;
98
+ unsigned idx;
99
+
100
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
101
+ return false;
102
+ }
103
+ if (esz == MO_64 && !a->q) {
104
+ return false;
105
+ }
106
+ if (fp_access_check(s)) {
107
+ tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd),
108
+ a->q ? 16 : 8, vec_full_reg_size(s),
109
+ cpu_reg(s, a->rn));
110
+ }
111
+ return true;
112
+}
113
+
114
+static bool do_smov_umov(DisasContext *s, arg_SMOV *a, MemOp is_signed)
115
+{
116
+ MemOp esz;
117
+ unsigned idx;
118
+
119
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
120
+ return false;
121
+ }
122
+ if (is_signed) {
123
+ if (esz == MO_64 || (esz == MO_32 && !a->q)) {
88
+ return false;
124
+ return false;
89
+ }
125
+ }
90
+ }
126
+ } else {
91
+
127
+ if (esz == MO_64 ? !a->q : a->q) {
92
return enabled && !prohibited && !filtered;
128
+ return false;
129
+ }
130
+ }
131
+ if (fp_access_check(s)) {
132
+ TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
133
+ read_vec_element(s, tcg_rd, a->rn, idx, esz | is_signed);
134
+ if (is_signed && !a->q) {
135
+ tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
136
+ }
137
+ }
138
+ return true;
139
+}
140
+
141
+TRANS(SMOV, do_smov_umov, a, MO_SIGN)
142
+TRANS(UMOV, do_smov_umov, a, 0)
143
+
144
+static bool trans_INS_general(DisasContext *s, arg_INS_general *a)
145
+{
146
+ MemOp esz;
147
+ unsigned idx;
148
+
149
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
150
+ return false;
151
+ }
152
+ if (fp_access_check(s)) {
153
+ write_vec_element(s, cpu_reg(s, a->rn), a->rd, idx, esz);
154
+ clear_vec_high(s, true, a->rd);
155
+ }
156
+ return true;
157
+}
158
+
159
+static bool trans_INS_element(DisasContext *s, arg_INS_element *a)
160
+{
161
+ MemOp esz;
162
+ unsigned didx, sidx;
163
+
164
+ if (!decode_esz_idx(a->di, &esz, &didx)) {
165
+ return false;
166
+ }
167
+ sidx = a->si >> esz;
168
+ if (fp_access_check(s)) {
169
+ TCGv_i64 tmp = tcg_temp_new_i64();
170
+
171
+ read_vec_element(s, tmp, a->rn, sidx, esz);
172
+ write_vec_element(s, tmp, a->rd, didx, esz);
173
+
174
+ /* INS is considered a 128-bit write for SVE. */
175
+ clear_vec_high(s, true, a->rd);
176
+ }
177
+ return true;
178
+}
179
+
180
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
181
* Note that it is the caller's responsibility to ensure that the
182
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
183
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
184
write_fp_dreg(s, rd, tcg_res);
93
}
185
}
94
186
95
@@ -XXX,XX +XXX,XX @@ void pmccntr_op_finish(CPUARMState *env)
187
-/* DUP (Element, Vector)
188
- *
189
- * 31 30 29 21 20 16 15 10 9 5 4 0
190
- * +---+---+-------------------+--------+-------------+------+------+
191
- * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 0 1 | Rn | Rd |
192
- * +---+---+-------------------+--------+-------------+------+------+
193
- *
194
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
195
- */
196
-static void handle_simd_dupe(DisasContext *s, int is_q, int rd, int rn,
197
- int imm5)
198
-{
199
- int size = ctz32(imm5);
200
- int index;
201
-
202
- if (size > 3 || (size == 3 && !is_q)) {
203
- unallocated_encoding(s);
204
- return;
205
- }
206
-
207
- if (!fp_access_check(s)) {
208
- return;
209
- }
210
-
211
- index = imm5 >> (size + 1);
212
- tcg_gen_gvec_dup_mem(size, vec_full_reg_offset(s, rd),
213
- vec_reg_offset(s, rn, index, size),
214
- is_q ? 16 : 8, vec_full_reg_size(s));
215
-}
216
-
217
-/* DUP (element, scalar)
218
- * 31 21 20 16 15 10 9 5 4 0
219
- * +-----------------------+--------+-------------+------+------+
220
- * | 0 1 0 1 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 0 1 | Rn | Rd |
221
- * +-----------------------+--------+-------------+------+------+
222
- */
223
-static void handle_simd_dupes(DisasContext *s, int rd, int rn,
224
- int imm5)
225
-{
226
- int size = ctz32(imm5);
227
- int index;
228
- TCGv_i64 tmp;
229
-
230
- if (size > 3) {
231
- unallocated_encoding(s);
232
- return;
233
- }
234
-
235
- if (!fp_access_check(s)) {
236
- return;
237
- }
238
-
239
- index = imm5 >> (size + 1);
240
-
241
- /* This instruction just extracts the specified element and
242
- * zero-extends it into the bottom of the destination register.
243
- */
244
- tmp = tcg_temp_new_i64();
245
- read_vec_element(s, tmp, rn, index, size);
246
- write_fp_dreg(s, rd, tmp);
247
-}
248
-
249
-/* DUP (General)
250
- *
251
- * 31 30 29 21 20 16 15 10 9 5 4 0
252
- * +---+---+-------------------+--------+-------------+------+------+
253
- * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 1 1 | Rn | Rd |
254
- * +---+---+-------------------+--------+-------------+------+------+
255
- *
256
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
257
- */
258
-static void handle_simd_dupg(DisasContext *s, int is_q, int rd, int rn,
259
- int imm5)
260
-{
261
- int size = ctz32(imm5);
262
- uint32_t dofs, oprsz, maxsz;
263
-
264
- if (size > 3 || ((size == 3) && !is_q)) {
265
- unallocated_encoding(s);
266
- return;
267
- }
268
-
269
- if (!fp_access_check(s)) {
270
- return;
271
- }
272
-
273
- dofs = vec_full_reg_offset(s, rd);
274
- oprsz = is_q ? 16 : 8;
275
- maxsz = vec_full_reg_size(s);
276
-
277
- tcg_gen_gvec_dup_i64(size, dofs, oprsz, maxsz, cpu_reg(s, rn));
278
-}
279
-
280
-/* INS (Element)
281
- *
282
- * 31 21 20 16 15 14 11 10 9 5 4 0
283
- * +-----------------------+--------+------------+---+------+------+
284
- * | 0 1 1 0 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd |
285
- * +-----------------------+--------+------------+---+------+------+
286
- *
287
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
288
- * index: encoded in imm5<4:size+1>
289
- */
290
-static void handle_simd_inse(DisasContext *s, int rd, int rn,
291
- int imm4, int imm5)
292
-{
293
- int size = ctz32(imm5);
294
- int src_index, dst_index;
295
- TCGv_i64 tmp;
296
-
297
- if (size > 3) {
298
- unallocated_encoding(s);
299
- return;
300
- }
301
-
302
- if (!fp_access_check(s)) {
303
- return;
304
- }
305
-
306
- dst_index = extract32(imm5, 1+size, 5);
307
- src_index = extract32(imm4, size, 4);
308
-
309
- tmp = tcg_temp_new_i64();
310
-
311
- read_vec_element(s, tmp, rn, src_index, size);
312
- write_vec_element(s, tmp, rd, dst_index, size);
313
-
314
- /* INS is considered a 128-bit write for SVE. */
315
- clear_vec_high(s, true, rd);
316
-}
317
-
318
-
319
-/* INS (General)
320
- *
321
- * 31 21 20 16 15 10 9 5 4 0
322
- * +-----------------------+--------+-------------+------+------+
323
- * | 0 1 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 1 1 1 | Rn | Rd |
324
- * +-----------------------+--------+-------------+------+------+
325
- *
326
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
327
- * index: encoded in imm5<4:size+1>
328
- */
329
-static void handle_simd_insg(DisasContext *s, int rd, int rn, int imm5)
330
-{
331
- int size = ctz32(imm5);
332
- int idx;
333
-
334
- if (size > 3) {
335
- unallocated_encoding(s);
336
- return;
337
- }
338
-
339
- if (!fp_access_check(s)) {
340
- return;
341
- }
342
-
343
- idx = extract32(imm5, 1 + size, 4 - size);
344
- write_vec_element(s, cpu_reg(s, rn), rd, idx, size);
345
-
346
- /* INS is considered a 128-bit write for SVE. */
347
- clear_vec_high(s, true, rd);
348
-}
349
-
350
-/*
351
- * UMOV (General)
352
- * SMOV (General)
353
- *
354
- * 31 30 29 21 20 16 15 12 10 9 5 4 0
355
- * +---+---+-------------------+--------+-------------+------+------+
356
- * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 1 U 1 1 | Rn | Rd |
357
- * +---+---+-------------------+--------+-------------+------+------+
358
- *
359
- * U: unsigned when set
360
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
361
- */
362
-static void handle_simd_umov_smov(DisasContext *s, int is_q, int is_signed,
363
- int rn, int rd, int imm5)
364
-{
365
- int size = ctz32(imm5);
366
- int element;
367
- TCGv_i64 tcg_rd;
368
-
369
- /* Check for UnallocatedEncodings */
370
- if (is_signed) {
371
- if (size > 2 || (size == 2 && !is_q)) {
372
- unallocated_encoding(s);
373
- return;
374
- }
375
- } else {
376
- if (size > 3
377
- || (size < 3 && is_q)
378
- || (size == 3 && !is_q)) {
379
- unallocated_encoding(s);
380
- return;
381
- }
382
- }
383
-
384
- if (!fp_access_check(s)) {
385
- return;
386
- }
387
-
388
- element = extract32(imm5, 1+size, 4);
389
-
390
- tcg_rd = cpu_reg(s, rd);
391
- read_vec_element(s, tcg_rd, rn, element, size | (is_signed ? MO_SIGN : 0));
392
- if (is_signed && !is_q) {
393
- tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
394
- }
395
-}
396
-
397
-/* AdvSIMD copy
398
- * 31 30 29 28 21 20 16 15 14 11 10 9 5 4 0
399
- * +---+---+----+-----------------+------+---+------+---+------+------+
400
- * | 0 | Q | op | 0 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd |
401
- * +---+---+----+-----------------+------+---+------+---+------+------+
402
- */
403
-static void disas_simd_copy(DisasContext *s, uint32_t insn)
404
-{
405
- int rd = extract32(insn, 0, 5);
406
- int rn = extract32(insn, 5, 5);
407
- int imm4 = extract32(insn, 11, 4);
408
- int op = extract32(insn, 29, 1);
409
- int is_q = extract32(insn, 30, 1);
410
- int imm5 = extract32(insn, 16, 5);
411
-
412
- if (op) {
413
- if (is_q) {
414
- /* INS (element) */
415
- handle_simd_inse(s, rd, rn, imm4, imm5);
416
- } else {
417
- unallocated_encoding(s);
418
- }
419
- } else {
420
- switch (imm4) {
421
- case 0:
422
- /* DUP (element - vector) */
423
- handle_simd_dupe(s, is_q, rd, rn, imm5);
424
- break;
425
- case 1:
426
- /* DUP (general) */
427
- handle_simd_dupg(s, is_q, rd, rn, imm5);
428
- break;
429
- case 3:
430
- if (is_q) {
431
- /* INS (general) */
432
- handle_simd_insg(s, rd, rn, imm5);
433
- } else {
434
- unallocated_encoding(s);
435
- }
436
- break;
437
- case 5:
438
- case 7:
439
- /* UMOV/SMOV (is_q indicates 32/64; imm4 indicates signedness) */
440
- handle_simd_umov_smov(s, is_q, (imm4 == 5), rn, rd, imm5);
441
- break;
442
- default:
443
- unallocated_encoding(s);
444
- break;
445
- }
446
- }
447
-}
448
-
449
/* AdvSIMD modified immediate
450
* 31 30 29 28 19 18 16 15 12 11 10 9 5 4 0
451
* +---+---+----+---------------------+-----+-------+----+---+-------+------+
452
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
96
}
453
}
97
}
454
}
98
455
99
+static void pmevcntr_op_start(CPUARMState *env, uint8_t counter)
456
-/* AdvSIMD scalar copy
100
+{
457
- * 31 30 29 28 21 20 16 15 14 11 10 9 5 4 0
101
+
458
- * +-----+----+-----------------+------+---+------+---+------+------+
102
+ uint16_t event = env->cp15.c14_pmevtyper[counter] & PMXEVTYPER_EVTCOUNT;
459
- * | 0 1 | op | 1 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd |
103
+ uint64_t count = 0;
460
- * +-----+----+-----------------+------+---+------+---+------+------+
104
+ if (event_supported(event)) {
461
- */
105
+ uint16_t event_idx = supported_event_map[event];
462
-static void disas_simd_scalar_copy(DisasContext *s, uint32_t insn)
106
+ count = pm_events[event_idx].get_count(env);
463
-{
107
+ }
464
- int rd = extract32(insn, 0, 5);
108
+
465
- int rn = extract32(insn, 5, 5);
109
+ if (pmu_counter_enabled(env, counter)) {
466
- int imm4 = extract32(insn, 11, 4);
110
+ env->cp15.c14_pmevcntr[counter] =
467
- int imm5 = extract32(insn, 16, 5);
111
+ count - env->cp15.c14_pmevcntr_delta[counter];
468
- int op = extract32(insn, 29, 1);
112
+ }
469
-
113
+ env->cp15.c14_pmevcntr_delta[counter] = count;
470
- if (op != 0 || imm4 != 0) {
114
+}
471
- unallocated_encoding(s);
115
+
472
- return;
116
+static void pmevcntr_op_finish(CPUARMState *env, uint8_t counter)
473
- }
117
+{
474
-
118
+ if (pmu_counter_enabled(env, counter)) {
475
- /* DUP (element, scalar) */
119
+ env->cp15.c14_pmevcntr_delta[counter] -=
476
- handle_simd_dupes(s, rd, rn, imm5);
120
+ env->cp15.c14_pmevcntr[counter];
477
-}
121
+ }
478
-
122
+}
479
/* AdvSIMD scalar pairwise
123
+
480
* 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0
124
void pmu_op_start(CPUARMState *env)
481
* +-----+---+-----------+------+-----------+--------+-----+------+------+
125
{
482
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
126
+ unsigned int i;
483
{ 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff },
127
pmccntr_op_start(env);
484
{ 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
128
+ for (i = 0; i < pmu_num_counters(env); i++) {
485
{ 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },
129
+ pmevcntr_op_start(env, i);
486
- { 0x0e000400, 0x9fe08400, disas_simd_copy },
130
+ }
487
{ 0x0f000000, 0x9f000400, disas_simd_indexed }, /* vector indexed */
131
}
488
/* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
132
489
{ 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
133
void pmu_op_finish(CPUARMState *env)
490
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
134
{
491
{ 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },
135
+ unsigned int i;
492
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
136
pmccntr_op_finish(env);
493
{ 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
137
+ for (i = 0; i < pmu_num_counters(env); i++) {
494
- { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
138
+ pmevcntr_op_finish(env, i);
495
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
139
+ }
496
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
140
}
497
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
141
142
void pmu_pre_el_change(ARMCPU *cpu, void *ignored)
143
@@ -XXX,XX +XXX,XX @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
144
env->cp15.c15_ccnt = 0;
145
}
146
147
+ if (value & PMCRP) {
148
+ unsigned int i;
149
+ for (i = 0; i < pmu_num_counters(env); i++) {
150
+ env->cp15.c14_pmevcntr[i] = 0;
151
+ }
152
+ }
153
+
154
/* only the DP, X, D and E bits are writable */
155
env->cp15.c9_pmcr &= ~0x39;
156
env->cp15.c9_pmcr |= (value & 0x39);
157
@@ -XXX,XX +XXX,XX @@ void pmccntr_op_finish(CPUARMState *env)
158
{
159
}
160
161
+void pmevcntr_op_start(CPUARMState *env, uint8_t i)
162
+{
163
+}
164
+
165
+void pmevcntr_op_finish(CPUARMState *env, uint8_t i)
166
+{
167
+}
168
+
169
void pmu_op_start(CPUARMState *env)
170
{
171
}
172
@@ -XXX,XX +XXX,XX @@ static void pmovsset_write(CPUARMState *env, const ARMCPRegInfo *ri,
173
env->cp15.c9_pmovsr |= value;
174
}
175
176
-static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
177
- uint64_t value)
178
+static void pmevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
179
+ uint64_t value, const uint8_t counter)
180
{
181
+ if (counter == 31) {
182
+ pmccfiltr_write(env, ri, value);
183
+ } else if (counter < pmu_num_counters(env)) {
184
+ pmevcntr_op_start(env, counter);
185
+
186
+ /*
187
+ * If this counter's event type is changing, store the current
188
+ * underlying count for the new type in c14_pmevcntr_delta[counter] so
189
+ * pmevcntr_op_finish has the correct baseline when it converts back to
190
+ * a delta.
191
+ */
192
+ uint16_t old_event = env->cp15.c14_pmevtyper[counter] &
193
+ PMXEVTYPER_EVTCOUNT;
194
+ uint16_t new_event = value & PMXEVTYPER_EVTCOUNT;
195
+ if (old_event != new_event) {
196
+ uint64_t count = 0;
197
+ if (event_supported(new_event)) {
198
+ uint16_t event_idx = supported_event_map[new_event];
199
+ count = pm_events[event_idx].get_count(env);
200
+ }
201
+ env->cp15.c14_pmevcntr_delta[counter] = count;
202
+ }
203
+
204
+ env->cp15.c14_pmevtyper[counter] = value & PMXEVTYPER_MASK;
205
+ pmevcntr_op_finish(env, counter);
206
+ }
207
/* Attempts to access PMXEVTYPER are CONSTRAINED UNPREDICTABLE when
208
* PMSELR value is equal to or greater than the number of implemented
209
* counters, but not equal to 0x1f. We opt to behave as a RAZ/WI.
210
*/
211
- if (env->cp15.c9_pmselr == 0x1f) {
212
- pmccfiltr_write(env, ri, value);
213
+}
214
+
215
+static uint64_t pmevtyper_read(CPUARMState *env, const ARMCPRegInfo *ri,
216
+ const uint8_t counter)
217
+{
218
+ if (counter == 31) {
219
+ return env->cp15.pmccfiltr_el0;
220
+ } else if (counter < pmu_num_counters(env)) {
221
+ return env->cp15.c14_pmevtyper[counter];
222
+ } else {
223
+ /*
224
+ * We opt to behave as a RAZ/WI when attempts to access PMXEVTYPER
225
+ * are CONSTRAINED UNPREDICTABLE. See comments in pmevtyper_write().
226
+ */
227
+ return 0;
228
}
229
}
230
231
+static void pmevtyper_writefn(CPUARMState *env, const ARMCPRegInfo *ri,
232
+ uint64_t value)
233
+{
234
+ uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
235
+ pmevtyper_write(env, ri, value, counter);
236
+}
237
+
238
+static void pmevtyper_rawwrite(CPUARMState *env, const ARMCPRegInfo *ri,
239
+ uint64_t value)
240
+{
241
+ uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
242
+ env->cp15.c14_pmevtyper[counter] = value;
243
+
244
+ /*
245
+ * pmevtyper_rawwrite is called between a pair of pmu_op_start and
246
+ * pmu_op_finish calls when loading saved state for a migration. Because
247
+ * we're potentially updating the type of event here, the value written to
248
+ * c14_pmevcntr_delta by the preceeding pmu_op_start call may be for a
249
+ * different counter type. Therefore, we need to set this value to the
250
+ * current count for the counter type we're writing so that pmu_op_finish
251
+ * has the correct count for its calculation.
252
+ */
253
+ uint16_t event = value & PMXEVTYPER_EVTCOUNT;
254
+ if (event_supported(event)) {
255
+ uint16_t event_idx = supported_event_map[event];
256
+ env->cp15.c14_pmevcntr_delta[counter] =
257
+ pm_events[event_idx].get_count(env);
258
+ }
259
+}
260
+
261
+static uint64_t pmevtyper_readfn(CPUARMState *env, const ARMCPRegInfo *ri)
262
+{
263
+ uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
264
+ return pmevtyper_read(env, ri, counter);
265
+}
266
+
267
+static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
268
+ uint64_t value)
269
+{
270
+ pmevtyper_write(env, ri, value, env->cp15.c9_pmselr & 31);
271
+}
272
+
273
static uint64_t pmxevtyper_read(CPUARMState *env, const ARMCPRegInfo *ri)
274
{
275
- /* We opt to behave as a RAZ/WI when attempts to access PMXEVTYPER
276
- * are CONSTRAINED UNPREDICTABLE. See comments in pmxevtyper_write().
277
+ return pmevtyper_read(env, ri, env->cp15.c9_pmselr & 31);
278
+}
279
+
280
+static void pmevcntr_write(CPUARMState *env, const ARMCPRegInfo *ri,
281
+ uint64_t value, uint8_t counter)
282
+{
283
+ if (counter < pmu_num_counters(env)) {
284
+ pmevcntr_op_start(env, counter);
285
+ env->cp15.c14_pmevcntr[counter] = value;
286
+ pmevcntr_op_finish(env, counter);
287
+ }
288
+ /*
289
+ * We opt to behave as a RAZ/WI when attempts to access PM[X]EVCNTR
290
+ * are CONSTRAINED UNPREDICTABLE.
291
*/
292
- if (env->cp15.c9_pmselr == 0x1f) {
293
- return env->cp15.pmccfiltr_el0;
294
+}
295
+
296
+static uint64_t pmevcntr_read(CPUARMState *env, const ARMCPRegInfo *ri,
297
+ uint8_t counter)
298
+{
299
+ if (counter < pmu_num_counters(env)) {
300
+ uint64_t ret;
301
+ pmevcntr_op_start(env, counter);
302
+ ret = env->cp15.c14_pmevcntr[counter];
303
+ pmevcntr_op_finish(env, counter);
304
+ return ret;
305
} else {
306
+ /* We opt to behave as a RAZ/WI when attempts to access PM[X]EVCNTR
307
+ * are CONSTRAINED UNPREDICTABLE. */
308
return 0;
309
}
310
}
311
312
+static void pmevcntr_writefn(CPUARMState *env, const ARMCPRegInfo *ri,
313
+ uint64_t value)
314
+{
315
+ uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
316
+ pmevcntr_write(env, ri, value, counter);
317
+}
318
+
319
+static uint64_t pmevcntr_readfn(CPUARMState *env, const ARMCPRegInfo *ri)
320
+{
321
+ uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
322
+ return pmevcntr_read(env, ri, counter);
323
+}
324
+
325
+static void pmevcntr_rawwrite(CPUARMState *env, const ARMCPRegInfo *ri,
326
+ uint64_t value)
327
+{
328
+ uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
329
+ assert(counter < pmu_num_counters(env));
330
+ env->cp15.c14_pmevcntr[counter] = value;
331
+ pmevcntr_write(env, ri, value, counter);
332
+}
333
+
334
+static uint64_t pmevcntr_rawread(CPUARMState *env, const ARMCPRegInfo *ri)
335
+{
336
+ uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
337
+ assert(counter < pmu_num_counters(env));
338
+ return env->cp15.c14_pmevcntr[counter];
339
+}
340
+
341
+static void pmxevcntr_write(CPUARMState *env, const ARMCPRegInfo *ri,
342
+ uint64_t value)
343
+{
344
+ pmevcntr_write(env, ri, value, env->cp15.c9_pmselr & 31);
345
+}
346
+
347
+static uint64_t pmxevcntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
348
+{
349
+ return pmevcntr_read(env, ri, env->cp15.c9_pmselr & 31);
350
+}
351
+
352
static void pmuserenr_write(CPUARMState *env, const ARMCPRegInfo *ri,
353
uint64_t value)
354
{
355
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
356
.fieldoffset = offsetof(CPUARMState, cp15.pmccfiltr_el0),
357
.resetvalue = 0, },
358
{ .name = "PMXEVTYPER", .cp = 15, .crn = 9, .crm = 13, .opc1 = 0, .opc2 = 1,
359
- .access = PL0_RW, .type = ARM_CP_NO_RAW, .accessfn = pmreg_access,
360
+ .access = PL0_RW, .type = ARM_CP_NO_RAW | ARM_CP_IO,
361
+ .accessfn = pmreg_access,
362
.writefn = pmxevtyper_write, .readfn = pmxevtyper_read },
363
{ .name = "PMXEVTYPER_EL0", .state = ARM_CP_STATE_AA64,
364
.opc0 = 3, .opc1 = 3, .crn = 9, .crm = 13, .opc2 = 1,
365
- .access = PL0_RW, .type = ARM_CP_NO_RAW, .accessfn = pmreg_access,
366
+ .access = PL0_RW, .type = ARM_CP_NO_RAW | ARM_CP_IO,
367
+ .accessfn = pmreg_access,
368
.writefn = pmxevtyper_write, .readfn = pmxevtyper_read },
369
- /* Unimplemented, RAZ/WI. */
370
{ .name = "PMXEVCNTR", .cp = 15, .crn = 9, .crm = 13, .opc1 = 0, .opc2 = 2,
371
- .access = PL0_RW, .type = ARM_CP_CONST, .resetvalue = 0,
372
- .accessfn = pmreg_access_xevcntr },
373
+ .access = PL0_RW, .type = ARM_CP_NO_RAW | ARM_CP_IO,
374
+ .accessfn = pmreg_access_xevcntr,
375
+ .writefn = pmxevcntr_write, .readfn = pmxevcntr_read },
376
+ { .name = "PMXEVCNTR_EL0", .state = ARM_CP_STATE_AA64,
377
+ .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 13, .opc2 = 2,
378
+ .access = PL0_RW, .type = ARM_CP_NO_RAW | ARM_CP_IO,
379
+ .accessfn = pmreg_access_xevcntr,
380
+ .writefn = pmxevcntr_write, .readfn = pmxevcntr_read },
381
{ .name = "PMUSERENR", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 0,
382
.access = PL0_R | PL1_RW, .accessfn = access_tpm,
383
.fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmuserenr),
384
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
385
#endif
386
/* The only field of MDCR_EL2 that has a defined architectural reset value
387
* is MDCR_EL2.HPMN which should reset to the value of PMCR_EL0.N; but we
388
- * don't impelment any PMU event counters, so using zero as a reset
389
+ * don't implement any PMU event counters, so using zero as a reset
390
* value for MDCR_EL2 is okay
391
*/
392
{ .name = "MDCR_EL2", .state = ARM_CP_STATE_BOTH,
393
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
394
* field as main ID register, and we implement only the cycle
395
* count register.
396
*/
397
+ unsigned int i, pmcrn = 0;
398
#ifndef CONFIG_USER_ONLY
399
ARMCPRegInfo pmcr = {
400
.name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
401
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
402
};
403
define_one_arm_cp_reg(cpu, &pmcr);
404
define_one_arm_cp_reg(cpu, &pmcr64);
405
+ for (i = 0; i < pmcrn; i++) {
406
+ char *pmevcntr_name = g_strdup_printf("PMEVCNTR%d", i);
407
+ char *pmevcntr_el0_name = g_strdup_printf("PMEVCNTR%d_EL0", i);
408
+ char *pmevtyper_name = g_strdup_printf("PMEVTYPER%d", i);
409
+ char *pmevtyper_el0_name = g_strdup_printf("PMEVTYPER%d_EL0", i);
410
+ ARMCPRegInfo pmev_regs[] = {
411
+ { .name = pmevcntr_name, .cp = 15, .crn = 15,
412
+ .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
413
+ .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
414
+ .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
415
+ .accessfn = pmreg_access },
416
+ { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
417
+ .opc0 = 3, .opc1 = 3, .crn = 15, .crm = 8 | (3 & (i >> 3)),
418
+ .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
419
+ .type = ARM_CP_IO,
420
+ .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
421
+ .raw_readfn = pmevcntr_rawread,
422
+ .raw_writefn = pmevcntr_rawwrite },
423
+ { .name = pmevtyper_name, .cp = 15, .crn = 15,
424
+ .crm = 12 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
425
+ .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
426
+ .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
427
+ .accessfn = pmreg_access },
428
+ { .name = pmevtyper_el0_name, .state = ARM_CP_STATE_AA64,
429
+ .opc0 = 3, .opc1 = 3, .crn = 15, .crm = 12 | (3 & (i >> 3)),
430
+ .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
431
+ .type = ARM_CP_IO,
432
+ .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
433
+ .raw_writefn = pmevtyper_rawwrite },
434
+ REGINFO_SENTINEL
435
+ };
436
+ define_arm_cp_regs(cpu, pmev_regs);
437
+ g_free(pmevcntr_name);
438
+ g_free(pmevcntr_el0_name);
439
+ g_free(pmevtyper_name);
440
+ g_free(pmevtyper_el0_name);
441
+ }
442
#endif
443
ARMCPRegInfo clidr = {
444
.name = "CLIDR", .state = ARM_CP_STATE_BOTH,
445
--
498
--
446
2.20.1
499
2.34.1
447
448
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Now properly signals unallocated for REV64 with SF=0.
3
Convert all forms (scalar, vector, scalar indexed, vector indexed),
4
Allows for the opcode2 field to be decoded shortly.
4
which allows us to remove switch table entries elsewhere.
5
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20190108223129.5570-8-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-19-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
10
---
11
target/arm/translate-a64.c | 31 ++++++++++++++++++++++---------
11
target/arm/tcg/helper-a64.h | 8 ++
12
1 file changed, 22 insertions(+), 9 deletions(-)
12
target/arm/tcg/a64.decode | 45 +++++++
13
target/arm/tcg/translate-a64.c | 221 +++++++++++++++++++++++++++------
14
target/arm/tcg/vec_helper.c | 39 +++---
15
4 files changed, 259 insertions(+), 54 deletions(-)
13
16
14
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
17
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/translate-a64.c
19
--- a/target/arm/tcg/helper-a64.h
17
+++ b/target/arm/translate-a64.c
20
+++ b/target/arm/tcg/helper-a64.h
18
@@ -XXX,XX +XXX,XX @@ static void handle_rev16(DisasContext *s, unsigned int sf,
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(cpye, void, env, i32, i32, i32)
19
*/
22
DEF_HELPER_4(cpyfp, void, env, i32, i32, i32)
20
static void disas_data_proc_1src(DisasContext *s, uint32_t insn)
23
DEF_HELPER_4(cpyfm, void, env, i32, i32, i32)
21
{
24
DEF_HELPER_4(cpyfe, void, env, i32, i32, i32)
22
- unsigned int sf, opcode, rn, rd;
25
+
23
+ unsigned int sf, opcode, opcode2, rn, rd;
26
+DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
27
+DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
- if (extract32(insn, 29, 1) || extract32(insn, 16, 5)) {
28
+DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+ if (extract32(insn, 29, 1)) {
29
+
30
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/tcg/a64.decode
36
+++ b/target/arm/tcg/a64.decode
37
@@ -XXX,XX +XXX,XX @@
38
#
39
40
%rd 0:5
41
+%esz_sd 22:1 !function=plus_2
42
+%hl 11:1 21:1
43
+%hlm 11:1 20:2
44
45
&r rn
46
&ri rd imm
47
&rri_sf rd rn imm sf
48
&i imm
49
+&rrr_e rd rn rm esz
50
+&rrx_e rd rn rm idx esz
51
&qrr_e q rd rn esz
52
&qrrr_e q rd rn rm esz
53
+&qrrx_e q rd rn rm idx esz
54
&qrrrr_e q rd rn rm ra esz
55
56
+@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1
57
+@rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd
58
+
59
+@rrx_h ........ .. .. rm:4 .... . . rn:5 rd:5 &rrx_e esz=1 idx=%hlm
60
+@rrx_s ........ .. . rm:5 .... . . rn:5 rd:5 &rrx_e esz=2 idx=%hl
61
+@rrx_d ........ .. . rm:5 .... idx:1 . rn:5 rd:5 &rrx_e esz=3
62
+
63
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
64
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
65
@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
66
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
67
@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
68
69
+@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1
70
+@qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd
71
+
72
+@qrrx_h . q:1 .. .... .. .. rm:4 .... . . rn:5 rd:5 \
73
+ &qrrx_e esz=1 idx=%hlm
74
+@qrrx_s . q:1 .. .... .. . rm:5 .... . . rn:5 rd:5 \
75
+ &qrrx_e esz=2 idx=%hl
76
+@qrrx_d . q:1 .. .... .. . rm:5 .... idx:1 . rn:5 rd:5 \
77
+ &qrrx_e esz=3
78
+
79
### Data Processing - Immediate
80
81
# PC-rel addressing
82
@@ -XXX,XX +XXX,XX @@ INS_general 0 1 00 1110 000 imm:5 0 0011 1 rn:5 rd:5
83
SMOV 0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5
84
UMOV 0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5
85
INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5
86
+
87
+### Advanced SIMD scalar three same
88
+
89
+FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h
90
+FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd
91
+
92
+### Advanced SIMD three same
93
+
94
+FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h
95
+FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
96
+
97
+### Advanced SIMD scalar x indexed element
98
+
99
+FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
100
+FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
101
+FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
102
+
103
+### Advanced SIMD vector x indexed element
104
+
105
+FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
106
+FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
107
+FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
108
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
109
index XXXXXXX..XXXXXXX 100644
110
--- a/target/arm/tcg/translate-a64.c
111
+++ b/target/arm/tcg/translate-a64.c
112
@@ -XXX,XX +XXX,XX @@ static bool trans_INS_element(DisasContext *s, arg_INS_element *a)
113
return true;
114
}
115
116
+/*
117
+ * Advanced SIMD three same
118
+ */
119
+
120
+typedef struct FPScalar {
121
+ void (*gen_h)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
122
+ void (*gen_s)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
123
+ void (*gen_d)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
124
+} FPScalar;
125
+
126
+static bool do_fp3_scalar(DisasContext *s, arg_rrr_e *a, const FPScalar *f)
127
+{
128
+ switch (a->esz) {
129
+ case MO_64:
130
+ if (fp_access_check(s)) {
131
+ TCGv_i64 t0 = read_fp_dreg(s, a->rn);
132
+ TCGv_i64 t1 = read_fp_dreg(s, a->rm);
133
+ f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
134
+ write_fp_dreg(s, a->rd, t0);
135
+ }
136
+ break;
137
+ case MO_32:
138
+ if (fp_access_check(s)) {
139
+ TCGv_i32 t0 = read_fp_sreg(s, a->rn);
140
+ TCGv_i32 t1 = read_fp_sreg(s, a->rm);
141
+ f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
142
+ write_fp_sreg(s, a->rd, t0);
143
+ }
144
+ break;
145
+ case MO_16:
146
+ if (!dc_isar_feature(aa64_fp16, s)) {
147
+ return false;
148
+ }
149
+ if (fp_access_check(s)) {
150
+ TCGv_i32 t0 = read_fp_hreg(s, a->rn);
151
+ TCGv_i32 t1 = read_fp_hreg(s, a->rm);
152
+ f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16));
153
+ write_fp_sreg(s, a->rd, t0);
154
+ }
155
+ break;
156
+ default:
157
+ return false;
158
+ }
159
+ return true;
160
+}
161
+
162
+static const FPScalar f_scalar_fmulx = {
163
+ gen_helper_advsimd_mulxh,
164
+ gen_helper_vfp_mulxs,
165
+ gen_helper_vfp_mulxd,
166
+};
167
+TRANS(FMULX_s, do_fp3_scalar, a, &f_scalar_fmulx)
168
+
169
+static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
170
+ gen_helper_gvec_3_ptr * const fns[3])
171
+{
172
+ MemOp esz = a->esz;
173
+
174
+ switch (esz) {
175
+ case MO_64:
176
+ if (!a->q) {
177
+ return false;
178
+ }
179
+ break;
180
+ case MO_32:
181
+ break;
182
+ case MO_16:
183
+ if (!dc_isar_feature(aa64_fp16, s)) {
184
+ return false;
185
+ }
186
+ break;
187
+ default:
188
+ return false;
189
+ }
190
+ if (fp_access_check(s)) {
191
+ gen_gvec_op3_fpst(s, a->q, a->rd, a->rn, a->rm,
192
+ esz == MO_16, 0, fns[esz - 1]);
193
+ }
194
+ return true;
195
+}
196
+
197
+static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = {
198
+ gen_helper_gvec_fmulx_h,
199
+ gen_helper_gvec_fmulx_s,
200
+ gen_helper_gvec_fmulx_d,
201
+};
202
+TRANS(FMULX_v, do_fp3_vector, a, f_vector_fmulx)
203
+
204
+/*
205
+ * Advanced SIMD scalar/vector x indexed element
206
+ */
207
+
208
+static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f)
209
+{
210
+ switch (a->esz) {
211
+ case MO_64:
212
+ if (fp_access_check(s)) {
213
+ TCGv_i64 t0 = read_fp_dreg(s, a->rn);
214
+ TCGv_i64 t1 = tcg_temp_new_i64();
215
+
216
+ read_vec_element(s, t1, a->rm, a->idx, MO_64);
217
+ f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
218
+ write_fp_dreg(s, a->rd, t0);
219
+ }
220
+ break;
221
+ case MO_32:
222
+ if (fp_access_check(s)) {
223
+ TCGv_i32 t0 = read_fp_sreg(s, a->rn);
224
+ TCGv_i32 t1 = tcg_temp_new_i32();
225
+
226
+ read_vec_element_i32(s, t1, a->rm, a->idx, MO_32);
227
+ f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
228
+ write_fp_sreg(s, a->rd, t0);
229
+ }
230
+ break;
231
+ case MO_16:
232
+ if (!dc_isar_feature(aa64_fp16, s)) {
233
+ return false;
234
+ }
235
+ if (fp_access_check(s)) {
236
+ TCGv_i32 t0 = read_fp_hreg(s, a->rn);
237
+ TCGv_i32 t1 = tcg_temp_new_i32();
238
+
239
+ read_vec_element_i32(s, t1, a->rm, a->idx, MO_16);
240
+ f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16));
241
+ write_fp_sreg(s, a->rd, t0);
242
+ }
243
+ break;
244
+ default:
245
+ g_assert_not_reached();
246
+ }
247
+ return true;
248
+}
249
+
250
+TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx)
251
+
252
+static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a,
253
+ gen_helper_gvec_3_ptr * const fns[3])
254
+{
255
+ MemOp esz = a->esz;
256
+
257
+ switch (esz) {
258
+ case MO_64:
259
+ if (!a->q) {
260
+ return false;
261
+ }
262
+ break;
263
+ case MO_32:
264
+ break;
265
+ case MO_16:
266
+ if (!dc_isar_feature(aa64_fp16, s)) {
267
+ return false;
268
+ }
269
+ break;
270
+ default:
271
+ g_assert_not_reached();
272
+ }
273
+ if (fp_access_check(s)) {
274
+ gen_gvec_op3_fpst(s, a->q, a->rd, a->rn, a->rm,
275
+ esz == MO_16, a->idx, fns[esz - 1]);
276
+ }
277
+ return true;
278
+}
279
+
280
+static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = {
281
+ gen_helper_gvec_fmulx_idx_h,
282
+ gen_helper_gvec_fmulx_idx_s,
283
+ gen_helper_gvec_fmulx_idx_d,
284
+};
285
+TRANS(FMULX_vi, do_fp3_vector_idx, a, f_vector_idx_fmulx)
286
+
287
+
288
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
289
* Note that it is the caller's responsibility to ensure that the
290
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
291
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
292
case 0x1a: /* FADD */
293
gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
294
break;
295
- case 0x1b: /* FMULX */
296
- gen_helper_vfp_mulxd(tcg_res, tcg_op1, tcg_op2, fpst);
297
- break;
298
case 0x1c: /* FCMEQ */
299
gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
300
break;
301
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
302
gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
303
break;
304
default:
305
+ case 0x1b: /* FMULX */
306
g_assert_not_reached();
307
}
308
309
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
310
case 0x1a: /* FADD */
311
gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
312
break;
313
- case 0x1b: /* FMULX */
314
- gen_helper_vfp_mulxs(tcg_res, tcg_op1, tcg_op2, fpst);
315
- break;
316
case 0x1c: /* FCMEQ */
317
gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
318
break;
319
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
320
gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
321
break;
322
default:
323
+ case 0x1b: /* FMULX */
324
g_assert_not_reached();
325
}
326
327
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
328
/* Floating point: U, size[1] and opcode indicate operation */
329
int fpopcode = opcode | (extract32(size, 1, 1) << 5) | (u << 6);
330
switch (fpopcode) {
331
- case 0x1b: /* FMULX */
332
case 0x1f: /* FRECPS */
333
case 0x3f: /* FRSQRTS */
334
case 0x5d: /* FACGE */
335
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
336
case 0x7a: /* FABD */
337
break;
338
default:
339
+ case 0x1b: /* FMULX */
340
unallocated_encoding(s);
341
return;
342
}
343
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
344
TCGv_i32 tcg_res;
345
346
switch (fpopcode) {
347
- case 0x03: /* FMULX */
348
case 0x04: /* FCMEQ (reg) */
349
case 0x07: /* FRECPS */
350
case 0x0f: /* FRSQRTS */
351
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
352
case 0x1d: /* FACGT */
353
break;
354
default:
355
+ case 0x03: /* FMULX */
27
unallocated_encoding(s);
356
unallocated_encoding(s);
28
return;
357
return;
29
}
358
}
30
359
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
31
sf = extract32(insn, 31, 1);
360
tcg_res = tcg_temp_new_i32();
32
opcode = extract32(insn, 10, 6);
361
33
+ opcode2 = extract32(insn, 16, 5);
362
switch (fpopcode) {
34
rn = extract32(insn, 5, 5);
363
- case 0x03: /* FMULX */
35
rd = extract32(insn, 0, 5);
364
- gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst);
36
365
- break;
37
- switch (opcode) {
366
case 0x04: /* FCMEQ (reg) */
38
- case 0: /* RBIT */
367
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
39
+#define MAP(SF, O2, O1) ((SF) | (O1 << 1) | (O2 << 7))
40
+
41
+ switch (MAP(sf, opcode2, opcode)) {
42
+ case MAP(0, 0x00, 0x00): /* RBIT */
43
+ case MAP(1, 0x00, 0x00):
44
handle_rbit(s, sf, rn, rd);
45
break;
368
break;
46
- case 1: /* REV16 */
369
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
47
+ case MAP(0, 0x00, 0x01): /* REV16 */
370
gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
48
+ case MAP(1, 0x00, 0x01):
49
handle_rev16(s, sf, rn, rd);
50
break;
371
break;
51
- case 2: /* REV32 */
372
default:
52
+ case MAP(0, 0x00, 0x02): /* REV/REV32 */
373
+ case 0x03: /* FMULX */
53
+ case MAP(1, 0x00, 0x02):
374
g_assert_not_reached();
54
handle_rev32(s, sf, rn, rd);
375
}
376
377
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
378
handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32,
379
rn, rm, rd);
380
return;
381
- case 0x1b: /* FMULX */
382
case 0x1f: /* FRECPS */
383
case 0x3f: /* FRSQRTS */
384
case 0x5d: /* FACGE */
385
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
386
return;
387
388
default:
389
+ case 0x1b: /* FMULX */
390
unallocated_encoding(s);
391
return;
392
}
393
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
394
case 0x0: /* FMAXNM */
395
case 0x1: /* FMLA */
396
case 0x2: /* FADD */
397
- case 0x3: /* FMULX */
398
case 0x4: /* FCMEQ */
399
case 0x6: /* FMAX */
400
case 0x7: /* FRECPS */
401
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
402
pairwise = true;
55
break;
403
break;
56
- case 3: /* REV64 */
404
default:
57
+ case MAP(1, 0x00, 0x03): /* REV64 */
405
+ case 0x3: /* FMULX */
58
handle_rev64(s, sf, rn, rd);
406
unallocated_encoding(s);
407
return;
408
}
409
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
410
case 0x2: /* FADD */
411
gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
412
break;
413
- case 0x3: /* FMULX */
414
- gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst);
415
- break;
416
case 0x4: /* FCMEQ */
417
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
418
break;
419
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
420
gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
421
break;
422
default:
423
+ case 0x3: /* FMULX */
424
g_assert_not_reached();
425
}
426
427
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
428
case 0x01: /* FMLA */
429
case 0x05: /* FMLS */
430
case 0x09: /* FMUL */
431
- case 0x19: /* FMULX */
432
is_fp = 1;
59
break;
433
break;
60
- case 4: /* CLZ */
434
case 0x1d: /* SQRDMLAH */
61
+ case MAP(0, 0x00, 0x04): /* CLZ */
435
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
62
+ case MAP(1, 0x00, 0x04):
436
/* is_fp, but we pass tcg_env not fp_status. */
63
handle_clz(s, sf, rn, rd);
64
break;
437
break;
65
- case 5: /* CLS */
438
default:
66
+ case MAP(0, 0x00, 0x05): /* CLS */
439
+ case 0x19: /* FMULX */
67
+ case MAP(1, 0x00, 0x05):
440
unallocated_encoding(s);
68
handle_cls(s, sf, rn, rd);
441
return;
69
break;
70
+ default:
71
+ unallocated_encoding(s);
72
+ break;
73
}
442
}
74
+
443
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
75
+#undef MAP
444
case 0x09: /* FMUL */
445
gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
446
break;
447
- case 0x19: /* FMULX */
448
- gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst);
449
- break;
450
default:
451
+ case 0x19: /* FMULX */
452
g_assert_not_reached();
453
}
454
455
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
456
g_assert_not_reached();
457
}
458
break;
459
- case 0x19: /* FMULX */
460
- switch (size) {
461
- case 1:
462
- if (is_scalar) {
463
- gen_helper_advsimd_mulxh(tcg_res, tcg_op,
464
- tcg_idx, fpst);
465
- } else {
466
- gen_helper_advsimd_mulx2h(tcg_res, tcg_op,
467
- tcg_idx, fpst);
468
- }
469
- break;
470
- case 2:
471
- gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst);
472
- break;
473
- default:
474
- g_assert_not_reached();
475
- }
476
- break;
477
case 0x0c: /* SQDMULH */
478
if (size == 1) {
479
gen_helper_neon_qdmulh_s16(tcg_res, tcg_env,
480
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
481
}
482
break;
483
default:
484
+ case 0x19: /* FMULX */
485
g_assert_not_reached();
486
}
487
488
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
489
index XXXXXXX..XXXXXXX 100644
490
--- a/target/arm/tcg/vec_helper.c
491
+++ b/target/arm/tcg/vec_helper.c
492
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_rsqrts_nf_h, float16_rsqrts_nf, float16)
493
DO_3OP(gvec_rsqrts_nf_s, float32_rsqrts_nf, float32)
494
495
#ifdef TARGET_AARCH64
496
+DO_3OP(gvec_fmulx_h, helper_advsimd_mulxh, float16)
497
+DO_3OP(gvec_fmulx_s, helper_vfp_mulxs, float32)
498
+DO_3OP(gvec_fmulx_d, helper_vfp_mulxd, float64)
499
500
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
501
DO_3OP(gvec_recps_s, helper_recpsf_f32, float32)
502
@@ -XXX,XX +XXX,XX @@ DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -, H8)
503
504
#undef DO_MLA_IDX
505
506
-#define DO_FMUL_IDX(NAME, ADD, TYPE, H) \
507
+#define DO_FMUL_IDX(NAME, ADD, MUL, TYPE, H) \
508
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
509
{ \
510
intptr_t i, j, oprsz = simd_oprsz(desc); \
511
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
512
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
513
TYPE mm = m[H(i + idx)]; \
514
for (j = 0; j < segment; j++) { \
515
- d[i + j] = TYPE##_##ADD(d[i + j], \
516
- TYPE##_mul(n[i + j], mm, stat), stat); \
517
+ d[i + j] = ADD(d[i + j], MUL(n[i + j], mm, stat), stat); \
518
} \
519
} \
520
clear_tail(d, oprsz, simd_maxsz(desc)); \
76
}
521
}
77
522
78
static void handle_div(DisasContext *s, bool is_signed, unsigned int sf,
523
-#define float16_nop(N, M, S) (M)
524
-#define float32_nop(N, M, S) (M)
525
-#define float64_nop(N, M, S) (M)
526
+#define nop(N, M, S) (M)
527
528
-DO_FMUL_IDX(gvec_fmul_idx_h, nop, float16, H2)
529
-DO_FMUL_IDX(gvec_fmul_idx_s, nop, float32, H4)
530
-DO_FMUL_IDX(gvec_fmul_idx_d, nop, float64, H8)
531
+DO_FMUL_IDX(gvec_fmul_idx_h, nop, float16_mul, float16, H2)
532
+DO_FMUL_IDX(gvec_fmul_idx_s, nop, float32_mul, float32, H4)
533
+DO_FMUL_IDX(gvec_fmul_idx_d, nop, float64_mul, float64, H8)
534
+
535
+#ifdef TARGET_AARCH64
536
+
537
+DO_FMUL_IDX(gvec_fmulx_idx_h, nop, helper_advsimd_mulxh, float16, H2)
538
+DO_FMUL_IDX(gvec_fmulx_idx_s, nop, helper_vfp_mulxs, float32, H4)
539
+DO_FMUL_IDX(gvec_fmulx_idx_d, nop, helper_vfp_mulxd, float64, H8)
540
+
541
+#endif
542
+
543
+#undef nop
544
545
/*
546
* Non-fused multiply-accumulate operations, for Neon. NB that unlike
547
* the fused ops below they assume accumulate both from and into Vd.
548
*/
549
-DO_FMUL_IDX(gvec_fmla_nf_idx_h, add, float16, H2)
550
-DO_FMUL_IDX(gvec_fmla_nf_idx_s, add, float32, H4)
551
-DO_FMUL_IDX(gvec_fmls_nf_idx_h, sub, float16, H2)
552
-DO_FMUL_IDX(gvec_fmls_nf_idx_s, sub, float32, H4)
553
+DO_FMUL_IDX(gvec_fmla_nf_idx_h, float16_add, float16_mul, float16, H2)
554
+DO_FMUL_IDX(gvec_fmla_nf_idx_s, float32_add, float32_mul, float32, H4)
555
+DO_FMUL_IDX(gvec_fmls_nf_idx_h, float16_sub, float16_mul, float16, H2)
556
+DO_FMUL_IDX(gvec_fmls_nf_idx_s, float32_sub, float32_mul, float32, H4)
557
558
-#undef float16_nop
559
-#undef float32_nop
560
-#undef float64_nop
561
#undef DO_FMUL_IDX
562
563
#define DO_FMLA_IDX(NAME, TYPE, H) \
79
--
564
--
80
2.20.1
565
2.34.1
81
82
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Split out functions to extract the virtual address parameters.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Let the functions choose T0 or T1 address space half, if present.
5
Extract (most of) the control bits that vary between EL or Tx.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20240524232121.284515-20-richard.henderson@linaro.org
9
Message-id: 20190108223129.5570-19-richard.henderson@linaro.org
10
[PMM: fixed minor checkpatch comment nits]
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
7
---
13
target/arm/internals.h | 14 +++
8
target/arm/tcg/helper-a64.h | 4 +
14
target/arm/helper.c | 278 ++++++++++++++++++++++-------------------
9
target/arm/tcg/translate.h | 5 +
15
2 files changed, 164 insertions(+), 128 deletions(-)
10
target/arm/tcg/a64.decode | 27 +++++
11
target/arm/tcg/translate-a64.c | 205 +++++++++++++++++----------------
12
target/arm/tcg/vec_helper.c | 4 +
13
5 files changed, 143 insertions(+), 102 deletions(-)
16
14
17
diff --git a/target/arm/internals.h b/target/arm/internals.h
15
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
18
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/internals.h
17
--- a/target/arm/tcg/helper-a64.h
20
+++ b/target/arm/internals.h
18
+++ b/target/arm/tcg/helper-a64.h
21
@@ -XXX,XX +XXX,XX @@ static inline ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env)
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(cpyfp, void, env, i32, i32, i32)
22
ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env);
20
DEF_HELPER_4(cpyfm, void, env, i32, i32, i32)
23
#endif
21
DEF_HELPER_4(cpyfe, void, env, i32, i32, i32)
24
22
25
+/*
23
+DEF_HELPER_FLAGS_5(gvec_fdiv_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+ * Parameters of a given virtual address, as extracted from the
24
+DEF_HELPER_FLAGS_5(gvec_fdiv_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
+ * translation control register (TCR) for a given regime.
25
+DEF_HELPER_FLAGS_5(gvec_fdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
+ */
26
+
29
+typedef struct ARMVAParameters {
27
DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
+ unsigned tsz : 8;
28
DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
+ unsigned select : 1;
29
DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
+ bool tbi : 1;
30
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
33
+ bool epd : 1;
34
+ bool hpd : 1;
35
+ bool using16k : 1;
36
+ bool using64k : 1;
37
+} ARMVAParameters;
38
+
39
#endif
40
diff --git a/target/arm/helper.c b/target/arm/helper.c
41
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/helper.c
32
--- a/target/arm/tcg/translate.h
43
+++ b/target/arm/helper.c
33
+++ b/target/arm/tcg/translate.h
44
@@ -XXX,XX +XXX,XX @@ static uint8_t convert_stage2_attrs(CPUARMState *env, uint8_t s2attrs)
34
@@ -XXX,XX +XXX,XX @@ static inline int shl_12(DisasContext *s, int x)
45
return (hiattr << 6) | (hihint << 4) | (loattr << 2) | lohint;
35
return x << 12;
46
}
36
}
47
37
48
+static ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
38
+static inline int xor_2(DisasContext *s, int x)
49
+ ARMMMUIdx mmu_idx, bool data)
50
+{
39
+{
51
+ uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
40
+ return x ^ 2;
52
+ uint32_t el = regime_el(env, mmu_idx);
53
+ bool tbi, epd, hpd, using16k, using64k;
54
+ int select, tsz;
55
+
56
+ /*
57
+ * Bit 55 is always between the two regions, and is canonical for
58
+ * determining if address tagging is enabled.
59
+ */
60
+ select = extract64(va, 55, 1);
61
+
62
+ if (el > 1) {
63
+ tsz = extract32(tcr, 0, 6);
64
+ using64k = extract32(tcr, 14, 1);
65
+ using16k = extract32(tcr, 15, 1);
66
+ if (mmu_idx == ARMMMUIdx_S2NS) {
67
+ /* VTCR_EL2 */
68
+ tbi = hpd = false;
69
+ } else {
70
+ tbi = extract32(tcr, 20, 1);
71
+ hpd = extract32(tcr, 24, 1);
72
+ }
73
+ epd = false;
74
+ } else if (!select) {
75
+ tsz = extract32(tcr, 0, 6);
76
+ epd = extract32(tcr, 7, 1);
77
+ using64k = extract32(tcr, 14, 1);
78
+ using16k = extract32(tcr, 15, 1);
79
+ tbi = extract64(tcr, 37, 1);
80
+ hpd = extract64(tcr, 41, 1);
81
+ } else {
82
+ int tg = extract32(tcr, 30, 2);
83
+ using16k = tg == 1;
84
+ using64k = tg == 3;
85
+ tsz = extract32(tcr, 16, 6);
86
+ epd = extract32(tcr, 23, 1);
87
+ tbi = extract64(tcr, 38, 1);
88
+ hpd = extract64(tcr, 42, 1);
89
+ }
90
+ tsz = MIN(tsz, 39); /* TODO: ARMv8.4-TTST */
91
+ tsz = MAX(tsz, 16); /* TODO: ARMv8.2-LVA */
92
+
93
+ return (ARMVAParameters) {
94
+ .tsz = tsz,
95
+ .select = select,
96
+ .tbi = tbi,
97
+ .epd = epd,
98
+ .hpd = hpd,
99
+ .using16k = using16k,
100
+ .using64k = using64k,
101
+ };
102
+}
41
+}
103
+
42
+
104
+static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
43
static inline int neon_3same_fp_size(DisasContext *s, int x)
105
+ ARMMMUIdx mmu_idx)
44
{
106
+{
45
/* Convert 0==fp32, 1==fp16 into a MO_* value */
107
+ uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
46
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
108
+ uint32_t el = regime_el(env, mmu_idx);
47
index XXXXXXX..XXXXXXX 100644
109
+ int select, tsz;
48
--- a/target/arm/tcg/a64.decode
110
+ bool epd, hpd;
49
+++ b/target/arm/tcg/a64.decode
111
+
50
@@ -XXX,XX +XXX,XX @@
112
+ if (mmu_idx == ARMMMUIdx_S2NS) {
51
113
+ /* VTCR */
52
%rd 0:5
114
+ bool sext = extract32(tcr, 4, 1);
53
%esz_sd 22:1 !function=plus_2
115
+ bool sign = extract32(tcr, 3, 1);
54
+%esz_hsd 22:2 !function=xor_2
116
+
55
%hl 11:1 21:1
117
+ /*
56
%hlm 11:1 20:2
118
+ * If the sign-extend bit is not the same as t0sz[3], the result
57
119
+ * is unpredictable. Flag this as a guest error.
58
@@ -XXX,XX +XXX,XX @@
120
+ */
59
121
+ if (sign != sext) {
60
@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1
122
+ qemu_log_mask(LOG_GUEST_ERROR,
61
@rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd
123
+ "AArch32: VTCR.S / VTCR.T0SZ[3] mismatch\n");
62
+@rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd
124
+ }
63
125
+ tsz = sextract32(tcr, 0, 4) + 8;
64
@rrx_h ........ .. .. rm:4 .... . . rn:5 rd:5 &rrx_e esz=1 idx=%hlm
126
+ select = 0;
65
@rrx_s ........ .. . rm:5 .... . . rn:5 rd:5 &rrx_e esz=2 idx=%hl
127
+ hpd = false;
66
@@ -XXX,XX +XXX,XX @@ INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5
128
+ epd = false;
67
129
+ } else if (el == 2) {
68
### Advanced SIMD scalar three same
130
+ /* HTCR */
69
131
+ tsz = extract32(tcr, 0, 3);
70
+FADD_s 0001 1110 ..1 ..... 0010 10 ..... ..... @rrr_hsd
132
+ select = 0;
71
+FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd
133
+ hpd = extract64(tcr, 24, 1);
72
+FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd
134
+ epd = false;
73
+FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd
135
+ } else {
74
+
136
+ int t0sz = extract32(tcr, 0, 3);
75
FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h
137
+ int t1sz = extract32(tcr, 16, 3);
76
FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd
138
+
77
139
+ if (t1sz == 0) {
78
### Advanced SIMD three same
140
+ select = va > (0xffffffffu >> t0sz);
79
141
+ } else {
80
+FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
142
+ /* Note that we will detect errors later. */
81
+FADD_v 0.00 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd
143
+ select = va >= ~(0xffffffffu >> t1sz);
82
+
144
+ }
83
+FSUB_v 0.00 1110 110 ..... 00010 1 ..... ..... @qrrr_h
145
+ if (!select) {
84
+FSUB_v 0.00 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd
146
+ tsz = t0sz;
85
+
147
+ epd = extract32(tcr, 7, 1);
86
+FDIV_v 0.10 1110 010 ..... 00111 1 ..... ..... @qrrr_h
148
+ hpd = extract64(tcr, 41, 1);
87
+FDIV_v 0.10 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd
149
+ } else {
88
+
150
+ tsz = t1sz;
89
+FMUL_v 0.10 1110 010 ..... 00011 1 ..... ..... @qrrr_h
151
+ epd = extract32(tcr, 23, 1);
90
+FMUL_v 0.10 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
152
+ hpd = extract64(tcr, 42, 1);
91
+
153
+ }
92
FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h
154
+ /* For aarch32, hpd0 is not enabled without t2e as well. */
93
FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
155
+ hpd &= extract32(tcr, 6, 1);
94
156
+ }
95
### Advanced SIMD scalar x indexed element
157
+
96
158
+ return (ARMVAParameters) {
97
+FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
159
+ .tsz = tsz,
98
+FMUL_si 0101 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
160
+ .select = select,
99
+FMUL_si 0101 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
161
+ .epd = epd,
100
+
162
+ .hpd = hpd,
101
FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
163
+ };
102
FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
164
+}
103
FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
165
+
104
166
static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
105
### Advanced SIMD vector x indexed element
167
MMUAccessType access_type, ARMMMUIdx mmu_idx,
106
168
hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
107
+FMUL_vi 0.00 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
169
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
108
+FMUL_vi 0.00 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
170
/* Read an LPAE long-descriptor translation table. */
109
+FMUL_vi 0.00 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
171
ARMFaultType fault_type = ARMFault_Translation;
110
+
172
uint32_t level;
111
FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
173
- uint32_t epd = 0;
112
FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
174
- int32_t t0sz, t1sz;
113
FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
175
- uint32_t tg;
114
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
176
+ ARMVAParameters param;
115
index XXXXXXX..XXXXXXX 100644
177
uint64_t ttbr;
116
--- a/target/arm/tcg/translate-a64.c
178
- int ttbr_select;
117
+++ b/target/arm/tcg/translate-a64.c
179
hwaddr descaddr, indexmask, indexmask_grainsize;
118
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_scalar(DisasContext *s, arg_rrr_e *a, const FPScalar *f)
180
uint32_t tableattrs;
119
return true;
181
- target_ulong page_size;
120
}
182
+ target_ulong page_size, top_bits;
121
183
uint32_t attrs;
122
+static const FPScalar f_scalar_fadd = {
184
- int32_t stride = 9;
123
+ gen_helper_vfp_addh,
185
- int32_t addrsize;
124
+ gen_helper_vfp_adds,
186
- int inputsize;
125
+ gen_helper_vfp_addd,
187
- int32_t tbi = 0;
126
+};
188
+ int32_t stride;
127
+TRANS(FADD_s, do_fp3_scalar, a, &f_scalar_fadd)
189
+ int addrsize, inputsize;
128
+
190
TCR *tcr = regime_tcr(env, mmu_idx);
129
+static const FPScalar f_scalar_fsub = {
191
int ap, ns, xn, pxn;
130
+ gen_helper_vfp_subh,
192
uint32_t el = regime_el(env, mmu_idx);
131
+ gen_helper_vfp_subs,
193
- bool ttbr1_valid = true;
132
+ gen_helper_vfp_subd,
194
+ bool ttbr1_valid;
133
+};
195
uint64_t descaddrmask;
134
+TRANS(FSUB_s, do_fp3_scalar, a, &f_scalar_fsub)
196
bool aarch64 = arm_el_is_aa64(env, el);
135
+
197
- bool hpd = false;
136
+static const FPScalar f_scalar_fdiv = {
198
137
+ gen_helper_vfp_divh,
199
/* TODO:
138
+ gen_helper_vfp_divs,
200
* This code does not handle the different format TCR for VTCR_EL2.
139
+ gen_helper_vfp_divd,
201
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
140
+};
202
* support for those page table walks.
141
+TRANS(FDIV_s, do_fp3_scalar, a, &f_scalar_fdiv)
203
*/
142
+
204
if (aarch64) {
143
+static const FPScalar f_scalar_fmul = {
205
+ param = aa64_va_parameters(env, address, mmu_idx,
144
+ gen_helper_vfp_mulh,
206
+ access_type != MMU_INST_FETCH);
145
+ gen_helper_vfp_muls,
207
level = 0;
146
+ gen_helper_vfp_muld,
208
- addrsize = 64;
147
+};
209
- if (el > 1) {
148
+TRANS(FMUL_s, do_fp3_scalar, a, &f_scalar_fmul)
210
- if (mmu_idx != ARMMMUIdx_S2NS) {
149
+
211
- tbi = extract64(tcr->raw_tcr, 20, 1);
150
static const FPScalar f_scalar_fmulx = {
212
- }
151
gen_helper_advsimd_mulxh,
213
- } else {
152
gen_helper_vfp_mulxs,
214
- if (extract64(address, 55, 1)) {
153
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
215
- tbi = extract64(tcr->raw_tcr, 38, 1);
154
return true;
216
- } else {
155
}
217
- tbi = extract64(tcr->raw_tcr, 37, 1);
156
218
- }
157
+static gen_helper_gvec_3_ptr * const f_vector_fadd[3] = {
219
- }
158
+ gen_helper_gvec_fadd_h,
220
- tbi *= 8;
159
+ gen_helper_gvec_fadd_s,
221
-
160
+ gen_helper_gvec_fadd_d,
222
/* If we are in 64-bit EL2 or EL3 then there is no TTBR1, so mark it
161
+};
223
* invalid.
162
+TRANS(FADD_v, do_fp3_vector, a, f_vector_fadd)
224
*/
163
+
225
- if (el > 1) {
164
+static gen_helper_gvec_3_ptr * const f_vector_fsub[3] = {
226
- ttbr1_valid = false;
165
+ gen_helper_gvec_fsub_h,
227
- }
166
+ gen_helper_gvec_fsub_s,
228
+ ttbr1_valid = (el < 2);
167
+ gen_helper_gvec_fsub_d,
229
+ addrsize = 64 - 8 * param.tbi;
168
+};
230
+ inputsize = 64 - param.tsz;
169
+TRANS(FSUB_v, do_fp3_vector, a, f_vector_fsub)
231
} else {
170
+
232
+ param = aa32_va_parameters(env, address, mmu_idx);
171
+static gen_helper_gvec_3_ptr * const f_vector_fdiv[3] = {
233
level = 1;
172
+ gen_helper_gvec_fdiv_h,
234
- addrsize = 32;
173
+ gen_helper_gvec_fdiv_s,
235
/* There is no TTBR1 for EL2 */
174
+ gen_helper_gvec_fdiv_d,
236
- if (el == 2) {
175
+};
237
- ttbr1_valid = false;
176
+TRANS(FDIV_v, do_fp3_vector, a, f_vector_fdiv)
238
- }
177
+
239
+ ttbr1_valid = (el != 2);
178
+static gen_helper_gvec_3_ptr * const f_vector_fmul[3] = {
240
+ addrsize = (mmu_idx == ARMMMUIdx_S2NS ? 40 : 32);
179
+ gen_helper_gvec_fmul_h,
241
+ inputsize = addrsize - param.tsz;
180
+ gen_helper_gvec_fmul_s,
181
+ gen_helper_gvec_fmul_d,
182
+};
183
+TRANS(FMUL_v, do_fp3_vector, a, f_vector_fmul)
184
+
185
static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = {
186
gen_helper_gvec_fmulx_h,
187
gen_helper_gvec_fmulx_s,
188
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f)
189
return true;
190
}
191
192
+TRANS(FMUL_si, do_fp3_scalar_idx, a, &f_scalar_fmul)
193
TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx)
194
195
static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a,
196
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a,
197
return true;
198
}
199
200
+static gen_helper_gvec_3_ptr * const f_vector_idx_fmul[3] = {
201
+ gen_helper_gvec_fmul_idx_h,
202
+ gen_helper_gvec_fmul_idx_s,
203
+ gen_helper_gvec_fmul_idx_d,
204
+};
205
+TRANS(FMUL_vi, do_fp3_vector_idx, a, f_vector_idx_fmul)
206
+
207
static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = {
208
gen_helper_gvec_fmulx_idx_h,
209
gen_helper_gvec_fmulx_idx_s,
210
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
211
tcg_op2 = read_fp_sreg(s, rm);
212
213
switch (opcode) {
214
- case 0x0: /* FMUL */
215
- gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
216
- break;
217
- case 0x1: /* FDIV */
218
- gen_helper_vfp_divs(tcg_res, tcg_op1, tcg_op2, fpst);
219
- break;
220
- case 0x2: /* FADD */
221
- gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
222
- break;
223
- case 0x3: /* FSUB */
224
- gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
225
- break;
226
case 0x4: /* FMAX */
227
gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
228
break;
229
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
230
gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
231
gen_helper_vfp_negs(tcg_res, tcg_res);
232
break;
233
+ default:
234
+ case 0x0: /* FMUL */
235
+ case 0x1: /* FDIV */
236
+ case 0x2: /* FADD */
237
+ case 0x3: /* FSUB */
238
+ g_assert_not_reached();
242
}
239
}
243
240
244
- /* Determine whether this address is in the region controlled by
241
write_fp_sreg(s, rd, tcg_res);
245
- * TTBR0 or TTBR1 (or if it is in neither region and should fault).
242
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
246
- * This is a Non-secure PL0/1 stage 1 translation, so controlled by
243
tcg_op2 = read_fp_dreg(s, rm);
247
- * TTBCR/TTBR0/TTBR1 in accordance with ARM ARM DDI0406C table B-32:
244
248
+ /*
245
switch (opcode) {
249
+ * We determined the region when collecting the parameters, but we
246
- case 0x0: /* FMUL */
250
+ * have not yet validated that the address is valid for the region.
247
- gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
251
+ * Extract the top bits and verify that they all match select.
248
- break;
252
*/
249
- case 0x1: /* FDIV */
253
- if (aarch64) {
250
- gen_helper_vfp_divd(tcg_res, tcg_op1, tcg_op2, fpst);
254
- /* AArch64 translation. */
251
- break;
255
- t0sz = extract32(tcr->raw_tcr, 0, 6);
252
- case 0x2: /* FADD */
256
- t0sz = MIN(t0sz, 39);
253
- gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
257
- t0sz = MAX(t0sz, 16);
254
- break;
258
- } else if (mmu_idx != ARMMMUIdx_S2NS) {
255
- case 0x3: /* FSUB */
259
- /* AArch32 stage 1 translation. */
256
- gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
260
- t0sz = extract32(tcr->raw_tcr, 0, 3);
257
- break;
261
- } else {
258
case 0x4: /* FMAX */
262
- /* AArch32 stage 2 translation. */
259
gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
263
- bool sext = extract32(tcr->raw_tcr, 4, 1);
260
break;
264
- bool sign = extract32(tcr->raw_tcr, 3, 1);
261
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
265
- /* Address size is 40-bit for a stage 2 translation,
262
gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
266
- * and t0sz can be negative (from -8 to 7),
263
gen_helper_vfp_negd(tcg_res, tcg_res);
267
- * so we need to adjust it to use the TTBR selecting logic below.
264
break;
268
- */
265
+ default:
269
- addrsize = 40;
266
+ case 0x0: /* FMUL */
270
- t0sz = sextract32(tcr->raw_tcr, 0, 4) + 8;
267
+ case 0x1: /* FDIV */
271
-
268
+ case 0x2: /* FADD */
272
- /* If the sign-extend bit is not the same as t0sz[3], the result
269
+ case 0x3: /* FSUB */
273
- * is unpredictable. Flag this as a guest error. */
270
+ g_assert_not_reached();
274
- if (sign != sext) {
275
- qemu_log_mask(LOG_GUEST_ERROR,
276
- "AArch32: VTCR.S / VTCR.T0SZ[3] mismatch\n");
277
- }
278
- }
279
- t1sz = extract32(tcr->raw_tcr, 16, 6);
280
- if (aarch64) {
281
- t1sz = MIN(t1sz, 39);
282
- t1sz = MAX(t1sz, 16);
283
- }
284
- if (t0sz && !extract64(address, addrsize - t0sz, t0sz - tbi)) {
285
- /* there is a ttbr0 region and we are in it (high bits all zero) */
286
- ttbr_select = 0;
287
- } else if (ttbr1_valid && t1sz &&
288
- !extract64(~address, addrsize - t1sz, t1sz - tbi)) {
289
- /* there is a ttbr1 region and we are in it (high bits all one) */
290
- ttbr_select = 1;
291
- } else if (!t0sz) {
292
- /* ttbr0 region is "everything not in the ttbr1 region" */
293
- ttbr_select = 0;
294
- } else if (!t1sz && ttbr1_valid) {
295
- /* ttbr1 region is "everything not in the ttbr0 region" */
296
- ttbr_select = 1;
297
- } else {
298
- /* in the gap between the two regions, this is a Translation fault */
299
+ top_bits = sextract64(address, inputsize, addrsize - inputsize);
300
+ if (-top_bits != param.select || (param.select && !ttbr1_valid)) {
301
+ /* In the gap between the two regions, this is a Translation fault */
302
fault_type = ARMFault_Translation;
303
goto do_fault;
304
}
271
}
305
272
306
+ if (param.using64k) {
273
write_fp_dreg(s, rd, tcg_res);
307
+ stride = 13;
274
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
308
+ } else if (param.using16k) {
275
tcg_op2 = read_fp_hreg(s, rm);
309
+ stride = 11;
276
310
+ } else {
277
switch (opcode) {
311
+ stride = 9;
278
- case 0x0: /* FMUL */
312
+ }
279
- gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
313
+
280
- break;
314
/* Note that QEMU ignores shareability and cacheability attributes,
281
- case 0x1: /* FDIV */
315
* so we don't need to do anything with the SH, ORGN, IRGN fields
282
- gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst);
316
* in the TTBCR. Similarly, TTBCR:A1 selects whether we get the
283
- break;
317
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
284
- case 0x2: /* FADD */
318
* implement any ASID-like capability so we can ignore it (instead
285
- gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
319
* we will always flush the TLB any time the ASID is changed).
286
- break;
320
*/
287
- case 0x3: /* FSUB */
321
- if (ttbr_select == 0) {
288
- gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
322
- ttbr = regime_ttbr(env, mmu_idx, 0);
289
- break;
323
- if (el < 2) {
290
case 0x4: /* FMAX */
324
- epd = extract32(tcr->raw_tcr, 7, 1);
291
gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
325
- }
292
break;
326
- inputsize = addrsize - t0sz;
293
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
327
-
294
tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
328
- tg = extract32(tcr->raw_tcr, 14, 2);
295
break;
329
- if (tg == 1) { /* 64KB pages */
296
default:
330
- stride = 13;
297
+ case 0x0: /* FMUL */
331
- }
298
+ case 0x1: /* FDIV */
332
- if (tg == 2) { /* 16KB pages */
299
+ case 0x2: /* FADD */
333
- stride = 11;
300
+ case 0x3: /* FSUB */
334
- }
301
g_assert_not_reached();
335
- if (aarch64 && el > 1) {
302
}
336
- hpd = extract64(tcr->raw_tcr, 24, 1);
303
337
- } else {
304
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
338
- hpd = extract64(tcr->raw_tcr, 41, 1);
305
case 0x18: /* FMAXNM */
339
- }
306
gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
340
- if (!aarch64) {
307
break;
341
- /* For aarch32, hpd0 is not enabled without t2e as well. */
308
- case 0x1a: /* FADD */
342
- hpd &= extract64(tcr->raw_tcr, 6, 1);
309
- gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
343
- }
310
- break;
344
- } else {
311
case 0x1c: /* FCMEQ */
345
- /* We should only be here if TTBR1 is valid */
312
gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
346
- assert(ttbr1_valid);
313
break;
347
-
314
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
348
- ttbr = regime_ttbr(env, mmu_idx, 1);
315
case 0x38: /* FMINNM */
349
- epd = extract32(tcr->raw_tcr, 23, 1);
316
gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
350
- inputsize = addrsize - t1sz;
317
break;
351
-
318
- case 0x3a: /* FSUB */
352
- tg = extract32(tcr->raw_tcr, 30, 2);
319
- gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
353
- if (tg == 3) { /* 64KB pages */
320
- break;
354
- stride = 13;
321
case 0x3e: /* FMIN */
355
- }
322
gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
356
- if (tg == 1) { /* 16KB pages */
323
break;
357
- stride = 11;
324
case 0x3f: /* FRSQRTS */
358
- }
325
gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
359
- hpd = extract64(tcr->raw_tcr, 42, 1);
326
break;
360
- if (!aarch64) {
327
- case 0x5b: /* FMUL */
361
- /* For aarch32, hpd1 is not enabled without t2e as well. */
328
- gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
362
- hpd &= extract64(tcr->raw_tcr, 6, 1);
329
- break;
363
- }
330
case 0x5c: /* FCMGE */
364
- }
331
gen_helper_neon_cge_f64(tcg_res, tcg_op1, tcg_op2, fpst);
365
+ ttbr = regime_ttbr(env, mmu_idx, param.select);
332
break;
366
333
case 0x5d: /* FACGE */
367
/* Here we should have set up all the parameters for the translation:
334
gen_helper_neon_acge_f64(tcg_res, tcg_op1, tcg_op2, fpst);
368
* inputsize, ttbr, epd, stride, tbi
335
break;
369
*/
336
- case 0x5f: /* FDIV */
370
337
- gen_helper_vfp_divd(tcg_res, tcg_op1, tcg_op2, fpst);
371
- if (epd) {
338
- break;
372
+ if (param.epd) {
339
case 0x7a: /* FABD */
373
/* Translation table walk disabled => Translation fault on TLB miss
340
gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
374
* Note: This is always 0 on 64-bit EL2 and EL3.
341
gen_helper_vfp_absd(tcg_res, tcg_res);
375
*/
342
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
376
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
343
gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
377
}
344
break;
378
/* Merge in attributes from table descriptors */
345
default:
379
attrs |= nstable << 3; /* NS */
346
+ case 0x1a: /* FADD */
380
- if (hpd) {
347
case 0x1b: /* FMULX */
381
+ if (param.hpd) {
348
+ case 0x3a: /* FSUB */
382
/* HPD disables all the table attributes except NSTable. */
349
+ case 0x5b: /* FMUL */
383
break;
350
+ case 0x5f: /* FDIV */
384
}
351
g_assert_not_reached();
352
}
353
354
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
355
gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2,
356
tcg_res, fpst);
357
break;
358
- case 0x1a: /* FADD */
359
- gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
360
- break;
361
case 0x1c: /* FCMEQ */
362
gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
363
break;
364
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
365
case 0x38: /* FMINNM */
366
gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
367
break;
368
- case 0x3a: /* FSUB */
369
- gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
370
- break;
371
case 0x3e: /* FMIN */
372
gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
373
break;
374
case 0x3f: /* FRSQRTS */
375
gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
376
break;
377
- case 0x5b: /* FMUL */
378
- gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
379
- break;
380
case 0x5c: /* FCMGE */
381
gen_helper_neon_cge_f32(tcg_res, tcg_op1, tcg_op2, fpst);
382
break;
383
case 0x5d: /* FACGE */
384
gen_helper_neon_acge_f32(tcg_res, tcg_op1, tcg_op2, fpst);
385
break;
386
- case 0x5f: /* FDIV */
387
- gen_helper_vfp_divs(tcg_res, tcg_op1, tcg_op2, fpst);
388
- break;
389
case 0x7a: /* FABD */
390
gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
391
gen_helper_vfp_abss(tcg_res, tcg_res);
392
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
393
gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
394
break;
395
default:
396
+ case 0x1a: /* FADD */
397
case 0x1b: /* FMULX */
398
+ case 0x3a: /* FSUB */
399
+ case 0x5b: /* FMUL */
400
+ case 0x5f: /* FDIV */
401
g_assert_not_reached();
402
}
403
404
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
405
case 0x19: /* FMLA */
406
case 0x39: /* FMLS */
407
case 0x18: /* FMAXNM */
408
- case 0x1a: /* FADD */
409
case 0x1c: /* FCMEQ */
410
case 0x1e: /* FMAX */
411
case 0x38: /* FMINNM */
412
- case 0x3a: /* FSUB */
413
case 0x3e: /* FMIN */
414
- case 0x5b: /* FMUL */
415
case 0x5c: /* FCMGE */
416
- case 0x5f: /* FDIV */
417
case 0x7a: /* FABD */
418
case 0x7c: /* FCMGT */
419
if (!fp_access_check(s)) {
420
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
421
return;
422
423
default:
424
+ case 0x1a: /* FADD */
425
case 0x1b: /* FMULX */
426
+ case 0x3a: /* FSUB */
427
+ case 0x5b: /* FMUL */
428
+ case 0x5f: /* FDIV */
429
unallocated_encoding(s);
430
return;
431
}
432
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
433
switch (fpopcode) {
434
case 0x0: /* FMAXNM */
435
case 0x1: /* FMLA */
436
- case 0x2: /* FADD */
437
case 0x4: /* FCMEQ */
438
case 0x6: /* FMAX */
439
case 0x7: /* FRECPS */
440
case 0x8: /* FMINNM */
441
case 0x9: /* FMLS */
442
- case 0xa: /* FSUB */
443
case 0xe: /* FMIN */
444
case 0xf: /* FRSQRTS */
445
- case 0x13: /* FMUL */
446
case 0x14: /* FCMGE */
447
case 0x15: /* FACGE */
448
- case 0x17: /* FDIV */
449
case 0x1a: /* FABD */
450
case 0x1c: /* FCMGT */
451
case 0x1d: /* FACGT */
452
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
453
pairwise = true;
454
break;
455
default:
456
+ case 0x2: /* FADD */
457
case 0x3: /* FMULX */
458
+ case 0xa: /* FSUB */
459
+ case 0x13: /* FMUL */
460
+ case 0x17: /* FDIV */
461
unallocated_encoding(s);
462
return;
463
}
464
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
465
gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
466
fpst);
467
break;
468
- case 0x2: /* FADD */
469
- gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
470
- break;
471
case 0x4: /* FCMEQ */
472
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
473
break;
474
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
475
gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
476
fpst);
477
break;
478
- case 0xa: /* FSUB */
479
- gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
480
- break;
481
case 0xe: /* FMIN */
482
gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
483
break;
484
case 0xf: /* FRSQRTS */
485
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
486
break;
487
- case 0x13: /* FMUL */
488
- gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
489
- break;
490
case 0x14: /* FCMGE */
491
gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
492
break;
493
case 0x15: /* FACGE */
494
gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
495
break;
496
- case 0x17: /* FDIV */
497
- gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst);
498
- break;
499
case 0x1a: /* FABD */
500
gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
501
tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
502
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
503
gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
504
break;
505
default:
506
+ case 0x2: /* FADD */
507
case 0x3: /* FMULX */
508
+ case 0xa: /* FSUB */
509
+ case 0x13: /* FMUL */
510
+ case 0x17: /* FDIV */
511
g_assert_not_reached();
512
}
513
514
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
515
break;
516
case 0x01: /* FMLA */
517
case 0x05: /* FMLS */
518
- case 0x09: /* FMUL */
519
is_fp = 1;
520
break;
521
case 0x1d: /* SQRDMLAH */
522
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
523
/* is_fp, but we pass tcg_env not fp_status. */
524
break;
525
default:
526
+ case 0x09: /* FMUL */
527
case 0x19: /* FMULX */
528
unallocated_encoding(s);
529
return;
530
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
531
read_vec_element(s, tcg_res, rd, pass, MO_64);
532
gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst);
533
break;
534
- case 0x09: /* FMUL */
535
- gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
536
- break;
537
default:
538
+ case 0x09: /* FMUL */
539
case 0x19: /* FMULX */
540
g_assert_not_reached();
541
}
542
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
543
g_assert_not_reached();
544
}
545
break;
546
- case 0x09: /* FMUL */
547
- switch (size) {
548
- case 1:
549
- if (is_scalar) {
550
- gen_helper_advsimd_mulh(tcg_res, tcg_op,
551
- tcg_idx, fpst);
552
- } else {
553
- gen_helper_advsimd_mul2h(tcg_res, tcg_op,
554
- tcg_idx, fpst);
555
- }
556
- break;
557
- case 2:
558
- gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst);
559
- break;
560
- default:
561
- g_assert_not_reached();
562
- }
563
- break;
564
case 0x0c: /* SQDMULH */
565
if (size == 1) {
566
gen_helper_neon_qdmulh_s16(tcg_res, tcg_env,
567
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
568
}
569
break;
570
default:
571
+ case 0x09: /* FMUL */
572
case 0x19: /* FMULX */
573
g_assert_not_reached();
574
}
575
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
576
index XXXXXXX..XXXXXXX 100644
577
--- a/target/arm/tcg/vec_helper.c
578
+++ b/target/arm/tcg/vec_helper.c
579
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_rsqrts_nf_h, float16_rsqrts_nf, float16)
580
DO_3OP(gvec_rsqrts_nf_s, float32_rsqrts_nf, float32)
581
582
#ifdef TARGET_AARCH64
583
+DO_3OP(gvec_fdiv_h, float16_div, float16)
584
+DO_3OP(gvec_fdiv_s, float32_div, float32)
585
+DO_3OP(gvec_fdiv_d, float64_div, float64)
586
+
587
DO_3OP(gvec_fmulx_h, helper_advsimd_mulxh, float16)
588
DO_3OP(gvec_fmulx_s, helper_vfp_mulxs, float32)
589
DO_3OP(gvec_fmulx_d, helper_vfp_mulxd, float64)
385
--
590
--
386
2.20.1
591
2.34.1
387
388
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20190108223129.5570-30-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-21-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/cpu64.c | 4 ++++
8
target/arm/helper.h | 4 +
9
1 file changed, 4 insertions(+)
9
target/arm/tcg/a64.decode | 17 ++++
10
target/arm/tcg/translate-a64.c | 168 +++++++++++++++++----------------
11
target/arm/tcg/vec_helper.c | 4 +
12
4 files changed, 113 insertions(+), 80 deletions(-)
10
13
11
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
12
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/cpu64.c
16
--- a/target/arm/helper.h
14
+++ b/target/arm/cpu64.c
17
+++ b/target/arm/helper.h
15
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
16
19
17
t = cpu->isar.id_aa64isar1;
20
DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
21
DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
+ t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1); /* PAuth, architected only */
22
+DEF_HELPER_FLAGS_5(gvec_fmax_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
+ t = FIELD_DP64(t, ID_AA64ISAR1, API, 0);
23
21
+ t = FIELD_DP64(t, ID_AA64ISAR1, GPA, 1);
24
DEF_HELPER_FLAGS_5(gvec_fmin_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+ t = FIELD_DP64(t, ID_AA64ISAR1, GPI, 0);
25
DEF_HELPER_FLAGS_5(gvec_fmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
cpu->isar.id_aa64isar1 = t;
26
+DEF_HELPER_FLAGS_5(gvec_fmin_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
27
25
t = cpu->isar.id_aa64pfr0;
28
DEF_HELPER_FLAGS_5(gvec_fmaxnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_5(gvec_fmaxnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_5(gvec_fmaxnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
32
DEF_HELPER_FLAGS_5(gvec_fminnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_5(gvec_fminnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
36
DEF_HELPER_FLAGS_5(gvec_recps_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_5(gvec_recps_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/tcg/a64.decode
41
+++ b/target/arm/tcg/a64.decode
42
@@ -XXX,XX +XXX,XX @@ FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd
43
FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd
44
FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd
45
46
+FMAX_s 0001 1110 ..1 ..... 0100 10 ..... ..... @rrr_hsd
47
+FMIN_s 0001 1110 ..1 ..... 0101 10 ..... ..... @rrr_hsd
48
+FMAXNM_s 0001 1110 ..1 ..... 0110 10 ..... ..... @rrr_hsd
49
+FMINNM_s 0001 1110 ..1 ..... 0111 10 ..... ..... @rrr_hsd
50
+
51
FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h
52
FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd
53
54
@@ -XXX,XX +XXX,XX @@ FDIV_v 0.10 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd
55
FMUL_v 0.10 1110 010 ..... 00011 1 ..... ..... @qrrr_h
56
FMUL_v 0.10 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
57
58
+FMAX_v 0.00 1110 010 ..... 00110 1 ..... ..... @qrrr_h
59
+FMAX_v 0.00 1110 0.1 ..... 11110 1 ..... ..... @qrrr_sd
60
+
61
+FMIN_v 0.00 1110 110 ..... 00110 1 ..... ..... @qrrr_h
62
+FMIN_v 0.00 1110 1.1 ..... 11110 1 ..... ..... @qrrr_sd
63
+
64
+FMAXNM_v 0.00 1110 010 ..... 00000 1 ..... ..... @qrrr_h
65
+FMAXNM_v 0.00 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd
66
+
67
+FMINNM_v 0.00 1110 110 ..... 00000 1 ..... ..... @qrrr_h
68
+FMINNM_v 0.00 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
69
+
70
FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h
71
FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
72
73
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/tcg/translate-a64.c
76
+++ b/target/arm/tcg/translate-a64.c
77
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_fmul = {
78
};
79
TRANS(FMUL_s, do_fp3_scalar, a, &f_scalar_fmul)
80
81
+static const FPScalar f_scalar_fmax = {
82
+ gen_helper_advsimd_maxh,
83
+ gen_helper_vfp_maxs,
84
+ gen_helper_vfp_maxd,
85
+};
86
+TRANS(FMAX_s, do_fp3_scalar, a, &f_scalar_fmax)
87
+
88
+static const FPScalar f_scalar_fmin = {
89
+ gen_helper_advsimd_minh,
90
+ gen_helper_vfp_mins,
91
+ gen_helper_vfp_mind,
92
+};
93
+TRANS(FMIN_s, do_fp3_scalar, a, &f_scalar_fmin)
94
+
95
+static const FPScalar f_scalar_fmaxnm = {
96
+ gen_helper_advsimd_maxnumh,
97
+ gen_helper_vfp_maxnums,
98
+ gen_helper_vfp_maxnumd,
99
+};
100
+TRANS(FMAXNM_s, do_fp3_scalar, a, &f_scalar_fmaxnm)
101
+
102
+static const FPScalar f_scalar_fminnm = {
103
+ gen_helper_advsimd_minnumh,
104
+ gen_helper_vfp_minnums,
105
+ gen_helper_vfp_minnumd,
106
+};
107
+TRANS(FMINNM_s, do_fp3_scalar, a, &f_scalar_fminnm)
108
+
109
static const FPScalar f_scalar_fmulx = {
110
gen_helper_advsimd_mulxh,
111
gen_helper_vfp_mulxs,
112
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fmul[3] = {
113
};
114
TRANS(FMUL_v, do_fp3_vector, a, f_vector_fmul)
115
116
+static gen_helper_gvec_3_ptr * const f_vector_fmax[3] = {
117
+ gen_helper_gvec_fmax_h,
118
+ gen_helper_gvec_fmax_s,
119
+ gen_helper_gvec_fmax_d,
120
+};
121
+TRANS(FMAX_v, do_fp3_vector, a, f_vector_fmax)
122
+
123
+static gen_helper_gvec_3_ptr * const f_vector_fmin[3] = {
124
+ gen_helper_gvec_fmin_h,
125
+ gen_helper_gvec_fmin_s,
126
+ gen_helper_gvec_fmin_d,
127
+};
128
+TRANS(FMIN_v, do_fp3_vector, a, f_vector_fmin)
129
+
130
+static gen_helper_gvec_3_ptr * const f_vector_fmaxnm[3] = {
131
+ gen_helper_gvec_fmaxnum_h,
132
+ gen_helper_gvec_fmaxnum_s,
133
+ gen_helper_gvec_fmaxnum_d,
134
+};
135
+TRANS(FMAXNM_v, do_fp3_vector, a, f_vector_fmaxnm)
136
+
137
+static gen_helper_gvec_3_ptr * const f_vector_fminnm[3] = {
138
+ gen_helper_gvec_fminnum_h,
139
+ gen_helper_gvec_fminnum_s,
140
+ gen_helper_gvec_fminnum_d,
141
+};
142
+TRANS(FMINNM_v, do_fp3_vector, a, f_vector_fminnm)
143
+
144
static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = {
145
gen_helper_gvec_fmulx_h,
146
gen_helper_gvec_fmulx_s,
147
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
148
tcg_op2 = read_fp_sreg(s, rm);
149
150
switch (opcode) {
151
- case 0x4: /* FMAX */
152
- gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
153
- break;
154
- case 0x5: /* FMIN */
155
- gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
156
- break;
157
- case 0x6: /* FMAXNM */
158
- gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
159
- break;
160
- case 0x7: /* FMINNM */
161
- gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
162
- break;
163
case 0x8: /* FNMUL */
164
gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
165
gen_helper_vfp_negs(tcg_res, tcg_res);
166
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
167
case 0x1: /* FDIV */
168
case 0x2: /* FADD */
169
case 0x3: /* FSUB */
170
+ case 0x4: /* FMAX */
171
+ case 0x5: /* FMIN */
172
+ case 0x6: /* FMAXNM */
173
+ case 0x7: /* FMINNM */
174
g_assert_not_reached();
175
}
176
177
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
178
tcg_op2 = read_fp_dreg(s, rm);
179
180
switch (opcode) {
181
- case 0x4: /* FMAX */
182
- gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
183
- break;
184
- case 0x5: /* FMIN */
185
- gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
186
- break;
187
- case 0x6: /* FMAXNM */
188
- gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
189
- break;
190
- case 0x7: /* FMINNM */
191
- gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
192
- break;
193
case 0x8: /* FNMUL */
194
gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
195
gen_helper_vfp_negd(tcg_res, tcg_res);
196
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
197
case 0x1: /* FDIV */
198
case 0x2: /* FADD */
199
case 0x3: /* FSUB */
200
+ case 0x4: /* FMAX */
201
+ case 0x5: /* FMIN */
202
+ case 0x6: /* FMAXNM */
203
+ case 0x7: /* FMINNM */
204
g_assert_not_reached();
205
}
206
207
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
208
tcg_op2 = read_fp_hreg(s, rm);
209
210
switch (opcode) {
211
- case 0x4: /* FMAX */
212
- gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
213
- break;
214
- case 0x5: /* FMIN */
215
- gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
216
- break;
217
- case 0x6: /* FMAXNM */
218
- gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
219
- break;
220
- case 0x7: /* FMINNM */
221
- gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
222
- break;
223
case 0x8: /* FNMUL */
224
gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
225
tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
226
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
227
case 0x1: /* FDIV */
228
case 0x2: /* FADD */
229
case 0x3: /* FSUB */
230
+ case 0x4: /* FMAX */
231
+ case 0x5: /* FMIN */
232
+ case 0x6: /* FMAXNM */
233
+ case 0x7: /* FMINNM */
234
g_assert_not_reached();
235
}
236
237
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
238
gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2,
239
tcg_res, fpst);
240
break;
241
- case 0x18: /* FMAXNM */
242
- gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
243
- break;
244
case 0x1c: /* FCMEQ */
245
gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
246
break;
247
- case 0x1e: /* FMAX */
248
- gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
249
- break;
250
case 0x1f: /* FRECPS */
251
gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
252
break;
253
- case 0x38: /* FMINNM */
254
- gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
255
- break;
256
- case 0x3e: /* FMIN */
257
- gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
258
- break;
259
case 0x3f: /* FRSQRTS */
260
gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
261
break;
262
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
263
gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
264
break;
265
default:
266
+ case 0x18: /* FMAXNM */
267
case 0x1a: /* FADD */
268
case 0x1b: /* FMULX */
269
+ case 0x1e: /* FMAX */
270
+ case 0x38: /* FMINNM */
271
case 0x3a: /* FSUB */
272
+ case 0x3e: /* FMIN */
273
case 0x5b: /* FMUL */
274
case 0x5f: /* FDIV */
275
g_assert_not_reached();
276
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
277
case 0x1c: /* FCMEQ */
278
gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
279
break;
280
- case 0x1e: /* FMAX */
281
- gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
282
- break;
283
case 0x1f: /* FRECPS */
284
gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
285
break;
286
- case 0x18: /* FMAXNM */
287
- gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
288
- break;
289
- case 0x38: /* FMINNM */
290
- gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
291
- break;
292
- case 0x3e: /* FMIN */
293
- gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
294
- break;
295
case 0x3f: /* FRSQRTS */
296
gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
297
break;
298
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
299
gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
300
break;
301
default:
302
+ case 0x18: /* FMAXNM */
303
case 0x1a: /* FADD */
304
case 0x1b: /* FMULX */
305
+ case 0x1e: /* FMAX */
306
+ case 0x38: /* FMINNM */
307
case 0x3a: /* FSUB */
308
+ case 0x3e: /* FMIN */
309
case 0x5b: /* FMUL */
310
case 0x5f: /* FDIV */
311
g_assert_not_reached();
312
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
313
case 0x7d: /* FACGT */
314
case 0x19: /* FMLA */
315
case 0x39: /* FMLS */
316
- case 0x18: /* FMAXNM */
317
case 0x1c: /* FCMEQ */
318
- case 0x1e: /* FMAX */
319
- case 0x38: /* FMINNM */
320
- case 0x3e: /* FMIN */
321
case 0x5c: /* FCMGE */
322
case 0x7a: /* FABD */
323
case 0x7c: /* FCMGT */
324
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
325
return;
326
327
default:
328
+ case 0x18: /* FMAXNM */
329
case 0x1a: /* FADD */
330
case 0x1b: /* FMULX */
331
+ case 0x1e: /* FMAX */
332
+ case 0x38: /* FMINNM */
333
case 0x3a: /* FSUB */
334
+ case 0x3e: /* FMIN */
335
case 0x5b: /* FMUL */
336
case 0x5f: /* FDIV */
337
unallocated_encoding(s);
338
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
339
int pass;
340
341
switch (fpopcode) {
342
- case 0x0: /* FMAXNM */
343
case 0x1: /* FMLA */
344
case 0x4: /* FCMEQ */
345
- case 0x6: /* FMAX */
346
case 0x7: /* FRECPS */
347
- case 0x8: /* FMINNM */
348
case 0x9: /* FMLS */
349
- case 0xe: /* FMIN */
350
case 0xf: /* FRSQRTS */
351
case 0x14: /* FCMGE */
352
case 0x15: /* FACGE */
353
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
354
pairwise = true;
355
break;
356
default:
357
+ case 0x0: /* FMAXNM */
358
case 0x2: /* FADD */
359
case 0x3: /* FMULX */
360
+ case 0x6: /* FMAX */
361
+ case 0x8: /* FMINNM */
362
case 0xa: /* FSUB */
363
+ case 0xe: /* FMIN */
364
case 0x13: /* FMUL */
365
case 0x17: /* FDIV */
366
unallocated_encoding(s);
367
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
368
read_vec_element_i32(s, tcg_op2, rm, pass, MO_16);
369
370
switch (fpopcode) {
371
- case 0x0: /* FMAXNM */
372
- gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
373
- break;
374
case 0x1: /* FMLA */
375
read_vec_element_i32(s, tcg_res, rd, pass, MO_16);
376
gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
377
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
378
case 0x4: /* FCMEQ */
379
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
380
break;
381
- case 0x6: /* FMAX */
382
- gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
383
- break;
384
case 0x7: /* FRECPS */
385
gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
386
break;
387
- case 0x8: /* FMINNM */
388
- gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
389
- break;
390
case 0x9: /* FMLS */
391
/* As usual for ARM, separate negation for fused multiply-add */
392
tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000);
393
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
394
gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
395
fpst);
396
break;
397
- case 0xe: /* FMIN */
398
- gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
399
- break;
400
case 0xf: /* FRSQRTS */
401
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
402
break;
403
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
404
gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
405
break;
406
default:
407
+ case 0x0: /* FMAXNM */
408
case 0x2: /* FADD */
409
case 0x3: /* FMULX */
410
+ case 0x6: /* FMAX */
411
+ case 0x8: /* FMINNM */
412
case 0xa: /* FSUB */
413
+ case 0xe: /* FMIN */
414
case 0x13: /* FMUL */
415
case 0x17: /* FDIV */
416
g_assert_not_reached();
417
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
418
index XXXXXXX..XXXXXXX 100644
419
--- a/target/arm/tcg/vec_helper.c
420
+++ b/target/arm/tcg/vec_helper.c
421
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_facgt_s, float32_acgt, float32)
422
423
DO_3OP(gvec_fmax_h, float16_max, float16)
424
DO_3OP(gvec_fmax_s, float32_max, float32)
425
+DO_3OP(gvec_fmax_d, float64_max, float64)
426
427
DO_3OP(gvec_fmin_h, float16_min, float16)
428
DO_3OP(gvec_fmin_s, float32_min, float32)
429
+DO_3OP(gvec_fmin_d, float64_min, float64)
430
431
DO_3OP(gvec_fmaxnum_h, float16_maxnum, float16)
432
DO_3OP(gvec_fmaxnum_s, float32_maxnum, float32)
433
+DO_3OP(gvec_fmaxnum_d, float64_maxnum, float64)
434
435
DO_3OP(gvec_fminnum_h, float16_minnum, float16)
436
DO_3OP(gvec_fminnum_s, float32_minnum, float32)
437
+DO_3OP(gvec_fminnum_d, float64_minnum, float64)
438
439
DO_3OP(gvec_recps_nf_h, float16_recps_nf, float16)
440
DO_3OP(gvec_recps_nf_s, float32_recps_nf, float32)
26
--
441
--
27
2.20.1
442
2.34.1
28
29
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Add 4 attributes that controls the EL1 enable bits, as we may not
3
Load and zero-extend float16 into a TCGv_i32 before
4
always want to turn on pointer authentication with -cpu max.
4
all scalar operations.
5
However, by default they are enabled.
6
5
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20190108223129.5570-31-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-22-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/cpu.c | 3 +++
11
target/arm/tcg/translate-vfp.c | 39 +++++++++++++++++++---------------
13
target/arm/cpu64.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++
12
1 file changed, 22 insertions(+), 17 deletions(-)
14
2 files changed, 63 insertions(+)
15
13
16
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
14
diff --git a/target/arm/tcg/translate-vfp.c b/target/arm/tcg/translate-vfp.c
17
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/cpu.c
16
--- a/target/arm/tcg/translate-vfp.c
19
+++ b/target/arm/cpu.c
17
+++ b/target/arm/tcg/translate-vfp.c
20
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
18
@@ -XXX,XX +XXX,XX @@ static inline void vfp_store_reg32(TCGv_i32 var, int reg)
21
env->pstate = PSTATE_MODE_EL0t;
19
tcg_gen_st_i32(var, tcg_env, vfp_reg_offset(false, reg));
22
/* Userspace expects access to DC ZVA, CTL_EL0 and the cache ops */
23
env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE;
24
+ /* Enable all PAC instructions */
25
+ env->cp15.hcr_el2 |= HCR_API;
26
+ env->cp15.scr_el3 |= SCR_API;
27
/* and to the FP/Neon instructions */
28
env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3);
29
/* and to the SVE instructions */
30
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/cpu64.c
33
+++ b/target/arm/cpu64.c
34
@@ -XXX,XX +XXX,XX @@ static void cpu_max_set_sve_vq(Object *obj, Visitor *v, const char *name,
35
error_propagate(errp, err);
36
}
20
}
37
21
38
+#ifdef CONFIG_USER_ONLY
22
+static inline void vfp_load_reg16(TCGv_i32 var, int reg)
39
+static void cpu_max_get_packey(Object *obj, Visitor *v, const char *name,
40
+ void *opaque, Error **errp)
41
+{
23
+{
42
+ ARMCPU *cpu = ARM_CPU(obj);
24
+ tcg_gen_ld16u_i32(var, tcg_env,
43
+ const uint64_t *bit = opaque;
25
+ vfp_reg_offset(false, reg) + HOST_BIG_ENDIAN * 2);
44
+ bool enabled = (cpu->env.cp15.sctlr_el[1] & *bit) != 0;
45
+
46
+ visit_type_bool(v, name, &enabled, errp);
47
+}
26
+}
48
+
27
+
49
+static void cpu_max_set_packey(Object *obj, Visitor *v, const char *name,
28
/*
50
+ void *opaque, Error **errp)
29
* The imm8 encodes the sign bit, enough bits to represent an exponent in
51
+{
30
* the range 01....1xx to 10....0xx, and the most significant 4 bits of
52
+ ARMCPU *cpu = ARM_CPU(obj);
31
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_half(DisasContext *s, arg_VMOV_single *a)
53
+ Error *err = NULL;
32
if (a->l) {
54
+ const uint64_t *bit = opaque;
33
/* VFP to general purpose register */
55
+ bool enabled;
34
tmp = tcg_temp_new_i32();
56
+
35
- vfp_load_reg32(tmp, a->vn);
57
+ visit_type_bool(v, name, &enabled, errp);
36
- tcg_gen_andi_i32(tmp, tmp, 0xffff);
58
+
37
+ vfp_load_reg16(tmp, a->vn);
59
+ if (!err) {
38
store_reg(s, a->rt, tmp);
60
+ if (enabled) {
39
} else {
61
+ cpu->env.cp15.sctlr_el[1] |= *bit;
40
/* general purpose register to VFP */
62
+ } else {
41
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_hp(DisasContext *s, VFPGen3OpSPFn *fn,
63
+ cpu->env.cp15.sctlr_el[1] &= ~*bit;
42
fd = tcg_temp_new_i32();
64
+ }
43
fpst = fpstatus_ptr(FPST_FPCR_F16);
65
+ }
44
66
+ error_propagate(errp, err);
45
- vfp_load_reg32(f0, vn);
67
+}
46
- vfp_load_reg32(f1, vm);
68
+#endif
47
+ vfp_load_reg16(f0, vn);
69
+
48
+ vfp_load_reg16(f1, vm);
70
/* -cpu max: if KVM is enabled, like -cpu host (best possible with this host);
49
71
* otherwise, a CPU with as many features enabled as our emulation supports.
50
if (reads_vd) {
72
* The version of '-cpu max' for qemu-system-arm is defined in cpu.c;
51
- vfp_load_reg32(fd, vd);
73
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
52
+ vfp_load_reg16(fd, vd);
74
*/
53
}
75
cpu->ctr = 0x80038003; /* 32 byte I and D cacheline size, VIPT icache */
54
fn(fd, f0, f1, fpst);
76
cpu->dcz_blocksize = 7; /* 512 bytes */
55
vfp_store_reg32(fd, vd);
77
+
56
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_hp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
78
+ /*
57
}
79
+ * Note that Linux will enable enable all of the keys at once.
58
80
+ * But doing it this way will allow experimentation beyond that.
59
f0 = tcg_temp_new_i32();
81
+ */
60
- vfp_load_reg32(f0, vm);
82
+ {
61
+ vfp_load_reg16(f0, vm);
83
+ static const uint64_t apia_bit = SCTLR_EnIA;
62
fn(f0, f0);
84
+ static const uint64_t apib_bit = SCTLR_EnIB;
63
vfp_store_reg32(f0, vd);
85
+ static const uint64_t apda_bit = SCTLR_EnDA;
64
86
+ static const uint64_t apdb_bit = SCTLR_EnDB;
65
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
87
+
66
vm = tcg_temp_new_i32();
88
+ object_property_add(obj, "apia", "bool", cpu_max_get_packey,
67
vd = tcg_temp_new_i32();
89
+ cpu_max_set_packey, NULL,
68
90
+ (void *)&apia_bit, &error_fatal);
69
- vfp_load_reg32(vn, a->vn);
91
+ object_property_add(obj, "apib", "bool", cpu_max_get_packey,
70
- vfp_load_reg32(vm, a->vm);
92
+ cpu_max_set_packey, NULL,
71
+ vfp_load_reg16(vn, a->vn);
93
+ (void *)&apib_bit, &error_fatal);
72
+ vfp_load_reg16(vm, a->vm);
94
+ object_property_add(obj, "apda", "bool", cpu_max_get_packey,
73
if (neg_n) {
95
+ cpu_max_set_packey, NULL,
74
/* VFNMS, VFMS */
96
+ (void *)&apda_bit, &error_fatal);
75
gen_helper_vfp_negh(vn, vn);
97
+ object_property_add(obj, "apdb", "bool", cpu_max_get_packey,
76
}
98
+ cpu_max_set_packey, NULL,
77
- vfp_load_reg32(vd, a->vd);
99
+ (void *)&apdb_bit, &error_fatal);
78
+ vfp_load_reg16(vd, a->vd);
100
+
79
if (neg_d) {
101
+ /* Enable all PAC keys by default. */
80
/* VFNMA, VFNMS */
102
+ cpu->env.cp15.sctlr_el[1] |= SCTLR_EnIA | SCTLR_EnIB;
81
gen_helper_vfp_negh(vd, vd);
103
+ cpu->env.cp15.sctlr_el[1] |= SCTLR_EnDA | SCTLR_EnDB;
82
@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_hp(DisasContext *s, arg_VCMP_sp *a)
104
+ }
83
vd = tcg_temp_new_i32();
105
#endif
84
vm = tcg_temp_new_i32();
106
85
107
cpu->sve_max_vq = ARM_MAX_VQ;
86
- vfp_load_reg32(vd, a->vd);
87
+ vfp_load_reg16(vd, a->vd);
88
if (a->z) {
89
tcg_gen_movi_i32(vm, 0);
90
} else {
91
- vfp_load_reg32(vm, a->vm);
92
+ vfp_load_reg16(vm, a->vm);
93
}
94
95
if (a->e) {
96
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_hp(DisasContext *s, arg_VRINTR_sp *a)
97
}
98
99
tmp = tcg_temp_new_i32();
100
- vfp_load_reg32(tmp, a->vm);
101
+ vfp_load_reg16(tmp, a->vm);
102
fpst = fpstatus_ptr(FPST_FPCR_F16);
103
gen_helper_rinth(tmp, tmp, fpst);
104
vfp_store_reg32(tmp, a->vd);
105
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_hp(DisasContext *s, arg_VRINTZ_sp *a)
106
}
107
108
tmp = tcg_temp_new_i32();
109
- vfp_load_reg32(tmp, a->vm);
110
+ vfp_load_reg16(tmp, a->vm);
111
fpst = fpstatus_ptr(FPST_FPCR_F16);
112
tcg_rmode = gen_set_rmode(FPROUNDING_ZERO, fpst);
113
gen_helper_rinth(tmp, tmp, fpst);
114
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_hp(DisasContext *s, arg_VRINTX_sp *a)
115
}
116
117
tmp = tcg_temp_new_i32();
118
- vfp_load_reg32(tmp, a->vm);
119
+ vfp_load_reg16(tmp, a->vm);
120
fpst = fpstatus_ptr(FPST_FPCR_F16);
121
gen_helper_rinth_exact(tmp, tmp, fpst);
122
vfp_store_reg32(tmp, a->vd);
123
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_hp_int(DisasContext *s, arg_VCVT_sp_int *a)
124
125
fpst = fpstatus_ptr(FPST_FPCR_F16);
126
vm = tcg_temp_new_i32();
127
- vfp_load_reg32(vm, a->vm);
128
+ vfp_load_reg16(vm, a->vm);
129
130
if (a->s) {
131
if (a->rz) {
132
@@ -XXX,XX +XXX,XX @@ static bool trans_VINS(DisasContext *s, arg_VINS *a)
133
/* Insert low half of Vm into high half of Vd */
134
rm = tcg_temp_new_i32();
135
rd = tcg_temp_new_i32();
136
- vfp_load_reg32(rm, a->vm);
137
- vfp_load_reg32(rd, a->vd);
138
+ vfp_load_reg16(rm, a->vm);
139
+ vfp_load_reg16(rd, a->vd);
140
tcg_gen_deposit_i32(rd, rd, rm, 16, 16);
141
vfp_store_reg32(rd, a->vd);
142
return true;
108
--
143
--
109
2.20.1
144
2.34.1
110
111
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20190108223129.5570-14-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-23-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/translate-a64.c | 82 +++++++++++++++++++++++++++++++++++++-
8
target/arm/helper.h | 6 ----
9
1 file changed, 81 insertions(+), 1 deletion(-)
9
target/arm/tcg/translate.h | 30 +++++++++++++++++++
10
target/arm/tcg/translate-a64.c | 44 +++++++++++++--------------
11
target/arm/tcg/translate-vfp.c | 54 +++++++++++++++++-----------------
12
target/arm/vfp_helper.c | 30 -------------------
13
5 files changed, 79 insertions(+), 85 deletions(-)
10
14
11
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
15
diff --git a/target/arm/helper.h b/target/arm/helper.h
12
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-a64.c
17
--- a/target/arm/helper.h
14
+++ b/target/arm/translate-a64.c
18
+++ b/target/arm/helper.h
15
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_maxnumd, f64, f64, f64, ptr)
20
DEF_HELPER_3(vfp_minnumh, f16, f16, f16, ptr)
21
DEF_HELPER_3(vfp_minnums, f32, f32, f32, ptr)
22
DEF_HELPER_3(vfp_minnumd, f64, f64, f64, ptr)
23
-DEF_HELPER_1(vfp_negh, f16, f16)
24
-DEF_HELPER_1(vfp_negs, f32, f32)
25
-DEF_HELPER_1(vfp_negd, f64, f64)
26
-DEF_HELPER_1(vfp_absh, f16, f16)
27
-DEF_HELPER_1(vfp_abss, f32, f32)
28
-DEF_HELPER_1(vfp_absd, f64, f64)
29
DEF_HELPER_2(vfp_sqrth, f16, f16, env)
30
DEF_HELPER_2(vfp_sqrts, f32, f32, env)
31
DEF_HELPER_2(vfp_sqrtd, f64, f64, env)
32
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/tcg/translate.h
35
+++ b/target/arm/tcg/translate.h
36
@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
37
*/
38
uint64_t vfp_expand_imm(int size, uint8_t imm8);
39
40
+static inline void gen_vfp_absh(TCGv_i32 d, TCGv_i32 s)
41
+{
42
+ tcg_gen_andi_i32(d, s, INT16_MAX);
43
+}
44
+
45
+static inline void gen_vfp_abss(TCGv_i32 d, TCGv_i32 s)
46
+{
47
+ tcg_gen_andi_i32(d, s, INT32_MAX);
48
+}
49
+
50
+static inline void gen_vfp_absd(TCGv_i64 d, TCGv_i64 s)
51
+{
52
+ tcg_gen_andi_i64(d, s, INT64_MAX);
53
+}
54
+
55
+static inline void gen_vfp_negh(TCGv_i32 d, TCGv_i32 s)
56
+{
57
+ tcg_gen_xori_i32(d, s, 1u << 15);
58
+}
59
+
60
+static inline void gen_vfp_negs(TCGv_i32 d, TCGv_i32 s)
61
+{
62
+ tcg_gen_xori_i32(d, s, 1u << 31);
63
+}
64
+
65
+static inline void gen_vfp_negd(TCGv_i64 d, TCGv_i64 s)
66
+{
67
+ tcg_gen_xori_i64(d, s, 1ull << 63);
68
+}
69
+
70
/* Vector operations shared between ARM and AArch64. */
71
void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
72
uint32_t opr_sz, uint32_t max_sz);
73
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/tcg/translate-a64.c
76
+++ b/target/arm/tcg/translate-a64.c
77
@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
78
tcg_gen_mov_i32(tcg_res, tcg_op);
79
break;
80
case 0x1: /* FABS */
81
- tcg_gen_andi_i32(tcg_res, tcg_op, 0x7fff);
82
+ gen_vfp_absh(tcg_res, tcg_op);
83
break;
84
case 0x2: /* FNEG */
85
- tcg_gen_xori_i32(tcg_res, tcg_op, 0x8000);
86
+ gen_vfp_negh(tcg_res, tcg_op);
87
break;
88
case 0x3: /* FSQRT */
89
fpst = fpstatus_ptr(FPST_FPCR_F16);
90
@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_single(DisasContext *s, int opcode, int rd, int rn)
91
tcg_gen_mov_i32(tcg_res, tcg_op);
92
goto done;
93
case 0x1: /* FABS */
94
- gen_helper_vfp_abss(tcg_res, tcg_op);
95
+ gen_vfp_abss(tcg_res, tcg_op);
96
goto done;
97
case 0x2: /* FNEG */
98
- gen_helper_vfp_negs(tcg_res, tcg_op);
99
+ gen_vfp_negs(tcg_res, tcg_op);
100
goto done;
101
case 0x3: /* FSQRT */
102
gen_helper_vfp_sqrts(tcg_res, tcg_op, tcg_env);
103
@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_double(DisasContext *s, int opcode, int rd, int rn)
104
105
switch (opcode) {
106
case 0x1: /* FABS */
107
- gen_helper_vfp_absd(tcg_res, tcg_op);
108
+ gen_vfp_absd(tcg_res, tcg_op);
109
goto done;
110
case 0x2: /* FNEG */
111
- gen_helper_vfp_negd(tcg_res, tcg_op);
112
+ gen_vfp_negd(tcg_res, tcg_op);
113
goto done;
114
case 0x3: /* FSQRT */
115
gen_helper_vfp_sqrtd(tcg_res, tcg_op, tcg_env);
116
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
117
switch (opcode) {
118
case 0x8: /* FNMUL */
119
gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
120
- gen_helper_vfp_negs(tcg_res, tcg_res);
121
+ gen_vfp_negs(tcg_res, tcg_res);
122
break;
123
default:
124
case 0x0: /* FMUL */
125
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
126
switch (opcode) {
127
case 0x8: /* FNMUL */
128
gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
129
- gen_helper_vfp_negd(tcg_res, tcg_res);
130
+ gen_vfp_negd(tcg_res, tcg_res);
131
break;
132
default:
133
case 0x0: /* FMUL */
134
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
135
switch (opcode) {
136
case 0x8: /* FNMUL */
137
gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
138
- tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
139
+ gen_vfp_negh(tcg_res, tcg_res);
140
break;
141
default:
142
case 0x0: /* FMUL */
143
@@ -XXX,XX +XXX,XX @@ static void handle_fp_3src_single(DisasContext *s, bool o0, bool o1,
144
* flipped if it is a negated-input.
145
*/
146
if (o1 == true) {
147
- gen_helper_vfp_negs(tcg_op3, tcg_op3);
148
+ gen_vfp_negs(tcg_op3, tcg_op3);
149
}
150
151
if (o0 != o1) {
152
- gen_helper_vfp_negs(tcg_op1, tcg_op1);
153
+ gen_vfp_negs(tcg_op1, tcg_op1);
154
}
155
156
gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst);
157
@@ -XXX,XX +XXX,XX @@ static void handle_fp_3src_double(DisasContext *s, bool o0, bool o1,
158
* flipped if it is a negated-input.
159
*/
160
if (o1 == true) {
161
- gen_helper_vfp_negd(tcg_op3, tcg_op3);
162
+ gen_vfp_negd(tcg_op3, tcg_op3);
163
}
164
165
if (o0 != o1) {
166
- gen_helper_vfp_negd(tcg_op1, tcg_op1);
167
+ gen_vfp_negd(tcg_op1, tcg_op1);
168
}
169
170
gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst);
171
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
172
switch (fpopcode) {
173
case 0x39: /* FMLS */
174
/* As usual for ARM, separate negation for fused multiply-add */
175
- gen_helper_vfp_negd(tcg_op1, tcg_op1);
176
+ gen_vfp_negd(tcg_op1, tcg_op1);
177
/* fall through */
178
case 0x19: /* FMLA */
179
read_vec_element(s, tcg_res, rd, pass, MO_64);
180
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
181
break;
182
case 0x7a: /* FABD */
183
gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
184
- gen_helper_vfp_absd(tcg_res, tcg_res);
185
+ gen_vfp_absd(tcg_res, tcg_res);
186
break;
187
case 0x7c: /* FCMGT */
188
gen_helper_neon_cgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
189
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
190
switch (fpopcode) {
191
case 0x39: /* FMLS */
192
/* As usual for ARM, separate negation for fused multiply-add */
193
- gen_helper_vfp_negs(tcg_op1, tcg_op1);
194
+ gen_vfp_negs(tcg_op1, tcg_op1);
195
/* fall through */
196
case 0x19: /* FMLA */
197
read_vec_element_i32(s, tcg_res, rd, pass, MO_32);
198
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
199
break;
200
case 0x7a: /* FABD */
201
gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
202
- gen_helper_vfp_abss(tcg_res, tcg_res);
203
+ gen_vfp_abss(tcg_res, tcg_res);
204
break;
205
case 0x7c: /* FCMGT */
206
gen_helper_neon_cgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
207
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_64(DisasContext *s, int opcode, bool u,
208
}
209
break;
210
case 0x2f: /* FABS */
211
- gen_helper_vfp_absd(tcg_rd, tcg_rn);
212
+ gen_vfp_absd(tcg_rd, tcg_rn);
213
break;
214
case 0x6f: /* FNEG */
215
- gen_helper_vfp_negd(tcg_rd, tcg_rn);
216
+ gen_vfp_negd(tcg_rd, tcg_rn);
217
break;
218
case 0x7f: /* FSQRT */
219
gen_helper_vfp_sqrtd(tcg_rd, tcg_rn, tcg_env);
220
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
221
}
222
break;
223
case 0x2f: /* FABS */
224
- gen_helper_vfp_abss(tcg_res, tcg_op);
225
+ gen_vfp_abss(tcg_res, tcg_op);
226
break;
227
case 0x6f: /* FNEG */
228
- gen_helper_vfp_negs(tcg_res, tcg_op);
229
+ gen_vfp_negs(tcg_res, tcg_op);
230
break;
231
case 0x7f: /* FSQRT */
232
gen_helper_vfp_sqrts(tcg_res, tcg_op, tcg_env);
233
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
234
switch (16 * u + opcode) {
235
case 0x05: /* FMLS */
236
/* As usual for ARM, separate negation for fused multiply-add */
237
- gen_helper_vfp_negd(tcg_op, tcg_op);
238
+ gen_vfp_negd(tcg_op, tcg_op);
239
/* fall through */
240
case 0x01: /* FMLA */
241
read_vec_element(s, tcg_res, rd, pass, MO_64);
242
diff --git a/target/arm/tcg/translate-vfp.c b/target/arm/tcg/translate-vfp.c
243
index XXXXXXX..XXXXXXX 100644
244
--- a/target/arm/tcg/translate-vfp.c
245
+++ b/target/arm/tcg/translate-vfp.c
246
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
247
TCGv_i32 tmp = tcg_temp_new_i32();
248
249
gen_helper_vfp_mulh(tmp, vn, vm, fpst);
250
- gen_helper_vfp_negh(tmp, tmp);
251
+ gen_vfp_negh(tmp, tmp);
252
gen_helper_vfp_addh(vd, vd, tmp, fpst);
253
}
254
255
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
256
TCGv_i32 tmp = tcg_temp_new_i32();
257
258
gen_helper_vfp_muls(tmp, vn, vm, fpst);
259
- gen_helper_vfp_negs(tmp, tmp);
260
+ gen_vfp_negs(tmp, tmp);
261
gen_helper_vfp_adds(vd, vd, tmp, fpst);
262
}
263
264
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
265
TCGv_i64 tmp = tcg_temp_new_i64();
266
267
gen_helper_vfp_muld(tmp, vn, vm, fpst);
268
- gen_helper_vfp_negd(tmp, tmp);
269
+ gen_vfp_negd(tmp, tmp);
270
gen_helper_vfp_addd(vd, vd, tmp, fpst);
271
}
272
273
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLS_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
274
TCGv_i32 tmp = tcg_temp_new_i32();
275
276
gen_helper_vfp_mulh(tmp, vn, vm, fpst);
277
- gen_helper_vfp_negh(vd, vd);
278
+ gen_vfp_negh(vd, vd);
279
gen_helper_vfp_addh(vd, vd, tmp, fpst);
280
}
281
282
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
283
TCGv_i32 tmp = tcg_temp_new_i32();
284
285
gen_helper_vfp_muls(tmp, vn, vm, fpst);
286
- gen_helper_vfp_negs(vd, vd);
287
+ gen_vfp_negs(vd, vd);
288
gen_helper_vfp_adds(vd, vd, tmp, fpst);
289
}
290
291
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLS_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
292
TCGv_i64 tmp = tcg_temp_new_i64();
293
294
gen_helper_vfp_muld(tmp, vn, vm, fpst);
295
- gen_helper_vfp_negd(vd, vd);
296
+ gen_vfp_negd(vd, vd);
297
gen_helper_vfp_addd(vd, vd, tmp, fpst);
298
}
299
300
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLA_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
301
TCGv_i32 tmp = tcg_temp_new_i32();
302
303
gen_helper_vfp_mulh(tmp, vn, vm, fpst);
304
- gen_helper_vfp_negh(tmp, tmp);
305
- gen_helper_vfp_negh(vd, vd);
306
+ gen_vfp_negh(tmp, tmp);
307
+ gen_vfp_negh(vd, vd);
308
gen_helper_vfp_addh(vd, vd, tmp, fpst);
309
}
310
311
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLA_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
312
TCGv_i32 tmp = tcg_temp_new_i32();
313
314
gen_helper_vfp_muls(tmp, vn, vm, fpst);
315
- gen_helper_vfp_negs(tmp, tmp);
316
- gen_helper_vfp_negs(vd, vd);
317
+ gen_vfp_negs(tmp, tmp);
318
+ gen_vfp_negs(vd, vd);
319
gen_helper_vfp_adds(vd, vd, tmp, fpst);
320
}
321
322
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLA_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
323
TCGv_i64 tmp = tcg_temp_new_i64();
324
325
gen_helper_vfp_muld(tmp, vn, vm, fpst);
326
- gen_helper_vfp_negd(tmp, tmp);
327
- gen_helper_vfp_negd(vd, vd);
328
+ gen_vfp_negd(tmp, tmp);
329
+ gen_vfp_negd(vd, vd);
330
gen_helper_vfp_addd(vd, vd, tmp, fpst);
331
}
332
333
@@ -XXX,XX +XXX,XX @@ static void gen_VNMUL_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
16
{
334
{
17
unsigned int opc, op2, op3, rn, op4;
335
/* VNMUL: -(fn * fm) */
18
TCGv_i64 dst;
336
gen_helper_vfp_mulh(vd, vn, vm, fpst);
19
+ TCGv_i64 modifier;
337
- gen_helper_vfp_negh(vd, vd);
20
338
+ gen_vfp_negh(vd, vd);
21
opc = extract32(insn, 21, 4);
339
}
22
op2 = extract32(insn, 16, 5);
340
23
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
341
static bool trans_VNMUL_hp(DisasContext *s, arg_VNMUL_sp *a)
24
case 2: /* RET */
342
@@ -XXX,XX +XXX,XX @@ static void gen_VNMUL_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
25
switch (op3) {
343
{
26
case 0:
344
/* VNMUL: -(fn * fm) */
27
+ /* BR, BLR, RET */
345
gen_helper_vfp_muls(vd, vn, vm, fpst);
28
if (op4 != 0) {
346
- gen_helper_vfp_negs(vd, vd);
29
goto do_unallocated;
347
+ gen_vfp_negs(vd, vd);
30
}
348
}
31
dst = cpu_reg(s, rn);
349
32
break;
350
static bool trans_VNMUL_sp(DisasContext *s, arg_VNMUL_sp *a)
33
351
@@ -XXX,XX +XXX,XX @@ static void gen_VNMUL_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
34
+ case 2:
352
{
35
+ case 3:
353
/* VNMUL: -(fn * fm) */
36
+ if (!dc_isar_feature(aa64_pauth, s)) {
354
gen_helper_vfp_muld(vd, vn, vm, fpst);
37
+ goto do_unallocated;
355
- gen_helper_vfp_negd(vd, vd);
38
+ }
356
+ gen_vfp_negd(vd, vd);
39
+ if (opc == 2) {
357
}
40
+ /* RETAA, RETAB */
358
41
+ if (rn != 0x1f || op4 != 0x1f) {
359
static bool trans_VNMUL_dp(DisasContext *s, arg_VNMUL_dp *a)
42
+ goto do_unallocated;
360
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
43
+ }
361
vfp_load_reg16(vm, a->vm);
44
+ rn = 30;
362
if (neg_n) {
45
+ modifier = cpu_X[31];
363
/* VFNMS, VFMS */
46
+ } else {
364
- gen_helper_vfp_negh(vn, vn);
47
+ /* BRAAZ, BRABZ, BLRAAZ, BLRABZ */
365
+ gen_vfp_negh(vn, vn);
48
+ if (op4 != 0x1f) {
366
}
49
+ goto do_unallocated;
367
vfp_load_reg16(vd, a->vd);
50
+ }
368
if (neg_d) {
51
+ modifier = new_tmp_a64_zero(s);
369
/* VFNMA, VFNMS */
52
+ }
370
- gen_helper_vfp_negh(vd, vd);
53
+ if (s->pauth_active) {
371
+ gen_vfp_negh(vd, vd);
54
+ dst = new_tmp_a64(s);
372
}
55
+ if (op3 == 2) {
373
fpst = fpstatus_ptr(FPST_FPCR_F16);
56
+ gen_helper_autia(dst, cpu_env, cpu_reg(s, rn), modifier);
374
gen_helper_vfp_muladdh(vd, vn, vm, vd, fpst);
57
+ } else {
375
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_sp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
58
+ gen_helper_autib(dst, cpu_env, cpu_reg(s, rn), modifier);
376
vfp_load_reg32(vm, a->vm);
59
+ }
377
if (neg_n) {
60
+ } else {
378
/* VFNMS, VFMS */
61
+ dst = cpu_reg(s, rn);
379
- gen_helper_vfp_negs(vn, vn);
62
+ }
380
+ gen_vfp_negs(vn, vn);
63
+ break;
381
}
64
+
382
vfp_load_reg32(vd, a->vd);
65
default:
383
if (neg_d) {
66
goto do_unallocated;
384
/* VFNMA, VFNMS */
67
}
385
- gen_helper_vfp_negs(vd, vd);
68
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
386
+ gen_vfp_negs(vd, vd);
69
}
387
}
70
break;
388
fpst = fpstatus_ptr(FPST_FPCR);
71
389
gen_helper_vfp_muladds(vd, vn, vm, vd, fpst);
72
+ case 8: /* BRAA */
390
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
73
+ case 9: /* BLRAA */
391
vfp_load_reg64(vm, a->vm);
74
+ if (!dc_isar_feature(aa64_pauth, s)) {
392
if (neg_n) {
75
+ goto do_unallocated;
393
/* VFNMS, VFMS */
76
+ }
394
- gen_helper_vfp_negd(vn, vn);
77
+ if (op3 != 2 || op3 != 3) {
395
+ gen_vfp_negd(vn, vn);
78
+ goto do_unallocated;
396
}
79
+ }
397
vfp_load_reg64(vd, a->vd);
80
+ if (s->pauth_active) {
398
if (neg_d) {
81
+ dst = new_tmp_a64(s);
399
/* VFNMA, VFNMS */
82
+ modifier = cpu_reg_sp(s, op4);
400
- gen_helper_vfp_negd(vd, vd);
83
+ if (op3 == 2) {
401
+ gen_vfp_negd(vd, vd);
84
+ gen_helper_autia(dst, cpu_env, cpu_reg(s, rn), modifier);
402
}
85
+ } else {
403
fpst = fpstatus_ptr(FPST_FPCR);
86
+ gen_helper_autib(dst, cpu_env, cpu_reg(s, rn), modifier);
404
gen_helper_vfp_muladdd(vd, vn, vm, vd, fpst);
87
+ }
405
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
88
+ } else {
406
DO_VFP_VMOV(VMOV_reg, sp, tcg_gen_mov_i32)
89
+ dst = cpu_reg(s, rn);
407
DO_VFP_VMOV(VMOV_reg, dp, tcg_gen_mov_i64)
90
+ }
408
91
+ gen_a64_set_pc(s, dst);
409
-DO_VFP_2OP(VABS, hp, gen_helper_vfp_absh, aa32_fp16_arith)
92
+ /* BLRAA also needs to load return address */
410
-DO_VFP_2OP(VABS, sp, gen_helper_vfp_abss, aa32_fpsp_v2)
93
+ if (opc == 9) {
411
-DO_VFP_2OP(VABS, dp, gen_helper_vfp_absd, aa32_fpdp_v2)
94
+ tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
412
+DO_VFP_2OP(VABS, hp, gen_vfp_absh, aa32_fp16_arith)
95
+ }
413
+DO_VFP_2OP(VABS, sp, gen_vfp_abss, aa32_fpsp_v2)
96
+ break;
414
+DO_VFP_2OP(VABS, dp, gen_vfp_absd, aa32_fpdp_v2)
97
+
415
98
case 4: /* ERET */
416
-DO_VFP_2OP(VNEG, hp, gen_helper_vfp_negh, aa32_fp16_arith)
99
if (s->current_el == 0) {
417
-DO_VFP_2OP(VNEG, sp, gen_helper_vfp_negs, aa32_fpsp_v2)
100
goto do_unallocated;
418
-DO_VFP_2OP(VNEG, dp, gen_helper_vfp_negd, aa32_fpdp_v2)
101
}
419
+DO_VFP_2OP(VNEG, hp, gen_vfp_negh, aa32_fp16_arith)
102
switch (op3) {
420
+DO_VFP_2OP(VNEG, sp, gen_vfp_negs, aa32_fpsp_v2)
103
- case 0:
421
+DO_VFP_2OP(VNEG, dp, gen_vfp_negd, aa32_fpdp_v2)
104
+ case 0: /* ERET */
422
105
if (op4 != 0) {
423
static void gen_VSQRT_hp(TCGv_i32 vd, TCGv_i32 vm)
106
goto do_unallocated;
424
{
107
}
425
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
108
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
426
index XXXXXXX..XXXXXXX 100644
109
offsetof(CPUARMState, elr_el[s->current_el]));
427
--- a/target/arm/vfp_helper.c
110
break;
428
+++ b/target/arm/vfp_helper.c
111
429
@@ -XXX,XX +XXX,XX @@ VFP_BINOP(minnum)
112
+ case 2: /* ERETAA */
430
VFP_BINOP(maxnum)
113
+ case 3: /* ERETAB */
431
#undef VFP_BINOP
114
+ if (!dc_isar_feature(aa64_pauth, s)) {
432
115
+ goto do_unallocated;
433
-dh_ctype_f16 VFP_HELPER(neg, h)(dh_ctype_f16 a)
116
+ }
434
-{
117
+ if (rn != 0x1f || op4 != 0x1f) {
435
- return float16_chs(a);
118
+ goto do_unallocated;
436
-}
119
+ }
437
-
120
+ dst = tcg_temp_new_i64();
438
-float32 VFP_HELPER(neg, s)(float32 a)
121
+ tcg_gen_ld_i64(dst, cpu_env,
439
-{
122
+ offsetof(CPUARMState, elr_el[s->current_el]));
440
- return float32_chs(a);
123
+ if (s->pauth_active) {
441
-}
124
+ modifier = cpu_X[31];
442
-
125
+ if (op3 == 2) {
443
-float64 VFP_HELPER(neg, d)(float64 a)
126
+ gen_helper_autia(dst, cpu_env, dst, modifier);
444
-{
127
+ } else {
445
- return float64_chs(a);
128
+ gen_helper_autib(dst, cpu_env, dst, modifier);
446
-}
129
+ }
447
-
130
+ }
448
-dh_ctype_f16 VFP_HELPER(abs, h)(dh_ctype_f16 a)
131
+ break;
449
-{
132
+
450
- return float16_abs(a);
133
default:
451
-}
134
goto do_unallocated;
452
-
135
}
453
-float32 VFP_HELPER(abs, s)(float32 a)
454
-{
455
- return float32_abs(a);
456
-}
457
-
458
-float64 VFP_HELPER(abs, d)(float64 a)
459
-{
460
- return float64_abs(a);
461
-}
462
-
463
dh_ctype_f16 VFP_HELPER(sqrt, h)(dh_ctype_f16 a, CPUARMState *env)
464
{
465
return float16_sqrt(a, &env->vfp.fp_status_f16);
136
--
466
--
137
2.20.1
467
2.34.1
138
139
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This is not really functional yet, because the crypto is not yet
3
This is the last instruction within disas_fp_2src,
4
implemented. This, however follows the Auth pseudo function.
4
so remove that and its subroutines.
5
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20190108223129.5570-26-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-24-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
10
---
11
target/arm/pauth_helper.c | 21 ++++++++++++++++++++-
11
target/arm/tcg/a64.decode | 1 +
12
1 file changed, 20 insertions(+), 1 deletion(-)
12
target/arm/tcg/translate-a64.c | 177 +++++----------------------------
13
2 files changed, 27 insertions(+), 151 deletions(-)
13
14
14
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
15
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/pauth_helper.c
17
--- a/target/arm/tcg/a64.decode
17
+++ b/target/arm/pauth_helper.c
18
+++ b/target/arm/tcg/a64.decode
18
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_original_ptr(uint64_t ptr, ARMVAParameters param)
19
@@ -XXX,XX +XXX,XX @@ FADD_s 0001 1110 ..1 ..... 0010 10 ..... ..... @rrr_hsd
19
static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
20
FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd
20
ARMPACKey *key, bool data, int keynumber)
21
FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd
22
FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd
23
+FNMUL_s 0001 1110 ..1 ..... 1000 10 ..... ..... @rrr_hsd
24
25
FMAX_s 0001 1110 ..1 ..... 0100 10 ..... ..... @rrr_hsd
26
FMIN_s 0001 1110 ..1 ..... 0101 10 ..... ..... @rrr_hsd
27
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
28
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/tcg/translate-a64.c
30
+++ b/target/arm/tcg/translate-a64.c
31
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_fmulx = {
32
};
33
TRANS(FMULX_s, do_fp3_scalar, a, &f_scalar_fmulx)
34
35
+static void gen_fnmul_h(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
36
+{
37
+ gen_helper_vfp_mulh(d, n, m, s);
38
+ gen_vfp_negh(d, d);
39
+}
40
+
41
+static void gen_fnmul_s(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
42
+{
43
+ gen_helper_vfp_muls(d, n, m, s);
44
+ gen_vfp_negs(d, d);
45
+}
46
+
47
+static void gen_fnmul_d(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_ptr s)
48
+{
49
+ gen_helper_vfp_muld(d, n, m, s);
50
+ gen_vfp_negd(d, d);
51
+}
52
+
53
+static const FPScalar f_scalar_fnmul = {
54
+ gen_fnmul_h,
55
+ gen_fnmul_s,
56
+ gen_fnmul_d,
57
+};
58
+TRANS(FNMUL_s, do_fp3_scalar, a, &f_scalar_fnmul)
59
+
60
static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
61
gen_helper_gvec_3_ptr * const fns[3])
21
{
62
{
22
- g_assert_not_reached(); /* FIXME */
63
@@ -XXX,XX +XXX,XX @@ static void disas_fp_1src(DisasContext *s, uint32_t insn)
23
+ ARMMMUIdx mmu_idx = arm_stage1_mmu_idx(env);
64
}
24
+ ARMVAParameters param = aa64_va_parameters(env, ptr, mmu_idx, data);
25
+ int bot_bit, top_bit;
26
+ uint64_t pac, orig_ptr, test;
27
+
28
+ orig_ptr = pauth_original_ptr(ptr, param);
29
+ pac = pauth_computepac(orig_ptr, modifier, *key);
30
+ bot_bit = 64 - param.tsz;
31
+ top_bit = 64 - 8 * param.tbi;
32
+
33
+ test = (pac ^ ptr) & ~MAKE_64BIT_MASK(55, 1);
34
+ if (unlikely(extract64(test, bot_bit, top_bit - bot_bit))) {
35
+ int error_code = (keynumber << 1) | (keynumber ^ 1);
36
+ if (param.tbi) {
37
+ return deposit64(ptr, 53, 2, error_code);
38
+ } else {
39
+ return deposit64(ptr, 61, 2, error_code);
40
+ }
41
+ }
42
+ return orig_ptr;
43
}
65
}
44
66
45
static uint64_t pauth_strip(CPUARMState *env, uint64_t ptr, bool data)
67
-/* Floating-point data-processing (2 source) - single precision */
68
-static void handle_fp_2src_single(DisasContext *s, int opcode,
69
- int rd, int rn, int rm)
70
-{
71
- TCGv_i32 tcg_op1;
72
- TCGv_i32 tcg_op2;
73
- TCGv_i32 tcg_res;
74
- TCGv_ptr fpst;
75
-
76
- tcg_res = tcg_temp_new_i32();
77
- fpst = fpstatus_ptr(FPST_FPCR);
78
- tcg_op1 = read_fp_sreg(s, rn);
79
- tcg_op2 = read_fp_sreg(s, rm);
80
-
81
- switch (opcode) {
82
- case 0x8: /* FNMUL */
83
- gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
84
- gen_vfp_negs(tcg_res, tcg_res);
85
- break;
86
- default:
87
- case 0x0: /* FMUL */
88
- case 0x1: /* FDIV */
89
- case 0x2: /* FADD */
90
- case 0x3: /* FSUB */
91
- case 0x4: /* FMAX */
92
- case 0x5: /* FMIN */
93
- case 0x6: /* FMAXNM */
94
- case 0x7: /* FMINNM */
95
- g_assert_not_reached();
96
- }
97
-
98
- write_fp_sreg(s, rd, tcg_res);
99
-}
100
-
101
-/* Floating-point data-processing (2 source) - double precision */
102
-static void handle_fp_2src_double(DisasContext *s, int opcode,
103
- int rd, int rn, int rm)
104
-{
105
- TCGv_i64 tcg_op1;
106
- TCGv_i64 tcg_op2;
107
- TCGv_i64 tcg_res;
108
- TCGv_ptr fpst;
109
-
110
- tcg_res = tcg_temp_new_i64();
111
- fpst = fpstatus_ptr(FPST_FPCR);
112
- tcg_op1 = read_fp_dreg(s, rn);
113
- tcg_op2 = read_fp_dreg(s, rm);
114
-
115
- switch (opcode) {
116
- case 0x8: /* FNMUL */
117
- gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
118
- gen_vfp_negd(tcg_res, tcg_res);
119
- break;
120
- default:
121
- case 0x0: /* FMUL */
122
- case 0x1: /* FDIV */
123
- case 0x2: /* FADD */
124
- case 0x3: /* FSUB */
125
- case 0x4: /* FMAX */
126
- case 0x5: /* FMIN */
127
- case 0x6: /* FMAXNM */
128
- case 0x7: /* FMINNM */
129
- g_assert_not_reached();
130
- }
131
-
132
- write_fp_dreg(s, rd, tcg_res);
133
-}
134
-
135
-/* Floating-point data-processing (2 source) - half precision */
136
-static void handle_fp_2src_half(DisasContext *s, int opcode,
137
- int rd, int rn, int rm)
138
-{
139
- TCGv_i32 tcg_op1;
140
- TCGv_i32 tcg_op2;
141
- TCGv_i32 tcg_res;
142
- TCGv_ptr fpst;
143
-
144
- tcg_res = tcg_temp_new_i32();
145
- fpst = fpstatus_ptr(FPST_FPCR_F16);
146
- tcg_op1 = read_fp_hreg(s, rn);
147
- tcg_op2 = read_fp_hreg(s, rm);
148
-
149
- switch (opcode) {
150
- case 0x8: /* FNMUL */
151
- gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
152
- gen_vfp_negh(tcg_res, tcg_res);
153
- break;
154
- default:
155
- case 0x0: /* FMUL */
156
- case 0x1: /* FDIV */
157
- case 0x2: /* FADD */
158
- case 0x3: /* FSUB */
159
- case 0x4: /* FMAX */
160
- case 0x5: /* FMIN */
161
- case 0x6: /* FMAXNM */
162
- case 0x7: /* FMINNM */
163
- g_assert_not_reached();
164
- }
165
-
166
- write_fp_sreg(s, rd, tcg_res);
167
-}
168
-
169
-/* Floating point data-processing (2 source)
170
- * 31 30 29 28 24 23 22 21 20 16 15 12 11 10 9 5 4 0
171
- * +---+---+---+-----------+------+---+------+--------+-----+------+------+
172
- * | M | 0 | S | 1 1 1 1 0 | type | 1 | Rm | opcode | 1 0 | Rn | Rd |
173
- * +---+---+---+-----------+------+---+------+--------+-----+------+------+
174
- */
175
-static void disas_fp_2src(DisasContext *s, uint32_t insn)
176
-{
177
- int mos = extract32(insn, 29, 3);
178
- int type = extract32(insn, 22, 2);
179
- int rd = extract32(insn, 0, 5);
180
- int rn = extract32(insn, 5, 5);
181
- int rm = extract32(insn, 16, 5);
182
- int opcode = extract32(insn, 12, 4);
183
-
184
- if (opcode > 8 || mos) {
185
- unallocated_encoding(s);
186
- return;
187
- }
188
-
189
- switch (type) {
190
- case 0:
191
- if (!fp_access_check(s)) {
192
- return;
193
- }
194
- handle_fp_2src_single(s, opcode, rd, rn, rm);
195
- break;
196
- case 1:
197
- if (!fp_access_check(s)) {
198
- return;
199
- }
200
- handle_fp_2src_double(s, opcode, rd, rn, rm);
201
- break;
202
- case 3:
203
- if (!dc_isar_feature(aa64_fp16, s)) {
204
- unallocated_encoding(s);
205
- return;
206
- }
207
- if (!fp_access_check(s)) {
208
- return;
209
- }
210
- handle_fp_2src_half(s, opcode, rd, rn, rm);
211
- break;
212
- default:
213
- unallocated_encoding(s);
214
- }
215
-}
216
-
217
/* Floating-point data-processing (3 source) - single precision */
218
static void handle_fp_3src_single(DisasContext *s, bool o0, bool o1,
219
int rd, int rn, int rm, int ra)
220
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
221
break;
222
case 2:
223
/* Floating point data-processing (2 source) */
224
- disas_fp_2src(s, insn);
225
+ unallocated_encoding(s); /* in decodetree */
226
break;
227
case 3:
228
/* Floating point conditional select */
46
--
229
--
47
2.20.1
230
2.34.1
48
49
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20190108223129.5570-7-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-25-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/translate-a64.c | 93 +++++++++++++++++++++++++++++++++-----
8
target/arm/helper.h | 2 +
9
1 file changed, 81 insertions(+), 12 deletions(-)
9
target/arm/tcg/a64.decode | 22 +++
10
target/arm/tcg/translate-a64.c | 241 +++++++++++++++++----------------
11
target/arm/tcg/vec_helper.c | 14 ++
12
4 files changed, 163 insertions(+), 116 deletions(-)
10
13
11
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
12
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-a64.c
16
--- a/target/arm/helper.h
14
+++ b/target/arm/translate-a64.c
17
+++ b/target/arm/helper.h
15
@@ -XXX,XX +XXX,XX @@ static void handle_hint(DisasContext *s, uint32_t insn,
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
16
}
19
17
20
DEF_HELPER_FLAGS_5(gvec_vfma_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
switch (selector) {
21
DEF_HELPER_FLAGS_5(gvec_vfma_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
- case 0: /* NOP */
22
+DEF_HELPER_FLAGS_5(gvec_vfma_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
- return;
23
21
- case 3: /* WFI */
24
DEF_HELPER_FLAGS_5(gvec_vfms_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+ case 0b00000: /* NOP */
25
DEF_HELPER_FLAGS_5(gvec_vfms_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
+ break;
26
+DEF_HELPER_FLAGS_5(gvec_vfms_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
+ case 0b00011: /* WFI */
27
25
s->base.is_jmp = DISAS_WFI;
28
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
26
- return;
29
void, ptr, ptr, ptr, ptr, i32)
27
+ break;
30
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
28
+ case 0b00001: /* YIELD */
31
index XXXXXXX..XXXXXXX 100644
29
/* When running in MTTCG we don't generate jumps to the yield and
32
--- a/target/arm/tcg/a64.decode
30
* WFE helpers as it won't affect the scheduling of other vCPUs.
33
+++ b/target/arm/tcg/a64.decode
31
* If we wanted to more completely model WFE/SEV so we don't busy
34
@@ -XXX,XX +XXX,XX @@ FMINNM_v 0.00 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
32
* spin unnecessarily we would need to do something more involved.
35
FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h
33
*/
36
FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
34
- case 1: /* YIELD */
37
35
if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
38
+FMLA_v 0.00 1110 010 ..... 00001 1 ..... ..... @qrrr_h
36
s->base.is_jmp = DISAS_YIELD;
39
+FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd
37
}
40
+
38
- return;
41
+FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h
39
- case 2: /* WFE */
42
+FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd
40
+ break;
43
+
41
+ case 0b00010: /* WFE */
44
### Advanced SIMD scalar x indexed element
42
if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
45
43
s->base.is_jmp = DISAS_WFE;
46
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
44
}
47
FMUL_si 0101 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
45
- return;
48
FMUL_si 0101 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
46
- case 4: /* SEV */
49
47
- case 5: /* SEVL */
50
+FMLA_si 0101 1111 00 .. .... 0001 . 0 ..... ..... @rrx_h
48
+ break;
51
+FMLA_si 0101 1111 10 .. .... 0001 . 0 ..... ..... @rrx_s
49
+ case 0b00100: /* SEV */
52
+FMLA_si 0101 1111 11 0. .... 0001 . 0 ..... ..... @rrx_d
50
+ case 0b00101: /* SEVL */
53
+
51
/* we treat all as NOP at least for now */
54
+FMLS_si 0101 1111 00 .. .... 0101 . 0 ..... ..... @rrx_h
52
- return;
55
+FMLS_si 0101 1111 10 .. .... 0101 . 0 ..... ..... @rrx_s
53
+ break;
56
+FMLS_si 0101 1111 11 0. .... 0101 . 0 ..... ..... @rrx_d
54
+ case 0b00111: /* XPACLRI */
57
+
55
+ if (s->pauth_active) {
58
FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
56
+ gen_helper_xpaci(cpu_X[30], cpu_env, cpu_X[30]);
59
FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
60
FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
61
@@ -XXX,XX +XXX,XX @@ FMUL_vi 0.00 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
62
FMUL_vi 0.00 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
63
FMUL_vi 0.00 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
64
65
+FMLA_vi 0.00 1111 00 .. .... 0001 . 0 ..... ..... @qrrx_h
66
+FMLA_vi 0.00 1111 10 . ..... 0001 . 0 ..... ..... @qrrx_s
67
+FMLA_vi 0.00 1111 11 0 ..... 0001 . 0 ..... ..... @qrrx_d
68
+
69
+FMLS_vi 0.00 1111 00 .. .... 0101 . 0 ..... ..... @qrrx_h
70
+FMLS_vi 0.00 1111 10 . ..... 0101 . 0 ..... ..... @qrrx_s
71
+FMLS_vi 0.00 1111 11 0 ..... 0101 . 0 ..... ..... @qrrx_d
72
+
73
FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
74
FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
75
FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
76
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
77
index XXXXXXX..XXXXXXX 100644
78
--- a/target/arm/tcg/translate-a64.c
79
+++ b/target/arm/tcg/translate-a64.c
80
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = {
81
};
82
TRANS(FMULX_v, do_fp3_vector, a, f_vector_fmulx)
83
84
+static gen_helper_gvec_3_ptr * const f_vector_fmla[3] = {
85
+ gen_helper_gvec_vfma_h,
86
+ gen_helper_gvec_vfma_s,
87
+ gen_helper_gvec_vfma_d,
88
+};
89
+TRANS(FMLA_v, do_fp3_vector, a, f_vector_fmla)
90
+
91
+static gen_helper_gvec_3_ptr * const f_vector_fmls[3] = {
92
+ gen_helper_gvec_vfms_h,
93
+ gen_helper_gvec_vfms_s,
94
+ gen_helper_gvec_vfms_d,
95
+};
96
+TRANS(FMLS_v, do_fp3_vector, a, f_vector_fmls)
97
+
98
/*
99
* Advanced SIMD scalar/vector x indexed element
100
*/
101
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f)
102
TRANS(FMUL_si, do_fp3_scalar_idx, a, &f_scalar_fmul)
103
TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx)
104
105
+static bool do_fmla_scalar_idx(DisasContext *s, arg_rrx_e *a, bool neg)
106
+{
107
+ switch (a->esz) {
108
+ case MO_64:
109
+ if (fp_access_check(s)) {
110
+ TCGv_i64 t0 = read_fp_dreg(s, a->rd);
111
+ TCGv_i64 t1 = read_fp_dreg(s, a->rn);
112
+ TCGv_i64 t2 = tcg_temp_new_i64();
113
+
114
+ read_vec_element(s, t2, a->rm, a->idx, MO_64);
115
+ if (neg) {
116
+ gen_vfp_negd(t1, t1);
117
+ }
118
+ gen_helper_vfp_muladdd(t0, t1, t2, t0, fpstatus_ptr(FPST_FPCR));
119
+ write_fp_dreg(s, a->rd, t0);
57
+ }
120
+ }
58
+ break;
121
+ break;
59
+ case 0b01000: /* PACIA1716 */
122
+ case MO_32:
60
+ if (s->pauth_active) {
123
+ if (fp_access_check(s)) {
61
+ gen_helper_pacia(cpu_X[17], cpu_env, cpu_X[17], cpu_X[16]);
124
+ TCGv_i32 t0 = read_fp_sreg(s, a->rd);
125
+ TCGv_i32 t1 = read_fp_sreg(s, a->rn);
126
+ TCGv_i32 t2 = tcg_temp_new_i32();
127
+
128
+ read_vec_element_i32(s, t2, a->rm, a->idx, MO_32);
129
+ if (neg) {
130
+ gen_vfp_negs(t1, t1);
131
+ }
132
+ gen_helper_vfp_muladds(t0, t1, t2, t0, fpstatus_ptr(FPST_FPCR));
133
+ write_fp_sreg(s, a->rd, t0);
62
+ }
134
+ }
63
+ break;
135
+ break;
64
+ case 0b01010: /* PACIB1716 */
136
+ case MO_16:
65
+ if (s->pauth_active) {
137
+ if (!dc_isar_feature(aa64_fp16, s)) {
66
+ gen_helper_pacib(cpu_X[17], cpu_env, cpu_X[17], cpu_X[16]);
138
+ return false;
139
+ }
140
+ if (fp_access_check(s)) {
141
+ TCGv_i32 t0 = read_fp_hreg(s, a->rd);
142
+ TCGv_i32 t1 = read_fp_hreg(s, a->rn);
143
+ TCGv_i32 t2 = tcg_temp_new_i32();
144
+
145
+ read_vec_element_i32(s, t2, a->rm, a->idx, MO_16);
146
+ if (neg) {
147
+ gen_vfp_negh(t1, t1);
148
+ }
149
+ gen_helper_advsimd_muladdh(t0, t1, t2, t0,
150
+ fpstatus_ptr(FPST_FPCR_F16));
151
+ write_fp_sreg(s, a->rd, t0);
67
+ }
152
+ }
68
+ break;
153
+ break;
69
+ case 0b01100: /* AUTIA1716 */
154
+ default:
70
+ if (s->pauth_active) {
155
+ g_assert_not_reached();
71
+ gen_helper_autia(cpu_X[17], cpu_env, cpu_X[17], cpu_X[16]);
156
+ }
157
+ return true;
158
+}
159
+
160
+TRANS(FMLA_si, do_fmla_scalar_idx, a, false)
161
+TRANS(FMLS_si, do_fmla_scalar_idx, a, true)
162
+
163
static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a,
164
gen_helper_gvec_3_ptr * const fns[3])
165
{
166
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = {
167
};
168
TRANS(FMULX_vi, do_fp3_vector_idx, a, f_vector_idx_fmulx)
169
170
+static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg)
171
+{
172
+ static gen_helper_gvec_4_ptr * const fns[3] = {
173
+ gen_helper_gvec_fmla_idx_h,
174
+ gen_helper_gvec_fmla_idx_s,
175
+ gen_helper_gvec_fmla_idx_d,
176
+ };
177
+ MemOp esz = a->esz;
178
+
179
+ switch (esz) {
180
+ case MO_64:
181
+ if (!a->q) {
182
+ return false;
72
+ }
183
+ }
73
+ break;
184
+ break;
74
+ case 0b01110: /* AUTIB1716 */
185
+ case MO_32:
75
+ if (s->pauth_active) {
186
+ break;
76
+ gen_helper_autib(cpu_X[17], cpu_env, cpu_X[17], cpu_X[16]);
187
+ case MO_16:
188
+ if (!dc_isar_feature(aa64_fp16, s)) {
189
+ return false;
77
+ }
190
+ }
78
+ break;
191
+ break;
79
+ case 0b11000: /* PACIAZ */
192
+ default:
80
+ if (s->pauth_active) {
193
+ g_assert_not_reached();
81
+ gen_helper_pacia(cpu_X[30], cpu_env, cpu_X[30],
194
+ }
82
+ new_tmp_a64_zero(s));
195
+ if (fp_access_check(s)) {
83
+ }
196
+ gen_gvec_op4_fpst(s, a->q, a->rd, a->rn, a->rm, a->rd,
84
+ break;
197
+ esz == MO_16, (a->idx << 1) | neg,
85
+ case 0b11001: /* PACIASP */
198
+ fns[esz - 1]);
86
+ if (s->pauth_active) {
199
+ }
87
+ gen_helper_pacia(cpu_X[30], cpu_env, cpu_X[30], cpu_X[31]);
200
+ return true;
88
+ }
201
+}
89
+ break;
202
+
90
+ case 0b11010: /* PACIBZ */
203
+TRANS(FMLA_vi, do_fmla_vector_idx, a, false)
91
+ if (s->pauth_active) {
204
+TRANS(FMLS_vi, do_fmla_vector_idx, a, true)
92
+ gen_helper_pacib(cpu_X[30], cpu_env, cpu_X[30],
205
+
93
+ new_tmp_a64_zero(s));
206
94
+ }
207
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
95
+ break;
208
* Note that it is the caller's responsibility to ensure that the
96
+ case 0b11011: /* PACIBSP */
209
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
97
+ if (s->pauth_active) {
210
read_vec_element(s, tcg_op2, rm, pass, MO_64);
98
+ gen_helper_pacib(cpu_X[30], cpu_env, cpu_X[30], cpu_X[31]);
211
99
+ }
212
switch (fpopcode) {
100
+ break;
213
- case 0x39: /* FMLS */
101
+ case 0b11100: /* AUTIAZ */
214
- /* As usual for ARM, separate negation for fused multiply-add */
102
+ if (s->pauth_active) {
215
- gen_vfp_negd(tcg_op1, tcg_op1);
103
+ gen_helper_autia(cpu_X[30], cpu_env, cpu_X[30],
216
- /* fall through */
104
+ new_tmp_a64_zero(s));
217
- case 0x19: /* FMLA */
105
+ }
218
- read_vec_element(s, tcg_res, rd, pass, MO_64);
106
+ break;
219
- gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2,
107
+ case 0b11101: /* AUTIASP */
220
- tcg_res, fpst);
108
+ if (s->pauth_active) {
221
- break;
109
+ gen_helper_autia(cpu_X[30], cpu_env, cpu_X[30], cpu_X[31]);
222
case 0x1c: /* FCMEQ */
110
+ }
223
gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
111
+ break;
224
break;
112
+ case 0b11110: /* AUTIBZ */
225
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
113
+ if (s->pauth_active) {
226
break;
114
+ gen_helper_autib(cpu_X[30], cpu_env, cpu_X[30],
227
default:
115
+ new_tmp_a64_zero(s));
228
case 0x18: /* FMAXNM */
116
+ }
229
+ case 0x19: /* FMLA */
117
+ break;
230
case 0x1a: /* FADD */
118
+ case 0b11111: /* AUTIBSP */
231
case 0x1b: /* FMULX */
119
+ if (s->pauth_active) {
232
case 0x1e: /* FMAX */
120
+ gen_helper_autib(cpu_X[30], cpu_env, cpu_X[30], cpu_X[31]);
233
case 0x38: /* FMINNM */
121
+ }
234
+ case 0x39: /* FMLS */
122
+ break;
235
case 0x3a: /* FSUB */
236
case 0x3e: /* FMIN */
237
case 0x5b: /* FMUL */
238
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
239
read_vec_element_i32(s, tcg_op2, rm, pass, MO_32);
240
241
switch (fpopcode) {
242
- case 0x39: /* FMLS */
243
- /* As usual for ARM, separate negation for fused multiply-add */
244
- gen_vfp_negs(tcg_op1, tcg_op1);
245
- /* fall through */
246
- case 0x19: /* FMLA */
247
- read_vec_element_i32(s, tcg_res, rd, pass, MO_32);
248
- gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2,
249
- tcg_res, fpst);
250
- break;
251
case 0x1c: /* FCMEQ */
252
gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
253
break;
254
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
255
break;
256
default:
257
case 0x18: /* FMAXNM */
258
+ case 0x19: /* FMLA */
259
case 0x1a: /* FADD */
260
case 0x1b: /* FMULX */
261
case 0x1e: /* FMAX */
262
case 0x38: /* FMINNM */
263
+ case 0x39: /* FMLS */
264
case 0x3a: /* FSUB */
265
case 0x3e: /* FMIN */
266
case 0x5b: /* FMUL */
267
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
268
case 0x3f: /* FRSQRTS */
269
case 0x5d: /* FACGE */
270
case 0x7d: /* FACGT */
271
- case 0x19: /* FMLA */
272
- case 0x39: /* FMLS */
273
case 0x1c: /* FCMEQ */
274
case 0x5c: /* FCMGE */
275
case 0x7a: /* FABD */
276
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
277
123
default:
278
default:
124
/* default specified as NOP equivalent */
279
case 0x18: /* FMAXNM */
125
- return;
280
+ case 0x19: /* FMLA */
126
+ break;
281
case 0x1a: /* FADD */
282
case 0x1b: /* FMULX */
283
case 0x1e: /* FMAX */
284
case 0x38: /* FMINNM */
285
+ case 0x39: /* FMLS */
286
case 0x3a: /* FSUB */
287
case 0x3e: /* FMIN */
288
case 0x5b: /* FMUL */
289
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
290
int pass;
291
292
switch (fpopcode) {
293
- case 0x1: /* FMLA */
294
case 0x4: /* FCMEQ */
295
case 0x7: /* FRECPS */
296
- case 0x9: /* FMLS */
297
case 0xf: /* FRSQRTS */
298
case 0x14: /* FCMGE */
299
case 0x15: /* FACGE */
300
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
301
break;
302
default:
303
case 0x0: /* FMAXNM */
304
+ case 0x1: /* FMLA */
305
case 0x2: /* FADD */
306
case 0x3: /* FMULX */
307
case 0x6: /* FMAX */
308
case 0x8: /* FMINNM */
309
+ case 0x9: /* FMLS */
310
case 0xa: /* FSUB */
311
case 0xe: /* FMIN */
312
case 0x13: /* FMUL */
313
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
314
read_vec_element_i32(s, tcg_op2, rm, pass, MO_16);
315
316
switch (fpopcode) {
317
- case 0x1: /* FMLA */
318
- read_vec_element_i32(s, tcg_res, rd, pass, MO_16);
319
- gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
320
- fpst);
321
- break;
322
case 0x4: /* FCMEQ */
323
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
324
break;
325
case 0x7: /* FRECPS */
326
gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
327
break;
328
- case 0x9: /* FMLS */
329
- /* As usual for ARM, separate negation for fused multiply-add */
330
- tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000);
331
- read_vec_element_i32(s, tcg_res, rd, pass, MO_16);
332
- gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
333
- fpst);
334
- break;
335
case 0xf: /* FRSQRTS */
336
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
337
break;
338
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
339
break;
340
default:
341
case 0x0: /* FMAXNM */
342
+ case 0x1: /* FMLA */
343
case 0x2: /* FADD */
344
case 0x3: /* FMULX */
345
case 0x6: /* FMAX */
346
case 0x8: /* FMINNM */
347
+ case 0x9: /* FMLS */
348
case 0xa: /* FSUB */
349
case 0xe: /* FMIN */
350
case 0x13: /* FMUL */
351
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
352
case 0x0c: /* SQDMULH */
353
case 0x0d: /* SQRDMULH */
354
break;
355
- case 0x01: /* FMLA */
356
- case 0x05: /* FMLS */
357
- is_fp = 1;
358
- break;
359
case 0x1d: /* SQRDMLAH */
360
case 0x1f: /* SQRDMLSH */
361
if (!dc_isar_feature(aa64_rdm, s)) {
362
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
363
/* is_fp, but we pass tcg_env not fp_status. */
364
break;
365
default:
366
+ case 0x01: /* FMLA */
367
+ case 0x05: /* FMLS */
368
case 0x09: /* FMUL */
369
case 0x19: /* FMULX */
370
unallocated_encoding(s);
371
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
372
373
switch (is_fp) {
374
case 1: /* normal fp */
375
- /* convert insn encoded size to MemOp size */
376
- switch (size) {
377
- case 0: /* half-precision */
378
- size = MO_16;
379
- is_fp16 = true;
380
- break;
381
- case MO_32: /* single precision */
382
- case MO_64: /* double precision */
383
- break;
384
- default:
385
- unallocated_encoding(s);
386
- return;
387
- }
388
- break;
389
+ unallocated_encoding(s); /* in decodetree */
390
+ return;
391
392
case 2: /* complex fp */
393
/* Each indexable element is a complex pair. */
394
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
127
}
395
}
396
397
if (size == 3) {
398
- TCGv_i64 tcg_idx = tcg_temp_new_i64();
399
- int pass;
400
-
401
- assert(is_fp && is_q && !is_long);
402
-
403
- read_vec_element(s, tcg_idx, rm, index, MO_64);
404
-
405
- for (pass = 0; pass < (is_scalar ? 1 : 2); pass++) {
406
- TCGv_i64 tcg_op = tcg_temp_new_i64();
407
- TCGv_i64 tcg_res = tcg_temp_new_i64();
408
-
409
- read_vec_element(s, tcg_op, rn, pass, MO_64);
410
-
411
- switch (16 * u + opcode) {
412
- case 0x05: /* FMLS */
413
- /* As usual for ARM, separate negation for fused multiply-add */
414
- gen_vfp_negd(tcg_op, tcg_op);
415
- /* fall through */
416
- case 0x01: /* FMLA */
417
- read_vec_element(s, tcg_res, rd, pass, MO_64);
418
- gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst);
419
- break;
420
- default:
421
- case 0x09: /* FMUL */
422
- case 0x19: /* FMULX */
423
- g_assert_not_reached();
424
- }
425
-
426
- write_vec_element(s, tcg_res, rd, pass, MO_64);
427
- }
428
-
429
- clear_vec_high(s, !is_scalar, rd);
430
+ g_assert_not_reached();
431
} else if (!is_long) {
432
/* 32 bit floating point, or 16 or 32 bit integer.
433
* For the 16 bit scalar case we use the usual Neon helpers and
434
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
435
genfn(tcg_res, tcg_op, tcg_res);
436
break;
437
}
438
- case 0x05: /* FMLS */
439
- case 0x01: /* FMLA */
440
- read_vec_element_i32(s, tcg_res, rd, pass,
441
- is_scalar ? size : MO_32);
442
- switch (size) {
443
- case 1:
444
- if (opcode == 0x5) {
445
- /* As usual for ARM, separate negation for fused
446
- * multiply-add */
447
- tcg_gen_xori_i32(tcg_op, tcg_op, 0x80008000);
448
- }
449
- if (is_scalar) {
450
- gen_helper_advsimd_muladdh(tcg_res, tcg_op, tcg_idx,
451
- tcg_res, fpst);
452
- } else {
453
- gen_helper_advsimd_muladd2h(tcg_res, tcg_op, tcg_idx,
454
- tcg_res, fpst);
455
- }
456
- break;
457
- case 2:
458
- if (opcode == 0x5) {
459
- /* As usual for ARM, separate negation for
460
- * fused multiply-add */
461
- tcg_gen_xori_i32(tcg_op, tcg_op, 0x80000000);
462
- }
463
- gen_helper_vfp_muladds(tcg_res, tcg_op, tcg_idx,
464
- tcg_res, fpst);
465
- break;
466
- default:
467
- g_assert_not_reached();
468
- }
469
- break;
470
case 0x0c: /* SQDMULH */
471
if (size == 1) {
472
gen_helper_neon_qdmulh_s16(tcg_res, tcg_env,
473
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
474
}
475
break;
476
default:
477
+ case 0x01: /* FMLA */
478
+ case 0x05: /* FMLS */
479
case 0x09: /* FMUL */
480
case 0x19: /* FMULX */
481
g_assert_not_reached();
482
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
483
index XXXXXXX..XXXXXXX 100644
484
--- a/target/arm/tcg/vec_helper.c
485
+++ b/target/arm/tcg/vec_helper.c
486
@@ -XXX,XX +XXX,XX @@ static float32 float32_muladd_f(float32 dest, float32 op1, float32 op2,
487
return float32_muladd(op1, op2, dest, 0, stat);
128
}
488
}
129
489
490
+static float64 float64_muladd_f(float64 dest, float64 op1, float64 op2,
491
+ float_status *stat)
492
+{
493
+ return float64_muladd(op1, op2, dest, 0, stat);
494
+}
495
+
496
static float16 float16_mulsub_f(float16 dest, float16 op1, float16 op2,
497
float_status *stat)
498
{
499
@@ -XXX,XX +XXX,XX @@ static float32 float32_mulsub_f(float32 dest, float32 op1, float32 op2,
500
return float32_muladd(float32_chs(op1), op2, dest, 0, stat);
501
}
502
503
+static float64 float64_mulsub_f(float64 dest, float64 op1, float64 op2,
504
+ float_status *stat)
505
+{
506
+ return float64_muladd(float64_chs(op1), op2, dest, 0, stat);
507
+}
508
+
509
#define DO_MULADD(NAME, FUNC, TYPE) \
510
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
511
{ \
512
@@ -XXX,XX +XXX,XX @@ DO_MULADD(gvec_fmls_s, float32_mulsub_nf, float32)
513
514
DO_MULADD(gvec_vfma_h, float16_muladd_f, float16)
515
DO_MULADD(gvec_vfma_s, float32_muladd_f, float32)
516
+DO_MULADD(gvec_vfma_d, float64_muladd_f, float64)
517
518
DO_MULADD(gvec_vfms_h, float16_mulsub_f, float16)
519
DO_MULADD(gvec_vfms_s, float32_mulsub_f, float32)
520
+DO_MULADD(gvec_vfms_d, float64_mulsub_f, float64)
521
522
/* For the indexed ops, SVE applies the index per 128-bit vector segment.
523
* For AdvSIMD, there is of course only one such vector segment.
130
--
524
--
131
2.20.1
525
2.34.1
132
133
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This is the main crypto routine, an implementation of QARMA.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
This matches, as much as possible, ARM pseudocode.
5
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20240524232121.284515-26-richard.henderson@linaro.org
8
Message-id: 20190108223129.5570-28-richard.henderson@linaro.org
9
[PMM: fixed minor checkpatch nits]
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/pauth_helper.c | 242 +++++++++++++++++++++++++++++++++++++-
8
target/arm/helper.h | 5 +
13
1 file changed, 241 insertions(+), 1 deletion(-)
9
target/arm/tcg/a64.decode | 30 ++++++
10
target/arm/tcg/translate-a64.c | 188 +++++++++++++++++++--------------
11
target/arm/tcg/vec_helper.c | 30 ++++++
12
4 files changed, 174 insertions(+), 79 deletions(-)
14
13
15
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/pauth_helper.c
16
--- a/target/arm/helper.h
18
+++ b/target/arm/pauth_helper.c
17
+++ b/target/arm/helper.h
19
@@ -XXX,XX +XXX,XX @@
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
#include "tcg/tcg-gvec-desc.h"
19
21
20
DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
21
DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
+static uint64_t pac_cell_shuffle(uint64_t i)
22
+DEF_HELPER_FLAGS_5(gvec_fceq_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
24
DEF_HELPER_FLAGS_5(gvec_fcge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
DEF_HELPER_FLAGS_5(gvec_fcge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(gvec_fcge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
28
DEF_HELPER_FLAGS_5(gvec_fcgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_5(gvec_fcgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_5(gvec_fcgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
32
DEF_HELPER_FLAGS_5(gvec_facge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
DEF_HELPER_FLAGS_5(gvec_facge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_5(gvec_facge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
36
DEF_HELPER_FLAGS_5(gvec_facgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_5(gvec_facgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
40
DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
41
DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
42
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/tcg/a64.decode
45
+++ b/target/arm/tcg/a64.decode
46
@@ -XXX,XX +XXX,XX @@ FMINNM_s 0001 1110 ..1 ..... 0111 10 ..... ..... @rrr_hsd
47
FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h
48
FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd
49
50
+FCMEQ_s 0101 1110 010 ..... 00100 1 ..... ..... @rrr_h
51
+FCMEQ_s 0101 1110 0.1 ..... 11100 1 ..... ..... @rrr_sd
52
+
53
+FCMGE_s 0111 1110 010 ..... 00100 1 ..... ..... @rrr_h
54
+FCMGE_s 0111 1110 0.1 ..... 11100 1 ..... ..... @rrr_sd
55
+
56
+FCMGT_s 0111 1110 110 ..... 00100 1 ..... ..... @rrr_h
57
+FCMGT_s 0111 1110 1.1 ..... 11100 1 ..... ..... @rrr_sd
58
+
59
+FACGE_s 0111 1110 010 ..... 00101 1 ..... ..... @rrr_h
60
+FACGE_s 0111 1110 0.1 ..... 11101 1 ..... ..... @rrr_sd
61
+
62
+FACGT_s 0111 1110 110 ..... 00101 1 ..... ..... @rrr_h
63
+FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd
64
+
65
### Advanced SIMD three same
66
67
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
68
@@ -XXX,XX +XXX,XX @@ FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd
69
FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h
70
FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd
71
72
+FCMEQ_v 0.00 1110 010 ..... 00100 1 ..... ..... @qrrr_h
73
+FCMEQ_v 0.00 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd
74
+
75
+FCMGE_v 0.10 1110 010 ..... 00100 1 ..... ..... @qrrr_h
76
+FCMGE_v 0.10 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd
77
+
78
+FCMGT_v 0.10 1110 110 ..... 00100 1 ..... ..... @qrrr_h
79
+FCMGT_v 0.10 1110 1.1 ..... 11100 1 ..... ..... @qrrr_sd
80
+
81
+FACGE_v 0.10 1110 010 ..... 00101 1 ..... ..... @qrrr_h
82
+FACGE_v 0.10 1110 0.1 ..... 11101 1 ..... ..... @qrrr_sd
83
+
84
+FACGT_v 0.10 1110 110 ..... 00101 1 ..... ..... @qrrr_h
85
+FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd
86
+
87
### Advanced SIMD scalar x indexed element
88
89
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
90
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/tcg/translate-a64.c
93
+++ b/target/arm/tcg/translate-a64.c
94
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_fnmul = {
95
};
96
TRANS(FNMUL_s, do_fp3_scalar, a, &f_scalar_fnmul)
97
98
+static const FPScalar f_scalar_fcmeq = {
99
+ gen_helper_advsimd_ceq_f16,
100
+ gen_helper_neon_ceq_f32,
101
+ gen_helper_neon_ceq_f64,
102
+};
103
+TRANS(FCMEQ_s, do_fp3_scalar, a, &f_scalar_fcmeq)
104
+
105
+static const FPScalar f_scalar_fcmge = {
106
+ gen_helper_advsimd_cge_f16,
107
+ gen_helper_neon_cge_f32,
108
+ gen_helper_neon_cge_f64,
109
+};
110
+TRANS(FCMGE_s, do_fp3_scalar, a, &f_scalar_fcmge)
111
+
112
+static const FPScalar f_scalar_fcmgt = {
113
+ gen_helper_advsimd_cgt_f16,
114
+ gen_helper_neon_cgt_f32,
115
+ gen_helper_neon_cgt_f64,
116
+};
117
+TRANS(FCMGT_s, do_fp3_scalar, a, &f_scalar_fcmgt)
118
+
119
+static const FPScalar f_scalar_facge = {
120
+ gen_helper_advsimd_acge_f16,
121
+ gen_helper_neon_acge_f32,
122
+ gen_helper_neon_acge_f64,
123
+};
124
+TRANS(FACGE_s, do_fp3_scalar, a, &f_scalar_facge)
125
+
126
+static const FPScalar f_scalar_facgt = {
127
+ gen_helper_advsimd_acgt_f16,
128
+ gen_helper_neon_acgt_f32,
129
+ gen_helper_neon_acgt_f64,
130
+};
131
+TRANS(FACGT_s, do_fp3_scalar, a, &f_scalar_facgt)
132
+
133
static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
134
gen_helper_gvec_3_ptr * const fns[3])
135
{
136
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fmls[3] = {
137
};
138
TRANS(FMLS_v, do_fp3_vector, a, f_vector_fmls)
139
140
+static gen_helper_gvec_3_ptr * const f_vector_fcmeq[3] = {
141
+ gen_helper_gvec_fceq_h,
142
+ gen_helper_gvec_fceq_s,
143
+ gen_helper_gvec_fceq_d,
144
+};
145
+TRANS(FCMEQ_v, do_fp3_vector, a, f_vector_fcmeq)
146
+
147
+static gen_helper_gvec_3_ptr * const f_vector_fcmge[3] = {
148
+ gen_helper_gvec_fcge_h,
149
+ gen_helper_gvec_fcge_s,
150
+ gen_helper_gvec_fcge_d,
151
+};
152
+TRANS(FCMGE_v, do_fp3_vector, a, f_vector_fcmge)
153
+
154
+static gen_helper_gvec_3_ptr * const f_vector_fcmgt[3] = {
155
+ gen_helper_gvec_fcgt_h,
156
+ gen_helper_gvec_fcgt_s,
157
+ gen_helper_gvec_fcgt_d,
158
+};
159
+TRANS(FCMGT_v, do_fp3_vector, a, f_vector_fcmgt)
160
+
161
+static gen_helper_gvec_3_ptr * const f_vector_facge[3] = {
162
+ gen_helper_gvec_facge_h,
163
+ gen_helper_gvec_facge_s,
164
+ gen_helper_gvec_facge_d,
165
+};
166
+TRANS(FACGE_v, do_fp3_vector, a, f_vector_facge)
167
+
168
+static gen_helper_gvec_3_ptr * const f_vector_facgt[3] = {
169
+ gen_helper_gvec_facgt_h,
170
+ gen_helper_gvec_facgt_s,
171
+ gen_helper_gvec_facgt_d,
172
+};
173
+TRANS(FACGT_v, do_fp3_vector, a, f_vector_facgt)
174
+
175
/*
176
* Advanced SIMD scalar/vector x indexed element
177
*/
178
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
179
read_vec_element(s, tcg_op2, rm, pass, MO_64);
180
181
switch (fpopcode) {
182
- case 0x1c: /* FCMEQ */
183
- gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
184
- break;
185
case 0x1f: /* FRECPS */
186
gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
187
break;
188
case 0x3f: /* FRSQRTS */
189
gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
190
break;
191
- case 0x5c: /* FCMGE */
192
- gen_helper_neon_cge_f64(tcg_res, tcg_op1, tcg_op2, fpst);
193
- break;
194
- case 0x5d: /* FACGE */
195
- gen_helper_neon_acge_f64(tcg_res, tcg_op1, tcg_op2, fpst);
196
- break;
197
case 0x7a: /* FABD */
198
gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
199
gen_vfp_absd(tcg_res, tcg_res);
200
break;
201
- case 0x7c: /* FCMGT */
202
- gen_helper_neon_cgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
203
- break;
204
- case 0x7d: /* FACGT */
205
- gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
206
- break;
207
default:
208
case 0x18: /* FMAXNM */
209
case 0x19: /* FMLA */
210
case 0x1a: /* FADD */
211
case 0x1b: /* FMULX */
212
+ case 0x1c: /* FCMEQ */
213
case 0x1e: /* FMAX */
214
case 0x38: /* FMINNM */
215
case 0x39: /* FMLS */
216
case 0x3a: /* FSUB */
217
case 0x3e: /* FMIN */
218
case 0x5b: /* FMUL */
219
+ case 0x5c: /* FCMGE */
220
+ case 0x5d: /* FACGE */
221
case 0x5f: /* FDIV */
222
+ case 0x7c: /* FCMGT */
223
+ case 0x7d: /* FACGT */
224
g_assert_not_reached();
225
}
226
227
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
228
read_vec_element_i32(s, tcg_op2, rm, pass, MO_32);
229
230
switch (fpopcode) {
231
- case 0x1c: /* FCMEQ */
232
- gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
233
- break;
234
case 0x1f: /* FRECPS */
235
gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
236
break;
237
case 0x3f: /* FRSQRTS */
238
gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
239
break;
240
- case 0x5c: /* FCMGE */
241
- gen_helper_neon_cge_f32(tcg_res, tcg_op1, tcg_op2, fpst);
242
- break;
243
- case 0x5d: /* FACGE */
244
- gen_helper_neon_acge_f32(tcg_res, tcg_op1, tcg_op2, fpst);
245
- break;
246
case 0x7a: /* FABD */
247
gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
248
gen_vfp_abss(tcg_res, tcg_res);
249
break;
250
- case 0x7c: /* FCMGT */
251
- gen_helper_neon_cgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
252
- break;
253
- case 0x7d: /* FACGT */
254
- gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
255
- break;
256
default:
257
case 0x18: /* FMAXNM */
258
case 0x19: /* FMLA */
259
case 0x1a: /* FADD */
260
case 0x1b: /* FMULX */
261
+ case 0x1c: /* FCMEQ */
262
case 0x1e: /* FMAX */
263
case 0x38: /* FMINNM */
264
case 0x39: /* FMLS */
265
case 0x3a: /* FSUB */
266
case 0x3e: /* FMIN */
267
case 0x5b: /* FMUL */
268
+ case 0x5c: /* FCMGE */
269
+ case 0x5d: /* FACGE */
270
case 0x5f: /* FDIV */
271
+ case 0x7c: /* FCMGT */
272
+ case 0x7d: /* FACGT */
273
g_assert_not_reached();
274
}
275
276
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
277
switch (fpopcode) {
278
case 0x1f: /* FRECPS */
279
case 0x3f: /* FRSQRTS */
280
+ case 0x7a: /* FABD */
281
+ break;
282
+ default:
283
+ case 0x1b: /* FMULX */
284
case 0x5d: /* FACGE */
285
case 0x7d: /* FACGT */
286
case 0x1c: /* FCMEQ */
287
case 0x5c: /* FCMGE */
288
case 0x7c: /* FCMGT */
289
- case 0x7a: /* FABD */
290
- break;
291
- default:
292
- case 0x1b: /* FMULX */
293
unallocated_encoding(s);
294
return;
295
}
296
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
297
TCGv_i32 tcg_res;
298
299
switch (fpopcode) {
300
- case 0x04: /* FCMEQ (reg) */
301
case 0x07: /* FRECPS */
302
case 0x0f: /* FRSQRTS */
303
- case 0x14: /* FCMGE (reg) */
304
- case 0x15: /* FACGE */
305
case 0x1a: /* FABD */
306
- case 0x1c: /* FCMGT (reg) */
307
- case 0x1d: /* FACGT */
308
break;
309
default:
310
case 0x03: /* FMULX */
311
+ case 0x04: /* FCMEQ (reg) */
312
+ case 0x14: /* FCMGE (reg) */
313
+ case 0x15: /* FACGE */
314
+ case 0x1c: /* FCMGT (reg) */
315
+ case 0x1d: /* FACGT */
316
unallocated_encoding(s);
317
return;
318
}
319
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
320
tcg_res = tcg_temp_new_i32();
321
322
switch (fpopcode) {
323
- case 0x04: /* FCMEQ (reg) */
324
- gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
325
- break;
326
case 0x07: /* FRECPS */
327
gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
328
break;
329
case 0x0f: /* FRSQRTS */
330
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
331
break;
332
- case 0x14: /* FCMGE (reg) */
333
- gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
334
- break;
335
- case 0x15: /* FACGE */
336
- gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
337
- break;
338
case 0x1a: /* FABD */
339
gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
340
tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
341
break;
342
- case 0x1c: /* FCMGT (reg) */
343
- gen_helper_advsimd_cgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
344
- break;
345
- case 0x1d: /* FACGT */
346
- gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
347
- break;
348
default:
349
case 0x03: /* FMULX */
350
+ case 0x04: /* FCMEQ (reg) */
351
+ case 0x14: /* FCMGE (reg) */
352
+ case 0x15: /* FACGE */
353
+ case 0x1c: /* FCMGT (reg) */
354
+ case 0x1d: /* FACGT */
355
g_assert_not_reached();
356
}
357
358
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
359
return;
360
case 0x1f: /* FRECPS */
361
case 0x3f: /* FRSQRTS */
362
- case 0x5d: /* FACGE */
363
- case 0x7d: /* FACGT */
364
- case 0x1c: /* FCMEQ */
365
- case 0x5c: /* FCMGE */
366
case 0x7a: /* FABD */
367
- case 0x7c: /* FCMGT */
368
if (!fp_access_check(s)) {
369
return;
370
}
371
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
372
case 0x19: /* FMLA */
373
case 0x1a: /* FADD */
374
case 0x1b: /* FMULX */
375
+ case 0x1c: /* FCMEQ */
376
case 0x1e: /* FMAX */
377
case 0x38: /* FMINNM */
378
case 0x39: /* FMLS */
379
case 0x3a: /* FSUB */
380
case 0x3e: /* FMIN */
381
case 0x5b: /* FMUL */
382
+ case 0x5c: /* FCMGE */
383
+ case 0x5d: /* FACGE */
384
case 0x5f: /* FDIV */
385
+ case 0x7d: /* FACGT */
386
+ case 0x7c: /* FCMGT */
387
unallocated_encoding(s);
388
return;
389
}
390
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
391
int pass;
392
393
switch (fpopcode) {
394
- case 0x4: /* FCMEQ */
395
case 0x7: /* FRECPS */
396
case 0xf: /* FRSQRTS */
397
- case 0x14: /* FCMGE */
398
- case 0x15: /* FACGE */
399
case 0x1a: /* FABD */
400
- case 0x1c: /* FCMGT */
401
- case 0x1d: /* FACGT */
402
pairwise = false;
403
break;
404
case 0x10: /* FMAXNMP */
405
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
406
case 0x1: /* FMLA */
407
case 0x2: /* FADD */
408
case 0x3: /* FMULX */
409
+ case 0x4: /* FCMEQ */
410
case 0x6: /* FMAX */
411
case 0x8: /* FMINNM */
412
case 0x9: /* FMLS */
413
case 0xa: /* FSUB */
414
case 0xe: /* FMIN */
415
case 0x13: /* FMUL */
416
+ case 0x14: /* FCMGE */
417
+ case 0x15: /* FACGE */
418
case 0x17: /* FDIV */
419
+ case 0x1c: /* FCMGT */
420
+ case 0x1d: /* FACGT */
421
unallocated_encoding(s);
422
return;
423
}
424
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
425
read_vec_element_i32(s, tcg_op2, rm, pass, MO_16);
426
427
switch (fpopcode) {
428
- case 0x4: /* FCMEQ */
429
- gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
430
- break;
431
case 0x7: /* FRECPS */
432
gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
433
break;
434
case 0xf: /* FRSQRTS */
435
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
436
break;
437
- case 0x14: /* FCMGE */
438
- gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
439
- break;
440
- case 0x15: /* FACGE */
441
- gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
442
- break;
443
case 0x1a: /* FABD */
444
gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
445
tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
446
break;
447
- case 0x1c: /* FCMGT */
448
- gen_helper_advsimd_cgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
449
- break;
450
- case 0x1d: /* FACGT */
451
- gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
452
- break;
453
default:
454
case 0x0: /* FMAXNM */
455
case 0x1: /* FMLA */
456
case 0x2: /* FADD */
457
case 0x3: /* FMULX */
458
+ case 0x4: /* FCMEQ */
459
case 0x6: /* FMAX */
460
case 0x8: /* FMINNM */
461
case 0x9: /* FMLS */
462
case 0xa: /* FSUB */
463
case 0xe: /* FMIN */
464
case 0x13: /* FMUL */
465
+ case 0x14: /* FCMGE */
466
+ case 0x15: /* FACGE */
467
case 0x17: /* FDIV */
468
+ case 0x1c: /* FCMGT */
469
+ case 0x1d: /* FACGT */
470
g_assert_not_reached();
471
}
472
473
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
474
index XXXXXXX..XXXXXXX 100644
475
--- a/target/arm/tcg/vec_helper.c
476
+++ b/target/arm/tcg/vec_helper.c
477
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_ceq(float32 op1, float32 op2, float_status *stat)
478
return -float32_eq_quiet(op1, op2, stat);
479
}
480
481
+static uint64_t float64_ceq(float64 op1, float64 op2, float_status *stat)
24
+{
482
+{
25
+ uint64_t o = 0;
483
+ return -float64_eq_quiet(op1, op2, stat);
26
+
27
+ o |= extract64(i, 52, 4);
28
+ o |= extract64(i, 24, 4) << 4;
29
+ o |= extract64(i, 44, 4) << 8;
30
+ o |= extract64(i, 0, 4) << 12;
31
+
32
+ o |= extract64(i, 28, 4) << 16;
33
+ o |= extract64(i, 48, 4) << 20;
34
+ o |= extract64(i, 4, 4) << 24;
35
+ o |= extract64(i, 40, 4) << 28;
36
+
37
+ o |= extract64(i, 32, 4) << 32;
38
+ o |= extract64(i, 12, 4) << 36;
39
+ o |= extract64(i, 56, 4) << 40;
40
+ o |= extract64(i, 20, 4) << 44;
41
+
42
+ o |= extract64(i, 8, 4) << 48;
43
+ o |= extract64(i, 36, 4) << 52;
44
+ o |= extract64(i, 16, 4) << 56;
45
+ o |= extract64(i, 60, 4) << 60;
46
+
47
+ return o;
48
+}
484
+}
49
+
485
+
50
+static uint64_t pac_cell_inv_shuffle(uint64_t i)
486
static uint16_t float16_cge(float16 op1, float16 op2, float_status *stat)
487
{
488
return -float16_le(op2, op1, stat);
489
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_cge(float32 op1, float32 op2, float_status *stat)
490
return -float32_le(op2, op1, stat);
491
}
492
493
+static uint64_t float64_cge(float64 op1, float64 op2, float_status *stat)
51
+{
494
+{
52
+ uint64_t o = 0;
495
+ return -float64_le(op2, op1, stat);
53
+
54
+ o |= extract64(i, 12, 4);
55
+ o |= extract64(i, 24, 4) << 4;
56
+ o |= extract64(i, 48, 4) << 8;
57
+ o |= extract64(i, 36, 4) << 12;
58
+
59
+ o |= extract64(i, 56, 4) << 16;
60
+ o |= extract64(i, 44, 4) << 20;
61
+ o |= extract64(i, 4, 4) << 24;
62
+ o |= extract64(i, 16, 4) << 28;
63
+
64
+ o |= i & MAKE_64BIT_MASK(32, 4);
65
+ o |= extract64(i, 52, 4) << 36;
66
+ o |= extract64(i, 28, 4) << 40;
67
+ o |= extract64(i, 8, 4) << 44;
68
+
69
+ o |= extract64(i, 20, 4) << 48;
70
+ o |= extract64(i, 0, 4) << 52;
71
+ o |= extract64(i, 40, 4) << 56;
72
+ o |= i & MAKE_64BIT_MASK(60, 4);
73
+
74
+ return o;
75
+}
496
+}
76
+
497
+
77
+static uint64_t pac_sub(uint64_t i)
498
static uint16_t float16_cgt(float16 op1, float16 op2, float_status *stat)
499
{
500
return -float16_lt(op2, op1, stat);
501
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_cgt(float32 op1, float32 op2, float_status *stat)
502
return -float32_lt(op2, op1, stat);
503
}
504
505
+static uint64_t float64_cgt(float64 op1, float64 op2, float_status *stat)
78
+{
506
+{
79
+ static const uint8_t sub[16] = {
507
+ return -float64_lt(op2, op1, stat);
80
+ 0xb, 0x6, 0x8, 0xf, 0xc, 0x0, 0x9, 0xe,
81
+ 0x3, 0x7, 0x4, 0x5, 0xd, 0x2, 0x1, 0xa,
82
+ };
83
+ uint64_t o = 0;
84
+ int b;
85
+
86
+ for (b = 0; b < 64; b += 16) {
87
+ o |= (uint64_t)sub[(i >> b) & 0xf] << b;
88
+ }
89
+ return o;
90
+}
508
+}
91
+
509
+
92
+static uint64_t pac_inv_sub(uint64_t i)
510
static uint16_t float16_acge(float16 op1, float16 op2, float_status *stat)
511
{
512
return -float16_le(float16_abs(op2), float16_abs(op1), stat);
513
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_acge(float32 op1, float32 op2, float_status *stat)
514
return -float32_le(float32_abs(op2), float32_abs(op1), stat);
515
}
516
517
+static uint64_t float64_acge(float64 op1, float64 op2, float_status *stat)
93
+{
518
+{
94
+ static const uint8_t inv_sub[16] = {
519
+ return -float64_le(float64_abs(op2), float64_abs(op1), stat);
95
+ 0x5, 0xe, 0xd, 0x8, 0xa, 0xb, 0x1, 0x9,
96
+ 0x2, 0x6, 0xf, 0x0, 0x4, 0xc, 0x7, 0x3,
97
+ };
98
+ uint64_t o = 0;
99
+ int b;
100
+
101
+ for (b = 0; b < 64; b += 16) {
102
+ o |= (uint64_t)inv_sub[(i >> b) & 0xf] << b;
103
+ }
104
+ return o;
105
+}
520
+}
106
+
521
+
107
+static int rot_cell(int cell, int n)
522
static uint16_t float16_acgt(float16 op1, float16 op2, float_status *stat)
523
{
524
return -float16_lt(float16_abs(op2), float16_abs(op1), stat);
525
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_acgt(float32 op1, float32 op2, float_status *stat)
526
return -float32_lt(float32_abs(op2), float32_abs(op1), stat);
527
}
528
529
+static uint64_t float64_acgt(float64 op1, float64 op2, float_status *stat)
108
+{
530
+{
109
+ /* 4-bit rotate left by n. */
531
+ return -float64_lt(float64_abs(op2), float64_abs(op1), stat);
110
+ cell |= cell << 4;
111
+ return extract32(cell, 4 - n, 4);
112
+}
532
+}
113
+
533
+
114
+static uint64_t pac_mult(uint64_t i)
534
static int16_t vfp_tosszh(float16 x, void *fpstp)
115
+{
116
+ uint64_t o = 0;
117
+ int b;
118
+
119
+ for (b = 0; b < 4 * 4; b += 4) {
120
+ int i0, i4, i8, ic, t0, t1, t2, t3;
121
+
122
+ i0 = extract64(i, b, 4);
123
+ i4 = extract64(i, b + 4 * 4, 4);
124
+ i8 = extract64(i, b + 8 * 4, 4);
125
+ ic = extract64(i, b + 12 * 4, 4);
126
+
127
+ t0 = rot_cell(i8, 1) ^ rot_cell(i4, 2) ^ rot_cell(i0, 1);
128
+ t1 = rot_cell(ic, 1) ^ rot_cell(i4, 1) ^ rot_cell(i0, 2);
129
+ t2 = rot_cell(ic, 2) ^ rot_cell(i8, 1) ^ rot_cell(i0, 1);
130
+ t3 = rot_cell(ic, 1) ^ rot_cell(i8, 2) ^ rot_cell(i4, 1);
131
+
132
+ o |= (uint64_t)t3 << b;
133
+ o |= (uint64_t)t2 << (b + 4 * 4);
134
+ o |= (uint64_t)t1 << (b + 8 * 4);
135
+ o |= (uint64_t)t0 << (b + 12 * 4);
136
+ }
137
+ return o;
138
+}
139
+
140
+static uint64_t tweak_cell_rot(uint64_t cell)
141
+{
142
+ return (cell >> 1) | (((cell ^ (cell >> 1)) & 1) << 3);
143
+}
144
+
145
+static uint64_t tweak_shuffle(uint64_t i)
146
+{
147
+ uint64_t o = 0;
148
+
149
+ o |= extract64(i, 16, 4) << 0;
150
+ o |= extract64(i, 20, 4) << 4;
151
+ o |= tweak_cell_rot(extract64(i, 24, 4)) << 8;
152
+ o |= extract64(i, 28, 4) << 12;
153
+
154
+ o |= tweak_cell_rot(extract64(i, 44, 4)) << 16;
155
+ o |= extract64(i, 8, 4) << 20;
156
+ o |= extract64(i, 12, 4) << 24;
157
+ o |= tweak_cell_rot(extract64(i, 32, 4)) << 28;
158
+
159
+ o |= extract64(i, 48, 4) << 32;
160
+ o |= extract64(i, 52, 4) << 36;
161
+ o |= extract64(i, 56, 4) << 40;
162
+ o |= tweak_cell_rot(extract64(i, 60, 4)) << 44;
163
+
164
+ o |= tweak_cell_rot(extract64(i, 0, 4)) << 48;
165
+ o |= extract64(i, 4, 4) << 52;
166
+ o |= tweak_cell_rot(extract64(i, 40, 4)) << 56;
167
+ o |= tweak_cell_rot(extract64(i, 36, 4)) << 60;
168
+
169
+ return o;
170
+}
171
+
172
+static uint64_t tweak_cell_inv_rot(uint64_t cell)
173
+{
174
+ return ((cell << 1) & 0xf) | ((cell & 1) ^ (cell >> 3));
175
+}
176
+
177
+static uint64_t tweak_inv_shuffle(uint64_t i)
178
+{
179
+ uint64_t o = 0;
180
+
181
+ o |= tweak_cell_inv_rot(extract64(i, 48, 4));
182
+ o |= extract64(i, 52, 4) << 4;
183
+ o |= extract64(i, 20, 4) << 8;
184
+ o |= extract64(i, 24, 4) << 12;
185
+
186
+ o |= extract64(i, 0, 4) << 16;
187
+ o |= extract64(i, 4, 4) << 20;
188
+ o |= tweak_cell_inv_rot(extract64(i, 8, 4)) << 24;
189
+ o |= extract64(i, 12, 4) << 28;
190
+
191
+ o |= tweak_cell_inv_rot(extract64(i, 28, 4)) << 32;
192
+ o |= tweak_cell_inv_rot(extract64(i, 60, 4)) << 36;
193
+ o |= tweak_cell_inv_rot(extract64(i, 56, 4)) << 40;
194
+ o |= tweak_cell_inv_rot(extract64(i, 16, 4)) << 44;
195
+
196
+ o |= extract64(i, 32, 4) << 48;
197
+ o |= extract64(i, 36, 4) << 52;
198
+ o |= extract64(i, 40, 4) << 56;
199
+ o |= tweak_cell_inv_rot(extract64(i, 44, 4)) << 60;
200
+
201
+ return o;
202
+}
203
+
204
static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
205
ARMPACKey key)
206
{
535
{
207
- g_assert_not_reached(); /* FIXME */
536
float_status *fpst = fpstp;
208
+ static const uint64_t RC[5] = {
537
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_fabd_s, float32_abd, float32)
209
+ 0x0000000000000000ull,
538
210
+ 0x13198A2E03707344ull,
539
DO_3OP(gvec_fceq_h, float16_ceq, float16)
211
+ 0xA4093822299F31D0ull,
540
DO_3OP(gvec_fceq_s, float32_ceq, float32)
212
+ 0x082EFA98EC4E6C89ull,
541
+DO_3OP(gvec_fceq_d, float64_ceq, float64)
213
+ 0x452821E638D01377ull,
542
214
+ };
543
DO_3OP(gvec_fcge_h, float16_cge, float16)
215
+ const uint64_t alpha = 0xC0AC29B7C97C50DDull;
544
DO_3OP(gvec_fcge_s, float32_cge, float32)
216
+ /*
545
+DO_3OP(gvec_fcge_d, float64_cge, float64)
217
+ * Note that in the ARM pseudocode, key0 contains bits <127:64>
546
218
+ * and key1 contains bits <63:0> of the 128-bit key.
547
DO_3OP(gvec_fcgt_h, float16_cgt, float16)
219
+ */
548
DO_3OP(gvec_fcgt_s, float32_cgt, float32)
220
+ uint64_t key0 = key.hi, key1 = key.lo;
549
+DO_3OP(gvec_fcgt_d, float64_cgt, float64)
221
+ uint64_t workingval, runningmod, roundkey, modk0;
550
222
+ int i;
551
DO_3OP(gvec_facge_h, float16_acge, float16)
223
+
552
DO_3OP(gvec_facge_s, float32_acge, float32)
224
+ modk0 = (key0 << 63) | ((key0 >> 1) ^ (key0 >> 63));
553
+DO_3OP(gvec_facge_d, float64_acge, float64)
225
+ runningmod = modifier;
554
226
+ workingval = data ^ key0;
555
DO_3OP(gvec_facgt_h, float16_acgt, float16)
227
+
556
DO_3OP(gvec_facgt_s, float32_acgt, float32)
228
+ for (i = 0; i <= 4; ++i) {
557
+DO_3OP(gvec_facgt_d, float64_acgt, float64)
229
+ roundkey = key1 ^ runningmod;
558
230
+ workingval ^= roundkey;
559
DO_3OP(gvec_fmax_h, float16_max, float16)
231
+ workingval ^= RC[i];
560
DO_3OP(gvec_fmax_s, float32_max, float32)
232
+ if (i > 0) {
233
+ workingval = pac_cell_shuffle(workingval);
234
+ workingval = pac_mult(workingval);
235
+ }
236
+ workingval = pac_sub(workingval);
237
+ runningmod = tweak_shuffle(runningmod);
238
+ }
239
+ roundkey = modk0 ^ runningmod;
240
+ workingval ^= roundkey;
241
+ workingval = pac_cell_shuffle(workingval);
242
+ workingval = pac_mult(workingval);
243
+ workingval = pac_sub(workingval);
244
+ workingval = pac_cell_shuffle(workingval);
245
+ workingval = pac_mult(workingval);
246
+ workingval ^= key1;
247
+ workingval = pac_cell_inv_shuffle(workingval);
248
+ workingval = pac_inv_sub(workingval);
249
+ workingval = pac_mult(workingval);
250
+ workingval = pac_cell_inv_shuffle(workingval);
251
+ workingval ^= key0;
252
+ workingval ^= runningmod;
253
+ for (i = 0; i <= 4; ++i) {
254
+ workingval = pac_inv_sub(workingval);
255
+ if (i < 4) {
256
+ workingval = pac_mult(workingval);
257
+ workingval = pac_cell_inv_shuffle(workingval);
258
+ }
259
+ runningmod = tweak_inv_shuffle(runningmod);
260
+ roundkey = key1 ^ runningmod;
261
+ workingval ^= RC[4 - i];
262
+ workingval ^= roundkey;
263
+ workingval ^= alpha;
264
+ }
265
+ workingval ^= modk0;
266
+
267
+ return workingval;
268
}
269
270
static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
271
--
561
--
272
2.20.1
562
2.34.1
273
274
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This path uses cpu_loop_exit_restore to unwind current processor state.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
5
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20240524232121.284515-27-richard.henderson@linaro.org
8
Message-id: 20190108223129.5570-5-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
7
---
11
target/arm/internals.h | 7 +++++++
8
target/arm/helper.h | 1 +
12
target/arm/op_helper.c | 19 +++++++++++++++++--
9
target/arm/tcg/a64.decode | 6 ++++
13
2 files changed, 24 insertions(+), 2 deletions(-)
10
target/arm/tcg/translate-a64.c | 60 ++++++++++++++++++++++------------
11
target/arm/tcg/vec_helper.c | 6 ++++
12
4 files changed, 53 insertions(+), 20 deletions(-)
14
13
15
diff --git a/target/arm/internals.h b/target/arm/internals.h
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/internals.h
16
--- a/target/arm/helper.h
18
+++ b/target/arm/internals.h
17
+++ b/target/arm/helper.h
19
@@ -XXX,XX +XXX,XX @@ FIELD(V7M_EXCRET, RES1, 7, 25) /* including the must-be-1 prefix */
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
void QEMU_NORETURN raise_exception(CPUARMState *env, uint32_t excp,
19
21
uint32_t syndrome, uint32_t target_el);
20
DEF_HELPER_FLAGS_5(gvec_fabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
21
DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
+/*
22
+DEF_HELPER_FLAGS_5(gvec_fabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
+ * Similarly, but also use unwinding to restore cpu state.
23
25
+ */
24
DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+void QEMU_NORETURN raise_exception_ra(CPUARMState *env, uint32_t excp,
25
DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
+ uint32_t syndrome, uint32_t target_el,
26
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
28
+ uintptr_t ra);
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/tcg/a64.decode
29
+++ b/target/arm/tcg/a64.decode
30
@@ -XXX,XX +XXX,XX @@ FACGE_s 0111 1110 0.1 ..... 11101 1 ..... ..... @rrr_sd
31
FACGT_s 0111 1110 110 ..... 00101 1 ..... ..... @rrr_h
32
FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd
33
34
+FABD_s 0111 1110 110 ..... 00010 1 ..... ..... @rrr_h
35
+FABD_s 0111 1110 1.1 ..... 11010 1 ..... ..... @rrr_sd
36
+
37
### Advanced SIMD three same
38
39
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
40
@@ -XXX,XX +XXX,XX @@ FACGE_v 0.10 1110 0.1 ..... 11101 1 ..... ..... @qrrr_sd
41
FACGT_v 0.10 1110 110 ..... 00101 1 ..... ..... @qrrr_h
42
FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd
43
44
+FABD_v 0.10 1110 110 ..... 00010 1 ..... ..... @qrrr_h
45
+FABD_v 0.10 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd
46
+
47
### Advanced SIMD scalar x indexed element
48
49
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
50
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/tcg/translate-a64.c
53
+++ b/target/arm/tcg/translate-a64.c
54
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_facgt = {
55
};
56
TRANS(FACGT_s, do_fp3_scalar, a, &f_scalar_facgt)
57
58
+static void gen_fabd_h(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
59
+{
60
+ gen_helper_vfp_subh(d, n, m, s);
61
+ gen_vfp_absh(d, d);
62
+}
63
+
64
+static void gen_fabd_s(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
65
+{
66
+ gen_helper_vfp_subs(d, n, m, s);
67
+ gen_vfp_abss(d, d);
68
+}
69
+
70
+static void gen_fabd_d(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_ptr s)
71
+{
72
+ gen_helper_vfp_subd(d, n, m, s);
73
+ gen_vfp_absd(d, d);
74
+}
75
+
76
+static const FPScalar f_scalar_fabd = {
77
+ gen_fabd_h,
78
+ gen_fabd_s,
79
+ gen_fabd_d,
80
+};
81
+TRANS(FABD_s, do_fp3_scalar, a, &f_scalar_fabd)
82
+
83
static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
84
gen_helper_gvec_3_ptr * const fns[3])
85
{
86
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_facgt[3] = {
87
};
88
TRANS(FACGT_v, do_fp3_vector, a, f_vector_facgt)
89
90
+static gen_helper_gvec_3_ptr * const f_vector_fabd[3] = {
91
+ gen_helper_gvec_fabd_h,
92
+ gen_helper_gvec_fabd_s,
93
+ gen_helper_gvec_fabd_d,
94
+};
95
+TRANS(FABD_v, do_fp3_vector, a, f_vector_fabd)
29
+
96
+
30
/*
97
/*
31
* For AArch64, map a given EL to an index in the banked_spsr array.
98
* Advanced SIMD scalar/vector x indexed element
32
* Note that this mapping and the AArch32 mapping defined in bank_number()
99
*/
33
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
100
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
34
index XXXXXXX..XXXXXXX 100644
101
case 0x3f: /* FRSQRTS */
35
--- a/target/arm/op_helper.c
102
gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
36
+++ b/target/arm/op_helper.c
103
break;
37
@@ -XXX,XX +XXX,XX @@
104
- case 0x7a: /* FABD */
38
#define SIGNBIT (uint32_t)0x80000000
105
- gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
39
#define SIGNBIT64 ((uint64_t)1 << 63)
106
- gen_vfp_absd(tcg_res, tcg_res);
40
107
- break;
41
-void raise_exception(CPUARMState *env, uint32_t excp,
108
default:
42
- uint32_t syndrome, uint32_t target_el)
109
case 0x18: /* FMAXNM */
43
+static CPUState *do_raise_exception(CPUARMState *env, uint32_t excp,
110
case 0x19: /* FMLA */
44
+ uint32_t syndrome, uint32_t target_el)
111
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
45
{
112
case 0x5c: /* FCMGE */
46
CPUState *cs = CPU(arm_env_get_cpu(env));
113
case 0x5d: /* FACGE */
47
114
case 0x5f: /* FDIV */
48
@@ -XXX,XX +XXX,XX @@ void raise_exception(CPUARMState *env, uint32_t excp,
115
+ case 0x7a: /* FABD */
49
cs->exception_index = excp;
116
case 0x7c: /* FCMGT */
50
env->exception.syndrome = syndrome;
117
case 0x7d: /* FACGT */
51
env->exception.target_el = target_el;
118
g_assert_not_reached();
52
+
119
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
53
+ return cs;
120
case 0x3f: /* FRSQRTS */
54
+}
121
gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
55
+
122
break;
56
+void raise_exception(CPUARMState *env, uint32_t excp,
123
- case 0x7a: /* FABD */
57
+ uint32_t syndrome, uint32_t target_el)
124
- gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
58
+{
125
- gen_vfp_abss(tcg_res, tcg_res);
59
+ CPUState *cs = do_raise_exception(env, excp, syndrome, target_el);
126
- break;
60
cpu_loop_exit(cs);
127
default:
128
case 0x18: /* FMAXNM */
129
case 0x19: /* FMLA */
130
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
131
case 0x5c: /* FCMGE */
132
case 0x5d: /* FACGE */
133
case 0x5f: /* FDIV */
134
+ case 0x7a: /* FABD */
135
case 0x7c: /* FCMGT */
136
case 0x7d: /* FACGT */
137
g_assert_not_reached();
138
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
139
switch (fpopcode) {
140
case 0x1f: /* FRECPS */
141
case 0x3f: /* FRSQRTS */
142
- case 0x7a: /* FABD */
143
break;
144
default:
145
case 0x1b: /* FMULX */
146
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
147
case 0x7d: /* FACGT */
148
case 0x1c: /* FCMEQ */
149
case 0x5c: /* FCMGE */
150
+ case 0x7a: /* FABD */
151
case 0x7c: /* FCMGT */
152
unallocated_encoding(s);
153
return;
154
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
155
switch (fpopcode) {
156
case 0x07: /* FRECPS */
157
case 0x0f: /* FRSQRTS */
158
- case 0x1a: /* FABD */
159
break;
160
default:
161
case 0x03: /* FMULX */
162
case 0x04: /* FCMEQ (reg) */
163
case 0x14: /* FCMGE (reg) */
164
case 0x15: /* FACGE */
165
+ case 0x1a: /* FABD */
166
case 0x1c: /* FCMGT (reg) */
167
case 0x1d: /* FACGT */
168
unallocated_encoding(s);
169
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
170
case 0x0f: /* FRSQRTS */
171
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
172
break;
173
- case 0x1a: /* FABD */
174
- gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
175
- tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
176
- break;
177
default:
178
case 0x03: /* FMULX */
179
case 0x04: /* FCMEQ (reg) */
180
case 0x14: /* FCMGE (reg) */
181
case 0x15: /* FACGE */
182
+ case 0x1a: /* FABD */
183
case 0x1c: /* FCMGT (reg) */
184
case 0x1d: /* FACGT */
185
g_assert_not_reached();
186
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
187
return;
188
case 0x1f: /* FRECPS */
189
case 0x3f: /* FRSQRTS */
190
- case 0x7a: /* FABD */
191
if (!fp_access_check(s)) {
192
return;
193
}
194
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
195
case 0x5c: /* FCMGE */
196
case 0x5d: /* FACGE */
197
case 0x5f: /* FDIV */
198
+ case 0x7a: /* FABD */
199
case 0x7d: /* FACGT */
200
case 0x7c: /* FCMGT */
201
unallocated_encoding(s);
202
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
203
switch (fpopcode) {
204
case 0x7: /* FRECPS */
205
case 0xf: /* FRSQRTS */
206
- case 0x1a: /* FABD */
207
pairwise = false;
208
break;
209
case 0x10: /* FMAXNMP */
210
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
211
case 0x14: /* FCMGE */
212
case 0x15: /* FACGE */
213
case 0x17: /* FDIV */
214
+ case 0x1a: /* FABD */
215
case 0x1c: /* FCMGT */
216
case 0x1d: /* FACGT */
217
unallocated_encoding(s);
218
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
219
case 0xf: /* FRSQRTS */
220
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
221
break;
222
- case 0x1a: /* FABD */
223
- gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
224
- tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
225
- break;
226
default:
227
case 0x0: /* FMAXNM */
228
case 0x1: /* FMLA */
229
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
230
case 0x14: /* FCMGE */
231
case 0x15: /* FACGE */
232
case 0x17: /* FDIV */
233
+ case 0x1a: /* FABD */
234
case 0x1c: /* FCMGT */
235
case 0x1d: /* FACGT */
236
g_assert_not_reached();
237
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
238
index XXXXXXX..XXXXXXX 100644
239
--- a/target/arm/tcg/vec_helper.c
240
+++ b/target/arm/tcg/vec_helper.c
241
@@ -XXX,XX +XXX,XX @@ static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
242
return float32_abs(float32_sub(op1, op2, stat));
61
}
243
}
62
244
63
+void raise_exception_ra(CPUARMState *env, uint32_t excp, uint32_t syndrome,
245
+static float64 float64_abd(float64 op1, float64 op2, float_status *stat)
64
+ uint32_t target_el, uintptr_t ra)
246
+{
65
+{
247
+ return float64_abs(float64_sub(op1, op2, stat));
66
+ CPUState *cs = do_raise_exception(env, excp, syndrome, target_el);
248
+}
67
+ cpu_loop_exit_restore(cs, ra);
249
+
68
+}
250
/*
69
+
251
* Reciprocal step. These are the AArch32 version which uses a
70
static int exception_target_el(CPUARMState *env)
252
* non-fused multiply-and-subtract.
71
{
253
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
72
int target_el = MAX(1, arm_current_el(env));
254
255
DO_3OP(gvec_fabd_h, float16_abd, float16)
256
DO_3OP(gvec_fabd_s, float32_abd, float32)
257
+DO_3OP(gvec_fabd_d, float64_abd, float64)
258
259
DO_3OP(gvec_fceq_h, float16_ceq, float16)
260
DO_3OP(gvec_fceq_s, float32_ceq, float32)
73
--
261
--
74
2.20.1
262
2.34.1
75
76
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
While we could expose stage_1_mmu_idx, the combination is
3
These are the last instructions within handle_3same_float
4
probably going to be more useful.
4
and disas_simd_scalar_three_reg_same_fp16 so remove them.
5
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20190108223129.5570-18-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-28-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
10
---
11
target/arm/internals.h | 15 +++++++++++++++
11
target/arm/tcg/a64.decode | 12 ++
12
target/arm/helper.c | 7 +++++++
12
target/arm/tcg/translate-a64.c | 293 ++++-----------------------------
13
2 files changed, 22 insertions(+)
13
2 files changed, 46 insertions(+), 259 deletions(-)
14
14
15
diff --git a/target/arm/internals.h b/target/arm/internals.h
15
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/internals.h
17
--- a/target/arm/tcg/a64.decode
18
+++ b/target/arm/internals.h
18
+++ b/target/arm/tcg/a64.decode
19
@@ -XXX,XX +XXX,XX @@ void arm_cpu_update_vfiq(ARMCPU *cpu);
19
@@ -XXX,XX +XXX,XX @@ FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd
20
FABD_s 0111 1110 110 ..... 00010 1 ..... ..... @rrr_h
21
FABD_s 0111 1110 1.1 ..... 11010 1 ..... ..... @rrr_sd
22
23
+FRECPS_s 0101 1110 010 ..... 00111 1 ..... ..... @rrr_h
24
+FRECPS_s 0101 1110 0.1 ..... 11111 1 ..... ..... @rrr_sd
25
+
26
+FRSQRTS_s 0101 1110 110 ..... 00111 1 ..... ..... @rrr_h
27
+FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd
28
+
29
### Advanced SIMD three same
30
31
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
32
@@ -XXX,XX +XXX,XX @@ FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd
33
FABD_v 0.10 1110 110 ..... 00010 1 ..... ..... @qrrr_h
34
FABD_v 0.10 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd
35
36
+FRECPS_v 0.00 1110 010 ..... 00111 1 ..... ..... @qrrr_h
37
+FRECPS_v 0.00 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd
38
+
39
+FRSQRTS_v 0.00 1110 110 ..... 00111 1 ..... ..... @qrrr_h
40
+FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd
41
+
42
### Advanced SIMD scalar x indexed element
43
44
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
45
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/tcg/translate-a64.c
48
+++ b/target/arm/tcg/translate-a64.c
49
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_fabd = {
50
};
51
TRANS(FABD_s, do_fp3_scalar, a, &f_scalar_fabd)
52
53
+static const FPScalar f_scalar_frecps = {
54
+ gen_helper_recpsf_f16,
55
+ gen_helper_recpsf_f32,
56
+ gen_helper_recpsf_f64,
57
+};
58
+TRANS(FRECPS_s, do_fp3_scalar, a, &f_scalar_frecps)
59
+
60
+static const FPScalar f_scalar_frsqrts = {
61
+ gen_helper_rsqrtsf_f16,
62
+ gen_helper_rsqrtsf_f32,
63
+ gen_helper_rsqrtsf_f64,
64
+};
65
+TRANS(FRSQRTS_s, do_fp3_scalar, a, &f_scalar_frsqrts)
66
+
67
static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
68
gen_helper_gvec_3_ptr * const fns[3])
69
{
70
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fabd[3] = {
71
};
72
TRANS(FABD_v, do_fp3_vector, a, f_vector_fabd)
73
74
+static gen_helper_gvec_3_ptr * const f_vector_frecps[3] = {
75
+ gen_helper_gvec_recps_h,
76
+ gen_helper_gvec_recps_s,
77
+ gen_helper_gvec_recps_d,
78
+};
79
+TRANS(FRECPS_v, do_fp3_vector, a, f_vector_frecps)
80
+
81
+static gen_helper_gvec_3_ptr * const f_vector_frsqrts[3] = {
82
+ gen_helper_gvec_rsqrts_h,
83
+ gen_helper_gvec_rsqrts_s,
84
+ gen_helper_gvec_rsqrts_d,
85
+};
86
+TRANS(FRSQRTS_v, do_fp3_vector, a, f_vector_frsqrts)
87
+
88
/*
89
* Advanced SIMD scalar/vector x indexed element
20
*/
90
*/
21
ARMMMUIdx arm_mmu_idx(CPUARMState *env);
91
@@ -XXX,XX +XXX,XX @@ static void handle_3same_64(DisasContext *s, int opcode, bool u,
22
92
}
23
+/**
24
+ * arm_stage1_mmu_idx:
25
+ * @env: The cpu environment
26
+ *
27
+ * Return the ARMMMUIdx for the stage1 traversal for the current regime.
28
+ */
29
+#ifdef CONFIG_USER_ONLY
30
+static inline ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env)
31
+{
32
+ return ARMMMUIdx_S1NSE0;
33
+}
34
+#else
35
+ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env);
36
+#endif
37
+
38
#endif
39
diff --git a/target/arm/helper.c b/target/arm/helper.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/helper.c
42
+++ b/target/arm/helper.c
43
@@ -XXX,XX +XXX,XX @@ int cpu_mmu_index(CPUARMState *env, bool ifetch)
44
return arm_to_core_mmu_idx(arm_mmu_idx(env));
45
}
93
}
46
94
47
+#ifndef CONFIG_USER_ONLY
95
-/* Handle the 3-same-operands float operations; shared by the scalar
48
+ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env)
96
- * and vector encodings. The caller must filter out any encodings
49
+{
97
- * not allocated for the encoding it is dealing with.
50
+ return stage_1_mmu_idx(arm_mmu_idx(env));
98
- */
51
+}
99
-static void handle_3same_float(DisasContext *s, int size, int elements,
52
+#endif
100
- int fpopcode, int rd, int rn, int rm)
53
+
101
-{
54
void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
102
- int pass;
55
target_ulong *cs_base, uint32_t *pflags)
103
- TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR);
56
{
104
-
105
- for (pass = 0; pass < elements; pass++) {
106
- if (size) {
107
- /* Double */
108
- TCGv_i64 tcg_op1 = tcg_temp_new_i64();
109
- TCGv_i64 tcg_op2 = tcg_temp_new_i64();
110
- TCGv_i64 tcg_res = tcg_temp_new_i64();
111
-
112
- read_vec_element(s, tcg_op1, rn, pass, MO_64);
113
- read_vec_element(s, tcg_op2, rm, pass, MO_64);
114
-
115
- switch (fpopcode) {
116
- case 0x1f: /* FRECPS */
117
- gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
118
- break;
119
- case 0x3f: /* FRSQRTS */
120
- gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
121
- break;
122
- default:
123
- case 0x18: /* FMAXNM */
124
- case 0x19: /* FMLA */
125
- case 0x1a: /* FADD */
126
- case 0x1b: /* FMULX */
127
- case 0x1c: /* FCMEQ */
128
- case 0x1e: /* FMAX */
129
- case 0x38: /* FMINNM */
130
- case 0x39: /* FMLS */
131
- case 0x3a: /* FSUB */
132
- case 0x3e: /* FMIN */
133
- case 0x5b: /* FMUL */
134
- case 0x5c: /* FCMGE */
135
- case 0x5d: /* FACGE */
136
- case 0x5f: /* FDIV */
137
- case 0x7a: /* FABD */
138
- case 0x7c: /* FCMGT */
139
- case 0x7d: /* FACGT */
140
- g_assert_not_reached();
141
- }
142
-
143
- write_vec_element(s, tcg_res, rd, pass, MO_64);
144
- } else {
145
- /* Single */
146
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
147
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
148
- TCGv_i32 tcg_res = tcg_temp_new_i32();
149
-
150
- read_vec_element_i32(s, tcg_op1, rn, pass, MO_32);
151
- read_vec_element_i32(s, tcg_op2, rm, pass, MO_32);
152
-
153
- switch (fpopcode) {
154
- case 0x1f: /* FRECPS */
155
- gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
156
- break;
157
- case 0x3f: /* FRSQRTS */
158
- gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
159
- break;
160
- default:
161
- case 0x18: /* FMAXNM */
162
- case 0x19: /* FMLA */
163
- case 0x1a: /* FADD */
164
- case 0x1b: /* FMULX */
165
- case 0x1c: /* FCMEQ */
166
- case 0x1e: /* FMAX */
167
- case 0x38: /* FMINNM */
168
- case 0x39: /* FMLS */
169
- case 0x3a: /* FSUB */
170
- case 0x3e: /* FMIN */
171
- case 0x5b: /* FMUL */
172
- case 0x5c: /* FCMGE */
173
- case 0x5d: /* FACGE */
174
- case 0x5f: /* FDIV */
175
- case 0x7a: /* FABD */
176
- case 0x7c: /* FCMGT */
177
- case 0x7d: /* FACGT */
178
- g_assert_not_reached();
179
- }
180
-
181
- if (elements == 1) {
182
- /* scalar single so clear high part */
183
- TCGv_i64 tcg_tmp = tcg_temp_new_i64();
184
-
185
- tcg_gen_extu_i32_i64(tcg_tmp, tcg_res);
186
- write_vec_element(s, tcg_tmp, rd, pass, MO_64);
187
- } else {
188
- write_vec_element_i32(s, tcg_res, rd, pass, MO_32);
189
- }
190
- }
191
- }
192
-
193
- clear_vec_high(s, elements * (size ? 8 : 4) > 8, rd);
194
-}
195
-
196
/* AdvSIMD scalar three same
197
* 31 30 29 28 24 23 22 21 20 16 15 11 10 9 5 4 0
198
* +-----+---+-----------+------+---+------+--------+---+------+------+
199
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
200
bool u = extract32(insn, 29, 1);
201
TCGv_i64 tcg_rd;
202
203
- if (opcode >= 0x18) {
204
- /* Floating point: U, size[1] and opcode indicate operation */
205
- int fpopcode = opcode | (extract32(size, 1, 1) << 5) | (u << 6);
206
- switch (fpopcode) {
207
- case 0x1f: /* FRECPS */
208
- case 0x3f: /* FRSQRTS */
209
- break;
210
- default:
211
- case 0x1b: /* FMULX */
212
- case 0x5d: /* FACGE */
213
- case 0x7d: /* FACGT */
214
- case 0x1c: /* FCMEQ */
215
- case 0x5c: /* FCMGE */
216
- case 0x7a: /* FABD */
217
- case 0x7c: /* FCMGT */
218
- unallocated_encoding(s);
219
- return;
220
- }
221
-
222
- if (!fp_access_check(s)) {
223
- return;
224
- }
225
-
226
- handle_3same_float(s, extract32(size, 0, 1), 1, fpopcode, rd, rn, rm);
227
- return;
228
- }
229
-
230
switch (opcode) {
231
case 0x1: /* SQADD, UQADD */
232
case 0x5: /* SQSUB, UQSUB */
233
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
234
write_fp_dreg(s, rd, tcg_rd);
235
}
236
237
-/* AdvSIMD scalar three same FP16
238
- * 31 30 29 28 24 23 22 21 20 16 15 14 13 11 10 9 5 4 0
239
- * +-----+---+-----------+---+-----+------+-----+--------+---+----+----+
240
- * | 0 1 | U | 1 1 1 1 0 | a | 1 0 | Rm | 0 0 | opcode | 1 | Rn | Rd |
241
- * +-----+---+-----------+---+-----+------+-----+--------+---+----+----+
242
- * v: 0101 1110 0100 0000 0000 0100 0000 0000 => 5e400400
243
- * m: 1101 1111 0110 0000 1100 0100 0000 0000 => df60c400
244
- */
245
-static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
246
- uint32_t insn)
247
-{
248
- int rd = extract32(insn, 0, 5);
249
- int rn = extract32(insn, 5, 5);
250
- int opcode = extract32(insn, 11, 3);
251
- int rm = extract32(insn, 16, 5);
252
- bool u = extract32(insn, 29, 1);
253
- bool a = extract32(insn, 23, 1);
254
- int fpopcode = opcode | (a << 3) | (u << 4);
255
- TCGv_ptr fpst;
256
- TCGv_i32 tcg_op1;
257
- TCGv_i32 tcg_op2;
258
- TCGv_i32 tcg_res;
259
-
260
- switch (fpopcode) {
261
- case 0x07: /* FRECPS */
262
- case 0x0f: /* FRSQRTS */
263
- break;
264
- default:
265
- case 0x03: /* FMULX */
266
- case 0x04: /* FCMEQ (reg) */
267
- case 0x14: /* FCMGE (reg) */
268
- case 0x15: /* FACGE */
269
- case 0x1a: /* FABD */
270
- case 0x1c: /* FCMGT (reg) */
271
- case 0x1d: /* FACGT */
272
- unallocated_encoding(s);
273
- return;
274
- }
275
-
276
- if (!dc_isar_feature(aa64_fp16, s)) {
277
- unallocated_encoding(s);
278
- }
279
-
280
- if (!fp_access_check(s)) {
281
- return;
282
- }
283
-
284
- fpst = fpstatus_ptr(FPST_FPCR_F16);
285
-
286
- tcg_op1 = read_fp_hreg(s, rn);
287
- tcg_op2 = read_fp_hreg(s, rm);
288
- tcg_res = tcg_temp_new_i32();
289
-
290
- switch (fpopcode) {
291
- case 0x07: /* FRECPS */
292
- gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
293
- break;
294
- case 0x0f: /* FRSQRTS */
295
- gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
296
- break;
297
- default:
298
- case 0x03: /* FMULX */
299
- case 0x04: /* FCMEQ (reg) */
300
- case 0x14: /* FCMGE (reg) */
301
- case 0x15: /* FACGE */
302
- case 0x1a: /* FABD */
303
- case 0x1c: /* FCMGT (reg) */
304
- case 0x1d: /* FACGT */
305
- g_assert_not_reached();
306
- }
307
-
308
- write_fp_sreg(s, rd, tcg_res);
309
-}
310
-
311
/* AdvSIMD scalar three same extra
312
* 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0
313
* +-----+---+-----------+------+---+------+---+--------+---+----+----+
314
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
315
316
/* Pairwise op subgroup of C3.6.16.
317
*
318
- * This is called directly or via the handle_3same_float for float pairwise
319
+ * This is called directly for float pairwise
320
* operations where the opcode and size are calculated differently.
321
*/
322
static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
323
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
324
int rn = extract32(insn, 5, 5);
325
int rd = extract32(insn, 0, 5);
326
327
- int datasize = is_q ? 128 : 64;
328
- int esize = 32 << size;
329
- int elements = datasize / esize;
330
-
331
if (size == 1 && !is_q) {
332
unallocated_encoding(s);
333
return;
334
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
335
handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32,
336
rn, rm, rd);
337
return;
338
- case 0x1f: /* FRECPS */
339
- case 0x3f: /* FRSQRTS */
340
- if (!fp_access_check(s)) {
341
- return;
342
- }
343
- handle_3same_float(s, size, elements, fpopcode, rd, rn, rm);
344
- return;
345
346
case 0x1d: /* FMLAL */
347
case 0x3d: /* FMLSL */
348
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
349
case 0x1b: /* FMULX */
350
case 0x1c: /* FCMEQ */
351
case 0x1e: /* FMAX */
352
+ case 0x1f: /* FRECPS */
353
case 0x38: /* FMINNM */
354
case 0x39: /* FMLS */
355
case 0x3a: /* FSUB */
356
case 0x3e: /* FMIN */
357
+ case 0x3f: /* FRSQRTS */
358
case 0x5b: /* FMUL */
359
case 0x5c: /* FCMGE */
360
case 0x5d: /* FACGE */
361
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
362
* together indicate the operation.
363
*/
364
int fpopcode = opcode | (a << 3) | (u << 4);
365
- int datasize = is_q ? 128 : 64;
366
- int elements = datasize / 16;
367
bool pairwise;
368
TCGv_ptr fpst;
369
int pass;
370
371
switch (fpopcode) {
372
- case 0x7: /* FRECPS */
373
- case 0xf: /* FRSQRTS */
374
- pairwise = false;
375
- break;
376
case 0x10: /* FMAXNMP */
377
case 0x12: /* FADDP */
378
case 0x16: /* FMAXP */
379
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
380
case 0x3: /* FMULX */
381
case 0x4: /* FCMEQ */
382
case 0x6: /* FMAX */
383
+ case 0x7: /* FRECPS */
384
case 0x8: /* FMINNM */
385
case 0x9: /* FMLS */
386
case 0xa: /* FSUB */
387
case 0xe: /* FMIN */
388
+ case 0xf: /* FRSQRTS */
389
case 0x13: /* FMUL */
390
case 0x14: /* FCMGE */
391
case 0x15: /* FACGE */
392
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
393
write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_16);
394
}
395
} else {
396
- for (pass = 0; pass < elements; pass++) {
397
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
398
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
399
- TCGv_i32 tcg_res = tcg_temp_new_i32();
400
-
401
- read_vec_element_i32(s, tcg_op1, rn, pass, MO_16);
402
- read_vec_element_i32(s, tcg_op2, rm, pass, MO_16);
403
-
404
- switch (fpopcode) {
405
- case 0x7: /* FRECPS */
406
- gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
407
- break;
408
- case 0xf: /* FRSQRTS */
409
- gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
410
- break;
411
- default:
412
- case 0x0: /* FMAXNM */
413
- case 0x1: /* FMLA */
414
- case 0x2: /* FADD */
415
- case 0x3: /* FMULX */
416
- case 0x4: /* FCMEQ */
417
- case 0x6: /* FMAX */
418
- case 0x8: /* FMINNM */
419
- case 0x9: /* FMLS */
420
- case 0xa: /* FSUB */
421
- case 0xe: /* FMIN */
422
- case 0x13: /* FMUL */
423
- case 0x14: /* FCMGE */
424
- case 0x15: /* FACGE */
425
- case 0x17: /* FDIV */
426
- case 0x1a: /* FABD */
427
- case 0x1c: /* FCMGT */
428
- case 0x1d: /* FACGT */
429
- g_assert_not_reached();
430
- }
431
-
432
- write_vec_element_i32(s, tcg_res, rd, pass, MO_16);
433
- }
434
+ g_assert_not_reached();
435
}
436
437
clear_vec_high(s, is_q, rd);
438
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
439
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
440
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
441
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
442
- { 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 },
443
{ 0x00000000, 0x00000000, NULL }
444
};
445
57
--
446
--
58
2.20.1
447
2.34.1
59
60
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20190108223129.5570-9-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-29-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/translate-a64.c | 146 +++++++++++++++++++++++++++++++++++++
8
target/arm/helper.h | 4 ++
9
1 file changed, 146 insertions(+)
9
target/arm/tcg/a64.decode | 12 +++++
10
target/arm/tcg/translate-a64.c | 87 ++++++++++++++++++++++++++--------
11
target/arm/tcg/vec_helper.c | 23 +++++++++
12
4 files changed, 105 insertions(+), 21 deletions(-)
10
13
11
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
12
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-a64.c
16
--- a/target/arm/helper.h
14
+++ b/target/arm/translate-a64.c
17
+++ b/target/arm/helper.h
15
@@ -XXX,XX +XXX,XX @@ static void handle_rev16(DisasContext *s, unsigned int sf,
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_uclamp_s, TCG_CALL_NO_RWG,
16
static void disas_data_proc_1src(DisasContext *s, uint32_t insn)
19
DEF_HELPER_FLAGS_5(gvec_uclamp_d, TCG_CALL_NO_RWG,
17
{
20
void, ptr, ptr, ptr, ptr, i32)
18
unsigned int sf, opcode, opcode2, rn, rd;
21
19
+ TCGv_i64 tcg_rd;
22
+DEF_HELPER_FLAGS_5(gvec_faddp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
20
23
+DEF_HELPER_FLAGS_5(gvec_faddp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
21
if (extract32(insn, 29, 1)) {
24
+DEF_HELPER_FLAGS_5(gvec_faddp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
unallocated_encoding(s);
25
+
23
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_1src(DisasContext *s, uint32_t insn)
26
#ifdef TARGET_AARCH64
24
case MAP(1, 0x00, 0x05):
27
#include "tcg/helper-a64.h"
25
handle_cls(s, sf, rn, rd);
28
#include "tcg/helper-sve.h"
26
break;
29
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
27
+ case MAP(1, 0x01, 0x00): /* PACIA */
30
index XXXXXXX..XXXXXXX 100644
28
+ if (s->pauth_active) {
31
--- a/target/arm/tcg/a64.decode
29
+ tcg_rd = cpu_reg(s, rd);
32
+++ b/target/arm/tcg/a64.decode
30
+ gen_helper_pacia(tcg_rd, cpu_env, tcg_rd, cpu_reg_sp(s, rn));
33
@@ -XXX,XX +XXX,XX @@
31
+ } else if (!dc_isar_feature(aa64_pauth, s)) {
34
&ri rd imm
32
+ goto do_unallocated;
35
&rri_sf rd rn imm sf
36
&i imm
37
+&rr_e rd rn esz
38
&rrr_e rd rn rm esz
39
&rrx_e rd rn rm idx esz
40
&qrr_e q rd rn esz
41
@@ -XXX,XX +XXX,XX @@
42
&qrrx_e q rd rn rm idx esz
43
&qrrrr_e q rd rn rm ra esz
44
45
+@rr_h ........ ... ..... ...... rn:5 rd:5 &rr_e esz=1
46
+@rr_sd ........ ... ..... ...... rn:5 rd:5 &rr_e esz=%esz_sd
47
+
48
@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1
49
@rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd
50
@rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd
51
@@ -XXX,XX +XXX,XX @@ FRECPS_s 0101 1110 0.1 ..... 11111 1 ..... ..... @rrr_sd
52
FRSQRTS_s 0101 1110 110 ..... 00111 1 ..... ..... @rrr_h
53
FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd
54
55
+### Advanced SIMD scalar pairwise
56
+
57
+FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h
58
+FADDP_s 0111 1110 0.11 0000 1101 10 ..... ..... @rr_sd
59
+
60
### Advanced SIMD three same
61
62
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
63
@@ -XXX,XX +XXX,XX @@ FRECPS_v 0.00 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd
64
FRSQRTS_v 0.00 1110 110 ..... 00111 1 ..... ..... @qrrr_h
65
FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd
66
67
+FADDP_v 0.10 1110 010 ..... 00010 1 ..... ..... @qrrr_h
68
+FADDP_v 0.10 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd
69
+
70
### Advanced SIMD scalar x indexed element
71
72
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
73
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/tcg/translate-a64.c
76
+++ b/target/arm/tcg/translate-a64.c
77
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_frsqrts[3] = {
78
};
79
TRANS(FRSQRTS_v, do_fp3_vector, a, f_vector_frsqrts)
80
81
+static gen_helper_gvec_3_ptr * const f_vector_faddp[3] = {
82
+ gen_helper_gvec_faddp_h,
83
+ gen_helper_gvec_faddp_s,
84
+ gen_helper_gvec_faddp_d,
85
+};
86
+TRANS(FADDP_v, do_fp3_vector, a, f_vector_faddp)
87
+
88
/*
89
* Advanced SIMD scalar/vector x indexed element
90
*/
91
@@ -XXX,XX +XXX,XX @@ static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg)
92
TRANS(FMLA_vi, do_fmla_vector_idx, a, false)
93
TRANS(FMLS_vi, do_fmla_vector_idx, a, true)
94
95
+/*
96
+ * Advanced SIMD scalar pairwise
97
+ */
98
+
99
+static bool do_fp3_scalar_pair(DisasContext *s, arg_rr_e *a, const FPScalar *f)
100
+{
101
+ switch (a->esz) {
102
+ case MO_64:
103
+ if (fp_access_check(s)) {
104
+ TCGv_i64 t0 = tcg_temp_new_i64();
105
+ TCGv_i64 t1 = tcg_temp_new_i64();
106
+
107
+ read_vec_element(s, t0, a->rn, 0, MO_64);
108
+ read_vec_element(s, t1, a->rn, 1, MO_64);
109
+ f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
110
+ write_fp_dreg(s, a->rd, t0);
33
+ }
111
+ }
34
+ break;
112
+ break;
35
+ case MAP(1, 0x01, 0x01): /* PACIB */
113
+ case MO_32:
36
+ if (s->pauth_active) {
114
+ if (fp_access_check(s)) {
37
+ tcg_rd = cpu_reg(s, rd);
115
+ TCGv_i32 t0 = tcg_temp_new_i32();
38
+ gen_helper_pacib(tcg_rd, cpu_env, tcg_rd, cpu_reg_sp(s, rn));
116
+ TCGv_i32 t1 = tcg_temp_new_i32();
39
+ } else if (!dc_isar_feature(aa64_pauth, s)) {
117
+
40
+ goto do_unallocated;
118
+ read_vec_element_i32(s, t0, a->rn, 0, MO_32);
119
+ read_vec_element_i32(s, t1, a->rn, 1, MO_32);
120
+ f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
121
+ write_fp_sreg(s, a->rd, t0);
41
+ }
122
+ }
42
+ break;
123
+ break;
43
+ case MAP(1, 0x01, 0x02): /* PACDA */
124
+ case MO_16:
44
+ if (s->pauth_active) {
125
+ if (!dc_isar_feature(aa64_fp16, s)) {
45
+ tcg_rd = cpu_reg(s, rd);
126
+ return false;
46
+ gen_helper_pacda(tcg_rd, cpu_env, tcg_rd, cpu_reg_sp(s, rn));
127
+ }
47
+ } else if (!dc_isar_feature(aa64_pauth, s)) {
128
+ if (fp_access_check(s)) {
48
+ goto do_unallocated;
129
+ TCGv_i32 t0 = tcg_temp_new_i32();
130
+ TCGv_i32 t1 = tcg_temp_new_i32();
131
+
132
+ read_vec_element_i32(s, t0, a->rn, 0, MO_16);
133
+ read_vec_element_i32(s, t1, a->rn, 1, MO_16);
134
+ f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16));
135
+ write_fp_sreg(s, a->rd, t0);
49
+ }
136
+ }
50
+ break;
137
+ break;
51
+ case MAP(1, 0x01, 0x03): /* PACDB */
138
+ default:
52
+ if (s->pauth_active) {
139
+ g_assert_not_reached();
53
+ tcg_rd = cpu_reg(s, rd);
140
+ }
54
+ gen_helper_pacdb(tcg_rd, cpu_env, tcg_rd, cpu_reg_sp(s, rn));
141
+ return true;
55
+ } else if (!dc_isar_feature(aa64_pauth, s)) {
142
+}
56
+ goto do_unallocated;
143
+
57
+ }
144
+TRANS(FADDP_s, do_fp3_scalar_pair, a, &f_scalar_fadd)
58
+ break;
145
59
+ case MAP(1, 0x01, 0x04): /* AUTIA */
146
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
60
+ if (s->pauth_active) {
147
* Note that it is the caller's responsibility to ensure that the
61
+ tcg_rd = cpu_reg(s, rd);
148
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
62
+ gen_helper_autia(tcg_rd, cpu_env, tcg_rd, cpu_reg_sp(s, rn));
149
fpst = NULL;
63
+ } else if (!dc_isar_feature(aa64_pauth, s)) {
150
break;
64
+ goto do_unallocated;
151
case 0xc: /* FMAXNMP */
65
+ }
152
- case 0xd: /* FADDP */
66
+ break;
153
case 0xf: /* FMAXP */
67
+ case MAP(1, 0x01, 0x05): /* AUTIB */
154
case 0x2c: /* FMINNMP */
68
+ if (s->pauth_active) {
155
case 0x2f: /* FMINP */
69
+ tcg_rd = cpu_reg(s, rd);
156
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
70
+ gen_helper_autib(tcg_rd, cpu_env, tcg_rd, cpu_reg_sp(s, rn));
157
fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
71
+ } else if (!dc_isar_feature(aa64_pauth, s)) {
158
break;
72
+ goto do_unallocated;
73
+ }
74
+ break;
75
+ case MAP(1, 0x01, 0x06): /* AUTDA */
76
+ if (s->pauth_active) {
77
+ tcg_rd = cpu_reg(s, rd);
78
+ gen_helper_autda(tcg_rd, cpu_env, tcg_rd, cpu_reg_sp(s, rn));
79
+ } else if (!dc_isar_feature(aa64_pauth, s)) {
80
+ goto do_unallocated;
81
+ }
82
+ break;
83
+ case MAP(1, 0x01, 0x07): /* AUTDB */
84
+ if (s->pauth_active) {
85
+ tcg_rd = cpu_reg(s, rd);
86
+ gen_helper_autdb(tcg_rd, cpu_env, tcg_rd, cpu_reg_sp(s, rn));
87
+ } else if (!dc_isar_feature(aa64_pauth, s)) {
88
+ goto do_unallocated;
89
+ }
90
+ break;
91
+ case MAP(1, 0x01, 0x08): /* PACIZA */
92
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
93
+ goto do_unallocated;
94
+ } else if (s->pauth_active) {
95
+ tcg_rd = cpu_reg(s, rd);
96
+ gen_helper_pacia(tcg_rd, cpu_env, tcg_rd, new_tmp_a64_zero(s));
97
+ }
98
+ break;
99
+ case MAP(1, 0x01, 0x09): /* PACIZB */
100
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
101
+ goto do_unallocated;
102
+ } else if (s->pauth_active) {
103
+ tcg_rd = cpu_reg(s, rd);
104
+ gen_helper_pacib(tcg_rd, cpu_env, tcg_rd, new_tmp_a64_zero(s));
105
+ }
106
+ break;
107
+ case MAP(1, 0x01, 0x0a): /* PACDZA */
108
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
109
+ goto do_unallocated;
110
+ } else if (s->pauth_active) {
111
+ tcg_rd = cpu_reg(s, rd);
112
+ gen_helper_pacda(tcg_rd, cpu_env, tcg_rd, new_tmp_a64_zero(s));
113
+ }
114
+ break;
115
+ case MAP(1, 0x01, 0x0b): /* PACDZB */
116
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
117
+ goto do_unallocated;
118
+ } else if (s->pauth_active) {
119
+ tcg_rd = cpu_reg(s, rd);
120
+ gen_helper_pacdb(tcg_rd, cpu_env, tcg_rd, new_tmp_a64_zero(s));
121
+ }
122
+ break;
123
+ case MAP(1, 0x01, 0x0c): /* AUTIZA */
124
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
125
+ goto do_unallocated;
126
+ } else if (s->pauth_active) {
127
+ tcg_rd = cpu_reg(s, rd);
128
+ gen_helper_autia(tcg_rd, cpu_env, tcg_rd, new_tmp_a64_zero(s));
129
+ }
130
+ break;
131
+ case MAP(1, 0x01, 0x0d): /* AUTIZB */
132
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
133
+ goto do_unallocated;
134
+ } else if (s->pauth_active) {
135
+ tcg_rd = cpu_reg(s, rd);
136
+ gen_helper_autib(tcg_rd, cpu_env, tcg_rd, new_tmp_a64_zero(s));
137
+ }
138
+ break;
139
+ case MAP(1, 0x01, 0x0e): /* AUTDZA */
140
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
141
+ goto do_unallocated;
142
+ } else if (s->pauth_active) {
143
+ tcg_rd = cpu_reg(s, rd);
144
+ gen_helper_autda(tcg_rd, cpu_env, tcg_rd, new_tmp_a64_zero(s));
145
+ }
146
+ break;
147
+ case MAP(1, 0x01, 0x0f): /* AUTDZB */
148
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
149
+ goto do_unallocated;
150
+ } else if (s->pauth_active) {
151
+ tcg_rd = cpu_reg(s, rd);
152
+ gen_helper_autdb(tcg_rd, cpu_env, tcg_rd, new_tmp_a64_zero(s));
153
+ }
154
+ break;
155
+ case MAP(1, 0x01, 0x10): /* XPACI */
156
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
157
+ goto do_unallocated;
158
+ } else if (s->pauth_active) {
159
+ tcg_rd = cpu_reg(s, rd);
160
+ gen_helper_xpaci(tcg_rd, cpu_env, tcg_rd);
161
+ }
162
+ break;
163
+ case MAP(1, 0x01, 0x11): /* XPACD */
164
+ if (!dc_isar_feature(aa64_pauth, s) || rn != 31) {
165
+ goto do_unallocated;
166
+ } else if (s->pauth_active) {
167
+ tcg_rd = cpu_reg(s, rd);
168
+ gen_helper_xpacd(tcg_rd, cpu_env, tcg_rd);
169
+ }
170
+ break;
171
default:
159
default:
172
+ do_unallocated:
160
+ case 0xd: /* FADDP */
173
unallocated_encoding(s);
161
unallocated_encoding(s);
174
break;
162
return;
175
}
163
}
164
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
165
case 0xc: /* FMAXNMP */
166
gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
167
break;
168
- case 0xd: /* FADDP */
169
- gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
170
- break;
171
case 0xf: /* FMAXP */
172
gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
173
break;
174
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
175
gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
176
break;
177
default:
178
+ case 0xd: /* FADDP */
179
g_assert_not_reached();
180
}
181
182
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
183
case 0xc: /* FMAXNMP */
184
gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
185
break;
186
- case 0xd: /* FADDP */
187
- gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
188
- break;
189
case 0xf: /* FMAXP */
190
gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
191
break;
192
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
193
gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
194
break;
195
default:
196
+ case 0xd: /* FADDP */
197
g_assert_not_reached();
198
}
199
} else {
200
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
201
case 0xc: /* FMAXNMP */
202
gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
203
break;
204
- case 0xd: /* FADDP */
205
- gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
206
- break;
207
case 0xf: /* FMAXP */
208
gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
209
break;
210
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
211
gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
212
break;
213
default:
214
+ case 0xd: /* FADDP */
215
g_assert_not_reached();
216
}
217
}
218
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
219
case 0x58: /* FMAXNMP */
220
gen_helper_vfp_maxnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
221
break;
222
- case 0x5a: /* FADDP */
223
- gen_helper_vfp_addd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
224
- break;
225
case 0x5e: /* FMAXP */
226
gen_helper_vfp_maxd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
227
break;
228
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
229
gen_helper_vfp_mind(tcg_res[pass], tcg_op1, tcg_op2, fpst);
230
break;
231
default:
232
+ case 0x5a: /* FADDP */
233
g_assert_not_reached();
234
}
235
}
236
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
237
case 0x58: /* FMAXNMP */
238
gen_helper_vfp_maxnums(tcg_res[pass], tcg_op1, tcg_op2, fpst);
239
break;
240
- case 0x5a: /* FADDP */
241
- gen_helper_vfp_adds(tcg_res[pass], tcg_op1, tcg_op2, fpst);
242
- break;
243
case 0x5e: /* FMAXP */
244
gen_helper_vfp_maxs(tcg_res[pass], tcg_op1, tcg_op2, fpst);
245
break;
246
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
247
gen_helper_vfp_mins(tcg_res[pass], tcg_op1, tcg_op2, fpst);
248
break;
249
default:
250
+ case 0x5a: /* FADDP */
251
g_assert_not_reached();
252
}
253
254
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
255
256
switch (fpopcode) {
257
case 0x58: /* FMAXNMP */
258
- case 0x5a: /* FADDP */
259
case 0x5e: /* FMAXP */
260
case 0x78: /* FMINNMP */
261
case 0x7e: /* FMINP */
262
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
263
case 0x3a: /* FSUB */
264
case 0x3e: /* FMIN */
265
case 0x3f: /* FRSQRTS */
266
+ case 0x5a: /* FADDP */
267
case 0x5b: /* FMUL */
268
case 0x5c: /* FCMGE */
269
case 0x5d: /* FACGE */
270
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
271
272
switch (fpopcode) {
273
case 0x10: /* FMAXNMP */
274
- case 0x12: /* FADDP */
275
case 0x16: /* FMAXP */
276
case 0x18: /* FMINNMP */
277
case 0x1e: /* FMINP */
278
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
279
case 0xa: /* FSUB */
280
case 0xe: /* FMIN */
281
case 0xf: /* FRSQRTS */
282
+ case 0x12: /* FADDP */
283
case 0x13: /* FMUL */
284
case 0x14: /* FCMGE */
285
case 0x15: /* FACGE */
286
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
287
gen_helper_advsimd_maxnumh(tcg_res[pass], tcg_op1, tcg_op2,
288
fpst);
289
break;
290
- case 0x12: /* FADDP */
291
- gen_helper_advsimd_addh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
292
- break;
293
case 0x16: /* FMAXP */
294
gen_helper_advsimd_maxh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
295
break;
296
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
297
gen_helper_advsimd_minh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
298
break;
299
default:
300
+ case 0x12: /* FADDP */
301
g_assert_not_reached();
302
}
303
}
304
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
305
index XXXXXXX..XXXXXXX 100644
306
--- a/target/arm/tcg/vec_helper.c
307
+++ b/target/arm/tcg/vec_helper.c
308
@@ -XXX,XX +XXX,XX @@ DO_NEON_PAIRWISE(neon_pmin, min)
309
310
#undef DO_NEON_PAIRWISE
311
312
+#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \
313
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
314
+{ \
315
+ ARMVectorReg scratch; \
316
+ intptr_t oprsz = simd_oprsz(desc); \
317
+ intptr_t half = oprsz / sizeof(TYPE) / 2; \
318
+ TYPE *d = vd, *n = vn, *m = vm; \
319
+ if (unlikely(d == m)) { \
320
+ m = memcpy(&scratch, m, oprsz); \
321
+ } \
322
+ for (intptr_t i = 0; i < half; ++i) { \
323
+ d[H(i)] = FUNC(n[H(i * 2)], n[H(i * 2 + 1)], stat); \
324
+ } \
325
+ for (intptr_t i = 0; i < half; ++i) { \
326
+ d[H(i + half)] = FUNC(m[H(i * 2)], m[H(i * 2 + 1)], stat); \
327
+ } \
328
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
329
+}
330
+
331
+DO_3OP_PAIR(gvec_faddp_h, float16_add, float16, H2)
332
+DO_3OP_PAIR(gvec_faddp_s, float32_add, float32, H4)
333
+DO_3OP_PAIR(gvec_faddp_d, float64_add, float64, )
334
+
335
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
336
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
337
{ \
176
--
338
--
177
2.20.1
339
2.34.1
178
179
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This function is only used by AArch64. Code movement only.
3
These are the last instructions within disas_simd_three_reg_same_fp16,
4
so remove it.
4
5
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20190108223129.5570-11-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-30-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
---
10
target/arm/helper-a64.h | 2 +
11
target/arm/helper.h | 16 ++
11
target/arm/helper.h | 1 -
12
target/arm/tcg/a64.decode | 24 +++
12
target/arm/helper-a64.c | 155 ++++++++++++++++++++++++++++++++++++++++
13
target/arm/tcg/translate-a64.c | 296 ++++++---------------------------
13
target/arm/op_helper.c | 155 ----------------------------------------
14
target/arm/tcg/vec_helper.c | 16 ++
14
4 files changed, 157 insertions(+), 156 deletions(-)
15
4 files changed, 107 insertions(+), 245 deletions(-)
15
16
16
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-a64.h
19
+++ b/target/arm/helper-a64.h
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(advsimd_f16tosinth, i32, f16, ptr)
21
DEF_HELPER_2(advsimd_f16touinth, i32, f16, ptr)
22
DEF_HELPER_2(sqrt_f16, f16, f16, ptr)
23
24
+DEF_HELPER_1(exception_return, void, env)
25
+
26
DEF_HELPER_FLAGS_3(pacia, TCG_CALL_NO_WG, i64, env, i64, i64)
27
DEF_HELPER_FLAGS_3(pacib, TCG_CALL_NO_WG, i64, env, i64, i64)
28
DEF_HELPER_FLAGS_3(pacda, TCG_CALL_NO_WG, i64, env, i64, i64)
29
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
30
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/helper.h
19
--- a/target/arm/helper.h
32
+++ b/target/arm/helper.h
20
+++ b/target/arm/helper.h
33
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(get_cp_reg64, i64, env, ptr)
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_faddp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
22
DEF_HELPER_FLAGS_5(gvec_faddp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
DEF_HELPER_3(msr_i_pstate, void, env, i32, i32)
23
DEF_HELPER_FLAGS_5(gvec_faddp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
DEF_HELPER_1(clear_pstate_ss, void, env)
24
37
-DEF_HELPER_1(exception_return, void, env)
25
+DEF_HELPER_FLAGS_5(gvec_fmaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
26
+DEF_HELPER_FLAGS_5(gvec_fmaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
DEF_HELPER_2(get_r13_banked, i32, env, i32)
27
+DEF_HELPER_FLAGS_5(gvec_fmaxp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
40
DEF_HELPER_3(set_r13_banked, void, env, i32, i32)
28
+
41
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
29
+DEF_HELPER_FLAGS_5(gvec_fminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_5(gvec_fminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(gvec_fminp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
+
33
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_5(gvec_fminnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_5(gvec_fminnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_5(gvec_fminnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
40
+
41
#ifdef TARGET_AARCH64
42
#include "tcg/helper-a64.h"
43
#include "tcg/helper-sve.h"
44
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
42
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/helper-a64.c
46
--- a/target/arm/tcg/a64.decode
44
+++ b/target/arm/helper-a64.c
47
+++ b/target/arm/tcg/a64.decode
45
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(advsimd_f16touinth)(uint32_t a, void *fpstp)
48
@@ -XXX,XX +XXX,XX @@ FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd
46
return float16_to_uint16(a, fpst);
49
FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h
50
FADDP_s 0111 1110 0.11 0000 1101 10 ..... ..... @rr_sd
51
52
+FMAXP_s 0101 1110 0011 0000 1111 10 ..... ..... @rr_h
53
+FMAXP_s 0111 1110 0.11 0000 1111 10 ..... ..... @rr_sd
54
+
55
+FMINP_s 0101 1110 1011 0000 1111 10 ..... ..... @rr_h
56
+FMINP_s 0111 1110 1.11 0000 1111 10 ..... ..... @rr_sd
57
+
58
+FMAXNMP_s 0101 1110 0011 0000 1100 10 ..... ..... @rr_h
59
+FMAXNMP_s 0111 1110 0.11 0000 1100 10 ..... ..... @rr_sd
60
+
61
+FMINNMP_s 0101 1110 1011 0000 1100 10 ..... ..... @rr_h
62
+FMINNMP_s 0111 1110 1.11 0000 1100 10 ..... ..... @rr_sd
63
+
64
### Advanced SIMD three same
65
66
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
67
@@ -XXX,XX +XXX,XX @@ FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd
68
FADDP_v 0.10 1110 010 ..... 00010 1 ..... ..... @qrrr_h
69
FADDP_v 0.10 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd
70
71
+FMAXP_v 0.10 1110 010 ..... 00110 1 ..... ..... @qrrr_h
72
+FMAXP_v 0.10 1110 0.1 ..... 11110 1 ..... ..... @qrrr_sd
73
+
74
+FMINP_v 0.10 1110 110 ..... 00110 1 ..... ..... @qrrr_h
75
+FMINP_v 0.10 1110 1.1 ..... 11110 1 ..... ..... @qrrr_sd
76
+
77
+FMAXNMP_v 0.10 1110 010 ..... 00000 1 ..... ..... @qrrr_h
78
+FMAXNMP_v 0.10 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd
79
+
80
+FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h
81
+FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
82
+
83
### Advanced SIMD scalar x indexed element
84
85
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
86
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/tcg/translate-a64.c
89
+++ b/target/arm/tcg/translate-a64.c
90
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_faddp[3] = {
91
};
92
TRANS(FADDP_v, do_fp3_vector, a, f_vector_faddp)
93
94
+static gen_helper_gvec_3_ptr * const f_vector_fmaxp[3] = {
95
+ gen_helper_gvec_fmaxp_h,
96
+ gen_helper_gvec_fmaxp_s,
97
+ gen_helper_gvec_fmaxp_d,
98
+};
99
+TRANS(FMAXP_v, do_fp3_vector, a, f_vector_fmaxp)
100
+
101
+static gen_helper_gvec_3_ptr * const f_vector_fminp[3] = {
102
+ gen_helper_gvec_fminp_h,
103
+ gen_helper_gvec_fminp_s,
104
+ gen_helper_gvec_fminp_d,
105
+};
106
+TRANS(FMINP_v, do_fp3_vector, a, f_vector_fminp)
107
+
108
+static gen_helper_gvec_3_ptr * const f_vector_fmaxnmp[3] = {
109
+ gen_helper_gvec_fmaxnump_h,
110
+ gen_helper_gvec_fmaxnump_s,
111
+ gen_helper_gvec_fmaxnump_d,
112
+};
113
+TRANS(FMAXNMP_v, do_fp3_vector, a, f_vector_fmaxnmp)
114
+
115
+static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = {
116
+ gen_helper_gvec_fminnump_h,
117
+ gen_helper_gvec_fminnump_s,
118
+ gen_helper_gvec_fminnump_d,
119
+};
120
+TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp)
121
+
122
/*
123
* Advanced SIMD scalar/vector x indexed element
124
*/
125
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_scalar_pair(DisasContext *s, arg_rr_e *a, const FPScalar *f)
47
}
126
}
48
127
49
+static int el_from_spsr(uint32_t spsr)
128
TRANS(FADDP_s, do_fp3_scalar_pair, a, &f_scalar_fadd)
50
+{
129
+TRANS(FMAXP_s, do_fp3_scalar_pair, a, &f_scalar_fmax)
51
+ /* Return the exception level that this SPSR is requesting a return to,
130
+TRANS(FMINP_s, do_fp3_scalar_pair, a, &f_scalar_fmin)
52
+ * or -1 if it is invalid (an illegal return)
131
+TRANS(FMAXNMP_s, do_fp3_scalar_pair, a, &f_scalar_fmaxnm)
53
+ */
132
+TRANS(FMINNMP_s, do_fp3_scalar_pair, a, &f_scalar_fminnm)
54
+ if (spsr & PSTATE_nRW) {
133
55
+ switch (spsr & CPSR_M) {
134
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
56
+ case ARM_CPU_MODE_USR:
135
* Note that it is the caller's responsibility to ensure that the
57
+ return 0;
136
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
58
+ case ARM_CPU_MODE_HYP:
137
int opcode = extract32(insn, 12, 5);
59
+ return 2;
138
int rn = extract32(insn, 5, 5);
60
+ case ARM_CPU_MODE_FIQ:
139
int rd = extract32(insn, 0, 5);
61
+ case ARM_CPU_MODE_IRQ:
140
- TCGv_ptr fpst;
62
+ case ARM_CPU_MODE_SVC:
141
63
+ case ARM_CPU_MODE_ABT:
142
/* For some ops (the FP ones), size[1] is part of the encoding.
64
+ case ARM_CPU_MODE_UND:
143
* For ADDP strictly it is not but size[1] is always 1 for valid
65
+ case ARM_CPU_MODE_SYS:
144
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
66
+ return 1;
145
if (!fp_access_check(s)) {
67
+ case ARM_CPU_MODE_MON:
146
return;
68
+ /* Returning to Mon from AArch64 is never possible,
147
}
69
+ * so this is an illegal return.
148
-
70
+ */
149
- fpst = NULL;
71
+ default:
150
break;
72
+ return -1;
151
+ default:
73
+ }
152
case 0xc: /* FMAXNMP */
74
+ } else {
153
+ case 0xd: /* FADDP */
75
+ if (extract32(spsr, 1, 1)) {
154
case 0xf: /* FMAXP */
76
+ /* Return with reserved M[1] bit set */
155
case 0x2c: /* FMINNMP */
77
+ return -1;
156
case 0x2f: /* FMINP */
78
+ }
157
- /* FP op, size[0] is 32 or 64 bit*/
79
+ if (extract32(spsr, 0, 4) == 1) {
158
- if (!u) {
80
+ /* return to EL0 with M[0] bit set */
159
- if ((size & 1) || !dc_isar_feature(aa64_fp16, s)) {
81
+ return -1;
160
- unallocated_encoding(s);
82
+ }
161
- return;
83
+ return extract32(spsr, 2, 2);
162
- } else {
84
+ }
163
- size = MO_16;
85
+}
164
- }
86
+
165
- } else {
87
+void HELPER(exception_return)(CPUARMState *env)
166
- size = extract32(size, 0, 1) ? MO_64 : MO_32;
88
+{
167
- }
89
+ int cur_el = arm_current_el(env);
168
-
90
+ unsigned int spsr_idx = aarch64_banked_spsr_index(cur_el);
169
- if (!fp_access_check(s)) {
91
+ uint32_t spsr = env->banked_spsr[spsr_idx];
170
- return;
92
+ int new_el;
171
- }
93
+ bool return_to_aa64 = (spsr & PSTATE_nRW) == 0;
172
-
94
+
173
- fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
95
+ aarch64_save_sp(env, cur_el);
174
- break;
96
+
175
- default:
97
+ arm_clear_exclusive(env);
176
- case 0xd: /* FADDP */
98
+
177
unallocated_encoding(s);
99
+ /* We must squash the PSTATE.SS bit to zero unless both of the
178
return;
100
+ * following hold:
179
}
101
+ * 1. debug exceptions are currently disabled
180
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
102
+ * 2. singlestep will be active in the EL we return to
181
case 0x3b: /* ADDP */
103
+ * We check 1 here and 2 after we've done the pstate/cpsr write() to
182
tcg_gen_add_i64(tcg_res, tcg_op1, tcg_op2);
104
+ * transition to the EL we're going to.
183
break;
105
+ */
184
- case 0xc: /* FMAXNMP */
106
+ if (arm_generate_debug_exceptions(env)) {
185
- gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
107
+ spsr &= ~PSTATE_SS;
186
- break;
108
+ }
187
- case 0xf: /* FMAXP */
109
+
188
- gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
110
+ new_el = el_from_spsr(spsr);
189
- break;
111
+ if (new_el == -1) {
190
- case 0x2c: /* FMINNMP */
112
+ goto illegal_return;
191
- gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
113
+ }
192
- break;
114
+ if (new_el > cur_el
193
- case 0x2f: /* FMINP */
115
+ || (new_el == 2 && !arm_feature(env, ARM_FEATURE_EL2))) {
194
- gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
116
+ /* Disallow return to an EL which is unimplemented or higher
195
- break;
117
+ * than the current one.
196
default:
118
+ */
197
+ case 0xc: /* FMAXNMP */
119
+ goto illegal_return;
198
case 0xd: /* FADDP */
120
+ }
199
+ case 0xf: /* FMAXP */
121
+
200
+ case 0x2c: /* FMINNMP */
122
+ if (new_el != 0 && arm_el_is_aa64(env, new_el) != return_to_aa64) {
201
+ case 0x2f: /* FMINP */
123
+ /* Return to an EL which is configured for a different register width */
202
g_assert_not_reached();
124
+ goto illegal_return;
203
}
125
+ }
204
126
+
205
write_fp_dreg(s, rd, tcg_res);
127
+ if (new_el == 2 && arm_is_secure_below_el3(env)) {
206
} else {
128
+ /* Return to the non-existent secure-EL2 */
207
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
129
+ goto illegal_return;
208
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
130
+ }
209
- TCGv_i32 tcg_res = tcg_temp_new_i32();
131
+
210
-
132
+ if (new_el == 1 && (arm_hcr_el2_eff(env) & HCR_TGE)) {
211
- read_vec_element_i32(s, tcg_op1, rn, 0, size);
133
+ goto illegal_return;
212
- read_vec_element_i32(s, tcg_op2, rn, 1, size);
134
+ }
213
-
135
+
214
- if (size == MO_16) {
136
+ qemu_mutex_lock_iothread();
215
- switch (opcode) {
137
+ arm_call_pre_el_change_hook(arm_env_get_cpu(env));
216
- case 0xc: /* FMAXNMP */
138
+ qemu_mutex_unlock_iothread();
217
- gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
139
+
218
- break;
140
+ if (!return_to_aa64) {
219
- case 0xf: /* FMAXP */
141
+ env->aarch64 = 0;
220
- gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
142
+ /* We do a raw CPSR write because aarch64_sync_64_to_32()
221
- break;
143
+ * will sort the register banks out for us, and we've already
222
- case 0x2c: /* FMINNMP */
144
+ * caught all the bad-mode cases in el_from_spsr().
223
- gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
145
+ */
224
- break;
146
+ cpsr_write(env, spsr, ~0, CPSRWriteRaw);
225
- case 0x2f: /* FMINP */
147
+ if (!arm_singlestep_active(env)) {
226
- gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
148
+ env->uncached_cpsr &= ~PSTATE_SS;
227
- break;
149
+ }
228
- default:
150
+ aarch64_sync_64_to_32(env);
229
- case 0xd: /* FADDP */
151
+
230
- g_assert_not_reached();
152
+ if (spsr & CPSR_T) {
231
- }
153
+ env->regs[15] = env->elr_el[cur_el] & ~0x1;
232
- } else {
154
+ } else {
233
- switch (opcode) {
155
+ env->regs[15] = env->elr_el[cur_el] & ~0x3;
234
- case 0xc: /* FMAXNMP */
156
+ }
235
- gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
157
+ qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
236
- break;
158
+ "AArch32 EL%d PC 0x%" PRIx32 "\n",
237
- case 0xf: /* FMAXP */
159
+ cur_el, new_el, env->regs[15]);
238
- gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
160
+ } else {
239
- break;
161
+ env->aarch64 = 1;
240
- case 0x2c: /* FMINNMP */
162
+ pstate_write(env, spsr);
241
- gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
163
+ if (!arm_singlestep_active(env)) {
242
- break;
164
+ env->pstate &= ~PSTATE_SS;
243
- case 0x2f: /* FMINP */
165
+ }
244
- gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
166
+ aarch64_restore_sp(env, new_el);
245
- break;
167
+ env->pc = env->elr_el[cur_el];
246
- default:
168
+ qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
247
- case 0xd: /* FADDP */
169
+ "AArch64 EL%d PC 0x%" PRIx64 "\n",
248
- g_assert_not_reached();
170
+ cur_el, new_el, env->pc);
249
- }
171
+ }
250
- }
172
+ /*
251
-
173
+ * Note that cur_el can never be 0. If new_el is 0, then
252
- write_fp_sreg(s, rd, tcg_res);
174
+ * el0_a64 is return_to_aa64, else el0_a64 is ignored.
253
+ g_assert_not_reached();
175
+ */
176
+ aarch64_sve_change_el(env, cur_el, new_el, return_to_aa64);
177
+
178
+ qemu_mutex_lock_iothread();
179
+ arm_call_el_change_hook(arm_env_get_cpu(env));
180
+ qemu_mutex_unlock_iothread();
181
+
182
+ return;
183
+
184
+illegal_return:
185
+ /* Illegal return events of various kinds have architecturally
186
+ * mandated behaviour:
187
+ * restore NZCV and DAIF from SPSR_ELx
188
+ * set PSTATE.IL
189
+ * restore PC from ELR_ELx
190
+ * no change to exception level, execution state or stack pointer
191
+ */
192
+ env->pstate |= PSTATE_IL;
193
+ env->pc = env->elr_el[cur_el];
194
+ spsr &= PSTATE_NZCV | PSTATE_DAIF;
195
+ spsr |= pstate_read(env) & ~(PSTATE_NZCV | PSTATE_DAIF);
196
+ pstate_write(env, spsr);
197
+ if (!arm_singlestep_active(env)) {
198
+ env->pstate &= ~PSTATE_SS;
199
+ }
200
+ qemu_log_mask(LOG_GUEST_ERROR, "Illegal exception return at EL%d: "
201
+ "resuming execution at 0x%" PRIx64 "\n", cur_el, env->pc);
202
+}
203
+
204
/*
205
* Square Root and Reciprocal square root
206
*/
207
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
208
index XXXXXXX..XXXXXXX 100644
209
--- a/target/arm/op_helper.c
210
+++ b/target/arm/op_helper.c
211
@@ -XXX,XX +XXX,XX @@ void HELPER(pre_smc)(CPUARMState *env, uint32_t syndrome)
212
}
254
}
213
}
255
}
214
256
215
-static int el_from_spsr(uint32_t spsr)
257
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
258
static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
259
int size, int rn, int rm, int rd)
260
{
261
- TCGv_ptr fpst;
262
int pass;
263
264
- /* Floating point operations need fpst */
265
- if (opcode >= 0x58) {
266
- fpst = fpstatus_ptr(FPST_FPCR);
267
- } else {
268
- fpst = NULL;
269
- }
270
-
271
if (!fp_access_check(s)) {
272
return;
273
}
274
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
275
case 0x17: /* ADDP */
276
tcg_gen_add_i64(tcg_res[pass], tcg_op1, tcg_op2);
277
break;
278
- case 0x58: /* FMAXNMP */
279
- gen_helper_vfp_maxnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
280
- break;
281
- case 0x5e: /* FMAXP */
282
- gen_helper_vfp_maxd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
283
- break;
284
- case 0x78: /* FMINNMP */
285
- gen_helper_vfp_minnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
286
- break;
287
- case 0x7e: /* FMINP */
288
- gen_helper_vfp_mind(tcg_res[pass], tcg_op1, tcg_op2, fpst);
289
- break;
290
default:
291
+ case 0x58: /* FMAXNMP */
292
case 0x5a: /* FADDP */
293
+ case 0x5e: /* FMAXP */
294
+ case 0x78: /* FMINNMP */
295
+ case 0x7e: /* FMINP */
296
g_assert_not_reached();
297
}
298
}
299
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
300
genfn = fns[size][u];
301
break;
302
}
303
- /* The FP operations are all on single floats (32 bit) */
304
- case 0x58: /* FMAXNMP */
305
- gen_helper_vfp_maxnums(tcg_res[pass], tcg_op1, tcg_op2, fpst);
306
- break;
307
- case 0x5e: /* FMAXP */
308
- gen_helper_vfp_maxs(tcg_res[pass], tcg_op1, tcg_op2, fpst);
309
- break;
310
- case 0x78: /* FMINNMP */
311
- gen_helper_vfp_minnums(tcg_res[pass], tcg_op1, tcg_op2, fpst);
312
- break;
313
- case 0x7e: /* FMINP */
314
- gen_helper_vfp_mins(tcg_res[pass], tcg_op1, tcg_op2, fpst);
315
- break;
316
default:
317
+ case 0x58: /* FMAXNMP */
318
case 0x5a: /* FADDP */
319
+ case 0x5e: /* FMAXP */
320
+ case 0x78: /* FMINNMP */
321
+ case 0x7e: /* FMINP */
322
g_assert_not_reached();
323
}
324
325
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
326
}
327
328
switch (fpopcode) {
329
- case 0x58: /* FMAXNMP */
330
- case 0x5e: /* FMAXP */
331
- case 0x78: /* FMINNMP */
332
- case 0x7e: /* FMINP */
333
- if (size && !is_q) {
334
- unallocated_encoding(s);
335
- return;
336
- }
337
- handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32,
338
- rn, rm, rd);
339
- return;
340
-
341
case 0x1d: /* FMLAL */
342
case 0x3d: /* FMLSL */
343
case 0x59: /* FMLAL2 */
344
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
345
case 0x3a: /* FSUB */
346
case 0x3e: /* FMIN */
347
case 0x3f: /* FRSQRTS */
348
+ case 0x58: /* FMAXNMP */
349
case 0x5a: /* FADDP */
350
case 0x5b: /* FMUL */
351
case 0x5c: /* FCMGE */
352
case 0x5d: /* FACGE */
353
+ case 0x5e: /* FMAXP */
354
case 0x5f: /* FDIV */
355
+ case 0x78: /* FMINNMP */
356
case 0x7a: /* FABD */
357
case 0x7d: /* FACGT */
358
case 0x7c: /* FCMGT */
359
+ case 0x7e: /* FMINP */
360
unallocated_encoding(s);
361
return;
362
}
363
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
364
}
365
}
366
367
-/*
368
- * Advanced SIMD three same (ARMv8.2 FP16 variants)
369
- *
370
- * 31 30 29 28 24 23 22 21 20 16 15 14 13 11 10 9 5 4 0
371
- * +---+---+---+-----------+---------+------+-----+--------+---+------+------+
372
- * | 0 | Q | U | 0 1 1 1 0 | a | 1 0 | Rm | 0 0 | opcode | 1 | Rn | Rd |
373
- * +---+---+---+-----------+---------+------+-----+--------+---+------+------+
374
- *
375
- * This includes FMULX, FCMEQ (register), FRECPS, FRSQRTS, FCMGE
376
- * (register), FACGE, FABD, FCMGT (register) and FACGT.
377
- *
378
- */
379
-static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
216
-{
380
-{
217
- /* Return the exception level that this SPSR is requesting a return to,
381
- int opcode = extract32(insn, 11, 3);
218
- * or -1 if it is invalid (an illegal return)
382
- int u = extract32(insn, 29, 1);
383
- int a = extract32(insn, 23, 1);
384
- int is_q = extract32(insn, 30, 1);
385
- int rm = extract32(insn, 16, 5);
386
- int rn = extract32(insn, 5, 5);
387
- int rd = extract32(insn, 0, 5);
388
- /*
389
- * For these floating point ops, the U, a and opcode bits
390
- * together indicate the operation.
219
- */
391
- */
220
- if (spsr & PSTATE_nRW) {
392
- int fpopcode = opcode | (a << 3) | (u << 4);
221
- switch (spsr & CPSR_M) {
393
- bool pairwise;
222
- case ARM_CPU_MODE_USR:
394
- TCGv_ptr fpst;
223
- return 0;
395
- int pass;
224
- case ARM_CPU_MODE_HYP:
396
-
225
- return 2;
397
- switch (fpopcode) {
226
- case ARM_CPU_MODE_FIQ:
398
- case 0x10: /* FMAXNMP */
227
- case ARM_CPU_MODE_IRQ:
399
- case 0x16: /* FMAXP */
228
- case ARM_CPU_MODE_SVC:
400
- case 0x18: /* FMINNMP */
229
- case ARM_CPU_MODE_ABT:
401
- case 0x1e: /* FMINP */
230
- case ARM_CPU_MODE_UND:
402
- pairwise = true;
231
- case ARM_CPU_MODE_SYS:
403
- break;
232
- return 1;
404
- default:
233
- case ARM_CPU_MODE_MON:
405
- case 0x0: /* FMAXNM */
234
- /* Returning to Mon from AArch64 is never possible,
406
- case 0x1: /* FMLA */
235
- * so this is an illegal return.
407
- case 0x2: /* FADD */
236
- */
408
- case 0x3: /* FMULX */
237
- default:
409
- case 0x4: /* FCMEQ */
238
- return -1;
410
- case 0x6: /* FMAX */
411
- case 0x7: /* FRECPS */
412
- case 0x8: /* FMINNM */
413
- case 0x9: /* FMLS */
414
- case 0xa: /* FSUB */
415
- case 0xe: /* FMIN */
416
- case 0xf: /* FRSQRTS */
417
- case 0x12: /* FADDP */
418
- case 0x13: /* FMUL */
419
- case 0x14: /* FCMGE */
420
- case 0x15: /* FACGE */
421
- case 0x17: /* FDIV */
422
- case 0x1a: /* FABD */
423
- case 0x1c: /* FCMGT */
424
- case 0x1d: /* FACGT */
425
- unallocated_encoding(s);
426
- return;
427
- }
428
-
429
- if (!dc_isar_feature(aa64_fp16, s)) {
430
- unallocated_encoding(s);
431
- return;
432
- }
433
-
434
- if (!fp_access_check(s)) {
435
- return;
436
- }
437
-
438
- fpst = fpstatus_ptr(FPST_FPCR_F16);
439
-
440
- if (pairwise) {
441
- int maxpass = is_q ? 8 : 4;
442
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
443
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
444
- TCGv_i32 tcg_res[8];
445
-
446
- for (pass = 0; pass < maxpass; pass++) {
447
- int passreg = pass < (maxpass / 2) ? rn : rm;
448
- int passelt = (pass << 1) & (maxpass - 1);
449
-
450
- read_vec_element_i32(s, tcg_op1, passreg, passelt, MO_16);
451
- read_vec_element_i32(s, tcg_op2, passreg, passelt + 1, MO_16);
452
- tcg_res[pass] = tcg_temp_new_i32();
453
-
454
- switch (fpopcode) {
455
- case 0x10: /* FMAXNMP */
456
- gen_helper_advsimd_maxnumh(tcg_res[pass], tcg_op1, tcg_op2,
457
- fpst);
458
- break;
459
- case 0x16: /* FMAXP */
460
- gen_helper_advsimd_maxh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
461
- break;
462
- case 0x18: /* FMINNMP */
463
- gen_helper_advsimd_minnumh(tcg_res[pass], tcg_op1, tcg_op2,
464
- fpst);
465
- break;
466
- case 0x1e: /* FMINP */
467
- gen_helper_advsimd_minh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
468
- break;
469
- default:
470
- case 0x12: /* FADDP */
471
- g_assert_not_reached();
472
- }
473
- }
474
-
475
- for (pass = 0; pass < maxpass; pass++) {
476
- write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_16);
239
- }
477
- }
240
- } else {
478
- } else {
241
- if (extract32(spsr, 1, 1)) {
479
- g_assert_not_reached();
242
- /* Return with reserved M[1] bit set */
243
- return -1;
244
- }
245
- if (extract32(spsr, 0, 4) == 1) {
246
- /* return to EL0 with M[0] bit set */
247
- return -1;
248
- }
249
- return extract32(spsr, 2, 2);
250
- }
480
- }
481
-
482
- clear_vec_high(s, is_q, rd);
251
-}
483
-}
252
-
484
-
253
-void HELPER(exception_return)(CPUARMState *env)
485
/* AdvSIMD three same extra
254
-{
486
* 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0
255
- int cur_el = arm_current_el(env);
487
* +---+---+---+-----------+------+---+------+---+--------+---+----+----+
256
- unsigned int spsr_idx = aarch64_banked_spsr_index(cur_el);
488
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
257
- uint32_t spsr = env->banked_spsr[spsr_idx];
489
{ 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
258
- int new_el;
490
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
259
- bool return_to_aa64 = (spsr & PSTATE_nRW) == 0;
491
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
260
-
492
- { 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
261
- aarch64_save_sp(env, cur_el);
493
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
262
-
494
{ 0x00000000, 0x00000000, NULL }
263
- arm_clear_exclusive(env);
495
};
264
-
496
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
265
- /* We must squash the PSTATE.SS bit to zero unless both of the
497
index XXXXXXX..XXXXXXX 100644
266
- * following hold:
498
--- a/target/arm/tcg/vec_helper.c
267
- * 1. debug exceptions are currently disabled
499
+++ b/target/arm/tcg/vec_helper.c
268
- * 2. singlestep will be active in the EL we return to
500
@@ -XXX,XX +XXX,XX @@ DO_3OP_PAIR(gvec_faddp_h, float16_add, float16, H2)
269
- * We check 1 here and 2 after we've done the pstate/cpsr write() to
501
DO_3OP_PAIR(gvec_faddp_s, float32_add, float32, H4)
270
- * transition to the EL we're going to.
502
DO_3OP_PAIR(gvec_faddp_d, float64_add, float64, )
271
- */
503
272
- if (arm_generate_debug_exceptions(env)) {
504
+DO_3OP_PAIR(gvec_fmaxp_h, float16_max, float16, H2)
273
- spsr &= ~PSTATE_SS;
505
+DO_3OP_PAIR(gvec_fmaxp_s, float32_max, float32, H4)
274
- }
506
+DO_3OP_PAIR(gvec_fmaxp_d, float64_max, float64, )
275
-
507
+
276
- new_el = el_from_spsr(spsr);
508
+DO_3OP_PAIR(gvec_fminp_h, float16_min, float16, H2)
277
- if (new_el == -1) {
509
+DO_3OP_PAIR(gvec_fminp_s, float32_min, float32, H4)
278
- goto illegal_return;
510
+DO_3OP_PAIR(gvec_fminp_d, float64_min, float64, )
279
- }
511
+
280
- if (new_el > cur_el
512
+DO_3OP_PAIR(gvec_fmaxnump_h, float16_maxnum, float16, H2)
281
- || (new_el == 2 && !arm_feature(env, ARM_FEATURE_EL2))) {
513
+DO_3OP_PAIR(gvec_fmaxnump_s, float32_maxnum, float32, H4)
282
- /* Disallow return to an EL which is unimplemented or higher
514
+DO_3OP_PAIR(gvec_fmaxnump_d, float64_maxnum, float64, )
283
- * than the current one.
515
+
284
- */
516
+DO_3OP_PAIR(gvec_fminnump_h, float16_minnum, float16, H2)
285
- goto illegal_return;
517
+DO_3OP_PAIR(gvec_fminnump_s, float32_minnum, float32, H4)
286
- }
518
+DO_3OP_PAIR(gvec_fminnump_d, float64_minnum, float64, )
287
-
519
+
288
- if (new_el != 0 && arm_el_is_aa64(env, new_el) != return_to_aa64) {
520
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
289
- /* Return to an EL which is configured for a different register width */
521
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
290
- goto illegal_return;
522
{ \
291
- }
292
-
293
- if (new_el == 2 && arm_is_secure_below_el3(env)) {
294
- /* Return to the non-existent secure-EL2 */
295
- goto illegal_return;
296
- }
297
-
298
- if (new_el == 1 && (arm_hcr_el2_eff(env) & HCR_TGE)) {
299
- goto illegal_return;
300
- }
301
-
302
- qemu_mutex_lock_iothread();
303
- arm_call_pre_el_change_hook(arm_env_get_cpu(env));
304
- qemu_mutex_unlock_iothread();
305
-
306
- if (!return_to_aa64) {
307
- env->aarch64 = 0;
308
- /* We do a raw CPSR write because aarch64_sync_64_to_32()
309
- * will sort the register banks out for us, and we've already
310
- * caught all the bad-mode cases in el_from_spsr().
311
- */
312
- cpsr_write(env, spsr, ~0, CPSRWriteRaw);
313
- if (!arm_singlestep_active(env)) {
314
- env->uncached_cpsr &= ~PSTATE_SS;
315
- }
316
- aarch64_sync_64_to_32(env);
317
-
318
- if (spsr & CPSR_T) {
319
- env->regs[15] = env->elr_el[cur_el] & ~0x1;
320
- } else {
321
- env->regs[15] = env->elr_el[cur_el] & ~0x3;
322
- }
323
- qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
324
- "AArch32 EL%d PC 0x%" PRIx32 "\n",
325
- cur_el, new_el, env->regs[15]);
326
- } else {
327
- env->aarch64 = 1;
328
- pstate_write(env, spsr);
329
- if (!arm_singlestep_active(env)) {
330
- env->pstate &= ~PSTATE_SS;
331
- }
332
- aarch64_restore_sp(env, new_el);
333
- env->pc = env->elr_el[cur_el];
334
- qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
335
- "AArch64 EL%d PC 0x%" PRIx64 "\n",
336
- cur_el, new_el, env->pc);
337
- }
338
- /*
339
- * Note that cur_el can never be 0. If new_el is 0, then
340
- * el0_a64 is return_to_aa64, else el0_a64 is ignored.
341
- */
342
- aarch64_sve_change_el(env, cur_el, new_el, return_to_aa64);
343
-
344
- qemu_mutex_lock_iothread();
345
- arm_call_el_change_hook(arm_env_get_cpu(env));
346
- qemu_mutex_unlock_iothread();
347
-
348
- return;
349
-
350
-illegal_return:
351
- /* Illegal return events of various kinds have architecturally
352
- * mandated behaviour:
353
- * restore NZCV and DAIF from SPSR_ELx
354
- * set PSTATE.IL
355
- * restore PC from ELR_ELx
356
- * no change to exception level, execution state or stack pointer
357
- */
358
- env->pstate |= PSTATE_IL;
359
- env->pc = env->elr_el[cur_el];
360
- spsr &= PSTATE_NZCV | PSTATE_DAIF;
361
- spsr |= pstate_read(env) & ~(PSTATE_NZCV | PSTATE_DAIF);
362
- pstate_write(env, spsr);
363
- if (!arm_singlestep_active(env)) {
364
- env->pstate &= ~PSTATE_SS;
365
- }
366
- qemu_log_mask(LOG_GUEST_ERROR, "Illegal exception return at EL%d: "
367
- "resuming execution at 0x%" PRIx64 "\n", cur_el, env->pc);
368
-}
369
-
370
/* Return true if the linked breakpoint entry lbn passes its checks */
371
static bool linked_bp_matches(ARMCPU *cpu, int lbn)
372
{
373
--
523
--
374
2.20.1
524
2.34.1
375
376
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
This function is, or will shortly become, too big to inline.
4
2
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20190108223129.5570-16-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-31-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
7
---
10
target/arm/cpu.h | 48 +++++----------------------------------------
8
target/arm/helper.h | 7 -----
11
target/arm/helper.c | 44 +++++++++++++++++++++++++++++++++++++++++
9
target/arm/tcg/translate-neon.c | 55 ++-------------------------------
12
2 files changed, 49 insertions(+), 43 deletions(-)
10
target/arm/tcg/vec_helper.c | 45 ---------------------------
11
3 files changed, 3 insertions(+), 104 deletions(-)
13
12
14
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/cpu.h
15
--- a/target/arm/helper.h
17
+++ b/target/arm/cpu.h
16
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ static inline int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx)
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
18
DEF_HELPER_FLAGS_6(gvec_fcmlad, TCG_CALL_NO_RWG,
19
void, ptr, ptr, ptr, ptr, ptr, i32)
20
21
-DEF_HELPER_FLAGS_5(neon_paddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
-DEF_HELPER_FLAGS_5(neon_pmaxh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
-DEF_HELPER_FLAGS_5(neon_pminh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
24
-DEF_HELPER_FLAGS_5(neon_padds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
-DEF_HELPER_FLAGS_5(neon_pmaxs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
-DEF_HELPER_FLAGS_5(neon_pmins, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
-
28
DEF_HELPER_FLAGS_4(gvec_sstoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_4(gvec_sitos, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
DEF_HELPER_FLAGS_4(gvec_ustoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/tcg/translate-neon.c
34
+++ b/target/arm/tcg/translate-neon.c
35
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VFMA, gen_helper_gvec_vfma_s, gen_helper_gvec_vfma_h)
36
DO_3S_FP_GVEC(VFMS, gen_helper_gvec_vfms_s, gen_helper_gvec_vfms_h)
37
DO_3S_FP_GVEC(VRECPS, gen_helper_gvec_recps_nf_s, gen_helper_gvec_recps_nf_h)
38
DO_3S_FP_GVEC(VRSQRTS, gen_helper_gvec_rsqrts_nf_s, gen_helper_gvec_rsqrts_nf_h)
39
+DO_3S_FP_GVEC(VPADD, gen_helper_gvec_faddp_s, gen_helper_gvec_faddp_h)
40
+DO_3S_FP_GVEC(VPMAX, gen_helper_gvec_fmaxp_s, gen_helper_gvec_fmaxp_h)
41
+DO_3S_FP_GVEC(VPMIN, gen_helper_gvec_fminp_s, gen_helper_gvec_fminp_h)
42
43
WRAP_FP_GVEC(gen_VMAXNM_fp32_3s, FPST_STD, gen_helper_gvec_fmaxnum_s)
44
WRAP_FP_GVEC(gen_VMAXNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fmaxnum_h)
45
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
46
return do_3same(s, a, gen_VMINNM_fp32_3s);
19
}
47
}
20
48
21
/* Return the MMU index for a v7M CPU in the specified security and
49
-static bool do_3same_fp_pair(DisasContext *s, arg_3same *a,
22
- * privilege state
50
- gen_helper_gvec_3_ptr *fn)
23
+ * privilege state.
24
*/
25
-static inline ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
26
- bool secstate,
27
- bool priv)
28
-{
51
-{
29
- ARMMMUIdx mmu_idx = ARM_MMU_IDX_M;
52
- /* FP pairwise operations */
53
- TCGv_ptr fpstatus;
30
-
54
-
31
- if (priv) {
55
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
32
- mmu_idx |= ARM_MMU_IDX_M_PRIV;
56
- return false;
33
- }
57
- }
34
-
58
-
35
- if (armv7m_nvic_neg_prio_requested(env->nvic, secstate)) {
59
- /* UNDEF accesses to D16-D31 if they don't exist. */
36
- mmu_idx |= ARM_MMU_IDX_M_NEGPRI;
60
- if (!dc_isar_feature(aa32_simd_r32, s) &&
61
- ((a->vd | a->vn | a->vm) & 0x10)) {
62
- return false;
37
- }
63
- }
38
-
64
-
39
- if (secstate) {
65
- if (!vfp_access_check(s)) {
40
- mmu_idx |= ARM_MMU_IDX_M_S;
66
- return true;
41
- }
67
- }
42
-
68
-
43
- return mmu_idx;
69
- assert(a->q == 0); /* enforced by decode patterns */
70
-
71
-
72
- fpstatus = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD);
73
- tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
74
- vfp_reg_offset(1, a->vn),
75
- vfp_reg_offset(1, a->vm),
76
- fpstatus, 8, 8, 0, fn);
77
-
78
- return true;
44
-}
79
-}
45
+ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
46
+ bool secstate, bool priv);
47
48
/* Return the MMU index for a v7M CPU in the specified security state */
49
-static inline ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env,
50
- bool secstate)
51
-{
52
- bool priv = arm_current_el(env) != 0;
53
-
80
-
54
- return arm_v7m_mmu_idx_for_secstate_and_priv(env, secstate, priv);
81
-/*
55
-}
82
- * For all the functions using this macro, size == 1 means fp16,
56
+ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate);
83
- * which is an architecture extension we don't implement yet.
57
84
- */
58
/* Determine the current mmu_idx to use for normal loads/stores */
85
-#define DO_3S_FP_PAIR(INSN,FUNC) \
59
-static inline int cpu_mmu_index(CPUARMState *env, bool ifetch)
86
- static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
60
-{
87
- { \
61
- int el = arm_current_el(env);
88
- if (a->size == MO_16) { \
62
-
89
- if (!dc_isar_feature(aa32_fp16_arith, s)) { \
63
- if (arm_feature(env, ARM_FEATURE_M)) {
90
- return false; \
64
- ARMMMUIdx mmu_idx = arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
91
- } \
65
-
92
- return do_3same_fp_pair(s, a, FUNC##h); \
66
- return arm_to_core_mmu_idx(mmu_idx);
93
- } \
94
- return do_3same_fp_pair(s, a, FUNC##s); \
67
- }
95
- }
68
-
96
-
69
- if (el < 2 && arm_is_secure_below_el3(env)) {
97
-DO_3S_FP_PAIR(VPADD, gen_helper_neon_padd)
70
- return arm_to_core_mmu_idx(ARMMMUIdx_S1SE0 + el);
98
-DO_3S_FP_PAIR(VPMAX, gen_helper_neon_pmax)
99
-DO_3S_FP_PAIR(VPMIN, gen_helper_neon_pmin)
100
-
101
static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
102
{
103
/* Handle a 2-reg-shift insn which can be vectorized. */
104
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
105
index XXXXXXX..XXXXXXX 100644
106
--- a/target/arm/tcg/vec_helper.c
107
+++ b/target/arm/tcg/vec_helper.c
108
@@ -XXX,XX +XXX,XX @@ DO_ABA(gvec_uaba_d, uint64_t)
109
110
#undef DO_ABA
111
112
-#define DO_NEON_PAIRWISE(NAME, OP) \
113
- void HELPER(NAME##s)(void *vd, void *vn, void *vm, \
114
- void *stat, uint32_t oprsz) \
115
- { \
116
- float_status *fpst = stat; \
117
- float32 *d = vd; \
118
- float32 *n = vn; \
119
- float32 *m = vm; \
120
- float32 r0, r1; \
121
- \
122
- /* Read all inputs before writing outputs in case vm == vd */ \
123
- r0 = float32_##OP(n[H4(0)], n[H4(1)], fpst); \
124
- r1 = float32_##OP(m[H4(0)], m[H4(1)], fpst); \
125
- \
126
- d[H4(0)] = r0; \
127
- d[H4(1)] = r1; \
128
- } \
129
- \
130
- void HELPER(NAME##h)(void *vd, void *vn, void *vm, \
131
- void *stat, uint32_t oprsz) \
132
- { \
133
- float_status *fpst = stat; \
134
- float16 *d = vd; \
135
- float16 *n = vn; \
136
- float16 *m = vm; \
137
- float16 r0, r1, r2, r3; \
138
- \
139
- /* Read all inputs before writing outputs in case vm == vd */ \
140
- r0 = float16_##OP(n[H2(0)], n[H2(1)], fpst); \
141
- r1 = float16_##OP(n[H2(2)], n[H2(3)], fpst); \
142
- r2 = float16_##OP(m[H2(0)], m[H2(1)], fpst); \
143
- r3 = float16_##OP(m[H2(2)], m[H2(3)], fpst); \
144
- \
145
- d[H2(0)] = r0; \
146
- d[H2(1)] = r1; \
147
- d[H2(2)] = r2; \
148
- d[H2(3)] = r3; \
71
- }
149
- }
72
- return el;
150
-
73
-}
151
-DO_NEON_PAIRWISE(neon_padd, add)
74
+int cpu_mmu_index(CPUARMState *env, bool ifetch);
152
-DO_NEON_PAIRWISE(neon_pmax, max)
75
153
-DO_NEON_PAIRWISE(neon_pmin, min)
76
/* Indexes used when registering address spaces with cpu_address_space_init */
154
-
77
typedef enum ARMASIdx {
155
-#undef DO_NEON_PAIRWISE
78
diff --git a/target/arm/helper.c b/target/arm/helper.c
156
-
79
index XXXXXXX..XXXXXXX 100644
157
#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \
80
--- a/target/arm/helper.c
158
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
81
+++ b/target/arm/helper.c
159
{ \
82
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
83
return 0;
84
}
85
86
+ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
87
+ bool secstate, bool priv)
88
+{
89
+ ARMMMUIdx mmu_idx = ARM_MMU_IDX_M;
90
+
91
+ if (priv) {
92
+ mmu_idx |= ARM_MMU_IDX_M_PRIV;
93
+ }
94
+
95
+ if (armv7m_nvic_neg_prio_requested(env->nvic, secstate)) {
96
+ mmu_idx |= ARM_MMU_IDX_M_NEGPRI;
97
+ }
98
+
99
+ if (secstate) {
100
+ mmu_idx |= ARM_MMU_IDX_M_S;
101
+ }
102
+
103
+ return mmu_idx;
104
+}
105
+
106
+/* Return the MMU index for a v7M CPU in the specified security state */
107
+ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
108
+{
109
+ bool priv = arm_current_el(env) != 0;
110
+
111
+ return arm_v7m_mmu_idx_for_secstate_and_priv(env, secstate, priv);
112
+}
113
+
114
+int cpu_mmu_index(CPUARMState *env, bool ifetch)
115
+{
116
+ int el = arm_current_el(env);
117
+
118
+ if (arm_feature(env, ARM_FEATURE_M)) {
119
+ ARMMMUIdx mmu_idx = arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
120
+
121
+ return arm_to_core_mmu_idx(mmu_idx);
122
+ }
123
+
124
+ if (el < 2 && arm_is_secure_below_el3(env)) {
125
+ return arm_to_core_mmu_idx(ARMMMUIdx_S1SE0 + el);
126
+ }
127
+ return el;
128
+}
129
+
130
void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
131
target_ulong *cs_base, uint32_t *pflags)
132
{
133
--
160
--
134
2.20.1
161
2.34.1
135
136
diff view generated by jsdifflib
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
pmccntr_read and pmccntr_write contained duplicate code that was already
4
being handled by pmccntr_sync. Consolidate the duplicated code into two
5
functions: pmccntr_op_start and pmccntr_op_finish. Add a companion to
6
c15_ccnt in CPUARMState so that we can simultaneously save both the
7
architectural register value and the last underlying cycle count - this
8
ensures time isn't lost and will also allow us to access the 'old'
9
architectural register value in order to detect overflows in later
10
patches.
11
12
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
13
Signed-off-by: Aaron Lindsay <aclindsa@gmail.com>
14
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
Message-id: 20181211151945.29137-3-aaron@os.amperecomputing.com
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20240524232121.284515-32-richard.henderson@linaro.org
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
7
---
18
target/arm/cpu.h | 37 +++++++++++---
8
target/arm/helper.h | 5 ++
19
target/arm/helper.c | 118 ++++++++++++++++++++++++++------------------
9
target/arm/tcg/translate.h | 3 +
20
2 files changed, 100 insertions(+), 55 deletions(-)
10
target/arm/tcg/a64.decode | 6 ++
11
target/arm/tcg/gengvec.c | 12 ++++
12
target/arm/tcg/translate-a64.c | 128 ++++++---------------------------
13
target/arm/tcg/vec_helper.c | 30 ++++++++
14
6 files changed, 77 insertions(+), 107 deletions(-)
21
15
22
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
23
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/cpu.h
18
--- a/target/arm/helper.h
25
+++ b/target/arm/cpu.h
19
+++ b/target/arm/helper.h
26
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fminnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i
27
uint64_t oslsr_el1; /* OS Lock Status */
21
DEF_HELPER_FLAGS_5(gvec_fminnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
uint64_t mdcr_el2;
22
DEF_HELPER_FLAGS_5(gvec_fminnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
uint64_t mdcr_el3;
23
30
- /* If the counter is enabled, this stores the last time the counter
24
+DEF_HELPER_FLAGS_4(gvec_addp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
- * was reset. Otherwise it stores the counter value
25
+DEF_HELPER_FLAGS_4(gvec_addp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+ /* Stores the architectural value of the counter *the last time it was
26
+DEF_HELPER_FLAGS_4(gvec_addp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+ * updated* by pmccntr_op_start. Accesses should always be surrounded
27
+DEF_HELPER_FLAGS_4(gvec_addp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+ * by pmccntr_op_start/pmccntr_op_finish to guarantee the latest
28
+
35
+ * architecturally-correct value is being read/set.
29
#ifdef TARGET_AARCH64
36
*/
30
#include "tcg/helper-a64.h"
37
uint64_t c15_ccnt;
31
#include "tcg/helper-sve.h"
38
+ /* Stores the delta between the architectural value and the underlying
32
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
39
+ * cycle count during normal operation. It is used to update c15_ccnt
33
index XXXXXXX..XXXXXXX 100644
40
+ * to be the correct architectural value before accesses. During
34
--- a/target/arm/tcg/translate.h
41
+ * accesses, c15_ccnt_delta contains the underlying count being used
35
+++ b/target/arm/tcg/translate.h
42
+ * for the access, after which it reverts to the delta value in
36
@@ -XXX,XX +XXX,XX @@ void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
43
+ * pmccntr_op_finish.
37
void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
44
+ */
38
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
45
+ uint64_t c15_ccnt_delta;
39
46
uint64_t pmccfiltr_el0; /* Performance Monitor Filter Register */
40
+void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
47
uint64_t vpidr_el2; /* Virtualization Processor ID Register */
41
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
48
uint64_t vmpidr_el2; /* Virtualization Multiprocessor ID Register */
42
+
49
@@ -XXX,XX +XXX,XX @@ int cpu_arm_signal_handler(int host_signum, void *pinfo,
43
/*
50
void *puc);
44
* Forward to the isar_feature_* tests given a DisasContext pointer.
51
52
/**
53
- * pmccntr_sync
54
+ * pmccntr_op_start/finish
55
* @env: CPUARMState
56
*
57
- * Synchronises the counter in the PMCCNTR. This must always be called twice,
58
- * once before any action that might affect the timer and again afterwards.
59
- * The function is used to swap the state of the register if required.
60
- * This only happens when not in user mode (!CONFIG_USER_ONLY)
61
+ * Convert the counter in the PMCCNTR between its delta form (the typical mode
62
+ * when it's enabled) and the guest-visible value. These two calls must always
63
+ * surround any action which might affect the counter.
64
*/
45
*/
65
-void pmccntr_sync(CPUARMState *env);
46
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
66
+void pmccntr_op_start(CPUARMState *env);
47
index XXXXXXX..XXXXXXX 100644
67
+void pmccntr_op_finish(CPUARMState *env);
48
--- a/target/arm/tcg/a64.decode
68
+
49
+++ b/target/arm/tcg/a64.decode
69
+/**
50
@@ -XXX,XX +XXX,XX @@
70
+ * pmu_op_start/finish
51
&qrrrr_e q rd rn rm ra esz
71
+ * @env: CPUARMState
52
72
+ *
53
@rr_h ........ ... ..... ...... rn:5 rd:5 &rr_e esz=1
73
+ * Convert all PMU counters between their delta form (the typical mode when
54
+@rr_d ........ ... ..... ...... rn:5 rd:5 &rr_e esz=3
74
+ * they are enabled) and the guest-visible values. These two calls must
55
@rr_sd ........ ... ..... ...... rn:5 rd:5 &rr_e esz=%esz_sd
75
+ * surround any action which might affect the counters.
56
76
+ */
57
@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1
77
+void pmu_op_start(CPUARMState *env);
58
@@ -XXX,XX +XXX,XX @@
78
+void pmu_op_finish(CPUARMState *env);
59
79
60
@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1
80
/* SCTLR bit meanings. Several bits have been reused in newer
61
@qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd
81
* versions of the architecture; in that case we define constants
62
+@qrrr_e . q:1 ...... esz:2 . rm:5 ...... rn:5 rd:5 &qrrr_e
82
diff --git a/target/arm/helper.c b/target/arm/helper.c
63
83
index XXXXXXX..XXXXXXX 100644
64
@qrrx_h . q:1 .. .... .. .. rm:4 .... . . rn:5 rd:5 \
84
--- a/target/arm/helper.c
65
&qrrx_e esz=1 idx=%hlm
85
+++ b/target/arm/helper.c
66
@@ -XXX,XX +XXX,XX @@ FMAXNMP_s 0111 1110 0.11 0000 1100 10 ..... ..... @rr_sd
86
@@ -XXX,XX +XXX,XX @@ static inline bool arm_ccnt_enabled(CPUARMState *env)
67
FMINNMP_s 0101 1110 1011 0000 1100 10 ..... ..... @rr_h
87
68
FMINNMP_s 0111 1110 1.11 0000 1100 10 ..... ..... @rr_sd
88
return true;
69
70
+ADDP_s 0101 1110 1111 0001 1011 10 ..... ..... @rr_d
71
+
72
### Advanced SIMD three same
73
74
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
75
@@ -XXX,XX +XXX,XX @@ FMAXNMP_v 0.10 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd
76
FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h
77
FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
78
79
+ADDP_v 0.00 1110 ..1 ..... 10111 1 ..... ..... @qrrr_e
80
+
81
### Advanced SIMD scalar x indexed element
82
83
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
84
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/tcg/gengvec.c
87
+++ b/target/arm/tcg/gengvec.c
88
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
89
};
90
tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
89
}
91
}
90
-
92
+
91
-void pmccntr_sync(CPUARMState *env)
93
+void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
92
+/*
94
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
93
+ * Ensure c15_ccnt is the guest-visible count so that operations such as
95
+{
94
+ * enabling/disabling the counter or filtering, modifying the count itself,
96
+ static gen_helper_gvec_3 * const fns[4] = {
95
+ * etc. can be done logically. This is essentially a no-op if the counter is
97
+ gen_helper_gvec_addp_b,
96
+ * not enabled at the time of the call.
98
+ gen_helper_gvec_addp_h,
97
+ */
99
+ gen_helper_gvec_addp_s,
98
+void pmccntr_op_start(CPUARMState *env)
100
+ gen_helper_gvec_addp_d,
99
{
101
+ };
100
- uint64_t temp_ticks;
102
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
101
-
103
+}
102
- temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
104
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
103
+ uint64_t cycles = 0;
105
index XXXXXXX..XXXXXXX 100644
104
+ cycles = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
106
--- a/target/arm/tcg/translate-a64.c
105
ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
107
+++ b/target/arm/tcg/translate-a64.c
106
108
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = {
107
- if (env->cp15.c9_pmcr & PMCRD) {
109
};
108
- /* Increment once every 64 processor clock cycles */
110
TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp)
109
- temp_ticks /= 64;
111
110
- }
112
+TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp)
111
-
113
+
112
if (arm_ccnt_enabled(env)) {
114
/*
113
- env->cp15.c15_ccnt = temp_ticks - env->cp15.c15_ccnt;
115
* Advanced SIMD scalar/vector x indexed element
114
+ uint64_t eff_cycles = cycles;
116
*/
115
+ if (env->cp15.c9_pmcr & PMCRD) {
117
@@ -XXX,XX +XXX,XX @@ TRANS(FMINP_s, do_fp3_scalar_pair, a, &f_scalar_fmin)
116
+ /* Increment once every 64 processor clock cycles */
118
TRANS(FMAXNMP_s, do_fp3_scalar_pair, a, &f_scalar_fmaxnm)
117
+ eff_cycles /= 64;
119
TRANS(FMINNMP_s, do_fp3_scalar_pair, a, &f_scalar_fminnm)
118
+ }
120
119
+
121
+static bool trans_ADDP_s(DisasContext *s, arg_rr_e *a)
120
+ env->cp15.c15_ccnt = eff_cycles - env->cp15.c15_ccnt_delta;
122
+{
123
+ if (fp_access_check(s)) {
124
+ TCGv_i64 t0 = tcg_temp_new_i64();
125
+ TCGv_i64 t1 = tcg_temp_new_i64();
126
+
127
+ read_vec_element(s, t0, a->rn, 0, MO_64);
128
+ read_vec_element(s, t1, a->rn, 1, MO_64);
129
+ tcg_gen_add_i64(t0, t0, t1);
130
+ write_fp_dreg(s, a->rd, t0);
131
+ }
132
+ return true;
133
+}
134
+
135
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
136
* Note that it is the caller's responsibility to ensure that the
137
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
138
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
121
}
139
}
122
+ env->cp15.c15_ccnt_delta = cycles;
123
+}
124
+
125
+/*
126
+ * If PMCCNTR is enabled, recalculate the delta between the clock and the
127
+ * guest-visible count. A call to pmccntr_op_finish should follow every call to
128
+ * pmccntr_op_start.
129
+ */
130
+void pmccntr_op_finish(CPUARMState *env)
131
+{
132
+ if (arm_ccnt_enabled(env)) {
133
+ uint64_t prev_cycles = env->cp15.c15_ccnt_delta;
134
+
135
+ if (env->cp15.c9_pmcr & PMCRD) {
136
+ /* Increment once every 64 processor clock cycles */
137
+ prev_cycles /= 64;
138
+ }
139
+
140
+ env->cp15.c15_ccnt_delta = prev_cycles - env->cp15.c15_ccnt;
141
+ }
142
+}
143
+
144
+void pmu_op_start(CPUARMState *env)
145
+{
146
+ pmccntr_op_start(env);
147
+}
148
+
149
+void pmu_op_finish(CPUARMState *env)
150
+{
151
+ pmccntr_op_finish(env);
152
}
140
}
153
141
154
static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
142
-/* AdvSIMD scalar pairwise
155
uint64_t value)
143
- * 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0
156
{
144
- * +-----+---+-----------+------+-----------+--------+-----+------+------+
157
- pmccntr_sync(env);
145
- * | 0 1 | U | 1 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 | Rn | Rd |
158
+ pmu_op_start(env);
146
- * +-----+---+-----------+------+-----------+--------+-----+------+------+
159
147
- */
160
if (value & PMCRC) {
148
-static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
161
/* The counter has been reset */
149
-{
162
@@ -XXX,XX +XXX,XX @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
150
- int u = extract32(insn, 29, 1);
163
env->cp15.c9_pmcr &= ~0x39;
151
- int size = extract32(insn, 22, 2);
164
env->cp15.c9_pmcr |= (value & 0x39);
152
- int opcode = extract32(insn, 12, 5);
165
153
- int rn = extract32(insn, 5, 5);
166
- pmccntr_sync(env);
154
- int rd = extract32(insn, 0, 5);
167
+ pmu_op_finish(env);
155
-
168
}
156
- /* For some ops (the FP ones), size[1] is part of the encoding.
169
157
- * For ADDP strictly it is not but size[1] is always 1 for valid
170
static uint64_t pmccntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
158
- * encodings.
171
{
159
- */
172
- uint64_t total_ticks;
160
- opcode |= (extract32(size, 1, 1) << 5);
173
-
161
-
174
- if (!arm_ccnt_enabled(env)) {
162
- switch (opcode) {
175
- /* Counter is disabled, do not change value */
163
- case 0x3b: /* ADDP */
176
- return env->cp15.c15_ccnt;
164
- if (u || size != 3) {
177
- }
165
- unallocated_encoding(s);
178
-
166
- return;
179
- total_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
167
- }
180
- ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
168
- if (!fp_access_check(s)) {
181
-
169
- return;
182
- if (env->cp15.c9_pmcr & PMCRD) {
170
- }
183
- /* Increment once every 64 processor clock cycles */
171
- break;
184
- total_ticks /= 64;
172
- default:
185
- }
173
- case 0xc: /* FMAXNMP */
186
- return total_ticks - env->cp15.c15_ccnt;
174
- case 0xd: /* FADDP */
187
+ uint64_t ret;
175
- case 0xf: /* FMAXP */
188
+ pmccntr_op_start(env);
176
- case 0x2c: /* FMINNMP */
189
+ ret = env->cp15.c15_ccnt;
177
- case 0x2f: /* FMINP */
190
+ pmccntr_op_finish(env);
178
- unallocated_encoding(s);
191
+ return ret;
192
}
193
194
static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
195
@@ -XXX,XX +XXX,XX @@ static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
196
static void pmccntr_write(CPUARMState *env, const ARMCPRegInfo *ri,
197
uint64_t value)
198
{
199
- uint64_t total_ticks;
200
-
201
- if (!arm_ccnt_enabled(env)) {
202
- /* Counter is disabled, set the absolute value */
203
- env->cp15.c15_ccnt = value;
204
- return;
179
- return;
205
- }
180
- }
206
-
181
-
207
- total_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
182
- if (size == MO_64) {
208
- ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
183
- TCGv_i64 tcg_op1 = tcg_temp_new_i64();
209
-
184
- TCGv_i64 tcg_op2 = tcg_temp_new_i64();
210
- if (env->cp15.c9_pmcr & PMCRD) {
185
- TCGv_i64 tcg_res = tcg_temp_new_i64();
211
- /* Increment once every 64 processor clock cycles */
186
-
212
- total_ticks /= 64;
187
- read_vec_element(s, tcg_op1, rn, 0, MO_64);
188
- read_vec_element(s, tcg_op2, rn, 1, MO_64);
189
-
190
- switch (opcode) {
191
- case 0x3b: /* ADDP */
192
- tcg_gen_add_i64(tcg_res, tcg_op1, tcg_op2);
193
- break;
194
- default:
195
- case 0xc: /* FMAXNMP */
196
- case 0xd: /* FADDP */
197
- case 0xf: /* FMAXP */
198
- case 0x2c: /* FMINNMP */
199
- case 0x2f: /* FMINP */
200
- g_assert_not_reached();
201
- }
202
-
203
- write_fp_dreg(s, rd, tcg_res);
204
- } else {
205
- g_assert_not_reached();
213
- }
206
- }
214
- env->cp15.c15_ccnt = total_ticks - value;
207
-}
215
+ pmccntr_op_start(env);
208
-
216
+ env->cp15.c15_ccnt = value;
209
/*
217
+ pmccntr_op_finish(env);
210
* Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate)
211
*
212
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
213
* adjacent elements being operated on to produce an element in the result.
214
*/
215
if (size == 3) {
216
- TCGv_i64 tcg_res[2];
217
-
218
- for (pass = 0; pass < 2; pass++) {
219
- TCGv_i64 tcg_op1 = tcg_temp_new_i64();
220
- TCGv_i64 tcg_op2 = tcg_temp_new_i64();
221
- int passreg = (pass == 0) ? rn : rm;
222
-
223
- read_vec_element(s, tcg_op1, passreg, 0, MO_64);
224
- read_vec_element(s, tcg_op2, passreg, 1, MO_64);
225
- tcg_res[pass] = tcg_temp_new_i64();
226
-
227
- switch (opcode) {
228
- case 0x17: /* ADDP */
229
- tcg_gen_add_i64(tcg_res[pass], tcg_op1, tcg_op2);
230
- break;
231
- default:
232
- case 0x58: /* FMAXNMP */
233
- case 0x5a: /* FADDP */
234
- case 0x5e: /* FMAXP */
235
- case 0x78: /* FMINNMP */
236
- case 0x7e: /* FMINP */
237
- g_assert_not_reached();
238
- }
239
- }
240
-
241
- for (pass = 0; pass < 2; pass++) {
242
- write_vec_element(s, tcg_res[pass], rd, pass, MO_64);
243
- }
244
+ g_assert_not_reached();
245
} else {
246
int maxpass = is_q ? 4 : 2;
247
TCGv_i32 tcg_res[4];
248
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
249
tcg_res[pass] = tcg_temp_new_i32();
250
251
switch (opcode) {
252
- case 0x17: /* ADDP */
253
- {
254
- static NeonGenTwoOpFn * const fns[3] = {
255
- gen_helper_neon_padd_u8,
256
- gen_helper_neon_padd_u16,
257
- tcg_gen_add_i32,
258
- };
259
- genfn = fns[size];
260
- break;
261
- }
262
case 0x14: /* SMAXP, UMAXP */
263
{
264
static NeonGenTwoOpFn * const fns[3][2] = {
265
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
266
break;
267
}
268
default:
269
+ case 0x17: /* ADDP */
270
case 0x58: /* FMAXNMP */
271
case 0x5a: /* FADDP */
272
case 0x5e: /* FMAXP */
273
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
274
case 0x3: /* logic ops */
275
disas_simd_3same_logic(s, insn);
276
break;
277
- case 0x17: /* ADDP */
278
case 0x14: /* SMAXP, UMAXP */
279
case 0x15: /* SMINP, UMINP */
280
{
281
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
282
default:
283
disas_simd_3same_int(s, insn);
284
break;
285
+ case 0x17: /* ADDP */
286
+ unallocated_encoding(s);
287
+ break;
288
}
218
}
289
}
219
290
220
static void pmccntr_write32(CPUARMState *env, const ARMCPRegInfo *ri,
291
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
221
@@ -XXX,XX +XXX,XX @@ static void pmccntr_write32(CPUARMState *env, const ARMCPRegInfo *ri,
292
{ 0x5e008400, 0xdf208400, disas_simd_scalar_three_reg_same_extra },
222
293
{ 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },
223
#else /* CONFIG_USER_ONLY */
294
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
224
295
- { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
225
-void pmccntr_sync(CPUARMState *env)
296
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
226
+void pmccntr_op_start(CPUARMState *env)
297
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
227
+{
298
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
299
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
300
index XXXXXXX..XXXXXXX 100644
301
--- a/target/arm/tcg/vec_helper.c
302
+++ b/target/arm/tcg/vec_helper.c
303
@@ -XXX,XX +XXX,XX @@ DO_3OP_PAIR(gvec_fminnump_h, float16_minnum, float16, H2)
304
DO_3OP_PAIR(gvec_fminnump_s, float32_minnum, float32, H4)
305
DO_3OP_PAIR(gvec_fminnump_d, float64_minnum, float64, )
306
307
+#undef DO_3OP_PAIR
308
+
309
+#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \
310
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
311
+{ \
312
+ ARMVectorReg scratch; \
313
+ intptr_t oprsz = simd_oprsz(desc); \
314
+ intptr_t half = oprsz / sizeof(TYPE) / 2; \
315
+ TYPE *d = vd, *n = vn, *m = vm; \
316
+ if (unlikely(d == m)) { \
317
+ m = memcpy(&scratch, m, oprsz); \
318
+ } \
319
+ for (intptr_t i = 0; i < half; ++i) { \
320
+ d[H(i)] = FUNC(n[H(i * 2)], n[H(i * 2 + 1)]); \
321
+ } \
322
+ for (intptr_t i = 0; i < half; ++i) { \
323
+ d[H(i + half)] = FUNC(m[H(i * 2)], m[H(i * 2 + 1)]); \
324
+ } \
325
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
228
+}
326
+}
229
+
327
+
230
+void pmccntr_op_finish(CPUARMState *env)
328
+#define ADD(A, B) (A + B)
231
+{
329
+DO_3OP_PAIR(gvec_addp_b, ADD, uint8_t, H1)
232
+}
330
+DO_3OP_PAIR(gvec_addp_h, ADD, uint16_t, H2)
233
+
331
+DO_3OP_PAIR(gvec_addp_s, ADD, uint32_t, H4)
234
+void pmu_op_start(CPUARMState *env)
332
+DO_3OP_PAIR(gvec_addp_d, ADD, uint64_t, )
235
+{
333
+#undef ADD
236
+}
334
+
237
+
335
+#undef DO_3OP_PAIR
238
+void pmu_op_finish(CPUARMState *env)
336
+
239
{
337
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
240
}
338
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
241
339
{ \
242
@@ -XXX,XX +XXX,XX @@ void pmccntr_sync(CPUARMState *env)
243
static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
244
uint64_t value)
245
{
246
- pmccntr_sync(env);
247
+ pmccntr_op_start(env);
248
env->cp15.pmccfiltr_el0 = value & 0xfc000000;
249
- pmccntr_sync(env);
250
+ pmccntr_op_finish(env);
251
}
252
253
static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
254
--
340
--
255
2.20.1
341
2.34.1
256
257
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
3
We need to reuse this from helper-a64.c. Provide a stub
4
definition for CONFIG_USER_ONLY. This matches the stub
5
definitions that we removed for arm_regime_tbi{0,1} before.
6
2
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20190108223129.5570-21-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-33-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/internals.h | 17 +++++++++++++++++
8
target/arm/helper.h | 2 --
13
target/arm/helper.c | 4 ++--
9
target/arm/tcg/neon_helper.c | 5 -----
14
2 files changed, 19 insertions(+), 2 deletions(-)
10
target/arm/tcg/translate-neon.c | 3 +--
11
3 files changed, 1 insertion(+), 9 deletions(-)
15
12
16
diff --git a/target/arm/internals.h b/target/arm/internals.h
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/internals.h
15
--- a/target/arm/helper.h
19
+++ b/target/arm/internals.h
16
+++ b/target/arm/helper.h
20
@@ -XXX,XX +XXX,XX @@ typedef struct ARMVAParameters {
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(neon_qrshl_s64, i64, env, i64, i64)
21
bool using64k : 1;
18
22
} ARMVAParameters;
19
DEF_HELPER_2(neon_add_u8, i32, i32, i32)
23
20
DEF_HELPER_2(neon_add_u16, i32, i32, i32)
24
+#ifdef CONFIG_USER_ONLY
21
-DEF_HELPER_2(neon_padd_u8, i32, i32, i32)
25
+static inline ARMVAParameters aa64_va_parameters(CPUARMState *env,
22
-DEF_HELPER_2(neon_padd_u16, i32, i32, i32)
26
+ uint64_t va,
23
DEF_HELPER_2(neon_sub_u8, i32, i32, i32)
27
+ ARMMMUIdx mmu_idx, bool data)
24
DEF_HELPER_2(neon_sub_u16, i32, i32, i32)
28
+{
25
DEF_HELPER_2(neon_mul_u8, i32, i32, i32)
29
+ return (ARMVAParameters) {
26
diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c
30
+ /* 48-bit address space */
31
+ .tsz = 16,
32
+ /* We can't handle tagged addresses properly in user-only mode */
33
+ .tbi = false,
34
+ };
35
+}
36
+#else
37
+ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
38
+ ARMMMUIdx mmu_idx, bool data);
39
+#endif
40
+
41
#endif
42
diff --git a/target/arm/helper.c b/target/arm/helper.c
43
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/helper.c
28
--- a/target/arm/tcg/neon_helper.c
45
+++ b/target/arm/helper.c
29
+++ b/target/arm/tcg/neon_helper.c
46
@@ -XXX,XX +XXX,XX @@ static uint8_t convert_stage2_attrs(CPUARMState *env, uint8_t s2attrs)
30
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_add_u16)(uint32_t a, uint32_t b)
47
return (hiattr << 6) | (hihint << 4) | (loattr << 2) | lohint;
31
return (a + b) ^ mask;
48
}
32
}
49
33
50
-static ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
34
-#define NEON_FN(dest, src1, src2) dest = src1 + src2
51
- ARMMMUIdx mmu_idx, bool data)
35
-NEON_POP(padd_u8, neon_u8, 4)
52
+ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
36
-NEON_POP(padd_u16, neon_u16, 2)
53
+ ARMMMUIdx mmu_idx, bool data)
37
-#undef NEON_FN
54
{
38
-
55
uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
39
#define NEON_FN(dest, src1, src2) dest = src1 - src2
56
uint32_t el = regime_el(env, mmu_idx);
40
NEON_VOP(sub_u8, neon_u8, 4)
41
NEON_VOP(sub_u16, neon_u16, 2)
42
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/tcg/translate-neon.c
45
+++ b/target/arm/tcg/translate-neon.c
46
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VABD_S, gen_gvec_sabd)
47
DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
48
DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
49
DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
50
+DO_3SAME_NO_SZ_3(VPADD, gen_gvec_addp)
51
52
#define DO_3SAME_CMP(INSN, COND) \
53
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
54
@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
55
#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
56
#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
57
#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
58
-#define gen_helper_neon_padd_u32 tcg_gen_add_i32
59
60
DO_3SAME_PAIR(VPMAX_S, pmax_s)
61
DO_3SAME_PAIR(VPMIN_S, pmin_s)
62
DO_3SAME_PAIR(VPMAX_U, pmax_u)
63
DO_3SAME_PAIR(VPMIN_U, pmin_u)
64
-DO_3SAME_PAIR(VPADD, padd_u)
65
66
#define DO_3SAME_VQDMULH(INSN, FUNC) \
67
WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \
57
--
68
--
58
2.20.1
69
2.34.1
59
60
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The arm_regime_tbi{0,1} functions are replacable with the new function
3
These are the last instructions within handle_simd_3same_pair
4
by giving the lowest and highest address.
4
so remove it.
5
5
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20190108223129.5570-24-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-34-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
10
---
11
target/arm/cpu.h | 35 -----------------------
11
target/arm/helper.h | 16 +++++
12
target/arm/helper.c | 70 ++++++++++++++++-----------------------------
12
target/arm/tcg/translate.h | 8 +++
13
2 files changed, 24 insertions(+), 81 deletions(-)
13
target/arm/tcg/a64.decode | 4 ++
14
target/arm/tcg/gengvec.c | 48 +++++++++++++
15
target/arm/tcg/translate-a64.c | 119 +++++----------------------------
16
target/arm/tcg/vec_helper.c | 16 +++++
17
6 files changed, 109 insertions(+), 102 deletions(-)
14
18
15
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
19
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.h
21
--- a/target/arm/helper.h
18
+++ b/target/arm/cpu.h
22
+++ b/target/arm/helper.h
19
@@ -XXX,XX +XXX,XX @@ static inline bool arm_cpu_bswap_data(CPUARMState *env)
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_addp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
DEF_HELPER_FLAGS_4(gvec_addp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
DEF_HELPER_FLAGS_4(gvec_addp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
27
+DEF_HELPER_FLAGS_4(gvec_smaxp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(gvec_smaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(gvec_smaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
+
31
+DEF_HELPER_FLAGS_4(gvec_sminp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(gvec_sminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(gvec_sminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+
35
+DEF_HELPER_FLAGS_4(gvec_umaxp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
+DEF_HELPER_FLAGS_4(gvec_umaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
+DEF_HELPER_FLAGS_4(gvec_umaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
+
39
+DEF_HELPER_FLAGS_4(gvec_uminp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(gvec_uminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
41
+DEF_HELPER_FLAGS_4(gvec_uminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
42
+
43
#ifdef TARGET_AARCH64
44
#include "tcg/helper-a64.h"
45
#include "tcg/helper-sve.h"
46
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/tcg/translate.h
49
+++ b/target/arm/tcg/translate.h
50
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
51
52
void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
53
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
54
+void gen_gvec_smaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
55
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
56
+void gen_gvec_sminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
57
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
58
+void gen_gvec_umaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
59
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
60
+void gen_gvec_uminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
61
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
62
63
/*
64
* Forward to the isar_feature_* tests given a DisasContext pointer.
65
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
66
index XXXXXXX..XXXXXXX 100644
67
--- a/target/arm/tcg/a64.decode
68
+++ b/target/arm/tcg/a64.decode
69
@@ -XXX,XX +XXX,XX @@ FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h
70
FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
71
72
ADDP_v 0.00 1110 ..1 ..... 10111 1 ..... ..... @qrrr_e
73
+SMAXP_v 0.00 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e
74
+SMINP_v 0.00 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e
75
+UMAXP_v 0.10 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e
76
+UMINP_v 0.10 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e
77
78
### Advanced SIMD scalar x indexed element
79
80
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/tcg/gengvec.c
83
+++ b/target/arm/tcg/gengvec.c
84
@@ -XXX,XX +XXX,XX @@ void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
85
};
86
tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
20
}
87
}
21
#endif
88
+
22
89
+void gen_gvec_smaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
23
-#ifndef CONFIG_USER_ONLY
90
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
24
-/**
91
+{
25
- * arm_regime_tbi0:
92
+ static gen_helper_gvec_3 * const fns[4] = {
26
- * @env: CPUARMState
93
+ gen_helper_gvec_smaxp_b,
27
- * @mmu_idx: MMU index indicating required translation regime
94
+ gen_helper_gvec_smaxp_h,
95
+ gen_helper_gvec_smaxp_s,
96
+ };
97
+ tcg_debug_assert(vece <= MO_32);
98
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
99
+}
100
+
101
+void gen_gvec_sminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
102
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
103
+{
104
+ static gen_helper_gvec_3 * const fns[4] = {
105
+ gen_helper_gvec_sminp_b,
106
+ gen_helper_gvec_sminp_h,
107
+ gen_helper_gvec_sminp_s,
108
+ };
109
+ tcg_debug_assert(vece <= MO_32);
110
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
111
+}
112
+
113
+void gen_gvec_umaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
114
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
115
+{
116
+ static gen_helper_gvec_3 * const fns[4] = {
117
+ gen_helper_gvec_umaxp_b,
118
+ gen_helper_gvec_umaxp_h,
119
+ gen_helper_gvec_umaxp_s,
120
+ };
121
+ tcg_debug_assert(vece <= MO_32);
122
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
123
+}
124
+
125
+void gen_gvec_uminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
126
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
127
+{
128
+ static gen_helper_gvec_3 * const fns[4] = {
129
+ gen_helper_gvec_uminp_b,
130
+ gen_helper_gvec_uminp_h,
131
+ gen_helper_gvec_uminp_s,
132
+ };
133
+ tcg_debug_assert(vece <= MO_32);
134
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
135
+}
136
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
137
index XXXXXXX..XXXXXXX 100644
138
--- a/target/arm/tcg/translate-a64.c
139
+++ b/target/arm/tcg/translate-a64.c
140
@@ -XXX,XX +XXX,XX @@ static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn)
141
return true;
142
}
143
144
+static bool do_gvec_fn3_no64(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn)
145
+{
146
+ if (a->esz == MO_64) {
147
+ return false;
148
+ }
149
+ if (fp_access_check(s)) {
150
+ gen_gvec_fn3(s, a->q, a->rd, a->rn, a->rm, fn, a->esz);
151
+ }
152
+ return true;
153
+}
154
+
155
static bool do_gvec_fn4(DisasContext *s, arg_qrrrr_e *a, GVecGen4Fn *fn)
156
{
157
if (!a->q && a->esz == MO_64) {
158
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = {
159
TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp)
160
161
TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp)
162
+TRANS(SMAXP_v, do_gvec_fn3_no64, a, gen_gvec_smaxp)
163
+TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp)
164
+TRANS(UMAXP_v, do_gvec_fn3_no64, a, gen_gvec_umaxp)
165
+TRANS(UMINP_v, do_gvec_fn3_no64, a, gen_gvec_uminp)
166
167
/*
168
* Advanced SIMD scalar/vector x indexed element
169
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
170
}
171
}
172
173
-/* Pairwise op subgroup of C3.6.16.
28
- *
174
- *
29
- * Extracts the TBI0 value from the appropriate TCR for the current EL
175
- * This is called directly for float pairwise
30
- *
176
- * operations where the opcode and size are calculated differently.
31
- * Returns: the TBI0 value.
32
- */
177
- */
33
-uint32_t arm_regime_tbi0(CPUARMState *env, ARMMMUIdx mmu_idx);
178
-static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
34
-
179
- int size, int rn, int rm, int rd)
35
-/**
36
- * arm_regime_tbi1:
37
- * @env: CPUARMState
38
- * @mmu_idx: MMU index indicating required translation regime
39
- *
40
- * Extracts the TBI1 value from the appropriate TCR for the current EL
41
- *
42
- * Returns: the TBI1 value.
43
- */
44
-uint32_t arm_regime_tbi1(CPUARMState *env, ARMMMUIdx mmu_idx);
45
-#else
46
-/* We can't handle tagged addresses properly in user-only mode */
47
-static inline uint32_t arm_regime_tbi0(CPUARMState *env, ARMMMUIdx mmu_idx)
48
-{
180
-{
49
- return 0;
181
- int pass;
50
-}
182
-
51
-
183
- if (!fp_access_check(s)) {
52
-static inline uint32_t arm_regime_tbi1(CPUARMState *env, ARMMMUIdx mmu_idx)
184
- return;
53
-{
185
- }
54
- return 0;
186
-
55
-}
187
- /* These operations work on the concatenated rm:rn, with each pair of
56
-#endif
188
- * adjacent elements being operated on to produce an element in the result.
57
-
58
void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
59
target_ulong *cs_base, uint32_t *flags);
60
61
diff --git a/target/arm/helper.c b/target/arm/helper.c
62
index XXXXXXX..XXXXXXX 100644
63
--- a/target/arm/helper.c
64
+++ b/target/arm/helper.c
65
@@ -XXX,XX +XXX,XX @@ static inline ARMMMUIdx stage_1_mmu_idx(ARMMMUIdx mmu_idx)
66
return mmu_idx;
67
}
68
69
-/* Returns TBI0 value for current regime el */
70
-uint32_t arm_regime_tbi0(CPUARMState *env, ARMMMUIdx mmu_idx)
71
-{
72
- TCR *tcr;
73
- uint32_t el;
74
-
75
- /* For EL0 and EL1, TBI is controlled by stage 1's TCR, so convert
76
- * a stage 1+2 mmu index into the appropriate stage 1 mmu index.
77
- */
189
- */
78
- mmu_idx = stage_1_mmu_idx(mmu_idx);
190
- if (size == 3) {
79
-
191
- g_assert_not_reached();
80
- tcr = regime_tcr(env, mmu_idx);
81
- el = regime_el(env, mmu_idx);
82
-
83
- if (el > 1) {
84
- return extract64(tcr->raw_tcr, 20, 1);
85
- } else {
192
- } else {
86
- return extract64(tcr->raw_tcr, 37, 1);
193
- int maxpass = is_q ? 4 : 2;
194
- TCGv_i32 tcg_res[4];
195
-
196
- for (pass = 0; pass < maxpass; pass++) {
197
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
198
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
199
- NeonGenTwoOpFn *genfn = NULL;
200
- int passreg = pass < (maxpass / 2) ? rn : rm;
201
- int passelt = (is_q && (pass & 1)) ? 2 : 0;
202
-
203
- read_vec_element_i32(s, tcg_op1, passreg, passelt, MO_32);
204
- read_vec_element_i32(s, tcg_op2, passreg, passelt + 1, MO_32);
205
- tcg_res[pass] = tcg_temp_new_i32();
206
-
207
- switch (opcode) {
208
- case 0x14: /* SMAXP, UMAXP */
209
- {
210
- static NeonGenTwoOpFn * const fns[3][2] = {
211
- { gen_helper_neon_pmax_s8, gen_helper_neon_pmax_u8 },
212
- { gen_helper_neon_pmax_s16, gen_helper_neon_pmax_u16 },
213
- { tcg_gen_smax_i32, tcg_gen_umax_i32 },
214
- };
215
- genfn = fns[size][u];
216
- break;
217
- }
218
- case 0x15: /* SMINP, UMINP */
219
- {
220
- static NeonGenTwoOpFn * const fns[3][2] = {
221
- { gen_helper_neon_pmin_s8, gen_helper_neon_pmin_u8 },
222
- { gen_helper_neon_pmin_s16, gen_helper_neon_pmin_u16 },
223
- { tcg_gen_smin_i32, tcg_gen_umin_i32 },
224
- };
225
- genfn = fns[size][u];
226
- break;
227
- }
228
- default:
229
- case 0x17: /* ADDP */
230
- case 0x58: /* FMAXNMP */
231
- case 0x5a: /* FADDP */
232
- case 0x5e: /* FMAXP */
233
- case 0x78: /* FMINNMP */
234
- case 0x7e: /* FMINP */
235
- g_assert_not_reached();
236
- }
237
-
238
- /* FP ops called directly, otherwise call now */
239
- if (genfn) {
240
- genfn(tcg_res[pass], tcg_op1, tcg_op2);
241
- }
242
- }
243
-
244
- for (pass = 0; pass < maxpass; pass++) {
245
- write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_32);
246
- }
247
- clear_vec_high(s, is_q, rd);
87
- }
248
- }
88
-}
249
-}
89
-
250
-
90
-/* Returns TBI1 value for current regime el */
251
/* Floating point op subgroup of C3.6.16. */
91
-uint32_t arm_regime_tbi1(CPUARMState *env, ARMMMUIdx mmu_idx)
252
static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
92
-{
253
{
93
- TCR *tcr;
254
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
94
- uint32_t el;
255
case 0x3: /* logic ops */
95
-
256
disas_simd_3same_logic(s, insn);
96
- /* For EL0 and EL1, TBI is controlled by stage 1's TCR, so convert
257
break;
97
- * a stage 1+2 mmu index into the appropriate stage 1 mmu index.
258
- case 0x14: /* SMAXP, UMAXP */
98
- */
259
- case 0x15: /* SMINP, UMINP */
99
- mmu_idx = stage_1_mmu_idx(mmu_idx);
260
- {
100
-
261
- /* Pairwise operations */
101
- tcr = regime_tcr(env, mmu_idx);
262
- int is_q = extract32(insn, 30, 1);
102
- el = regime_el(env, mmu_idx);
263
- int u = extract32(insn, 29, 1);
103
-
264
- int size = extract32(insn, 22, 2);
104
- if (el > 1) {
265
- int rm = extract32(insn, 16, 5);
105
- return 0;
266
- int rn = extract32(insn, 5, 5);
106
- } else {
267
- int rd = extract32(insn, 0, 5);
107
- return extract64(tcr->raw_tcr, 38, 1);
268
- if (opcode == 0x17) {
269
- if (u || (size == 3 && !is_q)) {
270
- unallocated_encoding(s);
271
- return;
272
- }
273
- } else {
274
- if (size == 3) {
275
- unallocated_encoding(s);
276
- return;
277
- }
278
- }
279
- handle_simd_3same_pair(s, is_q, u, opcode, size, rn, rm, rd);
280
- break;
108
- }
281
- }
109
-}
282
case 0x18 ... 0x31:
110
-
283
/* floating point ops, sz[1] and U are part of opcode */
111
/* Return the TTBR associated with this translation regime */
284
disas_simd_3same_float(s, insn);
112
static inline uint64_t regime_ttbr(CPUARMState *env, ARMMMUIdx mmu_idx,
285
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
113
int ttbrn)
286
default:
114
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
287
disas_simd_3same_int(s, insn);
115
288
break;
116
*pc = env->pc;
289
+ case 0x14: /* SMAXP, UMAXP */
117
flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
290
+ case 0x15: /* SMINP, UMINP */
118
- /* Get control bits for tagged addresses */
291
case 0x17: /* ADDP */
119
- flags = FIELD_DP32(flags, TBFLAG_A64, TBII,
292
unallocated_encoding(s);
120
- (arm_regime_tbi1(env, mmu_idx) << 1) |
293
break;
121
- arm_regime_tbi0(env, mmu_idx));
294
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
122
+
295
index XXXXXXX..XXXXXXX 100644
123
+#ifndef CONFIG_USER_ONLY
296
--- a/target/arm/tcg/vec_helper.c
124
+ /*
297
+++ b/target/arm/tcg/vec_helper.c
125
+ * Get control bits for tagged addresses. Note that the
298
@@ -XXX,XX +XXX,XX @@ DO_3OP_PAIR(gvec_addp_s, ADD, uint32_t, H4)
126
+ * translator only uses this for instruction addresses.
299
DO_3OP_PAIR(gvec_addp_d, ADD, uint64_t, )
127
+ */
300
#undef ADD
128
+ {
301
129
+ ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
302
+DO_3OP_PAIR(gvec_smaxp_b, MAX, int8_t, H1)
130
+ ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
303
+DO_3OP_PAIR(gvec_smaxp_h, MAX, int16_t, H2)
131
+ int tbii, tbid;
304
+DO_3OP_PAIR(gvec_smaxp_s, MAX, int32_t, H4)
132
+
305
+
133
+ /* FIXME: ARMv8.1-VHE S2 translation regime. */
306
+DO_3OP_PAIR(gvec_umaxp_b, MAX, uint8_t, H1)
134
+ if (regime_el(env, stage1) < 2) {
307
+DO_3OP_PAIR(gvec_umaxp_h, MAX, uint16_t, H2)
135
+ ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
308
+DO_3OP_PAIR(gvec_umaxp_s, MAX, uint32_t, H4)
136
+ tbid = (p1.tbi << 1) | p0.tbi;
309
+
137
+ tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
310
+DO_3OP_PAIR(gvec_sminp_b, MIN, int8_t, H1)
138
+ } else {
311
+DO_3OP_PAIR(gvec_sminp_h, MIN, int16_t, H2)
139
+ tbid = p0.tbi;
312
+DO_3OP_PAIR(gvec_sminp_s, MIN, int32_t, H4)
140
+ tbii = tbid & !p0.tbid;
313
+
141
+ }
314
+DO_3OP_PAIR(gvec_uminp_b, MIN, uint8_t, H1)
142
+
315
+DO_3OP_PAIR(gvec_uminp_h, MIN, uint16_t, H2)
143
+ flags = FIELD_DP32(flags, TBFLAG_A64, TBII, tbii);
316
+DO_3OP_PAIR(gvec_uminp_s, MIN, uint32_t, H4)
144
+ }
317
+
145
+#endif
318
#undef DO_3OP_PAIR
146
319
147
if (cpu_isar_feature(aa64_sve, cpu)) {
320
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
148
int sve_el = sve_exception_el(env, current_el);
149
--
321
--
150
2.20.1
322
2.34.1
151
152
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20190108223129.5570-10-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-35-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
7
---
8
target/arm/translate-a64.c | 8 ++++++++
8
target/arm/tcg/translate-neon.c | 78 ++-------------------------------
9
1 file changed, 8 insertions(+)
9
1 file changed, 4 insertions(+), 74 deletions(-)
10
10
11
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
11
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
12
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/arm/translate-a64.c
13
--- a/target/arm/tcg/translate-neon.c
14
+++ b/target/arm/translate-a64.c
14
+++ b/target/arm/tcg/translate-neon.c
15
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_2src(DisasContext *s, uint32_t insn)
15
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
16
case 11: /* RORV */
16
DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
17
handle_shift_reg(s, A64_SHIFT_TYPE_ROR, sf, rm, rn, rd);
17
DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
18
break;
18
DO_3SAME_NO_SZ_3(VPADD, gen_gvec_addp)
19
+ case 12: /* PACGA */
19
+DO_3SAME_NO_SZ_3(VPMAX_S, gen_gvec_smaxp)
20
+ if (sf == 0 || !dc_isar_feature(aa64_pauth, s)) {
20
+DO_3SAME_NO_SZ_3(VPMIN_S, gen_gvec_sminp)
21
+ goto do_unallocated;
21
+DO_3SAME_NO_SZ_3(VPMAX_U, gen_gvec_umaxp)
22
+ }
22
+DO_3SAME_NO_SZ_3(VPMIN_U, gen_gvec_uminp)
23
+ gen_helper_pacga(cpu_reg(s, rd), cpu_env,
23
24
+ cpu_reg(s, rn), cpu_reg_sp(s, rm));
24
#define DO_3SAME_CMP(INSN, COND) \
25
+ break;
25
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
26
case 16:
26
@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_U, qshl_u)
27
case 17:
27
DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
28
case 18:
28
DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
29
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_2src(DisasContext *s, uint32_t insn)
29
30
break;
30
-static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
31
}
31
-{
32
default:
32
- /* Operations handled pairwise 32 bits at a time */
33
+ do_unallocated:
33
- TCGv_i32 tmp, tmp2, tmp3;
34
unallocated_encoding(s);
34
-
35
break;
35
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
36
}
36
- return false;
37
- }
38
-
39
- /* UNDEF accesses to D16-D31 if they don't exist. */
40
- if (!dc_isar_feature(aa32_simd_r32, s) &&
41
- ((a->vd | a->vn | a->vm) & 0x10)) {
42
- return false;
43
- }
44
-
45
- if (a->size == 3) {
46
- return false;
47
- }
48
-
49
- if (!vfp_access_check(s)) {
50
- return true;
51
- }
52
-
53
- assert(a->q == 0); /* enforced by decode patterns */
54
-
55
- /*
56
- * Note that we have to be careful not to clobber the source operands
57
- * in the "vm == vd" case by storing the result of the first pass too
58
- * early. Since Q is 0 there are always just two passes, so instead
59
- * of a complicated loop over each pass we just unroll.
60
- */
61
- tmp = tcg_temp_new_i32();
62
- tmp2 = tcg_temp_new_i32();
63
- tmp3 = tcg_temp_new_i32();
64
-
65
- read_neon_element32(tmp, a->vn, 0, MO_32);
66
- read_neon_element32(tmp2, a->vn, 1, MO_32);
67
- fn(tmp, tmp, tmp2);
68
-
69
- read_neon_element32(tmp3, a->vm, 0, MO_32);
70
- read_neon_element32(tmp2, a->vm, 1, MO_32);
71
- fn(tmp3, tmp3, tmp2);
72
-
73
- write_neon_element32(tmp, a->vd, 0, MO_32);
74
- write_neon_element32(tmp3, a->vd, 1, MO_32);
75
-
76
- return true;
77
-}
78
-
79
-#define DO_3SAME_PAIR(INSN, func) \
80
- static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
81
- { \
82
- static NeonGenTwoOpFn * const fns[] = { \
83
- gen_helper_neon_##func##8, \
84
- gen_helper_neon_##func##16, \
85
- gen_helper_neon_##func##32, \
86
- }; \
87
- if (a->size > 2) { \
88
- return false; \
89
- } \
90
- return do_3same_pair(s, a, fns[a->size]); \
91
- }
92
-
93
-/* 32-bit pairwise ops end up the same as the elementwise versions. */
94
-#define gen_helper_neon_pmax_s32 tcg_gen_smax_i32
95
-#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
96
-#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
97
-#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
98
-
99
-DO_3SAME_PAIR(VPMAX_S, pmax_s)
100
-DO_3SAME_PAIR(VPMIN_S, pmin_s)
101
-DO_3SAME_PAIR(VPMAX_U, pmax_u)
102
-DO_3SAME_PAIR(VPMIN_U, pmin_u)
103
-
104
#define DO_3SAME_VQDMULH(INSN, FUNC) \
105
WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \
106
WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \
37
--
107
--
38
2.20.1
108
2.34.1
39
40
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This will enable PAuth decode in a subsequent patch.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Message-id: 20240524232121.284515-36-richard.henderson@linaro.org
7
Message-id: 20190108223129.5570-13-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
7
---
10
target/arm/translate-a64.c | 47 +++++++++++++++++++++++++++++---------
8
target/arm/tcg/a64.decode | 10 +++
11
1 file changed, 36 insertions(+), 11 deletions(-)
9
target/arm/tcg/translate-a64.c | 144 ++++++++++-----------------------
10
2 files changed, 51 insertions(+), 103 deletions(-)
12
11
13
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/translate-a64.c
14
--- a/target/arm/tcg/a64.decode
16
+++ b/target/arm/translate-a64.c
15
+++ b/target/arm/tcg/a64.decode
17
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
16
@@ -XXX,XX +XXX,XX @@ FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd
18
rn = extract32(insn, 5, 5);
17
FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h
19
op4 = extract32(insn, 0, 5);
18
FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd
20
19
21
- if (op4 != 0x0 || op3 != 0x0 || op2 != 0x1f) {
20
+FMLAL_v 0.00 1110 001 ..... 11101 1 ..... ..... @qrrr_h
21
+FMLSL_v 0.00 1110 101 ..... 11101 1 ..... ..... @qrrr_h
22
+FMLAL2_v 0.10 1110 001 ..... 11001 1 ..... ..... @qrrr_h
23
+FMLSL2_v 0.10 1110 101 ..... 11001 1 ..... ..... @qrrr_h
24
+
25
FCMEQ_v 0.00 1110 010 ..... 00100 1 ..... ..... @qrrr_h
26
FCMEQ_v 0.00 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd
27
28
@@ -XXX,XX +XXX,XX @@ FMLS_vi 0.00 1111 11 0 ..... 0101 . 0 ..... ..... @qrrx_d
29
FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
30
FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
31
FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
32
+
33
+FMLAL_vi 0.00 1111 10 .. .... 0000 . 0 ..... ..... @qrrx_h
34
+FMLSL_vi 0.00 1111 10 .. .... 0100 . 0 ..... ..... @qrrx_h
35
+FMLAL2_vi 0.10 1111 10 .. .... 1000 . 0 ..... ..... @qrrx_h
36
+FMLSL2_vi 0.10 1111 10 .. .... 1100 . 0 ..... ..... @qrrx_h
37
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/tcg/translate-a64.c
40
+++ b/target/arm/tcg/translate-a64.c
41
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = {
42
};
43
TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp)
44
45
+static bool do_fmlal(DisasContext *s, arg_qrrr_e *a, bool is_s, bool is_2)
46
+{
47
+ if (fp_access_check(s)) {
48
+ int data = (is_2 << 1) | is_s;
49
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
50
+ vec_full_reg_offset(s, a->rn),
51
+ vec_full_reg_offset(s, a->rm), tcg_env,
52
+ a->q ? 16 : 8, vec_full_reg_size(s),
53
+ data, gen_helper_gvec_fmlal_a64);
54
+ }
55
+ return true;
56
+}
57
+
58
+TRANS_FEAT(FMLAL_v, aa64_fhm, do_fmlal, a, false, false)
59
+TRANS_FEAT(FMLSL_v, aa64_fhm, do_fmlal, a, true, false)
60
+TRANS_FEAT(FMLAL2_v, aa64_fhm, do_fmlal, a, false, true)
61
+TRANS_FEAT(FMLSL2_v, aa64_fhm, do_fmlal, a, true, true)
62
+
63
TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp)
64
TRANS(SMAXP_v, do_gvec_fn3_no64, a, gen_gvec_smaxp)
65
TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp)
66
@@ -XXX,XX +XXX,XX @@ static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg)
67
TRANS(FMLA_vi, do_fmla_vector_idx, a, false)
68
TRANS(FMLS_vi, do_fmla_vector_idx, a, true)
69
70
+static bool do_fmlal_idx(DisasContext *s, arg_qrrx_e *a, bool is_s, bool is_2)
71
+{
72
+ if (fp_access_check(s)) {
73
+ int data = (a->idx << 2) | (is_2 << 1) | is_s;
74
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
75
+ vec_full_reg_offset(s, a->rn),
76
+ vec_full_reg_offset(s, a->rm), tcg_env,
77
+ a->q ? 16 : 8, vec_full_reg_size(s),
78
+ data, gen_helper_gvec_fmlal_idx_a64);
79
+ }
80
+ return true;
81
+}
82
+
83
+TRANS_FEAT(FMLAL_vi, aa64_fhm, do_fmlal_idx, a, false, false)
84
+TRANS_FEAT(FMLSL_vi, aa64_fhm, do_fmlal_idx, a, true, false)
85
+TRANS_FEAT(FMLAL2_vi, aa64_fhm, do_fmlal_idx, a, false, true)
86
+TRANS_FEAT(FMLSL2_vi, aa64_fhm, do_fmlal_idx, a, true, true)
87
+
88
/*
89
* Advanced SIMD scalar pairwise
90
*/
91
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
92
}
93
}
94
95
-/* Floating point op subgroup of C3.6.16. */
96
-static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
97
-{
98
- /* For floating point ops, the U, size[1] and opcode bits
99
- * together indicate the operation. size[0] indicates single
100
- * or double.
101
- */
102
- int fpopcode = extract32(insn, 11, 5)
103
- | (extract32(insn, 23, 1) << 5)
104
- | (extract32(insn, 29, 1) << 6);
105
- int is_q = extract32(insn, 30, 1);
106
- int size = extract32(insn, 22, 1);
107
- int rm = extract32(insn, 16, 5);
108
- int rn = extract32(insn, 5, 5);
109
- int rd = extract32(insn, 0, 5);
110
-
111
- if (size == 1 && !is_q) {
22
- unallocated_encoding(s);
112
- unallocated_encoding(s);
23
- return;
113
- return;
24
+ if (op2 != 0x1f) {
114
- }
25
+ goto do_unallocated;
115
-
26
}
116
- switch (fpopcode) {
27
117
- case 0x1d: /* FMLAL */
28
switch (opc) {
118
- case 0x3d: /* FMLSL */
29
case 0: /* BR */
119
- case 0x59: /* FMLAL2 */
30
case 1: /* BLR */
120
- case 0x79: /* FMLSL2 */
31
case 2: /* RET */
121
- if (size & 1 || !dc_isar_feature(aa64_fhm, s)) {
32
- gen_a64_set_pc(s, cpu_reg(s, rn));
33
+ switch (op3) {
34
+ case 0:
35
+ if (op4 != 0) {
36
+ goto do_unallocated;
37
+ }
38
+ dst = cpu_reg(s, rn);
39
+ break;
40
+
41
+ default:
42
+ goto do_unallocated;
43
+ }
44
+
45
+ gen_a64_set_pc(s, dst);
46
/* BLR also needs to load return address */
47
if (opc == 1) {
48
tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
49
}
50
break;
51
+
52
case 4: /* ERET */
53
if (s->current_el == 0) {
54
- unallocated_encoding(s);
122
- unallocated_encoding(s);
55
- return;
123
- return;
56
+ goto do_unallocated;
124
- }
57
+ }
125
- if (fp_access_check(s)) {
58
+ switch (op3) {
126
- int is_s = extract32(insn, 23, 1);
59
+ case 0:
127
- int is_2 = extract32(insn, 29, 1);
60
+ if (op4 != 0) {
128
- int data = (is_2 << 1) | is_s;
61
+ goto do_unallocated;
129
- tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
62
+ }
130
- vec_full_reg_offset(s, rn),
63
+ dst = tcg_temp_new_i64();
131
- vec_full_reg_offset(s, rm), tcg_env,
64
+ tcg_gen_ld_i64(dst, cpu_env,
132
- is_q ? 16 : 8, vec_full_reg_size(s),
65
+ offsetof(CPUARMState, elr_el[s->current_el]));
133
- data, gen_helper_gvec_fmlal_a64);
66
+ break;
134
- }
67
+
135
- return;
68
+ default:
136
-
69
+ goto do_unallocated;
137
- default:
138
- case 0x18: /* FMAXNM */
139
- case 0x19: /* FMLA */
140
- case 0x1a: /* FADD */
141
- case 0x1b: /* FMULX */
142
- case 0x1c: /* FCMEQ */
143
- case 0x1e: /* FMAX */
144
- case 0x1f: /* FRECPS */
145
- case 0x38: /* FMINNM */
146
- case 0x39: /* FMLS */
147
- case 0x3a: /* FSUB */
148
- case 0x3e: /* FMIN */
149
- case 0x3f: /* FRSQRTS */
150
- case 0x58: /* FMAXNMP */
151
- case 0x5a: /* FADDP */
152
- case 0x5b: /* FMUL */
153
- case 0x5c: /* FCMGE */
154
- case 0x5d: /* FACGE */
155
- case 0x5e: /* FMAXP */
156
- case 0x5f: /* FDIV */
157
- case 0x78: /* FMINNMP */
158
- case 0x7a: /* FABD */
159
- case 0x7d: /* FACGT */
160
- case 0x7c: /* FCMGT */
161
- case 0x7e: /* FMINP */
162
- unallocated_encoding(s);
163
- return;
164
- }
165
-}
166
-
167
/* Integer op subgroup of C3.6.16. */
168
static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
169
{
170
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
171
case 0x3: /* logic ops */
172
disas_simd_3same_logic(s, insn);
173
break;
174
- case 0x18 ... 0x31:
175
- /* floating point ops, sz[1] and U are part of opcode */
176
- disas_simd_3same_float(s, insn);
177
- break;
178
default:
179
disas_simd_3same_int(s, insn);
180
break;
181
case 0x14: /* SMAXP, UMAXP */
182
case 0x15: /* SMINP, UMINP */
183
case 0x17: /* ADDP */
184
+ case 0x18 ... 0x31: /* floating point ops */
185
unallocated_encoding(s);
186
break;
187
}
188
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
70
}
189
}
71
if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
190
is_fp = 2;
72
gen_io_start();
191
break;
73
}
192
- case 0x00: /* FMLAL */
74
- dst = tcg_temp_new_i64();
193
- case 0x04: /* FMLSL */
75
- tcg_gen_ld_i64(dst, cpu_env,
194
- case 0x18: /* FMLAL2 */
76
- offsetof(CPUARMState, elr_el[s->current_el]));
195
- case 0x1c: /* FMLSL2 */
77
+
196
- if (is_scalar || size != MO_32 || !dc_isar_feature(aa64_fhm, s)) {
78
gen_helper_exception_return(cpu_env, dst);
79
tcg_temp_free_i64(dst);
80
if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
81
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
82
/* Must exit loop to check un-masked IRQs */
83
s->base.is_jmp = DISAS_EXIT;
84
return;
85
+
86
case 5: /* DRPS */
87
- if (rn != 0x1f) {
88
- unallocated_encoding(s);
197
- unallocated_encoding(s);
89
+ if (op3 != 0 || op4 != 0 || rn != 0x1f) {
198
- return;
90
+ goto do_unallocated;
199
- }
91
} else {
200
- size = MO_16;
92
unsupported_encoding(s, insn);
201
- /* is_fp, but we pass tcg_env not fp_status. */
93
}
202
- break;
94
return;
95
+
96
default:
203
default:
97
+ do_unallocated:
204
+ case 0x00: /* FMLAL */
205
case 0x01: /* FMLA */
206
+ case 0x04: /* FMLSL */
207
case 0x05: /* FMLS */
208
case 0x09: /* FMUL */
209
+ case 0x18: /* FMLAL2 */
210
case 0x19: /* FMULX */
211
+ case 0x1c: /* FMLSL2 */
98
unallocated_encoding(s);
212
unallocated_encoding(s);
99
return;
213
return;
100
}
214
}
215
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
216
}
217
return;
218
219
- case 0x00: /* FMLAL */
220
- case 0x04: /* FMLSL */
221
- case 0x18: /* FMLAL2 */
222
- case 0x1c: /* FMLSL2 */
223
- {
224
- int is_s = extract32(opcode, 2, 1);
225
- int is_2 = u;
226
- int data = (index << 2) | (is_2 << 1) | is_s;
227
- tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
228
- vec_full_reg_offset(s, rn),
229
- vec_full_reg_offset(s, rm), tcg_env,
230
- is_q ? 16 : 8, vec_full_reg_size(s),
231
- data, gen_helper_gvec_fmlal_idx_a64);
232
- }
233
- return;
234
-
235
case 0x08: /* MUL */
236
if (!is_long && !is_scalar) {
237
static gen_helper_gvec_3 * const fns[3] = {
101
--
238
--
102
2.20.1
239
2.34.1
103
104
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The pattern
3
This includes AND, ORR, EOR, BIC, ORN, BSF, BIT, BIF.
4
5
ARMMMUIdx mmu_idx = core_to_arm_mmu_idx(env, cpu_mmu_index(env, false));
6
7
is computing the full ARMMMUIdx, stripping off the ARM bits,
8
and then putting them back.
9
10
Avoid the extra two steps with the appropriate helper function.
11
4
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20190108223129.5570-17-richard.henderson@linaro.org
7
Message-id: 20240524232121.284515-37-richard.henderson@linaro.org
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
9
---
17
target/arm/cpu.h | 9 ++++++++-
10
target/arm/tcg/a64.decode | 10 +++++
18
target/arm/internals.h | 8 ++++++++
11
target/arm/tcg/translate-a64.c | 68 ++++++++++------------------------
19
target/arm/helper.c | 27 ++++++++++++++++-----------
12
2 files changed, 29 insertions(+), 49 deletions(-)
20
3 files changed, 32 insertions(+), 12 deletions(-)
21
13
22
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
23
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/cpu.h
16
--- a/target/arm/tcg/a64.decode
25
+++ b/target/arm/cpu.h
17
+++ b/target/arm/tcg/a64.decode
26
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
18
@@ -XXX,XX +XXX,XX @@
27
/* Return the MMU index for a v7M CPU in the specified security state */
19
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
28
ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate);
20
@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
29
21
30
-/* Determine the current mmu_idx to use for normal loads/stores */
22
+@qrrr_b . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=0
31
+/**
23
@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1
32
+ * cpu_mmu_index:
24
@qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd
33
+ * @env: The cpu environment
25
@qrrr_e . q:1 ...... esz:2 . rm:5 ...... rn:5 rd:5 &qrrr_e
34
+ * @ifetch: True for code access, false for data access.
26
@@ -XXX,XX +XXX,XX @@ SMINP_v 0.00 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e
35
+ *
27
UMAXP_v 0.10 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e
36
+ * Return the core mmu index for the current translation regime.
28
UMINP_v 0.10 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e
37
+ * This function is used by generic TCG code paths.
29
38
+ */
30
+AND_v 0.00 1110 001 ..... 00011 1 ..... ..... @qrrr_b
39
int cpu_mmu_index(CPUARMState *env, bool ifetch);
31
+BIC_v 0.00 1110 011 ..... 00011 1 ..... ..... @qrrr_b
40
32
+ORR_v 0.00 1110 101 ..... 00011 1 ..... ..... @qrrr_b
41
/* Indexes used when registering address spaces with cpu_address_space_init */
33
+ORN_v 0.00 1110 111 ..... 00011 1 ..... ..... @qrrr_b
42
diff --git a/target/arm/internals.h b/target/arm/internals.h
34
+EOR_v 0.10 1110 001 ..... 00011 1 ..... ..... @qrrr_b
35
+BSL_v 0.10 1110 011 ..... 00011 1 ..... ..... @qrrr_b
36
+BIT_v 0.10 1110 101 ..... 00011 1 ..... ..... @qrrr_b
37
+BIF_v 0.10 1110 111 ..... 00011 1 ..... ..... @qrrr_b
38
+
39
### Advanced SIMD scalar x indexed element
40
41
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
42
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
43
index XXXXXXX..XXXXXXX 100644
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/internals.h
44
--- a/target/arm/tcg/translate-a64.c
45
+++ b/target/arm/internals.h
45
+++ b/target/arm/tcg/translate-a64.c
46
@@ -XXX,XX +XXX,XX @@ void arm_cpu_update_virq(ARMCPU *cpu);
46
@@ -XXX,XX +XXX,XX @@ TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp)
47
*/
47
TRANS(UMAXP_v, do_gvec_fn3_no64, a, gen_gvec_umaxp)
48
void arm_cpu_update_vfiq(ARMCPU *cpu);
48
TRANS(UMINP_v, do_gvec_fn3_no64, a, gen_gvec_uminp)
49
49
50
+/**
50
+TRANS(AND_v, do_gvec_fn3, a, tcg_gen_gvec_and)
51
+ * arm_mmu_idx:
51
+TRANS(BIC_v, do_gvec_fn3, a, tcg_gen_gvec_andc)
52
+ * @env: The cpu environment
52
+TRANS(ORR_v, do_gvec_fn3, a, tcg_gen_gvec_or)
53
+ *
53
+TRANS(ORN_v, do_gvec_fn3, a, tcg_gen_gvec_orc)
54
+ * Return the full ARMMMUIdx for the current translation regime.
54
+TRANS(EOR_v, do_gvec_fn3, a, tcg_gen_gvec_xor)
55
+ */
56
+ARMMMUIdx arm_mmu_idx(CPUARMState *env);
57
+
55
+
58
#endif
56
+static bool do_bitsel(DisasContext *s, bool is_q, int d, int a, int b, int c)
59
diff --git a/target/arm/helper.c b/target/arm/helper.c
57
+{
60
index XXXXXXX..XXXXXXX 100644
58
+ if (fp_access_check(s)) {
61
--- a/target/arm/helper.c
59
+ gen_gvec_fn4(s, is_q, d, a, b, c, tcg_gen_gvec_bitsel, 0);
62
+++ b/target/arm/helper.c
60
+ }
63
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
61
+ return true;
64
limit = env->v7m.msplim[M_REG_S];
65
}
66
} else {
67
- mmu_idx = core_to_arm_mmu_idx(env, cpu_mmu_index(env, false));
68
+ mmu_idx = arm_mmu_idx(env);
69
frame_sp_p = &env->regs[13];
70
limit = v7m_sp_limit(env);
71
}
72
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
73
CPUARMState *env = &cpu->env;
74
uint32_t xpsr = xpsr_read(env);
75
uint32_t frameptr = env->regs[13];
76
- ARMMMUIdx mmu_idx = core_to_arm_mmu_idx(env, cpu_mmu_index(env, false));
77
+ ARMMMUIdx mmu_idx = arm_mmu_idx(env);
78
79
/* Align stack pointer if the guest wants that */
80
if ((frameptr & 4) &&
81
@@ -XXX,XX +XXX,XX @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
82
int prot;
83
bool ret;
84
ARMMMUFaultInfo fi = {};
85
- ARMMMUIdx mmu_idx = core_to_arm_mmu_idx(env, cpu_mmu_index(env, false));
86
+ ARMMMUIdx mmu_idx = arm_mmu_idx(env);
87
88
*attrs = (MemTxAttrs) {};
89
90
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
91
return arm_v7m_mmu_idx_for_secstate_and_priv(env, secstate, priv);
92
}
93
94
-int cpu_mmu_index(CPUARMState *env, bool ifetch)
95
+ARMMMUIdx arm_mmu_idx(CPUARMState *env)
96
{
97
- int el = arm_current_el(env);
98
+ int el;
99
100
if (arm_feature(env, ARM_FEATURE_M)) {
101
- ARMMMUIdx mmu_idx = arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
102
-
103
- return arm_to_core_mmu_idx(mmu_idx);
104
+ return arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
105
}
106
107
+ el = arm_current_el(env);
108
if (el < 2 && arm_is_secure_below_el3(env)) {
109
- return arm_to_core_mmu_idx(ARMMMUIdx_S1SE0 + el);
110
+ return ARMMMUIdx_S1SE0 + el;
111
+ } else {
112
+ return ARMMMUIdx_S12NSE0 + el;
113
}
114
- return el;
115
+}
62
+}
116
+
63
+
117
+int cpu_mmu_index(CPUARMState *env, bool ifetch)
64
+TRANS(BSL_v, do_bitsel, a->q, a->rd, a->rd, a->rn, a->rm)
118
+{
65
+TRANS(BIT_v, do_bitsel, a->q, a->rd, a->rm, a->rn, a->rd)
119
+ return arm_to_core_mmu_idx(arm_mmu_idx(env));
66
+TRANS(BIF_v, do_bitsel, a->q, a->rd, a->rm, a->rd, a->rn)
67
+
68
/*
69
* Advanced SIMD scalar/vector x indexed element
70
*/
71
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
72
}
120
}
73
}
121
74
122
void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
75
-/* Logic op (opcode == 3) subgroup of C3.6.16. */
123
target_ulong *cs_base, uint32_t *pflags)
76
-static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
77
-{
78
- int rd = extract32(insn, 0, 5);
79
- int rn = extract32(insn, 5, 5);
80
- int rm = extract32(insn, 16, 5);
81
- int size = extract32(insn, 22, 2);
82
- bool is_u = extract32(insn, 29, 1);
83
- bool is_q = extract32(insn, 30, 1);
84
-
85
- if (!fp_access_check(s)) {
86
- return;
87
- }
88
-
89
- switch (size + 4 * is_u) {
90
- case 0: /* AND */
91
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_and, 0);
92
- return;
93
- case 1: /* BIC */
94
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_andc, 0);
95
- return;
96
- case 2: /* ORR */
97
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_or, 0);
98
- return;
99
- case 3: /* ORN */
100
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_orc, 0);
101
- return;
102
- case 4: /* EOR */
103
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_xor, 0);
104
- return;
105
-
106
- case 5: /* BSL bitwise select */
107
- gen_gvec_fn4(s, is_q, rd, rd, rn, rm, tcg_gen_gvec_bitsel, 0);
108
- return;
109
- case 6: /* BIT, bitwise insert if true */
110
- gen_gvec_fn4(s, is_q, rd, rm, rn, rd, tcg_gen_gvec_bitsel, 0);
111
- return;
112
- case 7: /* BIF, bitwise insert if false */
113
- gen_gvec_fn4(s, is_q, rd, rm, rd, rn, tcg_gen_gvec_bitsel, 0);
114
- return;
115
-
116
- default:
117
- g_assert_not_reached();
118
- }
119
-}
120
-
121
/* Integer op subgroup of C3.6.16. */
122
static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
124
{
123
{
125
- ARMMMUIdx mmu_idx = core_to_arm_mmu_idx(env, cpu_mmu_index(env, false));
124
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
126
+ ARMMMUIdx mmu_idx = arm_mmu_idx(env);
125
int opcode = extract32(insn, 11, 5);
127
int current_el = arm_current_el(env);
126
128
int fp_el = fp_exception_el(env, current_el);
127
switch (opcode) {
129
uint32_t flags = 0;
128
- case 0x3: /* logic ops */
129
- disas_simd_3same_logic(s, insn);
130
- break;
131
default:
132
disas_simd_3same_int(s, insn);
133
break;
134
+ case 0x3: /* logic ops */
135
case 0x14: /* SMAXP, UMAXP */
136
case 0x15: /* SMINP, UMINP */
137
case 0x17: /* ADDP */
130
--
138
--
131
2.20.1
139
2.34.1
132
133
diff view generated by jsdifflib
Deleted patch
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
1
3
In some cases it may be helpful to modify state before saving it for
4
migration, and then modify the state back after it has been saved. The
5
existing pre_save function provides half of this functionality. This
6
patch adds a post_save function to provide the second half.
7
8
Signed-off-by: Aaron Lindsay <aclindsa@gmail.com>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
11
Message-id: 20181211151945.29137-2-aaron@os.amperecomputing.com
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
include/migration/vmstate.h | 1 +
15
migration/vmstate.c | 13 ++++++++++++-
16
docs/devel/migration.rst | 9 +++++++--
17
3 files changed, 20 insertions(+), 3 deletions(-)
18
19
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/include/migration/vmstate.h
22
+++ b/include/migration/vmstate.h
23
@@ -XXX,XX +XXX,XX @@ struct VMStateDescription {
24
int (*pre_load)(void *opaque);
25
int (*post_load)(void *opaque, int version_id);
26
int (*pre_save)(void *opaque);
27
+ int (*post_save)(void *opaque);
28
bool (*needed)(void *opaque);
29
const VMStateField *fields;
30
const VMStateDescription **subsections;
31
diff --git a/migration/vmstate.c b/migration/vmstate.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/migration/vmstate.c
34
+++ b/migration/vmstate.c
35
@@ -XXX,XX +XXX,XX @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
36
if (ret) {
37
error_report("Save of field %s/%s failed",
38
vmsd->name, field->name);
39
+ if (vmsd->post_save) {
40
+ vmsd->post_save(opaque);
41
+ }
42
return ret;
43
}
44
45
@@ -XXX,XX +XXX,XX @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
46
json_end_array(vmdesc);
47
}
48
49
- return vmstate_subsection_save(f, vmsd, opaque, vmdesc);
50
+ ret = vmstate_subsection_save(f, vmsd, opaque, vmdesc);
51
+
52
+ if (vmsd->post_save) {
53
+ int ps_ret = vmsd->post_save(opaque);
54
+ if (!ret) {
55
+ ret = ps_ret;
56
+ }
57
+ }
58
+ return ret;
59
}
60
61
static const VMStateDescription *
62
diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
63
index XXXXXXX..XXXXXXX 100644
64
--- a/docs/devel/migration.rst
65
+++ b/docs/devel/migration.rst
66
@@ -XXX,XX +XXX,XX @@ The functions to do that are inside a vmstate definition, and are called:
67
68
This function is called before we save the state of one device.
69
70
-Example: You can look at hpet.c, that uses the three function to
71
-massage the state that is transferred.
72
+- ``int (*post_save)(void *opaque);``
73
+
74
+ This function is called after we save the state of one device
75
+ (even upon failure, unless the call to pre_save returned an error).
76
+
77
+Example: You can look at hpet.c, that uses the first three functions
78
+to massage the state that is transferred.
79
80
The ``VMSTATE_WITH_TMP`` macro may be useful when the migration
81
data doesn't match the stored device data well; it allows an
82
--
83
2.20.1
84
85
diff view generated by jsdifflib
Deleted patch
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
1
3
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20181211151945.29137-6-aaron@os.amperecomputing.com
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
target/arm/helper.c | 27 ++++++++++++++++++++++++++-
10
1 file changed, 26 insertions(+), 1 deletion(-)
11
12
diff --git a/target/arm/helper.c b/target/arm/helper.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/helper.c
15
+++ b/target/arm/helper.c
16
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
17
PMXEVTYPER_M | PMXEVTYPER_MT | \
18
PMXEVTYPER_EVTCOUNT)
19
20
+#define PMCCFILTR 0xf8000000
21
+#define PMCCFILTR_M PMXEVTYPER_M
22
+#define PMCCFILTR_EL0 (PMCCFILTR | PMCCFILTR_M)
23
+
24
static inline uint32_t pmu_num_counters(CPUARMState *env)
25
{
26
return (env->cp15.c9_pmcr & PMCRN_MASK) >> PMCRN_SHIFT;
27
@@ -XXX,XX +XXX,XX @@ static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
28
uint64_t value)
29
{
30
pmccntr_op_start(env);
31
- env->cp15.pmccfiltr_el0 = value & 0xfc000000;
32
+ env->cp15.pmccfiltr_el0 = value & PMCCFILTR_EL0;
33
pmccntr_op_finish(env);
34
}
35
36
+static void pmccfiltr_write_a32(CPUARMState *env, const ARMCPRegInfo *ri,
37
+ uint64_t value)
38
+{
39
+ pmccntr_op_start(env);
40
+ /* M is not accessible from AArch32 */
41
+ env->cp15.pmccfiltr_el0 = (env->cp15.pmccfiltr_el0 & PMCCFILTR_M) |
42
+ (value & PMCCFILTR);
43
+ pmccntr_op_finish(env);
44
+}
45
+
46
+static uint64_t pmccfiltr_read_a32(CPUARMState *env, const ARMCPRegInfo *ri)
47
+{
48
+ /* M is not visible in AArch32 */
49
+ return env->cp15.pmccfiltr_el0 & PMCCFILTR;
50
+}
51
+
52
static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
53
uint64_t value)
54
{
55
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
56
.readfn = pmccntr_read, .writefn = pmccntr_write,
57
.raw_readfn = raw_read, .raw_writefn = raw_write, },
58
#endif
59
+ { .name = "PMCCFILTR", .cp = 15, .opc1 = 0, .crn = 14, .crm = 15, .opc2 = 7,
60
+ .writefn = pmccfiltr_write_a32, .readfn = pmccfiltr_read_a32,
61
+ .access = PL0_RW, .accessfn = pmreg_access,
62
+ .type = ARM_CP_ALIAS | ARM_CP_IO,
63
+ .resetvalue = 0, },
64
{ .name = "PMCCFILTR_EL0", .state = ARM_CP_STATE_AA64,
65
.opc0 = 3, .opc1 = 3, .crn = 14, .crm = 15, .opc2 = 7,
66
.writefn = pmccfiltr_write, .raw_writefn = raw_write,
67
--
68
2.20.1
69
70
diff view generated by jsdifflib
Deleted patch
1
From: Aaron Lindsay <aaron@os.amperecomputing.com>
2
1
3
This is immediately necessary for the PMUv3 implementation to check
4
ID_DFR0.PerfMon to enable/disable specific features, but defines the
5
full complement of fields for possible future use elsewhere.
6
7
Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20181211151945.29137-8-aaron@os.amperecomputing.com
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/cpu.h | 9 +++++++++
13
1 file changed, 9 insertions(+)
14
15
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.h
18
+++ b/target/arm/cpu.h
19
@@ -XXX,XX +XXX,XX @@ FIELD(ID_AA64MMFR1, PAN, 20, 4)
20
FIELD(ID_AA64MMFR1, SPECSEI, 24, 4)
21
FIELD(ID_AA64MMFR1, XNX, 28, 4)
22
23
+FIELD(ID_DFR0, COPDBG, 0, 4)
24
+FIELD(ID_DFR0, COPSDBG, 4, 4)
25
+FIELD(ID_DFR0, MMAPDBG, 8, 4)
26
+FIELD(ID_DFR0, COPTRC, 12, 4)
27
+FIELD(ID_DFR0, MMAPTRC, 16, 4)
28
+FIELD(ID_DFR0, MPROFDBG, 20, 4)
29
+FIELD(ID_DFR0, PERFMON, 24, 4)
30
+FIELD(ID_DFR0, TRACEFILT, 28, 4)
31
+
32
QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= R_V7M_CSSELR_INDEX_MASK);
33
34
/* If adding a feature bit which corresponds to a Linux ELF
35
--
36
2.20.1
37
38
diff view generated by jsdifflib