1
The big thing here is RTH's patchset implementing ARMv8.1-VHE
1
Hi; most of this is the first half of the A64 simd decodetree
2
emulation; otherwise just a handful of smaller fixes.
2
conversion; the rest is a mix of fixes from the last couple of weeks.
3
4
v2 uses patches from the v2 decodetree series to avoid a few
5
regressions in some A32 insns.
6
7
(Richard: I'm still planning to review the second half of the
8
v2 decodetree series; I just wanted to get the respin of this
9
pullreq out today...)
3
10
4
thanks
11
thanks
5
-- PMM
12
-- PMM
6
13
7
The following changes since commit 346ed3151f1c43e72c40cb55b392a1d4cface62c:
14
The following changes since commit ad10b4badc1dd5b28305f9b9f1168cf0aa3ae946:
8
15
9
Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20200206.0' into staging (2020-02-07 11:52:15 +0000)
16
Merge tag 'pull-error-2024-05-27' of https://repo.or.cz/qemu/armbru into staging (2024-05-27 06:40:42 -0700)
10
17
11
are available in the Git repository at:
18
are available in the Git repository at:
12
19
13
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200207
20
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20240528
14
21
15
for you to fetch changes up to af6c91b490e9b1bce7a168f8a9c848f3e60f616e:
22
for you to fetch changes up to f240df3c31b40e4cf1af1f156a88efc1a1df406c:
16
23
17
stellaris: delay timer_new to avoid memleaks (2020-02-07 14:04:28 +0000)
24
target/arm: Convert disas_simd_3same_logic to decodetree (2024-05-28 14:29:01 +0100)
18
25
19
----------------------------------------------------------------
26
----------------------------------------------------------------
20
target-arm queue:
27
target-arm queue:
21
* monitor: fix query-cpu-model-expansion crash when using machine type none
28
* xlnx_dpdma: fix descriptor endianness bug
22
* Support emulation of the ARMv8.1-VHE architecture feature
29
* hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
23
* bcm2835_dma: fix bugs in TD mode handling
30
* hw/arm/npcm7xx: remove setting of mp-affinity
24
* docs/arm-cpu-features: Make kvm-no-adjvtime comment clearer
31
* hw/char: Correct STM32L4x5 usart register CR2 field ADD_0 size
25
* stellaris, stm32f2xx_timer, armv7m_systick: fix minor memory leaks
32
* hw/intc/arm_gic: Fix handling of NS view of GICC_APR<n>
33
* hw/input/tsc2005: Fix -Wchar-subscripts warning in tsc2005_txrx()
34
* hw: arm: Remove use of tabs in some source files
35
* docs/system: Remove ADC from raspi documentation
36
* target/arm: Start of the conversion of A64 SIMD to decodetree
26
37
27
----------------------------------------------------------------
38
----------------------------------------------------------------
28
Alex Bennée (1):
39
Alexandra Diupina (1):
29
target/arm: check TGE and E2H flags for EL0 pauth traps
40
xlnx_dpdma: fix descriptor endianness bug
30
41
31
Liang Yan (1):
42
Andrey Shumilin (1):
32
target/arm/monitor: query-cpu-model-expansion crashed qemu when using machine type none
43
hw/intc/arm_gic: Fix handling of NS view of GICC_APR<n>
33
44
34
Pan Nengyuan (3):
45
Dorjoy Chowdhury (1):
35
armv7m_systick: delay timer_new to avoid memleaks
46
hw/arm/npcm7xx: remove setting of mp-affinity
36
stm32f2xx_timer: delay timer_new to avoid memleaks
47
37
stellaris: delay timer_new to avoid memleaks
48
Inès Varhol (1):
49
hw/char: Correct STM32L4x5 usart register CR2 field ADD_0 size
38
50
39
Philippe Mathieu-Daudé (1):
51
Philippe Mathieu-Daudé (1):
40
docs/arm-cpu-features: Make kvm-no-adjvtime comment clearer
52
hw/input/tsc2005: Fix -Wchar-subscripts warning in tsc2005_txrx()
41
53
42
Rene Stange (2):
54
Rayhan Faizel (1):
43
bcm2835_dma: Fix the ylen loop in TD mode
55
docs/system: Remove ADC from raspi documentation
44
bcm2835_dma: Re-initialize xlen in TD mode
45
56
46
Richard Henderson (40):
57
Richard Henderson (34):
47
target/arm: Define isar_feature_aa64_vh
58
target/arm: Use PLD, PLDW, PLI not NOP for t32
48
target/arm: Enable HCR_E2H for VHE
59
target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer)
49
target/arm: Add CONTEXTIDR_EL2
60
target/arm: Fix decode of FMOV (hp) vs MOVI
50
target/arm: Add TTBR1_EL2
61
target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16)
51
target/arm: Update CNTVCT_EL0 for VHE
62
target/arm: Split out gengvec.c
52
target/arm: Split out vae1_tlbmask
63
target/arm: Split out gengvec64.c
53
target/arm: Split out alle1_tlbmask
64
target/arm: Convert Cryptographic AES to decodetree
54
target/arm: Simplify tlb_force_broadcast alternatives
65
target/arm: Convert Cryptographic 3-register SHA to decodetree
55
target/arm: Rename ARMMMUIdx*_S12NSE* to ARMMMUIdx*_E10_*
66
target/arm: Convert Cryptographic 2-register SHA to decodetree
56
target/arm: Rename ARMMMUIdx_S2NS to ARMMMUIdx_Stage2
67
target/arm: Convert Cryptographic 3-register SHA512 to decodetree
57
target/arm: Rename ARMMMUIdx_S1NSE* to ARMMMUIdx_Stage1_E*
68
target/arm: Convert Cryptographic 2-register SHA512 to decodetree
58
target/arm: Rename ARMMMUIdx_S1SE[01] to ARMMMUIdx_SE10_[01]
69
target/arm: Convert Cryptographic 4-register to decodetree
59
target/arm: Rename ARMMMUIdx*_S1E3 to ARMMMUIdx*_SE3
70
target/arm: Convert Cryptographic 3-register, imm2 to decodetree
60
target/arm: Rename ARMMMUIdx_S1E2 to ARMMMUIdx_E2
71
target/arm: Convert XAR to decodetree
61
target/arm: Recover 4 bits from TBFLAGs
72
target/arm: Convert Advanced SIMD copy to decodetree
62
target/arm: Expand TBFLAG_ANY.MMUIDX to 4 bits
73
target/arm: Convert FMULX to decodetree
63
target/arm: Rearrange ARMMMUIdxBit
74
target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree
64
target/arm: Tidy ARMMMUIdx m-profile definitions
75
target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree
65
target/arm: Reorganize ARMMMUIdx
76
target/arm: Introduce vfp_load_reg16
66
target/arm: Add regime_has_2_ranges
77
target/arm: Expand vfp neg and abs inline
67
target/arm: Update arm_mmu_idx for VHE
78
target/arm: Convert FNMUL to decodetree
68
target/arm: Update arm_sctlr for VHE
79
target/arm: Convert FMLA, FMLS to decodetree
69
target/arm: Update aa64_zva_access for EL2
80
target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree
70
target/arm: Update ctr_el0_access for EL2
81
target/arm: Convert FABD to decodetree
71
target/arm: Add the hypervisor virtual counter
82
target/arm: Convert FRECPS, FRSQRTS to decodetree
72
target/arm: Update timer access for VHE
83
target/arm: Convert FADDP to decodetree
73
target/arm: Update define_one_arm_cp_reg_with_opaque for VHE
84
target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree
74
target/arm: Add VHE system register redirection and aliasing
85
target/arm: Use gvec for neon faddp, fmaxp, fminp
75
target/arm: Add VHE timer register redirection and aliasing
86
target/arm: Convert ADDP to decodetree
76
target/arm: Flush tlb for ASID changes in EL2&0 translation regime
87
target/arm: Use gvec for neon padd
77
target/arm: Flush tlbs for E2&0 translation regime
88
target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree
78
target/arm: Update arm_phys_excp_target_el for TGE
89
target/arm: Use gvec for neon pmax, pmin
79
target/arm: Update {fp,sve}_exception_el for VHE
90
target/arm: Convert FMLAL, FMLSL to decodetree
80
target/arm: Update get_a64_user_mem_index for VHE
91
target/arm: Convert disas_simd_3same_logic to decodetree
81
target/arm: Update arm_cpu_do_interrupt_aarch64 for VHE
82
target/arm: Enable ARMv8.1-VHE in -cpu max
83
target/arm: Move arm_excp_unmasked to cpu.c
84
target/arm: Pass more cpu state to arm_excp_unmasked
85
target/arm: Use bool for unmasked in arm_excp_unmasked
86
target/arm: Raise only one interrupt in arm_cpu_exec_interrupt
87
92
88
target/arm/cpu-param.h | 2 +-
93
Tanmay Patil (1):
89
target/arm/cpu-qom.h | 1 +
94
hw: arm: Remove use of tabs in some source files
90
target/arm/cpu.h | 423 ++++++----------
91
target/arm/internals.h | 73 ++-
92
target/arm/translate.h | 4 +-
93
hw/arm/stellaris.c | 7 +-
94
hw/dma/bcm2835_dma.c | 8 +-
95
hw/timer/armv7m_systick.c | 6 +
96
hw/timer/stm32f2xx_timer.c | 5 +
97
target/arm/cpu.c | 162 +++++-
98
target/arm/cpu64.c | 1 +
99
target/arm/debug_helper.c | 50 +-
100
target/arm/helper-a64.c | 2 +-
101
target/arm/helper.c | 1211 ++++++++++++++++++++++++++++++++------------
102
target/arm/monitor.c | 15 +-
103
target/arm/pauth_helper.c | 14 +-
104
target/arm/translate-a64.c | 47 +-
105
target/arm/translate.c | 74 +--
106
docs/arm-cpu-features.rst | 2 +-
107
19 files changed, 1415 insertions(+), 692 deletions(-)
108
95
96
Zenghui Yu (1):
97
hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers
98
99
docs/system/arm/raspi.rst | 1 -
100
target/arm/helper.h | 68 +-
101
target/arm/tcg/helper-a64.h | 12 +
102
target/arm/tcg/translate-a64.h | 4 +
103
target/arm/tcg/translate.h | 51 +
104
target/arm/tcg/a64.decode | 315 +++-
105
target/arm/tcg/t32.decode | 25 +-
106
hw/arm/boot.c | 8 +-
107
hw/arm/npcm7xx.c | 3 -
108
hw/char/omap_uart.c | 49 +-
109
hw/char/stm32l4x5_usart.c | 2 +-
110
hw/dma/xlnx_dpdma.c | 68 +-
111
hw/gpio/zaurus.c | 59 +-
112
hw/input/tsc2005.c | 135 +-
113
hw/intc/arm_gic.c | 4 +-
114
target/arm/hvf/hvf.c | 130 +-
115
target/arm/tcg/gengvec.c | 1672 +++++++++++++++++++++
116
target/arm/tcg/gengvec64.c | 190 +++
117
target/arm/tcg/neon_helper.c | 5 -
118
target/arm/tcg/translate-a64.c | 3137 +++++++++++++--------------------------
119
target/arm/tcg/translate-neon.c | 136 +-
120
target/arm/tcg/translate-sve.c | 145 +-
121
target/arm/tcg/translate-vfp.c | 93 +-
122
target/arm/tcg/translate.c | 1592 +-------------------
123
target/arm/tcg/vec_helper.c | 221 ++-
124
target/arm/vfp_helper.c | 30 -
125
target/arm/tcg/meson.build | 2 +
126
27 files changed, 3860 insertions(+), 4297 deletions(-)
127
create mode 100644 target/arm/tcg/gengvec.c
128
create mode 100644 target/arm/tcg/gengvec64.c
129
diff view generated by jsdifflib
Deleted patch
1
From: Liang Yan <lyan@suse.com>
2
1
3
Commit e19afd566781 mentioned that target-arm only supports queryable
4
cpu models 'max', 'host', and the current type when KVM is in use.
5
The logic works well until using machine type none.
6
7
For machine type none, cpu_type will be null if cpu option is not
8
set by command line, strlen(cpu_type) will terminate process.
9
So We add a check above it.
10
11
This won't affect i386 and s390x since they do not use current_cpu.
12
13
Signed-off-by: Liang Yan <lyan@suse.com>
14
Message-id: 20200203134251.12986-1-lyan@suse.com
15
Reviewed-by: Andrew Jones <drjones@redhat.com>
16
Tested-by: Andrew Jones <drjones@redhat.com>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
---
19
target/arm/monitor.c | 15 +++++++++------
20
1 file changed, 9 insertions(+), 6 deletions(-)
21
22
diff --git a/target/arm/monitor.c b/target/arm/monitor.c
23
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/monitor.c
25
+++ b/target/arm/monitor.c
26
@@ -XXX,XX +XXX,XX @@ CpuModelExpansionInfo *qmp_query_cpu_model_expansion(CpuModelExpansionType type,
27
}
28
29
if (kvm_enabled()) {
30
- const char *cpu_type = current_machine->cpu_type;
31
- int len = strlen(cpu_type) - strlen(ARM_CPU_TYPE_SUFFIX);
32
bool supported = false;
33
34
if (!strcmp(model->name, "host") || !strcmp(model->name, "max")) {
35
/* These are kvmarm's recommended cpu types */
36
supported = true;
37
- } else if (strlen(model->name) == len &&
38
- !strncmp(model->name, cpu_type, len)) {
39
- /* KVM is enabled and we're using this type, so it works. */
40
- supported = true;
41
+ } else if (current_machine->cpu_type) {
42
+ const char *cpu_type = current_machine->cpu_type;
43
+ int len = strlen(cpu_type) - strlen(ARM_CPU_TYPE_SUFFIX);
44
+
45
+ if (strlen(model->name) == len &&
46
+ !strncmp(model->name, cpu_type, len)) {
47
+ /* KVM is enabled and we're using this type, so it works. */
48
+ supported = true;
49
+ }
50
}
51
if (!supported) {
52
error_setg(errp, "We cannot guarantee the CPU type '%s' works "
53
--
54
2.20.1
55
56
diff view generated by jsdifflib
1
From: Pan Nengyuan <pannengyuan@huawei.com>
1
From: Alexandra Diupina <adiupina@astralinux.ru>
2
2
3
There is a memory leak when we call 'device_list_properties' with typename = armv7m_systick. It's easy to reproduce as follow:
3
Add xlnx_dpdma_read_descriptor() and
4
xlnx_dpdma_write_descriptor() functions.
5
xlnx_dpdma_read_descriptor() combines reading a
6
descriptor from desc_addr by calling dma_memory_read()
7
and swapping the desc fields from guest memory order
8
to host memory order. xlnx_dpdma_write_descriptor()
9
performs similar actions when writing a descriptor.
4
10
5
virsh qemu-monitor-command vm1 --pretty '{"execute": "device-list-properties", "arguments": {"typename": "armv7m_systick"}}'
11
Found by Linux Verification Center (linuxtesting.org) with SVACE.
6
12
7
This patch delay timer_new to fix this memleaks.
13
Fixes: d3c6369a96 ("introduce xlnx-dpdma")
8
14
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
9
Reported-by: Euler Robot <euler.robot@huawei.com>
15
[PMM: tweaked indent, dropped behaviour change for write-failure case]
10
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
11
Message-id: 20200205070659.22488-2-pannengyuan@huawei.com
12
Cc: qemu-arm@nongnu.org
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
18
---
16
hw/timer/armv7m_systick.c | 6 ++++++
19
hw/dma/xlnx_dpdma.c | 68 ++++++++++++++++++++++++++++++++++++++++++---
17
1 file changed, 6 insertions(+)
20
1 file changed, 64 insertions(+), 4 deletions(-)
18
21
19
diff --git a/hw/timer/armv7m_systick.c b/hw/timer/armv7m_systick.c
22
diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c
20
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
21
--- a/hw/timer/armv7m_systick.c
24
--- a/hw/dma/xlnx_dpdma.c
22
+++ b/hw/timer/armv7m_systick.c
25
+++ b/hw/dma/xlnx_dpdma.c
23
@@ -XXX,XX +XXX,XX @@ static void systick_instance_init(Object *obj)
26
@@ -XXX,XX +XXX,XX @@ static void xlnx_dpdma_register_types(void)
24
memory_region_init_io(&s->iomem, obj, &systick_ops, s, "systick", 0xe0);
27
type_register_static(&xlnx_dpdma_info);
25
sysbus_init_mmio(sbd, &s->iomem);
28
}
26
sysbus_init_irq(sbd, &s->irq);
29
30
+static MemTxResult xlnx_dpdma_read_descriptor(XlnxDPDMAState *s,
31
+ uint64_t desc_addr,
32
+ DPDMADescriptor *desc)
33
+{
34
+ MemTxResult res = dma_memory_read(&address_space_memory, desc_addr,
35
+ &desc, sizeof(DPDMADescriptor),
36
+ MEMTXATTRS_UNSPECIFIED);
37
+ if (res) {
38
+ return res;
39
+ }
40
+
41
+ /* Convert from LE into host endianness. */
42
+ desc->control = le32_to_cpu(desc->control);
43
+ desc->descriptor_id = le32_to_cpu(desc->descriptor_id);
44
+ desc->xfer_size = le32_to_cpu(desc->xfer_size);
45
+ desc->line_size_stride = le32_to_cpu(desc->line_size_stride);
46
+ desc->timestamp_lsb = le32_to_cpu(desc->timestamp_lsb);
47
+ desc->timestamp_msb = le32_to_cpu(desc->timestamp_msb);
48
+ desc->address_extension = le32_to_cpu(desc->address_extension);
49
+ desc->next_descriptor = le32_to_cpu(desc->next_descriptor);
50
+ desc->source_address = le32_to_cpu(desc->source_address);
51
+ desc->address_extension_23 = le32_to_cpu(desc->address_extension_23);
52
+ desc->address_extension_45 = le32_to_cpu(desc->address_extension_45);
53
+ desc->source_address2 = le32_to_cpu(desc->source_address2);
54
+ desc->source_address3 = le32_to_cpu(desc->source_address3);
55
+ desc->source_address4 = le32_to_cpu(desc->source_address4);
56
+ desc->source_address5 = le32_to_cpu(desc->source_address5);
57
+ desc->crc = le32_to_cpu(desc->crc);
58
+
59
+ return res;
27
+}
60
+}
28
+
61
+
29
+static void systick_realize(DeviceState *dev, Error **errp)
62
+static MemTxResult xlnx_dpdma_write_descriptor(uint64_t desc_addr,
63
+ DPDMADescriptor *desc)
30
+{
64
+{
31
+ SysTickState *s = SYSTICK(dev);
65
+ DPDMADescriptor tmp_desc = *desc;
32
s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, systick_timer_tick, s);
66
+
33
}
67
+ /* Convert from host endianness into LE. */
34
68
+ tmp_desc.control = cpu_to_le32(tmp_desc.control);
35
@@ -XXX,XX +XXX,XX @@ static void systick_class_init(ObjectClass *klass, void *data)
69
+ tmp_desc.descriptor_id = cpu_to_le32(tmp_desc.descriptor_id);
36
70
+ tmp_desc.xfer_size = cpu_to_le32(tmp_desc.xfer_size);
37
dc->vmsd = &vmstate_systick;
71
+ tmp_desc.line_size_stride = cpu_to_le32(tmp_desc.line_size_stride);
38
dc->reset = systick_reset;
72
+ tmp_desc.timestamp_lsb = cpu_to_le32(tmp_desc.timestamp_lsb);
39
+ dc->realize = systick_realize;
73
+ tmp_desc.timestamp_msb = cpu_to_le32(tmp_desc.timestamp_msb);
40
}
74
+ tmp_desc.address_extension = cpu_to_le32(tmp_desc.address_extension);
41
75
+ tmp_desc.next_descriptor = cpu_to_le32(tmp_desc.next_descriptor);
42
static const TypeInfo armv7m_systick_info = {
76
+ tmp_desc.source_address = cpu_to_le32(tmp_desc.source_address);
77
+ tmp_desc.address_extension_23 = cpu_to_le32(tmp_desc.address_extension_23);
78
+ tmp_desc.address_extension_45 = cpu_to_le32(tmp_desc.address_extension_45);
79
+ tmp_desc.source_address2 = cpu_to_le32(tmp_desc.source_address2);
80
+ tmp_desc.source_address3 = cpu_to_le32(tmp_desc.source_address3);
81
+ tmp_desc.source_address4 = cpu_to_le32(tmp_desc.source_address4);
82
+ tmp_desc.source_address5 = cpu_to_le32(tmp_desc.source_address5);
83
+ tmp_desc.crc = cpu_to_le32(tmp_desc.crc);
84
+
85
+ return dma_memory_write(&address_space_memory, desc_addr, &tmp_desc,
86
+ sizeof(DPDMADescriptor), MEMTXATTRS_UNSPECIFIED);
87
+}
88
+
89
size_t xlnx_dpdma_start_operation(XlnxDPDMAState *s, uint8_t channel,
90
bool one_desc)
91
{
92
@@ -XXX,XX +XXX,XX @@ size_t xlnx_dpdma_start_operation(XlnxDPDMAState *s, uint8_t channel,
93
desc_addr = xlnx_dpdma_descriptor_next_address(s, channel);
94
}
95
96
- if (dma_memory_read(&address_space_memory, desc_addr, &desc,
97
- sizeof(DPDMADescriptor), MEMTXATTRS_UNSPECIFIED)) {
98
+ if (xlnx_dpdma_read_descriptor(s, desc_addr, &desc)) {
99
s->registers[DPDMA_EISR] |= ((1 << 1) << channel);
100
xlnx_dpdma_update_irq(s);
101
s->operation_finished[channel] = true;
102
@@ -XXX,XX +XXX,XX @@ size_t xlnx_dpdma_start_operation(XlnxDPDMAState *s, uint8_t channel,
103
/* The descriptor need to be updated when it's completed. */
104
DPRINTF("update the descriptor with the done flag set.\n");
105
xlnx_dpdma_desc_set_done(&desc);
106
- dma_memory_write(&address_space_memory, desc_addr, &desc,
107
- sizeof(DPDMADescriptor), MEMTXATTRS_UNSPECIFIED);
108
+ if (xlnx_dpdma_write_descriptor(desc_addr, &desc)) {
109
+ DPRINTF("Can't write the descriptor.\n");
110
+ /* TODO: check hardware behaviour for memory write failure */
111
+ }
112
}
113
114
if (xlnx_dpdma_desc_completion_interrupt(&desc)) {
43
--
115
--
44
2.20.1
116
2.34.1
45
46
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Zenghui Yu <zenghui.yu@linux.dev>
2
2
3
At the same time, add writefn to TTBR0_EL2 and TCR_EL2.
3
We wrongly encoded ID_AA64PFR1_EL1 using {3,0,0,4,2} in hvf_sreg_match[] so
4
A later patch will update any ASID therein.
4
we fail to get the expected ARMCPRegInfo from cp_regs hash table with the
5
wrong key.
5
6
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Fix it with the correct encoding {3,0,0,4,1}. With that fixed, the Linux
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
guest can properly detect FEAT_SSBS2 on my M1 HW.
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
9
Message-id: 20200206105448.4726-5-richard.henderson@linaro.org
10
All DBG{B,W}{V,C}R_EL1 registers are also wrongly encoded with op0 == 14.
11
It happens to work because HVF_SYSREG(CRn, CRm, 14, op1, op2) equals to
12
HVF_SYSREG(CRn, CRm, 2, op1, op2), by definition. But we shouldn't rely on
13
it.
14
15
Cc: qemu-stable@nongnu.org
16
Fixes: a1477da3ddeb ("hvf: Add Apple Silicon support")
17
Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev>
18
Reviewed-by: Alexander Graf <agraf@csgraf.de>
19
Message-id: 20240503153453.54389-1-zenghui.yu@linux.dev
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
21
---
12
target/arm/helper.c | 13 ++++++++++++-
22
target/arm/hvf/hvf.c | 130 +++++++++++++++++++++----------------------
13
1 file changed, 12 insertions(+), 1 deletion(-)
23
1 file changed, 65 insertions(+), 65 deletions(-)
14
24
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
25
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
16
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
27
--- a/target/arm/hvf/hvf.c
18
+++ b/target/arm/helper.c
28
+++ b/target/arm/hvf/hvf.c
19
@@ -XXX,XX +XXX,XX @@ static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
29
@@ -XXX,XX +XXX,XX @@ struct hvf_sreg_match {
20
raw_write(env, ri, value);
21
}
22
23
+static void vmsa_tcr_ttbr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
24
+ uint64_t value)
25
+{
26
+ /* TODO: There are ASID fields in here with HCR_EL2.E2H */
27
+ raw_write(env, ri, value);
28
+}
29
+
30
static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
31
uint64_t value)
32
{
33
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
34
.fieldoffset = offsetof(CPUARMState, cp15.tpidr_el[2]) },
35
{ .name = "TTBR0_EL2", .state = ARM_CP_STATE_AA64,
36
.opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 0,
37
- .access = PL2_RW, .resetvalue = 0,
38
+ .access = PL2_RW, .resetvalue = 0, .writefn = vmsa_tcr_ttbr_el2_write,
39
.fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[2]) },
40
{ .name = "HTTBR", .cp = 15, .opc1 = 4, .crm = 2,
41
.access = PL2_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
42
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vhe_reginfo[] = {
43
.opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 1,
44
.access = PL2_RW,
45
.fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[2]) },
46
+ { .name = "TTBR1_EL2", .state = ARM_CP_STATE_AA64,
47
+ .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
48
+ .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
49
+ .fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
50
REGINFO_SENTINEL
51
};
30
};
52
31
32
static struct hvf_sreg_match hvf_sreg_match[] = {
33
- { HV_SYS_REG_DBGBVR0_EL1, HVF_SYSREG(0, 0, 14, 0, 4) },
34
- { HV_SYS_REG_DBGBCR0_EL1, HVF_SYSREG(0, 0, 14, 0, 5) },
35
- { HV_SYS_REG_DBGWVR0_EL1, HVF_SYSREG(0, 0, 14, 0, 6) },
36
- { HV_SYS_REG_DBGWCR0_EL1, HVF_SYSREG(0, 0, 14, 0, 7) },
37
+ { HV_SYS_REG_DBGBVR0_EL1, HVF_SYSREG(0, 0, 2, 0, 4) },
38
+ { HV_SYS_REG_DBGBCR0_EL1, HVF_SYSREG(0, 0, 2, 0, 5) },
39
+ { HV_SYS_REG_DBGWVR0_EL1, HVF_SYSREG(0, 0, 2, 0, 6) },
40
+ { HV_SYS_REG_DBGWCR0_EL1, HVF_SYSREG(0, 0, 2, 0, 7) },
41
42
- { HV_SYS_REG_DBGBVR1_EL1, HVF_SYSREG(0, 1, 14, 0, 4) },
43
- { HV_SYS_REG_DBGBCR1_EL1, HVF_SYSREG(0, 1, 14, 0, 5) },
44
- { HV_SYS_REG_DBGWVR1_EL1, HVF_SYSREG(0, 1, 14, 0, 6) },
45
- { HV_SYS_REG_DBGWCR1_EL1, HVF_SYSREG(0, 1, 14, 0, 7) },
46
+ { HV_SYS_REG_DBGBVR1_EL1, HVF_SYSREG(0, 1, 2, 0, 4) },
47
+ { HV_SYS_REG_DBGBCR1_EL1, HVF_SYSREG(0, 1, 2, 0, 5) },
48
+ { HV_SYS_REG_DBGWVR1_EL1, HVF_SYSREG(0, 1, 2, 0, 6) },
49
+ { HV_SYS_REG_DBGWCR1_EL1, HVF_SYSREG(0, 1, 2, 0, 7) },
50
51
- { HV_SYS_REG_DBGBVR2_EL1, HVF_SYSREG(0, 2, 14, 0, 4) },
52
- { HV_SYS_REG_DBGBCR2_EL1, HVF_SYSREG(0, 2, 14, 0, 5) },
53
- { HV_SYS_REG_DBGWVR2_EL1, HVF_SYSREG(0, 2, 14, 0, 6) },
54
- { HV_SYS_REG_DBGWCR2_EL1, HVF_SYSREG(0, 2, 14, 0, 7) },
55
+ { HV_SYS_REG_DBGBVR2_EL1, HVF_SYSREG(0, 2, 2, 0, 4) },
56
+ { HV_SYS_REG_DBGBCR2_EL1, HVF_SYSREG(0, 2, 2, 0, 5) },
57
+ { HV_SYS_REG_DBGWVR2_EL1, HVF_SYSREG(0, 2, 2, 0, 6) },
58
+ { HV_SYS_REG_DBGWCR2_EL1, HVF_SYSREG(0, 2, 2, 0, 7) },
59
60
- { HV_SYS_REG_DBGBVR3_EL1, HVF_SYSREG(0, 3, 14, 0, 4) },
61
- { HV_SYS_REG_DBGBCR3_EL1, HVF_SYSREG(0, 3, 14, 0, 5) },
62
- { HV_SYS_REG_DBGWVR3_EL1, HVF_SYSREG(0, 3, 14, 0, 6) },
63
- { HV_SYS_REG_DBGWCR3_EL1, HVF_SYSREG(0, 3, 14, 0, 7) },
64
+ { HV_SYS_REG_DBGBVR3_EL1, HVF_SYSREG(0, 3, 2, 0, 4) },
65
+ { HV_SYS_REG_DBGBCR3_EL1, HVF_SYSREG(0, 3, 2, 0, 5) },
66
+ { HV_SYS_REG_DBGWVR3_EL1, HVF_SYSREG(0, 3, 2, 0, 6) },
67
+ { HV_SYS_REG_DBGWCR3_EL1, HVF_SYSREG(0, 3, 2, 0, 7) },
68
69
- { HV_SYS_REG_DBGBVR4_EL1, HVF_SYSREG(0, 4, 14, 0, 4) },
70
- { HV_SYS_REG_DBGBCR4_EL1, HVF_SYSREG(0, 4, 14, 0, 5) },
71
- { HV_SYS_REG_DBGWVR4_EL1, HVF_SYSREG(0, 4, 14, 0, 6) },
72
- { HV_SYS_REG_DBGWCR4_EL1, HVF_SYSREG(0, 4, 14, 0, 7) },
73
+ { HV_SYS_REG_DBGBVR4_EL1, HVF_SYSREG(0, 4, 2, 0, 4) },
74
+ { HV_SYS_REG_DBGBCR4_EL1, HVF_SYSREG(0, 4, 2, 0, 5) },
75
+ { HV_SYS_REG_DBGWVR4_EL1, HVF_SYSREG(0, 4, 2, 0, 6) },
76
+ { HV_SYS_REG_DBGWCR4_EL1, HVF_SYSREG(0, 4, 2, 0, 7) },
77
78
- { HV_SYS_REG_DBGBVR5_EL1, HVF_SYSREG(0, 5, 14, 0, 4) },
79
- { HV_SYS_REG_DBGBCR5_EL1, HVF_SYSREG(0, 5, 14, 0, 5) },
80
- { HV_SYS_REG_DBGWVR5_EL1, HVF_SYSREG(0, 5, 14, 0, 6) },
81
- { HV_SYS_REG_DBGWCR5_EL1, HVF_SYSREG(0, 5, 14, 0, 7) },
82
+ { HV_SYS_REG_DBGBVR5_EL1, HVF_SYSREG(0, 5, 2, 0, 4) },
83
+ { HV_SYS_REG_DBGBCR5_EL1, HVF_SYSREG(0, 5, 2, 0, 5) },
84
+ { HV_SYS_REG_DBGWVR5_EL1, HVF_SYSREG(0, 5, 2, 0, 6) },
85
+ { HV_SYS_REG_DBGWCR5_EL1, HVF_SYSREG(0, 5, 2, 0, 7) },
86
87
- { HV_SYS_REG_DBGBVR6_EL1, HVF_SYSREG(0, 6, 14, 0, 4) },
88
- { HV_SYS_REG_DBGBCR6_EL1, HVF_SYSREG(0, 6, 14, 0, 5) },
89
- { HV_SYS_REG_DBGWVR6_EL1, HVF_SYSREG(0, 6, 14, 0, 6) },
90
- { HV_SYS_REG_DBGWCR6_EL1, HVF_SYSREG(0, 6, 14, 0, 7) },
91
+ { HV_SYS_REG_DBGBVR6_EL1, HVF_SYSREG(0, 6, 2, 0, 4) },
92
+ { HV_SYS_REG_DBGBCR6_EL1, HVF_SYSREG(0, 6, 2, 0, 5) },
93
+ { HV_SYS_REG_DBGWVR6_EL1, HVF_SYSREG(0, 6, 2, 0, 6) },
94
+ { HV_SYS_REG_DBGWCR6_EL1, HVF_SYSREG(0, 6, 2, 0, 7) },
95
96
- { HV_SYS_REG_DBGBVR7_EL1, HVF_SYSREG(0, 7, 14, 0, 4) },
97
- { HV_SYS_REG_DBGBCR7_EL1, HVF_SYSREG(0, 7, 14, 0, 5) },
98
- { HV_SYS_REG_DBGWVR7_EL1, HVF_SYSREG(0, 7, 14, 0, 6) },
99
- { HV_SYS_REG_DBGWCR7_EL1, HVF_SYSREG(0, 7, 14, 0, 7) },
100
+ { HV_SYS_REG_DBGBVR7_EL1, HVF_SYSREG(0, 7, 2, 0, 4) },
101
+ { HV_SYS_REG_DBGBCR7_EL1, HVF_SYSREG(0, 7, 2, 0, 5) },
102
+ { HV_SYS_REG_DBGWVR7_EL1, HVF_SYSREG(0, 7, 2, 0, 6) },
103
+ { HV_SYS_REG_DBGWCR7_EL1, HVF_SYSREG(0, 7, 2, 0, 7) },
104
105
- { HV_SYS_REG_DBGBVR8_EL1, HVF_SYSREG(0, 8, 14, 0, 4) },
106
- { HV_SYS_REG_DBGBCR8_EL1, HVF_SYSREG(0, 8, 14, 0, 5) },
107
- { HV_SYS_REG_DBGWVR8_EL1, HVF_SYSREG(0, 8, 14, 0, 6) },
108
- { HV_SYS_REG_DBGWCR8_EL1, HVF_SYSREG(0, 8, 14, 0, 7) },
109
+ { HV_SYS_REG_DBGBVR8_EL1, HVF_SYSREG(0, 8, 2, 0, 4) },
110
+ { HV_SYS_REG_DBGBCR8_EL1, HVF_SYSREG(0, 8, 2, 0, 5) },
111
+ { HV_SYS_REG_DBGWVR8_EL1, HVF_SYSREG(0, 8, 2, 0, 6) },
112
+ { HV_SYS_REG_DBGWCR8_EL1, HVF_SYSREG(0, 8, 2, 0, 7) },
113
114
- { HV_SYS_REG_DBGBVR9_EL1, HVF_SYSREG(0, 9, 14, 0, 4) },
115
- { HV_SYS_REG_DBGBCR9_EL1, HVF_SYSREG(0, 9, 14, 0, 5) },
116
- { HV_SYS_REG_DBGWVR9_EL1, HVF_SYSREG(0, 9, 14, 0, 6) },
117
- { HV_SYS_REG_DBGWCR9_EL1, HVF_SYSREG(0, 9, 14, 0, 7) },
118
+ { HV_SYS_REG_DBGBVR9_EL1, HVF_SYSREG(0, 9, 2, 0, 4) },
119
+ { HV_SYS_REG_DBGBCR9_EL1, HVF_SYSREG(0, 9, 2, 0, 5) },
120
+ { HV_SYS_REG_DBGWVR9_EL1, HVF_SYSREG(0, 9, 2, 0, 6) },
121
+ { HV_SYS_REG_DBGWCR9_EL1, HVF_SYSREG(0, 9, 2, 0, 7) },
122
123
- { HV_SYS_REG_DBGBVR10_EL1, HVF_SYSREG(0, 10, 14, 0, 4) },
124
- { HV_SYS_REG_DBGBCR10_EL1, HVF_SYSREG(0, 10, 14, 0, 5) },
125
- { HV_SYS_REG_DBGWVR10_EL1, HVF_SYSREG(0, 10, 14, 0, 6) },
126
- { HV_SYS_REG_DBGWCR10_EL1, HVF_SYSREG(0, 10, 14, 0, 7) },
127
+ { HV_SYS_REG_DBGBVR10_EL1, HVF_SYSREG(0, 10, 2, 0, 4) },
128
+ { HV_SYS_REG_DBGBCR10_EL1, HVF_SYSREG(0, 10, 2, 0, 5) },
129
+ { HV_SYS_REG_DBGWVR10_EL1, HVF_SYSREG(0, 10, 2, 0, 6) },
130
+ { HV_SYS_REG_DBGWCR10_EL1, HVF_SYSREG(0, 10, 2, 0, 7) },
131
132
- { HV_SYS_REG_DBGBVR11_EL1, HVF_SYSREG(0, 11, 14, 0, 4) },
133
- { HV_SYS_REG_DBGBCR11_EL1, HVF_SYSREG(0, 11, 14, 0, 5) },
134
- { HV_SYS_REG_DBGWVR11_EL1, HVF_SYSREG(0, 11, 14, 0, 6) },
135
- { HV_SYS_REG_DBGWCR11_EL1, HVF_SYSREG(0, 11, 14, 0, 7) },
136
+ { HV_SYS_REG_DBGBVR11_EL1, HVF_SYSREG(0, 11, 2, 0, 4) },
137
+ { HV_SYS_REG_DBGBCR11_EL1, HVF_SYSREG(0, 11, 2, 0, 5) },
138
+ { HV_SYS_REG_DBGWVR11_EL1, HVF_SYSREG(0, 11, 2, 0, 6) },
139
+ { HV_SYS_REG_DBGWCR11_EL1, HVF_SYSREG(0, 11, 2, 0, 7) },
140
141
- { HV_SYS_REG_DBGBVR12_EL1, HVF_SYSREG(0, 12, 14, 0, 4) },
142
- { HV_SYS_REG_DBGBCR12_EL1, HVF_SYSREG(0, 12, 14, 0, 5) },
143
- { HV_SYS_REG_DBGWVR12_EL1, HVF_SYSREG(0, 12, 14, 0, 6) },
144
- { HV_SYS_REG_DBGWCR12_EL1, HVF_SYSREG(0, 12, 14, 0, 7) },
145
+ { HV_SYS_REG_DBGBVR12_EL1, HVF_SYSREG(0, 12, 2, 0, 4) },
146
+ { HV_SYS_REG_DBGBCR12_EL1, HVF_SYSREG(0, 12, 2, 0, 5) },
147
+ { HV_SYS_REG_DBGWVR12_EL1, HVF_SYSREG(0, 12, 2, 0, 6) },
148
+ { HV_SYS_REG_DBGWCR12_EL1, HVF_SYSREG(0, 12, 2, 0, 7) },
149
150
- { HV_SYS_REG_DBGBVR13_EL1, HVF_SYSREG(0, 13, 14, 0, 4) },
151
- { HV_SYS_REG_DBGBCR13_EL1, HVF_SYSREG(0, 13, 14, 0, 5) },
152
- { HV_SYS_REG_DBGWVR13_EL1, HVF_SYSREG(0, 13, 14, 0, 6) },
153
- { HV_SYS_REG_DBGWCR13_EL1, HVF_SYSREG(0, 13, 14, 0, 7) },
154
+ { HV_SYS_REG_DBGBVR13_EL1, HVF_SYSREG(0, 13, 2, 0, 4) },
155
+ { HV_SYS_REG_DBGBCR13_EL1, HVF_SYSREG(0, 13, 2, 0, 5) },
156
+ { HV_SYS_REG_DBGWVR13_EL1, HVF_SYSREG(0, 13, 2, 0, 6) },
157
+ { HV_SYS_REG_DBGWCR13_EL1, HVF_SYSREG(0, 13, 2, 0, 7) },
158
159
- { HV_SYS_REG_DBGBVR14_EL1, HVF_SYSREG(0, 14, 14, 0, 4) },
160
- { HV_SYS_REG_DBGBCR14_EL1, HVF_SYSREG(0, 14, 14, 0, 5) },
161
- { HV_SYS_REG_DBGWVR14_EL1, HVF_SYSREG(0, 14, 14, 0, 6) },
162
- { HV_SYS_REG_DBGWCR14_EL1, HVF_SYSREG(0, 14, 14, 0, 7) },
163
+ { HV_SYS_REG_DBGBVR14_EL1, HVF_SYSREG(0, 14, 2, 0, 4) },
164
+ { HV_SYS_REG_DBGBCR14_EL1, HVF_SYSREG(0, 14, 2, 0, 5) },
165
+ { HV_SYS_REG_DBGWVR14_EL1, HVF_SYSREG(0, 14, 2, 0, 6) },
166
+ { HV_SYS_REG_DBGWCR14_EL1, HVF_SYSREG(0, 14, 2, 0, 7) },
167
168
- { HV_SYS_REG_DBGBVR15_EL1, HVF_SYSREG(0, 15, 14, 0, 4) },
169
- { HV_SYS_REG_DBGBCR15_EL1, HVF_SYSREG(0, 15, 14, 0, 5) },
170
- { HV_SYS_REG_DBGWVR15_EL1, HVF_SYSREG(0, 15, 14, 0, 6) },
171
- { HV_SYS_REG_DBGWCR15_EL1, HVF_SYSREG(0, 15, 14, 0, 7) },
172
+ { HV_SYS_REG_DBGBVR15_EL1, HVF_SYSREG(0, 15, 2, 0, 4) },
173
+ { HV_SYS_REG_DBGBCR15_EL1, HVF_SYSREG(0, 15, 2, 0, 5) },
174
+ { HV_SYS_REG_DBGWVR15_EL1, HVF_SYSREG(0, 15, 2, 0, 6) },
175
+ { HV_SYS_REG_DBGWCR15_EL1, HVF_SYSREG(0, 15, 2, 0, 7) },
176
177
#ifdef SYNC_NO_RAW_REGS
178
/*
179
@@ -XXX,XX +XXX,XX @@ static struct hvf_sreg_match hvf_sreg_match[] = {
180
{ HV_SYS_REG_MPIDR_EL1, HVF_SYSREG(0, 0, 3, 0, 5) },
181
{ HV_SYS_REG_ID_AA64PFR0_EL1, HVF_SYSREG(0, 4, 3, 0, 0) },
182
#endif
183
- { HV_SYS_REG_ID_AA64PFR1_EL1, HVF_SYSREG(0, 4, 3, 0, 2) },
184
+ { HV_SYS_REG_ID_AA64PFR1_EL1, HVF_SYSREG(0, 4, 3, 0, 1) },
185
{ HV_SYS_REG_ID_AA64DFR0_EL1, HVF_SYSREG(0, 5, 3, 0, 0) },
186
{ HV_SYS_REG_ID_AA64DFR1_EL1, HVF_SYSREG(0, 5, 3, 0, 1) },
187
{ HV_SYS_REG_ID_AA64ISAR0_EL1, HVF_SYSREG(0, 6, 3, 0, 0) },
53
--
188
--
54
2.20.1
189
2.34.1
55
56
diff view generated by jsdifflib
1
From: Pan Nengyuan <pannengyuan@huawei.com>
1
From: Dorjoy Chowdhury <dorjoychy111@gmail.com>
2
2
3
There is a memory leak when we call 'device_list_properties' with typename = stm32f2xx_timer. It's easy to reproduce as follow:
3
The value of the mp-affinity property being set in npcm7xx_realize is
4
always the same as the default value it would have when arm_cpu_realizefn
5
is called if the property is not set here. So there is no need to set
6
the property value in npcm7xx_realize function.
4
7
5
virsh qemu-monitor-command vm1 --pretty '{"execute": "device-list-properties", "arguments": {"typename": "stm32f2xx_timer"}}'
8
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
6
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
This patch delay timer_new to fix this memleaks.
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
11
Message-id: 20240504141733.14813-1-dorjoychy111@gmail.com
9
Reported-by: Euler Robot <euler.robot@huawei.com>
10
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
11
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
13
Message-id: 20200205070659.22488-3-pannengyuan@huawei.com
14
Cc: Alistair Francis <alistair@alistair23.me>
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
13
---
18
hw/timer/stm32f2xx_timer.c | 5 +++++
14
hw/arm/npcm7xx.c | 3 ---
19
1 file changed, 5 insertions(+)
15
1 file changed, 3 deletions(-)
20
16
21
diff --git a/hw/timer/stm32f2xx_timer.c b/hw/timer/stm32f2xx_timer.c
17
diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
22
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
23
--- a/hw/timer/stm32f2xx_timer.c
19
--- a/hw/arm/npcm7xx.c
24
+++ b/hw/timer/stm32f2xx_timer.c
20
+++ b/hw/arm/npcm7xx.c
25
@@ -XXX,XX +XXX,XX @@ static void stm32f2xx_timer_init(Object *obj)
21
@@ -XXX,XX +XXX,XX @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
26
memory_region_init_io(&s->iomem, obj, &stm32f2xx_timer_ops, s,
22
27
"stm32f2xx_timer", 0x400);
23
/* CPUs */
28
sysbus_init_mmio(SYS_BUS_DEVICE(obj), &s->iomem);
24
for (i = 0; i < nc->num_cpus; i++) {
29
+}
25
- object_property_set_int(OBJECT(&s->cpu[i]), "mp-affinity",
30
26
- arm_build_mp_affinity(i, NPCM7XX_MAX_NUM_CPUS),
31
+static void stm32f2xx_timer_realize(DeviceState *dev, Error **errp)
27
- &error_abort);
32
+{
28
object_property_set_int(OBJECT(&s->cpu[i]), "reset-cbar",
33
+ STM32F2XXTimerState *s = STM32F2XXTIMER(dev);
29
NPCM7XX_GIC_CPU_IF_ADDR, &error_abort);
34
s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, stm32f2xx_timer_interrupt, s);
30
object_property_set_bool(OBJECT(&s->cpu[i]), "reset-hivecs", true,
35
}
36
37
@@ -XXX,XX +XXX,XX @@ static void stm32f2xx_timer_class_init(ObjectClass *klass, void *data)
38
dc->reset = stm32f2xx_timer_reset;
39
device_class_set_props(dc, stm32f2xx_timer_properties);
40
dc->vmsd = &vmstate_stm32f2xx_timer;
41
+ dc->realize = stm32f2xx_timer_realize;
42
}
43
44
static const TypeInfo stm32f2xx_timer_info = {
45
--
31
--
46
2.20.1
32
2.34.1
47
33
48
34
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
From: Inès Varhol <ines.varhol@telecom-paris.fr>
2
2
3
The bold text sounds like 'knock knock'. Only bolding the
3
Signed-off-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
4
second 'not' makes it easier to read.
4
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
5
5
Message-id: 20240505141613.387508-1-ines.varhol@telecom-paris.fr
6
Fixes: dea101a1ae
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Reviewed-by: Andrew Jones <drjones@redhat.com>
9
Message-id: 20200206225148.23923-1-philmd@redhat.com
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
8
---
12
docs/arm-cpu-features.rst | 2 +-
9
hw/char/stm32l4x5_usart.c | 2 +-
13
1 file changed, 1 insertion(+), 1 deletion(-)
10
1 file changed, 1 insertion(+), 1 deletion(-)
14
11
15
diff --git a/docs/arm-cpu-features.rst b/docs/arm-cpu-features.rst
12
diff --git a/hw/char/stm32l4x5_usart.c b/hw/char/stm32l4x5_usart.c
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/docs/arm-cpu-features.rst
14
--- a/hw/char/stm32l4x5_usart.c
18
+++ b/docs/arm-cpu-features.rst
15
+++ b/hw/char/stm32l4x5_usart.c
19
@@ -XXX,XX +XXX,XX @@ the list of KVM VCPU features and their descriptions.
16
@@ -XXX,XX +XXX,XX @@ REG32(CR1, 0x00)
20
17
FIELD(CR1, UE, 0, 1) /* USART enable */
21
kvm-no-adjvtime By default kvm-no-adjvtime is disabled. This
18
REG32(CR2, 0x04)
22
means that by default the virtual time
19
FIELD(CR2, ADD_1, 28, 4) /* ADD[7:4] */
23
- adjustment is enabled (vtime is *not not*
20
- FIELD(CR2, ADD_0, 24, 1) /* ADD[3:0] */
24
+ adjustment is enabled (vtime is not *not*
21
+ FIELD(CR2, ADD_0, 24, 4) /* ADD[3:0] */
25
adjusted).
22
FIELD(CR2, RTOEN, 23, 1) /* Receiver timeout enable */
26
23
FIELD(CR2, ABRMOD, 21, 2) /* Auto baud rate mode */
27
When virtual time adjustment is enabled each
24
FIELD(CR2, ABREN, 20, 1) /* Auto baud rate enable */
28
--
25
--
29
2.20.1
26
2.34.1
30
27
31
28
diff view generated by jsdifflib
1
From: Rene Stange <rsta2@o2online.de>
1
From: Andrey Shumilin <shum.sdl@nppct.ru>
2
2
3
In TD (two dimensions) DMA mode ylen has to be increased by one after
3
In gic_cpu_read() and gic_cpu_write(), we delegate the handling of
4
reading it from the TXFR_LEN register, because a value of zero has to
4
reading and writing the Non-Secure view of the GICC_APR<n> registers
5
result in one run through of the ylen loop. This has been tested on a
5
to functions gic_apr_ns_view() and gic_apr_write_ns_view().
6
real Raspberry Pi 3 Model B+. In the previous implementation the ylen
6
Unfortunately we got the order of the arguments wrong, swapping the
7
loop was not passed at all for a value of zero.
7
CPU number and the register number (which the compiler doesn't catch
8
because they're both integers).
8
9
9
Signed-off-by: Rene Stange <rsta2@o2online.de>
10
Most guests probably didn't notice this bug because directly
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
accessing the APR registers is typically something only done by
12
firmware when it is doing state save for going into a sleep mode.
13
14
Correct the mismatched call arguments.
15
16
Found by Linux Verification Center (linuxtesting.org) with SVACE.
17
18
Cc: qemu-stable@nongnu.org
19
Fixes: 51fd06e0ee ("hw/intc/arm_gic: Fix handling of GICC_APR<n>, GICC_NSAPR<n> registers")
20
Signed-off-by: Andrey Shumilin <shum.sdl@nppct.ru>
21
[PMM: Rewrote commit message]
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
24
Reviewed-by: Alex Bennée<alex.bennee@linaro.org>
12
---
25
---
13
hw/dma/bcm2835_dma.c | 4 ++--
26
hw/intc/arm_gic.c | 4 ++--
14
1 file changed, 2 insertions(+), 2 deletions(-)
27
1 file changed, 2 insertions(+), 2 deletions(-)
15
28
16
diff --git a/hw/dma/bcm2835_dma.c b/hw/dma/bcm2835_dma.c
29
diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
17
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/dma/bcm2835_dma.c
31
--- a/hw/intc/arm_gic.c
19
+++ b/hw/dma/bcm2835_dma.c
32
+++ b/hw/intc/arm_gic.c
20
@@ -XXX,XX +XXX,XX @@ static void bcm2835_dma_update(BCM2835DMAState *s, unsigned c)
33
@@ -XXX,XX +XXX,XX @@ static MemTxResult gic_cpu_read(GICState *s, int cpu, int offset,
21
ch->stride = ldl_le_phys(&s->dma_as, ch->conblk_ad + 16);
34
*data = s->h_apr[gic_get_vcpu_real_id(cpu)];
22
ch->nextconbk = ldl_le_phys(&s->dma_as, ch->conblk_ad + 20);
35
} else if (gic_cpu_ns_access(s, cpu, attrs)) {
23
36
/* NS view of GICC_APR<n> is the top half of GIC_NSAPR<n> */
24
+ ylen = 1;
37
- *data = gic_apr_ns_view(s, regno, cpu);
25
if (ch->ti & BCM2708_DMA_TDMODE) {
38
+ *data = gic_apr_ns_view(s, cpu, regno);
26
/* 2D transfer mode */
27
- ylen = (ch->txfr_len >> 16) & 0x3fff;
28
+ ylen += (ch->txfr_len >> 16) & 0x3fff;
29
xlen = ch->txfr_len & 0xffff;
30
dst_stride = ch->stride >> 16;
31
src_stride = ch->stride & 0xffff;
32
} else {
39
} else {
33
- ylen = 1;
40
*data = s->apr[regno][cpu];
34
xlen = ch->txfr_len;
41
}
35
dst_stride = 0;
42
@@ -XXX,XX +XXX,XX @@ static MemTxResult gic_cpu_write(GICState *s, int cpu, int offset,
36
src_stride = 0;
43
s->h_apr[gic_get_vcpu_real_id(cpu)] = value;
44
} else if (gic_cpu_ns_access(s, cpu, attrs)) {
45
/* NS view of GICC_APR<n> is the top half of GIC_NSAPR<n> */
46
- gic_apr_write_ns_view(s, regno, cpu, value);
47
+ gic_apr_write_ns_view(s, cpu, regno, value);
48
} else {
49
s->apr[regno][cpu] = value;
50
}
37
--
51
--
38
2.20.1
52
2.34.1
39
53
40
54
diff view generated by jsdifflib
1
From: Alex Bennée <alex.bennee@linaro.org>
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
2
2
3
According to ARM ARM we should only trap from the EL1&0 regime.
3
Check the function index is in range and use an unsigned
4
variable to avoid the following warning with GCC 13.2.0:
4
5
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
[666/5358] Compiling C object libcommon.fa.p/hw_input_tsc2005.c.o
7
hw/input/tsc2005.c: In function 'tsc2005_timer_tick':
8
hw/input/tsc2005.c:416:26: warning: array subscript has type 'char' [-Wchar-subscripts]
9
416 | s->dav |= mode_regs[s->function];
10
| ~^~~~~~~~~~
11
12
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
13
Message-id: 20240508143513.44996-1-philmd@linaro.org
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
15
[PMM: fixed missing ')']
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-35-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
17
---
12
target/arm/pauth_helper.c | 5 ++++-
18
hw/input/tsc2005.c | 5 ++++-
13
1 file changed, 4 insertions(+), 1 deletion(-)
19
1 file changed, 4 insertions(+), 1 deletion(-)
14
20
15
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
21
diff --git a/hw/input/tsc2005.c b/hw/input/tsc2005.c
16
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/pauth_helper.c
23
--- a/hw/input/tsc2005.c
18
+++ b/target/arm/pauth_helper.c
24
+++ b/hw/input/tsc2005.c
19
@@ -XXX,XX +XXX,XX @@ static void pauth_check_trap(CPUARMState *env, int el, uintptr_t ra)
25
@@ -XXX,XX +XXX,XX @@ uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len)
20
if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) {
26
static void tsc2005_timer_tick(void *opaque)
21
uint64_t hcr = arm_hcr_el2_eff(env);
27
{
22
bool trap = !(hcr & HCR_API);
28
TSC2005State *s = opaque;
23
- /* FIXME: ARMv8.1-VHE: trap only applies to EL1&0 regime. */
29
+ unsigned int function = s->function;
24
+ if (el == 0) {
30
+
25
+ /* Trap only applies to EL1&0 regime. */
31
+ assert(function < ARRAY_SIZE(mode_regs));
26
+ trap &= (hcr & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE);
32
27
+ }
33
/* Timer ticked -- a set of conversions has been finished. */
28
/* FIXME: ARMv8.3-NV: HCR_NV trap takes precedence for ERETA[AB]. */
34
29
if (trap) {
35
@@ -XXX,XX +XXX,XX @@ static void tsc2005_timer_tick(void *opaque)
30
pauth_trap(env, 2, ra);
36
return;
37
38
s->busy = false;
39
- s->dav |= mode_regs[s->function];
40
+ s->dav |= mode_regs[function];
41
s->function = -1;
42
tsc2005_pin_update(s);
43
}
31
--
44
--
32
2.20.1
45
2.34.1
33
46
34
47
diff view generated by jsdifflib
1
From: Pan Nengyuan <pannengyuan@huawei.com>
1
From: Tanmay Patil <tanmaynpatil105@gmail.com>
2
2
3
There is a memory leak when we call 'device_list_properties' with typename = stellaris-gptm. It's easy to reproduce as follow:
3
Some of the source files for older devices use hardcoded tabs
4
instead of our current coding standard's required spaces.
5
Fix these in the following files:
6
    - hw/arm/boot.c
7
    - hw/char/omap_uart.c
8
    - hw/gpio/zaurus.c
9
    - hw/input/tsc2005.c
4
10
5
virsh qemu-monitor-command vm1 --pretty '{"execute": "device-list-properties", "arguments": {"typename": "stellaris-gptm"}}'
11
This commit is mostly whitespace-only changes; it also
12
adds curly-braces to some 'if' statements.
6
13
7
This patch delay timer_new in realize to fix it.
14
This addresses part of https://gitlab.com/qemu-project/qemu/-/issues/373
15
but some other files remain to be handled.
8
16
9
Reported-by: Euler Robot <euler.robot@huawei.com>
17
Signed-off-by: Tanmay Patil <tanmaynpatil105@gmail.com>
10
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
18
Message-id: 20240508081502.88375-1-tanmaynpatil105@gmail.com
11
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Message-id: 20200205070659.22488-4-pannengyuan@huawei.com
13
Cc: qemu-arm@nongnu.org
14
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
20
[PMM: tweaked commit message]
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
22
---
17
hw/arm/stellaris.c | 7 ++++++-
23
hw/arm/boot.c | 8 +--
18
1 file changed, 6 insertions(+), 1 deletion(-)
24
hw/char/omap_uart.c | 49 +++++++++--------
25
hw/gpio/zaurus.c | 59 ++++++++++----------
26
hw/input/tsc2005.c | 130 ++++++++++++++++++++++++--------------------
27
4 files changed, 130 insertions(+), 116 deletions(-)
19
28
20
diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
29
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
21
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
22
--- a/hw/arm/stellaris.c
31
--- a/hw/arm/boot.c
23
+++ b/hw/arm/stellaris.c
32
+++ b/hw/arm/boot.c
24
@@ -XXX,XX +XXX,XX @@ static void stellaris_gptm_init(Object *obj)
33
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args_old(const struct arm_boot_info *info,
25
sysbus_init_mmio(sbd, &s->iomem);
34
WRITE_WORD(p, info->ram_size / 4096);
26
35
/* ramdisk_size */
27
s->opaque[0] = s->opaque[1] = s;
36
WRITE_WORD(p, 0);
28
+}
37
-#define FLAG_READONLY    1
29
+
38
-#define FLAG_RDLOAD    4
30
+static void stellaris_gptm_realize(DeviceState *dev, Error **errp)
39
-#define FLAG_RDPROMPT    8
40
+#define FLAG_READONLY 1
41
+#define FLAG_RDLOAD 4
42
+#define FLAG_RDPROMPT 8
43
/* flags */
44
WRITE_WORD(p, FLAG_READONLY | FLAG_RDLOAD | FLAG_RDPROMPT);
45
/* rootdev */
46
- WRITE_WORD(p, (31 << 8) | 0);    /* /dev/mtdblock0 */
47
+ WRITE_WORD(p, (31 << 8) | 0); /* /dev/mtdblock0 */
48
/* video_num_cols */
49
WRITE_WORD(p, 0);
50
/* video_num_rows */
51
diff --git a/hw/char/omap_uart.c b/hw/char/omap_uart.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/hw/char/omap_uart.c
54
+++ b/hw/char/omap_uart.c
55
@@ -XXX,XX +XXX,XX @@ struct omap_uart_s *omap_uart_init(hwaddr base,
56
s->fclk = fclk;
57
s->irq = irq;
58
s->serial = serial_mm_init(get_system_memory(), base, 2, irq,
59
- omap_clk_getrate(fclk)/16,
60
+ omap_clk_getrate(fclk) / 16,
61
chr ?: qemu_chr_new(label, "null", NULL),
62
DEVICE_NATIVE_ENDIAN);
63
return s;
64
@@ -XXX,XX +XXX,XX @@ static uint64_t omap_uart_read(void *opaque, hwaddr addr, unsigned size)
65
}
66
67
switch (addr) {
68
- case 0x20:    /* MDR1 */
69
+ case 0x20: /* MDR1 */
70
return s->mdr[0];
71
- case 0x24:    /* MDR2 */
72
+ case 0x24: /* MDR2 */
73
return s->mdr[1];
74
- case 0x40:    /* SCR */
75
+ case 0x40: /* SCR */
76
return s->scr;
77
- case 0x44:    /* SSR */
78
+ case 0x44: /* SSR */
79
return 0x0;
80
- case 0x48:    /* EBLR (OMAP2) */
81
+ case 0x48: /* EBLR (OMAP2) */
82
return s->eblr;
83
- case 0x4C:    /* OSC_12M_SEL (OMAP1) */
84
+ case 0x4C: /* OSC_12M_SEL (OMAP1) */
85
return s->clksel;
86
- case 0x50:    /* MVR */
87
+ case 0x50: /* MVR */
88
return 0x30;
89
- case 0x54:    /* SYSC (OMAP2) */
90
+ case 0x54: /* SYSC (OMAP2) */
91
return s->syscontrol;
92
- case 0x58:    /* SYSS (OMAP2) */
93
+ case 0x58: /* SYSS (OMAP2) */
94
return 1;
95
- case 0x5c:    /* WER (OMAP2) */
96
+ case 0x5c: /* WER (OMAP2) */
97
return s->wkup;
98
- case 0x60:    /* CFPS (OMAP2) */
99
+ case 0x60: /* CFPS (OMAP2) */
100
return s->cfps;
101
}
102
103
@@ -XXX,XX +XXX,XX @@ static void omap_uart_write(void *opaque, hwaddr addr,
104
}
105
106
switch (addr) {
107
- case 0x20:    /* MDR1 */
108
+ case 0x20: /* MDR1 */
109
s->mdr[0] = value & 0x7f;
110
break;
111
- case 0x24:    /* MDR2 */
112
+ case 0x24: /* MDR2 */
113
s->mdr[1] = value & 0xff;
114
break;
115
- case 0x40:    /* SCR */
116
+ case 0x40: /* SCR */
117
s->scr = value & 0xff;
118
break;
119
- case 0x48:    /* EBLR (OMAP2) */
120
+ case 0x48: /* EBLR (OMAP2) */
121
s->eblr = value & 0xff;
122
break;
123
- case 0x4C:    /* OSC_12M_SEL (OMAP1) */
124
+ case 0x4C: /* OSC_12M_SEL (OMAP1) */
125
s->clksel = value & 1;
126
break;
127
- case 0x44:    /* SSR */
128
- case 0x50:    /* MVR */
129
- case 0x58:    /* SYSS (OMAP2) */
130
+ case 0x44: /* SSR */
131
+ case 0x50: /* MVR */
132
+ case 0x58: /* SYSS (OMAP2) */
133
OMAP_RO_REG(addr);
134
break;
135
- case 0x54:    /* SYSC (OMAP2) */
136
+ case 0x54: /* SYSC (OMAP2) */
137
s->syscontrol = value & 0x1d;
138
- if (value & 2)
139
+ if (value & 2) {
140
omap_uart_reset(s);
141
+ }
142
break;
143
- case 0x5c:    /* WER (OMAP2) */
144
+ case 0x5c: /* WER (OMAP2) */
145
s->wkup = value & 0x7f;
146
break;
147
- case 0x60:    /* CFPS (OMAP2) */
148
+ case 0x60: /* CFPS (OMAP2) */
149
s->cfps = value & 0xff;
150
break;
151
default:
152
diff --git a/hw/gpio/zaurus.c b/hw/gpio/zaurus.c
153
index XXXXXXX..XXXXXXX 100644
154
--- a/hw/gpio/zaurus.c
155
+++ b/hw/gpio/zaurus.c
156
@@ -XXX,XX +XXX,XX @@ struct ScoopInfo {
157
uint16_t isr;
158
};
159
160
-#define SCOOP_MCR    0x00
161
-#define SCOOP_CDR    0x04
162
-#define SCOOP_CSR    0x08
163
-#define SCOOP_CPR    0x0c
164
-#define SCOOP_CCR    0x10
165
-#define SCOOP_IRR_IRM    0x14
166
-#define SCOOP_IMR    0x18
167
-#define SCOOP_ISR    0x1c
168
-#define SCOOP_GPCR    0x20
169
-#define SCOOP_GPWR    0x24
170
-#define SCOOP_GPRR    0x28
171
+#define SCOOP_MCR 0x00
172
+#define SCOOP_CDR 0x04
173
+#define SCOOP_CSR 0x08
174
+#define SCOOP_CPR 0x0c
175
+#define SCOOP_CCR 0x10
176
+#define SCOOP_IRR_IRM 0x14
177
+#define SCOOP_IMR 0x18
178
+#define SCOOP_ISR 0x1c
179
+#define SCOOP_GPCR 0x20
180
+#define SCOOP_GPWR 0x24
181
+#define SCOOP_GPRR 0x28
182
183
-static inline void scoop_gpio_handler_update(ScoopInfo *s) {
184
+static inline void scoop_gpio_handler_update(ScoopInfo *s)
31
+{
185
+{
32
+ gptm_state *s = STELLARIS_GPTM(dev);
186
uint32_t level, diff;
33
s->timer[0] = timer_new_ns(QEMU_CLOCK_VIRTUAL, gptm_tick, &s->opaque[0]);
187
int bit;
34
s->timer[1] = timer_new_ns(QEMU_CLOCK_VIRTUAL, gptm_tick, &s->opaque[1]);
188
level = s->gpio_level & s->gpio_dir;
189
@@ -XXX,XX +XXX,XX @@ static void scoop_write(void *opaque, hwaddr addr,
190
break;
191
case SCOOP_CPR:
192
s->power = value;
193
- if (value & 0x80)
194
+ if (value & 0x80) {
195
s->power |= 0x8040;
196
+ }
197
break;
198
case SCOOP_CCR:
199
s->ccr = value;
200
@@ -XXX,XX +XXX,XX @@ static void scoop_write(void *opaque, hwaddr addr,
201
scoop_gpio_handler_update(s);
202
break;
203
case SCOOP_GPWR:
204
- case SCOOP_GPRR:    /* GPRR is probably R/O in real HW */
205
+ case SCOOP_GPRR: /* GPRR is probably R/O in real HW */
206
s->gpio_level = value & s->gpio_dir;
207
scoop_gpio_handler_update(s);
208
break;
209
@@ -XXX,XX +XXX,XX @@ static void scoop_gpio_set(void *opaque, int line, int level)
210
{
211
ScoopInfo *s = (ScoopInfo *) opaque;
212
213
- if (level)
214
+ if (level) {
215
s->gpio_level |= (1 << line);
216
- else
217
+ } else {
218
s->gpio_level &= ~(1 << line);
219
+ }
35
}
220
}
36
221
37
-
222
static void scoop_init(Object *obj)
38
/* System controller. */
223
@@ -XXX,XX +XXX,XX @@ static int scoop_post_load(void *opaque, int version_id)
224
return 0;
225
}
226
227
-static bool is_version_0 (void *opaque, int version_id)
228
+static bool is_version_0(void *opaque, int version_id)
229
{
230
return version_id == 0;
231
}
232
@@ -XXX,XX +XXX,XX @@ type_init(scoop_register_types)
233
234
/* Write the bootloader parameters memory area. */
235
236
-#define MAGIC_CHG(a, b, c, d)    ((d << 24) | (c << 16) | (b << 8) | a)
237
+#define MAGIC_CHG(a, b, c, d) ((d << 24) | (c << 16) | (b << 8) | a)
238
239
static struct QEMU_PACKED sl_param_info {
240
uint32_t comadj_keyword;
241
@@ -XXX,XX +XXX,XX @@ static struct QEMU_PACKED sl_param_info {
242
uint32_t phad_keyword;
243
int32_t phadadj;
244
} zaurus_bootparam = {
245
- .comadj_keyword    = MAGIC_CHG('C', 'M', 'A', 'D'),
246
- .comadj        = 125,
247
- .uuid_keyword    = MAGIC_CHG('U', 'U', 'I', 'D'),
248
- .uuid        = { -1 },
249
- .touch_keyword    = MAGIC_CHG('T', 'U', 'C', 'H'),
250
- .touch_xp        = -1,
251
- .adadj_keyword    = MAGIC_CHG('B', 'V', 'A', 'D'),
252
- .adadj        = -1,
253
- .phad_keyword    = MAGIC_CHG('P', 'H', 'A', 'D'),
254
- .phadadj        = 0x01,
255
+ .comadj_keyword = MAGIC_CHG('C', 'M', 'A', 'D'),
256
+ .comadj = 125,
257
+ .uuid_keyword = MAGIC_CHG('U', 'U', 'I', 'D'),
258
+ .uuid = { -1 },
259
+ .touch_keyword = MAGIC_CHG('T', 'U', 'C', 'H'),
260
+ .touch_xp = -1,
261
+ .adadj_keyword = MAGIC_CHG('B', 'V', 'A', 'D'),
262
+ .adadj = -1,
263
+ .phad_keyword = MAGIC_CHG('P', 'H', 'A', 'D'),
264
+ .phadadj = 0x01,
265
};
266
267
void sl_bootparam_write(hwaddr ptr)
268
diff --git a/hw/input/tsc2005.c b/hw/input/tsc2005.c
269
index XXXXXXX..XXXXXXX 100644
270
--- a/hw/input/tsc2005.c
271
+++ b/hw/input/tsc2005.c
272
@@ -XXX,XX +XXX,XX @@
273
#include "migration/vmstate.h"
274
#include "trace.h"
275
276
-#define TSC_CUT_RESOLUTION(value, p)    ((value) >> (16 - (p ? 12 : 10)))
277
+#define TSC_CUT_RESOLUTION(value, p) ((value) >> (16 - (p ? 12 : 10)))
39
278
40
typedef struct {
279
typedef struct {
41
@@ -XXX,XX +XXX,XX @@ static void stellaris_gptm_class_init(ObjectClass *klass, void *data)
280
- qemu_irq pint;    /* Combination of the nPENIRQ and DAV signals */
42
DeviceClass *dc = DEVICE_CLASS(klass);
281
+ qemu_irq pint; /* Combination of the nPENIRQ and DAV signals */
43
282
QEMUTimer *timer;
44
dc->vmsd = &vmstate_stellaris_gptm;
283
uint16_t model;
45
+ dc->realize = stellaris_gptm_realize;
284
285
@@ -XXX,XX +XXX,XX @@ typedef struct {
286
} TSC2005State;
287
288
enum {
289
- TSC_MODE_XYZ_SCAN    = 0x0,
290
+ TSC_MODE_XYZ_SCAN = 0x0,
291
TSC_MODE_XY_SCAN,
292
TSC_MODE_X,
293
TSC_MODE_Y,
294
@@ -XXX,XX +XXX,XX @@ enum {
295
};
296
297
static const uint16_t mode_regs[16] = {
298
- 0xf000,    /* X, Y, Z scan */
299
- 0xc000,    /* X, Y scan */
300
- 0x8000,    /* X */
301
- 0x4000,    /* Y */
302
- 0x3000,    /* Z */
303
- 0x0800,    /* AUX */
304
- 0x0400,    /* TEMP1 */
305
- 0x0200,    /* TEMP2 */
306
- 0x0800,    /* AUX scan */
307
- 0x0040,    /* X test */
308
- 0x0020,    /* Y test */
309
- 0x0080,    /* Short-circuit test */
310
- 0x0000,    /* Reserved */
311
- 0x0000,    /* X+, X- drivers */
312
- 0x0000,    /* Y+, Y- drivers */
313
- 0x0000,    /* Y+, X- drivers */
314
+ 0xf000, /* X, Y, Z scan */
315
+ 0xc000, /* X, Y scan */
316
+ 0x8000, /* X */
317
+ 0x4000, /* Y */
318
+ 0x3000, /* Z */
319
+ 0x0800, /* AUX */
320
+ 0x0400, /* TEMP1 */
321
+ 0x0200, /* TEMP2 */
322
+ 0x0800, /* AUX scan */
323
+ 0x0040, /* X test */
324
+ 0x0020, /* Y test */
325
+ 0x0080, /* Short-circuit test */
326
+ 0x0000, /* Reserved */
327
+ 0x0000, /* X+, X- drivers */
328
+ 0x0000, /* Y+, Y- drivers */
329
+ 0x0000, /* Y+, X- drivers */
330
};
331
332
-#define X_TRANSFORM(s)            \
333
+#define X_TRANSFORM(s) \
334
((s->y * s->tr[0] - s->x * s->tr[1]) / s->tr[2] + s->tr[3])
335
-#define Y_TRANSFORM(s)            \
336
+#define Y_TRANSFORM(s) \
337
((s->y * s->tr[4] - s->x * s->tr[5]) / s->tr[6] + s->tr[7])
338
-#define Z1_TRANSFORM(s)            \
339
+#define Z1_TRANSFORM(s) \
340
((400 - ((s)->x >> 7) + ((s)->pressure << 10)) << 4)
341
-#define Z2_TRANSFORM(s)            \
342
+#define Z2_TRANSFORM(s) \
343
((4000 + ((s)->y >> 7) - ((s)->pressure << 10)) << 4)
344
345
-#define AUX_VAL                (700 << 4)    /* +/- 3 at 12-bit */
346
-#define TEMP1_VAL            (1264 << 4)    /* +/- 5 at 12-bit */
347
-#define TEMP2_VAL            (1531 << 4)    /* +/- 5 at 12-bit */
348
+#define AUX_VAL (700 << 4) /* +/- 3 at 12-bit */
349
+#define TEMP1_VAL (1264 << 4) /* +/- 5 at 12-bit */
350
+#define TEMP2_VAL (1531 << 4) /* +/- 5 at 12-bit */
351
352
static uint16_t tsc2005_read(TSC2005State *s, int reg)
353
{
354
uint16_t ret;
355
356
switch (reg) {
357
- case 0x0:    /* X */
358
+ case 0x0: /* X */
359
s->dav &= ~mode_regs[TSC_MODE_X];
360
return TSC_CUT_RESOLUTION(X_TRANSFORM(s), s->precision) +
361
(s->noise & 3);
362
- case 0x1:    /* Y */
363
+ case 0x1: /* Y */
364
s->dav &= ~mode_regs[TSC_MODE_Y];
365
- s->noise ++;
366
+ s->noise++;
367
return TSC_CUT_RESOLUTION(Y_TRANSFORM(s), s->precision) ^
368
(s->noise & 3);
369
- case 0x2:    /* Z1 */
370
+ case 0x2: /* Z1 */
371
s->dav &= 0xdfff;
372
return TSC_CUT_RESOLUTION(Z1_TRANSFORM(s), s->precision) -
373
(s->noise & 3);
374
- case 0x3:    /* Z2 */
375
+ case 0x3: /* Z2 */
376
s->dav &= 0xefff;
377
return TSC_CUT_RESOLUTION(Z2_TRANSFORM(s), s->precision) |
378
(s->noise & 3);
379
380
- case 0x4:    /* AUX */
381
+ case 0x4: /* AUX */
382
s->dav &= ~mode_regs[TSC_MODE_AUX];
383
return TSC_CUT_RESOLUTION(AUX_VAL, s->precision);
384
385
- case 0x5:    /* TEMP1 */
386
+ case 0x5: /* TEMP1 */
387
s->dav &= ~mode_regs[TSC_MODE_TEMP1];
388
return TSC_CUT_RESOLUTION(TEMP1_VAL, s->precision) -
389
(s->noise & 5);
390
- case 0x6:    /* TEMP2 */
391
+ case 0x6: /* TEMP2 */
392
s->dav &= 0xdfff;
393
s->dav &= ~mode_regs[TSC_MODE_TEMP2];
394
return TSC_CUT_RESOLUTION(TEMP2_VAL, s->precision) ^
395
(s->noise & 3);
396
397
- case 0x7:    /* Status */
398
+ case 0x7: /* Status */
399
ret = s->dav | (s->reset << 7) | (s->pdst << 2) | 0x0;
400
s->dav &= ~(mode_regs[TSC_MODE_X_TEST] | mode_regs[TSC_MODE_Y_TEST] |
401
mode_regs[TSC_MODE_TS_TEST]);
402
s->reset = true;
403
return ret;
404
405
- case 0x8: /* AUX high threshold */
406
+ case 0x8: /* AUX high threshold */
407
return s->aux_thr[1];
408
- case 0x9: /* AUX low threshold */
409
+ case 0x9: /* AUX low threshold */
410
return s->aux_thr[0];
411
412
- case 0xa: /* TEMP high threshold */
413
+ case 0xa: /* TEMP high threshold */
414
return s->temp_thr[1];
415
- case 0xb: /* TEMP low threshold */
416
+ case 0xb: /* TEMP low threshold */
417
return s->temp_thr[0];
418
419
- case 0xc:    /* CFR0 */
420
+ case 0xc: /* CFR0 */
421
return (s->pressure << 15) | ((!s->busy) << 14) |
422
- (s->nextprecision << 13) | s->timing[0];
423
- case 0xd:    /* CFR1 */
424
+ (s->nextprecision << 13) | s->timing[0];
425
+ case 0xd: /* CFR1 */
426
return s->timing[1];
427
- case 0xe:    /* CFR2 */
428
+ case 0xe: /* CFR2 */
429
return (s->pin_func << 14) | s->filter;
430
431
- case 0xf:    /* Function select status */
432
+ case 0xf: /* Function select status */
433
return s->function >= 0 ? 1 << s->function : 0;
434
}
435
436
@@ -XXX,XX +XXX,XX @@ static void tsc2005_write(TSC2005State *s, int reg, uint16_t data)
437
s->temp_thr[0] = data;
438
break;
439
440
- case 0xc:    /* CFR0 */
441
+ case 0xc: /* CFR0 */
442
s->host_mode = (data >> 15) != 0;
443
if (s->enabled != !(data & 0x4000)) {
444
s->enabled = !(data & 0x4000);
445
trace_tsc2005_sense(s->enabled ? "enabled" : "disabled");
446
- if (s->busy && !s->enabled)
447
+ if (s->busy && !s->enabled) {
448
timer_del(s->timer);
449
+ }
450
s->busy = s->busy && s->enabled;
451
}
452
s->nextprecision = (data >> 13) & 1;
453
@@ -XXX,XX +XXX,XX @@ static void tsc2005_write(TSC2005State *s, int reg, uint16_t data)
454
"tsc2005_write: illegal conversion clock setting\n");
455
}
456
break;
457
- case 0xd:    /* CFR1 */
458
+ case 0xd: /* CFR1 */
459
s->timing[1] = data & 0xf07;
460
break;
461
- case 0xe:    /* CFR2 */
462
+ case 0xe: /* CFR2 */
463
s->pin_func = (data >> 14) & 3;
464
s->filter = data & 0x3fff;
465
break;
466
@@ -XXX,XX +XXX,XX @@ static void tsc2005_pin_update(TSC2005State *s)
467
switch (s->nextfunction) {
468
case TSC_MODE_XYZ_SCAN:
469
case TSC_MODE_XY_SCAN:
470
- if (!s->host_mode && s->dav)
471
+ if (!s->host_mode && s->dav) {
472
s->enabled = false;
473
- if (!s->pressure)
474
+ }
475
+ if (!s->pressure) {
476
return;
477
+ }
478
/* Fall through */
479
case TSC_MODE_AUX_SCAN:
480
break;
481
@@ -XXX,XX +XXX,XX @@ static void tsc2005_pin_update(TSC2005State *s)
482
case TSC_MODE_X:
483
case TSC_MODE_Y:
484
case TSC_MODE_Z:
485
- if (!s->pressure)
486
+ if (!s->pressure) {
487
return;
488
+ }
489
/* Fall through */
490
case TSC_MODE_AUX:
491
case TSC_MODE_TEMP1:
492
@@ -XXX,XX +XXX,XX @@ static void tsc2005_pin_update(TSC2005State *s)
493
case TSC_MODE_X_TEST:
494
case TSC_MODE_Y_TEST:
495
case TSC_MODE_TS_TEST:
496
- if (s->dav)
497
+ if (s->dav) {
498
s->enabled = false;
499
+ }
500
break;
501
502
case TSC_MODE_RESERVED:
503
@@ -XXX,XX +XXX,XX @@ static void tsc2005_pin_update(TSC2005State *s)
504
return;
505
}
506
507
- if (!s->enabled || s->busy)
508
+ if (!s->enabled || s->busy) {
509
return;
510
+ }
511
512
s->busy = true;
513
s->precision = s->nextprecision;
514
s->function = s->nextfunction;
515
- s->pdst = !s->pnd0;    /* Synchronised on internal clock */
516
+ s->pdst = !s->pnd0; /* Synchronised on internal clock */
517
expires = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
518
(NANOSECONDS_PER_SECOND >> 7);
519
timer_mod(s->timer, expires);
520
@@ -XXX,XX +XXX,XX @@ static uint8_t tsc2005_txrx_word(void *opaque, uint8_t value)
521
TSC2005State *s = opaque;
522
uint32_t ret = 0;
523
524
- switch (s->state ++) {
525
+ switch (s->state++) {
526
case 0:
527
if (value & 0x80) {
528
/* Command */
529
@@ -XXX,XX +XXX,XX @@ static uint8_t tsc2005_txrx_word(void *opaque, uint8_t value)
530
if (s->enabled != !(value & 1)) {
531
s->enabled = !(value & 1);
532
trace_tsc2005_sense(s->enabled ? "enabled" : "disabled");
533
- if (s->busy && !s->enabled)
534
+ if (s->busy && !s->enabled) {
535
timer_del(s->timer);
536
+ }
537
s->busy = s->busy && s->enabled;
538
}
539
tsc2005_pin_update(s);
540
@@ -XXX,XX +XXX,XX @@ static uint8_t tsc2005_txrx_word(void *opaque, uint8_t value)
541
break;
542
543
case 1:
544
- if (s->command)
545
+ if (s->command) {
546
ret = (s->data >> 8) & 0xff;
547
- else
548
+ } else {
549
s->data |= value << 8;
550
+ }
551
break;
552
553
case 2:
554
@@ -XXX,XX +XXX,XX @@ static void tsc2005_timer_tick(void *opaque)
555
556
/* Timer ticked -- a set of conversions has been finished. */
557
558
- if (!s->busy)
559
+ if (!s->busy) {
560
return;
561
+ }
562
563
s->busy = false;
564
s->dav |= mode_regs[function];
565
@@ -XXX,XX +XXX,XX @@ static void tsc2005_touchscreen_event(void *opaque,
566
* signaling TS events immediately, but for now we simulate
567
* the first conversion delay for sake of correctness.
568
*/
569
- if (p != s->pressure)
570
+ if (p != s->pressure) {
571
tsc2005_pin_update(s);
572
+ }
46
}
573
}
47
574
48
static const TypeInfo stellaris_gptm_info = {
575
static int tsc2005_post_load(void *opaque, int version_id)
49
--
576
--
50
2.20.1
577
2.34.1
51
52
diff view generated by jsdifflib
1
From: Rene Stange <rsta2@o2online.de>
1
From: Rayhan Faizel <rayhan.faizel@gmail.com>
2
2
3
TD (two dimensions) DMA mode did not work, because the xlen variable
3
None of the RPi boards have ADC on-board. In real life, an external ADC chip
4
has not been re-initialized before each additional ylen run through
4
is required to operate on analog signals.
5
in bcm2835_dma_update(). Fix it.
6
5
7
Signed-off-by: Rene Stange <rsta2@o2online.de>
6
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20240512085716.222326-1-rayhan.faizel@gmail.com
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
10
---
11
hw/dma/bcm2835_dma.c | 4 +++-
11
docs/system/arm/raspi.rst | 1 -
12
1 file changed, 3 insertions(+), 1 deletion(-)
12
1 file changed, 1 deletion(-)
13
13
14
diff --git a/hw/dma/bcm2835_dma.c b/hw/dma/bcm2835_dma.c
14
diff --git a/docs/system/arm/raspi.rst b/docs/system/arm/raspi.rst
15
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/dma/bcm2835_dma.c
16
--- a/docs/system/arm/raspi.rst
17
+++ b/hw/dma/bcm2835_dma.c
17
+++ b/docs/system/arm/raspi.rst
18
@@ -XXX,XX +XXX,XX @@
18
@@ -XXX,XX +XXX,XX @@ Implemented devices
19
static void bcm2835_dma_update(BCM2835DMAState *s, unsigned c)
19
Missing devices
20
{
20
---------------
21
BCM2835DMAChan *ch = &s->chan[c];
21
22
- uint32_t data, xlen, ylen;
22
- * Analog to Digital Converter (ADC)
23
+ uint32_t data, xlen, xlen_td, ylen;
23
* Pulse Width Modulation (PWM)
24
int16_t dst_stride, src_stride;
24
* PCIE Root Port (raspi4b)
25
25
* GENET Ethernet Controller (raspi4b)
26
if (!(s->enable & (1 << c))) {
27
@@ -XXX,XX +XXX,XX @@ static void bcm2835_dma_update(BCM2835DMAState *s, unsigned c)
28
dst_stride = 0;
29
src_stride = 0;
30
}
31
+ xlen_td = xlen;
32
33
while (ylen != 0) {
34
/* Normal transfer mode */
35
@@ -XXX,XX +XXX,XX @@ static void bcm2835_dma_update(BCM2835DMAState *s, unsigned c)
36
if (--ylen != 0) {
37
ch->source_ad += src_stride;
38
ch->dest_ad += dst_stride;
39
+ xlen = xlen_td;
40
}
41
}
42
ch->cs |= BCM2708_DMA_END;
43
--
26
--
44
2.20.1
27
2.34.1
45
28
46
29
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The comment that we don't support EL2 is somewhat out of date.
3
This fixes a bug in that neither PLI nor PLDW are present in ARMv6T2,
4
Update to include checks against HCR_EL2.TDZ.
4
but are introduced with ARMv7 and ARMv7MP respectively.
5
For clarity, do not use NOP for PLD.
5
6
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Note that there is no PLDW (literal). Architecturally in the
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
T1 encoding of "PLD (literal)" bit 5 is "(0)", which means
9
that it should be zero and if it is not then the behaviour
10
is CONSTRAINED UNPREDICTABLE (might UNDEF, NOP, or ignore the
11
value of the bit).
12
13
In our implementation we have patterns for both:
14
15
+ PLD 1111 1000 -001 1111 1111 ------------ # (literal)
16
+ PLD 1111 1000 -011 1111 1111 ------------ # (literal)
17
18
and so we effectively ignore the value of bit 5. (This is a
19
permitted option for this CONSTRAINED UNPREDICTABLE.) This isn't a
20
behaviour change in this commit, since we previously had NOP lines
21
for both those patterns.
22
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
23
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-24-richard.henderson@linaro.org
24
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
25
Message-id: 20240524232121.284515-3-richard.henderson@linaro.org
26
[PMM: adjusted commit message to note that PLD (lit) T1 bit 5
27
being 1 is an UNPREDICTABLE case.]
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
28
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
29
---
12
target/arm/helper.c | 26 +++++++++++++++++++++-----
30
target/arm/tcg/t32.decode | 25 ++++++++++++-------------
13
1 file changed, 21 insertions(+), 5 deletions(-)
31
target/arm/tcg/translate.c | 4 ++--
32
2 files changed, 14 insertions(+), 15 deletions(-)
14
33
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
34
diff --git a/target/arm/tcg/t32.decode b/target/arm/tcg/t32.decode
16
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
36
--- a/target/arm/tcg/t32.decode
18
+++ b/target/arm/helper.c
37
+++ b/target/arm/tcg/t32.decode
19
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
38
@@ -XXX,XX +XXX,XX @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos
20
static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
39
# Note that Load, unsigned (literal) overlaps all other load encodings.
21
bool isread)
22
{
40
{
23
- /* We don't implement EL2, so the only control on DC ZVA is the
41
{
24
- * bit in the SCTLR which can prohibit access for EL0.
42
- NOP 1111 1000 -001 1111 1111 ------------ # PLD
25
- */
43
+ PLD 1111 1000 -001 1111 1111 ------------ # (literal)
26
- if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_DZE)) {
44
LDRB_ri 1111 1000 .001 1111 .... ............ @ldst_ri_lit
27
- return CP_ACCESS_TRAP;
45
}
28
+ int cur_el = arm_current_el(env);
46
{
29
+
47
- NOP 1111 1000 1001 ---- 1111 ------------ # PLD
30
+ if (cur_el < 2) {
48
+ PLD 1111 1000 1001 ---- 1111 ------------ # (immediate T1)
31
+ uint64_t hcr = arm_hcr_el2_eff(env);
49
LDRB_ri 1111 1000 1001 .... .... ............ @ldst_ri_pos
32
+
50
}
33
+ if (cur_el == 0) {
51
LDRB_ri 1111 1000 0001 .... .... 1..1 ........ @ldst_ri_idx
34
+ if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
52
{
35
+ if (!(env->cp15.sctlr_el[2] & SCTLR_DZE)) {
53
- NOP 1111 1000 0001 ---- 1111 1100 -------- # PLD
36
+ return CP_ACCESS_TRAP_EL2;
54
+ PLD 1111 1000 0001 ---- 1111 1100 -------- # (immediate T2)
37
+ }
55
LDRB_ri 1111 1000 0001 .... .... 1100 ........ @ldst_ri_neg
38
+ } else {
56
}
39
+ if (!(env->cp15.sctlr_el[1] & SCTLR_DZE)) {
57
LDRBT_ri 1111 1000 0001 .... .... 1110 ........ @ldst_ri_unp
40
+ return CP_ACCESS_TRAP;
58
{
41
+ }
59
- NOP 1111 1000 0001 ---- 1111 000000 -- ---- # PLD
42
+ if (hcr & HCR_TDZ) {
60
+ PLD 1111 1000 0001 ---- 1111 000000 -- ---- # (register)
43
+ return CP_ACCESS_TRAP_EL2;
61
LDRB_rr 1111 1000 0001 .... .... 000000 .. .... @ldst_rr
44
+ }
62
}
45
+ }
63
}
46
+ } else if (hcr & HCR_TDZ) {
64
{
47
+ return CP_ACCESS_TRAP_EL2;
65
{
48
+ }
66
- NOP 1111 1000 -011 1111 1111 ------------ # PLD
49
}
67
+ PLD 1111 1000 -011 1111 1111 ------------ # (literal)
50
return CP_ACCESS_OK;
68
LDRH_ri 1111 1000 .011 1111 .... ............ @ldst_ri_lit
69
}
70
{
71
- NOP 1111 1000 1011 ---- 1111 ------------ # PLDW
72
+ PLDW 1111 1000 1011 ---- 1111 ------------ # (immediate T1)
73
LDRH_ri 1111 1000 1011 .... .... ............ @ldst_ri_pos
74
}
75
LDRH_ri 1111 1000 0011 .... .... 1..1 ........ @ldst_ri_idx
76
{
77
- NOP 1111 1000 0011 ---- 1111 1100 -------- # PLDW
78
+ PLDW 1111 1000 0011 ---- 1111 1100 -------- # (immediate T2)
79
LDRH_ri 1111 1000 0011 .... .... 1100 ........ @ldst_ri_neg
80
}
81
LDRHT_ri 1111 1000 0011 .... .... 1110 ........ @ldst_ri_unp
82
{
83
- NOP 1111 1000 0011 ---- 1111 000000 -- ---- # PLDW
84
+ PLDW 1111 1000 0011 ---- 1111 000000 -- ---- # (register)
85
LDRH_rr 1111 1000 0011 .... .... 000000 .. .... @ldst_rr
86
}
87
}
88
@@ -XXX,XX +XXX,XX @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos
89
LDRT_ri 1111 1000 0101 .... .... 1110 ........ @ldst_ri_unp
90
LDR_rr 1111 1000 0101 .... .... 000000 .. .... @ldst_rr
91
}
92
-# NOPs here are PLI.
93
{
94
{
95
- NOP 1111 1001 -001 1111 1111 ------------
96
+ PLI 1111 1001 -001 1111 1111 ------------ # (literal T3)
97
LDRSB_ri 1111 1001 .001 1111 .... ............ @ldst_ri_lit
98
}
99
{
100
- NOP 1111 1001 1001 ---- 1111 ------------
101
+ PLI 1111 1001 1001 ---- 1111 ------------ # (immediate T1)
102
LDRSB_ri 1111 1001 1001 .... .... ............ @ldst_ri_pos
103
}
104
LDRSB_ri 1111 1001 0001 .... .... 1..1 ........ @ldst_ri_idx
105
{
106
- NOP 1111 1001 0001 ---- 1111 1100 --------
107
+ PLI 1111 1001 0001 ---- 1111 1100 -------- # (immediate T2)
108
LDRSB_ri 1111 1001 0001 .... .... 1100 ........ @ldst_ri_neg
109
}
110
LDRSBT_ri 1111 1001 0001 .... .... 1110 ........ @ldst_ri_unp
111
{
112
- NOP 1111 1001 0001 ---- 1111 000000 -- ----
113
+ PLI 1111 1001 0001 ---- 1111 000000 -- ---- # (register)
114
LDRSB_rr 1111 1001 0001 .... .... 000000 .. .... @ldst_rr
115
}
116
}
117
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
118
index XXXXXXX..XXXXXXX 100644
119
--- a/target/arm/tcg/translate.c
120
+++ b/target/arm/tcg/translate.c
121
@@ -XXX,XX +XXX,XX @@ static bool trans_PLD(DisasContext *s, arg_PLD *a)
122
return ENABLE_ARCH_5TE;
123
}
124
125
-static bool trans_PLDW(DisasContext *s, arg_PLD *a)
126
+static bool trans_PLDW(DisasContext *s, arg_PLDW *a)
127
{
128
return arm_dc_feature(s, ARM_FEATURE_V7MP);
129
}
130
131
-static bool trans_PLI(DisasContext *s, arg_PLD *a)
132
+static bool trans_PLI(DisasContext *s, arg_PLI *a)
133
{
134
return ENABLE_ARCH_7;
51
}
135
}
52
--
136
--
53
2.20.1
137
2.34.1
54
55
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The value computed is fully boolean; using int8_t is odd.
3
Fixes RISU mismatch for "fcvtzs h31, h0, #14".
4
4
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200206105448.4726-41-richard.henderson@linaro.org
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20240524232121.284515-5-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
9
---
11
target/arm/cpu.c | 6 +++---
10
target/arm/tcg/translate-a64.c | 3 +++
12
1 file changed, 3 insertions(+), 3 deletions(-)
11
1 file changed, 3 insertions(+)
13
12
14
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/cpu.c
15
--- a/target/arm/tcg/translate-a64.c
17
+++ b/target/arm/cpu.c
16
+++ b/target/arm/tcg/translate-a64.c
18
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
17
@@ -XXX,XX +XXX,XX @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
19
{
18
read_vec_element_i32(s, tcg_op, rn, pass, size);
20
CPUARMState *env = cs->env_ptr;
19
fn(tcg_op, tcg_op, tcg_shift, tcg_fpstatus);
21
bool pstate_unmasked;
20
if (is_scalar) {
22
- int8_t unmasked = 0;
21
+ if (size == MO_16 && !is_u) {
23
+ bool unmasked = false;
22
+ tcg_gen_ext16u_i32(tcg_op, tcg_op);
24
23
+ }
25
/*
24
write_fp_sreg(s, rd, tcg_op);
26
* Don't take exceptions if they target a lower EL.
25
} else {
27
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
26
write_vec_element_i32(s, tcg_op, rd, pass, size);
28
* don't affect the masking logic, only the interrupt routing.
29
*/
30
if (target_el == 3 || !secure) {
31
- unmasked = 1;
32
+ unmasked = true;
33
}
34
} else {
35
/*
36
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
37
}
38
39
if ((scr || hcr) && !secure) {
40
- unmasked = 1;
41
+ unmasked = true;
42
}
43
}
44
}
45
--
27
--
46
2.20.1
28
2.34.1
47
48
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Not all of the breakpoint types are supported, but those that
3
The decode of FMOV (vector, immediate, half-precision) vs
4
only examine contextidr are extended to support the new register.
4
invalid cases of MOVI are incorrect.
5
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Fixes RISU mismatch for invalid insn 0x2f01fd31.
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
8
Fixes: 70b4e6a4457 ("arm/translate-a64: add FP16 FMOV to simd_mod_imm")
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-4-richard.henderson@linaro.org
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Message-id: 20240524232121.284515-6-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
13
---
12
target/arm/debug_helper.c | 50 +++++++++++++++++++++++++++++----------
14
target/arm/tcg/translate-a64.c | 24 ++++++++++++++----------
13
target/arm/helper.c | 12 ++++++++++
15
1 file changed, 14 insertions(+), 10 deletions(-)
14
2 files changed, 50 insertions(+), 12 deletions(-)
15
16
16
diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
17
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
17
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/debug_helper.c
19
--- a/target/arm/tcg/translate-a64.c
19
+++ b/target/arm/debug_helper.c
20
+++ b/target/arm/tcg/translate-a64.c
20
@@ -XXX,XX +XXX,XX @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
21
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
21
int ctx_cmps = extract32(cpu->dbgdidr, 20, 4);
22
bool is_q = extract32(insn, 30, 1);
22
int bt;
23
uint64_t imm = 0;
23
uint32_t contextidr;
24
24
+ uint64_t hcr_el2;
25
- if (o2 != 0 || ((cmode == 0xf) && is_neg && !is_q)) {
25
26
- /* Check for FMOV (vector, immediate) - half-precision */
26
/*
27
- if (!(dc_isar_feature(aa64_fp16, s) && o2 && cmode == 0xf)) {
27
* Links to unimplemented or non-context aware breakpoints are
28
+ if (o2) {
28
@@ -XXX,XX +XXX,XX @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
29
+ if (cmode != 0xf || is_neg) {
30
unallocated_encoding(s);
31
return;
32
}
33
- }
34
-
35
- if (!fp_access_check(s)) {
36
- return;
37
- }
38
-
39
- if (cmode == 15 && o2 && !is_neg) {
40
/* FMOV (vector, immediate) - half-precision */
41
+ if (!dc_isar_feature(aa64_fp16, s)) {
42
+ unallocated_encoding(s);
43
+ return;
44
+ }
45
imm = vfp_expand_imm(MO_16, abcdefgh);
46
/* now duplicate across the lanes */
47
imm = dup_const(MO_16, imm);
48
} else {
49
+ if (cmode == 0xf && is_neg && !is_q) {
50
+ unallocated_encoding(s);
51
+ return;
52
+ }
53
imm = asimd_imm_const(abcdefgh, cmode, is_neg);
29
}
54
}
30
55
31
bt = extract64(bcr, 20, 4);
56
+ if (!fp_access_check(s)) {
32
-
57
+ return;
33
- /*
34
- * We match the whole register even if this is AArch32 using the
35
- * short descriptor format (in which case it holds both PROCID and ASID),
36
- * since we don't implement the optional v7 context ID masking.
37
- */
38
- contextidr = extract64(env->cp15.contextidr_el[1], 0, 32);
39
+ hcr_el2 = arm_hcr_el2_eff(env);
40
41
switch (bt) {
42
case 3: /* linked context ID match */
43
- if (arm_current_el(env) > 1) {
44
- /* Context matches never fire in EL2 or (AArch64) EL3 */
45
+ switch (arm_current_el(env)) {
46
+ default:
47
+ /* Context matches never fire in AArch64 EL3 */
48
return false;
49
+ case 2:
50
+ if (!(hcr_el2 & HCR_E2H)) {
51
+ /* Context matches never fire in EL2 without E2H enabled. */
52
+ return false;
53
+ }
54
+ contextidr = env->cp15.contextidr_el[2];
55
+ break;
56
+ case 1:
57
+ contextidr = env->cp15.contextidr_el[1];
58
+ break;
59
+ case 0:
60
+ if ((hcr_el2 & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
61
+ contextidr = env->cp15.contextidr_el[2];
62
+ } else {
63
+ contextidr = env->cp15.contextidr_el[1];
64
+ }
65
+ break;
66
}
67
- return (contextidr == extract64(env->cp15.dbgbvr[lbn], 0, 32));
68
- case 5: /* linked address mismatch (reserved in AArch64) */
69
+ break;
70
+
71
+ case 7: /* linked contextidr_el1 match */
72
+ contextidr = env->cp15.contextidr_el[1];
73
+ break;
74
+ case 13: /* linked contextidr_el2 match */
75
+ contextidr = env->cp15.contextidr_el[2];
76
+ break;
77
+
78
case 9: /* linked VMID match (reserved if no EL2) */
79
case 11: /* linked context ID and VMID match (reserved if no EL2) */
80
+ case 15: /* linked full context ID match */
81
default:
82
/*
83
* Links to Unlinked context breakpoints must generate no
84
@@ -XXX,XX +XXX,XX @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
85
return false;
86
}
87
88
- return false;
89
+ /*
90
+ * We match the whole register even if this is AArch32 using the
91
+ * short descriptor format (in which case it holds both PROCID and ASID),
92
+ * since we don't implement the optional v7 context ID masking.
93
+ */
94
+ return contextidr == (uint32_t)env->cp15.dbgbvr[lbn];
95
}
96
97
static bool bp_wp_matches(ARMCPU *cpu, int n, bool is_wp)
98
diff --git a/target/arm/helper.c b/target/arm/helper.c
99
index XXXXXXX..XXXXXXX 100644
100
--- a/target/arm/helper.c
101
+++ b/target/arm/helper.c
102
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo jazelle_regs[] = {
103
REGINFO_SENTINEL
104
};
105
106
+static const ARMCPRegInfo vhe_reginfo[] = {
107
+ { .name = "CONTEXTIDR_EL2", .state = ARM_CP_STATE_AA64,
108
+ .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 1,
109
+ .access = PL2_RW,
110
+ .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[2]) },
111
+ REGINFO_SENTINEL
112
+};
113
+
114
void register_cp_regs_for_features(ARMCPU *cpu)
115
{
116
/* Register all the coprocessor registers based on feature bits */
117
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
118
define_arm_cp_regs(cpu, lor_reginfo);
119
}
120
121
+ if (arm_feature(env, ARM_FEATURE_EL2) && cpu_isar_feature(aa64_vh, cpu)) {
122
+ define_arm_cp_regs(cpu, vhe_reginfo);
123
+ }
58
+ }
124
+
59
+
125
if (cpu_isar_feature(aa64_sve, cpu)) {
60
if (!((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9)) {
126
define_one_arm_cp_reg(cpu, &zcr_el1_reginfo);
61
/* MOVI or MVNI, with MVNI negation handled above. */
127
if (arm_feature(env, ARM_FEATURE_EL2)) {
62
tcg_gen_gvec_dup_imm(MO_64, vec_full_reg_offset(s, rd), is_q ? 16 : 8,
128
--
63
--
129
2.20.1
64
2.34.1
130
131
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Update to include checks against HCR_EL2.TID2.
3
All of these insns have "if sz == '1' then UNDEFINED" in their pseudocode.
4
Fixes a RISU miscompare for invalid insn 0x5ef0c87a.
4
5
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Fixes: 5c36d89567c ("arm/translate-a64: add all FP16 ops in simd_scalar_pairwise")
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200206105448.4726-25-richard.henderson@linaro.org
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20240524232121.284515-7-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
---
11
target/arm/helper.c | 26 +++++++++++++++++++++-----
12
target/arm/tcg/translate-a64.c | 2 +-
12
1 file changed, 21 insertions(+), 5 deletions(-)
13
1 file changed, 1 insertion(+), 1 deletion(-)
13
14
14
diff --git a/target/arm/helper.c b/target/arm/helper.c
15
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.c
17
--- a/target/arm/tcg/translate-a64.c
17
+++ b/target/arm/helper.c
18
+++ b/target/arm/tcg/translate-a64.c
18
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
19
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
19
static CPAccessResult ctr_el0_access(CPUARMState *env, const ARMCPRegInfo *ri,
20
case 0x2f: /* FMINP */
20
bool isread)
21
/* FP op, size[0] is 32 or 64 bit*/
21
{
22
if (!u) {
22
- /* Only accessible in EL0 if SCTLR.UCT is set (and only in AArch64,
23
- if (!dc_isar_feature(aa64_fp16, s)) {
23
- * but the AArch32 CTR has its own reginfo struct)
24
+ if ((size & 1) || !dc_isar_feature(aa64_fp16, s)) {
24
- */
25
unallocated_encoding(s);
25
- if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UCT)) {
26
return;
26
- return CP_ACCESS_TRAP;
27
} else {
27
+ int cur_el = arm_current_el(env);
28
+
29
+ if (cur_el < 2) {
30
+ uint64_t hcr = arm_hcr_el2_eff(env);
31
+
32
+ if (cur_el == 0) {
33
+ if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
34
+ if (!(env->cp15.sctlr_el[2] & SCTLR_UCT)) {
35
+ return CP_ACCESS_TRAP_EL2;
36
+ }
37
+ } else {
38
+ if (!(env->cp15.sctlr_el[1] & SCTLR_UCT)) {
39
+ return CP_ACCESS_TRAP;
40
+ }
41
+ if (hcr & HCR_TID2) {
42
+ return CP_ACCESS_TRAP_EL2;
43
+ }
44
+ }
45
+ } else if (hcr & HCR_TID2) {
46
+ return CP_ACCESS_TRAP_EL2;
47
+ }
48
}
49
50
if (arm_current_el(env) < 2 && arm_hcr_el2_eff(env) & HCR_TID2) {
51
--
28
--
52
2.20.1
29
2.34.1
53
54
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Avoid redundant computation of cpu state by passing it in
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
from the caller, which has already computed it for itself.
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-40-richard.henderson@linaro.org
6
Message-id: 20240524232121.284515-8-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
8
---
12
target/arm/cpu.c | 22 ++++++++++++----------
9
target/arm/tcg/translate.h | 5 +
13
1 file changed, 12 insertions(+), 10 deletions(-)
10
target/arm/tcg/gengvec.c | 1612 ++++++++++++++++++++++++++++++++++++
11
target/arm/tcg/translate.c | 1588 -----------------------------------
12
target/arm/tcg/meson.build | 1 +
13
4 files changed, 1618 insertions(+), 1588 deletions(-)
14
create mode 100644 target/arm/tcg/gengvec.c
14
15
15
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
16
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.c
18
--- a/target/arm/tcg/translate.h
18
+++ b/target/arm/cpu.c
19
+++ b/target/arm/tcg/translate.h
19
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
20
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
21
void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
22
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
23
24
+void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh);
25
+void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh);
26
+void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh);
27
+void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh);
28
+
29
void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
30
int64_t shift, uint32_t opr_sz, uint32_t max_sz);
31
void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
32
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
33
new file mode 100644
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/target/arm/tcg/gengvec.c
37
@@ -XXX,XX +XXX,XX @@
38
+/*
39
+ * ARM generic vector expansion
40
+ *
41
+ * Copyright (c) 2003 Fabrice Bellard
42
+ * Copyright (c) 2005-2007 CodeSourcery
43
+ * Copyright (c) 2007 OpenedHand, Ltd.
44
+ *
45
+ * This library is free software; you can redistribute it and/or
46
+ * modify it under the terms of the GNU Lesser General Public
47
+ * License as published by the Free Software Foundation; either
48
+ * version 2.1 of the License, or (at your option) any later version.
49
+ *
50
+ * This library is distributed in the hope that it will be useful,
51
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
52
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
53
+ * Lesser General Public License for more details.
54
+ *
55
+ * You should have received a copy of the GNU Lesser General Public
56
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
57
+ */
58
+
59
+#include "qemu/osdep.h"
60
+#include "translate.h"
61
+
62
+
63
+static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
64
+ uint32_t opr_sz, uint32_t max_sz,
65
+ gen_helper_gvec_3_ptr *fn)
66
+{
67
+ TCGv_ptr qc_ptr = tcg_temp_new_ptr();
68
+
69
+ tcg_gen_addi_ptr(qc_ptr, tcg_env, offsetof(CPUARMState, vfp.qc));
70
+ tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
71
+ opr_sz, max_sz, 0, fn);
72
+}
73
+
74
+void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
75
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
76
+{
77
+ static gen_helper_gvec_3_ptr * const fns[2] = {
78
+ gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
79
+ };
80
+ tcg_debug_assert(vece >= 1 && vece <= 2);
81
+ gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
82
+}
83
+
84
+void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
85
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
86
+{
87
+ static gen_helper_gvec_3_ptr * const fns[2] = {
88
+ gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
89
+ };
90
+ tcg_debug_assert(vece >= 1 && vece <= 2);
91
+ gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
92
+}
93
+
94
+#define GEN_CMP0(NAME, COND) \
95
+ void NAME(unsigned vece, uint32_t d, uint32_t m, \
96
+ uint32_t opr_sz, uint32_t max_sz) \
97
+ { tcg_gen_gvec_cmpi(COND, vece, d, m, 0, opr_sz, max_sz); }
98
+
99
+GEN_CMP0(gen_gvec_ceq0, TCG_COND_EQ)
100
+GEN_CMP0(gen_gvec_cle0, TCG_COND_LE)
101
+GEN_CMP0(gen_gvec_cge0, TCG_COND_GE)
102
+GEN_CMP0(gen_gvec_clt0, TCG_COND_LT)
103
+GEN_CMP0(gen_gvec_cgt0, TCG_COND_GT)
104
+
105
+#undef GEN_CMP0
106
+
107
+static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
108
+{
109
+ tcg_gen_vec_sar8i_i64(a, a, shift);
110
+ tcg_gen_vec_add8_i64(d, d, a);
111
+}
112
+
113
+static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
114
+{
115
+ tcg_gen_vec_sar16i_i64(a, a, shift);
116
+ tcg_gen_vec_add16_i64(d, d, a);
117
+}
118
+
119
+static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
120
+{
121
+ tcg_gen_sari_i32(a, a, shift);
122
+ tcg_gen_add_i32(d, d, a);
123
+}
124
+
125
+static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
126
+{
127
+ tcg_gen_sari_i64(a, a, shift);
128
+ tcg_gen_add_i64(d, d, a);
129
+}
130
+
131
+static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
132
+{
133
+ tcg_gen_sari_vec(vece, a, a, sh);
134
+ tcg_gen_add_vec(vece, d, d, a);
135
+}
136
+
137
+void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
138
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
139
+{
140
+ static const TCGOpcode vecop_list[] = {
141
+ INDEX_op_sari_vec, INDEX_op_add_vec, 0
142
+ };
143
+ static const GVecGen2i ops[4] = {
144
+ { .fni8 = gen_ssra8_i64,
145
+ .fniv = gen_ssra_vec,
146
+ .fno = gen_helper_gvec_ssra_b,
147
+ .load_dest = true,
148
+ .opt_opc = vecop_list,
149
+ .vece = MO_8 },
150
+ { .fni8 = gen_ssra16_i64,
151
+ .fniv = gen_ssra_vec,
152
+ .fno = gen_helper_gvec_ssra_h,
153
+ .load_dest = true,
154
+ .opt_opc = vecop_list,
155
+ .vece = MO_16 },
156
+ { .fni4 = gen_ssra32_i32,
157
+ .fniv = gen_ssra_vec,
158
+ .fno = gen_helper_gvec_ssra_s,
159
+ .load_dest = true,
160
+ .opt_opc = vecop_list,
161
+ .vece = MO_32 },
162
+ { .fni8 = gen_ssra64_i64,
163
+ .fniv = gen_ssra_vec,
164
+ .fno = gen_helper_gvec_ssra_d,
165
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
166
+ .opt_opc = vecop_list,
167
+ .load_dest = true,
168
+ .vece = MO_64 },
169
+ };
170
+
171
+ /* tszimm encoding produces immediates in the range [1..esize]. */
172
+ tcg_debug_assert(shift > 0);
173
+ tcg_debug_assert(shift <= (8 << vece));
174
+
175
+ /*
176
+ * Shifts larger than the element size are architecturally valid.
177
+ * Signed results in all sign bits.
178
+ */
179
+ shift = MIN(shift, (8 << vece) - 1);
180
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
181
+}
182
+
183
+static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
184
+{
185
+ tcg_gen_vec_shr8i_i64(a, a, shift);
186
+ tcg_gen_vec_add8_i64(d, d, a);
187
+}
188
+
189
+static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
190
+{
191
+ tcg_gen_vec_shr16i_i64(a, a, shift);
192
+ tcg_gen_vec_add16_i64(d, d, a);
193
+}
194
+
195
+static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
196
+{
197
+ tcg_gen_shri_i32(a, a, shift);
198
+ tcg_gen_add_i32(d, d, a);
199
+}
200
+
201
+static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
202
+{
203
+ tcg_gen_shri_i64(a, a, shift);
204
+ tcg_gen_add_i64(d, d, a);
205
+}
206
+
207
+static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
208
+{
209
+ tcg_gen_shri_vec(vece, a, a, sh);
210
+ tcg_gen_add_vec(vece, d, d, a);
211
+}
212
+
213
+void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
214
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
215
+{
216
+ static const TCGOpcode vecop_list[] = {
217
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
218
+ };
219
+ static const GVecGen2i ops[4] = {
220
+ { .fni8 = gen_usra8_i64,
221
+ .fniv = gen_usra_vec,
222
+ .fno = gen_helper_gvec_usra_b,
223
+ .load_dest = true,
224
+ .opt_opc = vecop_list,
225
+ .vece = MO_8, },
226
+ { .fni8 = gen_usra16_i64,
227
+ .fniv = gen_usra_vec,
228
+ .fno = gen_helper_gvec_usra_h,
229
+ .load_dest = true,
230
+ .opt_opc = vecop_list,
231
+ .vece = MO_16, },
232
+ { .fni4 = gen_usra32_i32,
233
+ .fniv = gen_usra_vec,
234
+ .fno = gen_helper_gvec_usra_s,
235
+ .load_dest = true,
236
+ .opt_opc = vecop_list,
237
+ .vece = MO_32, },
238
+ { .fni8 = gen_usra64_i64,
239
+ .fniv = gen_usra_vec,
240
+ .fno = gen_helper_gvec_usra_d,
241
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
242
+ .load_dest = true,
243
+ .opt_opc = vecop_list,
244
+ .vece = MO_64, },
245
+ };
246
+
247
+ /* tszimm encoding produces immediates in the range [1..esize]. */
248
+ tcg_debug_assert(shift > 0);
249
+ tcg_debug_assert(shift <= (8 << vece));
250
+
251
+ /*
252
+ * Shifts larger than the element size are architecturally valid.
253
+ * Unsigned results in all zeros as input to accumulate: nop.
254
+ */
255
+ if (shift < (8 << vece)) {
256
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
257
+ } else {
258
+ /* Nop, but we do need to clear the tail. */
259
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
260
+ }
261
+}
262
+
263
+/*
264
+ * Shift one less than the requested amount, and the low bit is
265
+ * the rounding bit. For the 8 and 16-bit operations, because we
266
+ * mask the low bit, we can perform a normal integer shift instead
267
+ * of a vector shift.
268
+ */
269
+static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
270
+{
271
+ TCGv_i64 t = tcg_temp_new_i64();
272
+
273
+ tcg_gen_shri_i64(t, a, sh - 1);
274
+ tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
275
+ tcg_gen_vec_sar8i_i64(d, a, sh);
276
+ tcg_gen_vec_add8_i64(d, d, t);
277
+}
278
+
279
+static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
280
+{
281
+ TCGv_i64 t = tcg_temp_new_i64();
282
+
283
+ tcg_gen_shri_i64(t, a, sh - 1);
284
+ tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
285
+ tcg_gen_vec_sar16i_i64(d, a, sh);
286
+ tcg_gen_vec_add16_i64(d, d, t);
287
+}
288
+
289
+void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
290
+{
291
+ TCGv_i32 t;
292
+
293
+ /* Handle shift by the input size for the benefit of trans_SRSHR_ri */
294
+ if (sh == 32) {
295
+ tcg_gen_movi_i32(d, 0);
296
+ return;
297
+ }
298
+ t = tcg_temp_new_i32();
299
+ tcg_gen_extract_i32(t, a, sh - 1, 1);
300
+ tcg_gen_sari_i32(d, a, sh);
301
+ tcg_gen_add_i32(d, d, t);
302
+}
303
+
304
+ void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
305
+{
306
+ TCGv_i64 t = tcg_temp_new_i64();
307
+
308
+ tcg_gen_extract_i64(t, a, sh - 1, 1);
309
+ tcg_gen_sari_i64(d, a, sh);
310
+ tcg_gen_add_i64(d, d, t);
311
+}
312
+
313
+static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
314
+{
315
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
316
+ TCGv_vec ones = tcg_temp_new_vec_matching(d);
317
+
318
+ tcg_gen_shri_vec(vece, t, a, sh - 1);
319
+ tcg_gen_dupi_vec(vece, ones, 1);
320
+ tcg_gen_and_vec(vece, t, t, ones);
321
+ tcg_gen_sari_vec(vece, d, a, sh);
322
+ tcg_gen_add_vec(vece, d, d, t);
323
+}
324
+
325
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
326
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
327
+{
328
+ static const TCGOpcode vecop_list[] = {
329
+ INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
330
+ };
331
+ static const GVecGen2i ops[4] = {
332
+ { .fni8 = gen_srshr8_i64,
333
+ .fniv = gen_srshr_vec,
334
+ .fno = gen_helper_gvec_srshr_b,
335
+ .opt_opc = vecop_list,
336
+ .vece = MO_8 },
337
+ { .fni8 = gen_srshr16_i64,
338
+ .fniv = gen_srshr_vec,
339
+ .fno = gen_helper_gvec_srshr_h,
340
+ .opt_opc = vecop_list,
341
+ .vece = MO_16 },
342
+ { .fni4 = gen_srshr32_i32,
343
+ .fniv = gen_srshr_vec,
344
+ .fno = gen_helper_gvec_srshr_s,
345
+ .opt_opc = vecop_list,
346
+ .vece = MO_32 },
347
+ { .fni8 = gen_srshr64_i64,
348
+ .fniv = gen_srshr_vec,
349
+ .fno = gen_helper_gvec_srshr_d,
350
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
351
+ .opt_opc = vecop_list,
352
+ .vece = MO_64 },
353
+ };
354
+
355
+ /* tszimm encoding produces immediates in the range [1..esize] */
356
+ tcg_debug_assert(shift > 0);
357
+ tcg_debug_assert(shift <= (8 << vece));
358
+
359
+ if (shift == (8 << vece)) {
360
+ /*
361
+ * Shifts larger than the element size are architecturally valid.
362
+ * Signed results in all sign bits. With rounding, this produces
363
+ * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
364
+ * I.e. always zero.
365
+ */
366
+ tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
367
+ } else {
368
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
369
+ }
370
+}
371
+
372
+static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
373
+{
374
+ TCGv_i64 t = tcg_temp_new_i64();
375
+
376
+ gen_srshr8_i64(t, a, sh);
377
+ tcg_gen_vec_add8_i64(d, d, t);
378
+}
379
+
380
+static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
381
+{
382
+ TCGv_i64 t = tcg_temp_new_i64();
383
+
384
+ gen_srshr16_i64(t, a, sh);
385
+ tcg_gen_vec_add16_i64(d, d, t);
386
+}
387
+
388
+static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
389
+{
390
+ TCGv_i32 t = tcg_temp_new_i32();
391
+
392
+ gen_srshr32_i32(t, a, sh);
393
+ tcg_gen_add_i32(d, d, t);
394
+}
395
+
396
+static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
397
+{
398
+ TCGv_i64 t = tcg_temp_new_i64();
399
+
400
+ gen_srshr64_i64(t, a, sh);
401
+ tcg_gen_add_i64(d, d, t);
402
+}
403
+
404
+static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
405
+{
406
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
407
+
408
+ gen_srshr_vec(vece, t, a, sh);
409
+ tcg_gen_add_vec(vece, d, d, t);
410
+}
411
+
412
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
413
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
414
+{
415
+ static const TCGOpcode vecop_list[] = {
416
+ INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
417
+ };
418
+ static const GVecGen2i ops[4] = {
419
+ { .fni8 = gen_srsra8_i64,
420
+ .fniv = gen_srsra_vec,
421
+ .fno = gen_helper_gvec_srsra_b,
422
+ .opt_opc = vecop_list,
423
+ .load_dest = true,
424
+ .vece = MO_8 },
425
+ { .fni8 = gen_srsra16_i64,
426
+ .fniv = gen_srsra_vec,
427
+ .fno = gen_helper_gvec_srsra_h,
428
+ .opt_opc = vecop_list,
429
+ .load_dest = true,
430
+ .vece = MO_16 },
431
+ { .fni4 = gen_srsra32_i32,
432
+ .fniv = gen_srsra_vec,
433
+ .fno = gen_helper_gvec_srsra_s,
434
+ .opt_opc = vecop_list,
435
+ .load_dest = true,
436
+ .vece = MO_32 },
437
+ { .fni8 = gen_srsra64_i64,
438
+ .fniv = gen_srsra_vec,
439
+ .fno = gen_helper_gvec_srsra_d,
440
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
441
+ .opt_opc = vecop_list,
442
+ .load_dest = true,
443
+ .vece = MO_64 },
444
+ };
445
+
446
+ /* tszimm encoding produces immediates in the range [1..esize] */
447
+ tcg_debug_assert(shift > 0);
448
+ tcg_debug_assert(shift <= (8 << vece));
449
+
450
+ /*
451
+ * Shifts larger than the element size are architecturally valid.
452
+ * Signed results in all sign bits. With rounding, this produces
453
+ * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
454
+ * I.e. always zero. With accumulation, this leaves D unchanged.
455
+ */
456
+ if (shift == (8 << vece)) {
457
+ /* Nop, but we do need to clear the tail. */
458
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
459
+ } else {
460
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
461
+ }
462
+}
463
+
464
+static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
465
+{
466
+ TCGv_i64 t = tcg_temp_new_i64();
467
+
468
+ tcg_gen_shri_i64(t, a, sh - 1);
469
+ tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
470
+ tcg_gen_vec_shr8i_i64(d, a, sh);
471
+ tcg_gen_vec_add8_i64(d, d, t);
472
+}
473
+
474
+static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
475
+{
476
+ TCGv_i64 t = tcg_temp_new_i64();
477
+
478
+ tcg_gen_shri_i64(t, a, sh - 1);
479
+ tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
480
+ tcg_gen_vec_shr16i_i64(d, a, sh);
481
+ tcg_gen_vec_add16_i64(d, d, t);
482
+}
483
+
484
+void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
485
+{
486
+ TCGv_i32 t;
487
+
488
+ /* Handle shift by the input size for the benefit of trans_URSHR_ri */
489
+ if (sh == 32) {
490
+ tcg_gen_extract_i32(d, a, sh - 1, 1);
491
+ return;
492
+ }
493
+ t = tcg_temp_new_i32();
494
+ tcg_gen_extract_i32(t, a, sh - 1, 1);
495
+ tcg_gen_shri_i32(d, a, sh);
496
+ tcg_gen_add_i32(d, d, t);
497
+}
498
+
499
+void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
500
+{
501
+ TCGv_i64 t = tcg_temp_new_i64();
502
+
503
+ tcg_gen_extract_i64(t, a, sh - 1, 1);
504
+ tcg_gen_shri_i64(d, a, sh);
505
+ tcg_gen_add_i64(d, d, t);
506
+}
507
+
508
+static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
509
+{
510
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
511
+ TCGv_vec ones = tcg_temp_new_vec_matching(d);
512
+
513
+ tcg_gen_shri_vec(vece, t, a, shift - 1);
514
+ tcg_gen_dupi_vec(vece, ones, 1);
515
+ tcg_gen_and_vec(vece, t, t, ones);
516
+ tcg_gen_shri_vec(vece, d, a, shift);
517
+ tcg_gen_add_vec(vece, d, d, t);
518
+}
519
+
520
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
521
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
522
+{
523
+ static const TCGOpcode vecop_list[] = {
524
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
525
+ };
526
+ static const GVecGen2i ops[4] = {
527
+ { .fni8 = gen_urshr8_i64,
528
+ .fniv = gen_urshr_vec,
529
+ .fno = gen_helper_gvec_urshr_b,
530
+ .opt_opc = vecop_list,
531
+ .vece = MO_8 },
532
+ { .fni8 = gen_urshr16_i64,
533
+ .fniv = gen_urshr_vec,
534
+ .fno = gen_helper_gvec_urshr_h,
535
+ .opt_opc = vecop_list,
536
+ .vece = MO_16 },
537
+ { .fni4 = gen_urshr32_i32,
538
+ .fniv = gen_urshr_vec,
539
+ .fno = gen_helper_gvec_urshr_s,
540
+ .opt_opc = vecop_list,
541
+ .vece = MO_32 },
542
+ { .fni8 = gen_urshr64_i64,
543
+ .fniv = gen_urshr_vec,
544
+ .fno = gen_helper_gvec_urshr_d,
545
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
546
+ .opt_opc = vecop_list,
547
+ .vece = MO_64 },
548
+ };
549
+
550
+ /* tszimm encoding produces immediates in the range [1..esize] */
551
+ tcg_debug_assert(shift > 0);
552
+ tcg_debug_assert(shift <= (8 << vece));
553
+
554
+ if (shift == (8 << vece)) {
555
+ /*
556
+ * Shifts larger than the element size are architecturally valid.
557
+ * Unsigned results in zero. With rounding, this produces a
558
+ * copy of the most significant bit.
559
+ */
560
+ tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
561
+ } else {
562
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
563
+ }
564
+}
565
+
566
+static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
567
+{
568
+ TCGv_i64 t = tcg_temp_new_i64();
569
+
570
+ if (sh == 8) {
571
+ tcg_gen_vec_shr8i_i64(t, a, 7);
572
+ } else {
573
+ gen_urshr8_i64(t, a, sh);
574
+ }
575
+ tcg_gen_vec_add8_i64(d, d, t);
576
+}
577
+
578
+static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
579
+{
580
+ TCGv_i64 t = tcg_temp_new_i64();
581
+
582
+ if (sh == 16) {
583
+ tcg_gen_vec_shr16i_i64(t, a, 15);
584
+ } else {
585
+ gen_urshr16_i64(t, a, sh);
586
+ }
587
+ tcg_gen_vec_add16_i64(d, d, t);
588
+}
589
+
590
+static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
591
+{
592
+ TCGv_i32 t = tcg_temp_new_i32();
593
+
594
+ if (sh == 32) {
595
+ tcg_gen_shri_i32(t, a, 31);
596
+ } else {
597
+ gen_urshr32_i32(t, a, sh);
598
+ }
599
+ tcg_gen_add_i32(d, d, t);
600
+}
601
+
602
+static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
603
+{
604
+ TCGv_i64 t = tcg_temp_new_i64();
605
+
606
+ if (sh == 64) {
607
+ tcg_gen_shri_i64(t, a, 63);
608
+ } else {
609
+ gen_urshr64_i64(t, a, sh);
610
+ }
611
+ tcg_gen_add_i64(d, d, t);
612
+}
613
+
614
+static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
615
+{
616
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
617
+
618
+ if (sh == (8 << vece)) {
619
+ tcg_gen_shri_vec(vece, t, a, sh - 1);
620
+ } else {
621
+ gen_urshr_vec(vece, t, a, sh);
622
+ }
623
+ tcg_gen_add_vec(vece, d, d, t);
624
+}
625
+
626
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
627
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
628
+{
629
+ static const TCGOpcode vecop_list[] = {
630
+ INDEX_op_shri_vec, INDEX_op_add_vec, 0
631
+ };
632
+ static const GVecGen2i ops[4] = {
633
+ { .fni8 = gen_ursra8_i64,
634
+ .fniv = gen_ursra_vec,
635
+ .fno = gen_helper_gvec_ursra_b,
636
+ .opt_opc = vecop_list,
637
+ .load_dest = true,
638
+ .vece = MO_8 },
639
+ { .fni8 = gen_ursra16_i64,
640
+ .fniv = gen_ursra_vec,
641
+ .fno = gen_helper_gvec_ursra_h,
642
+ .opt_opc = vecop_list,
643
+ .load_dest = true,
644
+ .vece = MO_16 },
645
+ { .fni4 = gen_ursra32_i32,
646
+ .fniv = gen_ursra_vec,
647
+ .fno = gen_helper_gvec_ursra_s,
648
+ .opt_opc = vecop_list,
649
+ .load_dest = true,
650
+ .vece = MO_32 },
651
+ { .fni8 = gen_ursra64_i64,
652
+ .fniv = gen_ursra_vec,
653
+ .fno = gen_helper_gvec_ursra_d,
654
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
655
+ .opt_opc = vecop_list,
656
+ .load_dest = true,
657
+ .vece = MO_64 },
658
+ };
659
+
660
+ /* tszimm encoding produces immediates in the range [1..esize] */
661
+ tcg_debug_assert(shift > 0);
662
+ tcg_debug_assert(shift <= (8 << vece));
663
+
664
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
665
+}
666
+
667
+static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
668
+{
669
+ uint64_t mask = dup_const(MO_8, 0xff >> shift);
670
+ TCGv_i64 t = tcg_temp_new_i64();
671
+
672
+ tcg_gen_shri_i64(t, a, shift);
673
+ tcg_gen_andi_i64(t, t, mask);
674
+ tcg_gen_andi_i64(d, d, ~mask);
675
+ tcg_gen_or_i64(d, d, t);
676
+}
677
+
678
+static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
679
+{
680
+ uint64_t mask = dup_const(MO_16, 0xffff >> shift);
681
+ TCGv_i64 t = tcg_temp_new_i64();
682
+
683
+ tcg_gen_shri_i64(t, a, shift);
684
+ tcg_gen_andi_i64(t, t, mask);
685
+ tcg_gen_andi_i64(d, d, ~mask);
686
+ tcg_gen_or_i64(d, d, t);
687
+}
688
+
689
+static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
690
+{
691
+ tcg_gen_shri_i32(a, a, shift);
692
+ tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
693
+}
694
+
695
+static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
696
+{
697
+ tcg_gen_shri_i64(a, a, shift);
698
+ tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
699
+}
700
+
701
+static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
702
+{
703
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
704
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
705
+
706
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
707
+ tcg_gen_shri_vec(vece, t, a, sh);
708
+ tcg_gen_and_vec(vece, d, d, m);
709
+ tcg_gen_or_vec(vece, d, d, t);
710
+}
711
+
712
+void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
713
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
714
+{
715
+ static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
716
+ const GVecGen2i ops[4] = {
717
+ { .fni8 = gen_shr8_ins_i64,
718
+ .fniv = gen_shr_ins_vec,
719
+ .fno = gen_helper_gvec_sri_b,
720
+ .load_dest = true,
721
+ .opt_opc = vecop_list,
722
+ .vece = MO_8 },
723
+ { .fni8 = gen_shr16_ins_i64,
724
+ .fniv = gen_shr_ins_vec,
725
+ .fno = gen_helper_gvec_sri_h,
726
+ .load_dest = true,
727
+ .opt_opc = vecop_list,
728
+ .vece = MO_16 },
729
+ { .fni4 = gen_shr32_ins_i32,
730
+ .fniv = gen_shr_ins_vec,
731
+ .fno = gen_helper_gvec_sri_s,
732
+ .load_dest = true,
733
+ .opt_opc = vecop_list,
734
+ .vece = MO_32 },
735
+ { .fni8 = gen_shr64_ins_i64,
736
+ .fniv = gen_shr_ins_vec,
737
+ .fno = gen_helper_gvec_sri_d,
738
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
739
+ .load_dest = true,
740
+ .opt_opc = vecop_list,
741
+ .vece = MO_64 },
742
+ };
743
+
744
+ /* tszimm encoding produces immediates in the range [1..esize]. */
745
+ tcg_debug_assert(shift > 0);
746
+ tcg_debug_assert(shift <= (8 << vece));
747
+
748
+ /* Shift of esize leaves destination unchanged. */
749
+ if (shift < (8 << vece)) {
750
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
751
+ } else {
752
+ /* Nop, but we do need to clear the tail. */
753
+ tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
754
+ }
755
+}
756
+
757
+static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
758
+{
759
+ uint64_t mask = dup_const(MO_8, 0xff << shift);
760
+ TCGv_i64 t = tcg_temp_new_i64();
761
+
762
+ tcg_gen_shli_i64(t, a, shift);
763
+ tcg_gen_andi_i64(t, t, mask);
764
+ tcg_gen_andi_i64(d, d, ~mask);
765
+ tcg_gen_or_i64(d, d, t);
766
+}
767
+
768
+static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
769
+{
770
+ uint64_t mask = dup_const(MO_16, 0xffff << shift);
771
+ TCGv_i64 t = tcg_temp_new_i64();
772
+
773
+ tcg_gen_shli_i64(t, a, shift);
774
+ tcg_gen_andi_i64(t, t, mask);
775
+ tcg_gen_andi_i64(d, d, ~mask);
776
+ tcg_gen_or_i64(d, d, t);
777
+}
778
+
779
+static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
780
+{
781
+ tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
782
+}
783
+
784
+static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
785
+{
786
+ tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
787
+}
788
+
789
+static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
790
+{
791
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
792
+ TCGv_vec m = tcg_temp_new_vec_matching(d);
793
+
794
+ tcg_gen_shli_vec(vece, t, a, sh);
795
+ tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
796
+ tcg_gen_and_vec(vece, d, d, m);
797
+ tcg_gen_or_vec(vece, d, d, t);
798
+}
799
+
800
+void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
801
+ int64_t shift, uint32_t opr_sz, uint32_t max_sz)
802
+{
803
+ static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
804
+ const GVecGen2i ops[4] = {
805
+ { .fni8 = gen_shl8_ins_i64,
806
+ .fniv = gen_shl_ins_vec,
807
+ .fno = gen_helper_gvec_sli_b,
808
+ .load_dest = true,
809
+ .opt_opc = vecop_list,
810
+ .vece = MO_8 },
811
+ { .fni8 = gen_shl16_ins_i64,
812
+ .fniv = gen_shl_ins_vec,
813
+ .fno = gen_helper_gvec_sli_h,
814
+ .load_dest = true,
815
+ .opt_opc = vecop_list,
816
+ .vece = MO_16 },
817
+ { .fni4 = gen_shl32_ins_i32,
818
+ .fniv = gen_shl_ins_vec,
819
+ .fno = gen_helper_gvec_sli_s,
820
+ .load_dest = true,
821
+ .opt_opc = vecop_list,
822
+ .vece = MO_32 },
823
+ { .fni8 = gen_shl64_ins_i64,
824
+ .fniv = gen_shl_ins_vec,
825
+ .fno = gen_helper_gvec_sli_d,
826
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
827
+ .load_dest = true,
828
+ .opt_opc = vecop_list,
829
+ .vece = MO_64 },
830
+ };
831
+
832
+ /* tszimm encoding produces immediates in the range [0..esize-1]. */
833
+ tcg_debug_assert(shift >= 0);
834
+ tcg_debug_assert(shift < (8 << vece));
835
+
836
+ if (shift == 0) {
837
+ tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
838
+ } else {
839
+ tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
840
+ }
841
+}
842
+
843
+static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
844
+{
845
+ gen_helper_neon_mul_u8(a, a, b);
846
+ gen_helper_neon_add_u8(d, d, a);
847
+}
848
+
849
+static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
850
+{
851
+ gen_helper_neon_mul_u8(a, a, b);
852
+ gen_helper_neon_sub_u8(d, d, a);
853
+}
854
+
855
+static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
856
+{
857
+ gen_helper_neon_mul_u16(a, a, b);
858
+ gen_helper_neon_add_u16(d, d, a);
859
+}
860
+
861
+static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
862
+{
863
+ gen_helper_neon_mul_u16(a, a, b);
864
+ gen_helper_neon_sub_u16(d, d, a);
865
+}
866
+
867
+static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
868
+{
869
+ tcg_gen_mul_i32(a, a, b);
870
+ tcg_gen_add_i32(d, d, a);
871
+}
872
+
873
+static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
874
+{
875
+ tcg_gen_mul_i32(a, a, b);
876
+ tcg_gen_sub_i32(d, d, a);
877
+}
878
+
879
+static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
880
+{
881
+ tcg_gen_mul_i64(a, a, b);
882
+ tcg_gen_add_i64(d, d, a);
883
+}
884
+
885
+static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
886
+{
887
+ tcg_gen_mul_i64(a, a, b);
888
+ tcg_gen_sub_i64(d, d, a);
889
+}
890
+
891
+static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
892
+{
893
+ tcg_gen_mul_vec(vece, a, a, b);
894
+ tcg_gen_add_vec(vece, d, d, a);
895
+}
896
+
897
+static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
898
+{
899
+ tcg_gen_mul_vec(vece, a, a, b);
900
+ tcg_gen_sub_vec(vece, d, d, a);
901
+}
902
+
903
+/* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
904
+ * these tables are shared with AArch64 which does support them.
905
+ */
906
+void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
907
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
908
+{
909
+ static const TCGOpcode vecop_list[] = {
910
+ INDEX_op_mul_vec, INDEX_op_add_vec, 0
911
+ };
912
+ static const GVecGen3 ops[4] = {
913
+ { .fni4 = gen_mla8_i32,
914
+ .fniv = gen_mla_vec,
915
+ .load_dest = true,
916
+ .opt_opc = vecop_list,
917
+ .vece = MO_8 },
918
+ { .fni4 = gen_mla16_i32,
919
+ .fniv = gen_mla_vec,
920
+ .load_dest = true,
921
+ .opt_opc = vecop_list,
922
+ .vece = MO_16 },
923
+ { .fni4 = gen_mla32_i32,
924
+ .fniv = gen_mla_vec,
925
+ .load_dest = true,
926
+ .opt_opc = vecop_list,
927
+ .vece = MO_32 },
928
+ { .fni8 = gen_mla64_i64,
929
+ .fniv = gen_mla_vec,
930
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
931
+ .load_dest = true,
932
+ .opt_opc = vecop_list,
933
+ .vece = MO_64 },
934
+ };
935
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
936
+}
937
+
938
+void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
939
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
940
+{
941
+ static const TCGOpcode vecop_list[] = {
942
+ INDEX_op_mul_vec, INDEX_op_sub_vec, 0
943
+ };
944
+ static const GVecGen3 ops[4] = {
945
+ { .fni4 = gen_mls8_i32,
946
+ .fniv = gen_mls_vec,
947
+ .load_dest = true,
948
+ .opt_opc = vecop_list,
949
+ .vece = MO_8 },
950
+ { .fni4 = gen_mls16_i32,
951
+ .fniv = gen_mls_vec,
952
+ .load_dest = true,
953
+ .opt_opc = vecop_list,
954
+ .vece = MO_16 },
955
+ { .fni4 = gen_mls32_i32,
956
+ .fniv = gen_mls_vec,
957
+ .load_dest = true,
958
+ .opt_opc = vecop_list,
959
+ .vece = MO_32 },
960
+ { .fni8 = gen_mls64_i64,
961
+ .fniv = gen_mls_vec,
962
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
963
+ .load_dest = true,
964
+ .opt_opc = vecop_list,
965
+ .vece = MO_64 },
966
+ };
967
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
968
+}
969
+
970
+/* CMTST : test is "if (X & Y != 0)". */
971
+static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
972
+{
973
+ tcg_gen_and_i32(d, a, b);
974
+ tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0));
975
+}
976
+
977
+void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
978
+{
979
+ tcg_gen_and_i64(d, a, b);
980
+ tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0));
981
+}
982
+
983
+static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
984
+{
985
+ tcg_gen_and_vec(vece, d, a, b);
986
+ tcg_gen_dupi_vec(vece, a, 0);
987
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
988
+}
989
+
990
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
991
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
992
+{
993
+ static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
994
+ static const GVecGen3 ops[4] = {
995
+ { .fni4 = gen_helper_neon_tst_u8,
996
+ .fniv = gen_cmtst_vec,
997
+ .opt_opc = vecop_list,
998
+ .vece = MO_8 },
999
+ { .fni4 = gen_helper_neon_tst_u16,
1000
+ .fniv = gen_cmtst_vec,
1001
+ .opt_opc = vecop_list,
1002
+ .vece = MO_16 },
1003
+ { .fni4 = gen_cmtst_i32,
1004
+ .fniv = gen_cmtst_vec,
1005
+ .opt_opc = vecop_list,
1006
+ .vece = MO_32 },
1007
+ { .fni8 = gen_cmtst_i64,
1008
+ .fniv = gen_cmtst_vec,
1009
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1010
+ .opt_opc = vecop_list,
1011
+ .vece = MO_64 },
1012
+ };
1013
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1014
+}
1015
+
1016
+void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
1017
+{
1018
+ TCGv_i32 lval = tcg_temp_new_i32();
1019
+ TCGv_i32 rval = tcg_temp_new_i32();
1020
+ TCGv_i32 lsh = tcg_temp_new_i32();
1021
+ TCGv_i32 rsh = tcg_temp_new_i32();
1022
+ TCGv_i32 zero = tcg_constant_i32(0);
1023
+ TCGv_i32 max = tcg_constant_i32(32);
1024
+
1025
+ /*
1026
+ * Rely on the TCG guarantee that out of range shifts produce
1027
+ * unspecified results, not undefined behaviour (i.e. no trap).
1028
+ * Discard out-of-range results after the fact.
1029
+ */
1030
+ tcg_gen_ext8s_i32(lsh, shift);
1031
+ tcg_gen_neg_i32(rsh, lsh);
1032
+ tcg_gen_shl_i32(lval, src, lsh);
1033
+ tcg_gen_shr_i32(rval, src, rsh);
1034
+ tcg_gen_movcond_i32(TCG_COND_LTU, dst, lsh, max, lval, zero);
1035
+ tcg_gen_movcond_i32(TCG_COND_LTU, dst, rsh, max, rval, dst);
1036
+}
1037
+
1038
+void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
1039
+{
1040
+ TCGv_i64 lval = tcg_temp_new_i64();
1041
+ TCGv_i64 rval = tcg_temp_new_i64();
1042
+ TCGv_i64 lsh = tcg_temp_new_i64();
1043
+ TCGv_i64 rsh = tcg_temp_new_i64();
1044
+ TCGv_i64 zero = tcg_constant_i64(0);
1045
+ TCGv_i64 max = tcg_constant_i64(64);
1046
+
1047
+ /*
1048
+ * Rely on the TCG guarantee that out of range shifts produce
1049
+ * unspecified results, not undefined behaviour (i.e. no trap).
1050
+ * Discard out-of-range results after the fact.
1051
+ */
1052
+ tcg_gen_ext8s_i64(lsh, shift);
1053
+ tcg_gen_neg_i64(rsh, lsh);
1054
+ tcg_gen_shl_i64(lval, src, lsh);
1055
+ tcg_gen_shr_i64(rval, src, rsh);
1056
+ tcg_gen_movcond_i64(TCG_COND_LTU, dst, lsh, max, lval, zero);
1057
+ tcg_gen_movcond_i64(TCG_COND_LTU, dst, rsh, max, rval, dst);
1058
+}
1059
+
1060
+static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
1061
+ TCGv_vec src, TCGv_vec shift)
1062
+{
1063
+ TCGv_vec lval = tcg_temp_new_vec_matching(dst);
1064
+ TCGv_vec rval = tcg_temp_new_vec_matching(dst);
1065
+ TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
1066
+ TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
1067
+ TCGv_vec msk, max;
1068
+
1069
+ tcg_gen_neg_vec(vece, rsh, shift);
1070
+ if (vece == MO_8) {
1071
+ tcg_gen_mov_vec(lsh, shift);
1072
+ } else {
1073
+ msk = tcg_temp_new_vec_matching(dst);
1074
+ tcg_gen_dupi_vec(vece, msk, 0xff);
1075
+ tcg_gen_and_vec(vece, lsh, shift, msk);
1076
+ tcg_gen_and_vec(vece, rsh, rsh, msk);
1077
+ }
1078
+
1079
+ /*
1080
+ * Rely on the TCG guarantee that out of range shifts produce
1081
+ * unspecified results, not undefined behaviour (i.e. no trap).
1082
+ * Discard out-of-range results after the fact.
1083
+ */
1084
+ tcg_gen_shlv_vec(vece, lval, src, lsh);
1085
+ tcg_gen_shrv_vec(vece, rval, src, rsh);
1086
+
1087
+ max = tcg_temp_new_vec_matching(dst);
1088
+ tcg_gen_dupi_vec(vece, max, 8 << vece);
1089
+
1090
+ /*
1091
+ * The choice of LT (signed) and GEU (unsigned) are biased toward
1092
+ * the instructions of the x86_64 host. For MO_8, the whole byte
1093
+ * is significant so we must use an unsigned compare; otherwise we
1094
+ * have already masked to a byte and so a signed compare works.
1095
+ * Other tcg hosts have a full set of comparisons and do not care.
1096
+ */
1097
+ if (vece == MO_8) {
1098
+ tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
1099
+ tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
1100
+ tcg_gen_andc_vec(vece, lval, lval, lsh);
1101
+ tcg_gen_andc_vec(vece, rval, rval, rsh);
1102
+ } else {
1103
+ tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
1104
+ tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
1105
+ tcg_gen_and_vec(vece, lval, lval, lsh);
1106
+ tcg_gen_and_vec(vece, rval, rval, rsh);
1107
+ }
1108
+ tcg_gen_or_vec(vece, dst, lval, rval);
1109
+}
1110
+
1111
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1112
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1113
+{
1114
+ static const TCGOpcode vecop_list[] = {
1115
+ INDEX_op_neg_vec, INDEX_op_shlv_vec,
1116
+ INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
1117
+ };
1118
+ static const GVecGen3 ops[4] = {
1119
+ { .fniv = gen_ushl_vec,
1120
+ .fno = gen_helper_gvec_ushl_b,
1121
+ .opt_opc = vecop_list,
1122
+ .vece = MO_8 },
1123
+ { .fniv = gen_ushl_vec,
1124
+ .fno = gen_helper_gvec_ushl_h,
1125
+ .opt_opc = vecop_list,
1126
+ .vece = MO_16 },
1127
+ { .fni4 = gen_ushl_i32,
1128
+ .fniv = gen_ushl_vec,
1129
+ .opt_opc = vecop_list,
1130
+ .vece = MO_32 },
1131
+ { .fni8 = gen_ushl_i64,
1132
+ .fniv = gen_ushl_vec,
1133
+ .opt_opc = vecop_list,
1134
+ .vece = MO_64 },
1135
+ };
1136
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1137
+}
1138
+
1139
+void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
1140
+{
1141
+ TCGv_i32 lval = tcg_temp_new_i32();
1142
+ TCGv_i32 rval = tcg_temp_new_i32();
1143
+ TCGv_i32 lsh = tcg_temp_new_i32();
1144
+ TCGv_i32 rsh = tcg_temp_new_i32();
1145
+ TCGv_i32 zero = tcg_constant_i32(0);
1146
+ TCGv_i32 max = tcg_constant_i32(31);
1147
+
1148
+ /*
1149
+ * Rely on the TCG guarantee that out of range shifts produce
1150
+ * unspecified results, not undefined behaviour (i.e. no trap).
1151
+ * Discard out-of-range results after the fact.
1152
+ */
1153
+ tcg_gen_ext8s_i32(lsh, shift);
1154
+ tcg_gen_neg_i32(rsh, lsh);
1155
+ tcg_gen_shl_i32(lval, src, lsh);
1156
+ tcg_gen_umin_i32(rsh, rsh, max);
1157
+ tcg_gen_sar_i32(rval, src, rsh);
1158
+ tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero);
1159
+ tcg_gen_movcond_i32(TCG_COND_LT, dst, lsh, zero, rval, lval);
1160
+}
1161
+
1162
+void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
1163
+{
1164
+ TCGv_i64 lval = tcg_temp_new_i64();
1165
+ TCGv_i64 rval = tcg_temp_new_i64();
1166
+ TCGv_i64 lsh = tcg_temp_new_i64();
1167
+ TCGv_i64 rsh = tcg_temp_new_i64();
1168
+ TCGv_i64 zero = tcg_constant_i64(0);
1169
+ TCGv_i64 max = tcg_constant_i64(63);
1170
+
1171
+ /*
1172
+ * Rely on the TCG guarantee that out of range shifts produce
1173
+ * unspecified results, not undefined behaviour (i.e. no trap).
1174
+ * Discard out-of-range results after the fact.
1175
+ */
1176
+ tcg_gen_ext8s_i64(lsh, shift);
1177
+ tcg_gen_neg_i64(rsh, lsh);
1178
+ tcg_gen_shl_i64(lval, src, lsh);
1179
+ tcg_gen_umin_i64(rsh, rsh, max);
1180
+ tcg_gen_sar_i64(rval, src, rsh);
1181
+ tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero);
1182
+ tcg_gen_movcond_i64(TCG_COND_LT, dst, lsh, zero, rval, lval);
1183
+}
1184
+
1185
+static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
1186
+ TCGv_vec src, TCGv_vec shift)
1187
+{
1188
+ TCGv_vec lval = tcg_temp_new_vec_matching(dst);
1189
+ TCGv_vec rval = tcg_temp_new_vec_matching(dst);
1190
+ TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
1191
+ TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
1192
+ TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
1193
+
1194
+ /*
1195
+ * Rely on the TCG guarantee that out of range shifts produce
1196
+ * unspecified results, not undefined behaviour (i.e. no trap).
1197
+ * Discard out-of-range results after the fact.
1198
+ */
1199
+ tcg_gen_neg_vec(vece, rsh, shift);
1200
+ if (vece == MO_8) {
1201
+ tcg_gen_mov_vec(lsh, shift);
1202
+ } else {
1203
+ tcg_gen_dupi_vec(vece, tmp, 0xff);
1204
+ tcg_gen_and_vec(vece, lsh, shift, tmp);
1205
+ tcg_gen_and_vec(vece, rsh, rsh, tmp);
1206
+ }
1207
+
1208
+ /* Bound rsh so out of bound right shift gets -1. */
1209
+ tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
1210
+ tcg_gen_umin_vec(vece, rsh, rsh, tmp);
1211
+ tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
1212
+
1213
+ tcg_gen_shlv_vec(vece, lval, src, lsh);
1214
+ tcg_gen_sarv_vec(vece, rval, src, rsh);
1215
+
1216
+ /* Select in-bound left shift. */
1217
+ tcg_gen_andc_vec(vece, lval, lval, tmp);
1218
+
1219
+ /* Select between left and right shift. */
1220
+ if (vece == MO_8) {
1221
+ tcg_gen_dupi_vec(vece, tmp, 0);
1222
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval);
1223
+ } else {
1224
+ tcg_gen_dupi_vec(vece, tmp, 0x80);
1225
+ tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval);
1226
+ }
1227
+}
1228
+
1229
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1230
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1231
+{
1232
+ static const TCGOpcode vecop_list[] = {
1233
+ INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
1234
+ INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
1235
+ };
1236
+ static const GVecGen3 ops[4] = {
1237
+ { .fniv = gen_sshl_vec,
1238
+ .fno = gen_helper_gvec_sshl_b,
1239
+ .opt_opc = vecop_list,
1240
+ .vece = MO_8 },
1241
+ { .fniv = gen_sshl_vec,
1242
+ .fno = gen_helper_gvec_sshl_h,
1243
+ .opt_opc = vecop_list,
1244
+ .vece = MO_16 },
1245
+ { .fni4 = gen_sshl_i32,
1246
+ .fniv = gen_sshl_vec,
1247
+ .opt_opc = vecop_list,
1248
+ .vece = MO_32 },
1249
+ { .fni8 = gen_sshl_i64,
1250
+ .fniv = gen_sshl_vec,
1251
+ .opt_opc = vecop_list,
1252
+ .vece = MO_64 },
1253
+ };
1254
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1255
+}
1256
+
1257
+static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
1258
+ TCGv_vec a, TCGv_vec b)
1259
+{
1260
+ TCGv_vec x = tcg_temp_new_vec_matching(t);
1261
+ tcg_gen_add_vec(vece, x, a, b);
1262
+ tcg_gen_usadd_vec(vece, t, a, b);
1263
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
1264
+ tcg_gen_or_vec(vece, sat, sat, x);
1265
+}
1266
+
1267
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1268
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1269
+{
1270
+ static const TCGOpcode vecop_list[] = {
1271
+ INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
1272
+ };
1273
+ static const GVecGen4 ops[4] = {
1274
+ { .fniv = gen_uqadd_vec,
1275
+ .fno = gen_helper_gvec_uqadd_b,
1276
+ .write_aofs = true,
1277
+ .opt_opc = vecop_list,
1278
+ .vece = MO_8 },
1279
+ { .fniv = gen_uqadd_vec,
1280
+ .fno = gen_helper_gvec_uqadd_h,
1281
+ .write_aofs = true,
1282
+ .opt_opc = vecop_list,
1283
+ .vece = MO_16 },
1284
+ { .fniv = gen_uqadd_vec,
1285
+ .fno = gen_helper_gvec_uqadd_s,
1286
+ .write_aofs = true,
1287
+ .opt_opc = vecop_list,
1288
+ .vece = MO_32 },
1289
+ { .fniv = gen_uqadd_vec,
1290
+ .fno = gen_helper_gvec_uqadd_d,
1291
+ .write_aofs = true,
1292
+ .opt_opc = vecop_list,
1293
+ .vece = MO_64 },
1294
+ };
1295
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
1296
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1297
+}
1298
+
1299
+static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
1300
+ TCGv_vec a, TCGv_vec b)
1301
+{
1302
+ TCGv_vec x = tcg_temp_new_vec_matching(t);
1303
+ tcg_gen_add_vec(vece, x, a, b);
1304
+ tcg_gen_ssadd_vec(vece, t, a, b);
1305
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
1306
+ tcg_gen_or_vec(vece, sat, sat, x);
1307
+}
1308
+
1309
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1310
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1311
+{
1312
+ static const TCGOpcode vecop_list[] = {
1313
+ INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
1314
+ };
1315
+ static const GVecGen4 ops[4] = {
1316
+ { .fniv = gen_sqadd_vec,
1317
+ .fno = gen_helper_gvec_sqadd_b,
1318
+ .opt_opc = vecop_list,
1319
+ .write_aofs = true,
1320
+ .vece = MO_8 },
1321
+ { .fniv = gen_sqadd_vec,
1322
+ .fno = gen_helper_gvec_sqadd_h,
1323
+ .opt_opc = vecop_list,
1324
+ .write_aofs = true,
1325
+ .vece = MO_16 },
1326
+ { .fniv = gen_sqadd_vec,
1327
+ .fno = gen_helper_gvec_sqadd_s,
1328
+ .opt_opc = vecop_list,
1329
+ .write_aofs = true,
1330
+ .vece = MO_32 },
1331
+ { .fniv = gen_sqadd_vec,
1332
+ .fno = gen_helper_gvec_sqadd_d,
1333
+ .opt_opc = vecop_list,
1334
+ .write_aofs = true,
1335
+ .vece = MO_64 },
1336
+ };
1337
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
1338
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1339
+}
1340
+
1341
+static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
1342
+ TCGv_vec a, TCGv_vec b)
1343
+{
1344
+ TCGv_vec x = tcg_temp_new_vec_matching(t);
1345
+ tcg_gen_sub_vec(vece, x, a, b);
1346
+ tcg_gen_ussub_vec(vece, t, a, b);
1347
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
1348
+ tcg_gen_or_vec(vece, sat, sat, x);
1349
+}
1350
+
1351
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1352
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1353
+{
1354
+ static const TCGOpcode vecop_list[] = {
1355
+ INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
1356
+ };
1357
+ static const GVecGen4 ops[4] = {
1358
+ { .fniv = gen_uqsub_vec,
1359
+ .fno = gen_helper_gvec_uqsub_b,
1360
+ .opt_opc = vecop_list,
1361
+ .write_aofs = true,
1362
+ .vece = MO_8 },
1363
+ { .fniv = gen_uqsub_vec,
1364
+ .fno = gen_helper_gvec_uqsub_h,
1365
+ .opt_opc = vecop_list,
1366
+ .write_aofs = true,
1367
+ .vece = MO_16 },
1368
+ { .fniv = gen_uqsub_vec,
1369
+ .fno = gen_helper_gvec_uqsub_s,
1370
+ .opt_opc = vecop_list,
1371
+ .write_aofs = true,
1372
+ .vece = MO_32 },
1373
+ { .fniv = gen_uqsub_vec,
1374
+ .fno = gen_helper_gvec_uqsub_d,
1375
+ .opt_opc = vecop_list,
1376
+ .write_aofs = true,
1377
+ .vece = MO_64 },
1378
+ };
1379
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
1380
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1381
+}
1382
+
1383
+static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
1384
+ TCGv_vec a, TCGv_vec b)
1385
+{
1386
+ TCGv_vec x = tcg_temp_new_vec_matching(t);
1387
+ tcg_gen_sub_vec(vece, x, a, b);
1388
+ tcg_gen_sssub_vec(vece, t, a, b);
1389
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
1390
+ tcg_gen_or_vec(vece, sat, sat, x);
1391
+}
1392
+
1393
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1394
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1395
+{
1396
+ static const TCGOpcode vecop_list[] = {
1397
+ INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
1398
+ };
1399
+ static const GVecGen4 ops[4] = {
1400
+ { .fniv = gen_sqsub_vec,
1401
+ .fno = gen_helper_gvec_sqsub_b,
1402
+ .opt_opc = vecop_list,
1403
+ .write_aofs = true,
1404
+ .vece = MO_8 },
1405
+ { .fniv = gen_sqsub_vec,
1406
+ .fno = gen_helper_gvec_sqsub_h,
1407
+ .opt_opc = vecop_list,
1408
+ .write_aofs = true,
1409
+ .vece = MO_16 },
1410
+ { .fniv = gen_sqsub_vec,
1411
+ .fno = gen_helper_gvec_sqsub_s,
1412
+ .opt_opc = vecop_list,
1413
+ .write_aofs = true,
1414
+ .vece = MO_32 },
1415
+ { .fniv = gen_sqsub_vec,
1416
+ .fno = gen_helper_gvec_sqsub_d,
1417
+ .opt_opc = vecop_list,
1418
+ .write_aofs = true,
1419
+ .vece = MO_64 },
1420
+ };
1421
+ tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
1422
+ rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1423
+}
1424
+
1425
+static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
1426
+{
1427
+ TCGv_i32 t = tcg_temp_new_i32();
1428
+
1429
+ tcg_gen_sub_i32(t, a, b);
1430
+ tcg_gen_sub_i32(d, b, a);
1431
+ tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
1432
+}
1433
+
1434
+static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
1435
+{
1436
+ TCGv_i64 t = tcg_temp_new_i64();
1437
+
1438
+ tcg_gen_sub_i64(t, a, b);
1439
+ tcg_gen_sub_i64(d, b, a);
1440
+ tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
1441
+}
1442
+
1443
+static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
1444
+{
1445
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
1446
+
1447
+ tcg_gen_smin_vec(vece, t, a, b);
1448
+ tcg_gen_smax_vec(vece, d, a, b);
1449
+ tcg_gen_sub_vec(vece, d, d, t);
1450
+}
1451
+
1452
+void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1453
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1454
+{
1455
+ static const TCGOpcode vecop_list[] = {
1456
+ INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
1457
+ };
1458
+ static const GVecGen3 ops[4] = {
1459
+ { .fniv = gen_sabd_vec,
1460
+ .fno = gen_helper_gvec_sabd_b,
1461
+ .opt_opc = vecop_list,
1462
+ .vece = MO_8 },
1463
+ { .fniv = gen_sabd_vec,
1464
+ .fno = gen_helper_gvec_sabd_h,
1465
+ .opt_opc = vecop_list,
1466
+ .vece = MO_16 },
1467
+ { .fni4 = gen_sabd_i32,
1468
+ .fniv = gen_sabd_vec,
1469
+ .fno = gen_helper_gvec_sabd_s,
1470
+ .opt_opc = vecop_list,
1471
+ .vece = MO_32 },
1472
+ { .fni8 = gen_sabd_i64,
1473
+ .fniv = gen_sabd_vec,
1474
+ .fno = gen_helper_gvec_sabd_d,
1475
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1476
+ .opt_opc = vecop_list,
1477
+ .vece = MO_64 },
1478
+ };
1479
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1480
+}
1481
+
1482
+static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
1483
+{
1484
+ TCGv_i32 t = tcg_temp_new_i32();
1485
+
1486
+ tcg_gen_sub_i32(t, a, b);
1487
+ tcg_gen_sub_i32(d, b, a);
1488
+ tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
1489
+}
1490
+
1491
+static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
1492
+{
1493
+ TCGv_i64 t = tcg_temp_new_i64();
1494
+
1495
+ tcg_gen_sub_i64(t, a, b);
1496
+ tcg_gen_sub_i64(d, b, a);
1497
+ tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
1498
+}
1499
+
1500
+static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
1501
+{
1502
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
1503
+
1504
+ tcg_gen_umin_vec(vece, t, a, b);
1505
+ tcg_gen_umax_vec(vece, d, a, b);
1506
+ tcg_gen_sub_vec(vece, d, d, t);
1507
+}
1508
+
1509
+void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1510
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1511
+{
1512
+ static const TCGOpcode vecop_list[] = {
1513
+ INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
1514
+ };
1515
+ static const GVecGen3 ops[4] = {
1516
+ { .fniv = gen_uabd_vec,
1517
+ .fno = gen_helper_gvec_uabd_b,
1518
+ .opt_opc = vecop_list,
1519
+ .vece = MO_8 },
1520
+ { .fniv = gen_uabd_vec,
1521
+ .fno = gen_helper_gvec_uabd_h,
1522
+ .opt_opc = vecop_list,
1523
+ .vece = MO_16 },
1524
+ { .fni4 = gen_uabd_i32,
1525
+ .fniv = gen_uabd_vec,
1526
+ .fno = gen_helper_gvec_uabd_s,
1527
+ .opt_opc = vecop_list,
1528
+ .vece = MO_32 },
1529
+ { .fni8 = gen_uabd_i64,
1530
+ .fniv = gen_uabd_vec,
1531
+ .fno = gen_helper_gvec_uabd_d,
1532
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1533
+ .opt_opc = vecop_list,
1534
+ .vece = MO_64 },
1535
+ };
1536
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1537
+}
1538
+
1539
+static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
1540
+{
1541
+ TCGv_i32 t = tcg_temp_new_i32();
1542
+ gen_sabd_i32(t, a, b);
1543
+ tcg_gen_add_i32(d, d, t);
1544
+}
1545
+
1546
+static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
1547
+{
1548
+ TCGv_i64 t = tcg_temp_new_i64();
1549
+ gen_sabd_i64(t, a, b);
1550
+ tcg_gen_add_i64(d, d, t);
1551
+}
1552
+
1553
+static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
1554
+{
1555
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
1556
+ gen_sabd_vec(vece, t, a, b);
1557
+ tcg_gen_add_vec(vece, d, d, t);
1558
+}
1559
+
1560
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1561
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1562
+{
1563
+ static const TCGOpcode vecop_list[] = {
1564
+ INDEX_op_sub_vec, INDEX_op_add_vec,
1565
+ INDEX_op_smin_vec, INDEX_op_smax_vec, 0
1566
+ };
1567
+ static const GVecGen3 ops[4] = {
1568
+ { .fniv = gen_saba_vec,
1569
+ .fno = gen_helper_gvec_saba_b,
1570
+ .opt_opc = vecop_list,
1571
+ .load_dest = true,
1572
+ .vece = MO_8 },
1573
+ { .fniv = gen_saba_vec,
1574
+ .fno = gen_helper_gvec_saba_h,
1575
+ .opt_opc = vecop_list,
1576
+ .load_dest = true,
1577
+ .vece = MO_16 },
1578
+ { .fni4 = gen_saba_i32,
1579
+ .fniv = gen_saba_vec,
1580
+ .fno = gen_helper_gvec_saba_s,
1581
+ .opt_opc = vecop_list,
1582
+ .load_dest = true,
1583
+ .vece = MO_32 },
1584
+ { .fni8 = gen_saba_i64,
1585
+ .fniv = gen_saba_vec,
1586
+ .fno = gen_helper_gvec_saba_d,
1587
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1588
+ .opt_opc = vecop_list,
1589
+ .load_dest = true,
1590
+ .vece = MO_64 },
1591
+ };
1592
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1593
+}
1594
+
1595
+static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
1596
+{
1597
+ TCGv_i32 t = tcg_temp_new_i32();
1598
+ gen_uabd_i32(t, a, b);
1599
+ tcg_gen_add_i32(d, d, t);
1600
+}
1601
+
1602
+static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
1603
+{
1604
+ TCGv_i64 t = tcg_temp_new_i64();
1605
+ gen_uabd_i64(t, a, b);
1606
+ tcg_gen_add_i64(d, d, t);
1607
+}
1608
+
1609
+static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
1610
+{
1611
+ TCGv_vec t = tcg_temp_new_vec_matching(d);
1612
+ gen_uabd_vec(vece, t, a, b);
1613
+ tcg_gen_add_vec(vece, d, d, t);
1614
+}
1615
+
1616
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1617
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1618
+{
1619
+ static const TCGOpcode vecop_list[] = {
1620
+ INDEX_op_sub_vec, INDEX_op_add_vec,
1621
+ INDEX_op_umin_vec, INDEX_op_umax_vec, 0
1622
+ };
1623
+ static const GVecGen3 ops[4] = {
1624
+ { .fniv = gen_uaba_vec,
1625
+ .fno = gen_helper_gvec_uaba_b,
1626
+ .opt_opc = vecop_list,
1627
+ .load_dest = true,
1628
+ .vece = MO_8 },
1629
+ { .fniv = gen_uaba_vec,
1630
+ .fno = gen_helper_gvec_uaba_h,
1631
+ .opt_opc = vecop_list,
1632
+ .load_dest = true,
1633
+ .vece = MO_16 },
1634
+ { .fni4 = gen_uaba_i32,
1635
+ .fniv = gen_uaba_vec,
1636
+ .fno = gen_helper_gvec_uaba_s,
1637
+ .opt_opc = vecop_list,
1638
+ .load_dest = true,
1639
+ .vece = MO_32 },
1640
+ { .fni8 = gen_uaba_i64,
1641
+ .fniv = gen_uaba_vec,
1642
+ .fno = gen_helper_gvec_uaba_d,
1643
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1644
+ .opt_opc = vecop_list,
1645
+ .load_dest = true,
1646
+ .vece = MO_64 },
1647
+ };
1648
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
1649
+}
1650
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
1651
index XXXXXXX..XXXXXXX 100644
1652
--- a/target/arm/tcg/translate.c
1653
+++ b/target/arm/tcg/translate.c
1654
@@ -XXX,XX +XXX,XX @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
1655
gen_rfe(s, pc, load_cpu_field(spsr));
20
}
1656
}
21
1657
22
static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
1658
-static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
23
- unsigned int target_el)
1659
- uint32_t opr_sz, uint32_t max_sz,
24
+ unsigned int target_el,
1660
- gen_helper_gvec_3_ptr *fn)
25
+ unsigned int cur_el, bool secure,
1661
-{
26
+ uint64_t hcr_el2)
1662
- TCGv_ptr qc_ptr = tcg_temp_new_ptr();
1663
-
1664
- tcg_gen_addi_ptr(qc_ptr, tcg_env, offsetof(CPUARMState, vfp.qc));
1665
- tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
1666
- opr_sz, max_sz, 0, fn);
1667
-}
1668
-
1669
-void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1670
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1671
-{
1672
- static gen_helper_gvec_3_ptr * const fns[2] = {
1673
- gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
1674
- };
1675
- tcg_debug_assert(vece >= 1 && vece <= 2);
1676
- gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
1677
-}
1678
-
1679
-void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
1680
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
1681
-{
1682
- static gen_helper_gvec_3_ptr * const fns[2] = {
1683
- gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
1684
- };
1685
- tcg_debug_assert(vece >= 1 && vece <= 2);
1686
- gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
1687
-}
1688
-
1689
-#define GEN_CMP0(NAME, COND) \
1690
- void NAME(unsigned vece, uint32_t d, uint32_t m, \
1691
- uint32_t opr_sz, uint32_t max_sz) \
1692
- { tcg_gen_gvec_cmpi(COND, vece, d, m, 0, opr_sz, max_sz); }
1693
-
1694
-GEN_CMP0(gen_gvec_ceq0, TCG_COND_EQ)
1695
-GEN_CMP0(gen_gvec_cle0, TCG_COND_LE)
1696
-GEN_CMP0(gen_gvec_cge0, TCG_COND_GE)
1697
-GEN_CMP0(gen_gvec_clt0, TCG_COND_LT)
1698
-GEN_CMP0(gen_gvec_cgt0, TCG_COND_GT)
1699
-
1700
-#undef GEN_CMP0
1701
-
1702
-static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1703
-{
1704
- tcg_gen_vec_sar8i_i64(a, a, shift);
1705
- tcg_gen_vec_add8_i64(d, d, a);
1706
-}
1707
-
1708
-static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1709
-{
1710
- tcg_gen_vec_sar16i_i64(a, a, shift);
1711
- tcg_gen_vec_add16_i64(d, d, a);
1712
-}
1713
-
1714
-static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
1715
-{
1716
- tcg_gen_sari_i32(a, a, shift);
1717
- tcg_gen_add_i32(d, d, a);
1718
-}
1719
-
1720
-static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1721
-{
1722
- tcg_gen_sari_i64(a, a, shift);
1723
- tcg_gen_add_i64(d, d, a);
1724
-}
1725
-
1726
-static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
1727
-{
1728
- tcg_gen_sari_vec(vece, a, a, sh);
1729
- tcg_gen_add_vec(vece, d, d, a);
1730
-}
1731
-
1732
-void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
1733
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
1734
-{
1735
- static const TCGOpcode vecop_list[] = {
1736
- INDEX_op_sari_vec, INDEX_op_add_vec, 0
1737
- };
1738
- static const GVecGen2i ops[4] = {
1739
- { .fni8 = gen_ssra8_i64,
1740
- .fniv = gen_ssra_vec,
1741
- .fno = gen_helper_gvec_ssra_b,
1742
- .load_dest = true,
1743
- .opt_opc = vecop_list,
1744
- .vece = MO_8 },
1745
- { .fni8 = gen_ssra16_i64,
1746
- .fniv = gen_ssra_vec,
1747
- .fno = gen_helper_gvec_ssra_h,
1748
- .load_dest = true,
1749
- .opt_opc = vecop_list,
1750
- .vece = MO_16 },
1751
- { .fni4 = gen_ssra32_i32,
1752
- .fniv = gen_ssra_vec,
1753
- .fno = gen_helper_gvec_ssra_s,
1754
- .load_dest = true,
1755
- .opt_opc = vecop_list,
1756
- .vece = MO_32 },
1757
- { .fni8 = gen_ssra64_i64,
1758
- .fniv = gen_ssra_vec,
1759
- .fno = gen_helper_gvec_ssra_d,
1760
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1761
- .opt_opc = vecop_list,
1762
- .load_dest = true,
1763
- .vece = MO_64 },
1764
- };
1765
-
1766
- /* tszimm encoding produces immediates in the range [1..esize]. */
1767
- tcg_debug_assert(shift > 0);
1768
- tcg_debug_assert(shift <= (8 << vece));
1769
-
1770
- /*
1771
- * Shifts larger than the element size are architecturally valid.
1772
- * Signed results in all sign bits.
1773
- */
1774
- shift = MIN(shift, (8 << vece) - 1);
1775
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
1776
-}
1777
-
1778
-static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1779
-{
1780
- tcg_gen_vec_shr8i_i64(a, a, shift);
1781
- tcg_gen_vec_add8_i64(d, d, a);
1782
-}
1783
-
1784
-static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1785
-{
1786
- tcg_gen_vec_shr16i_i64(a, a, shift);
1787
- tcg_gen_vec_add16_i64(d, d, a);
1788
-}
1789
-
1790
-static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
1791
-{
1792
- tcg_gen_shri_i32(a, a, shift);
1793
- tcg_gen_add_i32(d, d, a);
1794
-}
1795
-
1796
-static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
1797
-{
1798
- tcg_gen_shri_i64(a, a, shift);
1799
- tcg_gen_add_i64(d, d, a);
1800
-}
1801
-
1802
-static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
1803
-{
1804
- tcg_gen_shri_vec(vece, a, a, sh);
1805
- tcg_gen_add_vec(vece, d, d, a);
1806
-}
1807
-
1808
-void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
1809
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
1810
-{
1811
- static const TCGOpcode vecop_list[] = {
1812
- INDEX_op_shri_vec, INDEX_op_add_vec, 0
1813
- };
1814
- static const GVecGen2i ops[4] = {
1815
- { .fni8 = gen_usra8_i64,
1816
- .fniv = gen_usra_vec,
1817
- .fno = gen_helper_gvec_usra_b,
1818
- .load_dest = true,
1819
- .opt_opc = vecop_list,
1820
- .vece = MO_8, },
1821
- { .fni8 = gen_usra16_i64,
1822
- .fniv = gen_usra_vec,
1823
- .fno = gen_helper_gvec_usra_h,
1824
- .load_dest = true,
1825
- .opt_opc = vecop_list,
1826
- .vece = MO_16, },
1827
- { .fni4 = gen_usra32_i32,
1828
- .fniv = gen_usra_vec,
1829
- .fno = gen_helper_gvec_usra_s,
1830
- .load_dest = true,
1831
- .opt_opc = vecop_list,
1832
- .vece = MO_32, },
1833
- { .fni8 = gen_usra64_i64,
1834
- .fniv = gen_usra_vec,
1835
- .fno = gen_helper_gvec_usra_d,
1836
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1837
- .load_dest = true,
1838
- .opt_opc = vecop_list,
1839
- .vece = MO_64, },
1840
- };
1841
-
1842
- /* tszimm encoding produces immediates in the range [1..esize]. */
1843
- tcg_debug_assert(shift > 0);
1844
- tcg_debug_assert(shift <= (8 << vece));
1845
-
1846
- /*
1847
- * Shifts larger than the element size are architecturally valid.
1848
- * Unsigned results in all zeros as input to accumulate: nop.
1849
- */
1850
- if (shift < (8 << vece)) {
1851
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
1852
- } else {
1853
- /* Nop, but we do need to clear the tail. */
1854
- tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
1855
- }
1856
-}
1857
-
1858
-/*
1859
- * Shift one less than the requested amount, and the low bit is
1860
- * the rounding bit. For the 8 and 16-bit operations, because we
1861
- * mask the low bit, we can perform a normal integer shift instead
1862
- * of a vector shift.
1863
- */
1864
-static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1865
-{
1866
- TCGv_i64 t = tcg_temp_new_i64();
1867
-
1868
- tcg_gen_shri_i64(t, a, sh - 1);
1869
- tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
1870
- tcg_gen_vec_sar8i_i64(d, a, sh);
1871
- tcg_gen_vec_add8_i64(d, d, t);
1872
-}
1873
-
1874
-static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1875
-{
1876
- TCGv_i64 t = tcg_temp_new_i64();
1877
-
1878
- tcg_gen_shri_i64(t, a, sh - 1);
1879
- tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
1880
- tcg_gen_vec_sar16i_i64(d, a, sh);
1881
- tcg_gen_vec_add16_i64(d, d, t);
1882
-}
1883
-
1884
-static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
1885
-{
1886
- TCGv_i32 t;
1887
-
1888
- /* Handle shift by the input size for the benefit of trans_SRSHR_ri */
1889
- if (sh == 32) {
1890
- tcg_gen_movi_i32(d, 0);
1891
- return;
1892
- }
1893
- t = tcg_temp_new_i32();
1894
- tcg_gen_extract_i32(t, a, sh - 1, 1);
1895
- tcg_gen_sari_i32(d, a, sh);
1896
- tcg_gen_add_i32(d, d, t);
1897
-}
1898
-
1899
-static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1900
-{
1901
- TCGv_i64 t = tcg_temp_new_i64();
1902
-
1903
- tcg_gen_extract_i64(t, a, sh - 1, 1);
1904
- tcg_gen_sari_i64(d, a, sh);
1905
- tcg_gen_add_i64(d, d, t);
1906
-}
1907
-
1908
-static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
1909
-{
1910
- TCGv_vec t = tcg_temp_new_vec_matching(d);
1911
- TCGv_vec ones = tcg_temp_new_vec_matching(d);
1912
-
1913
- tcg_gen_shri_vec(vece, t, a, sh - 1);
1914
- tcg_gen_dupi_vec(vece, ones, 1);
1915
- tcg_gen_and_vec(vece, t, t, ones);
1916
- tcg_gen_sari_vec(vece, d, a, sh);
1917
- tcg_gen_add_vec(vece, d, d, t);
1918
-}
1919
-
1920
-void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
1921
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
1922
-{
1923
- static const TCGOpcode vecop_list[] = {
1924
- INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
1925
- };
1926
- static const GVecGen2i ops[4] = {
1927
- { .fni8 = gen_srshr8_i64,
1928
- .fniv = gen_srshr_vec,
1929
- .fno = gen_helper_gvec_srshr_b,
1930
- .opt_opc = vecop_list,
1931
- .vece = MO_8 },
1932
- { .fni8 = gen_srshr16_i64,
1933
- .fniv = gen_srshr_vec,
1934
- .fno = gen_helper_gvec_srshr_h,
1935
- .opt_opc = vecop_list,
1936
- .vece = MO_16 },
1937
- { .fni4 = gen_srshr32_i32,
1938
- .fniv = gen_srshr_vec,
1939
- .fno = gen_helper_gvec_srshr_s,
1940
- .opt_opc = vecop_list,
1941
- .vece = MO_32 },
1942
- { .fni8 = gen_srshr64_i64,
1943
- .fniv = gen_srshr_vec,
1944
- .fno = gen_helper_gvec_srshr_d,
1945
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
1946
- .opt_opc = vecop_list,
1947
- .vece = MO_64 },
1948
- };
1949
-
1950
- /* tszimm encoding produces immediates in the range [1..esize] */
1951
- tcg_debug_assert(shift > 0);
1952
- tcg_debug_assert(shift <= (8 << vece));
1953
-
1954
- if (shift == (8 << vece)) {
1955
- /*
1956
- * Shifts larger than the element size are architecturally valid.
1957
- * Signed results in all sign bits. With rounding, this produces
1958
- * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
1959
- * I.e. always zero.
1960
- */
1961
- tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
1962
- } else {
1963
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
1964
- }
1965
-}
1966
-
1967
-static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1968
-{
1969
- TCGv_i64 t = tcg_temp_new_i64();
1970
-
1971
- gen_srshr8_i64(t, a, sh);
1972
- tcg_gen_vec_add8_i64(d, d, t);
1973
-}
1974
-
1975
-static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1976
-{
1977
- TCGv_i64 t = tcg_temp_new_i64();
1978
-
1979
- gen_srshr16_i64(t, a, sh);
1980
- tcg_gen_vec_add16_i64(d, d, t);
1981
-}
1982
-
1983
-static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
1984
-{
1985
- TCGv_i32 t = tcg_temp_new_i32();
1986
-
1987
- gen_srshr32_i32(t, a, sh);
1988
- tcg_gen_add_i32(d, d, t);
1989
-}
1990
-
1991
-static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
1992
-{
1993
- TCGv_i64 t = tcg_temp_new_i64();
1994
-
1995
- gen_srshr64_i64(t, a, sh);
1996
- tcg_gen_add_i64(d, d, t);
1997
-}
1998
-
1999
-static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
2000
-{
2001
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2002
-
2003
- gen_srshr_vec(vece, t, a, sh);
2004
- tcg_gen_add_vec(vece, d, d, t);
2005
-}
2006
-
2007
-void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2008
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2009
-{
2010
- static const TCGOpcode vecop_list[] = {
2011
- INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
2012
- };
2013
- static const GVecGen2i ops[4] = {
2014
- { .fni8 = gen_srsra8_i64,
2015
- .fniv = gen_srsra_vec,
2016
- .fno = gen_helper_gvec_srsra_b,
2017
- .opt_opc = vecop_list,
2018
- .load_dest = true,
2019
- .vece = MO_8 },
2020
- { .fni8 = gen_srsra16_i64,
2021
- .fniv = gen_srsra_vec,
2022
- .fno = gen_helper_gvec_srsra_h,
2023
- .opt_opc = vecop_list,
2024
- .load_dest = true,
2025
- .vece = MO_16 },
2026
- { .fni4 = gen_srsra32_i32,
2027
- .fniv = gen_srsra_vec,
2028
- .fno = gen_helper_gvec_srsra_s,
2029
- .opt_opc = vecop_list,
2030
- .load_dest = true,
2031
- .vece = MO_32 },
2032
- { .fni8 = gen_srsra64_i64,
2033
- .fniv = gen_srsra_vec,
2034
- .fno = gen_helper_gvec_srsra_d,
2035
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2036
- .opt_opc = vecop_list,
2037
- .load_dest = true,
2038
- .vece = MO_64 },
2039
- };
2040
-
2041
- /* tszimm encoding produces immediates in the range [1..esize] */
2042
- tcg_debug_assert(shift > 0);
2043
- tcg_debug_assert(shift <= (8 << vece));
2044
-
2045
- /*
2046
- * Shifts larger than the element size are architecturally valid.
2047
- * Signed results in all sign bits. With rounding, this produces
2048
- * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
2049
- * I.e. always zero. With accumulation, this leaves D unchanged.
2050
- */
2051
- if (shift == (8 << vece)) {
2052
- /* Nop, but we do need to clear the tail. */
2053
- tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
2054
- } else {
2055
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2056
- }
2057
-}
2058
-
2059
-static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2060
-{
2061
- TCGv_i64 t = tcg_temp_new_i64();
2062
-
2063
- tcg_gen_shri_i64(t, a, sh - 1);
2064
- tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
2065
- tcg_gen_vec_shr8i_i64(d, a, sh);
2066
- tcg_gen_vec_add8_i64(d, d, t);
2067
-}
2068
-
2069
-static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2070
-{
2071
- TCGv_i64 t = tcg_temp_new_i64();
2072
-
2073
- tcg_gen_shri_i64(t, a, sh - 1);
2074
- tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
2075
- tcg_gen_vec_shr16i_i64(d, a, sh);
2076
- tcg_gen_vec_add16_i64(d, d, t);
2077
-}
2078
-
2079
-static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
2080
-{
2081
- TCGv_i32 t;
2082
-
2083
- /* Handle shift by the input size for the benefit of trans_URSHR_ri */
2084
- if (sh == 32) {
2085
- tcg_gen_extract_i32(d, a, sh - 1, 1);
2086
- return;
2087
- }
2088
- t = tcg_temp_new_i32();
2089
- tcg_gen_extract_i32(t, a, sh - 1, 1);
2090
- tcg_gen_shri_i32(d, a, sh);
2091
- tcg_gen_add_i32(d, d, t);
2092
-}
2093
-
2094
-static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2095
-{
2096
- TCGv_i64 t = tcg_temp_new_i64();
2097
-
2098
- tcg_gen_extract_i64(t, a, sh - 1, 1);
2099
- tcg_gen_shri_i64(d, a, sh);
2100
- tcg_gen_add_i64(d, d, t);
2101
-}
2102
-
2103
-static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
2104
-{
2105
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2106
- TCGv_vec ones = tcg_temp_new_vec_matching(d);
2107
-
2108
- tcg_gen_shri_vec(vece, t, a, shift - 1);
2109
- tcg_gen_dupi_vec(vece, ones, 1);
2110
- tcg_gen_and_vec(vece, t, t, ones);
2111
- tcg_gen_shri_vec(vece, d, a, shift);
2112
- tcg_gen_add_vec(vece, d, d, t);
2113
-}
2114
-
2115
-void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2116
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2117
-{
2118
- static const TCGOpcode vecop_list[] = {
2119
- INDEX_op_shri_vec, INDEX_op_add_vec, 0
2120
- };
2121
- static const GVecGen2i ops[4] = {
2122
- { .fni8 = gen_urshr8_i64,
2123
- .fniv = gen_urshr_vec,
2124
- .fno = gen_helper_gvec_urshr_b,
2125
- .opt_opc = vecop_list,
2126
- .vece = MO_8 },
2127
- { .fni8 = gen_urshr16_i64,
2128
- .fniv = gen_urshr_vec,
2129
- .fno = gen_helper_gvec_urshr_h,
2130
- .opt_opc = vecop_list,
2131
- .vece = MO_16 },
2132
- { .fni4 = gen_urshr32_i32,
2133
- .fniv = gen_urshr_vec,
2134
- .fno = gen_helper_gvec_urshr_s,
2135
- .opt_opc = vecop_list,
2136
- .vece = MO_32 },
2137
- { .fni8 = gen_urshr64_i64,
2138
- .fniv = gen_urshr_vec,
2139
- .fno = gen_helper_gvec_urshr_d,
2140
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2141
- .opt_opc = vecop_list,
2142
- .vece = MO_64 },
2143
- };
2144
-
2145
- /* tszimm encoding produces immediates in the range [1..esize] */
2146
- tcg_debug_assert(shift > 0);
2147
- tcg_debug_assert(shift <= (8 << vece));
2148
-
2149
- if (shift == (8 << vece)) {
2150
- /*
2151
- * Shifts larger than the element size are architecturally valid.
2152
- * Unsigned results in zero. With rounding, this produces a
2153
- * copy of the most significant bit.
2154
- */
2155
- tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
2156
- } else {
2157
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2158
- }
2159
-}
2160
-
2161
-static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2162
-{
2163
- TCGv_i64 t = tcg_temp_new_i64();
2164
-
2165
- if (sh == 8) {
2166
- tcg_gen_vec_shr8i_i64(t, a, 7);
2167
- } else {
2168
- gen_urshr8_i64(t, a, sh);
2169
- }
2170
- tcg_gen_vec_add8_i64(d, d, t);
2171
-}
2172
-
2173
-static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2174
-{
2175
- TCGv_i64 t = tcg_temp_new_i64();
2176
-
2177
- if (sh == 16) {
2178
- tcg_gen_vec_shr16i_i64(t, a, 15);
2179
- } else {
2180
- gen_urshr16_i64(t, a, sh);
2181
- }
2182
- tcg_gen_vec_add16_i64(d, d, t);
2183
-}
2184
-
2185
-static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
2186
-{
2187
- TCGv_i32 t = tcg_temp_new_i32();
2188
-
2189
- if (sh == 32) {
2190
- tcg_gen_shri_i32(t, a, 31);
2191
- } else {
2192
- gen_urshr32_i32(t, a, sh);
2193
- }
2194
- tcg_gen_add_i32(d, d, t);
2195
-}
2196
-
2197
-static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
2198
-{
2199
- TCGv_i64 t = tcg_temp_new_i64();
2200
-
2201
- if (sh == 64) {
2202
- tcg_gen_shri_i64(t, a, 63);
2203
- } else {
2204
- gen_urshr64_i64(t, a, sh);
2205
- }
2206
- tcg_gen_add_i64(d, d, t);
2207
-}
2208
-
2209
-static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
2210
-{
2211
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2212
-
2213
- if (sh == (8 << vece)) {
2214
- tcg_gen_shri_vec(vece, t, a, sh - 1);
2215
- } else {
2216
- gen_urshr_vec(vece, t, a, sh);
2217
- }
2218
- tcg_gen_add_vec(vece, d, d, t);
2219
-}
2220
-
2221
-void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2222
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2223
-{
2224
- static const TCGOpcode vecop_list[] = {
2225
- INDEX_op_shri_vec, INDEX_op_add_vec, 0
2226
- };
2227
- static const GVecGen2i ops[4] = {
2228
- { .fni8 = gen_ursra8_i64,
2229
- .fniv = gen_ursra_vec,
2230
- .fno = gen_helper_gvec_ursra_b,
2231
- .opt_opc = vecop_list,
2232
- .load_dest = true,
2233
- .vece = MO_8 },
2234
- { .fni8 = gen_ursra16_i64,
2235
- .fniv = gen_ursra_vec,
2236
- .fno = gen_helper_gvec_ursra_h,
2237
- .opt_opc = vecop_list,
2238
- .load_dest = true,
2239
- .vece = MO_16 },
2240
- { .fni4 = gen_ursra32_i32,
2241
- .fniv = gen_ursra_vec,
2242
- .fno = gen_helper_gvec_ursra_s,
2243
- .opt_opc = vecop_list,
2244
- .load_dest = true,
2245
- .vece = MO_32 },
2246
- { .fni8 = gen_ursra64_i64,
2247
- .fniv = gen_ursra_vec,
2248
- .fno = gen_helper_gvec_ursra_d,
2249
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2250
- .opt_opc = vecop_list,
2251
- .load_dest = true,
2252
- .vece = MO_64 },
2253
- };
2254
-
2255
- /* tszimm encoding produces immediates in the range [1..esize] */
2256
- tcg_debug_assert(shift > 0);
2257
- tcg_debug_assert(shift <= (8 << vece));
2258
-
2259
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2260
-}
2261
-
2262
-static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2263
-{
2264
- uint64_t mask = dup_const(MO_8, 0xff >> shift);
2265
- TCGv_i64 t = tcg_temp_new_i64();
2266
-
2267
- tcg_gen_shri_i64(t, a, shift);
2268
- tcg_gen_andi_i64(t, t, mask);
2269
- tcg_gen_andi_i64(d, d, ~mask);
2270
- tcg_gen_or_i64(d, d, t);
2271
-}
2272
-
2273
-static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2274
-{
2275
- uint64_t mask = dup_const(MO_16, 0xffff >> shift);
2276
- TCGv_i64 t = tcg_temp_new_i64();
2277
-
2278
- tcg_gen_shri_i64(t, a, shift);
2279
- tcg_gen_andi_i64(t, t, mask);
2280
- tcg_gen_andi_i64(d, d, ~mask);
2281
- tcg_gen_or_i64(d, d, t);
2282
-}
2283
-
2284
-static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
2285
-{
2286
- tcg_gen_shri_i32(a, a, shift);
2287
- tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
2288
-}
2289
-
2290
-static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2291
-{
2292
- tcg_gen_shri_i64(a, a, shift);
2293
- tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
2294
-}
2295
-
2296
-static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
2297
-{
2298
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2299
- TCGv_vec m = tcg_temp_new_vec_matching(d);
2300
-
2301
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
2302
- tcg_gen_shri_vec(vece, t, a, sh);
2303
- tcg_gen_and_vec(vece, d, d, m);
2304
- tcg_gen_or_vec(vece, d, d, t);
2305
-}
2306
-
2307
-void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2308
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2309
-{
2310
- static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
2311
- const GVecGen2i ops[4] = {
2312
- { .fni8 = gen_shr8_ins_i64,
2313
- .fniv = gen_shr_ins_vec,
2314
- .fno = gen_helper_gvec_sri_b,
2315
- .load_dest = true,
2316
- .opt_opc = vecop_list,
2317
- .vece = MO_8 },
2318
- { .fni8 = gen_shr16_ins_i64,
2319
- .fniv = gen_shr_ins_vec,
2320
- .fno = gen_helper_gvec_sri_h,
2321
- .load_dest = true,
2322
- .opt_opc = vecop_list,
2323
- .vece = MO_16 },
2324
- { .fni4 = gen_shr32_ins_i32,
2325
- .fniv = gen_shr_ins_vec,
2326
- .fno = gen_helper_gvec_sri_s,
2327
- .load_dest = true,
2328
- .opt_opc = vecop_list,
2329
- .vece = MO_32 },
2330
- { .fni8 = gen_shr64_ins_i64,
2331
- .fniv = gen_shr_ins_vec,
2332
- .fno = gen_helper_gvec_sri_d,
2333
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2334
- .load_dest = true,
2335
- .opt_opc = vecop_list,
2336
- .vece = MO_64 },
2337
- };
2338
-
2339
- /* tszimm encoding produces immediates in the range [1..esize]. */
2340
- tcg_debug_assert(shift > 0);
2341
- tcg_debug_assert(shift <= (8 << vece));
2342
-
2343
- /* Shift of esize leaves destination unchanged. */
2344
- if (shift < (8 << vece)) {
2345
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2346
- } else {
2347
- /* Nop, but we do need to clear the tail. */
2348
- tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
2349
- }
2350
-}
2351
-
2352
-static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2353
-{
2354
- uint64_t mask = dup_const(MO_8, 0xff << shift);
2355
- TCGv_i64 t = tcg_temp_new_i64();
2356
-
2357
- tcg_gen_shli_i64(t, a, shift);
2358
- tcg_gen_andi_i64(t, t, mask);
2359
- tcg_gen_andi_i64(d, d, ~mask);
2360
- tcg_gen_or_i64(d, d, t);
2361
-}
2362
-
2363
-static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2364
-{
2365
- uint64_t mask = dup_const(MO_16, 0xffff << shift);
2366
- TCGv_i64 t = tcg_temp_new_i64();
2367
-
2368
- tcg_gen_shli_i64(t, a, shift);
2369
- tcg_gen_andi_i64(t, t, mask);
2370
- tcg_gen_andi_i64(d, d, ~mask);
2371
- tcg_gen_or_i64(d, d, t);
2372
-}
2373
-
2374
-static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
2375
-{
2376
- tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
2377
-}
2378
-
2379
-static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
2380
-{
2381
- tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
2382
-}
2383
-
2384
-static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
2385
-{
2386
- TCGv_vec t = tcg_temp_new_vec_matching(d);
2387
- TCGv_vec m = tcg_temp_new_vec_matching(d);
2388
-
2389
- tcg_gen_shli_vec(vece, t, a, sh);
2390
- tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
2391
- tcg_gen_and_vec(vece, d, d, m);
2392
- tcg_gen_or_vec(vece, d, d, t);
2393
-}
2394
-
2395
-void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
2396
- int64_t shift, uint32_t opr_sz, uint32_t max_sz)
2397
-{
2398
- static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
2399
- const GVecGen2i ops[4] = {
2400
- { .fni8 = gen_shl8_ins_i64,
2401
- .fniv = gen_shl_ins_vec,
2402
- .fno = gen_helper_gvec_sli_b,
2403
- .load_dest = true,
2404
- .opt_opc = vecop_list,
2405
- .vece = MO_8 },
2406
- { .fni8 = gen_shl16_ins_i64,
2407
- .fniv = gen_shl_ins_vec,
2408
- .fno = gen_helper_gvec_sli_h,
2409
- .load_dest = true,
2410
- .opt_opc = vecop_list,
2411
- .vece = MO_16 },
2412
- { .fni4 = gen_shl32_ins_i32,
2413
- .fniv = gen_shl_ins_vec,
2414
- .fno = gen_helper_gvec_sli_s,
2415
- .load_dest = true,
2416
- .opt_opc = vecop_list,
2417
- .vece = MO_32 },
2418
- { .fni8 = gen_shl64_ins_i64,
2419
- .fniv = gen_shl_ins_vec,
2420
- .fno = gen_helper_gvec_sli_d,
2421
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2422
- .load_dest = true,
2423
- .opt_opc = vecop_list,
2424
- .vece = MO_64 },
2425
- };
2426
-
2427
- /* tszimm encoding produces immediates in the range [0..esize-1]. */
2428
- tcg_debug_assert(shift >= 0);
2429
- tcg_debug_assert(shift < (8 << vece));
2430
-
2431
- if (shift == 0) {
2432
- tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
2433
- } else {
2434
- tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
2435
- }
2436
-}
2437
-
2438
-static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2439
-{
2440
- gen_helper_neon_mul_u8(a, a, b);
2441
- gen_helper_neon_add_u8(d, d, a);
2442
-}
2443
-
2444
-static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2445
-{
2446
- gen_helper_neon_mul_u8(a, a, b);
2447
- gen_helper_neon_sub_u8(d, d, a);
2448
-}
2449
-
2450
-static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2451
-{
2452
- gen_helper_neon_mul_u16(a, a, b);
2453
- gen_helper_neon_add_u16(d, d, a);
2454
-}
2455
-
2456
-static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2457
-{
2458
- gen_helper_neon_mul_u16(a, a, b);
2459
- gen_helper_neon_sub_u16(d, d, a);
2460
-}
2461
-
2462
-static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2463
-{
2464
- tcg_gen_mul_i32(a, a, b);
2465
- tcg_gen_add_i32(d, d, a);
2466
-}
2467
-
2468
-static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2469
-{
2470
- tcg_gen_mul_i32(a, a, b);
2471
- tcg_gen_sub_i32(d, d, a);
2472
-}
2473
-
2474
-static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
2475
-{
2476
- tcg_gen_mul_i64(a, a, b);
2477
- tcg_gen_add_i64(d, d, a);
2478
-}
2479
-
2480
-static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
2481
-{
2482
- tcg_gen_mul_i64(a, a, b);
2483
- tcg_gen_sub_i64(d, d, a);
2484
-}
2485
-
2486
-static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
2487
-{
2488
- tcg_gen_mul_vec(vece, a, a, b);
2489
- tcg_gen_add_vec(vece, d, d, a);
2490
-}
2491
-
2492
-static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
2493
-{
2494
- tcg_gen_mul_vec(vece, a, a, b);
2495
- tcg_gen_sub_vec(vece, d, d, a);
2496
-}
2497
-
2498
-/* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
2499
- * these tables are shared with AArch64 which does support them.
2500
- */
2501
-void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2502
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2503
-{
2504
- static const TCGOpcode vecop_list[] = {
2505
- INDEX_op_mul_vec, INDEX_op_add_vec, 0
2506
- };
2507
- static const GVecGen3 ops[4] = {
2508
- { .fni4 = gen_mla8_i32,
2509
- .fniv = gen_mla_vec,
2510
- .load_dest = true,
2511
- .opt_opc = vecop_list,
2512
- .vece = MO_8 },
2513
- { .fni4 = gen_mla16_i32,
2514
- .fniv = gen_mla_vec,
2515
- .load_dest = true,
2516
- .opt_opc = vecop_list,
2517
- .vece = MO_16 },
2518
- { .fni4 = gen_mla32_i32,
2519
- .fniv = gen_mla_vec,
2520
- .load_dest = true,
2521
- .opt_opc = vecop_list,
2522
- .vece = MO_32 },
2523
- { .fni8 = gen_mla64_i64,
2524
- .fniv = gen_mla_vec,
2525
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2526
- .load_dest = true,
2527
- .opt_opc = vecop_list,
2528
- .vece = MO_64 },
2529
- };
2530
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2531
-}
2532
-
2533
-void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2534
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2535
-{
2536
- static const TCGOpcode vecop_list[] = {
2537
- INDEX_op_mul_vec, INDEX_op_sub_vec, 0
2538
- };
2539
- static const GVecGen3 ops[4] = {
2540
- { .fni4 = gen_mls8_i32,
2541
- .fniv = gen_mls_vec,
2542
- .load_dest = true,
2543
- .opt_opc = vecop_list,
2544
- .vece = MO_8 },
2545
- { .fni4 = gen_mls16_i32,
2546
- .fniv = gen_mls_vec,
2547
- .load_dest = true,
2548
- .opt_opc = vecop_list,
2549
- .vece = MO_16 },
2550
- { .fni4 = gen_mls32_i32,
2551
- .fniv = gen_mls_vec,
2552
- .load_dest = true,
2553
- .opt_opc = vecop_list,
2554
- .vece = MO_32 },
2555
- { .fni8 = gen_mls64_i64,
2556
- .fniv = gen_mls_vec,
2557
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2558
- .load_dest = true,
2559
- .opt_opc = vecop_list,
2560
- .vece = MO_64 },
2561
- };
2562
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2563
-}
2564
-
2565
-/* CMTST : test is "if (X & Y != 0)". */
2566
-static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
2567
-{
2568
- tcg_gen_and_i32(d, a, b);
2569
- tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0));
2570
-}
2571
-
2572
-void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
2573
-{
2574
- tcg_gen_and_i64(d, a, b);
2575
- tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0));
2576
-}
2577
-
2578
-static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
2579
-{
2580
- tcg_gen_and_vec(vece, d, a, b);
2581
- tcg_gen_dupi_vec(vece, a, 0);
2582
- tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
2583
-}
2584
-
2585
-void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2586
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2587
-{
2588
- static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
2589
- static const GVecGen3 ops[4] = {
2590
- { .fni4 = gen_helper_neon_tst_u8,
2591
- .fniv = gen_cmtst_vec,
2592
- .opt_opc = vecop_list,
2593
- .vece = MO_8 },
2594
- { .fni4 = gen_helper_neon_tst_u16,
2595
- .fniv = gen_cmtst_vec,
2596
- .opt_opc = vecop_list,
2597
- .vece = MO_16 },
2598
- { .fni4 = gen_cmtst_i32,
2599
- .fniv = gen_cmtst_vec,
2600
- .opt_opc = vecop_list,
2601
- .vece = MO_32 },
2602
- { .fni8 = gen_cmtst_i64,
2603
- .fniv = gen_cmtst_vec,
2604
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
2605
- .opt_opc = vecop_list,
2606
- .vece = MO_64 },
2607
- };
2608
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2609
-}
2610
-
2611
-void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
2612
-{
2613
- TCGv_i32 lval = tcg_temp_new_i32();
2614
- TCGv_i32 rval = tcg_temp_new_i32();
2615
- TCGv_i32 lsh = tcg_temp_new_i32();
2616
- TCGv_i32 rsh = tcg_temp_new_i32();
2617
- TCGv_i32 zero = tcg_constant_i32(0);
2618
- TCGv_i32 max = tcg_constant_i32(32);
2619
-
2620
- /*
2621
- * Rely on the TCG guarantee that out of range shifts produce
2622
- * unspecified results, not undefined behaviour (i.e. no trap).
2623
- * Discard out-of-range results after the fact.
2624
- */
2625
- tcg_gen_ext8s_i32(lsh, shift);
2626
- tcg_gen_neg_i32(rsh, lsh);
2627
- tcg_gen_shl_i32(lval, src, lsh);
2628
- tcg_gen_shr_i32(rval, src, rsh);
2629
- tcg_gen_movcond_i32(TCG_COND_LTU, dst, lsh, max, lval, zero);
2630
- tcg_gen_movcond_i32(TCG_COND_LTU, dst, rsh, max, rval, dst);
2631
-}
2632
-
2633
-void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
2634
-{
2635
- TCGv_i64 lval = tcg_temp_new_i64();
2636
- TCGv_i64 rval = tcg_temp_new_i64();
2637
- TCGv_i64 lsh = tcg_temp_new_i64();
2638
- TCGv_i64 rsh = tcg_temp_new_i64();
2639
- TCGv_i64 zero = tcg_constant_i64(0);
2640
- TCGv_i64 max = tcg_constant_i64(64);
2641
-
2642
- /*
2643
- * Rely on the TCG guarantee that out of range shifts produce
2644
- * unspecified results, not undefined behaviour (i.e. no trap).
2645
- * Discard out-of-range results after the fact.
2646
- */
2647
- tcg_gen_ext8s_i64(lsh, shift);
2648
- tcg_gen_neg_i64(rsh, lsh);
2649
- tcg_gen_shl_i64(lval, src, lsh);
2650
- tcg_gen_shr_i64(rval, src, rsh);
2651
- tcg_gen_movcond_i64(TCG_COND_LTU, dst, lsh, max, lval, zero);
2652
- tcg_gen_movcond_i64(TCG_COND_LTU, dst, rsh, max, rval, dst);
2653
-}
2654
-
2655
-static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
2656
- TCGv_vec src, TCGv_vec shift)
2657
-{
2658
- TCGv_vec lval = tcg_temp_new_vec_matching(dst);
2659
- TCGv_vec rval = tcg_temp_new_vec_matching(dst);
2660
- TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
2661
- TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
2662
- TCGv_vec msk, max;
2663
-
2664
- tcg_gen_neg_vec(vece, rsh, shift);
2665
- if (vece == MO_8) {
2666
- tcg_gen_mov_vec(lsh, shift);
2667
- } else {
2668
- msk = tcg_temp_new_vec_matching(dst);
2669
- tcg_gen_dupi_vec(vece, msk, 0xff);
2670
- tcg_gen_and_vec(vece, lsh, shift, msk);
2671
- tcg_gen_and_vec(vece, rsh, rsh, msk);
2672
- }
2673
-
2674
- /*
2675
- * Rely on the TCG guarantee that out of range shifts produce
2676
- * unspecified results, not undefined behaviour (i.e. no trap).
2677
- * Discard out-of-range results after the fact.
2678
- */
2679
- tcg_gen_shlv_vec(vece, lval, src, lsh);
2680
- tcg_gen_shrv_vec(vece, rval, src, rsh);
2681
-
2682
- max = tcg_temp_new_vec_matching(dst);
2683
- tcg_gen_dupi_vec(vece, max, 8 << vece);
2684
-
2685
- /*
2686
- * The choice of LT (signed) and GEU (unsigned) are biased toward
2687
- * the instructions of the x86_64 host. For MO_8, the whole byte
2688
- * is significant so we must use an unsigned compare; otherwise we
2689
- * have already masked to a byte and so a signed compare works.
2690
- * Other tcg hosts have a full set of comparisons and do not care.
2691
- */
2692
- if (vece == MO_8) {
2693
- tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
2694
- tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
2695
- tcg_gen_andc_vec(vece, lval, lval, lsh);
2696
- tcg_gen_andc_vec(vece, rval, rval, rsh);
2697
- } else {
2698
- tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
2699
- tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
2700
- tcg_gen_and_vec(vece, lval, lval, lsh);
2701
- tcg_gen_and_vec(vece, rval, rval, rsh);
2702
- }
2703
- tcg_gen_or_vec(vece, dst, lval, rval);
2704
-}
2705
-
2706
-void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2707
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2708
-{
2709
- static const TCGOpcode vecop_list[] = {
2710
- INDEX_op_neg_vec, INDEX_op_shlv_vec,
2711
- INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
2712
- };
2713
- static const GVecGen3 ops[4] = {
2714
- { .fniv = gen_ushl_vec,
2715
- .fno = gen_helper_gvec_ushl_b,
2716
- .opt_opc = vecop_list,
2717
- .vece = MO_8 },
2718
- { .fniv = gen_ushl_vec,
2719
- .fno = gen_helper_gvec_ushl_h,
2720
- .opt_opc = vecop_list,
2721
- .vece = MO_16 },
2722
- { .fni4 = gen_ushl_i32,
2723
- .fniv = gen_ushl_vec,
2724
- .opt_opc = vecop_list,
2725
- .vece = MO_32 },
2726
- { .fni8 = gen_ushl_i64,
2727
- .fniv = gen_ushl_vec,
2728
- .opt_opc = vecop_list,
2729
- .vece = MO_64 },
2730
- };
2731
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2732
-}
2733
-
2734
-void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
2735
-{
2736
- TCGv_i32 lval = tcg_temp_new_i32();
2737
- TCGv_i32 rval = tcg_temp_new_i32();
2738
- TCGv_i32 lsh = tcg_temp_new_i32();
2739
- TCGv_i32 rsh = tcg_temp_new_i32();
2740
- TCGv_i32 zero = tcg_constant_i32(0);
2741
- TCGv_i32 max = tcg_constant_i32(31);
2742
-
2743
- /*
2744
- * Rely on the TCG guarantee that out of range shifts produce
2745
- * unspecified results, not undefined behaviour (i.e. no trap).
2746
- * Discard out-of-range results after the fact.
2747
- */
2748
- tcg_gen_ext8s_i32(lsh, shift);
2749
- tcg_gen_neg_i32(rsh, lsh);
2750
- tcg_gen_shl_i32(lval, src, lsh);
2751
- tcg_gen_umin_i32(rsh, rsh, max);
2752
- tcg_gen_sar_i32(rval, src, rsh);
2753
- tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero);
2754
- tcg_gen_movcond_i32(TCG_COND_LT, dst, lsh, zero, rval, lval);
2755
-}
2756
-
2757
-void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
2758
-{
2759
- TCGv_i64 lval = tcg_temp_new_i64();
2760
- TCGv_i64 rval = tcg_temp_new_i64();
2761
- TCGv_i64 lsh = tcg_temp_new_i64();
2762
- TCGv_i64 rsh = tcg_temp_new_i64();
2763
- TCGv_i64 zero = tcg_constant_i64(0);
2764
- TCGv_i64 max = tcg_constant_i64(63);
2765
-
2766
- /*
2767
- * Rely on the TCG guarantee that out of range shifts produce
2768
- * unspecified results, not undefined behaviour (i.e. no trap).
2769
- * Discard out-of-range results after the fact.
2770
- */
2771
- tcg_gen_ext8s_i64(lsh, shift);
2772
- tcg_gen_neg_i64(rsh, lsh);
2773
- tcg_gen_shl_i64(lval, src, lsh);
2774
- tcg_gen_umin_i64(rsh, rsh, max);
2775
- tcg_gen_sar_i64(rval, src, rsh);
2776
- tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero);
2777
- tcg_gen_movcond_i64(TCG_COND_LT, dst, lsh, zero, rval, lval);
2778
-}
2779
-
2780
-static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
2781
- TCGv_vec src, TCGv_vec shift)
2782
-{
2783
- TCGv_vec lval = tcg_temp_new_vec_matching(dst);
2784
- TCGv_vec rval = tcg_temp_new_vec_matching(dst);
2785
- TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
2786
- TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
2787
- TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
2788
-
2789
- /*
2790
- * Rely on the TCG guarantee that out of range shifts produce
2791
- * unspecified results, not undefined behaviour (i.e. no trap).
2792
- * Discard out-of-range results after the fact.
2793
- */
2794
- tcg_gen_neg_vec(vece, rsh, shift);
2795
- if (vece == MO_8) {
2796
- tcg_gen_mov_vec(lsh, shift);
2797
- } else {
2798
- tcg_gen_dupi_vec(vece, tmp, 0xff);
2799
- tcg_gen_and_vec(vece, lsh, shift, tmp);
2800
- tcg_gen_and_vec(vece, rsh, rsh, tmp);
2801
- }
2802
-
2803
- /* Bound rsh so out of bound right shift gets -1. */
2804
- tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
2805
- tcg_gen_umin_vec(vece, rsh, rsh, tmp);
2806
- tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
2807
-
2808
- tcg_gen_shlv_vec(vece, lval, src, lsh);
2809
- tcg_gen_sarv_vec(vece, rval, src, rsh);
2810
-
2811
- /* Select in-bound left shift. */
2812
- tcg_gen_andc_vec(vece, lval, lval, tmp);
2813
-
2814
- /* Select between left and right shift. */
2815
- if (vece == MO_8) {
2816
- tcg_gen_dupi_vec(vece, tmp, 0);
2817
- tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval);
2818
- } else {
2819
- tcg_gen_dupi_vec(vece, tmp, 0x80);
2820
- tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval);
2821
- }
2822
-}
2823
-
2824
-void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2825
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2826
-{
2827
- static const TCGOpcode vecop_list[] = {
2828
- INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
2829
- INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
2830
- };
2831
- static const GVecGen3 ops[4] = {
2832
- { .fniv = gen_sshl_vec,
2833
- .fno = gen_helper_gvec_sshl_b,
2834
- .opt_opc = vecop_list,
2835
- .vece = MO_8 },
2836
- { .fniv = gen_sshl_vec,
2837
- .fno = gen_helper_gvec_sshl_h,
2838
- .opt_opc = vecop_list,
2839
- .vece = MO_16 },
2840
- { .fni4 = gen_sshl_i32,
2841
- .fniv = gen_sshl_vec,
2842
- .opt_opc = vecop_list,
2843
- .vece = MO_32 },
2844
- { .fni8 = gen_sshl_i64,
2845
- .fniv = gen_sshl_vec,
2846
- .opt_opc = vecop_list,
2847
- .vece = MO_64 },
2848
- };
2849
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2850
-}
2851
-
2852
-static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
2853
- TCGv_vec a, TCGv_vec b)
2854
-{
2855
- TCGv_vec x = tcg_temp_new_vec_matching(t);
2856
- tcg_gen_add_vec(vece, x, a, b);
2857
- tcg_gen_usadd_vec(vece, t, a, b);
2858
- tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
2859
- tcg_gen_or_vec(vece, sat, sat, x);
2860
-}
2861
-
2862
-void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2863
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2864
-{
2865
- static const TCGOpcode vecop_list[] = {
2866
- INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
2867
- };
2868
- static const GVecGen4 ops[4] = {
2869
- { .fniv = gen_uqadd_vec,
2870
- .fno = gen_helper_gvec_uqadd_b,
2871
- .write_aofs = true,
2872
- .opt_opc = vecop_list,
2873
- .vece = MO_8 },
2874
- { .fniv = gen_uqadd_vec,
2875
- .fno = gen_helper_gvec_uqadd_h,
2876
- .write_aofs = true,
2877
- .opt_opc = vecop_list,
2878
- .vece = MO_16 },
2879
- { .fniv = gen_uqadd_vec,
2880
- .fno = gen_helper_gvec_uqadd_s,
2881
- .write_aofs = true,
2882
- .opt_opc = vecop_list,
2883
- .vece = MO_32 },
2884
- { .fniv = gen_uqadd_vec,
2885
- .fno = gen_helper_gvec_uqadd_d,
2886
- .write_aofs = true,
2887
- .opt_opc = vecop_list,
2888
- .vece = MO_64 },
2889
- };
2890
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
2891
- rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2892
-}
2893
-
2894
-static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
2895
- TCGv_vec a, TCGv_vec b)
2896
-{
2897
- TCGv_vec x = tcg_temp_new_vec_matching(t);
2898
- tcg_gen_add_vec(vece, x, a, b);
2899
- tcg_gen_ssadd_vec(vece, t, a, b);
2900
- tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
2901
- tcg_gen_or_vec(vece, sat, sat, x);
2902
-}
2903
-
2904
-void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2905
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2906
-{
2907
- static const TCGOpcode vecop_list[] = {
2908
- INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
2909
- };
2910
- static const GVecGen4 ops[4] = {
2911
- { .fniv = gen_sqadd_vec,
2912
- .fno = gen_helper_gvec_sqadd_b,
2913
- .opt_opc = vecop_list,
2914
- .write_aofs = true,
2915
- .vece = MO_8 },
2916
- { .fniv = gen_sqadd_vec,
2917
- .fno = gen_helper_gvec_sqadd_h,
2918
- .opt_opc = vecop_list,
2919
- .write_aofs = true,
2920
- .vece = MO_16 },
2921
- { .fniv = gen_sqadd_vec,
2922
- .fno = gen_helper_gvec_sqadd_s,
2923
- .opt_opc = vecop_list,
2924
- .write_aofs = true,
2925
- .vece = MO_32 },
2926
- { .fniv = gen_sqadd_vec,
2927
- .fno = gen_helper_gvec_sqadd_d,
2928
- .opt_opc = vecop_list,
2929
- .write_aofs = true,
2930
- .vece = MO_64 },
2931
- };
2932
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
2933
- rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2934
-}
2935
-
2936
-static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
2937
- TCGv_vec a, TCGv_vec b)
2938
-{
2939
- TCGv_vec x = tcg_temp_new_vec_matching(t);
2940
- tcg_gen_sub_vec(vece, x, a, b);
2941
- tcg_gen_ussub_vec(vece, t, a, b);
2942
- tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
2943
- tcg_gen_or_vec(vece, sat, sat, x);
2944
-}
2945
-
2946
-void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2947
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2948
-{
2949
- static const TCGOpcode vecop_list[] = {
2950
- INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
2951
- };
2952
- static const GVecGen4 ops[4] = {
2953
- { .fniv = gen_uqsub_vec,
2954
- .fno = gen_helper_gvec_uqsub_b,
2955
- .opt_opc = vecop_list,
2956
- .write_aofs = true,
2957
- .vece = MO_8 },
2958
- { .fniv = gen_uqsub_vec,
2959
- .fno = gen_helper_gvec_uqsub_h,
2960
- .opt_opc = vecop_list,
2961
- .write_aofs = true,
2962
- .vece = MO_16 },
2963
- { .fniv = gen_uqsub_vec,
2964
- .fno = gen_helper_gvec_uqsub_s,
2965
- .opt_opc = vecop_list,
2966
- .write_aofs = true,
2967
- .vece = MO_32 },
2968
- { .fniv = gen_uqsub_vec,
2969
- .fno = gen_helper_gvec_uqsub_d,
2970
- .opt_opc = vecop_list,
2971
- .write_aofs = true,
2972
- .vece = MO_64 },
2973
- };
2974
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
2975
- rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
2976
-}
2977
-
2978
-static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
2979
- TCGv_vec a, TCGv_vec b)
2980
-{
2981
- TCGv_vec x = tcg_temp_new_vec_matching(t);
2982
- tcg_gen_sub_vec(vece, x, a, b);
2983
- tcg_gen_sssub_vec(vece, t, a, b);
2984
- tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t);
2985
- tcg_gen_or_vec(vece, sat, sat, x);
2986
-}
2987
-
2988
-void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
2989
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
2990
-{
2991
- static const TCGOpcode vecop_list[] = {
2992
- INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
2993
- };
2994
- static const GVecGen4 ops[4] = {
2995
- { .fniv = gen_sqsub_vec,
2996
- .fno = gen_helper_gvec_sqsub_b,
2997
- .opt_opc = vecop_list,
2998
- .write_aofs = true,
2999
- .vece = MO_8 },
3000
- { .fniv = gen_sqsub_vec,
3001
- .fno = gen_helper_gvec_sqsub_h,
3002
- .opt_opc = vecop_list,
3003
- .write_aofs = true,
3004
- .vece = MO_16 },
3005
- { .fniv = gen_sqsub_vec,
3006
- .fno = gen_helper_gvec_sqsub_s,
3007
- .opt_opc = vecop_list,
3008
- .write_aofs = true,
3009
- .vece = MO_32 },
3010
- { .fniv = gen_sqsub_vec,
3011
- .fno = gen_helper_gvec_sqsub_d,
3012
- .opt_opc = vecop_list,
3013
- .write_aofs = true,
3014
- .vece = MO_64 },
3015
- };
3016
- tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
3017
- rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3018
-}
3019
-
3020
-static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
3021
-{
3022
- TCGv_i32 t = tcg_temp_new_i32();
3023
-
3024
- tcg_gen_sub_i32(t, a, b);
3025
- tcg_gen_sub_i32(d, b, a);
3026
- tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
3027
-}
3028
-
3029
-static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
3030
-{
3031
- TCGv_i64 t = tcg_temp_new_i64();
3032
-
3033
- tcg_gen_sub_i64(t, a, b);
3034
- tcg_gen_sub_i64(d, b, a);
3035
- tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
3036
-}
3037
-
3038
-static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
3039
-{
3040
- TCGv_vec t = tcg_temp_new_vec_matching(d);
3041
-
3042
- tcg_gen_smin_vec(vece, t, a, b);
3043
- tcg_gen_smax_vec(vece, d, a, b);
3044
- tcg_gen_sub_vec(vece, d, d, t);
3045
-}
3046
-
3047
-void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
3048
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
3049
-{
3050
- static const TCGOpcode vecop_list[] = {
3051
- INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
3052
- };
3053
- static const GVecGen3 ops[4] = {
3054
- { .fniv = gen_sabd_vec,
3055
- .fno = gen_helper_gvec_sabd_b,
3056
- .opt_opc = vecop_list,
3057
- .vece = MO_8 },
3058
- { .fniv = gen_sabd_vec,
3059
- .fno = gen_helper_gvec_sabd_h,
3060
- .opt_opc = vecop_list,
3061
- .vece = MO_16 },
3062
- { .fni4 = gen_sabd_i32,
3063
- .fniv = gen_sabd_vec,
3064
- .fno = gen_helper_gvec_sabd_s,
3065
- .opt_opc = vecop_list,
3066
- .vece = MO_32 },
3067
- { .fni8 = gen_sabd_i64,
3068
- .fniv = gen_sabd_vec,
3069
- .fno = gen_helper_gvec_sabd_d,
3070
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
3071
- .opt_opc = vecop_list,
3072
- .vece = MO_64 },
3073
- };
3074
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3075
-}
3076
-
3077
-static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
3078
-{
3079
- TCGv_i32 t = tcg_temp_new_i32();
3080
-
3081
- tcg_gen_sub_i32(t, a, b);
3082
- tcg_gen_sub_i32(d, b, a);
3083
- tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
3084
-}
3085
-
3086
-static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
3087
-{
3088
- TCGv_i64 t = tcg_temp_new_i64();
3089
-
3090
- tcg_gen_sub_i64(t, a, b);
3091
- tcg_gen_sub_i64(d, b, a);
3092
- tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
3093
-}
3094
-
3095
-static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
3096
-{
3097
- TCGv_vec t = tcg_temp_new_vec_matching(d);
3098
-
3099
- tcg_gen_umin_vec(vece, t, a, b);
3100
- tcg_gen_umax_vec(vece, d, a, b);
3101
- tcg_gen_sub_vec(vece, d, d, t);
3102
-}
3103
-
3104
-void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
3105
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
3106
-{
3107
- static const TCGOpcode vecop_list[] = {
3108
- INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
3109
- };
3110
- static const GVecGen3 ops[4] = {
3111
- { .fniv = gen_uabd_vec,
3112
- .fno = gen_helper_gvec_uabd_b,
3113
- .opt_opc = vecop_list,
3114
- .vece = MO_8 },
3115
- { .fniv = gen_uabd_vec,
3116
- .fno = gen_helper_gvec_uabd_h,
3117
- .opt_opc = vecop_list,
3118
- .vece = MO_16 },
3119
- { .fni4 = gen_uabd_i32,
3120
- .fniv = gen_uabd_vec,
3121
- .fno = gen_helper_gvec_uabd_s,
3122
- .opt_opc = vecop_list,
3123
- .vece = MO_32 },
3124
- { .fni8 = gen_uabd_i64,
3125
- .fniv = gen_uabd_vec,
3126
- .fno = gen_helper_gvec_uabd_d,
3127
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
3128
- .opt_opc = vecop_list,
3129
- .vece = MO_64 },
3130
- };
3131
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3132
-}
3133
-
3134
-static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
3135
-{
3136
- TCGv_i32 t = tcg_temp_new_i32();
3137
- gen_sabd_i32(t, a, b);
3138
- tcg_gen_add_i32(d, d, t);
3139
-}
3140
-
3141
-static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
3142
-{
3143
- TCGv_i64 t = tcg_temp_new_i64();
3144
- gen_sabd_i64(t, a, b);
3145
- tcg_gen_add_i64(d, d, t);
3146
-}
3147
-
3148
-static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
3149
-{
3150
- TCGv_vec t = tcg_temp_new_vec_matching(d);
3151
- gen_sabd_vec(vece, t, a, b);
3152
- tcg_gen_add_vec(vece, d, d, t);
3153
-}
3154
-
3155
-void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
3156
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
3157
-{
3158
- static const TCGOpcode vecop_list[] = {
3159
- INDEX_op_sub_vec, INDEX_op_add_vec,
3160
- INDEX_op_smin_vec, INDEX_op_smax_vec, 0
3161
- };
3162
- static const GVecGen3 ops[4] = {
3163
- { .fniv = gen_saba_vec,
3164
- .fno = gen_helper_gvec_saba_b,
3165
- .opt_opc = vecop_list,
3166
- .load_dest = true,
3167
- .vece = MO_8 },
3168
- { .fniv = gen_saba_vec,
3169
- .fno = gen_helper_gvec_saba_h,
3170
- .opt_opc = vecop_list,
3171
- .load_dest = true,
3172
- .vece = MO_16 },
3173
- { .fni4 = gen_saba_i32,
3174
- .fniv = gen_saba_vec,
3175
- .fno = gen_helper_gvec_saba_s,
3176
- .opt_opc = vecop_list,
3177
- .load_dest = true,
3178
- .vece = MO_32 },
3179
- { .fni8 = gen_saba_i64,
3180
- .fniv = gen_saba_vec,
3181
- .fno = gen_helper_gvec_saba_d,
3182
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
3183
- .opt_opc = vecop_list,
3184
- .load_dest = true,
3185
- .vece = MO_64 },
3186
- };
3187
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3188
-}
3189
-
3190
-static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
3191
-{
3192
- TCGv_i32 t = tcg_temp_new_i32();
3193
- gen_uabd_i32(t, a, b);
3194
- tcg_gen_add_i32(d, d, t);
3195
-}
3196
-
3197
-static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
3198
-{
3199
- TCGv_i64 t = tcg_temp_new_i64();
3200
- gen_uabd_i64(t, a, b);
3201
- tcg_gen_add_i64(d, d, t);
3202
-}
3203
-
3204
-static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
3205
-{
3206
- TCGv_vec t = tcg_temp_new_vec_matching(d);
3207
- gen_uabd_vec(vece, t, a, b);
3208
- tcg_gen_add_vec(vece, d, d, t);
3209
-}
3210
-
3211
-void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
3212
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
3213
-{
3214
- static const TCGOpcode vecop_list[] = {
3215
- INDEX_op_sub_vec, INDEX_op_add_vec,
3216
- INDEX_op_umin_vec, INDEX_op_umax_vec, 0
3217
- };
3218
- static const GVecGen3 ops[4] = {
3219
- { .fniv = gen_uaba_vec,
3220
- .fno = gen_helper_gvec_uaba_b,
3221
- .opt_opc = vecop_list,
3222
- .load_dest = true,
3223
- .vece = MO_8 },
3224
- { .fniv = gen_uaba_vec,
3225
- .fno = gen_helper_gvec_uaba_h,
3226
- .opt_opc = vecop_list,
3227
- .load_dest = true,
3228
- .vece = MO_16 },
3229
- { .fni4 = gen_uaba_i32,
3230
- .fniv = gen_uaba_vec,
3231
- .fno = gen_helper_gvec_uaba_s,
3232
- .opt_opc = vecop_list,
3233
- .load_dest = true,
3234
- .vece = MO_32 },
3235
- { .fni8 = gen_uaba_i64,
3236
- .fniv = gen_uaba_vec,
3237
- .fno = gen_helper_gvec_uaba_d,
3238
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
3239
- .opt_opc = vecop_list,
3240
- .load_dest = true,
3241
- .vece = MO_64 },
3242
- };
3243
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
3244
-}
3245
-
3246
static bool aa32_cpreg_encoding_in_impdef_space(uint8_t crn, uint8_t crm)
27
{
3247
{
28
CPUARMState *env = cs->env_ptr;
3248
static const uint16_t mask[3] = {
29
- unsigned int cur_el = arm_current_el(env);
3249
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
30
- bool secure = arm_is_secure(env);
3250
index XXXXXXX..XXXXXXX 100644
31
bool pstate_unmasked;
3251
--- a/target/arm/tcg/meson.build
32
int8_t unmasked = 0;
3252
+++ b/target/arm/tcg/meson.build
33
- uint64_t hcr_el2;
3253
@@ -XXX,XX +XXX,XX @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: gen_a64)
34
3254
35
/*
3255
arm_ss.add(files(
36
* Don't take exceptions if they target a lower EL.
3256
'cpu32.c',
37
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
3257
+ 'gengvec.c',
38
return false;
3258
'translate.c',
39
}
3259
'translate-m-nocp.c',
40
3260
'translate-mve.c',
41
- hcr_el2 = arm_hcr_el2_eff(env);
42
-
43
switch (excp_idx) {
44
case EXCP_FIQ:
45
pstate_unmasked = !(env->daif & PSTATE_F);
46
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
47
CPUARMState *env = cs->env_ptr;
48
uint32_t cur_el = arm_current_el(env);
49
bool secure = arm_is_secure(env);
50
+ uint64_t hcr_el2 = arm_hcr_el2_eff(env);
51
uint32_t target_el;
52
uint32_t excp_idx;
53
bool ret = false;
54
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
55
if (interrupt_request & CPU_INTERRUPT_FIQ) {
56
excp_idx = EXCP_FIQ;
57
target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
58
- if (arm_excp_unmasked(cs, excp_idx, target_el)) {
59
+ if (arm_excp_unmasked(cs, excp_idx, target_el,
60
+ cur_el, secure, hcr_el2)) {
61
cs->exception_index = excp_idx;
62
env->exception.target_el = target_el;
63
cc->do_interrupt(cs);
64
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
65
if (interrupt_request & CPU_INTERRUPT_HARD) {
66
excp_idx = EXCP_IRQ;
67
target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
68
- if (arm_excp_unmasked(cs, excp_idx, target_el)) {
69
+ if (arm_excp_unmasked(cs, excp_idx, target_el,
70
+ cur_el, secure, hcr_el2)) {
71
cs->exception_index = excp_idx;
72
env->exception.target_el = target_el;
73
cc->do_interrupt(cs);
74
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
75
if (interrupt_request & CPU_INTERRUPT_VIRQ) {
76
excp_idx = EXCP_VIRQ;
77
target_el = 1;
78
- if (arm_excp_unmasked(cs, excp_idx, target_el)) {
79
+ if (arm_excp_unmasked(cs, excp_idx, target_el,
80
+ cur_el, secure, hcr_el2)) {
81
cs->exception_index = excp_idx;
82
env->exception.target_el = target_el;
83
cc->do_interrupt(cs);
84
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
85
if (interrupt_request & CPU_INTERRUPT_VFIQ) {
86
excp_idx = EXCP_VFIQ;
87
target_el = 1;
88
- if (arm_excp_unmasked(cs, excp_idx, target_el)) {
89
+ if (arm_excp_unmasked(cs, excp_idx, target_el,
90
+ cur_el, secure, hcr_el2)) {
91
cs->exception_index = excp_idx;
92
env->exception.target_el = target_el;
93
cc->do_interrupt(cs);
94
--
3261
--
95
2.20.1
3262
2.34.1
96
3263
97
3264
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
No functional change, but unify code sequences.
3
Split some routines out of translate-a64.c and translate-sve.c
4
that are used by both.
4
5
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-7-richard.henderson@linaro.org
9
Message-id: 20240524232121.284515-9-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
11
---
12
target/arm/helper.c | 32 +++++++++++++-------------------
12
target/arm/tcg/translate-a64.h | 4 +
13
1 file changed, 13 insertions(+), 19 deletions(-)
13
target/arm/tcg/gengvec64.c | 190 +++++++++++++++++++++++++++++++++
14
target/arm/tcg/translate-a64.c | 26 -----
15
target/arm/tcg/translate-sve.c | 145 +------------------------
16
target/arm/tcg/meson.build | 1 +
17
5 files changed, 197 insertions(+), 169 deletions(-)
18
create mode 100644 target/arm/tcg/gengvec64.c
14
19
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
20
diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h
16
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
22
--- a/target/arm/tcg/translate-a64.h
18
+++ b/target/arm/helper.c
23
+++ b/target/arm/tcg/translate-a64.h
19
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
24
@@ -XXX,XX +XXX,XX @@ void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
20
* Page D4-1736 (DDI0487A.b)
25
void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
21
*/
26
uint32_t rm_ofs, int64_t shift,
22
27
uint32_t opr_sz, uint32_t max_sz);
23
+static int vae1_tlbmask(CPUARMState *env)
28
+void gen_gvec_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
24
+{
29
+ uint32_t a, uint32_t oprsz, uint32_t maxsz);
25
+ if (arm_is_secure_below_el3(env)) {
30
+void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
26
+ return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
31
+ uint32_t a, uint32_t oprsz, uint32_t maxsz);
32
33
void gen_sve_ldr(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm);
34
void gen_sve_str(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm);
35
diff --git a/target/arm/tcg/gengvec64.c b/target/arm/tcg/gengvec64.c
36
new file mode 100644
37
index XXXXXXX..XXXXXXX
38
--- /dev/null
39
+++ b/target/arm/tcg/gengvec64.c
40
@@ -XXX,XX +XXX,XX @@
41
+/*
42
+ * AArch64 generic vector expansion
43
+ *
44
+ * Copyright (c) 2013 Alexander Graf <agraf@suse.de>
45
+ *
46
+ * This library is free software; you can redistribute it and/or
47
+ * modify it under the terms of the GNU Lesser General Public
48
+ * License as published by the Free Software Foundation; either
49
+ * version 2.1 of the License, or (at your option) any later version.
50
+ *
51
+ * This library is distributed in the hope that it will be useful,
52
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
53
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
54
+ * Lesser General Public License for more details.
55
+ *
56
+ * You should have received a copy of the GNU Lesser General Public
57
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
58
+ */
59
+
60
+#include "qemu/osdep.h"
61
+#include "translate.h"
62
+#include "translate-a64.h"
63
+
64
+
65
+static void gen_rax1_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)
66
+{
67
+ tcg_gen_rotli_i64(d, m, 1);
68
+ tcg_gen_xor_i64(d, d, n);
69
+}
70
+
71
+static void gen_rax1_vec(unsigned vece, TCGv_vec d, TCGv_vec n, TCGv_vec m)
72
+{
73
+ tcg_gen_rotli_vec(vece, d, m, 1);
74
+ tcg_gen_xor_vec(vece, d, d, n);
75
+}
76
+
77
+void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
78
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
79
+{
80
+ static const TCGOpcode vecop_list[] = { INDEX_op_rotli_vec, 0 };
81
+ static const GVecGen3 op = {
82
+ .fni8 = gen_rax1_i64,
83
+ .fniv = gen_rax1_vec,
84
+ .opt_opc = vecop_list,
85
+ .fno = gen_helper_crypto_rax1,
86
+ .vece = MO_64,
87
+ };
88
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &op);
89
+}
90
+
91
+static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
92
+{
93
+ TCGv_i64 t = tcg_temp_new_i64();
94
+ uint64_t mask = dup_const(MO_8, 0xff >> sh);
95
+
96
+ tcg_gen_xor_i64(t, n, m);
97
+ tcg_gen_shri_i64(d, t, sh);
98
+ tcg_gen_shli_i64(t, t, 8 - sh);
99
+ tcg_gen_andi_i64(d, d, mask);
100
+ tcg_gen_andi_i64(t, t, ~mask);
101
+ tcg_gen_or_i64(d, d, t);
102
+}
103
+
104
+static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
105
+{
106
+ TCGv_i64 t = tcg_temp_new_i64();
107
+ uint64_t mask = dup_const(MO_16, 0xffff >> sh);
108
+
109
+ tcg_gen_xor_i64(t, n, m);
110
+ tcg_gen_shri_i64(d, t, sh);
111
+ tcg_gen_shli_i64(t, t, 16 - sh);
112
+ tcg_gen_andi_i64(d, d, mask);
113
+ tcg_gen_andi_i64(t, t, ~mask);
114
+ tcg_gen_or_i64(d, d, t);
115
+}
116
+
117
+static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh)
118
+{
119
+ tcg_gen_xor_i32(d, n, m);
120
+ tcg_gen_rotri_i32(d, d, sh);
121
+}
122
+
123
+static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
124
+{
125
+ tcg_gen_xor_i64(d, n, m);
126
+ tcg_gen_rotri_i64(d, d, sh);
127
+}
128
+
129
+static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
130
+ TCGv_vec m, int64_t sh)
131
+{
132
+ tcg_gen_xor_vec(vece, d, n, m);
133
+ tcg_gen_rotri_vec(vece, d, d, sh);
134
+}
135
+
136
+void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
137
+ uint32_t rm_ofs, int64_t shift,
138
+ uint32_t opr_sz, uint32_t max_sz)
139
+{
140
+ static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 };
141
+ static const GVecGen3i ops[4] = {
142
+ { .fni8 = gen_xar8_i64,
143
+ .fniv = gen_xar_vec,
144
+ .fno = gen_helper_sve2_xar_b,
145
+ .opt_opc = vecop,
146
+ .vece = MO_8 },
147
+ { .fni8 = gen_xar16_i64,
148
+ .fniv = gen_xar_vec,
149
+ .fno = gen_helper_sve2_xar_h,
150
+ .opt_opc = vecop,
151
+ .vece = MO_16 },
152
+ { .fni4 = gen_xar_i32,
153
+ .fniv = gen_xar_vec,
154
+ .fno = gen_helper_sve2_xar_s,
155
+ .opt_opc = vecop,
156
+ .vece = MO_32 },
157
+ { .fni8 = gen_xar_i64,
158
+ .fniv = gen_xar_vec,
159
+ .fno = gen_helper_gvec_xar_d,
160
+ .opt_opc = vecop,
161
+ .vece = MO_64 }
162
+ };
163
+ int esize = 8 << vece;
164
+
165
+ /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */
166
+ tcg_debug_assert(shift >= 0);
167
+ tcg_debug_assert(shift <= esize);
168
+ shift &= esize - 1;
169
+
170
+ if (shift == 0) {
171
+ /* xar with no rotate devolves to xor. */
172
+ tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz);
27
+ } else {
173
+ } else {
28
+ return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
174
+ tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz,
175
+ shift, &ops[vece]);
29
+ }
176
+ }
30
+}
177
+}
31
+
178
+
32
static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
179
+static void gen_eor3_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k)
33
uint64_t value)
180
+{
181
+ tcg_gen_xor_i64(d, n, m);
182
+ tcg_gen_xor_i64(d, d, k);
183
+}
184
+
185
+static void gen_eor3_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
186
+ TCGv_vec m, TCGv_vec k)
187
+{
188
+ tcg_gen_xor_vec(vece, d, n, m);
189
+ tcg_gen_xor_vec(vece, d, d, k);
190
+}
191
+
192
+void gen_gvec_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
193
+ uint32_t a, uint32_t oprsz, uint32_t maxsz)
194
+{
195
+ static const GVecGen4 op = {
196
+ .fni8 = gen_eor3_i64,
197
+ .fniv = gen_eor3_vec,
198
+ .fno = gen_helper_sve2_eor3,
199
+ .vece = MO_64,
200
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
201
+ };
202
+ tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op);
203
+}
204
+
205
+static void gen_bcax_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k)
206
+{
207
+ tcg_gen_andc_i64(d, m, k);
208
+ tcg_gen_xor_i64(d, d, n);
209
+}
210
+
211
+static void gen_bcax_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
212
+ TCGv_vec m, TCGv_vec k)
213
+{
214
+ tcg_gen_andc_vec(vece, d, m, k);
215
+ tcg_gen_xor_vec(vece, d, d, n);
216
+}
217
+
218
+void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
219
+ uint32_t a, uint32_t oprsz, uint32_t maxsz)
220
+{
221
+ static const GVecGen4 op = {
222
+ .fni8 = gen_bcax_i64,
223
+ .fniv = gen_bcax_vec,
224
+ .fno = gen_helper_sve2_bcax,
225
+ .vece = MO_64,
226
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
227
+ };
228
+ tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op);
229
+}
230
+
231
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
232
index XXXXXXX..XXXXXXX 100644
233
--- a/target/arm/tcg/translate-a64.c
234
+++ b/target/arm/tcg/translate-a64.c
235
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
236
gen_gvec_op2_ool(s, true, rd, rn, 0, genfn);
237
}
238
239
-static void gen_rax1_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)
240
-{
241
- tcg_gen_rotli_i64(d, m, 1);
242
- tcg_gen_xor_i64(d, d, n);
243
-}
244
-
245
-static void gen_rax1_vec(unsigned vece, TCGv_vec d, TCGv_vec n, TCGv_vec m)
246
-{
247
- tcg_gen_rotli_vec(vece, d, m, 1);
248
- tcg_gen_xor_vec(vece, d, d, n);
249
-}
250
-
251
-void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
252
- uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
253
-{
254
- static const TCGOpcode vecop_list[] = { INDEX_op_rotli_vec, 0 };
255
- static const GVecGen3 op = {
256
- .fni8 = gen_rax1_i64,
257
- .fniv = gen_rax1_vec,
258
- .opt_opc = vecop_list,
259
- .fno = gen_helper_crypto_rax1,
260
- .vece = MO_64,
261
- };
262
- tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &op);
263
-}
264
-
265
/* Crypto three-reg SHA512
266
* 31 21 20 16 15 14 13 12 11 10 9 5 4 0
267
* +-----------------------+------+---+---+-----+--------+------+------+
268
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
269
index XXXXXXX..XXXXXXX 100644
270
--- a/target/arm/tcg/translate-sve.c
271
+++ b/target/arm/tcg/translate-sve.c
272
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(ORR_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_or, a)
273
TRANS_FEAT(EOR_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_xor, a)
274
TRANS_FEAT(BIC_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_andc, a)
275
276
-static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
277
-{
278
- TCGv_i64 t = tcg_temp_new_i64();
279
- uint64_t mask = dup_const(MO_8, 0xff >> sh);
280
-
281
- tcg_gen_xor_i64(t, n, m);
282
- tcg_gen_shri_i64(d, t, sh);
283
- tcg_gen_shli_i64(t, t, 8 - sh);
284
- tcg_gen_andi_i64(d, d, mask);
285
- tcg_gen_andi_i64(t, t, ~mask);
286
- tcg_gen_or_i64(d, d, t);
287
-}
288
-
289
-static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
290
-{
291
- TCGv_i64 t = tcg_temp_new_i64();
292
- uint64_t mask = dup_const(MO_16, 0xffff >> sh);
293
-
294
- tcg_gen_xor_i64(t, n, m);
295
- tcg_gen_shri_i64(d, t, sh);
296
- tcg_gen_shli_i64(t, t, 16 - sh);
297
- tcg_gen_andi_i64(d, d, mask);
298
- tcg_gen_andi_i64(t, t, ~mask);
299
- tcg_gen_or_i64(d, d, t);
300
-}
301
-
302
-static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh)
303
-{
304
- tcg_gen_xor_i32(d, n, m);
305
- tcg_gen_rotri_i32(d, d, sh);
306
-}
307
-
308
-static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh)
309
-{
310
- tcg_gen_xor_i64(d, n, m);
311
- tcg_gen_rotri_i64(d, d, sh);
312
-}
313
-
314
-static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
315
- TCGv_vec m, int64_t sh)
316
-{
317
- tcg_gen_xor_vec(vece, d, n, m);
318
- tcg_gen_rotri_vec(vece, d, d, sh);
319
-}
320
-
321
-void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
322
- uint32_t rm_ofs, int64_t shift,
323
- uint32_t opr_sz, uint32_t max_sz)
324
-{
325
- static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 };
326
- static const GVecGen3i ops[4] = {
327
- { .fni8 = gen_xar8_i64,
328
- .fniv = gen_xar_vec,
329
- .fno = gen_helper_sve2_xar_b,
330
- .opt_opc = vecop,
331
- .vece = MO_8 },
332
- { .fni8 = gen_xar16_i64,
333
- .fniv = gen_xar_vec,
334
- .fno = gen_helper_sve2_xar_h,
335
- .opt_opc = vecop,
336
- .vece = MO_16 },
337
- { .fni4 = gen_xar_i32,
338
- .fniv = gen_xar_vec,
339
- .fno = gen_helper_sve2_xar_s,
340
- .opt_opc = vecop,
341
- .vece = MO_32 },
342
- { .fni8 = gen_xar_i64,
343
- .fniv = gen_xar_vec,
344
- .fno = gen_helper_gvec_xar_d,
345
- .opt_opc = vecop,
346
- .vece = MO_64 }
347
- };
348
- int esize = 8 << vece;
349
-
350
- /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */
351
- tcg_debug_assert(shift >= 0);
352
- tcg_debug_assert(shift <= esize);
353
- shift &= esize - 1;
354
-
355
- if (shift == 0) {
356
- /* xar with no rotate devolves to xor. */
357
- tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz);
358
- } else {
359
- tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz,
360
- shift, &ops[vece]);
361
- }
362
-}
363
-
364
static bool trans_XAR(DisasContext *s, arg_rrri_esz *a)
34
{
365
{
35
CPUState *cs = env_cpu(env);
366
if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) {
36
- bool sec = arm_is_secure_below_el3(env);
367
@@ -XXX,XX +XXX,XX @@ static bool trans_XAR(DisasContext *s, arg_rrri_esz *a)
37
+ int mask = vae1_tlbmask(env);
368
return true;
38
39
- if (sec) {
40
- tlb_flush_by_mmuidx_all_cpus_synced(cs,
41
- ARMMMUIdxBit_S1SE1 |
42
- ARMMMUIdxBit_S1SE0);
43
- } else {
44
- tlb_flush_by_mmuidx_all_cpus_synced(cs,
45
- ARMMMUIdxBit_S12NSE1 |
46
- ARMMMUIdxBit_S12NSE0);
47
- }
48
+ tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
49
}
369
}
50
370
51
static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
371
-static void gen_eor3_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k)
52
uint64_t value)
372
-{
53
{
373
- tcg_gen_xor_i64(d, n, m);
54
CPUState *cs = env_cpu(env);
374
- tcg_gen_xor_i64(d, d, k);
55
+ int mask = vae1_tlbmask(env);
375
-}
56
376
-
57
if (tlb_force_broadcast(env)) {
377
-static void gen_eor3_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
58
tlbi_aa64_vmalle1is_write(env, NULL, value);
378
- TCGv_vec m, TCGv_vec k)
59
return;
379
-{
60
}
380
- tcg_gen_xor_vec(vece, d, n, m);
61
381
- tcg_gen_xor_vec(vece, d, d, k);
62
- if (arm_is_secure_below_el3(env)) {
382
-}
63
- tlb_flush_by_mmuidx(cs,
383
-
64
- ARMMMUIdxBit_S1SE1 |
384
-static void gen_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
65
- ARMMMUIdxBit_S1SE0);
385
- uint32_t a, uint32_t oprsz, uint32_t maxsz)
66
- } else {
386
-{
67
- tlb_flush_by_mmuidx(cs,
387
- static const GVecGen4 op = {
68
- ARMMMUIdxBit_S12NSE1 |
388
- .fni8 = gen_eor3_i64,
69
- ARMMMUIdxBit_S12NSE0);
389
- .fniv = gen_eor3_vec,
70
- }
390
- .fno = gen_helper_sve2_eor3,
71
+ tlb_flush_by_mmuidx(cs, mask);
391
- .vece = MO_64,
72
}
392
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
73
393
- };
74
static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
394
- tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op);
395
-}
396
-
397
-TRANS_FEAT(EOR3, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_eor3, a)
398
-
399
-static void gen_bcax_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k)
400
-{
401
- tcg_gen_andc_i64(d, m, k);
402
- tcg_gen_xor_i64(d, d, n);
403
-}
404
-
405
-static void gen_bcax_vec(unsigned vece, TCGv_vec d, TCGv_vec n,
406
- TCGv_vec m, TCGv_vec k)
407
-{
408
- tcg_gen_andc_vec(vece, d, m, k);
409
- tcg_gen_xor_vec(vece, d, d, n);
410
-}
411
-
412
-static void gen_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
413
- uint32_t a, uint32_t oprsz, uint32_t maxsz)
414
-{
415
- static const GVecGen4 op = {
416
- .fni8 = gen_bcax_i64,
417
- .fniv = gen_bcax_vec,
418
- .fno = gen_helper_sve2_bcax,
419
- .vece = MO_64,
420
- .prefer_i64 = TCG_TARGET_REG_BITS == 64,
421
- };
422
- tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op);
423
-}
424
-
425
-TRANS_FEAT(BCAX, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_bcax, a)
426
+TRANS_FEAT(EOR3, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_gvec_eor3, a)
427
+TRANS_FEAT(BCAX, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_gvec_bcax, a)
428
429
static void gen_bsl(unsigned vece, uint32_t d, uint32_t n, uint32_t m,
430
uint32_t a, uint32_t oprsz, uint32_t maxsz)
431
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
432
index XXXXXXX..XXXXXXX 100644
433
--- a/target/arm/tcg/meson.build
434
+++ b/target/arm/tcg/meson.build
435
@@ -XXX,XX +XXX,XX @@ arm_ss.add(files(
436
437
arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
438
'cpu64.c',
439
+ 'gengvec64.c',
440
'translate-a64.c',
441
'translate-sve.c',
442
'translate-sme.c',
75
--
443
--
76
2.20.1
444
2.34.1
77
445
78
446
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Tested-by: Alex Bennée <alex.bennee@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200206105448.4726-26-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-10-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
7
---
9
target/arm/cpu-qom.h | 1 +
8
target/arm/tcg/a64.decode | 21 +++++++--
10
target/arm/cpu.h | 11 +++++----
9
target/arm/tcg/translate-a64.c | 86 +++++++++++++++-------------------
11
target/arm/cpu.c | 3 ++-
10
2 files changed, 54 insertions(+), 53 deletions(-)
12
target/arm/helper.c | 56 ++++++++++++++++++++++++++++++++++++++++++++
13
4 files changed, 65 insertions(+), 6 deletions(-)
14
11
15
diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu-qom.h
14
--- a/target/arm/tcg/a64.decode
18
+++ b/target/arm/cpu-qom.h
15
+++ b/target/arm/tcg/a64.decode
19
@@ -XXX,XX +XXX,XX @@ void arm_gt_ptimer_cb(void *opaque);
16
@@ -XXX,XX +XXX,XX @@
20
void arm_gt_vtimer_cb(void *opaque);
17
# This file is processed by scripts/decodetree.py
21
void arm_gt_htimer_cb(void *opaque);
18
#
22
void arm_gt_stimer_cb(void *opaque);
19
23
+void arm_gt_hvtimer_cb(void *opaque);
20
-&r rn
24
21
-&ri rd imm
25
#define ARM_AFF0_SHIFT 0
22
-&rri_sf rd rn imm sf
26
#define ARM_AFF0_MASK (0xFFULL << ARM_AFF0_SHIFT)
23
-&i imm
27
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
24
+%rd 0:5
25
26
+&r rn
27
+&ri rd imm
28
+&rri_sf rd rn imm sf
29
+&i imm
30
+&qrr_e q rd rn esz
31
+&qrrr_e q rd rn rm esz
32
+
33
+@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
34
+@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
35
36
### Data Processing - Immediate
37
38
@@ -XXX,XX +XXX,XX @@ CPYFE 00 011 0 01100 ..... .... 01 ..... ..... @cpy
39
CPYP 00 011 1 01000 ..... .... 01 ..... ..... @cpy
40
CPYM 00 011 1 01010 ..... .... 01 ..... ..... @cpy
41
CPYE 00 011 1 01100 ..... .... 01 ..... ..... @cpy
42
+
43
+### Cryptographic AES
44
+
45
+AESE 01001110 00 10100 00100 10 ..... ..... @r2r_q1e0
46
+AESD 01001110 00 10100 00101 10 ..... ..... @r2r_q1e0
47
+AESMC 01001110 00 10100 00110 10 ..... ..... @rr_q1e0
48
+AESIMC 01001110 00 10100 00111 10 ..... ..... @rr_q1e0
49
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
28
index XXXXXXX..XXXXXXX 100644
50
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/cpu.h
51
--- a/target/arm/tcg/translate-a64.c
30
+++ b/target/arm/cpu.h
52
+++ b/target/arm/tcg/translate-a64.c
31
@@ -XXX,XX +XXX,XX @@ typedef struct ARMGenericTimer {
53
@@ -XXX,XX +XXX,XX @@ bool sme_enabled_check_with_svcr(DisasContext *s, unsigned req)
32
uint64_t ctl; /* Timer Control register */
54
return true;
33
} ARMGenericTimer;
34
35
-#define GTIMER_PHYS 0
36
-#define GTIMER_VIRT 1
37
-#define GTIMER_HYP 2
38
-#define GTIMER_SEC 3
39
-#define NUM_GTIMERS 4
40
+#define GTIMER_PHYS 0
41
+#define GTIMER_VIRT 1
42
+#define GTIMER_HYP 2
43
+#define GTIMER_SEC 3
44
+#define GTIMER_HYPVIRT 4
45
+#define NUM_GTIMERS 5
46
47
typedef struct {
48
uint64_t raw_tcr;
49
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
50
index XXXXXXX..XXXXXXX 100644
51
--- a/target/arm/cpu.c
52
+++ b/target/arm/cpu.c
53
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
54
}
55
}
56
57
-
58
{
59
uint64_t scale;
60
61
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
62
arm_gt_htimer_cb, cpu);
63
cpu->gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
64
arm_gt_stimer_cb, cpu);
65
+ cpu->gt_timer[GTIMER_HYPVIRT] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
66
+ arm_gt_hvtimer_cb, cpu);
67
}
68
#endif
69
70
diff --git a/target/arm/helper.c b/target/arm/helper.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/target/arm/helper.c
73
+++ b/target/arm/helper.c
74
@@ -XXX,XX +XXX,XX @@ static uint64_t gt_tval_read(CPUARMState *env, const ARMCPRegInfo *ri,
75
76
switch (timeridx) {
77
case GTIMER_VIRT:
78
+ case GTIMER_HYPVIRT:
79
offset = gt_virt_cnt_offset(env);
80
break;
81
}
82
@@ -XXX,XX +XXX,XX @@ static void gt_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
83
84
switch (timeridx) {
85
case GTIMER_VIRT:
86
+ case GTIMER_HYPVIRT:
87
offset = gt_virt_cnt_offset(env);
88
break;
89
}
90
@@ -XXX,XX +XXX,XX @@ static void gt_sec_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
91
gt_ctl_write(env, ri, GTIMER_SEC, value);
92
}
55
}
93
56
94
+static void gt_hv_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
57
+/*
58
+ * Expanders for AdvSIMD translation functions.
59
+ */
60
+
61
+static bool do_gvec_op2_ool(DisasContext *s, arg_qrr_e *a, int data,
62
+ gen_helper_gvec_2 *fn)
95
+{
63
+{
96
+ gt_timer_reset(env, ri, GTIMER_HYPVIRT);
64
+ if (!a->q && a->esz == MO_64) {
65
+ return false;
66
+ }
67
+ if (fp_access_check(s)) {
68
+ gen_gvec_op2_ool(s, a->q, a->rd, a->rn, data, fn);
69
+ }
70
+ return true;
97
+}
71
+}
98
+
72
+
99
+static void gt_hv_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
73
+static bool do_gvec_op3_ool(DisasContext *s, arg_qrrr_e *a, int data,
100
+ uint64_t value)
74
+ gen_helper_gvec_3 *fn)
101
+{
75
+{
102
+ gt_cval_write(env, ri, GTIMER_HYPVIRT, value);
76
+ if (!a->q && a->esz == MO_64) {
77
+ return false;
78
+ }
79
+ if (fp_access_check(s)) {
80
+ gen_gvec_op3_ool(s, a->q, a->rd, a->rn, a->rm, data, fn);
81
+ }
82
+ return true;
103
+}
83
+}
104
+
84
+
105
+static uint64_t gt_hv_tval_read(CPUARMState *env, const ARMCPRegInfo *ri)
85
/*
106
+{
86
* This utility function is for doing register extension with an
107
+ return gt_tval_read(env, ri, GTIMER_HYPVIRT);
87
* optional shift. You will likely want to pass a temporary for the
108
+}
88
@@ -XXX,XX +XXX,XX @@ static bool trans_EXTR(DisasContext *s, arg_extract *a)
89
return true;
90
}
91
92
+/*
93
+ * Cryptographic AES
94
+ */
109
+
95
+
110
+static void gt_hv_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
96
+TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese)
111
+ uint64_t value)
97
+TRANS_FEAT(AESD, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aesd)
112
+{
98
+TRANS_FEAT(AESMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesmc)
113
+ gt_tval_write(env, ri, GTIMER_HYPVIRT, value);
99
+TRANS_FEAT(AESIMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesimc)
114
+}
115
+
100
+
116
+static void gt_hv_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
101
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
117
+ uint64_t value)
102
* Note that it is the caller's responsibility to ensure that the
118
+{
103
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
119
+ gt_ctl_write(env, ri, GTIMER_HYPVIRT, value);
104
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
120
+}
105
}
121
+
122
void arm_gt_ptimer_cb(void *opaque)
123
{
124
ARMCPU *cpu = opaque;
125
@@ -XXX,XX +XXX,XX @@ void arm_gt_stimer_cb(void *opaque)
126
gt_recalc_timer(cpu, GTIMER_SEC);
127
}
106
}
128
107
129
+void arm_gt_hvtimer_cb(void *opaque)
108
-/* Crypto AES
130
+{
109
- * 31 24 23 22 21 17 16 12 11 10 9 5 4 0
131
+ ARMCPU *cpu = opaque;
110
- * +-----------------+------+-----------+--------+-----+------+------+
132
+
111
- * | 0 1 0 0 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 | Rn | Rd |
133
+ gt_recalc_timer(cpu, GTIMER_HYPVIRT);
112
- * +-----------------+------+-----------+--------+-----+------+------+
134
+}
113
- */
135
+
114
-static void disas_crypto_aes(DisasContext *s, uint32_t insn)
136
static void arm_gt_cntfrq_reset(CPUARMState *env, const ARMCPRegInfo *opaque)
115
-{
137
{
116
- int size = extract32(insn, 22, 2);
138
ARMCPU *cpu = env_archcpu(env);
117
- int opcode = extract32(insn, 12, 5);
139
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vhe_reginfo[] = {
118
- int rn = extract32(insn, 5, 5);
140
.opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
119
- int rd = extract32(insn, 0, 5);
141
.access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
120
- gen_helper_gvec_2 *genfn2 = NULL;
142
.fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
121
- gen_helper_gvec_3 *genfn3 = NULL;
143
+#ifndef CONFIG_USER_ONLY
122
-
144
+ { .name = "CNTHV_CVAL_EL2", .state = ARM_CP_STATE_AA64,
123
- if (!dc_isar_feature(aa64_aes, s) || size != 0) {
145
+ .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 3, .opc2 = 2,
124
- unallocated_encoding(s);
146
+ .fieldoffset =
125
- return;
147
+ offsetof(CPUARMState, cp15.c14_timer[GTIMER_HYPVIRT].cval),
126
- }
148
+ .type = ARM_CP_IO, .access = PL2_RW,
127
-
149
+ .writefn = gt_hv_cval_write, .raw_writefn = raw_write },
128
- switch (opcode) {
150
+ { .name = "CNTHV_TVAL_EL2", .state = ARM_CP_STATE_BOTH,
129
- case 0x4: /* AESE */
151
+ .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 3, .opc2 = 0,
130
- genfn3 = gen_helper_crypto_aese;
152
+ .type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL2_RW,
131
- break;
153
+ .resetfn = gt_hv_timer_reset,
132
- case 0x6: /* AESMC */
154
+ .readfn = gt_hv_tval_read, .writefn = gt_hv_tval_write },
133
- genfn2 = gen_helper_crypto_aesmc;
155
+ { .name = "CNTHV_CTL_EL2", .state = ARM_CP_STATE_BOTH,
134
- break;
156
+ .type = ARM_CP_IO,
135
- case 0x5: /* AESD */
157
+ .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 3, .opc2 = 1,
136
- genfn3 = gen_helper_crypto_aesd;
158
+ .access = PL2_RW,
137
- break;
159
+ .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_HYPVIRT].ctl),
138
- case 0x7: /* AESIMC */
160
+ .writefn = gt_hv_ctl_write, .raw_writefn = raw_write },
139
- genfn2 = gen_helper_crypto_aesimc;
161
+#endif
140
- break;
162
REGINFO_SENTINEL
141
- default:
163
};
142
- unallocated_encoding(s);
164
143
- return;
144
- }
145
-
146
- if (!fp_access_check(s)) {
147
- return;
148
- }
149
- if (genfn2) {
150
- gen_gvec_op2_ool(s, true, rd, rn, 0, genfn2);
151
- } else {
152
- gen_gvec_op3_ool(s, true, rd, rd, rn, 0, genfn3);
153
- }
154
-}
155
-
156
/* Crypto three-reg SHA
157
* 31 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0
158
* +-----------------+------+---+------+---+--------+-----+------+------+
159
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
160
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
161
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
162
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
163
- { 0x4e280800, 0xff3e0c00, disas_crypto_aes },
164
{ 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha },
165
{ 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },
166
{ 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 },
165
--
167
--
166
2.20.1
168
2.34.1
167
168
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Use the correct sctlr for EL2&0 regime. Due to header ordering,
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
and where arm_mmu_idx_el is declared, we need to move the function
5
out of line. Use the function in many more places in order to
6
select the correct control.
7
8
Tested-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20200206105448.4726-23-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-11-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
7
---
14
target/arm/cpu.h | 10 +---------
8
target/arm/tcg/a64.decode | 11 +++++
15
target/arm/helper-a64.c | 2 +-
9
target/arm/tcg/translate-a64.c | 78 +++++-----------------------------
16
target/arm/helper.c | 20 +++++++++++++++-----
10
2 files changed, 21 insertions(+), 68 deletions(-)
17
target/arm/pauth_helper.c | 9 +--------
18
4 files changed, 18 insertions(+), 23 deletions(-)
19
11
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
21
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu.h
14
--- a/target/arm/tcg/a64.decode
23
+++ b/target/arm/cpu.h
15
+++ b/target/arm/tcg/a64.decode
24
@@ -XXX,XX +XXX,XX @@ static inline bool arm_sctlr_b(CPUARMState *env)
16
@@ -XXX,XX +XXX,XX @@
25
(env->cp15.sctlr_el[1] & SCTLR_B) != 0;
17
18
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
19
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
20
+@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
21
22
### Data Processing - Immediate
23
24
@@ -XXX,XX +XXX,XX @@ AESE 01001110 00 10100 00100 10 ..... ..... @r2r_q1e0
25
AESD 01001110 00 10100 00101 10 ..... ..... @r2r_q1e0
26
AESMC 01001110 00 10100 00110 10 ..... ..... @rr_q1e0
27
AESIMC 01001110 00 10100 00111 10 ..... ..... @rr_q1e0
28
+
29
+### Cryptographic three-register SHA
30
+
31
+SHA1C 0101 1110 000 ..... 000000 ..... ..... @rrr_q1e0
32
+SHA1P 0101 1110 000 ..... 000100 ..... ..... @rrr_q1e0
33
+SHA1M 0101 1110 000 ..... 001000 ..... ..... @rrr_q1e0
34
+SHA1SU0 0101 1110 000 ..... 001100 ..... ..... @rrr_q1e0
35
+SHA256H 0101 1110 000 ..... 010000 ..... ..... @rrr_q1e0
36
+SHA256H2 0101 1110 000 ..... 010100 ..... ..... @rrr_q1e0
37
+SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0
38
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/tcg/translate-a64.c
41
+++ b/target/arm/tcg/translate-a64.c
42
@@ -XXX,XX +XXX,XX @@ static bool trans_EXTR(DisasContext *s, arg_extract *a)
26
}
43
}
27
44
28
-static inline uint64_t arm_sctlr(CPUARMState *env, int el)
45
/*
29
-{
46
- * Cryptographic AES
30
- if (el == 0) {
47
+ * Cryptographic AES, SHA
31
- /* FIXME: ARMv8.1-VHE S2 translation regime. */
48
*/
32
- return env->cp15.sctlr_el[1];
49
33
- } else {
50
TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese)
34
- return env->cp15.sctlr_el[el];
51
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(AESD, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aesd)
35
- }
52
TRANS_FEAT(AESMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesmc)
36
-}
53
TRANS_FEAT(AESIMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesimc)
37
+uint64_t arm_sctlr(CPUARMState *env, int el);
54
38
55
+TRANS_FEAT(SHA1C, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1c)
39
static inline bool arm_cpu_data_is_big_endian_a32(CPUARMState *env,
56
+TRANS_FEAT(SHA1P, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1p)
40
bool sctlr_b)
57
+TRANS_FEAT(SHA1M, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1m)
41
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
58
+TRANS_FEAT(SHA1SU0, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1su0)
42
index XXXXXXX..XXXXXXX 100644
59
+
43
--- a/target/arm/helper-a64.c
60
+TRANS_FEAT(SHA256H, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h)
44
+++ b/target/arm/helper-a64.c
61
+TRANS_FEAT(SHA256H2, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h2)
45
@@ -XXX,XX +XXX,XX @@ static void daif_check(CPUARMState *env, uint32_t op,
62
+TRANS_FEAT(SHA256SU1, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256su1)
46
uint32_t imm, uintptr_t ra)
63
+
47
{
64
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
48
/* DAIF update to PSTATE. This is OK from EL0 only if UMA is set. */
65
* Note that it is the caller's responsibility to ensure that the
49
- if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UMA)) {
66
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
50
+ if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UMA)) {
67
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
51
raise_exception_ra(env, EXCP_UDEF,
52
syn_aa64_sysregtrap(0, extract32(op, 0, 3),
53
extract32(op, 3, 3), 4,
54
diff --git a/target/arm/helper.c b/target/arm/helper.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/helper.c
57
+++ b/target/arm/helper.c
58
@@ -XXX,XX +XXX,XX @@ static void aa64_fpsr_write(CPUARMState *env, const ARMCPRegInfo *ri,
59
static CPAccessResult aa64_daif_access(CPUARMState *env, const ARMCPRegInfo *ri,
60
bool isread)
61
{
62
- if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UMA)) {
63
+ if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UMA)) {
64
return CP_ACCESS_TRAP;
65
}
66
return CP_ACCESS_OK;
67
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
68
/* Cache invalidate/clean: NOP, but EL0 must UNDEF unless
69
* SCTLR_EL1.UCI is set.
70
*/
71
- if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UCI)) {
72
+ if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UCI)) {
73
return CP_ACCESS_TRAP;
74
}
75
return CP_ACCESS_OK;
76
@@ -XXX,XX +XXX,XX @@ static uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
77
}
68
}
78
}
69
}
79
70
80
-#ifndef CONFIG_USER_ONLY
71
-/* Crypto three-reg SHA
81
+uint64_t arm_sctlr(CPUARMState *env, int el)
72
- * 31 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0
82
+{
73
- * +-----------------+------+---+------+---+--------+-----+------+------+
83
+ /* Only EL0 needs to be adjusted for EL1&0 or EL2&0. */
74
- * | 0 1 0 1 1 1 1 0 | size | 0 | Rm | 0 | opcode | 0 0 | Rn | Rd |
84
+ if (el == 0) {
75
- * +-----------------+------+---+------+---+--------+-----+------+------+
85
+ ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, 0);
76
- */
86
+ el = (mmu_idx == ARMMMUIdx_E20_0 ? 2 : 1);
77
-static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
87
+ }
78
-{
88
+ return env->cp15.sctlr_el[el];
79
- int size = extract32(insn, 22, 2);
89
+}
80
- int opcode = extract32(insn, 12, 3);
90
81
- int rm = extract32(insn, 16, 5);
91
/* Return the SCTLR value which controls this address translation regime */
82
- int rn = extract32(insn, 5, 5);
92
-static inline uint32_t regime_sctlr(CPUARMState *env, ARMMMUIdx mmu_idx)
83
- int rd = extract32(insn, 0, 5);
93
+static inline uint64_t regime_sctlr(CPUARMState *env, ARMMMUIdx mmu_idx)
84
- gen_helper_gvec_3 *genfn;
94
{
85
- bool feature;
95
return env->cp15.sctlr_el[regime_el(env, mmu_idx)];
86
-
96
}
87
- if (size != 0) {
97
88
- unallocated_encoding(s);
98
+#ifndef CONFIG_USER_ONLY
89
- return;
99
+
100
/* Return true if the specified stage of address translation is disabled */
101
static inline bool regime_translation_disabled(CPUARMState *env,
102
ARMMMUIdx mmu_idx)
103
@@ -XXX,XX +XXX,XX @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
104
flags = FIELD_DP32(flags, TBFLAG_A64, ZCR_LEN, zcr_len);
105
}
106
107
- sctlr = arm_sctlr(env, el);
108
+ sctlr = regime_sctlr(env, stage1);
109
110
if (arm_cpu_data_is_big_endian_a64(el, sctlr)) {
111
flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
112
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
113
index XXXXXXX..XXXXXXX 100644
114
--- a/target/arm/pauth_helper.c
115
+++ b/target/arm/pauth_helper.c
116
@@ -XXX,XX +XXX,XX @@ static void pauth_check_trap(CPUARMState *env, int el, uintptr_t ra)
117
118
static bool pauth_key_enabled(CPUARMState *env, int el, uint32_t bit)
119
{
120
- uint32_t sctlr;
121
- if (el == 0) {
122
- /* FIXME: ARMv8.1-VHE S2 translation regime. */
123
- sctlr = env->cp15.sctlr_el[1];
124
- } else {
125
- sctlr = env->cp15.sctlr_el[el];
126
- }
90
- }
127
- return (sctlr & bit) != 0;
91
-
128
+ return (arm_sctlr(env, el) & bit) != 0;
92
- switch (opcode) {
129
}
93
- case 0: /* SHA1C */
130
94
- genfn = gen_helper_crypto_sha1c;
131
uint64_t HELPER(pacia)(CPUARMState *env, uint64_t x, uint64_t y)
95
- feature = dc_isar_feature(aa64_sha1, s);
96
- break;
97
- case 1: /* SHA1P */
98
- genfn = gen_helper_crypto_sha1p;
99
- feature = dc_isar_feature(aa64_sha1, s);
100
- break;
101
- case 2: /* SHA1M */
102
- genfn = gen_helper_crypto_sha1m;
103
- feature = dc_isar_feature(aa64_sha1, s);
104
- break;
105
- case 3: /* SHA1SU0 */
106
- genfn = gen_helper_crypto_sha1su0;
107
- feature = dc_isar_feature(aa64_sha1, s);
108
- break;
109
- case 4: /* SHA256H */
110
- genfn = gen_helper_crypto_sha256h;
111
- feature = dc_isar_feature(aa64_sha256, s);
112
- break;
113
- case 5: /* SHA256H2 */
114
- genfn = gen_helper_crypto_sha256h2;
115
- feature = dc_isar_feature(aa64_sha256, s);
116
- break;
117
- case 6: /* SHA256SU1 */
118
- genfn = gen_helper_crypto_sha256su1;
119
- feature = dc_isar_feature(aa64_sha256, s);
120
- break;
121
- default:
122
- unallocated_encoding(s);
123
- return;
124
- }
125
-
126
- if (!feature) {
127
- unallocated_encoding(s);
128
- return;
129
- }
130
-
131
- if (!fp_access_check(s)) {
132
- return;
133
- }
134
- gen_gvec_op3_ool(s, true, rd, rn, rm, 0, genfn);
135
-}
136
-
137
/* Crypto two-reg SHA
138
* 31 24 23 22 21 17 16 12 11 10 9 5 4 0
139
* +-----------------+------+-----------+--------+-----+------+------+
140
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
141
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
142
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
143
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
144
- { 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha },
145
{ 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },
146
{ 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 },
147
{ 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 },
132
--
148
--
133
2.20.1
149
2.34.1
134
135
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Create a predicate to indicate whether the regime has
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
both positive and negative addresses.
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-21-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-12-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/internals.h | 18 ++++++++++++++++++
8
target/arm/tcg/a64.decode | 6 ++++
13
target/arm/helper.c | 23 ++++++-----------------
9
target/arm/tcg/translate-a64.c | 54 +++-------------------------------
14
target/arm/translate-a64.c | 3 +--
10
2 files changed, 10 insertions(+), 50 deletions(-)
15
3 files changed, 25 insertions(+), 19 deletions(-)
16
11
17
diff --git a/target/arm/internals.h b/target/arm/internals.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
18
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/internals.h
14
--- a/target/arm/tcg/a64.decode
20
+++ b/target/arm/internals.h
15
+++ b/target/arm/tcg/a64.decode
21
@@ -XXX,XX +XXX,XX @@ static inline void arm_call_el_change_hook(ARMCPU *cpu)
16
@@ -XXX,XX +XXX,XX @@ SHA1SU0 0101 1110 000 ..... 001100 ..... ..... @rrr_q1e0
17
SHA256H 0101 1110 000 ..... 010000 ..... ..... @rrr_q1e0
18
SHA256H2 0101 1110 000 ..... 010100 ..... ..... @rrr_q1e0
19
SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0
20
+
21
+### Cryptographic two-register SHA
22
+
23
+SHA1H 0101 1110 0010 1000 0000 10 ..... ..... @rr_q1e0
24
+SHA1SU1 0101 1110 0010 1000 0001 10 ..... ..... @rr_q1e0
25
+SHA256SU0 0101 1110 0010 1000 0010 10 ..... ..... @rr_q1e0
26
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/tcg/translate-a64.c
29
+++ b/target/arm/tcg/translate-a64.c
30
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SHA256H, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256
31
TRANS_FEAT(SHA256H2, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h2)
32
TRANS_FEAT(SHA256SU1, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256su1)
33
34
+TRANS_FEAT(SHA1H, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1h)
35
+TRANS_FEAT(SHA1SU1, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1su1)
36
+TRANS_FEAT(SHA256SU0, aa64_sha256, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha256su0)
37
+
38
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
39
* Note that it is the caller's responsibility to ensure that the
40
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
41
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
22
}
42
}
23
}
43
}
24
44
25
+/* Return true if this address translation regime has two ranges. */
45
-/* Crypto two-reg SHA
26
+static inline bool regime_has_2_ranges(ARMMMUIdx mmu_idx)
46
- * 31 24 23 22 21 17 16 12 11 10 9 5 4 0
27
+{
47
- * +-----------------+------+-----------+--------+-----+------+------+
28
+ switch (mmu_idx) {
48
- * | 0 1 0 1 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 | Rn | Rd |
29
+ case ARMMMUIdx_Stage1_E0:
49
- * +-----------------+------+-----------+--------+-----+------+------+
30
+ case ARMMMUIdx_Stage1_E1:
50
- */
31
+ case ARMMMUIdx_E10_0:
51
-static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
32
+ case ARMMMUIdx_E10_1:
52
-{
33
+ case ARMMMUIdx_E20_0:
53
- int size = extract32(insn, 22, 2);
34
+ case ARMMMUIdx_E20_2:
54
- int opcode = extract32(insn, 12, 5);
35
+ case ARMMMUIdx_SE10_0:
55
- int rn = extract32(insn, 5, 5);
36
+ case ARMMMUIdx_SE10_1:
56
- int rd = extract32(insn, 0, 5);
37
+ return true;
57
- gen_helper_gvec_2 *genfn;
38
+ default:
58
- bool feature;
39
+ return false;
59
-
40
+ }
60
- if (size != 0) {
41
+}
61
- unallocated_encoding(s);
42
+
62
- return;
43
/* Return true if this address translation regime is secure */
63
- }
44
static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
64
-
45
{
65
- switch (opcode) {
46
diff --git a/target/arm/helper.c b/target/arm/helper.c
66
- case 0: /* SHA1H */
47
index XXXXXXX..XXXXXXX 100644
67
- feature = dc_isar_feature(aa64_sha1, s);
48
--- a/target/arm/helper.c
68
- genfn = gen_helper_crypto_sha1h;
49
+++ b/target/arm/helper.c
69
- break;
50
@@ -XXX,XX +XXX,XX @@ static int get_S1prot(CPUARMState *env, ARMMMUIdx mmu_idx, bool is_aa64,
70
- case 1: /* SHA1SU1 */
51
}
71
- feature = dc_isar_feature(aa64_sha1, s);
52
72
- genfn = gen_helper_crypto_sha1su1;
53
if (is_aa64) {
73
- break;
54
- switch (regime_el(env, mmu_idx)) {
74
- case 2: /* SHA256SU0 */
55
- case 1:
75
- feature = dc_isar_feature(aa64_sha256, s);
56
- if (!is_user) {
76
- genfn = gen_helper_crypto_sha256su0;
57
- xn = pxn || (user_rw & PAGE_WRITE);
77
- break;
58
- }
78
- default:
59
- break;
79
- unallocated_encoding(s);
60
- case 2:
80
- return;
61
- case 3:
81
- }
62
- break;
82
-
63
+ if (regime_has_2_ranges(mmu_idx) && !is_user) {
83
- if (!feature) {
64
+ xn = pxn || (user_rw & PAGE_WRITE);
84
- unallocated_encoding(s);
65
}
85
- return;
66
} else if (arm_feature(env, ARM_FEATURE_V7)) {
86
- }
67
switch (regime_el(env, mmu_idx)) {
87
-
68
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
88
- if (!fp_access_check(s)) {
69
ARMMMUIdx mmu_idx)
89
- return;
70
{
90
- }
71
uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
91
- gen_gvec_op2_ool(s, true, rd, rn, 0, genfn);
72
- uint32_t el = regime_el(env, mmu_idx);
92
-}
73
bool tbi, tbid, epd, hpd, using16k, using64k;
93
-
74
int select, tsz;
94
/* Crypto three-reg SHA512
75
95
* 31 21 20 16 15 14 13 12 11 10 9 5 4 0
76
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
96
* +-----------------------+------+---+---+-----+--------+------+------+
77
*/
97
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
78
select = extract64(va, 55, 1);
98
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
79
99
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
80
- if (el > 1) {
100
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
81
+ if (!regime_has_2_ranges(mmu_idx)) {
101
- { 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },
82
tsz = extract32(tcr, 0, 6);
102
{ 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 },
83
using64k = extract32(tcr, 14, 1);
103
{ 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 },
84
using16k = extract32(tcr, 15, 1);
104
{ 0xce000000, 0xff808000, disas_crypto_four_reg },
85
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
86
param = aa64_va_parameters(env, address, mmu_idx,
87
access_type != MMU_INST_FETCH);
88
level = 0;
89
- /* If we are in 64-bit EL2 or EL3 then there is no TTBR1, so mark it
90
- * invalid.
91
- */
92
- ttbr1_valid = (el < 2);
93
+ ttbr1_valid = regime_has_2_ranges(mmu_idx);
94
addrsize = 64 - 8 * param.tbi;
95
inputsize = 64 - param.tsz;
96
} else {
97
@@ -XXX,XX +XXX,XX @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
98
99
flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
100
101
- /* FIXME: ARMv8.1-VHE S2 translation regime. */
102
- if (regime_el(env, stage1) < 2) {
103
+ /* Get control bits for tagged addresses. */
104
+ if (regime_has_2_ranges(mmu_idx)) {
105
ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
106
tbid = (p1.tbi << 1) | p0.tbi;
107
tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
108
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
109
index XXXXXXX..XXXXXXX 100644
110
--- a/target/arm/translate-a64.c
111
+++ b/target/arm/translate-a64.c
112
@@ -XXX,XX +XXX,XX @@ static void gen_top_byte_ignore(DisasContext *s, TCGv_i64 dst,
113
if (tbi == 0) {
114
/* Load unmodified address */
115
tcg_gen_mov_i64(dst, src);
116
- } else if (s->current_el >= 2) {
117
- /* FIXME: ARMv8.1-VHE S2 translation regime. */
118
+ } else if (!regime_has_2_ranges(s->mmu_idx)) {
119
/* Force tag byte to all zero */
120
tcg_gen_extract_i64(dst, src, 0, 56);
121
} else {
122
--
105
--
123
2.20.1
106
2.34.1
124
125
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This is part of a reorganization to the set of mmu_idx.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
The non-secure EL2 regime only has a single stage translation;
5
there is no point in pointing out that the idx is for stage1.
6
7
Tested-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-15-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-13-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
7
---
13
target/arm/cpu.h | 4 ++--
8
target/arm/tcg/a64.decode | 11 ++++
14
target/arm/internals.h | 2 +-
9
target/arm/tcg/translate-a64.c | 97 ++++++++--------------------------
15
target/arm/helper.c | 22 +++++++++++-----------
10
2 files changed, 32 insertions(+), 76 deletions(-)
16
target/arm/translate.c | 2 +-
17
4 files changed, 15 insertions(+), 15 deletions(-)
18
11
19
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
20
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/cpu.h
14
--- a/target/arm/tcg/a64.decode
22
+++ b/target/arm/cpu.h
15
+++ b/target/arm/tcg/a64.decode
23
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
16
@@ -XXX,XX +XXX,XX @@
24
typedef enum ARMMMUIdx {
17
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
25
ARMMMUIdx_E10_0 = 0 | ARM_MMU_IDX_A,
18
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
26
ARMMMUIdx_E10_1 = 1 | ARM_MMU_IDX_A,
19
@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
27
- ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
20
+@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
28
+ ARMMMUIdx_E2 = 2 | ARM_MMU_IDX_A,
21
29
ARMMMUIdx_SE3 = 3 | ARM_MMU_IDX_A,
22
### Data Processing - Immediate
30
ARMMMUIdx_SE10_0 = 4 | ARM_MMU_IDX_A,
23
31
ARMMMUIdx_SE10_1 = 5 | ARM_MMU_IDX_A,
24
@@ -XXX,XX +XXX,XX @@ SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0
32
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
25
SHA1H 0101 1110 0010 1000 0000 10 ..... ..... @rr_q1e0
33
typedef enum ARMMMUIdxBit {
26
SHA1SU1 0101 1110 0010 1000 0001 10 ..... ..... @rr_q1e0
34
ARMMMUIdxBit_E10_0 = 1 << 0,
27
SHA256SU0 0101 1110 0010 1000 0010 10 ..... ..... @rr_q1e0
35
ARMMMUIdxBit_E10_1 = 1 << 1,
28
+
36
- ARMMMUIdxBit_S1E2 = 1 << 2,
29
+### Cryptographic three-register SHA512
37
+ ARMMMUIdxBit_E2 = 1 << 2,
30
+
38
ARMMMUIdxBit_SE3 = 1 << 3,
31
+SHA512H 1100 1110 011 ..... 100000 ..... ..... @rrr_q1e0
39
ARMMMUIdxBit_SE10_0 = 1 << 4,
32
+SHA512H2 1100 1110 011 ..... 100001 ..... ..... @rrr_q1e0
40
ARMMMUIdxBit_SE10_1 = 1 << 5,
33
+SHA512SU1 1100 1110 011 ..... 100010 ..... ..... @rrr_q1e0
41
diff --git a/target/arm/internals.h b/target/arm/internals.h
34
+RAX1 1100 1110 011 ..... 100011 ..... ..... @rrr_q1e3
35
+SM3PARTW1 1100 1110 011 ..... 110000 ..... ..... @rrr_q1e0
36
+SM3PARTW2 1100 1110 011 ..... 110001 ..... ..... @rrr_q1e0
37
+SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0
38
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
42
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/internals.h
40
--- a/target/arm/tcg/translate-a64.c
44
+++ b/target/arm/internals.h
41
+++ b/target/arm/tcg/translate-a64.c
45
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
42
@@ -XXX,XX +XXX,XX @@ static bool do_gvec_op3_ool(DisasContext *s, arg_qrrr_e *a, int data,
46
case ARMMMUIdx_E10_1:
43
return true;
47
case ARMMMUIdx_Stage1_E0:
48
case ARMMMUIdx_Stage1_E1:
49
- case ARMMMUIdx_S1E2:
50
+ case ARMMMUIdx_E2:
51
case ARMMMUIdx_Stage2:
52
case ARMMMUIdx_MPrivNegPri:
53
case ARMMMUIdx_MUserNegPri:
54
diff --git a/target/arm/helper.c b/target/arm/helper.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/helper.c
57
+++ b/target/arm/helper.c
58
@@ -XXX,XX +XXX,XX @@ static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
59
{
60
CPUState *cs = env_cpu(env);
61
62
- tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_S1E2);
63
+ tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_E2);
64
}
44
}
65
45
66
static void tlbiall_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
46
+static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn)
67
@@ -XXX,XX +XXX,XX @@ static void tlbiall_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
47
+{
68
{
48
+ if (!a->q && a->esz == MO_64) {
69
CPUState *cs = env_cpu(env);
49
+ return false;
70
50
+ }
71
- tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E2);
51
+ if (fp_access_check(s)) {
72
+ tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_E2);
52
+ gen_gvec_fn3(s, a->q, a->rd, a->rn, a->rm, fn, a->esz);
53
+ }
54
+ return true;
55
+}
56
+
57
/*
58
* This utility function is for doing register extension with an
59
* optional shift. You will likely want to pass a temporary for the
60
@@ -XXX,XX +XXX,XX @@ static bool trans_EXTR(DisasContext *s, arg_extract *a)
73
}
61
}
74
62
75
static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
63
/*
76
@@ -XXX,XX +XXX,XX @@ static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
64
- * Cryptographic AES, SHA
77
CPUState *cs = env_cpu(env);
65
+ * Cryptographic AES, SHA, SHA512
78
uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
66
*/
79
67
80
- tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S1E2);
68
TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese)
81
+ tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_E2);
69
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SHA1H, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1h)
70
TRANS_FEAT(SHA1SU1, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1su1)
71
TRANS_FEAT(SHA256SU0, aa64_sha256, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha256su0)
72
73
+TRANS_FEAT(SHA512H, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512h)
74
+TRANS_FEAT(SHA512H2, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512h2)
75
+TRANS_FEAT(SHA512SU1, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512su1)
76
+TRANS_FEAT(RAX1, aa64_sha3, do_gvec_fn3, a, gen_gvec_rax1)
77
+TRANS_FEAT(SM3PARTW1, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw1)
78
+TRANS_FEAT(SM3PARTW2, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw2)
79
+TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey)
80
+
81
+
82
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
83
* Note that it is the caller's responsibility to ensure that the
84
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
85
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
86
}
82
}
87
}
83
88
84
static void tlbimva_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
89
-/* Crypto three-reg SHA512
85
@@ -XXX,XX +XXX,XX @@ static void tlbimva_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
90
- * 31 21 20 16 15 14 13 12 11 10 9 5 4 0
86
uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
91
- * +-----------------------+------+---+---+-----+--------+------+------+
87
92
- * | 1 1 0 0 1 1 1 0 0 1 1 | Rm | 1 | O | 0 0 | opcode | Rn | Rd |
88
tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
93
- * +-----------------------+------+---+---+-----+--------+------+------+
89
- ARMMMUIdxBit_S1E2);
94
- */
90
+ ARMMMUIdxBit_E2);
95
-static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
91
}
96
-{
92
97
- int opcode = extract32(insn, 10, 2);
93
static const ARMCPRegInfo cp_reginfo[] = {
98
- int o = extract32(insn, 14, 1);
94
@@ -XXX,XX +XXX,XX @@ static void ats1h_write(CPUARMState *env, const ARMCPRegInfo *ri,
99
- int rm = extract32(insn, 16, 5);
95
MMUAccessType access_type = ri->opc2 & 1 ? MMU_DATA_STORE : MMU_DATA_LOAD;
100
- int rn = extract32(insn, 5, 5);
96
uint64_t par64;
101
- int rd = extract32(insn, 0, 5);
97
102
- bool feature;
98
- par64 = do_ats_write(env, value, access_type, ARMMMUIdx_S1E2);
103
- gen_helper_gvec_3 *oolfn = NULL;
99
+ par64 = do_ats_write(env, value, access_type, ARMMMUIdx_E2);
104
- GVecGen3Fn *gvecfn = NULL;
100
105
-
101
A32_BANKED_CURRENT_REG_SET(env, par, par64);
106
- if (o == 0) {
102
}
107
- switch (opcode) {
103
@@ -XXX,XX +XXX,XX @@ static void ats_write64(CPUARMState *env, const ARMCPRegInfo *ri,
108
- case 0: /* SHA512H */
104
mmu_idx = secure ? ARMMMUIdx_SE10_1 : ARMMMUIdx_Stage1_E1;
109
- feature = dc_isar_feature(aa64_sha512, s);
105
break;
110
- oolfn = gen_helper_crypto_sha512h;
106
case 4: /* AT S1E2R, AT S1E2W */
111
- break;
107
- mmu_idx = ARMMMUIdx_S1E2;
112
- case 1: /* SHA512H2 */
108
+ mmu_idx = ARMMMUIdx_E2;
113
- feature = dc_isar_feature(aa64_sha512, s);
109
break;
114
- oolfn = gen_helper_crypto_sha512h2;
110
case 6: /* AT S1E3R, AT S1E3W */
115
- break;
111
mmu_idx = ARMMMUIdx_SE3;
116
- case 2: /* SHA512SU1 */
112
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,
117
- feature = dc_isar_feature(aa64_sha512, s);
113
ARMCPU *cpu = env_archcpu(env);
118
- oolfn = gen_helper_crypto_sha512su1;
114
CPUState *cs = CPU(cpu);
119
- break;
115
120
- case 3: /* RAX1 */
116
- tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_S1E2);
121
- feature = dc_isar_feature(aa64_sha3, s);
117
+ tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_E2);
122
- gvecfn = gen_gvec_rax1;
118
}
123
- break;
119
124
- default:
120
static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
125
- g_assert_not_reached();
121
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
126
- }
122
{
127
- } else {
123
CPUState *cs = env_cpu(env);
128
- switch (opcode) {
124
129
- case 0: /* SM3PARTW1 */
125
- tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E2);
130
- feature = dc_isar_feature(aa64_sm3, s);
126
+ tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_E2);
131
- oolfn = gen_helper_crypto_sm3partw1;
127
}
132
- break;
128
133
- case 1: /* SM3PARTW2 */
129
static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
134
- feature = dc_isar_feature(aa64_sm3, s);
130
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
135
- oolfn = gen_helper_crypto_sm3partw2;
131
CPUState *cs = CPU(cpu);
136
- break;
132
uint64_t pageaddr = sextract64(value << 12, 0, 56);
137
- case 2: /* SM4EKEY */
133
138
- feature = dc_isar_feature(aa64_sm4, s);
134
- tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S1E2);
139
- oolfn = gen_helper_crypto_sm4ekey;
135
+ tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_E2);
140
- break;
136
}
141
- default:
137
142
- unallocated_encoding(s);
138
static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
143
- return;
139
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
144
- }
140
uint64_t pageaddr = sextract64(value << 12, 0, 56);
145
- }
141
146
-
142
tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
147
- if (!feature) {
143
- ARMMMUIdxBit_S1E2);
148
- unallocated_encoding(s);
144
+ ARMMMUIdxBit_E2);
149
- return;
145
}
150
- }
146
151
-
147
static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
152
- if (!fp_access_check(s)) {
148
@@ -XXX,XX +XXX,XX @@ static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
153
- return;
149
{
154
- }
150
switch (mmu_idx) {
155
-
151
case ARMMMUIdx_Stage2:
156
- if (oolfn) {
152
- case ARMMMUIdx_S1E2:
157
- gen_gvec_op3_ool(s, true, rd, rn, rm, 0, oolfn);
153
+ case ARMMMUIdx_E2:
158
- } else {
154
return 2;
159
- gen_gvec_fn3(s, true, rd, rn, rm, gvecfn, MO_64);
155
case ARMMMUIdx_SE3:
160
- }
156
return 3;
161
-}
157
diff --git a/target/arm/translate.c b/target/arm/translate.c
162
-
158
index XXXXXXX..XXXXXXX 100644
163
/* Crypto two-reg SHA512
159
--- a/target/arm/translate.c
164
* 31 12 11 10 9 5 4 0
160
+++ b/target/arm/translate.c
165
* +-----------------------------------------+--------+------+------+
161
@@ -XXX,XX +XXX,XX @@ static inline int get_a32_user_mem_index(DisasContext *s)
166
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
162
* otherwise, access as if at PL0.
167
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
163
*/
168
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
164
switch (s->mmu_idx) {
169
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
165
- case ARMMMUIdx_S1E2: /* this one is UNPREDICTABLE */
170
- { 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 },
166
+ case ARMMMUIdx_E2: /* this one is UNPREDICTABLE */
171
{ 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 },
167
case ARMMMUIdx_E10_0:
172
{ 0xce000000, 0xff808000, disas_crypto_four_reg },
168
case ARMMMUIdx_E10_1:
173
{ 0xce800000, 0xffe00000, disas_crypto_xar },
169
return arm_to_core_mmu_idx(ARMMMUIdx_E10_0);
170
--
174
--
171
2.20.1
175
2.34.1
172
173
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Tested-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200206105448.4726-38-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-14-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
7
---
9
target/arm/cpu64.c | 1 +
8
target/arm/tcg/a64.decode | 5 ++++
10
1 file changed, 1 insertion(+)
9
target/arm/tcg/translate-a64.c | 50 ++--------------------------------
10
2 files changed, 8 insertions(+), 47 deletions(-)
11
11
12
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
13
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu64.c
14
--- a/target/arm/tcg/a64.decode
15
+++ b/target/arm/cpu64.c
15
+++ b/target/arm/tcg/a64.decode
16
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
16
@@ -XXX,XX +XXX,XX @@ RAX1 1100 1110 011 ..... 100011 ..... ..... @rrr_q1e3
17
t = cpu->isar.id_aa64mmfr1;
17
SM3PARTW1 1100 1110 011 ..... 110000 ..... ..... @rrr_q1e0
18
t = FIELD_DP64(t, ID_AA64MMFR1, HPDS, 1); /* HPD */
18
SM3PARTW2 1100 1110 011 ..... 110001 ..... ..... @rrr_q1e0
19
t = FIELD_DP64(t, ID_AA64MMFR1, LO, 1);
19
SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0
20
+ t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
20
+
21
cpu->isar.id_aa64mmfr1 = t;
21
+### Cryptographic two-register SHA512
22
22
+
23
/* Replicate the same data to the 32-bit id registers. */
23
+SHA512SU0 1100 1110 110 00000 100000 ..... ..... @rr_q1e0
24
+SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0
25
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/tcg/translate-a64.c
28
+++ b/target/arm/tcg/translate-a64.c
29
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SM3PARTW1, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3part
30
TRANS_FEAT(SM3PARTW2, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw2)
31
TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey)
32
33
+TRANS_FEAT(SHA512SU0, aa64_sha512, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha512su0)
34
+TRANS_FEAT(SM4E, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4e)
35
+
36
37
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
38
* Note that it is the caller's responsibility to ensure that the
39
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
40
}
41
}
42
43
-/* Crypto two-reg SHA512
44
- * 31 12 11 10 9 5 4 0
45
- * +-----------------------------------------+--------+------+------+
46
- * | 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0 0 | opcode | Rn | Rd |
47
- * +-----------------------------------------+--------+------+------+
48
- */
49
-static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
50
-{
51
- int opcode = extract32(insn, 10, 2);
52
- int rn = extract32(insn, 5, 5);
53
- int rd = extract32(insn, 0, 5);
54
- bool feature;
55
-
56
- switch (opcode) {
57
- case 0: /* SHA512SU0 */
58
- feature = dc_isar_feature(aa64_sha512, s);
59
- break;
60
- case 1: /* SM4E */
61
- feature = dc_isar_feature(aa64_sm4, s);
62
- break;
63
- default:
64
- unallocated_encoding(s);
65
- return;
66
- }
67
-
68
- if (!feature) {
69
- unallocated_encoding(s);
70
- return;
71
- }
72
-
73
- if (!fp_access_check(s)) {
74
- return;
75
- }
76
-
77
- switch (opcode) {
78
- case 0: /* SHA512SU0 */
79
- gen_gvec_op2_ool(s, true, rd, rn, 0, gen_helper_crypto_sha512su0);
80
- break;
81
- case 1: /* SM4E */
82
- gen_gvec_op3_ool(s, true, rd, rd, rn, 0, gen_helper_crypto_sm4e);
83
- break;
84
- default:
85
- g_assert_not_reached();
86
- }
87
-}
88
-
89
/* Crypto four-register
90
* 31 23 22 21 20 16 15 14 10 9 5 4 0
91
* +-------------------+-----+------+---+------+------+------+
92
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
93
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
94
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
95
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
96
- { 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 },
97
{ 0xce000000, 0xff808000, disas_crypto_four_reg },
98
{ 0xce800000, 0xffe00000, disas_crypto_xar },
99
{ 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 },
24
--
100
--
25
2.20.1
101
2.34.1
26
27
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The virtual offset may be 0 depending on EL, E2H and TGE.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200206105448.4726-6-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-15-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
7
---
11
target/arm/helper.c | 40 +++++++++++++++++++++++++++++++++++++---
8
target/arm/tcg/a64.decode | 8 ++
12
1 file changed, 37 insertions(+), 3 deletions(-)
9
target/arm/tcg/translate-a64.c | 132 +++++++++++----------------------
10
2 files changed, 51 insertions(+), 89 deletions(-)
13
11
14
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.c
14
--- a/target/arm/tcg/a64.decode
17
+++ b/target/arm/helper.c
15
+++ b/target/arm/tcg/a64.decode
18
@@ -XXX,XX +XXX,XX @@ static uint64_t gt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
16
@@ -XXX,XX +XXX,XX @@
19
return gt_get_countervalue(env);
17
&i imm
18
&qrr_e q rd rn esz
19
&qrrr_e q rd rn rm esz
20
+&qrrrr_e q rd rn rm ra esz
21
22
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
23
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
24
@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
25
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
26
+@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
27
28
### Data Processing - Immediate
29
30
@@ -XXX,XX +XXX,XX @@ SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0
31
32
SHA512SU0 1100 1110 110 00000 100000 ..... ..... @rr_q1e0
33
SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0
34
+
35
+### Cryptographic four-register
36
+
37
+EOR3 1100 1110 000 ..... 0 ..... ..... ..... @rrrr_q1e3
38
+BCAX 1100 1110 001 ..... 0 ..... ..... ..... @rrrr_q1e3
39
+SM3SS1 1100 1110 010 ..... 0 ..... ..... ..... @rrrr_q1e3
40
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/tcg/translate-a64.c
43
+++ b/target/arm/tcg/translate-a64.c
44
@@ -XXX,XX +XXX,XX @@ static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn)
45
return true;
20
}
46
}
21
47
22
+static uint64_t gt_virt_cnt_offset(CPUARMState *env)
48
+static bool do_gvec_fn4(DisasContext *s, arg_qrrrr_e *a, GVecGen4Fn *fn)
23
+{
49
+{
24
+ uint64_t hcr;
50
+ if (!a->q && a->esz == MO_64) {
25
+
51
+ return false;
26
+ switch (arm_current_el(env)) {
52
+ }
27
+ case 2:
53
+ if (fp_access_check(s)) {
28
+ hcr = arm_hcr_el2_eff(env);
54
+ gen_gvec_fn4(s, a->q, a->rd, a->rn, a->rm, a->ra, fn, a->esz);
29
+ if (hcr & HCR_E2H) {
55
+ }
30
+ return 0;
56
+ return true;
31
+ }
32
+ break;
33
+ case 0:
34
+ hcr = arm_hcr_el2_eff(env);
35
+ if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
36
+ return 0;
37
+ }
38
+ break;
39
+ }
40
+
41
+ return env->cp15.cntvoff_el2;
42
+}
57
+}
43
+
58
+
44
static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
59
/*
45
{
60
* This utility function is for doing register extension with an
46
- return gt_get_countervalue(env) - env->cp15.cntvoff_el2;
61
* optional shift. You will likely want to pass a temporary for the
47
+ return gt_get_countervalue(env) - gt_virt_cnt_offset(env);
62
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey)
63
TRANS_FEAT(SHA512SU0, aa64_sha512, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha512su0)
64
TRANS_FEAT(SM4E, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4e)
65
66
+TRANS_FEAT(EOR3, aa64_sha3, do_gvec_fn4, a, gen_gvec_eor3)
67
+TRANS_FEAT(BCAX, aa64_sha3, do_gvec_fn4, a, gen_gvec_bcax)
68
+
69
+static bool trans_SM3SS1(DisasContext *s, arg_SM3SS1 *a)
70
+{
71
+ if (!dc_isar_feature(aa64_sm3, s)) {
72
+ return false;
73
+ }
74
+ if (fp_access_check(s)) {
75
+ TCGv_i32 tcg_op1 = tcg_temp_new_i32();
76
+ TCGv_i32 tcg_op2 = tcg_temp_new_i32();
77
+ TCGv_i32 tcg_op3 = tcg_temp_new_i32();
78
+ TCGv_i32 tcg_res = tcg_temp_new_i32();
79
+ unsigned vsz, dofs;
80
+
81
+ read_vec_element_i32(s, tcg_op1, a->rn, 3, MO_32);
82
+ read_vec_element_i32(s, tcg_op2, a->rm, 3, MO_32);
83
+ read_vec_element_i32(s, tcg_op3, a->ra, 3, MO_32);
84
+
85
+ tcg_gen_rotri_i32(tcg_res, tcg_op1, 20);
86
+ tcg_gen_add_i32(tcg_res, tcg_res, tcg_op2);
87
+ tcg_gen_add_i32(tcg_res, tcg_res, tcg_op3);
88
+ tcg_gen_rotri_i32(tcg_res, tcg_res, 25);
89
+
90
+ /* Clear the whole register first, then store bits [127:96]. */
91
+ vsz = vec_full_reg_size(s);
92
+ dofs = vec_full_reg_offset(s, a->rd);
93
+ tcg_gen_gvec_dup_imm(MO_64, dofs, vsz, vsz, 0);
94
+ write_vec_element_i32(s, tcg_res, a->rd, 3, MO_32);
95
+ }
96
+ return true;
97
+}
98
99
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
100
* Note that it is the caller's responsibility to ensure that the
101
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
102
}
48
}
103
}
49
104
50
static void gt_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
105
-/* Crypto four-register
51
@@ -XXX,XX +XXX,XX @@ static void gt_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
106
- * 31 23 22 21 20 16 15 14 10 9 5 4 0
52
static uint64_t gt_tval_read(CPUARMState *env, const ARMCPRegInfo *ri,
107
- * +-------------------+-----+------+---+------+------+------+
53
int timeridx)
108
- * | 1 1 0 0 1 1 1 0 0 | Op0 | Rm | 0 | Ra | Rn | Rd |
54
{
109
- * +-------------------+-----+------+---+------+------+------+
55
- uint64_t offset = timeridx == GTIMER_VIRT ? env->cp15.cntvoff_el2 : 0;
110
- */
56
+ uint64_t offset = 0;
111
-static void disas_crypto_four_reg(DisasContext *s, uint32_t insn)
57
+
112
-{
58
+ switch (timeridx) {
113
- int op0 = extract32(insn, 21, 2);
59
+ case GTIMER_VIRT:
114
- int rm = extract32(insn, 16, 5);
60
+ offset = gt_virt_cnt_offset(env);
115
- int ra = extract32(insn, 10, 5);
61
+ break;
116
- int rn = extract32(insn, 5, 5);
62
+ }
117
- int rd = extract32(insn, 0, 5);
63
118
- bool feature;
64
return (uint32_t)(env->cp15.c14_timer[timeridx].cval -
119
-
65
(gt_get_countervalue(env) - offset));
120
- switch (op0) {
66
@@ -XXX,XX +XXX,XX @@ static void gt_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
121
- case 0: /* EOR3 */
67
int timeridx,
122
- case 1: /* BCAX */
68
uint64_t value)
123
- feature = dc_isar_feature(aa64_sha3, s);
69
{
124
- break;
70
- uint64_t offset = timeridx == GTIMER_VIRT ? env->cp15.cntvoff_el2 : 0;
125
- case 2: /* SM3SS1 */
71
+ uint64_t offset = 0;
126
- feature = dc_isar_feature(aa64_sm3, s);
72
+
127
- break;
73
+ switch (timeridx) {
128
- default:
74
+ case GTIMER_VIRT:
129
- unallocated_encoding(s);
75
+ offset = gt_virt_cnt_offset(env);
130
- return;
76
+ break;
131
- }
77
+ }
132
-
78
133
- if (!feature) {
79
trace_arm_gt_tval_write(timeridx, value);
134
- unallocated_encoding(s);
80
env->cp15.c14_timer[timeridx].cval = gt_get_countervalue(env) - offset +
135
- return;
136
- }
137
-
138
- if (!fp_access_check(s)) {
139
- return;
140
- }
141
-
142
- if (op0 < 2) {
143
- TCGv_i64 tcg_op1, tcg_op2, tcg_op3, tcg_res[2];
144
- int pass;
145
-
146
- tcg_op1 = tcg_temp_new_i64();
147
- tcg_op2 = tcg_temp_new_i64();
148
- tcg_op3 = tcg_temp_new_i64();
149
- tcg_res[0] = tcg_temp_new_i64();
150
- tcg_res[1] = tcg_temp_new_i64();
151
-
152
- for (pass = 0; pass < 2; pass++) {
153
- read_vec_element(s, tcg_op1, rn, pass, MO_64);
154
- read_vec_element(s, tcg_op2, rm, pass, MO_64);
155
- read_vec_element(s, tcg_op3, ra, pass, MO_64);
156
-
157
- if (op0 == 0) {
158
- /* EOR3 */
159
- tcg_gen_xor_i64(tcg_res[pass], tcg_op2, tcg_op3);
160
- } else {
161
- /* BCAX */
162
- tcg_gen_andc_i64(tcg_res[pass], tcg_op2, tcg_op3);
163
- }
164
- tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1);
165
- }
166
- write_vec_element(s, tcg_res[0], rd, 0, MO_64);
167
- write_vec_element(s, tcg_res[1], rd, 1, MO_64);
168
- } else {
169
- TCGv_i32 tcg_op1, tcg_op2, tcg_op3, tcg_res, tcg_zero;
170
-
171
- tcg_op1 = tcg_temp_new_i32();
172
- tcg_op2 = tcg_temp_new_i32();
173
- tcg_op3 = tcg_temp_new_i32();
174
- tcg_res = tcg_temp_new_i32();
175
- tcg_zero = tcg_constant_i32(0);
176
-
177
- read_vec_element_i32(s, tcg_op1, rn, 3, MO_32);
178
- read_vec_element_i32(s, tcg_op2, rm, 3, MO_32);
179
- read_vec_element_i32(s, tcg_op3, ra, 3, MO_32);
180
-
181
- tcg_gen_rotri_i32(tcg_res, tcg_op1, 20);
182
- tcg_gen_add_i32(tcg_res, tcg_res, tcg_op2);
183
- tcg_gen_add_i32(tcg_res, tcg_res, tcg_op3);
184
- tcg_gen_rotri_i32(tcg_res, tcg_res, 25);
185
-
186
- write_vec_element_i32(s, tcg_zero, rd, 0, MO_32);
187
- write_vec_element_i32(s, tcg_zero, rd, 1, MO_32);
188
- write_vec_element_i32(s, tcg_zero, rd, 2, MO_32);
189
- write_vec_element_i32(s, tcg_res, rd, 3, MO_32);
190
- }
191
-}
192
-
193
/* Crypto XAR
194
* 31 21 20 16 15 10 9 5 4 0
195
* +-----------------------+------+--------+------+------+
196
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
197
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
198
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
199
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
200
- { 0xce000000, 0xff808000, disas_crypto_four_reg },
201
{ 0xce800000, 0xffe00000, disas_crypto_xar },
202
{ 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 },
203
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
81
--
204
--
82
2.20.1
205
2.34.1
83
84
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Since we only support a single ASID, flush the tlb when it changes.
4
5
Note that TCR_EL2, like TCR_EL1, has the A1 bit that chooses between
6
the two TTBR* registers for the location of the ASID.
7
8
Tested-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20200206105448.4726-31-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-16-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
7
---
14
target/arm/helper.c | 22 +++++++++++++++-------
8
target/arm/tcg/a64.decode | 10 ++++++++
15
1 file changed, 15 insertions(+), 7 deletions(-)
9
target/arm/tcg/translate-a64.c | 43 ++++++++++------------------------
10
2 files changed, 22 insertions(+), 31 deletions(-)
16
11
17
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
18
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper.c
14
--- a/target/arm/tcg/a64.decode
20
+++ b/target/arm/helper.c
15
+++ b/target/arm/tcg/a64.decode
21
@@ -XXX,XX +XXX,XX @@ static void vmsa_ttbcr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
16
@@ -XXX,XX +XXX,XX @@ SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0
22
tcr->base_mask = 0xffffc000u;
17
EOR3 1100 1110 000 ..... 0 ..... ..... ..... @rrrr_q1e3
18
BCAX 1100 1110 001 ..... 0 ..... ..... ..... @rrrr_q1e3
19
SM3SS1 1100 1110 010 ..... 0 ..... ..... ..... @rrrr_q1e3
20
+
21
+### Cryptographic three-register, imm2
22
+
23
+&crypto3i rd rn rm imm
24
+@crypto3i ........ ... rm:5 .. imm:2 .. rn:5 rd:5 &crypto3i
25
+
26
+SM3TT1A 11001110 010 ..... 10 .. 00 ..... ..... @crypto3i
27
+SM3TT1B 11001110 010 ..... 10 .. 01 ..... ..... @crypto3i
28
+SM3TT2A 11001110 010 ..... 10 .. 10 ..... ..... @crypto3i
29
+SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i
30
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/tcg/translate-a64.c
33
+++ b/target/arm/tcg/translate-a64.c
34
@@ -XXX,XX +XXX,XX @@ static bool trans_SM3SS1(DisasContext *s, arg_SM3SS1 *a)
35
return true;
23
}
36
}
24
37
25
-static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
38
+static bool do_crypto3i(DisasContext *s, arg_crypto3i *a, gen_helper_gvec_3 *fn)
26
+static void vmsa_tcr_el12_write(CPUARMState *env, const ARMCPRegInfo *ri,
39
+{
27
uint64_t value)
40
+ if (fp_access_check(s)) {
28
{
41
+ gen_gvec_op3_ool(s, true, a->rd, a->rn, a->rm, a->imm, fn);
29
ARMCPU *cpu = env_archcpu(env);
30
@@ -XXX,XX +XXX,XX @@ static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
31
static void vmsa_tcr_ttbr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
32
uint64_t value)
33
{
34
- /* TODO: There are ASID fields in here with HCR_EL2.E2H */
35
+ /*
36
+ * If we are running with E2&0 regime, then an ASID is active.
37
+ * Flush if that might be changing. Note we're not checking
38
+ * TCR_EL2.A1 to know if this is really the TTBRx_EL2 that
39
+ * holds the active ASID, only checking the field that might.
40
+ */
41
+ if (extract64(raw_read(env, ri) ^ value, 48, 16) &&
42
+ (arm_hcr_el2_eff(env) & HCR_E2H)) {
43
+ tlb_flush_by_mmuidx(env_cpu(env),
44
+ ARMMMUIdxBit_E20_2 | ARMMMUIdxBit_E20_0);
45
+ }
42
+ }
46
raw_write(env, ri, value);
43
+ return true;
44
+}
45
+TRANS_FEAT(SM3TT1A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1a)
46
+TRANS_FEAT(SM3TT1B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1b)
47
+TRANS_FEAT(SM3TT2A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2a)
48
+TRANS_FEAT(SM3TT2B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2b)
49
+
50
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
51
* Note that it is the caller's responsibility to ensure that the
52
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
53
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn)
54
vec_full_reg_size(s));
47
}
55
}
48
56
49
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
57
-/* Crypto three-reg imm2
50
offsetof(CPUARMState, cp15.ttbr1_ns) } },
58
- * 31 21 20 16 15 14 13 12 11 10 9 5 4 0
51
{ .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
59
- * +-----------------------+------+-----+------+--------+------+------+
52
.opc0 = 3, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 2,
60
- * | 1 1 0 0 1 1 1 0 0 1 0 | Rm | 1 0 | imm2 | opcode | Rn | Rd |
53
- .access = PL1_RW, .writefn = vmsa_tcr_el1_write,
61
- * +-----------------------+------+-----+------+--------+------+------+
54
+ .access = PL1_RW, .writefn = vmsa_tcr_el12_write,
62
- */
55
.resetfn = vmsa_ttbcr_reset, .raw_writefn = raw_write,
63
-static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
56
.fieldoffset = offsetof(CPUARMState, cp15.tcr_el[1]) },
64
-{
57
{ .name = "TTBCR", .cp = 15, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 2,
65
- static gen_helper_gvec_3 * const fns[4] = {
58
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
66
- gen_helper_crypto_sm3tt1a, gen_helper_crypto_sm3tt1b,
59
.resetvalue = 0 },
67
- gen_helper_crypto_sm3tt2a, gen_helper_crypto_sm3tt2b,
60
{ .name = "TCR_EL2", .state = ARM_CP_STATE_BOTH,
68
- };
61
.opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 2,
69
- int opcode = extract32(insn, 10, 2);
62
- .access = PL2_RW,
70
- int imm2 = extract32(insn, 12, 2);
63
- /* no .writefn needed as this can't cause an ASID change;
71
- int rm = extract32(insn, 16, 5);
64
- * no .raw_writefn or .resetfn needed as we never use mask/base_mask
72
- int rn = extract32(insn, 5, 5);
65
- */
73
- int rd = extract32(insn, 0, 5);
66
+ .access = PL2_RW, .writefn = vmsa_tcr_el12_write,
74
-
67
+ /* no .raw_writefn or .resetfn needed as we never use mask/base_mask */
75
- if (!dc_isar_feature(aa64_sm3, s)) {
68
.fieldoffset = offsetof(CPUARMState, cp15.tcr_el[2]) },
76
- unallocated_encoding(s);
69
{ .name = "VTCR", .state = ARM_CP_STATE_AA32,
77
- return;
70
.cp = 15, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
78
- }
79
-
80
- if (!fp_access_check(s)) {
81
- return;
82
- }
83
-
84
- gen_gvec_op3_ool(s, true, rd, rn, rm, imm2, fns[opcode]);
85
-}
86
-
87
/* C3.6 Data processing - SIMD, inc Crypto
88
*
89
* As the decode gets a little complex we are using a table based
90
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
91
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
92
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
93
{ 0xce800000, 0xffe00000, disas_crypto_xar },
94
- { 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 },
95
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
96
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
97
{ 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 },
71
--
98
--
72
2.20.1
99
2.34.1
73
74
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
When VHE is enabled, the exception level below EL2 is not EL1,
4
but EL0, and so to identify the entry vector offset for exceptions
5
targeting EL2 we need to look at the width of EL0, not of EL1.
6
7
Tested-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-37-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-17-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
7
---
13
target/arm/helper.c | 9 +++++++--
8
target/arm/tcg/a64.decode | 4 ++++
14
1 file changed, 7 insertions(+), 2 deletions(-)
9
target/arm/tcg/translate-a64.c | 43 +++++++++++-----------------------
10
2 files changed, 18 insertions(+), 29 deletions(-)
15
11
16
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
17
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.c
14
--- a/target/arm/tcg/a64.decode
19
+++ b/target/arm/helper.c
15
+++ b/target/arm/tcg/a64.decode
20
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
16
@@ -XXX,XX +XXX,XX @@ SM3TT1A 11001110 010 ..... 10 .. 00 ..... ..... @crypto3i
21
* immediately lower than the target level is using AArch32 or AArch64
17
SM3TT1B 11001110 010 ..... 10 .. 01 ..... ..... @crypto3i
22
*/
18
SM3TT2A 11001110 010 ..... 10 .. 10 ..... ..... @crypto3i
23
bool is_aa64;
19
SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i
24
+ uint64_t hcr;
20
+
25
21
+### Cryptographic XAR
26
switch (new_el) {
22
+
27
case 3:
23
+XAR 1100 1110 100 rm:5 imm:6 rn:5 rd:5
28
is_aa64 = (env->cp15.scr_el3 & SCR_RW) != 0;
24
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
29
break;
25
index XXXXXXX..XXXXXXX 100644
30
case 2:
26
--- a/target/arm/tcg/translate-a64.c
31
- is_aa64 = (env->cp15.hcr_el2 & HCR_RW) != 0;
27
+++ b/target/arm/tcg/translate-a64.c
32
- break;
28
@@ -XXX,XX +XXX,XX @@ TRANS_FEAT(SM3TT1B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1b)
33
+ hcr = arm_hcr_el2_eff(env);
29
TRANS_FEAT(SM3TT2A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2a)
34
+ if ((hcr & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
30
TRANS_FEAT(SM3TT2B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2b)
35
+ is_aa64 = (hcr & HCR_RW) != 0;
31
36
+ break;
32
+static bool trans_XAR(DisasContext *s, arg_XAR *a)
37
+ }
33
+{
38
+ /* fall through */
34
+ if (!dc_isar_feature(aa64_sha3, s)) {
39
case 1:
35
+ return false;
40
is_aa64 = is_a64(env);
36
+ }
41
break;
37
+ if (fp_access_check(s)) {
38
+ gen_gvec_xar(MO_64, vec_full_reg_offset(s, a->rd),
39
+ vec_full_reg_offset(s, a->rn),
40
+ vec_full_reg_offset(s, a->rm), a->imm, 16,
41
+ vec_full_reg_size(s));
42
+ }
43
+ return true;
44
+}
45
+
46
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
47
* Note that it is the caller's responsibility to ensure that the
48
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
49
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
50
}
51
}
52
53
-/* Crypto XAR
54
- * 31 21 20 16 15 10 9 5 4 0
55
- * +-----------------------+------+--------+------+------+
56
- * | 1 1 0 0 1 1 1 0 1 0 0 | Rm | imm6 | Rn | Rd |
57
- * +-----------------------+------+--------+------+------+
58
- */
59
-static void disas_crypto_xar(DisasContext *s, uint32_t insn)
60
-{
61
- int rm = extract32(insn, 16, 5);
62
- int imm6 = extract32(insn, 10, 6);
63
- int rn = extract32(insn, 5, 5);
64
- int rd = extract32(insn, 0, 5);
65
-
66
- if (!dc_isar_feature(aa64_sha3, s)) {
67
- unallocated_encoding(s);
68
- return;
69
- }
70
-
71
- if (!fp_access_check(s)) {
72
- return;
73
- }
74
-
75
- gen_gvec_xar(MO_64, vec_full_reg_offset(s, rd),
76
- vec_full_reg_offset(s, rn),
77
- vec_full_reg_offset(s, rm), imm6, 16,
78
- vec_full_reg_size(s));
79
-}
80
-
81
/* C3.6 Data processing - SIMD, inc Crypto
82
*
83
* As the decode gets a little complex we are using a table based
84
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
85
{ 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
86
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
87
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
88
- { 0xce800000, 0xffe00000, disas_crypto_xar },
89
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
90
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
91
{ 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 },
42
--
92
--
43
2.20.1
93
2.34.1
44
45
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The EL1&0 regime is the only one that uses 2-stage translation.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200206105448.4726-11-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-18-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
7
---
11
target/arm/cpu.h | 4 +--
8
target/arm/tcg/a64.decode | 13 +
12
target/arm/internals.h | 2 +-
9
target/arm/tcg/translate-a64.c | 426 +++++++++++----------------------
13
target/arm/helper.c | 57 ++++++++++++++++++++------------------
10
2 files changed, 152 insertions(+), 287 deletions(-)
14
target/arm/translate-a64.c | 2 +-
15
target/arm/translate.c | 2 +-
16
5 files changed, 35 insertions(+), 32 deletions(-)
17
11
18
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
19
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/cpu.h
14
--- a/target/arm/tcg/a64.decode
21
+++ b/target/arm/cpu.h
15
+++ b/target/arm/tcg/a64.decode
22
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
16
@@ -XXX,XX +XXX,XX @@ SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i
23
ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
17
### Cryptographic XAR
24
ARMMMUIdx_S1SE0 = 4 | ARM_MMU_IDX_A,
18
25
ARMMMUIdx_S1SE1 = 5 | ARM_MMU_IDX_A,
19
XAR 1100 1110 100 rm:5 imm:6 rn:5 rd:5
26
- ARMMMUIdx_S2NS = 6 | ARM_MMU_IDX_A,
20
+
27
+ ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
21
+### Advanced SIMD scalar copy
28
ARMMMUIdx_MUser = 0 | ARM_MMU_IDX_M,
22
+
29
ARMMMUIdx_MPriv = 1 | ARM_MMU_IDX_M,
23
+DUP_element_s 0101 1110 000 imm:5 0 0000 1 rn:5 rd:5
30
ARMMMUIdx_MUserNegPri = 2 | ARM_MMU_IDX_M,
24
+
31
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
25
+### Advanced SIMD copy
32
ARMMMUIdxBit_S1E3 = 1 << 3,
26
+
33
ARMMMUIdxBit_S1SE0 = 1 << 4,
27
+DUP_element_v 0 q:1 00 1110 000 imm:5 0 0000 1 rn:5 rd:5
34
ARMMMUIdxBit_S1SE1 = 1 << 5,
28
+DUP_general 0 q:1 00 1110 000 imm:5 0 0001 1 rn:5 rd:5
35
- ARMMMUIdxBit_S2NS = 1 << 6,
29
+INS_general 0 1 00 1110 000 imm:5 0 0011 1 rn:5 rd:5
36
+ ARMMMUIdxBit_Stage2 = 1 << 6,
30
+SMOV 0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5
37
ARMMMUIdxBit_MUser = 1 << 0,
31
+UMOV 0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5
38
ARMMMUIdxBit_MPriv = 1 << 1,
32
+INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5
39
ARMMMUIdxBit_MUserNegPri = 1 << 2,
33
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
40
diff --git a/target/arm/internals.h b/target/arm/internals.h
41
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/internals.h
35
--- a/target/arm/tcg/translate-a64.c
43
+++ b/target/arm/internals.h
36
+++ b/target/arm/tcg/translate-a64.c
44
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
37
@@ -XXX,XX +XXX,XX @@ static bool trans_XAR(DisasContext *s, arg_XAR *a)
45
case ARMMMUIdx_S1NSE0:
38
return true;
46
case ARMMMUIdx_S1NSE1:
47
case ARMMMUIdx_S1E2:
48
- case ARMMMUIdx_S2NS:
49
+ case ARMMMUIdx_Stage2:
50
case ARMMMUIdx_MPrivNegPri:
51
case ARMMMUIdx_MUserNegPri:
52
case ARMMMUIdx_MPriv:
53
diff --git a/target/arm/helper.c b/target/arm/helper.c
54
index XXXXXXX..XXXXXXX 100644
55
--- a/target/arm/helper.c
56
+++ b/target/arm/helper.c
57
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
58
tlb_flush_by_mmuidx(cs,
59
ARMMMUIdxBit_E10_1 |
60
ARMMMUIdxBit_E10_0 |
61
- ARMMMUIdxBit_S2NS);
62
+ ARMMMUIdxBit_Stage2);
63
}
39
}
64
40
65
static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
41
+/*
66
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
42
+ * Advanced SIMD copy
67
tlb_flush_by_mmuidx_all_cpus_synced(cs,
43
+ */
68
ARMMMUIdxBit_E10_1 |
44
+
69
ARMMMUIdxBit_E10_0 |
45
+static bool decode_esz_idx(int imm, MemOp *pesz, unsigned *pidx)
70
- ARMMMUIdxBit_S2NS);
46
+{
71
+ ARMMMUIdxBit_Stage2);
47
+ unsigned esz = ctz32(imm);
48
+ if (esz <= MO_64) {
49
+ *pesz = esz;
50
+ *pidx = imm >> (esz + 1);
51
+ return true;
52
+ }
53
+ return false;
54
+}
55
+
56
+static bool trans_DUP_element_s(DisasContext *s, arg_DUP_element_s *a)
57
+{
58
+ MemOp esz;
59
+ unsigned idx;
60
+
61
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
62
+ return false;
63
+ }
64
+ if (fp_access_check(s)) {
65
+ /*
66
+ * This instruction just extracts the specified element and
67
+ * zero-extends it into the bottom of the destination register.
68
+ */
69
+ TCGv_i64 tmp = tcg_temp_new_i64();
70
+ read_vec_element(s, tmp, a->rn, idx, esz);
71
+ write_fp_dreg(s, a->rd, tmp);
72
+ }
73
+ return true;
74
+}
75
+
76
+static bool trans_DUP_element_v(DisasContext *s, arg_DUP_element_v *a)
77
+{
78
+ MemOp esz;
79
+ unsigned idx;
80
+
81
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
82
+ return false;
83
+ }
84
+ if (esz == MO_64 && !a->q) {
85
+ return false;
86
+ }
87
+ if (fp_access_check(s)) {
88
+ tcg_gen_gvec_dup_mem(esz, vec_full_reg_offset(s, a->rd),
89
+ vec_reg_offset(s, a->rn, idx, esz),
90
+ a->q ? 16 : 8, vec_full_reg_size(s));
91
+ }
92
+ return true;
93
+}
94
+
95
+static bool trans_DUP_general(DisasContext *s, arg_DUP_general *a)
96
+{
97
+ MemOp esz;
98
+ unsigned idx;
99
+
100
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
101
+ return false;
102
+ }
103
+ if (esz == MO_64 && !a->q) {
104
+ return false;
105
+ }
106
+ if (fp_access_check(s)) {
107
+ tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd),
108
+ a->q ? 16 : 8, vec_full_reg_size(s),
109
+ cpu_reg(s, a->rn));
110
+ }
111
+ return true;
112
+}
113
+
114
+static bool do_smov_umov(DisasContext *s, arg_SMOV *a, MemOp is_signed)
115
+{
116
+ MemOp esz;
117
+ unsigned idx;
118
+
119
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
120
+ return false;
121
+ }
122
+ if (is_signed) {
123
+ if (esz == MO_64 || (esz == MO_32 && !a->q)) {
124
+ return false;
125
+ }
126
+ } else {
127
+ if (esz == MO_64 ? !a->q : a->q) {
128
+ return false;
129
+ }
130
+ }
131
+ if (fp_access_check(s)) {
132
+ TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
133
+ read_vec_element(s, tcg_rd, a->rn, idx, esz | is_signed);
134
+ if (is_signed && !a->q) {
135
+ tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
136
+ }
137
+ }
138
+ return true;
139
+}
140
+
141
+TRANS(SMOV, do_smov_umov, a, MO_SIGN)
142
+TRANS(UMOV, do_smov_umov, a, 0)
143
+
144
+static bool trans_INS_general(DisasContext *s, arg_INS_general *a)
145
+{
146
+ MemOp esz;
147
+ unsigned idx;
148
+
149
+ if (!decode_esz_idx(a->imm, &esz, &idx)) {
150
+ return false;
151
+ }
152
+ if (fp_access_check(s)) {
153
+ write_vec_element(s, cpu_reg(s, a->rn), a->rd, idx, esz);
154
+ clear_vec_high(s, true, a->rd);
155
+ }
156
+ return true;
157
+}
158
+
159
+static bool trans_INS_element(DisasContext *s, arg_INS_element *a)
160
+{
161
+ MemOp esz;
162
+ unsigned didx, sidx;
163
+
164
+ if (!decode_esz_idx(a->di, &esz, &didx)) {
165
+ return false;
166
+ }
167
+ sidx = a->si >> esz;
168
+ if (fp_access_check(s)) {
169
+ TCGv_i64 tmp = tcg_temp_new_i64();
170
+
171
+ read_vec_element(s, tmp, a->rn, sidx, esz);
172
+ write_vec_element(s, tmp, a->rd, didx, esz);
173
+
174
+ /* INS is considered a 128-bit write for SVE. */
175
+ clear_vec_high(s, true, a->rd);
176
+ }
177
+ return true;
178
+}
179
+
180
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
181
* Note that it is the caller's responsibility to ensure that the
182
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
183
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
184
write_fp_dreg(s, rd, tcg_res);
72
}
185
}
73
186
74
static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
187
-/* DUP (Element, Vector)
75
@@ -XXX,XX +XXX,XX @@ static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
188
- *
76
189
- * 31 30 29 21 20 16 15 10 9 5 4 0
77
pageaddr = sextract64(value << 12, 0, 40);
190
- * +---+---+-------------------+--------+-------------+------+------+
78
191
- * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 0 1 | Rn | Rd |
79
- tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S2NS);
192
- * +---+---+-------------------+--------+-------------+------+------+
80
+ tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
193
- *
81
}
194
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
82
195
- */
83
static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
196
-static void handle_simd_dupe(DisasContext *s, int is_q, int rd, int rn,
84
@@ -XXX,XX +XXX,XX @@ static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
197
- int imm5)
85
pageaddr = sextract64(value << 12, 0, 40);
198
-{
86
199
- int size = ctz32(imm5);
87
tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
200
- int index;
88
- ARMMMUIdxBit_S2NS);
201
-
89
+ ARMMMUIdxBit_Stage2);
202
- if (size > 3 || (size == 3 && !is_q)) {
90
}
203
- unallocated_encoding(s);
91
204
- return;
92
static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
205
- }
93
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
206
-
94
ARMCPU *cpu = env_archcpu(env);
207
- if (!fp_access_check(s)) {
95
CPUState *cs = CPU(cpu);
208
- return;
96
209
- }
97
- /* Accesses to VTTBR may change the VMID so we must flush the TLB. */
210
-
98
+ /*
211
- index = imm5 >> (size + 1);
99
+ * A change in VMID to the stage2 page table (Stage2) invalidates
212
- tcg_gen_gvec_dup_mem(size, vec_full_reg_offset(s, rd),
100
+ * the combined stage 1&2 tlbs (EL10_1 and EL10_0).
213
- vec_reg_offset(s, rn, index, size),
101
+ */
214
- is_q ? 16 : 8, vec_full_reg_size(s));
102
if (raw_read(env, ri) != value) {
215
-}
103
tlb_flush_by_mmuidx(cs,
216
-
104
ARMMMUIdxBit_E10_1 |
217
-/* DUP (element, scalar)
105
ARMMMUIdxBit_E10_0 |
218
- * 31 21 20 16 15 10 9 5 4 0
106
- ARMMMUIdxBit_S2NS);
219
- * +-----------------------+--------+-------------+------+------+
107
+ ARMMMUIdxBit_Stage2);
220
- * | 0 1 0 1 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 0 1 | Rn | Rd |
108
raw_write(env, ri, value);
221
- * +-----------------------+--------+-------------+------+------+
222
- */
223
-static void handle_simd_dupes(DisasContext *s, int rd, int rn,
224
- int imm5)
225
-{
226
- int size = ctz32(imm5);
227
- int index;
228
- TCGv_i64 tmp;
229
-
230
- if (size > 3) {
231
- unallocated_encoding(s);
232
- return;
233
- }
234
-
235
- if (!fp_access_check(s)) {
236
- return;
237
- }
238
-
239
- index = imm5 >> (size + 1);
240
-
241
- /* This instruction just extracts the specified element and
242
- * zero-extends it into the bottom of the destination register.
243
- */
244
- tmp = tcg_temp_new_i64();
245
- read_vec_element(s, tmp, rn, index, size);
246
- write_fp_dreg(s, rd, tmp);
247
-}
248
-
249
-/* DUP (General)
250
- *
251
- * 31 30 29 21 20 16 15 10 9 5 4 0
252
- * +---+---+-------------------+--------+-------------+------+------+
253
- * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 1 1 | Rn | Rd |
254
- * +---+---+-------------------+--------+-------------+------+------+
255
- *
256
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
257
- */
258
-static void handle_simd_dupg(DisasContext *s, int is_q, int rd, int rn,
259
- int imm5)
260
-{
261
- int size = ctz32(imm5);
262
- uint32_t dofs, oprsz, maxsz;
263
-
264
- if (size > 3 || ((size == 3) && !is_q)) {
265
- unallocated_encoding(s);
266
- return;
267
- }
268
-
269
- if (!fp_access_check(s)) {
270
- return;
271
- }
272
-
273
- dofs = vec_full_reg_offset(s, rd);
274
- oprsz = is_q ? 16 : 8;
275
- maxsz = vec_full_reg_size(s);
276
-
277
- tcg_gen_gvec_dup_i64(size, dofs, oprsz, maxsz, cpu_reg(s, rn));
278
-}
279
-
280
-/* INS (Element)
281
- *
282
- * 31 21 20 16 15 14 11 10 9 5 4 0
283
- * +-----------------------+--------+------------+---+------+------+
284
- * | 0 1 1 0 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd |
285
- * +-----------------------+--------+------------+---+------+------+
286
- *
287
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
288
- * index: encoded in imm5<4:size+1>
289
- */
290
-static void handle_simd_inse(DisasContext *s, int rd, int rn,
291
- int imm4, int imm5)
292
-{
293
- int size = ctz32(imm5);
294
- int src_index, dst_index;
295
- TCGv_i64 tmp;
296
-
297
- if (size > 3) {
298
- unallocated_encoding(s);
299
- return;
300
- }
301
-
302
- if (!fp_access_check(s)) {
303
- return;
304
- }
305
-
306
- dst_index = extract32(imm5, 1+size, 5);
307
- src_index = extract32(imm4, size, 4);
308
-
309
- tmp = tcg_temp_new_i64();
310
-
311
- read_vec_element(s, tmp, rn, src_index, size);
312
- write_vec_element(s, tmp, rd, dst_index, size);
313
-
314
- /* INS is considered a 128-bit write for SVE. */
315
- clear_vec_high(s, true, rd);
316
-}
317
-
318
-
319
-/* INS (General)
320
- *
321
- * 31 21 20 16 15 10 9 5 4 0
322
- * +-----------------------+--------+-------------+------+------+
323
- * | 0 1 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 1 1 1 | Rn | Rd |
324
- * +-----------------------+--------+-------------+------+------+
325
- *
326
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
327
- * index: encoded in imm5<4:size+1>
328
- */
329
-static void handle_simd_insg(DisasContext *s, int rd, int rn, int imm5)
330
-{
331
- int size = ctz32(imm5);
332
- int idx;
333
-
334
- if (size > 3) {
335
- unallocated_encoding(s);
336
- return;
337
- }
338
-
339
- if (!fp_access_check(s)) {
340
- return;
341
- }
342
-
343
- idx = extract32(imm5, 1 + size, 4 - size);
344
- write_vec_element(s, cpu_reg(s, rn), rd, idx, size);
345
-
346
- /* INS is considered a 128-bit write for SVE. */
347
- clear_vec_high(s, true, rd);
348
-}
349
-
350
-/*
351
- * UMOV (General)
352
- * SMOV (General)
353
- *
354
- * 31 30 29 21 20 16 15 12 10 9 5 4 0
355
- * +---+---+-------------------+--------+-------------+------+------+
356
- * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 1 U 1 1 | Rn | Rd |
357
- * +---+---+-------------------+--------+-------------+------+------+
358
- *
359
- * U: unsigned when set
360
- * size: encoded in imm5 (see ARM ARM LowestSetBit())
361
- */
362
-static void handle_simd_umov_smov(DisasContext *s, int is_q, int is_signed,
363
- int rn, int rd, int imm5)
364
-{
365
- int size = ctz32(imm5);
366
- int element;
367
- TCGv_i64 tcg_rd;
368
-
369
- /* Check for UnallocatedEncodings */
370
- if (is_signed) {
371
- if (size > 2 || (size == 2 && !is_q)) {
372
- unallocated_encoding(s);
373
- return;
374
- }
375
- } else {
376
- if (size > 3
377
- || (size < 3 && is_q)
378
- || (size == 3 && !is_q)) {
379
- unallocated_encoding(s);
380
- return;
381
- }
382
- }
383
-
384
- if (!fp_access_check(s)) {
385
- return;
386
- }
387
-
388
- element = extract32(imm5, 1+size, 4);
389
-
390
- tcg_rd = cpu_reg(s, rd);
391
- read_vec_element(s, tcg_rd, rn, element, size | (is_signed ? MO_SIGN : 0));
392
- if (is_signed && !is_q) {
393
- tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
394
- }
395
-}
396
-
397
-/* AdvSIMD copy
398
- * 31 30 29 28 21 20 16 15 14 11 10 9 5 4 0
399
- * +---+---+----+-----------------+------+---+------+---+------+------+
400
- * | 0 | Q | op | 0 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd |
401
- * +---+---+----+-----------------+------+---+------+---+------+------+
402
- */
403
-static void disas_simd_copy(DisasContext *s, uint32_t insn)
404
-{
405
- int rd = extract32(insn, 0, 5);
406
- int rn = extract32(insn, 5, 5);
407
- int imm4 = extract32(insn, 11, 4);
408
- int op = extract32(insn, 29, 1);
409
- int is_q = extract32(insn, 30, 1);
410
- int imm5 = extract32(insn, 16, 5);
411
-
412
- if (op) {
413
- if (is_q) {
414
- /* INS (element) */
415
- handle_simd_inse(s, rd, rn, imm4, imm5);
416
- } else {
417
- unallocated_encoding(s);
418
- }
419
- } else {
420
- switch (imm4) {
421
- case 0:
422
- /* DUP (element - vector) */
423
- handle_simd_dupe(s, is_q, rd, rn, imm5);
424
- break;
425
- case 1:
426
- /* DUP (general) */
427
- handle_simd_dupg(s, is_q, rd, rn, imm5);
428
- break;
429
- case 3:
430
- if (is_q) {
431
- /* INS (general) */
432
- handle_simd_insg(s, rd, rn, imm5);
433
- } else {
434
- unallocated_encoding(s);
435
- }
436
- break;
437
- case 5:
438
- case 7:
439
- /* UMOV/SMOV (is_q indicates 32/64; imm4 indicates signedness) */
440
- handle_simd_umov_smov(s, is_q, (imm4 == 5), rn, rd, imm5);
441
- break;
442
- default:
443
- unallocated_encoding(s);
444
- break;
445
- }
446
- }
447
-}
448
-
449
/* AdvSIMD modified immediate
450
* 31 30 29 28 19 18 16 15 12 11 10 9 5 4 0
451
* +---+---+----+---------------------+-----+-------+----+---+-------+------+
452
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
109
}
453
}
110
}
454
}
111
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
455
112
if (arm_is_secure_below_el3(env)) {
456
-/* AdvSIMD scalar copy
113
return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
457
- * 31 30 29 28 21 20 16 15 14 11 10 9 5 4 0
114
} else if (arm_feature(env, ARM_FEATURE_EL2)) {
458
- * +-----+----+-----------------+------+---+------+---+------+------+
115
- return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0 | ARMMMUIdxBit_S2NS;
459
- * | 0 1 | op | 1 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd |
116
+ return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0 | ARMMMUIdxBit_Stage2;
460
- * +-----+----+-----------------+------+---+------+---+------+------+
117
} else {
461
- */
118
return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0;
462
-static void disas_simd_scalar_copy(DisasContext *s, uint32_t insn)
119
}
463
-{
120
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
464
- int rd = extract32(insn, 0, 5);
121
465
- int rn = extract32(insn, 5, 5);
122
pageaddr = sextract64(value << 12, 0, 48);
466
- int imm4 = extract32(insn, 11, 4);
123
467
- int imm5 = extract32(insn, 16, 5);
124
- tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S2NS);
468
- int op = extract32(insn, 29, 1);
125
+ tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
469
-
126
}
470
- if (op != 0 || imm4 != 0) {
127
471
- unallocated_encoding(s);
128
static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
472
- return;
129
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
473
- }
130
pageaddr = sextract64(value << 12, 0, 48);
474
-
131
475
- /* DUP (element, scalar) */
132
tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
476
- handle_simd_dupes(s, rd, rn, imm5);
133
- ARMMMUIdxBit_S2NS);
477
-}
134
+ ARMMMUIdxBit_Stage2);
478
-
135
}
479
/* AdvSIMD scalar pairwise
136
480
* 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0
137
static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
481
* +-----+---+-----------+------+-----------+--------+-----+------+------+
138
@@ -XXX,XX +XXX,XX @@ void arm_cpu_do_interrupt(CPUState *cs)
482
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
139
static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
483
{ 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff },
140
{
484
{ 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
141
switch (mmu_idx) {
485
{ 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },
142
- case ARMMMUIdx_S2NS:
486
- { 0x0e000400, 0x9fe08400, disas_simd_copy },
143
+ case ARMMMUIdx_Stage2:
487
{ 0x0f000000, 0x9f000400, disas_simd_indexed }, /* vector indexed */
144
case ARMMMUIdx_S1E2:
488
/* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
145
return 2;
489
{ 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
146
case ARMMMUIdx_S1E3:
490
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
147
@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_disabled(CPUARMState *env,
491
{ 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },
148
}
492
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
149
}
493
{ 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
150
494
- { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
151
- if (mmu_idx == ARMMMUIdx_S2NS) {
495
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
152
+ if (mmu_idx == ARMMMUIdx_Stage2) {
496
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
153
/* HCR.DC means HCR.VM behaves as 1 */
497
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
154
return (env->cp15.hcr_el2 & (HCR_DC | HCR_VM)) == 0;
155
}
156
@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_big_endian(CPUARMState *env,
157
static inline uint64_t regime_ttbr(CPUARMState *env, ARMMMUIdx mmu_idx,
158
int ttbrn)
159
{
160
- if (mmu_idx == ARMMMUIdx_S2NS) {
161
+ if (mmu_idx == ARMMMUIdx_Stage2) {
162
return env->cp15.vttbr_el2;
163
}
164
if (ttbrn == 0) {
165
@@ -XXX,XX +XXX,XX @@ static inline uint64_t regime_ttbr(CPUARMState *env, ARMMMUIdx mmu_idx,
166
/* Return the TCR controlling this translation regime */
167
static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
168
{
169
- if (mmu_idx == ARMMMUIdx_S2NS) {
170
+ if (mmu_idx == ARMMMUIdx_Stage2) {
171
return &env->cp15.vtcr_el2;
172
}
173
return &env->cp15.tcr_el[regime_el(env, mmu_idx)];
174
@@ -XXX,XX +XXX,XX @@ static int get_S1prot(CPUARMState *env, ARMMMUIdx mmu_idx, bool is_aa64,
175
bool have_wxn;
176
int wxn = 0;
177
178
- assert(mmu_idx != ARMMMUIdx_S2NS);
179
+ assert(mmu_idx != ARMMMUIdx_Stage2);
180
181
user_rw = simple_ap_to_rw_prot_is_user(ap, true);
182
if (is_user) {
183
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
184
ARMMMUFaultInfo *fi)
185
{
186
if ((mmu_idx == ARMMMUIdx_S1NSE0 || mmu_idx == ARMMMUIdx_S1NSE1) &&
187
- !regime_translation_disabled(env, ARMMMUIdx_S2NS)) {
188
+ !regime_translation_disabled(env, ARMMMUIdx_Stage2)) {
189
target_ulong s2size;
190
hwaddr s2pa;
191
int s2prot;
192
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
193
pcacheattrs = &cacheattrs;
194
}
195
196
- ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_S2NS, &s2pa,
197
+ ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_Stage2, &s2pa,
198
&txattrs, &s2prot, &s2size, fi, pcacheattrs);
199
if (ret) {
200
assert(fi->type != ARMFault_None);
201
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
202
tsz = extract32(tcr, 0, 6);
203
using64k = extract32(tcr, 14, 1);
204
using16k = extract32(tcr, 15, 1);
205
- if (mmu_idx == ARMMMUIdx_S2NS) {
206
+ if (mmu_idx == ARMMMUIdx_Stage2) {
207
/* VTCR_EL2 */
208
tbi = tbid = hpd = false;
209
} else {
210
@@ -XXX,XX +XXX,XX @@ static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
211
int select, tsz;
212
bool epd, hpd;
213
214
- if (mmu_idx == ARMMMUIdx_S2NS) {
215
+ if (mmu_idx == ARMMMUIdx_Stage2) {
216
/* VTCR */
217
bool sext = extract32(tcr, 4, 1);
218
bool sign = extract32(tcr, 3, 1);
219
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
220
level = 1;
221
/* There is no TTBR1 for EL2 */
222
ttbr1_valid = (el != 2);
223
- addrsize = (mmu_idx == ARMMMUIdx_S2NS ? 40 : 32);
224
+ addrsize = (mmu_idx == ARMMMUIdx_Stage2 ? 40 : 32);
225
inputsize = addrsize - param.tsz;
226
}
227
228
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
229
goto do_fault;
230
}
231
232
- if (mmu_idx != ARMMMUIdx_S2NS) {
233
+ if (mmu_idx != ARMMMUIdx_Stage2) {
234
/* The starting level depends on the virtual address size (which can
235
* be up to 48 bits) and the translation granule size. It indicates
236
* the number of strides (stride bits at a time) needed to
237
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
238
attrs = extract64(descriptor, 2, 10)
239
| (extract64(descriptor, 52, 12) << 10);
240
241
- if (mmu_idx == ARMMMUIdx_S2NS) {
242
+ if (mmu_idx == ARMMMUIdx_Stage2) {
243
/* Stage 2 table descriptors do not include any attribute fields */
244
break;
245
}
246
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
247
ap = extract32(attrs, 4, 2);
248
xn = extract32(attrs, 12, 1);
249
250
- if (mmu_idx == ARMMMUIdx_S2NS) {
251
+ if (mmu_idx == ARMMMUIdx_Stage2) {
252
ns = true;
253
*prot = get_S2prot(env, ap, xn);
254
} else {
255
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
256
}
257
258
if (cacheattrs != NULL) {
259
- if (mmu_idx == ARMMMUIdx_S2NS) {
260
+ if (mmu_idx == ARMMMUIdx_Stage2) {
261
cacheattrs->attrs = convert_stage2_attrs(env,
262
extract32(attrs, 0, 4));
263
} else {
264
@@ -XXX,XX +XXX,XX @@ do_fault:
265
fi->type = fault_type;
266
fi->level = level;
267
/* Tag the error as S2 for failed S1 PTW at S2 or ordinary S2. */
268
- fi->stage2 = fi->s1ptw || (mmu_idx == ARMMMUIdx_S2NS);
269
+ fi->stage2 = fi->s1ptw || (mmu_idx == ARMMMUIdx_Stage2);
270
return true;
271
}
272
273
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
274
prot, page_size, fi, cacheattrs);
275
276
/* If S1 fails or S2 is disabled, return early. */
277
- if (ret || regime_translation_disabled(env, ARMMMUIdx_S2NS)) {
278
+ if (ret || regime_translation_disabled(env, ARMMMUIdx_Stage2)) {
279
*phys_ptr = ipa;
280
return ret;
281
}
282
283
/* S1 is done. Now do S2 translation. */
284
- ret = get_phys_addr_lpae(env, ipa, access_type, ARMMMUIdx_S2NS,
285
+ ret = get_phys_addr_lpae(env, ipa, access_type, ARMMMUIdx_Stage2,
286
phys_ptr, attrs, &s2_prot,
287
page_size, fi,
288
cacheattrs != NULL ? &cacheattrs2 : NULL);
289
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
290
/* Fast Context Switch Extension. This doesn't exist at all in v8.
291
* In v7 and earlier it affects all stage 1 translations.
292
*/
293
- if (address < 0x02000000 && mmu_idx != ARMMMUIdx_S2NS
294
+ if (address < 0x02000000 && mmu_idx != ARMMMUIdx_Stage2
295
&& !arm_feature(env, ARM_FEATURE_V8)) {
296
if (regime_el(env, mmu_idx) == 3) {
297
address += env->cp15.fcseidr_s;
298
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
299
index XXXXXXX..XXXXXXX 100644
300
--- a/target/arm/translate-a64.c
301
+++ b/target/arm/translate-a64.c
302
@@ -XXX,XX +XXX,XX @@ static inline int get_a64_user_mem_index(DisasContext *s)
303
case ARMMMUIdx_S1SE1:
304
useridx = ARMMMUIdx_S1SE0;
305
break;
306
- case ARMMMUIdx_S2NS:
307
+ case ARMMMUIdx_Stage2:
308
g_assert_not_reached();
309
default:
310
useridx = s->mmu_idx;
311
diff --git a/target/arm/translate.c b/target/arm/translate.c
312
index XXXXXXX..XXXXXXX 100644
313
--- a/target/arm/translate.c
314
+++ b/target/arm/translate.c
315
@@ -XXX,XX +XXX,XX @@ static inline int get_a32_user_mem_index(DisasContext *s)
316
case ARMMMUIdx_MSUserNegPri:
317
case ARMMMUIdx_MSPrivNegPri:
318
return arm_to_core_mmu_idx(ARMMMUIdx_MSUserNegPri);
319
- case ARMMMUIdx_S2NS:
320
+ case ARMMMUIdx_Stage2:
321
default:
322
g_assert_not_reached();
323
}
324
--
498
--
325
2.20.1
499
2.34.1
326
327
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Apart from the wholesale redirection that HCR_EL2.E2H performs
3
Convert all forms (scalar, vector, scalar indexed, vector indexed),
4
for EL2, there's a separate redirection specific to the timers
4
which allows us to remove switch table entries elsewhere.
5
that happens for EL0 when running in the EL2&0 regime.
6
5
7
Tested-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-30-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-19-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
10
---
13
target/arm/helper.c | 181 +++++++++++++++++++++++++++++++++++++++++---
11
target/arm/tcg/helper-a64.h | 8 ++
14
1 file changed, 169 insertions(+), 12 deletions(-)
12
target/arm/tcg/a64.decode | 45 +++++++
13
target/arm/tcg/translate-a64.c | 221 +++++++++++++++++++++++++++------
14
target/arm/tcg/vec_helper.c | 39 +++---
15
4 files changed, 259 insertions(+), 54 deletions(-)
15
16
16
diff --git a/target/arm/helper.c b/target/arm/helper.c
17
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
17
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.c
19
--- a/target/arm/tcg/helper-a64.h
19
+++ b/target/arm/helper.c
20
+++ b/target/arm/tcg/helper-a64.h
20
@@ -XXX,XX +XXX,XX @@ static void gt_phys_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(cpye, void, env, i32, i32, i32)
21
gt_ctl_write(env, ri, GTIMER_PHYS, value);
22
DEF_HELPER_4(cpyfp, void, env, i32, i32, i32)
23
DEF_HELPER_4(cpyfm, void, env, i32, i32, i32)
24
DEF_HELPER_4(cpyfe, void, env, i32, i32, i32)
25
+
26
+DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
+
30
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/tcg/a64.decode
36
+++ b/target/arm/tcg/a64.decode
37
@@ -XXX,XX +XXX,XX @@
38
#
39
40
%rd 0:5
41
+%esz_sd 22:1 !function=plus_2
42
+%hl 11:1 21:1
43
+%hlm 11:1 20:2
44
45
&r rn
46
&ri rd imm
47
&rri_sf rd rn imm sf
48
&i imm
49
+&rrr_e rd rn rm esz
50
+&rrx_e rd rn rm idx esz
51
&qrr_e q rd rn esz
52
&qrrr_e q rd rn rm esz
53
+&qrrx_e q rd rn rm idx esz
54
&qrrrr_e q rd rn rm ra esz
55
56
+@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1
57
+@rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd
58
+
59
+@rrx_h ........ .. .. rm:4 .... . . rn:5 rd:5 &rrx_e esz=1 idx=%hlm
60
+@rrx_s ........ .. . rm:5 .... . . rn:5 rd:5 &rrx_e esz=2 idx=%hl
61
+@rrx_d ........ .. . rm:5 .... idx:1 . rn:5 rd:5 &rrx_e esz=3
62
+
63
@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0
64
@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0
65
@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0
66
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
67
@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
68
69
+@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1
70
+@qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd
71
+
72
+@qrrx_h . q:1 .. .... .. .. rm:4 .... . . rn:5 rd:5 \
73
+ &qrrx_e esz=1 idx=%hlm
74
+@qrrx_s . q:1 .. .... .. . rm:5 .... . . rn:5 rd:5 \
75
+ &qrrx_e esz=2 idx=%hl
76
+@qrrx_d . q:1 .. .... .. . rm:5 .... idx:1 . rn:5 rd:5 \
77
+ &qrrx_e esz=3
78
+
79
### Data Processing - Immediate
80
81
# PC-rel addressing
82
@@ -XXX,XX +XXX,XX @@ INS_general 0 1 00 1110 000 imm:5 0 0011 1 rn:5 rd:5
83
SMOV 0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5
84
UMOV 0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5
85
INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5
86
+
87
+### Advanced SIMD scalar three same
88
+
89
+FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h
90
+FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd
91
+
92
+### Advanced SIMD three same
93
+
94
+FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h
95
+FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
96
+
97
+### Advanced SIMD scalar x indexed element
98
+
99
+FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
100
+FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
101
+FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
102
+
103
+### Advanced SIMD vector x indexed element
104
+
105
+FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
106
+FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
107
+FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
108
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
109
index XXXXXXX..XXXXXXX 100644
110
--- a/target/arm/tcg/translate-a64.c
111
+++ b/target/arm/tcg/translate-a64.c
112
@@ -XXX,XX +XXX,XX @@ static bool trans_INS_element(DisasContext *s, arg_INS_element *a)
113
return true;
22
}
114
}
23
115
24
+static int gt_phys_redir_timeridx(CPUARMState *env)
116
+/*
117
+ * Advanced SIMD three same
118
+ */
119
+
120
+typedef struct FPScalar {
121
+ void (*gen_h)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
122
+ void (*gen_s)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
123
+ void (*gen_d)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
124
+} FPScalar;
125
+
126
+static bool do_fp3_scalar(DisasContext *s, arg_rrr_e *a, const FPScalar *f)
25
+{
127
+{
26
+ switch (arm_mmu_idx(env)) {
128
+ switch (a->esz) {
27
+ case ARMMMUIdx_E20_0:
129
+ case MO_64:
28
+ case ARMMMUIdx_E20_2:
130
+ if (fp_access_check(s)) {
29
+ return GTIMER_HYP;
131
+ TCGv_i64 t0 = read_fp_dreg(s, a->rn);
132
+ TCGv_i64 t1 = read_fp_dreg(s, a->rm);
133
+ f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
134
+ write_fp_dreg(s, a->rd, t0);
135
+ }
136
+ break;
137
+ case MO_32:
138
+ if (fp_access_check(s)) {
139
+ TCGv_i32 t0 = read_fp_sreg(s, a->rn);
140
+ TCGv_i32 t1 = read_fp_sreg(s, a->rm);
141
+ f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
142
+ write_fp_sreg(s, a->rd, t0);
143
+ }
144
+ break;
145
+ case MO_16:
146
+ if (!dc_isar_feature(aa64_fp16, s)) {
147
+ return false;
148
+ }
149
+ if (fp_access_check(s)) {
150
+ TCGv_i32 t0 = read_fp_hreg(s, a->rn);
151
+ TCGv_i32 t1 = read_fp_hreg(s, a->rm);
152
+ f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16));
153
+ write_fp_sreg(s, a->rd, t0);
154
+ }
155
+ break;
30
+ default:
156
+ default:
31
+ return GTIMER_PHYS;
157
+ return false;
32
+ }
158
+ }
159
+ return true;
33
+}
160
+}
34
+
161
+
35
+static int gt_virt_redir_timeridx(CPUARMState *env)
162
+static const FPScalar f_scalar_fmulx = {
163
+ gen_helper_advsimd_mulxh,
164
+ gen_helper_vfp_mulxs,
165
+ gen_helper_vfp_mulxd,
166
+};
167
+TRANS(FMULX_s, do_fp3_scalar, a, &f_scalar_fmulx)
168
+
169
+static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
170
+ gen_helper_gvec_3_ptr * const fns[3])
36
+{
171
+{
37
+ switch (arm_mmu_idx(env)) {
172
+ MemOp esz = a->esz;
38
+ case ARMMMUIdx_E20_0:
173
+
39
+ case ARMMMUIdx_E20_2:
174
+ switch (esz) {
40
+ return GTIMER_HYPVIRT;
175
+ case MO_64:
176
+ if (!a->q) {
177
+ return false;
178
+ }
179
+ break;
180
+ case MO_32:
181
+ break;
182
+ case MO_16:
183
+ if (!dc_isar_feature(aa64_fp16, s)) {
184
+ return false;
185
+ }
186
+ break;
41
+ default:
187
+ default:
42
+ return GTIMER_VIRT;
188
+ return false;
43
+ }
189
+ }
190
+ if (fp_access_check(s)) {
191
+ gen_gvec_op3_fpst(s, a->q, a->rd, a->rn, a->rm,
192
+ esz == MO_16, 0, fns[esz - 1]);
193
+ }
194
+ return true;
44
+}
195
+}
45
+
196
+
46
+static uint64_t gt_phys_redir_cval_read(CPUARMState *env,
197
+static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = {
47
+ const ARMCPRegInfo *ri)
198
+ gen_helper_gvec_fmulx_h,
199
+ gen_helper_gvec_fmulx_s,
200
+ gen_helper_gvec_fmulx_d,
201
+};
202
+TRANS(FMULX_v, do_fp3_vector, a, f_vector_fmulx)
203
+
204
+/*
205
+ * Advanced SIMD scalar/vector x indexed element
206
+ */
207
+
208
+static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f)
48
+{
209
+{
49
+ int timeridx = gt_phys_redir_timeridx(env);
210
+ switch (a->esz) {
50
+ return env->cp15.c14_timer[timeridx].cval;
211
+ case MO_64:
212
+ if (fp_access_check(s)) {
213
+ TCGv_i64 t0 = read_fp_dreg(s, a->rn);
214
+ TCGv_i64 t1 = tcg_temp_new_i64();
215
+
216
+ read_vec_element(s, t1, a->rm, a->idx, MO_64);
217
+ f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
218
+ write_fp_dreg(s, a->rd, t0);
219
+ }
220
+ break;
221
+ case MO_32:
222
+ if (fp_access_check(s)) {
223
+ TCGv_i32 t0 = read_fp_sreg(s, a->rn);
224
+ TCGv_i32 t1 = tcg_temp_new_i32();
225
+
226
+ read_vec_element_i32(s, t1, a->rm, a->idx, MO_32);
227
+ f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
228
+ write_fp_sreg(s, a->rd, t0);
229
+ }
230
+ break;
231
+ case MO_16:
232
+ if (!dc_isar_feature(aa64_fp16, s)) {
233
+ return false;
234
+ }
235
+ if (fp_access_check(s)) {
236
+ TCGv_i32 t0 = read_fp_hreg(s, a->rn);
237
+ TCGv_i32 t1 = tcg_temp_new_i32();
238
+
239
+ read_vec_element_i32(s, t1, a->rm, a->idx, MO_16);
240
+ f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16));
241
+ write_fp_sreg(s, a->rd, t0);
242
+ }
243
+ break;
244
+ default:
245
+ g_assert_not_reached();
246
+ }
247
+ return true;
51
+}
248
+}
52
+
249
+
53
+static void gt_phys_redir_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
250
+TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx)
54
+ uint64_t value)
251
+
252
+static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a,
253
+ gen_helper_gvec_3_ptr * const fns[3])
55
+{
254
+{
56
+ int timeridx = gt_phys_redir_timeridx(env);
255
+ MemOp esz = a->esz;
57
+ gt_cval_write(env, ri, timeridx, value);
256
+
257
+ switch (esz) {
258
+ case MO_64:
259
+ if (!a->q) {
260
+ return false;
261
+ }
262
+ break;
263
+ case MO_32:
264
+ break;
265
+ case MO_16:
266
+ if (!dc_isar_feature(aa64_fp16, s)) {
267
+ return false;
268
+ }
269
+ break;
270
+ default:
271
+ g_assert_not_reached();
272
+ }
273
+ if (fp_access_check(s)) {
274
+ gen_gvec_op3_fpst(s, a->q, a->rd, a->rn, a->rm,
275
+ esz == MO_16, a->idx, fns[esz - 1]);
276
+ }
277
+ return true;
58
+}
278
+}
59
+
279
+
60
+static uint64_t gt_phys_redir_tval_read(CPUARMState *env,
280
+static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = {
61
+ const ARMCPRegInfo *ri)
281
+ gen_helper_gvec_fmulx_idx_h,
62
+{
282
+ gen_helper_gvec_fmulx_idx_s,
63
+ int timeridx = gt_phys_redir_timeridx(env);
283
+ gen_helper_gvec_fmulx_idx_d,
64
+ return gt_tval_read(env, ri, timeridx);
284
+};
65
+}
285
+TRANS(FMULX_vi, do_fp3_vector_idx, a, f_vector_idx_fmulx)
66
+
286
+
67
+static void gt_phys_redir_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
287
+
68
+ uint64_t value)
288
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
69
+{
289
* Note that it is the caller's responsibility to ensure that the
70
+ int timeridx = gt_phys_redir_timeridx(env);
290
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
71
+ gt_tval_write(env, ri, timeridx, value);
291
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
72
+}
292
case 0x1a: /* FADD */
73
+
293
gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
74
+static uint64_t gt_phys_redir_ctl_read(CPUARMState *env,
294
break;
75
+ const ARMCPRegInfo *ri)
295
- case 0x1b: /* FMULX */
76
+{
296
- gen_helper_vfp_mulxd(tcg_res, tcg_op1, tcg_op2, fpst);
77
+ int timeridx = gt_phys_redir_timeridx(env);
297
- break;
78
+ return env->cp15.c14_timer[timeridx].ctl;
298
case 0x1c: /* FCMEQ */
79
+}
299
gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
80
+
300
break;
81
+static void gt_phys_redir_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
301
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
82
+ uint64_t value)
302
gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
83
+{
303
break;
84
+ int timeridx = gt_phys_redir_timeridx(env);
304
default:
85
+ gt_ctl_write(env, ri, timeridx, value);
305
+ case 0x1b: /* FMULX */
86
+}
306
g_assert_not_reached();
87
+
307
}
88
static void gt_virt_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
308
89
{
309
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
90
gt_timer_reset(env, ri, GTIMER_VIRT);
310
case 0x1a: /* FADD */
91
@@ -XXX,XX +XXX,XX @@ static void gt_cntvoff_write(CPUARMState *env, const ARMCPRegInfo *ri,
311
gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
92
gt_recalc_timer(cpu, GTIMER_VIRT);
312
break;
313
- case 0x1b: /* FMULX */
314
- gen_helper_vfp_mulxs(tcg_res, tcg_op1, tcg_op2, fpst);
315
- break;
316
case 0x1c: /* FCMEQ */
317
gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
318
break;
319
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
320
gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
321
break;
322
default:
323
+ case 0x1b: /* FMULX */
324
g_assert_not_reached();
325
}
326
327
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
328
/* Floating point: U, size[1] and opcode indicate operation */
329
int fpopcode = opcode | (extract32(size, 1, 1) << 5) | (u << 6);
330
switch (fpopcode) {
331
- case 0x1b: /* FMULX */
332
case 0x1f: /* FRECPS */
333
case 0x3f: /* FRSQRTS */
334
case 0x5d: /* FACGE */
335
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
336
case 0x7a: /* FABD */
337
break;
338
default:
339
+ case 0x1b: /* FMULX */
340
unallocated_encoding(s);
341
return;
342
}
343
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
344
TCGv_i32 tcg_res;
345
346
switch (fpopcode) {
347
- case 0x03: /* FMULX */
348
case 0x04: /* FCMEQ (reg) */
349
case 0x07: /* FRECPS */
350
case 0x0f: /* FRSQRTS */
351
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
352
case 0x1d: /* FACGT */
353
break;
354
default:
355
+ case 0x03: /* FMULX */
356
unallocated_encoding(s);
357
return;
358
}
359
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
360
tcg_res = tcg_temp_new_i32();
361
362
switch (fpopcode) {
363
- case 0x03: /* FMULX */
364
- gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst);
365
- break;
366
case 0x04: /* FCMEQ (reg) */
367
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
368
break;
369
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
370
gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
371
break;
372
default:
373
+ case 0x03: /* FMULX */
374
g_assert_not_reached();
375
}
376
377
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
378
handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32,
379
rn, rm, rd);
380
return;
381
- case 0x1b: /* FMULX */
382
case 0x1f: /* FRECPS */
383
case 0x3f: /* FRSQRTS */
384
case 0x5d: /* FACGE */
385
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
386
return;
387
388
default:
389
+ case 0x1b: /* FMULX */
390
unallocated_encoding(s);
391
return;
392
}
393
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
394
case 0x0: /* FMAXNM */
395
case 0x1: /* FMLA */
396
case 0x2: /* FADD */
397
- case 0x3: /* FMULX */
398
case 0x4: /* FCMEQ */
399
case 0x6: /* FMAX */
400
case 0x7: /* FRECPS */
401
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
402
pairwise = true;
403
break;
404
default:
405
+ case 0x3: /* FMULX */
406
unallocated_encoding(s);
407
return;
408
}
409
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
410
case 0x2: /* FADD */
411
gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
412
break;
413
- case 0x3: /* FMULX */
414
- gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst);
415
- break;
416
case 0x4: /* FCMEQ */
417
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
418
break;
419
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
420
gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
421
break;
422
default:
423
+ case 0x3: /* FMULX */
424
g_assert_not_reached();
425
}
426
427
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
428
case 0x01: /* FMLA */
429
case 0x05: /* FMLS */
430
case 0x09: /* FMUL */
431
- case 0x19: /* FMULX */
432
is_fp = 1;
433
break;
434
case 0x1d: /* SQRDMLAH */
435
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
436
/* is_fp, but we pass tcg_env not fp_status. */
437
break;
438
default:
439
+ case 0x19: /* FMULX */
440
unallocated_encoding(s);
441
return;
442
}
443
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
444
case 0x09: /* FMUL */
445
gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
446
break;
447
- case 0x19: /* FMULX */
448
- gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst);
449
- break;
450
default:
451
+ case 0x19: /* FMULX */
452
g_assert_not_reached();
453
}
454
455
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
456
g_assert_not_reached();
457
}
458
break;
459
- case 0x19: /* FMULX */
460
- switch (size) {
461
- case 1:
462
- if (is_scalar) {
463
- gen_helper_advsimd_mulxh(tcg_res, tcg_op,
464
- tcg_idx, fpst);
465
- } else {
466
- gen_helper_advsimd_mulx2h(tcg_res, tcg_op,
467
- tcg_idx, fpst);
468
- }
469
- break;
470
- case 2:
471
- gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst);
472
- break;
473
- default:
474
- g_assert_not_reached();
475
- }
476
- break;
477
case 0x0c: /* SQDMULH */
478
if (size == 1) {
479
gen_helper_neon_qdmulh_s16(tcg_res, tcg_env,
480
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
481
}
482
break;
483
default:
484
+ case 0x19: /* FMULX */
485
g_assert_not_reached();
486
}
487
488
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
489
index XXXXXXX..XXXXXXX 100644
490
--- a/target/arm/tcg/vec_helper.c
491
+++ b/target/arm/tcg/vec_helper.c
492
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_rsqrts_nf_h, float16_rsqrts_nf, float16)
493
DO_3OP(gvec_rsqrts_nf_s, float32_rsqrts_nf, float32)
494
495
#ifdef TARGET_AARCH64
496
+DO_3OP(gvec_fmulx_h, helper_advsimd_mulxh, float16)
497
+DO_3OP(gvec_fmulx_s, helper_vfp_mulxs, float32)
498
+DO_3OP(gvec_fmulx_d, helper_vfp_mulxd, float64)
499
500
DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
501
DO_3OP(gvec_recps_s, helper_recpsf_f32, float32)
502
@@ -XXX,XX +XXX,XX @@ DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -, H8)
503
504
#undef DO_MLA_IDX
505
506
-#define DO_FMUL_IDX(NAME, ADD, TYPE, H) \
507
+#define DO_FMUL_IDX(NAME, ADD, MUL, TYPE, H) \
508
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
509
{ \
510
intptr_t i, j, oprsz = simd_oprsz(desc); \
511
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
512
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
513
TYPE mm = m[H(i + idx)]; \
514
for (j = 0; j < segment; j++) { \
515
- d[i + j] = TYPE##_##ADD(d[i + j], \
516
- TYPE##_mul(n[i + j], mm, stat), stat); \
517
+ d[i + j] = ADD(d[i + j], MUL(n[i + j], mm, stat), stat); \
518
} \
519
} \
520
clear_tail(d, oprsz, simd_maxsz(desc)); \
93
}
521
}
94
522
95
+static uint64_t gt_virt_redir_cval_read(CPUARMState *env,
523
-#define float16_nop(N, M, S) (M)
96
+ const ARMCPRegInfo *ri)
524
-#define float32_nop(N, M, S) (M)
97
+{
525
-#define float64_nop(N, M, S) (M)
98
+ int timeridx = gt_virt_redir_timeridx(env);
526
+#define nop(N, M, S) (M)
99
+ return env->cp15.c14_timer[timeridx].cval;
527
100
+}
528
-DO_FMUL_IDX(gvec_fmul_idx_h, nop, float16, H2)
101
+
529
-DO_FMUL_IDX(gvec_fmul_idx_s, nop, float32, H4)
102
+static void gt_virt_redir_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
530
-DO_FMUL_IDX(gvec_fmul_idx_d, nop, float64, H8)
103
+ uint64_t value)
531
+DO_FMUL_IDX(gvec_fmul_idx_h, nop, float16_mul, float16, H2)
104
+{
532
+DO_FMUL_IDX(gvec_fmul_idx_s, nop, float32_mul, float32, H4)
105
+ int timeridx = gt_virt_redir_timeridx(env);
533
+DO_FMUL_IDX(gvec_fmul_idx_d, nop, float64_mul, float64, H8)
106
+ gt_cval_write(env, ri, timeridx, value);
534
+
107
+}
535
+#ifdef TARGET_AARCH64
108
+
536
+
109
+static uint64_t gt_virt_redir_tval_read(CPUARMState *env,
537
+DO_FMUL_IDX(gvec_fmulx_idx_h, nop, helper_advsimd_mulxh, float16, H2)
110
+ const ARMCPRegInfo *ri)
538
+DO_FMUL_IDX(gvec_fmulx_idx_s, nop, helper_vfp_mulxs, float32, H4)
111
+{
539
+DO_FMUL_IDX(gvec_fmulx_idx_d, nop, helper_vfp_mulxd, float64, H8)
112
+ int timeridx = gt_virt_redir_timeridx(env);
540
+
113
+ return gt_tval_read(env, ri, timeridx);
541
+#endif
114
+}
542
+
115
+
543
+#undef nop
116
+static void gt_virt_redir_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
544
117
+ uint64_t value)
545
/*
118
+{
546
* Non-fused multiply-accumulate operations, for Neon. NB that unlike
119
+ int timeridx = gt_virt_redir_timeridx(env);
547
* the fused ops below they assume accumulate both from and into Vd.
120
+ gt_tval_write(env, ri, timeridx, value);
548
*/
121
+}
549
-DO_FMUL_IDX(gvec_fmla_nf_idx_h, add, float16, H2)
122
+
550
-DO_FMUL_IDX(gvec_fmla_nf_idx_s, add, float32, H4)
123
+static uint64_t gt_virt_redir_ctl_read(CPUARMState *env,
551
-DO_FMUL_IDX(gvec_fmls_nf_idx_h, sub, float16, H2)
124
+ const ARMCPRegInfo *ri)
552
-DO_FMUL_IDX(gvec_fmls_nf_idx_s, sub, float32, H4)
125
+{
553
+DO_FMUL_IDX(gvec_fmla_nf_idx_h, float16_add, float16_mul, float16, H2)
126
+ int timeridx = gt_virt_redir_timeridx(env);
554
+DO_FMUL_IDX(gvec_fmla_nf_idx_s, float32_add, float32_mul, float32, H4)
127
+ return env->cp15.c14_timer[timeridx].ctl;
555
+DO_FMUL_IDX(gvec_fmls_nf_idx_h, float16_sub, float16_mul, float16, H2)
128
+}
556
+DO_FMUL_IDX(gvec_fmls_nf_idx_s, float32_sub, float32_mul, float32, H4)
129
+
557
130
+static void gt_virt_redir_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
558
-#undef float16_nop
131
+ uint64_t value)
559
-#undef float32_nop
132
+{
560
-#undef float64_nop
133
+ int timeridx = gt_virt_redir_timeridx(env);
561
#undef DO_FMUL_IDX
134
+ gt_ctl_write(env, ri, timeridx, value);
562
135
+}
563
#define DO_FMLA_IDX(NAME, TYPE, H) \
136
+
137
static void gt_hyp_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
138
{
139
gt_timer_reset(env, ri, GTIMER_HYP);
140
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
141
.accessfn = gt_ptimer_access,
142
.fieldoffset = offsetoflow32(CPUARMState,
143
cp15.c14_timer[GTIMER_PHYS].ctl),
144
- .writefn = gt_phys_ctl_write, .raw_writefn = raw_write,
145
+ .readfn = gt_phys_redir_ctl_read, .raw_readfn = raw_read,
146
+ .writefn = gt_phys_redir_ctl_write, .raw_writefn = raw_write,
147
},
148
{ .name = "CNTP_CTL_S",
149
.cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 1,
150
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
151
.accessfn = gt_ptimer_access,
152
.fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_PHYS].ctl),
153
.resetvalue = 0,
154
- .writefn = gt_phys_ctl_write, .raw_writefn = raw_write,
155
+ .readfn = gt_phys_redir_ctl_read, .raw_readfn = raw_read,
156
+ .writefn = gt_phys_redir_ctl_write, .raw_writefn = raw_write,
157
},
158
{ .name = "CNTV_CTL", .cp = 15, .crn = 14, .crm = 3, .opc1 = 0, .opc2 = 1,
159
.type = ARM_CP_IO | ARM_CP_ALIAS, .access = PL0_RW,
160
.accessfn = gt_vtimer_access,
161
.fieldoffset = offsetoflow32(CPUARMState,
162
cp15.c14_timer[GTIMER_VIRT].ctl),
163
- .writefn = gt_virt_ctl_write, .raw_writefn = raw_write,
164
+ .readfn = gt_virt_redir_ctl_read, .raw_readfn = raw_read,
165
+ .writefn = gt_virt_redir_ctl_write, .raw_writefn = raw_write,
166
},
167
{ .name = "CNTV_CTL_EL0", .state = ARM_CP_STATE_AA64,
168
.opc0 = 3, .opc1 = 3, .crn = 14, .crm = 3, .opc2 = 1,
169
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
170
.accessfn = gt_vtimer_access,
171
.fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_VIRT].ctl),
172
.resetvalue = 0,
173
- .writefn = gt_virt_ctl_write, .raw_writefn = raw_write,
174
+ .readfn = gt_virt_redir_ctl_read, .raw_readfn = raw_read,
175
+ .writefn = gt_virt_redir_ctl_write, .raw_writefn = raw_write,
176
},
177
/* TimerValue views: a 32 bit downcounting view of the underlying state */
178
{ .name = "CNTP_TVAL", .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 0,
179
.secure = ARM_CP_SECSTATE_NS,
180
.type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL0_RW,
181
.accessfn = gt_ptimer_access,
182
- .readfn = gt_phys_tval_read, .writefn = gt_phys_tval_write,
183
+ .readfn = gt_phys_redir_tval_read, .writefn = gt_phys_redir_tval_write,
184
},
185
{ .name = "CNTP_TVAL_S",
186
.cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 0,
187
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
188
.opc0 = 3, .opc1 = 3, .crn = 14, .crm = 2, .opc2 = 0,
189
.type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL0_RW,
190
.accessfn = gt_ptimer_access, .resetfn = gt_phys_timer_reset,
191
- .readfn = gt_phys_tval_read, .writefn = gt_phys_tval_write,
192
+ .readfn = gt_phys_redir_tval_read, .writefn = gt_phys_redir_tval_write,
193
},
194
{ .name = "CNTV_TVAL", .cp = 15, .crn = 14, .crm = 3, .opc1 = 0, .opc2 = 0,
195
.type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL0_RW,
196
.accessfn = gt_vtimer_access,
197
- .readfn = gt_virt_tval_read, .writefn = gt_virt_tval_write,
198
+ .readfn = gt_virt_redir_tval_read, .writefn = gt_virt_redir_tval_write,
199
},
200
{ .name = "CNTV_TVAL_EL0", .state = ARM_CP_STATE_AA64,
201
.opc0 = 3, .opc1 = 3, .crn = 14, .crm = 3, .opc2 = 0,
202
.type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL0_RW,
203
.accessfn = gt_vtimer_access, .resetfn = gt_virt_timer_reset,
204
- .readfn = gt_virt_tval_read, .writefn = gt_virt_tval_write,
205
+ .readfn = gt_virt_redir_tval_read, .writefn = gt_virt_redir_tval_write,
206
},
207
/* The counter itself */
208
{ .name = "CNTPCT", .cp = 15, .crm = 14, .opc1 = 0,
209
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
210
.type = ARM_CP_64BIT | ARM_CP_IO | ARM_CP_ALIAS,
211
.fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_PHYS].cval),
212
.accessfn = gt_ptimer_access,
213
- .writefn = gt_phys_cval_write, .raw_writefn = raw_write,
214
+ .readfn = gt_phys_redir_cval_read, .raw_readfn = raw_read,
215
+ .writefn = gt_phys_redir_cval_write, .raw_writefn = raw_write,
216
},
217
{ .name = "CNTP_CVAL_S", .cp = 15, .crm = 14, .opc1 = 2,
218
.secure = ARM_CP_SECSTATE_S,
219
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
220
.type = ARM_CP_IO,
221
.fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_PHYS].cval),
222
.resetvalue = 0, .accessfn = gt_ptimer_access,
223
- .writefn = gt_phys_cval_write, .raw_writefn = raw_write,
224
+ .readfn = gt_phys_redir_cval_read, .raw_readfn = raw_read,
225
+ .writefn = gt_phys_redir_cval_write, .raw_writefn = raw_write,
226
},
227
{ .name = "CNTV_CVAL", .cp = 15, .crm = 14, .opc1 = 3,
228
.access = PL0_RW,
229
.type = ARM_CP_64BIT | ARM_CP_IO | ARM_CP_ALIAS,
230
.fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_VIRT].cval),
231
.accessfn = gt_vtimer_access,
232
- .writefn = gt_virt_cval_write, .raw_writefn = raw_write,
233
+ .readfn = gt_virt_redir_cval_read, .raw_readfn = raw_read,
234
+ .writefn = gt_virt_redir_cval_write, .raw_writefn = raw_write,
235
},
236
{ .name = "CNTV_CVAL_EL0", .state = ARM_CP_STATE_AA64,
237
.opc0 = 3, .opc1 = 3, .crn = 14, .crm = 3, .opc2 = 2,
238
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
239
.type = ARM_CP_IO,
240
.fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_VIRT].cval),
241
.resetvalue = 0, .accessfn = gt_vtimer_access,
242
- .writefn = gt_virt_cval_write, .raw_writefn = raw_write,
243
+ .readfn = gt_virt_redir_cval_read, .raw_readfn = raw_read,
244
+ .writefn = gt_virt_redir_cval_write, .raw_writefn = raw_write,
245
},
246
/* Secure timer -- this is actually restricted to only EL3
247
* and configurably Secure-EL1 via the accessfn.
248
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
249
REGINFO_SENTINEL
250
};
251
252
+static CPAccessResult e2h_access(CPUARMState *env, const ARMCPRegInfo *ri,
253
+ bool isread)
254
+{
255
+ if (!(arm_hcr_el2_eff(env) & HCR_E2H)) {
256
+ return CP_ACCESS_TRAP;
257
+ }
258
+ return CP_ACCESS_OK;
259
+}
260
+
261
#else
262
263
/* In user-mode most of the generic timer registers are inaccessible
264
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vhe_reginfo[] = {
265
.access = PL2_RW,
266
.fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_HYPVIRT].ctl),
267
.writefn = gt_hv_ctl_write, .raw_writefn = raw_write },
268
+ { .name = "CNTP_CTL_EL02", .state = ARM_CP_STATE_AA64,
269
+ .opc0 = 3, .opc1 = 5, .crn = 14, .crm = 2, .opc2 = 1,
270
+ .type = ARM_CP_IO | ARM_CP_ALIAS,
271
+ .access = PL2_RW, .accessfn = e2h_access,
272
+ .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_PHYS].ctl),
273
+ .writefn = gt_phys_ctl_write, .raw_writefn = raw_write },
274
+ { .name = "CNTV_CTL_EL02", .state = ARM_CP_STATE_AA64,
275
+ .opc0 = 3, .opc1 = 5, .crn = 14, .crm = 3, .opc2 = 1,
276
+ .type = ARM_CP_IO | ARM_CP_ALIAS,
277
+ .access = PL2_RW, .accessfn = e2h_access,
278
+ .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_VIRT].ctl),
279
+ .writefn = gt_virt_ctl_write, .raw_writefn = raw_write },
280
+ { .name = "CNTP_TVAL_EL02", .state = ARM_CP_STATE_AA64,
281
+ .opc0 = 3, .opc1 = 5, .crn = 14, .crm = 2, .opc2 = 0,
282
+ .type = ARM_CP_NO_RAW | ARM_CP_IO | ARM_CP_ALIAS,
283
+ .access = PL2_RW, .accessfn = e2h_access,
284
+ .readfn = gt_phys_tval_read, .writefn = gt_phys_tval_write },
285
+ { .name = "CNTV_TVAL_EL02", .state = ARM_CP_STATE_AA64,
286
+ .opc0 = 3, .opc1 = 5, .crn = 14, .crm = 3, .opc2 = 0,
287
+ .type = ARM_CP_NO_RAW | ARM_CP_IO | ARM_CP_ALIAS,
288
+ .access = PL2_RW, .accessfn = e2h_access,
289
+ .readfn = gt_virt_tval_read, .writefn = gt_virt_tval_write },
290
+ { .name = "CNTP_CVAL_EL02", .state = ARM_CP_STATE_AA64,
291
+ .opc0 = 3, .opc1 = 5, .crn = 14, .crm = 2, .opc2 = 2,
292
+ .type = ARM_CP_IO | ARM_CP_ALIAS,
293
+ .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_PHYS].cval),
294
+ .access = PL2_RW, .accessfn = e2h_access,
295
+ .writefn = gt_phys_cval_write, .raw_writefn = raw_write },
296
+ { .name = "CNTV_CVAL_EL02", .state = ARM_CP_STATE_AA64,
297
+ .opc0 = 3, .opc1 = 5, .crn = 14, .crm = 3, .opc2 = 2,
298
+ .type = ARM_CP_IO | ARM_CP_ALIAS,
299
+ .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_VIRT].cval),
300
+ .access = PL2_RW, .accessfn = e2h_access,
301
+ .writefn = gt_virt_cval_write, .raw_writefn = raw_write },
302
#endif
303
REGINFO_SENTINEL
304
};
305
--
564
--
306
2.20.1
565
2.34.1
307
308
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This is part of a reorganization to the set of mmu_idx.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
The EL3 regime only has a single stage translation, and
5
is always secure.
6
7
Tested-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-14-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-20-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
7
---
13
target/arm/cpu.h | 4 ++--
8
target/arm/tcg/helper-a64.h | 4 +
14
target/arm/internals.h | 2 +-
9
target/arm/tcg/translate.h | 5 +
15
target/arm/helper.c | 14 +++++++-------
10
target/arm/tcg/a64.decode | 27 +++++
16
target/arm/translate.c | 2 +-
11
target/arm/tcg/translate-a64.c | 205 +++++++++++++++++----------------
17
4 files changed, 11 insertions(+), 11 deletions(-)
12
target/arm/tcg/vec_helper.c | 4 +
13
5 files changed, 143 insertions(+), 102 deletions(-)
18
14
19
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
20
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/cpu.h
17
--- a/target/arm/tcg/helper-a64.h
22
+++ b/target/arm/cpu.h
18
+++ b/target/arm/tcg/helper-a64.h
23
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(cpyfp, void, env, i32, i32, i32)
24
ARMMMUIdx_E10_0 = 0 | ARM_MMU_IDX_A,
20
DEF_HELPER_4(cpyfm, void, env, i32, i32, i32)
25
ARMMMUIdx_E10_1 = 1 | ARM_MMU_IDX_A,
21
DEF_HELPER_4(cpyfe, void, env, i32, i32, i32)
26
ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
22
27
- ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
23
+DEF_HELPER_FLAGS_5(gvec_fdiv_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
+ ARMMMUIdx_SE3 = 3 | ARM_MMU_IDX_A,
24
+DEF_HELPER_FLAGS_5(gvec_fdiv_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
ARMMMUIdx_SE10_0 = 4 | ARM_MMU_IDX_A,
25
+DEF_HELPER_FLAGS_5(gvec_fdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
ARMMMUIdx_SE10_1 = 5 | ARM_MMU_IDX_A,
26
+
31
ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
27
DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
28
DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
ARMMMUIdxBit_E10_0 = 1 << 0,
29
DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
ARMMMUIdxBit_E10_1 = 1 << 1,
30
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
35
ARMMMUIdxBit_S1E2 = 1 << 2,
36
- ARMMMUIdxBit_S1E3 = 1 << 3,
37
+ ARMMMUIdxBit_SE3 = 1 << 3,
38
ARMMMUIdxBit_SE10_0 = 1 << 4,
39
ARMMMUIdxBit_SE10_1 = 1 << 5,
40
ARMMMUIdxBit_Stage2 = 1 << 6,
41
diff --git a/target/arm/internals.h b/target/arm/internals.h
42
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/internals.h
32
--- a/target/arm/tcg/translate.h
44
+++ b/target/arm/internals.h
33
+++ b/target/arm/tcg/translate.h
45
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
34
@@ -XXX,XX +XXX,XX @@ static inline int shl_12(DisasContext *s, int x)
46
case ARMMMUIdx_MPriv:
35
return x << 12;
47
case ARMMMUIdx_MUser:
36
}
48
return false;
37
49
- case ARMMMUIdx_S1E3:
38
+static inline int xor_2(DisasContext *s, int x)
50
+ case ARMMMUIdx_SE3:
39
+{
51
case ARMMMUIdx_SE10_0:
40
+ return x ^ 2;
52
case ARMMMUIdx_SE10_1:
41
+}
53
case ARMMMUIdx_MSPrivNegPri:
42
+
54
diff --git a/target/arm/helper.c b/target/arm/helper.c
43
static inline int neon_3same_fp_size(DisasContext *s, int x)
44
{
45
/* Convert 0==fp32, 1==fp16 into a MO_* value */
46
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
55
index XXXXXXX..XXXXXXX 100644
47
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/helper.c
48
--- a/target/arm/tcg/a64.decode
57
+++ b/target/arm/helper.c
49
+++ b/target/arm/tcg/a64.decode
58
@@ -XXX,XX +XXX,XX @@ static void ats_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
50
@@ -XXX,XX +XXX,XX @@
59
/* stage 1 current state PL1: ATS1CPR, ATS1CPW */
51
60
switch (el) {
52
%rd 0:5
61
case 3:
53
%esz_sd 22:1 !function=plus_2
62
- mmu_idx = ARMMMUIdx_S1E3;
54
+%esz_hsd 22:2 !function=xor_2
63
+ mmu_idx = ARMMMUIdx_SE3;
55
%hl 11:1 21:1
64
break;
56
%hlm 11:1 20:2
65
case 2:
57
66
mmu_idx = ARMMMUIdx_Stage1_E1;
58
@@ -XXX,XX +XXX,XX @@
67
@@ -XXX,XX +XXX,XX @@ static void ats_write64(CPUARMState *env, const ARMCPRegInfo *ri,
59
68
mmu_idx = ARMMMUIdx_S1E2;
60
@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1
69
break;
61
@rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd
70
case 6: /* AT S1E3R, AT S1E3W */
62
+@rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd
71
- mmu_idx = ARMMMUIdx_S1E3;
63
72
+ mmu_idx = ARMMMUIdx_SE3;
64
@rrx_h ........ .. .. rm:4 .... . . rn:5 rd:5 &rrx_e esz=1 idx=%hlm
73
break;
65
@rrx_s ........ .. . rm:5 .... . . rn:5 rd:5 &rrx_e esz=2 idx=%hl
74
default:
66
@@ -XXX,XX +XXX,XX @@ INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5
75
g_assert_not_reached();
67
76
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
68
### Advanced SIMD scalar three same
77
ARMCPU *cpu = env_archcpu(env);
69
78
CPUState *cs = CPU(cpu);
70
+FADD_s 0001 1110 ..1 ..... 0010 10 ..... ..... @rrr_hsd
79
71
+FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd
80
- tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_S1E3);
72
+FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd
81
+ tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_SE3);
73
+FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd
74
+
75
FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h
76
FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd
77
78
### Advanced SIMD three same
79
80
+FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
81
+FADD_v 0.00 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd
82
+
83
+FSUB_v 0.00 1110 110 ..... 00010 1 ..... ..... @qrrr_h
84
+FSUB_v 0.00 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd
85
+
86
+FDIV_v 0.10 1110 010 ..... 00111 1 ..... ..... @qrrr_h
87
+FDIV_v 0.10 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd
88
+
89
+FMUL_v 0.10 1110 010 ..... 00011 1 ..... ..... @qrrr_h
90
+FMUL_v 0.10 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
91
+
92
FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h
93
FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
94
95
### Advanced SIMD scalar x indexed element
96
97
+FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
98
+FMUL_si 0101 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
99
+FMUL_si 0101 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
100
+
101
FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
102
FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
103
FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
104
105
### Advanced SIMD vector x indexed element
106
107
+FMUL_vi 0.00 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
108
+FMUL_vi 0.00 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
109
+FMUL_vi 0.00 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
110
+
111
FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
112
FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
113
FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
114
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
115
index XXXXXXX..XXXXXXX 100644
116
--- a/target/arm/tcg/translate-a64.c
117
+++ b/target/arm/tcg/translate-a64.c
118
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_scalar(DisasContext *s, arg_rrr_e *a, const FPScalar *f)
119
return true;
82
}
120
}
83
121
84
static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
122
+static const FPScalar f_scalar_fadd = {
85
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
123
+ gen_helper_vfp_addh,
86
{
124
+ gen_helper_vfp_adds,
87
CPUState *cs = env_cpu(env);
125
+ gen_helper_vfp_addd,
88
126
+};
89
- tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E3);
127
+TRANS(FADD_s, do_fp3_scalar, a, &f_scalar_fadd)
90
+ tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_SE3);
128
+
129
+static const FPScalar f_scalar_fsub = {
130
+ gen_helper_vfp_subh,
131
+ gen_helper_vfp_subs,
132
+ gen_helper_vfp_subd,
133
+};
134
+TRANS(FSUB_s, do_fp3_scalar, a, &f_scalar_fsub)
135
+
136
+static const FPScalar f_scalar_fdiv = {
137
+ gen_helper_vfp_divh,
138
+ gen_helper_vfp_divs,
139
+ gen_helper_vfp_divd,
140
+};
141
+TRANS(FDIV_s, do_fp3_scalar, a, &f_scalar_fdiv)
142
+
143
+static const FPScalar f_scalar_fmul = {
144
+ gen_helper_vfp_mulh,
145
+ gen_helper_vfp_muls,
146
+ gen_helper_vfp_muld,
147
+};
148
+TRANS(FMUL_s, do_fp3_scalar, a, &f_scalar_fmul)
149
+
150
static const FPScalar f_scalar_fmulx = {
151
gen_helper_advsimd_mulxh,
152
gen_helper_vfp_mulxs,
153
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
154
return true;
91
}
155
}
92
156
93
static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
157
+static gen_helper_gvec_3_ptr * const f_vector_fadd[3] = {
94
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
158
+ gen_helper_gvec_fadd_h,
95
CPUState *cs = CPU(cpu);
159
+ gen_helper_gvec_fadd_s,
96
uint64_t pageaddr = sextract64(value << 12, 0, 56);
160
+ gen_helper_gvec_fadd_d,
97
161
+};
98
- tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S1E3);
162
+TRANS(FADD_v, do_fp3_vector, a, f_vector_fadd)
99
+ tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_SE3);
163
+
164
+static gen_helper_gvec_3_ptr * const f_vector_fsub[3] = {
165
+ gen_helper_gvec_fsub_h,
166
+ gen_helper_gvec_fsub_s,
167
+ gen_helper_gvec_fsub_d,
168
+};
169
+TRANS(FSUB_v, do_fp3_vector, a, f_vector_fsub)
170
+
171
+static gen_helper_gvec_3_ptr * const f_vector_fdiv[3] = {
172
+ gen_helper_gvec_fdiv_h,
173
+ gen_helper_gvec_fdiv_s,
174
+ gen_helper_gvec_fdiv_d,
175
+};
176
+TRANS(FDIV_v, do_fp3_vector, a, f_vector_fdiv)
177
+
178
+static gen_helper_gvec_3_ptr * const f_vector_fmul[3] = {
179
+ gen_helper_gvec_fmul_h,
180
+ gen_helper_gvec_fmul_s,
181
+ gen_helper_gvec_fmul_d,
182
+};
183
+TRANS(FMUL_v, do_fp3_vector, a, f_vector_fmul)
184
+
185
static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = {
186
gen_helper_gvec_fmulx_h,
187
gen_helper_gvec_fmulx_s,
188
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f)
189
return true;
100
}
190
}
101
191
102
static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
192
+TRANS(FMUL_si, do_fp3_scalar_idx, a, &f_scalar_fmul)
103
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
193
TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx)
104
uint64_t pageaddr = sextract64(value << 12, 0, 56);
194
105
195
static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a,
106
tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
196
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a,
107
- ARMMMUIdxBit_S1E3);
197
return true;
108
+ ARMMMUIdxBit_SE3);
109
}
198
}
110
199
111
static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
200
+static gen_helper_gvec_3_ptr * const f_vector_idx_fmul[3] = {
112
@@ -XXX,XX +XXX,XX @@ static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
201
+ gen_helper_gvec_fmul_idx_h,
113
case ARMMMUIdx_Stage2:
202
+ gen_helper_gvec_fmul_idx_s,
114
case ARMMMUIdx_S1E2:
203
+ gen_helper_gvec_fmul_idx_d,
115
return 2;
204
+};
116
- case ARMMMUIdx_S1E3:
205
+TRANS(FMUL_vi, do_fp3_vector_idx, a, f_vector_idx_fmul)
117
+ case ARMMMUIdx_SE3:
206
+
118
return 3;
207
static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = {
119
case ARMMMUIdx_SE10_0:
208
gen_helper_gvec_fmulx_idx_h,
120
return arm_el_is_aa64(env, 3) ? 1 : 3;
209
gen_helper_gvec_fmulx_idx_s,
121
diff --git a/target/arm/translate.c b/target/arm/translate.c
210
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
211
tcg_op2 = read_fp_sreg(s, rm);
212
213
switch (opcode) {
214
- case 0x0: /* FMUL */
215
- gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
216
- break;
217
- case 0x1: /* FDIV */
218
- gen_helper_vfp_divs(tcg_res, tcg_op1, tcg_op2, fpst);
219
- break;
220
- case 0x2: /* FADD */
221
- gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
222
- break;
223
- case 0x3: /* FSUB */
224
- gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
225
- break;
226
case 0x4: /* FMAX */
227
gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
228
break;
229
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
230
gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
231
gen_helper_vfp_negs(tcg_res, tcg_res);
232
break;
233
+ default:
234
+ case 0x0: /* FMUL */
235
+ case 0x1: /* FDIV */
236
+ case 0x2: /* FADD */
237
+ case 0x3: /* FSUB */
238
+ g_assert_not_reached();
239
}
240
241
write_fp_sreg(s, rd, tcg_res);
242
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
243
tcg_op2 = read_fp_dreg(s, rm);
244
245
switch (opcode) {
246
- case 0x0: /* FMUL */
247
- gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
248
- break;
249
- case 0x1: /* FDIV */
250
- gen_helper_vfp_divd(tcg_res, tcg_op1, tcg_op2, fpst);
251
- break;
252
- case 0x2: /* FADD */
253
- gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
254
- break;
255
- case 0x3: /* FSUB */
256
- gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
257
- break;
258
case 0x4: /* FMAX */
259
gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
260
break;
261
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
262
gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
263
gen_helper_vfp_negd(tcg_res, tcg_res);
264
break;
265
+ default:
266
+ case 0x0: /* FMUL */
267
+ case 0x1: /* FDIV */
268
+ case 0x2: /* FADD */
269
+ case 0x3: /* FSUB */
270
+ g_assert_not_reached();
271
}
272
273
write_fp_dreg(s, rd, tcg_res);
274
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
275
tcg_op2 = read_fp_hreg(s, rm);
276
277
switch (opcode) {
278
- case 0x0: /* FMUL */
279
- gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
280
- break;
281
- case 0x1: /* FDIV */
282
- gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst);
283
- break;
284
- case 0x2: /* FADD */
285
- gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
286
- break;
287
- case 0x3: /* FSUB */
288
- gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
289
- break;
290
case 0x4: /* FMAX */
291
gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
292
break;
293
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
294
tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
295
break;
296
default:
297
+ case 0x0: /* FMUL */
298
+ case 0x1: /* FDIV */
299
+ case 0x2: /* FADD */
300
+ case 0x3: /* FSUB */
301
g_assert_not_reached();
302
}
303
304
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
305
case 0x18: /* FMAXNM */
306
gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
307
break;
308
- case 0x1a: /* FADD */
309
- gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
310
- break;
311
case 0x1c: /* FCMEQ */
312
gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
313
break;
314
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
315
case 0x38: /* FMINNM */
316
gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
317
break;
318
- case 0x3a: /* FSUB */
319
- gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
320
- break;
321
case 0x3e: /* FMIN */
322
gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
323
break;
324
case 0x3f: /* FRSQRTS */
325
gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
326
break;
327
- case 0x5b: /* FMUL */
328
- gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
329
- break;
330
case 0x5c: /* FCMGE */
331
gen_helper_neon_cge_f64(tcg_res, tcg_op1, tcg_op2, fpst);
332
break;
333
case 0x5d: /* FACGE */
334
gen_helper_neon_acge_f64(tcg_res, tcg_op1, tcg_op2, fpst);
335
break;
336
- case 0x5f: /* FDIV */
337
- gen_helper_vfp_divd(tcg_res, tcg_op1, tcg_op2, fpst);
338
- break;
339
case 0x7a: /* FABD */
340
gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
341
gen_helper_vfp_absd(tcg_res, tcg_res);
342
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
343
gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
344
break;
345
default:
346
+ case 0x1a: /* FADD */
347
case 0x1b: /* FMULX */
348
+ case 0x3a: /* FSUB */
349
+ case 0x5b: /* FMUL */
350
+ case 0x5f: /* FDIV */
351
g_assert_not_reached();
352
}
353
354
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
355
gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2,
356
tcg_res, fpst);
357
break;
358
- case 0x1a: /* FADD */
359
- gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
360
- break;
361
case 0x1c: /* FCMEQ */
362
gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
363
break;
364
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
365
case 0x38: /* FMINNM */
366
gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
367
break;
368
- case 0x3a: /* FSUB */
369
- gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
370
- break;
371
case 0x3e: /* FMIN */
372
gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
373
break;
374
case 0x3f: /* FRSQRTS */
375
gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
376
break;
377
- case 0x5b: /* FMUL */
378
- gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
379
- break;
380
case 0x5c: /* FCMGE */
381
gen_helper_neon_cge_f32(tcg_res, tcg_op1, tcg_op2, fpst);
382
break;
383
case 0x5d: /* FACGE */
384
gen_helper_neon_acge_f32(tcg_res, tcg_op1, tcg_op2, fpst);
385
break;
386
- case 0x5f: /* FDIV */
387
- gen_helper_vfp_divs(tcg_res, tcg_op1, tcg_op2, fpst);
388
- break;
389
case 0x7a: /* FABD */
390
gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
391
gen_helper_vfp_abss(tcg_res, tcg_res);
392
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
393
gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
394
break;
395
default:
396
+ case 0x1a: /* FADD */
397
case 0x1b: /* FMULX */
398
+ case 0x3a: /* FSUB */
399
+ case 0x5b: /* FMUL */
400
+ case 0x5f: /* FDIV */
401
g_assert_not_reached();
402
}
403
404
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
405
case 0x19: /* FMLA */
406
case 0x39: /* FMLS */
407
case 0x18: /* FMAXNM */
408
- case 0x1a: /* FADD */
409
case 0x1c: /* FCMEQ */
410
case 0x1e: /* FMAX */
411
case 0x38: /* FMINNM */
412
- case 0x3a: /* FSUB */
413
case 0x3e: /* FMIN */
414
- case 0x5b: /* FMUL */
415
case 0x5c: /* FCMGE */
416
- case 0x5f: /* FDIV */
417
case 0x7a: /* FABD */
418
case 0x7c: /* FCMGT */
419
if (!fp_access_check(s)) {
420
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
421
return;
422
423
default:
424
+ case 0x1a: /* FADD */
425
case 0x1b: /* FMULX */
426
+ case 0x3a: /* FSUB */
427
+ case 0x5b: /* FMUL */
428
+ case 0x5f: /* FDIV */
429
unallocated_encoding(s);
430
return;
431
}
432
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
433
switch (fpopcode) {
434
case 0x0: /* FMAXNM */
435
case 0x1: /* FMLA */
436
- case 0x2: /* FADD */
437
case 0x4: /* FCMEQ */
438
case 0x6: /* FMAX */
439
case 0x7: /* FRECPS */
440
case 0x8: /* FMINNM */
441
case 0x9: /* FMLS */
442
- case 0xa: /* FSUB */
443
case 0xe: /* FMIN */
444
case 0xf: /* FRSQRTS */
445
- case 0x13: /* FMUL */
446
case 0x14: /* FCMGE */
447
case 0x15: /* FACGE */
448
- case 0x17: /* FDIV */
449
case 0x1a: /* FABD */
450
case 0x1c: /* FCMGT */
451
case 0x1d: /* FACGT */
452
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
453
pairwise = true;
454
break;
455
default:
456
+ case 0x2: /* FADD */
457
case 0x3: /* FMULX */
458
+ case 0xa: /* FSUB */
459
+ case 0x13: /* FMUL */
460
+ case 0x17: /* FDIV */
461
unallocated_encoding(s);
462
return;
463
}
464
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
465
gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
466
fpst);
467
break;
468
- case 0x2: /* FADD */
469
- gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
470
- break;
471
case 0x4: /* FCMEQ */
472
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
473
break;
474
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
475
gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
476
fpst);
477
break;
478
- case 0xa: /* FSUB */
479
- gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
480
- break;
481
case 0xe: /* FMIN */
482
gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
483
break;
484
case 0xf: /* FRSQRTS */
485
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
486
break;
487
- case 0x13: /* FMUL */
488
- gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
489
- break;
490
case 0x14: /* FCMGE */
491
gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
492
break;
493
case 0x15: /* FACGE */
494
gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
495
break;
496
- case 0x17: /* FDIV */
497
- gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst);
498
- break;
499
case 0x1a: /* FABD */
500
gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
501
tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
502
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
503
gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
504
break;
505
default:
506
+ case 0x2: /* FADD */
507
case 0x3: /* FMULX */
508
+ case 0xa: /* FSUB */
509
+ case 0x13: /* FMUL */
510
+ case 0x17: /* FDIV */
511
g_assert_not_reached();
512
}
513
514
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
515
break;
516
case 0x01: /* FMLA */
517
case 0x05: /* FMLS */
518
- case 0x09: /* FMUL */
519
is_fp = 1;
520
break;
521
case 0x1d: /* SQRDMLAH */
522
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
523
/* is_fp, but we pass tcg_env not fp_status. */
524
break;
525
default:
526
+ case 0x09: /* FMUL */
527
case 0x19: /* FMULX */
528
unallocated_encoding(s);
529
return;
530
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
531
read_vec_element(s, tcg_res, rd, pass, MO_64);
532
gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst);
533
break;
534
- case 0x09: /* FMUL */
535
- gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
536
- break;
537
default:
538
+ case 0x09: /* FMUL */
539
case 0x19: /* FMULX */
540
g_assert_not_reached();
541
}
542
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
543
g_assert_not_reached();
544
}
545
break;
546
- case 0x09: /* FMUL */
547
- switch (size) {
548
- case 1:
549
- if (is_scalar) {
550
- gen_helper_advsimd_mulh(tcg_res, tcg_op,
551
- tcg_idx, fpst);
552
- } else {
553
- gen_helper_advsimd_mul2h(tcg_res, tcg_op,
554
- tcg_idx, fpst);
555
- }
556
- break;
557
- case 2:
558
- gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst);
559
- break;
560
- default:
561
- g_assert_not_reached();
562
- }
563
- break;
564
case 0x0c: /* SQDMULH */
565
if (size == 1) {
566
gen_helper_neon_qdmulh_s16(tcg_res, tcg_env,
567
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
568
}
569
break;
570
default:
571
+ case 0x09: /* FMUL */
572
case 0x19: /* FMULX */
573
g_assert_not_reached();
574
}
575
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
122
index XXXXXXX..XXXXXXX 100644
576
index XXXXXXX..XXXXXXX 100644
123
--- a/target/arm/translate.c
577
--- a/target/arm/tcg/vec_helper.c
124
+++ b/target/arm/translate.c
578
+++ b/target/arm/tcg/vec_helper.c
125
@@ -XXX,XX +XXX,XX @@ static inline int get_a32_user_mem_index(DisasContext *s)
579
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_rsqrts_nf_h, float16_rsqrts_nf, float16)
126
case ARMMMUIdx_E10_0:
580
DO_3OP(gvec_rsqrts_nf_s, float32_rsqrts_nf, float32)
127
case ARMMMUIdx_E10_1:
581
128
return arm_to_core_mmu_idx(ARMMMUIdx_E10_0);
582
#ifdef TARGET_AARCH64
129
- case ARMMMUIdx_S1E3:
583
+DO_3OP(gvec_fdiv_h, float16_div, float16)
130
+ case ARMMMUIdx_SE3:
584
+DO_3OP(gvec_fdiv_s, float32_div, float32)
131
case ARMMMUIdx_SE10_0:
585
+DO_3OP(gvec_fdiv_d, float64_div, float64)
132
case ARMMMUIdx_SE10_1:
586
+
133
return arm_to_core_mmu_idx(ARMMMUIdx_SE10_0);
587
DO_3OP(gvec_fmulx_h, helper_advsimd_mulxh, float16)
588
DO_3OP(gvec_fmulx_s, helper_vfp_mulxs, float32)
589
DO_3OP(gvec_fmulx_d, helper_vfp_mulxd, float64)
134
--
590
--
135
2.20.1
591
2.34.1
136
137
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We had completely run out of TBFLAG bits.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Split A- and M-profile bits into two overlapping buckets.
5
This results in 4 free bits.
6
7
We used to initialize all of the a32 and m32 fields in DisasContext
8
by assignment, in arm_tr_init_disas_context. Now we only initialize
9
either the a32 or m32 by assignment, because the bits overlap in
10
tbflags. So zero the entire structure in gen_intermediate_code.
11
12
Tested-by: Alex Bennée <alex.bennee@linaro.org>
13
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
14
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
15
Message-id: 20200206105448.4726-16-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-21-richard.henderson@linaro.org
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
7
---
18
target/arm/cpu.h | 68 ++++++++++++++++++++++++++----------------
8
target/arm/helper.h | 4 +
19
target/arm/helper.c | 17 +++++------
9
target/arm/tcg/a64.decode | 17 ++++
20
target/arm/translate.c | 57 +++++++++++++++++++----------------
10
target/arm/tcg/translate-a64.c | 168 +++++++++++++++++----------------
21
3 files changed, 82 insertions(+), 60 deletions(-)
11
target/arm/tcg/vec_helper.c | 4 +
12
4 files changed, 113 insertions(+), 80 deletions(-)
22
13
23
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
24
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
25
--- a/target/arm/cpu.h
16
--- a/target/arm/helper.h
26
+++ b/target/arm/cpu.h
17
+++ b/target/arm/helper.h
27
@@ -XXX,XX +XXX,XX @@ typedef ARMCPU ArchCPU;
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
* We put flags which are shared between 32 and 64 bit mode at the top
19
29
* of the word, and flags which apply to only one mode at the bottom.
20
DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
*
21
DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
+ * 31 21 18 14 9 0
22
+DEF_HELPER_FLAGS_5(gvec_fmax_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
+ * +--------------+-----+-----+----------+--------------+
23
33
+ * | | | TBFLAG_A32 | |
24
DEF_HELPER_FLAGS_5(gvec_fmin_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+ * | | +-----+----------+ TBFLAG_AM32 |
25
DEF_HELPER_FLAGS_5(gvec_fmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
+ * | TBFLAG_ANY | |TBFLAG_M32| |
26
+DEF_HELPER_FLAGS_5(gvec_fmin_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
+ * | | +-------------------------|
27
37
+ * | | | TBFLAG_A64 |
28
DEF_HELPER_FLAGS_5(gvec_fmaxnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
+ * +--------------+-----------+-------------------------+
29
DEF_HELPER_FLAGS_5(gvec_fmaxnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
+ * 31 21 14 0
30
+DEF_HELPER_FLAGS_5(gvec_fmaxnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
40
+ *
31
41
* Unless otherwise noted, these bits are cached in env->hflags.
32
DEF_HELPER_FLAGS_5(gvec_fminnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
42
*/
33
DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
43
FIELD(TBFLAG_ANY, AARCH64_STATE, 31, 1)
34
+DEF_HELPER_FLAGS_5(gvec_fminnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
44
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, PSTATE_SS, 26, 1) /* Not cached. */
35
45
/* Target EL if we take a floating-point-disabled exception */
36
DEF_HELPER_FLAGS_5(gvec_recps_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
46
FIELD(TBFLAG_ANY, FPEXC_EL, 24, 2)
37
DEF_HELPER_FLAGS_5(gvec_recps_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
47
FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
38
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
48
-/*
49
- * For A-profile only, target EL for debug exceptions.
50
- * Note that this overlaps with the M-profile-only HANDLER and STACKCHECK bits.
51
- */
52
+/* For A-profile only, target EL for debug exceptions. */
53
FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 21, 2)
54
55
-/* Bit usage when in AArch32 state: */
56
-FIELD(TBFLAG_A32, THUMB, 0, 1) /* Not cached. */
57
-FIELD(TBFLAG_A32, VECLEN, 1, 3) /* Not cached. */
58
-FIELD(TBFLAG_A32, VECSTRIDE, 4, 2) /* Not cached. */
59
+/*
60
+ * Bit usage when in AArch32 state, both A- and M-profile.
61
+ */
62
+FIELD(TBFLAG_AM32, CONDEXEC, 0, 8) /* Not cached. */
63
+FIELD(TBFLAG_AM32, THUMB, 8, 1) /* Not cached. */
64
+
65
+/*
66
+ * Bit usage when in AArch32 state, for A-profile only.
67
+ */
68
+FIELD(TBFLAG_A32, VECLEN, 9, 3) /* Not cached. */
69
+FIELD(TBFLAG_A32, VECSTRIDE, 12, 2) /* Not cached. */
70
/*
71
* We store the bottom two bits of the CPAR as TB flags and handle
72
* checks on the other bits at runtime. This shares the same bits as
73
* VECSTRIDE, which is OK as no XScale CPU has VFP.
74
* Not cached, because VECLEN+VECSTRIDE are not cached.
75
*/
76
-FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
77
+FIELD(TBFLAG_A32, XSCALE_CPAR, 12, 2)
78
+FIELD(TBFLAG_A32, VFPEN, 14, 1) /* Partially cached, minus FPEXC. */
79
+FIELD(TBFLAG_A32, SCTLR_B, 15, 1)
80
+FIELD(TBFLAG_A32, HSTR_ACTIVE, 16, 1)
81
/*
82
* Indicates whether cp register reads and writes by guest code should access
83
* the secure or nonsecure bank of banked registers; note that this is not
84
* the same thing as the current security state of the processor!
85
*/
86
-FIELD(TBFLAG_A32, NS, 6, 1)
87
-FIELD(TBFLAG_A32, VFPEN, 7, 1) /* Partially cached, minus FPEXC. */
88
-FIELD(TBFLAG_A32, CONDEXEC, 8, 8) /* Not cached. */
89
-FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
90
-FIELD(TBFLAG_A32, HSTR_ACTIVE, 17, 1)
91
+FIELD(TBFLAG_A32, NS, 17, 1)
92
93
-/* For M profile only, set if FPCCR.LSPACT is set */
94
-FIELD(TBFLAG_A32, LSPACT, 18, 1) /* Not cached. */
95
-/* For M profile only, set if we must create a new FP context */
96
-FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1) /* Not cached. */
97
-/* For M profile only, set if FPCCR.S does not match current security state */
98
-FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1) /* Not cached. */
99
-/* For M profile only, Handler (ie not Thread) mode */
100
-FIELD(TBFLAG_A32, HANDLER, 21, 1)
101
-/* For M profile only, whether we should generate stack-limit checks */
102
-FIELD(TBFLAG_A32, STACKCHECK, 22, 1)
103
+/*
104
+ * Bit usage when in AArch32 state, for M-profile only.
105
+ */
106
+/* Handler (ie not Thread) mode */
107
+FIELD(TBFLAG_M32, HANDLER, 9, 1)
108
+/* Whether we should generate stack-limit checks */
109
+FIELD(TBFLAG_M32, STACKCHECK, 10, 1)
110
+/* Set if FPCCR.LSPACT is set */
111
+FIELD(TBFLAG_M32, LSPACT, 11, 1) /* Not cached. */
112
+/* Set if we must create a new FP context */
113
+FIELD(TBFLAG_M32, NEW_FP_CTXT_NEEDED, 12, 1) /* Not cached. */
114
+/* Set if FPCCR.S does not match current security state */
115
+FIELD(TBFLAG_M32, FPCCR_S_WRONG, 13, 1) /* Not cached. */
116
117
-/* Bit usage when in AArch64 state */
118
+/*
119
+ * Bit usage when in AArch64 state
120
+ */
121
FIELD(TBFLAG_A64, TBII, 0, 2)
122
FIELD(TBFLAG_A64, SVEEXC_EL, 2, 2)
123
FIELD(TBFLAG_A64, ZCR_LEN, 4, 4)
124
diff --git a/target/arm/helper.c b/target/arm/helper.c
125
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
126
--- a/target/arm/helper.c
40
--- a/target/arm/tcg/a64.decode
127
+++ b/target/arm/helper.c
41
+++ b/target/arm/tcg/a64.decode
128
@@ -XXX,XX +XXX,XX @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
42
@@ -XXX,XX +XXX,XX @@ FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd
129
{
43
FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd
130
uint32_t flags = 0;
44
FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd
131
45
132
- /* v8M always enables the fpu. */
46
+FMAX_s 0001 1110 ..1 ..... 0100 10 ..... ..... @rrr_hsd
133
- flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
47
+FMIN_s 0001 1110 ..1 ..... 0101 10 ..... ..... @rrr_hsd
134
-
48
+FMAXNM_s 0001 1110 ..1 ..... 0110 10 ..... ..... @rrr_hsd
135
if (arm_v7m_is_handler_mode(env)) {
49
+FMINNM_s 0001 1110 ..1 ..... 0111 10 ..... ..... @rrr_hsd
136
- flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
50
+
137
+ flags = FIELD_DP32(flags, TBFLAG_M32, HANDLER, 1);
51
FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h
52
FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd
53
54
@@ -XXX,XX +XXX,XX @@ FDIV_v 0.10 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd
55
FMUL_v 0.10 1110 010 ..... 00011 1 ..... ..... @qrrr_h
56
FMUL_v 0.10 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
57
58
+FMAX_v 0.00 1110 010 ..... 00110 1 ..... ..... @qrrr_h
59
+FMAX_v 0.00 1110 0.1 ..... 11110 1 ..... ..... @qrrr_sd
60
+
61
+FMIN_v 0.00 1110 110 ..... 00110 1 ..... ..... @qrrr_h
62
+FMIN_v 0.00 1110 1.1 ..... 11110 1 ..... ..... @qrrr_sd
63
+
64
+FMAXNM_v 0.00 1110 010 ..... 00000 1 ..... ..... @qrrr_h
65
+FMAXNM_v 0.00 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd
66
+
67
+FMINNM_v 0.00 1110 110 ..... 00000 1 ..... ..... @qrrr_h
68
+FMINNM_v 0.00 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
69
+
70
FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h
71
FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
72
73
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/tcg/translate-a64.c
76
+++ b/target/arm/tcg/translate-a64.c
77
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_fmul = {
78
};
79
TRANS(FMUL_s, do_fp3_scalar, a, &f_scalar_fmul)
80
81
+static const FPScalar f_scalar_fmax = {
82
+ gen_helper_advsimd_maxh,
83
+ gen_helper_vfp_maxs,
84
+ gen_helper_vfp_maxd,
85
+};
86
+TRANS(FMAX_s, do_fp3_scalar, a, &f_scalar_fmax)
87
+
88
+static const FPScalar f_scalar_fmin = {
89
+ gen_helper_advsimd_minh,
90
+ gen_helper_vfp_mins,
91
+ gen_helper_vfp_mind,
92
+};
93
+TRANS(FMIN_s, do_fp3_scalar, a, &f_scalar_fmin)
94
+
95
+static const FPScalar f_scalar_fmaxnm = {
96
+ gen_helper_advsimd_maxnumh,
97
+ gen_helper_vfp_maxnums,
98
+ gen_helper_vfp_maxnumd,
99
+};
100
+TRANS(FMAXNM_s, do_fp3_scalar, a, &f_scalar_fmaxnm)
101
+
102
+static const FPScalar f_scalar_fminnm = {
103
+ gen_helper_advsimd_minnumh,
104
+ gen_helper_vfp_minnums,
105
+ gen_helper_vfp_minnumd,
106
+};
107
+TRANS(FMINNM_s, do_fp3_scalar, a, &f_scalar_fminnm)
108
+
109
static const FPScalar f_scalar_fmulx = {
110
gen_helper_advsimd_mulxh,
111
gen_helper_vfp_mulxs,
112
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fmul[3] = {
113
};
114
TRANS(FMUL_v, do_fp3_vector, a, f_vector_fmul)
115
116
+static gen_helper_gvec_3_ptr * const f_vector_fmax[3] = {
117
+ gen_helper_gvec_fmax_h,
118
+ gen_helper_gvec_fmax_s,
119
+ gen_helper_gvec_fmax_d,
120
+};
121
+TRANS(FMAX_v, do_fp3_vector, a, f_vector_fmax)
122
+
123
+static gen_helper_gvec_3_ptr * const f_vector_fmin[3] = {
124
+ gen_helper_gvec_fmin_h,
125
+ gen_helper_gvec_fmin_s,
126
+ gen_helper_gvec_fmin_d,
127
+};
128
+TRANS(FMIN_v, do_fp3_vector, a, f_vector_fmin)
129
+
130
+static gen_helper_gvec_3_ptr * const f_vector_fmaxnm[3] = {
131
+ gen_helper_gvec_fmaxnum_h,
132
+ gen_helper_gvec_fmaxnum_s,
133
+ gen_helper_gvec_fmaxnum_d,
134
+};
135
+TRANS(FMAXNM_v, do_fp3_vector, a, f_vector_fmaxnm)
136
+
137
+static gen_helper_gvec_3_ptr * const f_vector_fminnm[3] = {
138
+ gen_helper_gvec_fminnum_h,
139
+ gen_helper_gvec_fminnum_s,
140
+ gen_helper_gvec_fminnum_d,
141
+};
142
+TRANS(FMINNM_v, do_fp3_vector, a, f_vector_fminnm)
143
+
144
static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = {
145
gen_helper_gvec_fmulx_h,
146
gen_helper_gvec_fmulx_s,
147
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
148
tcg_op2 = read_fp_sreg(s, rm);
149
150
switch (opcode) {
151
- case 0x4: /* FMAX */
152
- gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
153
- break;
154
- case 0x5: /* FMIN */
155
- gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
156
- break;
157
- case 0x6: /* FMAXNM */
158
- gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
159
- break;
160
- case 0x7: /* FMINNM */
161
- gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
162
- break;
163
case 0x8: /* FNMUL */
164
gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
165
gen_helper_vfp_negs(tcg_res, tcg_res);
166
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
167
case 0x1: /* FDIV */
168
case 0x2: /* FADD */
169
case 0x3: /* FSUB */
170
+ case 0x4: /* FMAX */
171
+ case 0x5: /* FMIN */
172
+ case 0x6: /* FMAXNM */
173
+ case 0x7: /* FMINNM */
174
g_assert_not_reached();
138
}
175
}
139
176
140
/*
177
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
141
@@ -XXX,XX +XXX,XX @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
178
tcg_op2 = read_fp_dreg(s, rm);
142
if (arm_feature(env, ARM_FEATURE_V8) &&
179
143
!((mmu_idx & ARM_MMU_IDX_M_NEGPRI) &&
180
switch (opcode) {
144
(env->v7m.ccr[env->v7m.secure] & R_V7M_CCR_STKOFHFNMIGN_MASK))) {
181
- case 0x4: /* FMAX */
145
- flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
182
- gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
146
+ flags = FIELD_DP32(flags, TBFLAG_M32, STACKCHECK, 1);
183
- break;
184
- case 0x5: /* FMIN */
185
- gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
186
- break;
187
- case 0x6: /* FMAXNM */
188
- gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
189
- break;
190
- case 0x7: /* FMINNM */
191
- gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
192
- break;
193
case 0x8: /* FNMUL */
194
gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
195
gen_helper_vfp_negd(tcg_res, tcg_res);
196
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
197
case 0x1: /* FDIV */
198
case 0x2: /* FADD */
199
case 0x3: /* FSUB */
200
+ case 0x4: /* FMAX */
201
+ case 0x5: /* FMIN */
202
+ case 0x6: /* FMAXNM */
203
+ case 0x7: /* FMINNM */
204
g_assert_not_reached();
147
}
205
}
148
206
149
return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
207
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
150
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
208
tcg_op2 = read_fp_hreg(s, rm);
151
if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
209
152
FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S)
210
switch (opcode) {
153
!= env->v7m.secure) {
211
- case 0x4: /* FMAX */
154
- flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
212
- gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
155
+ flags = FIELD_DP32(flags, TBFLAG_M32, FPCCR_S_WRONG, 1);
213
- break;
156
}
214
- case 0x5: /* FMIN */
157
215
- gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
158
if ((env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
216
- break;
159
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
217
- case 0x6: /* FMAXNM */
160
* active FP context; we must create a new FP context before
218
- gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
161
* executing any FP insn.
219
- break;
162
*/
220
- case 0x7: /* FMINNM */
163
- flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
221
- gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
164
+ flags = FIELD_DP32(flags, TBFLAG_M32, NEW_FP_CTXT_NEEDED, 1);
222
- break;
165
}
223
case 0x8: /* FNMUL */
166
224
gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
167
bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
225
tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
168
if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
226
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
169
- flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
227
case 0x1: /* FDIV */
170
+ flags = FIELD_DP32(flags, TBFLAG_M32, LSPACT, 1);
228
case 0x2: /* FADD */
171
}
229
case 0x3: /* FSUB */
172
} else {
230
+ case 0x4: /* FMAX */
173
/*
231
+ case 0x5: /* FMIN */
174
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
232
+ case 0x6: /* FMAXNM */
175
}
233
+ case 0x7: /* FMINNM */
176
}
234
g_assert_not_reached();
177
178
- flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
179
- flags = FIELD_DP32(flags, TBFLAG_A32, CONDEXEC, env->condexec_bits);
180
+ flags = FIELD_DP32(flags, TBFLAG_AM32, THUMB, env->thumb);
181
+ flags = FIELD_DP32(flags, TBFLAG_AM32, CONDEXEC, env->condexec_bits);
182
pstate_for_ss = env->uncached_cpsr;
183
}
235
}
184
236
185
diff --git a/target/arm/translate.c b/target/arm/translate.c
237
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
238
gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2,
239
tcg_res, fpst);
240
break;
241
- case 0x18: /* FMAXNM */
242
- gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
243
- break;
244
case 0x1c: /* FCMEQ */
245
gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
246
break;
247
- case 0x1e: /* FMAX */
248
- gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
249
- break;
250
case 0x1f: /* FRECPS */
251
gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
252
break;
253
- case 0x38: /* FMINNM */
254
- gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
255
- break;
256
- case 0x3e: /* FMIN */
257
- gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
258
- break;
259
case 0x3f: /* FRSQRTS */
260
gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
261
break;
262
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
263
gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
264
break;
265
default:
266
+ case 0x18: /* FMAXNM */
267
case 0x1a: /* FADD */
268
case 0x1b: /* FMULX */
269
+ case 0x1e: /* FMAX */
270
+ case 0x38: /* FMINNM */
271
case 0x3a: /* FSUB */
272
+ case 0x3e: /* FMIN */
273
case 0x5b: /* FMUL */
274
case 0x5f: /* FDIV */
275
g_assert_not_reached();
276
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
277
case 0x1c: /* FCMEQ */
278
gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
279
break;
280
- case 0x1e: /* FMAX */
281
- gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
282
- break;
283
case 0x1f: /* FRECPS */
284
gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
285
break;
286
- case 0x18: /* FMAXNM */
287
- gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
288
- break;
289
- case 0x38: /* FMINNM */
290
- gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
291
- break;
292
- case 0x3e: /* FMIN */
293
- gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
294
- break;
295
case 0x3f: /* FRSQRTS */
296
gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
297
break;
298
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
299
gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
300
break;
301
default:
302
+ case 0x18: /* FMAXNM */
303
case 0x1a: /* FADD */
304
case 0x1b: /* FMULX */
305
+ case 0x1e: /* FMAX */
306
+ case 0x38: /* FMINNM */
307
case 0x3a: /* FSUB */
308
+ case 0x3e: /* FMIN */
309
case 0x5b: /* FMUL */
310
case 0x5f: /* FDIV */
311
g_assert_not_reached();
312
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
313
case 0x7d: /* FACGT */
314
case 0x19: /* FMLA */
315
case 0x39: /* FMLS */
316
- case 0x18: /* FMAXNM */
317
case 0x1c: /* FCMEQ */
318
- case 0x1e: /* FMAX */
319
- case 0x38: /* FMINNM */
320
- case 0x3e: /* FMIN */
321
case 0x5c: /* FCMGE */
322
case 0x7a: /* FABD */
323
case 0x7c: /* FCMGT */
324
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
325
return;
326
327
default:
328
+ case 0x18: /* FMAXNM */
329
case 0x1a: /* FADD */
330
case 0x1b: /* FMULX */
331
+ case 0x1e: /* FMAX */
332
+ case 0x38: /* FMINNM */
333
case 0x3a: /* FSUB */
334
+ case 0x3e: /* FMIN */
335
case 0x5b: /* FMUL */
336
case 0x5f: /* FDIV */
337
unallocated_encoding(s);
338
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
339
int pass;
340
341
switch (fpopcode) {
342
- case 0x0: /* FMAXNM */
343
case 0x1: /* FMLA */
344
case 0x4: /* FCMEQ */
345
- case 0x6: /* FMAX */
346
case 0x7: /* FRECPS */
347
- case 0x8: /* FMINNM */
348
case 0x9: /* FMLS */
349
- case 0xe: /* FMIN */
350
case 0xf: /* FRSQRTS */
351
case 0x14: /* FCMGE */
352
case 0x15: /* FACGE */
353
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
354
pairwise = true;
355
break;
356
default:
357
+ case 0x0: /* FMAXNM */
358
case 0x2: /* FADD */
359
case 0x3: /* FMULX */
360
+ case 0x6: /* FMAX */
361
+ case 0x8: /* FMINNM */
362
case 0xa: /* FSUB */
363
+ case 0xe: /* FMIN */
364
case 0x13: /* FMUL */
365
case 0x17: /* FDIV */
366
unallocated_encoding(s);
367
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
368
read_vec_element_i32(s, tcg_op2, rm, pass, MO_16);
369
370
switch (fpopcode) {
371
- case 0x0: /* FMAXNM */
372
- gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
373
- break;
374
case 0x1: /* FMLA */
375
read_vec_element_i32(s, tcg_res, rd, pass, MO_16);
376
gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
377
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
378
case 0x4: /* FCMEQ */
379
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
380
break;
381
- case 0x6: /* FMAX */
382
- gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
383
- break;
384
case 0x7: /* FRECPS */
385
gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
386
break;
387
- case 0x8: /* FMINNM */
388
- gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
389
- break;
390
case 0x9: /* FMLS */
391
/* As usual for ARM, separate negation for fused multiply-add */
392
tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000);
393
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
394
gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
395
fpst);
396
break;
397
- case 0xe: /* FMIN */
398
- gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
399
- break;
400
case 0xf: /* FRSQRTS */
401
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
402
break;
403
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
404
gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
405
break;
406
default:
407
+ case 0x0: /* FMAXNM */
408
case 0x2: /* FADD */
409
case 0x3: /* FMULX */
410
+ case 0x6: /* FMAX */
411
+ case 0x8: /* FMINNM */
412
case 0xa: /* FSUB */
413
+ case 0xe: /* FMIN */
414
case 0x13: /* FMUL */
415
case 0x17: /* FDIV */
416
g_assert_not_reached();
417
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
186
index XXXXXXX..XXXXXXX 100644
418
index XXXXXXX..XXXXXXX 100644
187
--- a/target/arm/translate.c
419
--- a/target/arm/tcg/vec_helper.c
188
+++ b/target/arm/translate.c
420
+++ b/target/arm/tcg/vec_helper.c
189
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
421
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_facgt_s, float32_acgt, float32)
190
*/
422
191
dc->secure_routed_to_el3 = arm_feature(env, ARM_FEATURE_EL3) &&
423
DO_3OP(gvec_fmax_h, float16_max, float16)
192
!arm_el_is_aa64(env, 3);
424
DO_3OP(gvec_fmax_s, float32_max, float32)
193
- dc->thumb = FIELD_EX32(tb_flags, TBFLAG_A32, THUMB);
425
+DO_3OP(gvec_fmax_d, float64_max, float64)
194
- dc->sctlr_b = FIELD_EX32(tb_flags, TBFLAG_A32, SCTLR_B);
426
195
- dc->hstr_active = FIELD_EX32(tb_flags, TBFLAG_A32, HSTR_ACTIVE);
427
DO_3OP(gvec_fmin_h, float16_min, float16)
196
+ dc->thumb = FIELD_EX32(tb_flags, TBFLAG_AM32, THUMB);
428
DO_3OP(gvec_fmin_s, float32_min, float32)
197
dc->be_data = FIELD_EX32(tb_flags, TBFLAG_ANY, BE_DATA) ? MO_BE : MO_LE;
429
+DO_3OP(gvec_fmin_d, float64_min, float64)
198
- condexec = FIELD_EX32(tb_flags, TBFLAG_A32, CONDEXEC);
430
199
+ condexec = FIELD_EX32(tb_flags, TBFLAG_AM32, CONDEXEC);
431
DO_3OP(gvec_fmaxnum_h, float16_maxnum, float16)
200
dc->condexec_mask = (condexec & 0xf) << 1;
432
DO_3OP(gvec_fmaxnum_s, float32_maxnum, float32)
201
dc->condexec_cond = condexec >> 4;
433
+DO_3OP(gvec_fmaxnum_d, float64_maxnum, float64)
202
+
434
203
core_mmu_idx = FIELD_EX32(tb_flags, TBFLAG_ANY, MMUIDX);
435
DO_3OP(gvec_fminnum_h, float16_minnum, float16)
204
dc->mmu_idx = core_to_arm_mmu_idx(env, core_mmu_idx);
436
DO_3OP(gvec_fminnum_s, float32_minnum, float32)
205
dc->current_el = arm_mmu_idx_to_el(dc->mmu_idx);
437
+DO_3OP(gvec_fminnum_d, float64_minnum, float64)
206
#if !defined(CONFIG_USER_ONLY)
438
207
dc->user = (dc->current_el == 0);
439
DO_3OP(gvec_recps_nf_h, float16_recps_nf, float16)
208
#endif
440
DO_3OP(gvec_recps_nf_s, float32_recps_nf, float32)
209
- dc->ns = FIELD_EX32(tb_flags, TBFLAG_A32, NS);
210
dc->fp_excp_el = FIELD_EX32(tb_flags, TBFLAG_ANY, FPEXC_EL);
211
- dc->vfp_enabled = FIELD_EX32(tb_flags, TBFLAG_A32, VFPEN);
212
- dc->vec_len = FIELD_EX32(tb_flags, TBFLAG_A32, VECLEN);
213
- if (arm_feature(env, ARM_FEATURE_XSCALE)) {
214
- dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
215
- dc->vec_stride = 0;
216
+
217
+ if (arm_feature(env, ARM_FEATURE_M)) {
218
+ dc->vfp_enabled = 1;
219
+ dc->be_data = MO_TE;
220
+ dc->v7m_handler_mode = FIELD_EX32(tb_flags, TBFLAG_M32, HANDLER);
221
+ dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
222
+ regime_is_secure(env, dc->mmu_idx);
223
+ dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_M32, STACKCHECK);
224
+ dc->v8m_fpccr_s_wrong =
225
+ FIELD_EX32(tb_flags, TBFLAG_M32, FPCCR_S_WRONG);
226
+ dc->v7m_new_fp_ctxt_needed =
227
+ FIELD_EX32(tb_flags, TBFLAG_M32, NEW_FP_CTXT_NEEDED);
228
+ dc->v7m_lspact = FIELD_EX32(tb_flags, TBFLAG_M32, LSPACT);
229
} else {
230
- dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
231
- dc->c15_cpar = 0;
232
+ dc->be_data =
233
+ FIELD_EX32(tb_flags, TBFLAG_ANY, BE_DATA) ? MO_BE : MO_LE;
234
+ dc->debug_target_el =
235
+ FIELD_EX32(tb_flags, TBFLAG_ANY, DEBUG_TARGET_EL);
236
+ dc->sctlr_b = FIELD_EX32(tb_flags, TBFLAG_A32, SCTLR_B);
237
+ dc->hstr_active = FIELD_EX32(tb_flags, TBFLAG_A32, HSTR_ACTIVE);
238
+ dc->ns = FIELD_EX32(tb_flags, TBFLAG_A32, NS);
239
+ dc->vfp_enabled = FIELD_EX32(tb_flags, TBFLAG_A32, VFPEN);
240
+ if (arm_feature(env, ARM_FEATURE_XSCALE)) {
241
+ dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
242
+ } else {
243
+ dc->vec_len = FIELD_EX32(tb_flags, TBFLAG_A32, VECLEN);
244
+ dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
245
+ }
246
}
247
- dc->v7m_handler_mode = FIELD_EX32(tb_flags, TBFLAG_A32, HANDLER);
248
- dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
249
- regime_is_secure(env, dc->mmu_idx);
250
- dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
251
- dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
252
- dc->v7m_new_fp_ctxt_needed =
253
- FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
254
- dc->v7m_lspact = FIELD_EX32(tb_flags, TBFLAG_A32, LSPACT);
255
dc->cp_regs = cpu->cp_regs;
256
dc->features = env->features;
257
258
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
259
dc->ss_active = FIELD_EX32(tb_flags, TBFLAG_ANY, SS_ACTIVE);
260
dc->pstate_ss = FIELD_EX32(tb_flags, TBFLAG_ANY, PSTATE_SS);
261
dc->is_ldex = false;
262
- if (!arm_feature(env, ARM_FEATURE_M)) {
263
- dc->debug_target_el = FIELD_EX32(tb_flags, TBFLAG_ANY, DEBUG_TARGET_EL);
264
- }
265
266
dc->page_start = dc->base.pc_first & TARGET_PAGE_MASK;
267
268
@@ -XXX,XX +XXX,XX @@ static const TranslatorOps thumb_translator_ops = {
269
/* generate intermediate code for basic block 'tb'. */
270
void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns)
271
{
272
- DisasContext dc;
273
+ DisasContext dc = { };
274
const TranslatorOps *ops = &arm_translator_ops;
275
276
- if (FIELD_EX32(tb->flags, TBFLAG_A32, THUMB)) {
277
+ if (FIELD_EX32(tb->flags, TBFLAG_AM32, THUMB)) {
278
ops = &thumb_translator_ops;
279
}
280
#ifdef TARGET_AARCH64
281
--
441
--
282
2.20.1
442
2.34.1
283
284
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Tested-by: Alex Bennée <alex.bennee@linaro.org>
3
Load and zero-extend float16 into a TCGv_i32 before
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
all scalar operations.
5
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200206105448.4726-2-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20240524232121.284515-22-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
10
---
9
target/arm/cpu.h | 5 +++++
11
target/arm/tcg/translate-vfp.c | 39 +++++++++++++++++++---------------
10
1 file changed, 5 insertions(+)
12
1 file changed, 22 insertions(+), 17 deletions(-)
11
13
12
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
diff --git a/target/arm/tcg/translate-vfp.c b/target/arm/tcg/translate-vfp.c
13
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/cpu.h
16
--- a/target/arm/tcg/translate-vfp.c
15
+++ b/target/arm/cpu.h
17
+++ b/target/arm/tcg/translate-vfp.c
16
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
18
@@ -XXX,XX +XXX,XX @@ static inline void vfp_store_reg32(TCGv_i32 var, int reg)
17
return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
19
tcg_gen_st_i32(var, tcg_env, vfp_reg_offset(false, reg));
18
}
20
}
19
21
20
+static inline bool isar_feature_aa64_vh(const ARMISARegisters *id)
22
+static inline void vfp_load_reg16(TCGv_i32 var, int reg)
21
+{
23
+{
22
+ return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, VH) != 0;
24
+ tcg_gen_ld16u_i32(var, tcg_env,
25
+ vfp_reg_offset(false, reg) + HOST_BIG_ENDIAN * 2);
23
+}
26
+}
24
+
27
+
25
static inline bool isar_feature_aa64_lor(const ARMISARegisters *id)
28
/*
26
{
29
* The imm8 encodes the sign bit, enough bits to represent an exponent in
27
return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, LO) != 0;
30
* the range 01....1xx to 10....0xx, and the most significant 4 bits of
31
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_half(DisasContext *s, arg_VMOV_single *a)
32
if (a->l) {
33
/* VFP to general purpose register */
34
tmp = tcg_temp_new_i32();
35
- vfp_load_reg32(tmp, a->vn);
36
- tcg_gen_andi_i32(tmp, tmp, 0xffff);
37
+ vfp_load_reg16(tmp, a->vn);
38
store_reg(s, a->rt, tmp);
39
} else {
40
/* general purpose register to VFP */
41
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_hp(DisasContext *s, VFPGen3OpSPFn *fn,
42
fd = tcg_temp_new_i32();
43
fpst = fpstatus_ptr(FPST_FPCR_F16);
44
45
- vfp_load_reg32(f0, vn);
46
- vfp_load_reg32(f1, vm);
47
+ vfp_load_reg16(f0, vn);
48
+ vfp_load_reg16(f1, vm);
49
50
if (reads_vd) {
51
- vfp_load_reg32(fd, vd);
52
+ vfp_load_reg16(fd, vd);
53
}
54
fn(fd, f0, f1, fpst);
55
vfp_store_reg32(fd, vd);
56
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_hp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
57
}
58
59
f0 = tcg_temp_new_i32();
60
- vfp_load_reg32(f0, vm);
61
+ vfp_load_reg16(f0, vm);
62
fn(f0, f0);
63
vfp_store_reg32(f0, vd);
64
65
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
66
vm = tcg_temp_new_i32();
67
vd = tcg_temp_new_i32();
68
69
- vfp_load_reg32(vn, a->vn);
70
- vfp_load_reg32(vm, a->vm);
71
+ vfp_load_reg16(vn, a->vn);
72
+ vfp_load_reg16(vm, a->vm);
73
if (neg_n) {
74
/* VFNMS, VFMS */
75
gen_helper_vfp_negh(vn, vn);
76
}
77
- vfp_load_reg32(vd, a->vd);
78
+ vfp_load_reg16(vd, a->vd);
79
if (neg_d) {
80
/* VFNMA, VFNMS */
81
gen_helper_vfp_negh(vd, vd);
82
@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_hp(DisasContext *s, arg_VCMP_sp *a)
83
vd = tcg_temp_new_i32();
84
vm = tcg_temp_new_i32();
85
86
- vfp_load_reg32(vd, a->vd);
87
+ vfp_load_reg16(vd, a->vd);
88
if (a->z) {
89
tcg_gen_movi_i32(vm, 0);
90
} else {
91
- vfp_load_reg32(vm, a->vm);
92
+ vfp_load_reg16(vm, a->vm);
93
}
94
95
if (a->e) {
96
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_hp(DisasContext *s, arg_VRINTR_sp *a)
97
}
98
99
tmp = tcg_temp_new_i32();
100
- vfp_load_reg32(tmp, a->vm);
101
+ vfp_load_reg16(tmp, a->vm);
102
fpst = fpstatus_ptr(FPST_FPCR_F16);
103
gen_helper_rinth(tmp, tmp, fpst);
104
vfp_store_reg32(tmp, a->vd);
105
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_hp(DisasContext *s, arg_VRINTZ_sp *a)
106
}
107
108
tmp = tcg_temp_new_i32();
109
- vfp_load_reg32(tmp, a->vm);
110
+ vfp_load_reg16(tmp, a->vm);
111
fpst = fpstatus_ptr(FPST_FPCR_F16);
112
tcg_rmode = gen_set_rmode(FPROUNDING_ZERO, fpst);
113
gen_helper_rinth(tmp, tmp, fpst);
114
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_hp(DisasContext *s, arg_VRINTX_sp *a)
115
}
116
117
tmp = tcg_temp_new_i32();
118
- vfp_load_reg32(tmp, a->vm);
119
+ vfp_load_reg16(tmp, a->vm);
120
fpst = fpstatus_ptr(FPST_FPCR_F16);
121
gen_helper_rinth_exact(tmp, tmp, fpst);
122
vfp_store_reg32(tmp, a->vd);
123
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_hp_int(DisasContext *s, arg_VCVT_sp_int *a)
124
125
fpst = fpstatus_ptr(FPST_FPCR_F16);
126
vm = tcg_temp_new_i32();
127
- vfp_load_reg32(vm, a->vm);
128
+ vfp_load_reg16(vm, a->vm);
129
130
if (a->s) {
131
if (a->rz) {
132
@@ -XXX,XX +XXX,XX @@ static bool trans_VINS(DisasContext *s, arg_VINS *a)
133
/* Insert low half of Vm into high half of Vd */
134
rm = tcg_temp_new_i32();
135
rd = tcg_temp_new_i32();
136
- vfp_load_reg32(rm, a->vm);
137
- vfp_load_reg32(rd, a->vd);
138
+ vfp_load_reg16(rm, a->vm);
139
+ vfp_load_reg16(rd, a->vd);
140
tcg_gen_deposit_i32(rd, rd, rm, 16, 16);
141
vfp_store_reg32(rd, a->vd);
142
return true;
28
--
143
--
29
2.20.1
144
2.34.1
30
31
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Tested-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200206105448.4726-3-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
9
target/arm/cpu.h | 7 -------
10
target/arm/helper.c | 6 +++++-
11
2 files changed, 5 insertions(+), 8 deletions(-)
12
13
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/cpu.h
16
+++ b/target/arm/cpu.h
17
@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
18
#define HCR_ATA (1ULL << 56)
19
#define HCR_DCT (1ULL << 57)
20
21
-/*
22
- * When we actually implement ARMv8.1-VHE we should add HCR_E2H to
23
- * HCR_MASK and then clear it again if the feature bit is not set in
24
- * hcr_write().
25
- */
26
-#define HCR_MASK ((1ULL << 34) - 1)
27
-
28
#define SCR_NS (1U << 0)
29
#define SCR_IRQ (1U << 1)
30
#define SCR_FIQ (1U << 2)
31
diff --git a/target/arm/helper.c b/target/arm/helper.c
32
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/helper.c
34
+++ b/target/arm/helper.c
35
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
36
static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
37
{
38
ARMCPU *cpu = env_archcpu(env);
39
- uint64_t valid_mask = HCR_MASK;
40
+ /* Begin with bits defined in base ARMv8.0. */
41
+ uint64_t valid_mask = MAKE_64BIT_MASK(0, 34);
42
43
if (arm_feature(env, ARM_FEATURE_EL3)) {
44
valid_mask &= ~HCR_HCD;
45
@@ -XXX,XX +XXX,XX @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
46
*/
47
valid_mask &= ~HCR_TSC;
48
}
49
+ if (cpu_isar_feature(aa64_vh, cpu)) {
50
+ valid_mask |= HCR_E2H;
51
+ }
52
if (cpu_isar_feature(aa64_lor, cpu)) {
53
valid_mask |= HCR_TLOR;
54
}
55
--
56
2.20.1
57
58
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
When TGE+E2H are both set, CPACR_EL1 is ignored.
4
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200206105448.4726-34-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-23-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
7
---
11
target/arm/helper.c | 53 ++++++++++++++++++++++++---------------------
8
target/arm/helper.h | 6 ----
12
1 file changed, 28 insertions(+), 25 deletions(-)
9
target/arm/tcg/translate.h | 30 +++++++++++++++++++
10
target/arm/tcg/translate-a64.c | 44 +++++++++++++--------------
11
target/arm/tcg/translate-vfp.c | 54 +++++++++++++++++-----------------
12
target/arm/vfp_helper.c | 30 -------------------
13
5 files changed, 79 insertions(+), 85 deletions(-)
13
14
14
diff --git a/target/arm/helper.c b/target/arm/helper.c
15
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.c
17
--- a/target/arm/helper.h
17
+++ b/target/arm/helper.c
18
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_maxnumd, f64, f64, f64, ptr)
19
int sve_exception_el(CPUARMState *env, int el)
20
DEF_HELPER_3(vfp_minnumh, f16, f16, f16, ptr)
21
DEF_HELPER_3(vfp_minnums, f32, f32, f32, ptr)
22
DEF_HELPER_3(vfp_minnumd, f64, f64, f64, ptr)
23
-DEF_HELPER_1(vfp_negh, f16, f16)
24
-DEF_HELPER_1(vfp_negs, f32, f32)
25
-DEF_HELPER_1(vfp_negd, f64, f64)
26
-DEF_HELPER_1(vfp_absh, f16, f16)
27
-DEF_HELPER_1(vfp_abss, f32, f32)
28
-DEF_HELPER_1(vfp_absd, f64, f64)
29
DEF_HELPER_2(vfp_sqrth, f16, f16, env)
30
DEF_HELPER_2(vfp_sqrts, f32, f32, env)
31
DEF_HELPER_2(vfp_sqrtd, f64, f64, env)
32
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/tcg/translate.h
35
+++ b/target/arm/tcg/translate.h
36
@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
37
*/
38
uint64_t vfp_expand_imm(int size, uint8_t imm8);
39
40
+static inline void gen_vfp_absh(TCGv_i32 d, TCGv_i32 s)
41
+{
42
+ tcg_gen_andi_i32(d, s, INT16_MAX);
43
+}
44
+
45
+static inline void gen_vfp_abss(TCGv_i32 d, TCGv_i32 s)
46
+{
47
+ tcg_gen_andi_i32(d, s, INT32_MAX);
48
+}
49
+
50
+static inline void gen_vfp_absd(TCGv_i64 d, TCGv_i64 s)
51
+{
52
+ tcg_gen_andi_i64(d, s, INT64_MAX);
53
+}
54
+
55
+static inline void gen_vfp_negh(TCGv_i32 d, TCGv_i32 s)
56
+{
57
+ tcg_gen_xori_i32(d, s, 1u << 15);
58
+}
59
+
60
+static inline void gen_vfp_negs(TCGv_i32 d, TCGv_i32 s)
61
+{
62
+ tcg_gen_xori_i32(d, s, 1u << 31);
63
+}
64
+
65
+static inline void gen_vfp_negd(TCGv_i64 d, TCGv_i64 s)
66
+{
67
+ tcg_gen_xori_i64(d, s, 1ull << 63);
68
+}
69
+
70
/* Vector operations shared between ARM and AArch64. */
71
void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
72
uint32_t opr_sz, uint32_t max_sz);
73
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/target/arm/tcg/translate-a64.c
76
+++ b/target/arm/tcg/translate-a64.c
77
@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
78
tcg_gen_mov_i32(tcg_res, tcg_op);
79
break;
80
case 0x1: /* FABS */
81
- tcg_gen_andi_i32(tcg_res, tcg_op, 0x7fff);
82
+ gen_vfp_absh(tcg_res, tcg_op);
83
break;
84
case 0x2: /* FNEG */
85
- tcg_gen_xori_i32(tcg_res, tcg_op, 0x8000);
86
+ gen_vfp_negh(tcg_res, tcg_op);
87
break;
88
case 0x3: /* FSQRT */
89
fpst = fpstatus_ptr(FPST_FPCR_F16);
90
@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_single(DisasContext *s, int opcode, int rd, int rn)
91
tcg_gen_mov_i32(tcg_res, tcg_op);
92
goto done;
93
case 0x1: /* FABS */
94
- gen_helper_vfp_abss(tcg_res, tcg_op);
95
+ gen_vfp_abss(tcg_res, tcg_op);
96
goto done;
97
case 0x2: /* FNEG */
98
- gen_helper_vfp_negs(tcg_res, tcg_op);
99
+ gen_vfp_negs(tcg_res, tcg_op);
100
goto done;
101
case 0x3: /* FSQRT */
102
gen_helper_vfp_sqrts(tcg_res, tcg_op, tcg_env);
103
@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_double(DisasContext *s, int opcode, int rd, int rn)
104
105
switch (opcode) {
106
case 0x1: /* FABS */
107
- gen_helper_vfp_absd(tcg_res, tcg_op);
108
+ gen_vfp_absd(tcg_res, tcg_op);
109
goto done;
110
case 0x2: /* FNEG */
111
- gen_helper_vfp_negd(tcg_res, tcg_op);
112
+ gen_vfp_negd(tcg_res, tcg_op);
113
goto done;
114
case 0x3: /* FSQRT */
115
gen_helper_vfp_sqrtd(tcg_res, tcg_op, tcg_env);
116
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_single(DisasContext *s, int opcode,
117
switch (opcode) {
118
case 0x8: /* FNMUL */
119
gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
120
- gen_helper_vfp_negs(tcg_res, tcg_res);
121
+ gen_vfp_negs(tcg_res, tcg_res);
122
break;
123
default:
124
case 0x0: /* FMUL */
125
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
126
switch (opcode) {
127
case 0x8: /* FNMUL */
128
gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
129
- gen_helper_vfp_negd(tcg_res, tcg_res);
130
+ gen_vfp_negd(tcg_res, tcg_res);
131
break;
132
default:
133
case 0x0: /* FMUL */
134
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_half(DisasContext *s, int opcode,
135
switch (opcode) {
136
case 0x8: /* FNMUL */
137
gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
138
- tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
139
+ gen_vfp_negh(tcg_res, tcg_res);
140
break;
141
default:
142
case 0x0: /* FMUL */
143
@@ -XXX,XX +XXX,XX @@ static void handle_fp_3src_single(DisasContext *s, bool o0, bool o1,
144
* flipped if it is a negated-input.
145
*/
146
if (o1 == true) {
147
- gen_helper_vfp_negs(tcg_op3, tcg_op3);
148
+ gen_vfp_negs(tcg_op3, tcg_op3);
149
}
150
151
if (o0 != o1) {
152
- gen_helper_vfp_negs(tcg_op1, tcg_op1);
153
+ gen_vfp_negs(tcg_op1, tcg_op1);
154
}
155
156
gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst);
157
@@ -XXX,XX +XXX,XX @@ static void handle_fp_3src_double(DisasContext *s, bool o0, bool o1,
158
* flipped if it is a negated-input.
159
*/
160
if (o1 == true) {
161
- gen_helper_vfp_negd(tcg_op3, tcg_op3);
162
+ gen_vfp_negd(tcg_op3, tcg_op3);
163
}
164
165
if (o0 != o1) {
166
- gen_helper_vfp_negd(tcg_op1, tcg_op1);
167
+ gen_vfp_negd(tcg_op1, tcg_op1);
168
}
169
170
gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst);
171
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
172
switch (fpopcode) {
173
case 0x39: /* FMLS */
174
/* As usual for ARM, separate negation for fused multiply-add */
175
- gen_helper_vfp_negd(tcg_op1, tcg_op1);
176
+ gen_vfp_negd(tcg_op1, tcg_op1);
177
/* fall through */
178
case 0x19: /* FMLA */
179
read_vec_element(s, tcg_res, rd, pass, MO_64);
180
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
181
break;
182
case 0x7a: /* FABD */
183
gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
184
- gen_helper_vfp_absd(tcg_res, tcg_res);
185
+ gen_vfp_absd(tcg_res, tcg_res);
186
break;
187
case 0x7c: /* FCMGT */
188
gen_helper_neon_cgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
189
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
190
switch (fpopcode) {
191
case 0x39: /* FMLS */
192
/* As usual for ARM, separate negation for fused multiply-add */
193
- gen_helper_vfp_negs(tcg_op1, tcg_op1);
194
+ gen_vfp_negs(tcg_op1, tcg_op1);
195
/* fall through */
196
case 0x19: /* FMLA */
197
read_vec_element_i32(s, tcg_res, rd, pass, MO_32);
198
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
199
break;
200
case 0x7a: /* FABD */
201
gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
202
- gen_helper_vfp_abss(tcg_res, tcg_res);
203
+ gen_vfp_abss(tcg_res, tcg_res);
204
break;
205
case 0x7c: /* FCMGT */
206
gen_helper_neon_cgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
207
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_64(DisasContext *s, int opcode, bool u,
208
}
209
break;
210
case 0x2f: /* FABS */
211
- gen_helper_vfp_absd(tcg_rd, tcg_rn);
212
+ gen_vfp_absd(tcg_rd, tcg_rn);
213
break;
214
case 0x6f: /* FNEG */
215
- gen_helper_vfp_negd(tcg_rd, tcg_rn);
216
+ gen_vfp_negd(tcg_rd, tcg_rn);
217
break;
218
case 0x7f: /* FSQRT */
219
gen_helper_vfp_sqrtd(tcg_rd, tcg_rn, tcg_env);
220
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
221
}
222
break;
223
case 0x2f: /* FABS */
224
- gen_helper_vfp_abss(tcg_res, tcg_op);
225
+ gen_vfp_abss(tcg_res, tcg_op);
226
break;
227
case 0x6f: /* FNEG */
228
- gen_helper_vfp_negs(tcg_res, tcg_op);
229
+ gen_vfp_negs(tcg_res, tcg_op);
230
break;
231
case 0x7f: /* FSQRT */
232
gen_helper_vfp_sqrts(tcg_res, tcg_op, tcg_env);
233
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
234
switch (16 * u + opcode) {
235
case 0x05: /* FMLS */
236
/* As usual for ARM, separate negation for fused multiply-add */
237
- gen_helper_vfp_negd(tcg_op, tcg_op);
238
+ gen_vfp_negd(tcg_op, tcg_op);
239
/* fall through */
240
case 0x01: /* FMLA */
241
read_vec_element(s, tcg_res, rd, pass, MO_64);
242
diff --git a/target/arm/tcg/translate-vfp.c b/target/arm/tcg/translate-vfp.c
243
index XXXXXXX..XXXXXXX 100644
244
--- a/target/arm/tcg/translate-vfp.c
245
+++ b/target/arm/tcg/translate-vfp.c
246
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
247
TCGv_i32 tmp = tcg_temp_new_i32();
248
249
gen_helper_vfp_mulh(tmp, vn, vm, fpst);
250
- gen_helper_vfp_negh(tmp, tmp);
251
+ gen_vfp_negh(tmp, tmp);
252
gen_helper_vfp_addh(vd, vd, tmp, fpst);
253
}
254
255
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
256
TCGv_i32 tmp = tcg_temp_new_i32();
257
258
gen_helper_vfp_muls(tmp, vn, vm, fpst);
259
- gen_helper_vfp_negs(tmp, tmp);
260
+ gen_vfp_negs(tmp, tmp);
261
gen_helper_vfp_adds(vd, vd, tmp, fpst);
262
}
263
264
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
265
TCGv_i64 tmp = tcg_temp_new_i64();
266
267
gen_helper_vfp_muld(tmp, vn, vm, fpst);
268
- gen_helper_vfp_negd(tmp, tmp);
269
+ gen_vfp_negd(tmp, tmp);
270
gen_helper_vfp_addd(vd, vd, tmp, fpst);
271
}
272
273
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLS_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
274
TCGv_i32 tmp = tcg_temp_new_i32();
275
276
gen_helper_vfp_mulh(tmp, vn, vm, fpst);
277
- gen_helper_vfp_negh(vd, vd);
278
+ gen_vfp_negh(vd, vd);
279
gen_helper_vfp_addh(vd, vd, tmp, fpst);
280
}
281
282
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
283
TCGv_i32 tmp = tcg_temp_new_i32();
284
285
gen_helper_vfp_muls(tmp, vn, vm, fpst);
286
- gen_helper_vfp_negs(vd, vd);
287
+ gen_vfp_negs(vd, vd);
288
gen_helper_vfp_adds(vd, vd, tmp, fpst);
289
}
290
291
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLS_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
292
TCGv_i64 tmp = tcg_temp_new_i64();
293
294
gen_helper_vfp_muld(tmp, vn, vm, fpst);
295
- gen_helper_vfp_negd(vd, vd);
296
+ gen_vfp_negd(vd, vd);
297
gen_helper_vfp_addd(vd, vd, tmp, fpst);
298
}
299
300
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLA_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
301
TCGv_i32 tmp = tcg_temp_new_i32();
302
303
gen_helper_vfp_mulh(tmp, vn, vm, fpst);
304
- gen_helper_vfp_negh(tmp, tmp);
305
- gen_helper_vfp_negh(vd, vd);
306
+ gen_vfp_negh(tmp, tmp);
307
+ gen_vfp_negh(vd, vd);
308
gen_helper_vfp_addh(vd, vd, tmp, fpst);
309
}
310
311
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLA_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
312
TCGv_i32 tmp = tcg_temp_new_i32();
313
314
gen_helper_vfp_muls(tmp, vn, vm, fpst);
315
- gen_helper_vfp_negs(tmp, tmp);
316
- gen_helper_vfp_negs(vd, vd);
317
+ gen_vfp_negs(tmp, tmp);
318
+ gen_vfp_negs(vd, vd);
319
gen_helper_vfp_adds(vd, vd, tmp, fpst);
320
}
321
322
@@ -XXX,XX +XXX,XX @@ static void gen_VNMLA_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
323
TCGv_i64 tmp = tcg_temp_new_i64();
324
325
gen_helper_vfp_muld(tmp, vn, vm, fpst);
326
- gen_helper_vfp_negd(tmp, tmp);
327
- gen_helper_vfp_negd(vd, vd);
328
+ gen_vfp_negd(tmp, tmp);
329
+ gen_vfp_negd(vd, vd);
330
gen_helper_vfp_addd(vd, vd, tmp, fpst);
331
}
332
333
@@ -XXX,XX +XXX,XX @@ static void gen_VNMUL_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
20
{
334
{
21
#ifndef CONFIG_USER_ONLY
335
/* VNMUL: -(fn * fm) */
22
- if (el <= 1) {
336
gen_helper_vfp_mulh(vd, vn, vm, fpst);
23
+ uint64_t hcr_el2 = arm_hcr_el2_eff(env);
337
- gen_helper_vfp_negh(vd, vd);
24
+
338
+ gen_vfp_negh(vd, vd);
25
+ if (el <= 1 && (hcr_el2 & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
339
}
26
bool disabled = false;
340
27
341
static bool trans_VNMUL_hp(DisasContext *s, arg_VNMUL_sp *a)
28
/* The CPACR.ZEN controls traps to EL1:
342
@@ -XXX,XX +XXX,XX @@ static void gen_VNMUL_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst)
29
@@ -XXX,XX +XXX,XX @@ int sve_exception_el(CPUARMState *env, int el)
30
}
31
if (disabled) {
32
/* route_to_el2 */
33
- return (arm_feature(env, ARM_FEATURE_EL2)
34
- && (arm_hcr_el2_eff(env) & HCR_TGE) ? 2 : 1);
35
+ return hcr_el2 & HCR_TGE ? 2 : 1;
36
}
37
38
/* Check CPACR.FPEN. */
39
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes)
40
int fp_exception_el(CPUARMState *env, int cur_el)
41
{
343
{
42
#ifndef CONFIG_USER_ONLY
344
/* VNMUL: -(fn * fm) */
43
- int fpen;
345
gen_helper_vfp_muls(vd, vn, vm, fpst);
44
-
346
- gen_helper_vfp_negs(vd, vd);
45
/* CPACR and the CPTR registers don't exist before v6, so FP is
347
+ gen_vfp_negs(vd, vd);
46
* always accessible
348
}
47
*/
349
48
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
350
static bool trans_VNMUL_sp(DisasContext *s, arg_VNMUL_sp *a)
49
* 0, 2 : trap EL0 and EL1/PL1 accesses
351
@@ -XXX,XX +XXX,XX @@ static void gen_VNMUL_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst)
50
* 1 : trap only EL0 accesses
352
{
51
* 3 : trap no accesses
353
/* VNMUL: -(fn * fm) */
52
+ * This register is ignored if E2H+TGE are both set.
354
gen_helper_vfp_muld(vd, vn, vm, fpst);
53
*/
355
- gen_helper_vfp_negd(vd, vd);
54
- fpen = extract32(env->cp15.cpacr_el1, 20, 2);
356
+ gen_vfp_negd(vd, vd);
55
- switch (fpen) {
357
}
56
- case 0:
358
57
- case 2:
359
static bool trans_VNMUL_dp(DisasContext *s, arg_VNMUL_dp *a)
58
- if (cur_el == 0 || cur_el == 1) {
360
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
59
- /* Trap to PL1, which might be EL1 or EL3 */
361
vfp_load_reg16(vm, a->vm);
60
- if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
362
if (neg_n) {
61
+ if ((arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
363
/* VFNMS, VFMS */
62
+ int fpen = extract32(env->cp15.cpacr_el1, 20, 2);
364
- gen_helper_vfp_negh(vn, vn);
63
+
365
+ gen_vfp_negh(vn, vn);
64
+ switch (fpen) {
366
}
65
+ case 0:
367
vfp_load_reg16(vd, a->vd);
66
+ case 2:
368
if (neg_d) {
67
+ if (cur_el == 0 || cur_el == 1) {
369
/* VFNMA, VFNMS */
68
+ /* Trap to PL1, which might be EL1 or EL3 */
370
- gen_helper_vfp_negh(vd, vd);
69
+ if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
371
+ gen_vfp_negh(vd, vd);
70
+ return 3;
372
}
71
+ }
373
fpst = fpstatus_ptr(FPST_FPCR_F16);
72
+ return 1;
374
gen_helper_vfp_muladdh(vd, vn, vm, vd, fpst);
73
+ }
375
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_sp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
74
+ if (cur_el == 3 && !is_a64(env)) {
376
vfp_load_reg32(vm, a->vm);
75
+ /* Secure PL1 running at EL3 */
377
if (neg_n) {
76
return 3;
378
/* VFNMS, VFMS */
77
}
379
- gen_helper_vfp_negs(vn, vn);
78
- return 1;
380
+ gen_vfp_negs(vn, vn);
79
+ break;
381
}
80
+ case 1:
382
vfp_load_reg32(vd, a->vd);
81
+ if (cur_el == 0) {
383
if (neg_d) {
82
+ return 1;
384
/* VFNMA, VFNMS */
83
+ }
385
- gen_helper_vfp_negs(vd, vd);
84
+ break;
386
+ gen_vfp_negs(vd, vd);
85
+ case 3:
387
}
86
+ break;
388
fpst = fpstatus_ptr(FPST_FPCR);
87
}
389
gen_helper_vfp_muladds(vd, vn, vm, vd, fpst);
88
- if (cur_el == 3 && !is_a64(env)) {
390
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
89
- /* Secure PL1 running at EL3 */
391
vfp_load_reg64(vm, a->vm);
90
- return 3;
392
if (neg_n) {
91
- }
393
/* VFNMS, VFMS */
92
- break;
394
- gen_helper_vfp_negd(vn, vn);
93
- case 1:
395
+ gen_vfp_negd(vn, vn);
94
- if (cur_el == 0) {
396
}
95
- return 1;
397
vfp_load_reg64(vd, a->vd);
96
- }
398
if (neg_d) {
97
- break;
399
/* VFNMA, VFNMS */
98
- case 3:
400
- gen_helper_vfp_negd(vd, vd);
99
- break;
401
+ gen_vfp_negd(vd, vd);
100
}
402
}
101
403
fpst = fpstatus_ptr(FPST_FPCR);
102
/*
404
gen_helper_vfp_muladdd(vd, vn, vm, vd, fpst);
405
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
406
DO_VFP_VMOV(VMOV_reg, sp, tcg_gen_mov_i32)
407
DO_VFP_VMOV(VMOV_reg, dp, tcg_gen_mov_i64)
408
409
-DO_VFP_2OP(VABS, hp, gen_helper_vfp_absh, aa32_fp16_arith)
410
-DO_VFP_2OP(VABS, sp, gen_helper_vfp_abss, aa32_fpsp_v2)
411
-DO_VFP_2OP(VABS, dp, gen_helper_vfp_absd, aa32_fpdp_v2)
412
+DO_VFP_2OP(VABS, hp, gen_vfp_absh, aa32_fp16_arith)
413
+DO_VFP_2OP(VABS, sp, gen_vfp_abss, aa32_fpsp_v2)
414
+DO_VFP_2OP(VABS, dp, gen_vfp_absd, aa32_fpdp_v2)
415
416
-DO_VFP_2OP(VNEG, hp, gen_helper_vfp_negh, aa32_fp16_arith)
417
-DO_VFP_2OP(VNEG, sp, gen_helper_vfp_negs, aa32_fpsp_v2)
418
-DO_VFP_2OP(VNEG, dp, gen_helper_vfp_negd, aa32_fpdp_v2)
419
+DO_VFP_2OP(VNEG, hp, gen_vfp_negh, aa32_fp16_arith)
420
+DO_VFP_2OP(VNEG, sp, gen_vfp_negs, aa32_fpsp_v2)
421
+DO_VFP_2OP(VNEG, dp, gen_vfp_negd, aa32_fpdp_v2)
422
423
static void gen_VSQRT_hp(TCGv_i32 vd, TCGv_i32 vm)
424
{
425
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
426
index XXXXXXX..XXXXXXX 100644
427
--- a/target/arm/vfp_helper.c
428
+++ b/target/arm/vfp_helper.c
429
@@ -XXX,XX +XXX,XX @@ VFP_BINOP(minnum)
430
VFP_BINOP(maxnum)
431
#undef VFP_BINOP
432
433
-dh_ctype_f16 VFP_HELPER(neg, h)(dh_ctype_f16 a)
434
-{
435
- return float16_chs(a);
436
-}
437
-
438
-float32 VFP_HELPER(neg, s)(float32 a)
439
-{
440
- return float32_chs(a);
441
-}
442
-
443
-float64 VFP_HELPER(neg, d)(float64 a)
444
-{
445
- return float64_chs(a);
446
-}
447
-
448
-dh_ctype_f16 VFP_HELPER(abs, h)(dh_ctype_f16 a)
449
-{
450
- return float16_abs(a);
451
-}
452
-
453
-float32 VFP_HELPER(abs, s)(float32 a)
454
-{
455
- return float32_abs(a);
456
-}
457
-
458
-float64 VFP_HELPER(abs, d)(float64 a)
459
-{
460
- return float64_abs(a);
461
-}
462
-
463
dh_ctype_f16 VFP_HELPER(sqrt, h)(dh_ctype_f16 a, CPUARMState *env)
464
{
465
return float16_sqrt(a, &env->vfp.fp_status_f16);
103
--
466
--
104
2.20.1
467
2.34.1
105
106
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Several of the EL1/0 registers are redirected to the EL2 version when in
3
This is the last instruction within disas_fp_2src,
4
EL2 and HCR_EL2.E2H is set. Many of these registers have side effects.
4
so remove that and its subroutines.
5
Link together the two ARMCPRegInfo structures after they have been
6
properly instantiated. Install common dispatch routines to all of the
7
relevant registers.
8
5
9
The same set of registers that are redirected also have additional
10
EL12/EL02 aliases created to access the original register that was
11
redirected.
12
13
Omit the generic timer registers from redirection here, because we'll
14
need multiple kinds of redirection from both EL0 and EL2.
15
16
Tested-by: Alex Bennée <alex.bennee@linaro.org>
17
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
19
Message-id: 20200206105448.4726-29-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-24-richard.henderson@linaro.org
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
---
10
---
22
target/arm/cpu.h | 13 ++++
11
target/arm/tcg/a64.decode | 1 +
23
target/arm/helper.c | 162 ++++++++++++++++++++++++++++++++++++++++++++
12
target/arm/tcg/translate-a64.c | 177 +++++----------------------------
24
2 files changed, 175 insertions(+)
13
2 files changed, 27 insertions(+), 151 deletions(-)
25
14
26
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
27
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/cpu.h
17
--- a/target/arm/tcg/a64.decode
29
+++ b/target/arm/cpu.h
18
+++ b/target/arm/tcg/a64.decode
30
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
19
@@ -XXX,XX +XXX,XX @@ FADD_s 0001 1110 ..1 ..... 0010 10 ..... ..... @rrr_hsd
31
* fieldoffset is 0 then no reset will be done.
20
FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd
32
*/
21
FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd
33
CPResetFn *resetfn;
22
FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd
34
+
23
+FNMUL_s 0001 1110 ..1 ..... 1000 10 ..... ..... @rrr_hsd
35
+ /*
24
36
+ * "Original" writefn and readfn.
25
FMAX_s 0001 1110 ..1 ..... 0100 10 ..... ..... @rrr_hsd
37
+ * For ARMv8.1-VHE register aliases, we overwrite the read/write
26
FMIN_s 0001 1110 ..1 ..... 0101 10 ..... ..... @rrr_hsd
38
+ * accessor functions of various EL1/EL0 to perform the runtime
27
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
39
+ * check for which sysreg should actually be modified, and then
28
index XXXXXXX..XXXXXXX 100644
40
+ * forwards the operation. Before overwriting the accessors,
29
--- a/target/arm/tcg/translate-a64.c
41
+ * the original function is copied here, so that accesses that
30
+++ b/target/arm/tcg/translate-a64.c
42
+ * really do go to the EL1/EL0 version proceed normally.
31
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_fmulx = {
43
+ * (The corresponding EL2 register is linked via opaque.)
44
+ */
45
+ CPReadFn *orig_readfn;
46
+ CPWriteFn *orig_writefn;
47
};
32
};
48
33
TRANS(FMULX_s, do_fp3_scalar, a, &f_scalar_fmulx)
49
/* Macros which are lvalues for the field in CPUARMState for the
34
50
diff --git a/target/arm/helper.c b/target/arm/helper.c
35
+static void gen_fnmul_h(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/helper.c
53
+++ b/target/arm/helper.c
54
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
55
REGINFO_SENTINEL
56
};
57
58
+#ifndef CONFIG_USER_ONLY
59
+/* Test if system register redirection is to occur in the current state. */
60
+static bool redirect_for_e2h(CPUARMState *env)
61
+{
36
+{
62
+ return arm_current_el(env) == 2 && (arm_hcr_el2_eff(env) & HCR_E2H);
37
+ gen_helper_vfp_mulh(d, n, m, s);
38
+ gen_vfp_negh(d, d);
63
+}
39
+}
64
+
40
+
65
+static uint64_t el2_e2h_read(CPUARMState *env, const ARMCPRegInfo *ri)
41
+static void gen_fnmul_s(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
66
+{
42
+{
67
+ CPReadFn *readfn;
43
+ gen_helper_vfp_muls(d, n, m, s);
68
+
44
+ gen_vfp_negs(d, d);
69
+ if (redirect_for_e2h(env)) {
70
+ /* Switch to the saved EL2 version of the register. */
71
+ ri = ri->opaque;
72
+ readfn = ri->readfn;
73
+ } else {
74
+ readfn = ri->orig_readfn;
75
+ }
76
+ if (readfn == NULL) {
77
+ readfn = raw_read;
78
+ }
79
+ return readfn(env, ri);
80
+}
45
+}
81
+
46
+
82
+static void el2_e2h_write(CPUARMState *env, const ARMCPRegInfo *ri,
47
+static void gen_fnmul_d(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_ptr s)
83
+ uint64_t value)
84
+{
48
+{
85
+ CPWriteFn *writefn;
49
+ gen_helper_vfp_muld(d, n, m, s);
86
+
50
+ gen_vfp_negd(d, d);
87
+ if (redirect_for_e2h(env)) {
88
+ /* Switch to the saved EL2 version of the register. */
89
+ ri = ri->opaque;
90
+ writefn = ri->writefn;
91
+ } else {
92
+ writefn = ri->orig_writefn;
93
+ }
94
+ if (writefn == NULL) {
95
+ writefn = raw_write;
96
+ }
97
+ writefn(env, ri, value);
98
+}
51
+}
99
+
52
+
100
+static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
53
+static const FPScalar f_scalar_fnmul = {
101
+{
54
+ gen_fnmul_h,
102
+ struct E2HAlias {
55
+ gen_fnmul_s,
103
+ uint32_t src_key, dst_key, new_key;
56
+ gen_fnmul_d,
104
+ const char *src_name, *dst_name, *new_name;
57
+};
105
+ bool (*feature)(const ARMISARegisters *id);
58
+TRANS(FNMUL_s, do_fp3_scalar, a, &f_scalar_fnmul)
106
+ };
59
+
107
+
60
static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
108
+#define K(op0, op1, crn, crm, op2) \
61
gen_helper_gvec_3_ptr * const fns[3])
109
+ ENCODE_AA64_CP_REG(CP_REG_ARM64_SYSREG_CP, crn, crm, op0, op1, op2)
110
+
111
+ static const struct E2HAlias aliases[] = {
112
+ { K(3, 0, 1, 0, 0), K(3, 4, 1, 0, 0), K(3, 5, 1, 0, 0),
113
+ "SCTLR", "SCTLR_EL2", "SCTLR_EL12" },
114
+ { K(3, 0, 1, 0, 2), K(3, 4, 1, 1, 2), K(3, 5, 1, 0, 2),
115
+ "CPACR", "CPTR_EL2", "CPACR_EL12" },
116
+ { K(3, 0, 2, 0, 0), K(3, 4, 2, 0, 0), K(3, 5, 2, 0, 0),
117
+ "TTBR0_EL1", "TTBR0_EL2", "TTBR0_EL12" },
118
+ { K(3, 0, 2, 0, 1), K(3, 4, 2, 0, 1), K(3, 5, 2, 0, 1),
119
+ "TTBR1_EL1", "TTBR1_EL2", "TTBR1_EL12" },
120
+ { K(3, 0, 2, 0, 2), K(3, 4, 2, 0, 2), K(3, 5, 2, 0, 2),
121
+ "TCR_EL1", "TCR_EL2", "TCR_EL12" },
122
+ { K(3, 0, 4, 0, 0), K(3, 4, 4, 0, 0), K(3, 5, 4, 0, 0),
123
+ "SPSR_EL1", "SPSR_EL2", "SPSR_EL12" },
124
+ { K(3, 0, 4, 0, 1), K(3, 4, 4, 0, 1), K(3, 5, 4, 0, 1),
125
+ "ELR_EL1", "ELR_EL2", "ELR_EL12" },
126
+ { K(3, 0, 5, 1, 0), K(3, 4, 5, 1, 0), K(3, 5, 5, 1, 0),
127
+ "AFSR0_EL1", "AFSR0_EL2", "AFSR0_EL12" },
128
+ { K(3, 0, 5, 1, 1), K(3, 4, 5, 1, 1), K(3, 5, 5, 1, 1),
129
+ "AFSR1_EL1", "AFSR1_EL2", "AFSR1_EL12" },
130
+ { K(3, 0, 5, 2, 0), K(3, 4, 5, 2, 0), K(3, 5, 5, 2, 0),
131
+ "ESR_EL1", "ESR_EL2", "ESR_EL12" },
132
+ { K(3, 0, 6, 0, 0), K(3, 4, 6, 0, 0), K(3, 5, 6, 0, 0),
133
+ "FAR_EL1", "FAR_EL2", "FAR_EL12" },
134
+ { K(3, 0, 10, 2, 0), K(3, 4, 10, 2, 0), K(3, 5, 10, 2, 0),
135
+ "MAIR_EL1", "MAIR_EL2", "MAIR_EL12" },
136
+ { K(3, 0, 10, 3, 0), K(3, 4, 10, 3, 0), K(3, 5, 10, 3, 0),
137
+ "AMAIR0", "AMAIR_EL2", "AMAIR_EL12" },
138
+ { K(3, 0, 12, 0, 0), K(3, 4, 12, 0, 0), K(3, 5, 12, 0, 0),
139
+ "VBAR", "VBAR_EL2", "VBAR_EL12" },
140
+ { K(3, 0, 13, 0, 1), K(3, 4, 13, 0, 1), K(3, 5, 13, 0, 1),
141
+ "CONTEXTIDR_EL1", "CONTEXTIDR_EL2", "CONTEXTIDR_EL12" },
142
+ { K(3, 0, 14, 1, 0), K(3, 4, 14, 1, 0), K(3, 5, 14, 1, 0),
143
+ "CNTKCTL", "CNTHCTL_EL2", "CNTKCTL_EL12" },
144
+
145
+ /*
146
+ * Note that redirection of ZCR is mentioned in the description
147
+ * of ZCR_EL2, and aliasing in the description of ZCR_EL1, but
148
+ * not in the summary table.
149
+ */
150
+ { K(3, 0, 1, 2, 0), K(3, 4, 1, 2, 0), K(3, 5, 1, 2, 0),
151
+ "ZCR_EL1", "ZCR_EL2", "ZCR_EL12", isar_feature_aa64_sve },
152
+
153
+ /* TODO: ARMv8.2-SPE -- PMSCR_EL2 */
154
+ /* TODO: ARMv8.4-Trace -- TRFCR_EL2 */
155
+ };
156
+#undef K
157
+
158
+ size_t i;
159
+
160
+ for (i = 0; i < ARRAY_SIZE(aliases); i++) {
161
+ const struct E2HAlias *a = &aliases[i];
162
+ ARMCPRegInfo *src_reg, *dst_reg;
163
+
164
+ if (a->feature && !a->feature(&cpu->isar)) {
165
+ continue;
166
+ }
167
+
168
+ src_reg = g_hash_table_lookup(cpu->cp_regs, &a->src_key);
169
+ dst_reg = g_hash_table_lookup(cpu->cp_regs, &a->dst_key);
170
+ g_assert(src_reg != NULL);
171
+ g_assert(dst_reg != NULL);
172
+
173
+ /* Cross-compare names to detect typos in the keys. */
174
+ g_assert(strcmp(src_reg->name, a->src_name) == 0);
175
+ g_assert(strcmp(dst_reg->name, a->dst_name) == 0);
176
+
177
+ /* None of the core system registers use opaque; we will. */
178
+ g_assert(src_reg->opaque == NULL);
179
+
180
+ /* Create alias before redirection so we dup the right data. */
181
+ if (a->new_key) {
182
+ ARMCPRegInfo *new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
183
+ uint32_t *new_key = g_memdup(&a->new_key, sizeof(uint32_t));
184
+ bool ok;
185
+
186
+ new_reg->name = a->new_name;
187
+ new_reg->type |= ARM_CP_ALIAS;
188
+ /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place. */
189
+ new_reg->access &= PL2_RW | PL3_RW;
190
+
191
+ ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
192
+ g_assert(ok);
193
+ }
194
+
195
+ src_reg->opaque = dst_reg;
196
+ src_reg->orig_readfn = src_reg->readfn ?: raw_read;
197
+ src_reg->orig_writefn = src_reg->writefn ?: raw_write;
198
+ if (!src_reg->raw_readfn) {
199
+ src_reg->raw_readfn = raw_read;
200
+ }
201
+ if (!src_reg->raw_writefn) {
202
+ src_reg->raw_writefn = raw_write;
203
+ }
204
+ src_reg->readfn = el2_e2h_read;
205
+ src_reg->writefn = el2_e2h_write;
206
+ }
207
+}
208
+#endif
209
+
210
static CPAccessResult ctr_el0_access(CPUARMState *env, const ARMCPRegInfo *ri,
211
bool isread)
212
{
62
{
213
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
63
@@ -XXX,XX +XXX,XX @@ static void disas_fp_1src(DisasContext *s, uint32_t insn)
214
: cpu_isar_feature(aa32_predinv, cpu)) {
215
define_arm_cp_regs(cpu, predinv_reginfo);
216
}
64
}
217
+
218
+#ifndef CONFIG_USER_ONLY
219
+ /*
220
+ * Register redirections and aliases must be done last,
221
+ * after the registers from the other extensions have been defined.
222
+ */
223
+ if (arm_feature(env, ARM_FEATURE_EL2) && cpu_isar_feature(aa64_vh, cpu)) {
224
+ define_arm_vh_e2h_redirects_aliases(cpu);
225
+ }
226
+#endif
227
}
65
}
228
66
229
void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
67
-/* Floating-point data-processing (2 source) - single precision */
68
-static void handle_fp_2src_single(DisasContext *s, int opcode,
69
- int rd, int rn, int rm)
70
-{
71
- TCGv_i32 tcg_op1;
72
- TCGv_i32 tcg_op2;
73
- TCGv_i32 tcg_res;
74
- TCGv_ptr fpst;
75
-
76
- tcg_res = tcg_temp_new_i32();
77
- fpst = fpstatus_ptr(FPST_FPCR);
78
- tcg_op1 = read_fp_sreg(s, rn);
79
- tcg_op2 = read_fp_sreg(s, rm);
80
-
81
- switch (opcode) {
82
- case 0x8: /* FNMUL */
83
- gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
84
- gen_vfp_negs(tcg_res, tcg_res);
85
- break;
86
- default:
87
- case 0x0: /* FMUL */
88
- case 0x1: /* FDIV */
89
- case 0x2: /* FADD */
90
- case 0x3: /* FSUB */
91
- case 0x4: /* FMAX */
92
- case 0x5: /* FMIN */
93
- case 0x6: /* FMAXNM */
94
- case 0x7: /* FMINNM */
95
- g_assert_not_reached();
96
- }
97
-
98
- write_fp_sreg(s, rd, tcg_res);
99
-}
100
-
101
-/* Floating-point data-processing (2 source) - double precision */
102
-static void handle_fp_2src_double(DisasContext *s, int opcode,
103
- int rd, int rn, int rm)
104
-{
105
- TCGv_i64 tcg_op1;
106
- TCGv_i64 tcg_op2;
107
- TCGv_i64 tcg_res;
108
- TCGv_ptr fpst;
109
-
110
- tcg_res = tcg_temp_new_i64();
111
- fpst = fpstatus_ptr(FPST_FPCR);
112
- tcg_op1 = read_fp_dreg(s, rn);
113
- tcg_op2 = read_fp_dreg(s, rm);
114
-
115
- switch (opcode) {
116
- case 0x8: /* FNMUL */
117
- gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
118
- gen_vfp_negd(tcg_res, tcg_res);
119
- break;
120
- default:
121
- case 0x0: /* FMUL */
122
- case 0x1: /* FDIV */
123
- case 0x2: /* FADD */
124
- case 0x3: /* FSUB */
125
- case 0x4: /* FMAX */
126
- case 0x5: /* FMIN */
127
- case 0x6: /* FMAXNM */
128
- case 0x7: /* FMINNM */
129
- g_assert_not_reached();
130
- }
131
-
132
- write_fp_dreg(s, rd, tcg_res);
133
-}
134
-
135
-/* Floating-point data-processing (2 source) - half precision */
136
-static void handle_fp_2src_half(DisasContext *s, int opcode,
137
- int rd, int rn, int rm)
138
-{
139
- TCGv_i32 tcg_op1;
140
- TCGv_i32 tcg_op2;
141
- TCGv_i32 tcg_res;
142
- TCGv_ptr fpst;
143
-
144
- tcg_res = tcg_temp_new_i32();
145
- fpst = fpstatus_ptr(FPST_FPCR_F16);
146
- tcg_op1 = read_fp_hreg(s, rn);
147
- tcg_op2 = read_fp_hreg(s, rm);
148
-
149
- switch (opcode) {
150
- case 0x8: /* FNMUL */
151
- gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
152
- gen_vfp_negh(tcg_res, tcg_res);
153
- break;
154
- default:
155
- case 0x0: /* FMUL */
156
- case 0x1: /* FDIV */
157
- case 0x2: /* FADD */
158
- case 0x3: /* FSUB */
159
- case 0x4: /* FMAX */
160
- case 0x5: /* FMIN */
161
- case 0x6: /* FMAXNM */
162
- case 0x7: /* FMINNM */
163
- g_assert_not_reached();
164
- }
165
-
166
- write_fp_sreg(s, rd, tcg_res);
167
-}
168
-
169
-/* Floating point data-processing (2 source)
170
- * 31 30 29 28 24 23 22 21 20 16 15 12 11 10 9 5 4 0
171
- * +---+---+---+-----------+------+---+------+--------+-----+------+------+
172
- * | M | 0 | S | 1 1 1 1 0 | type | 1 | Rm | opcode | 1 0 | Rn | Rd |
173
- * +---+---+---+-----------+------+---+------+--------+-----+------+------+
174
- */
175
-static void disas_fp_2src(DisasContext *s, uint32_t insn)
176
-{
177
- int mos = extract32(insn, 29, 3);
178
- int type = extract32(insn, 22, 2);
179
- int rd = extract32(insn, 0, 5);
180
- int rn = extract32(insn, 5, 5);
181
- int rm = extract32(insn, 16, 5);
182
- int opcode = extract32(insn, 12, 4);
183
-
184
- if (opcode > 8 || mos) {
185
- unallocated_encoding(s);
186
- return;
187
- }
188
-
189
- switch (type) {
190
- case 0:
191
- if (!fp_access_check(s)) {
192
- return;
193
- }
194
- handle_fp_2src_single(s, opcode, rd, rn, rm);
195
- break;
196
- case 1:
197
- if (!fp_access_check(s)) {
198
- return;
199
- }
200
- handle_fp_2src_double(s, opcode, rd, rn, rm);
201
- break;
202
- case 3:
203
- if (!dc_isar_feature(aa64_fp16, s)) {
204
- unallocated_encoding(s);
205
- return;
206
- }
207
- if (!fp_access_check(s)) {
208
- return;
209
- }
210
- handle_fp_2src_half(s, opcode, rd, rn, rm);
211
- break;
212
- default:
213
- unallocated_encoding(s);
214
- }
215
-}
216
-
217
/* Floating-point data-processing (3 source) - single precision */
218
static void handle_fp_3src_single(DisasContext *s, bool o0, bool o1,
219
int rd, int rn, int rm, int ra)
220
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
221
break;
222
case 2:
223
/* Floating point data-processing (2 source) */
224
- disas_fp_2src(s, insn);
225
+ unallocated_encoding(s); /* in decodetree */
226
break;
227
case 3:
228
/* Floating point conditional select */
230
--
229
--
231
2.20.1
230
2.34.1
232
233
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The EL2&0 translation regime is affected by Load Register (unpriv).
4
5
The code structure used here will facilitate later changes in this
6
area for implementing UAO and NV.
7
8
Tested-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-id: 20200206105448.4726-36-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-25-richard.henderson@linaro.org
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
7
---
14
target/arm/cpu.h | 9 ++++----
8
target/arm/helper.h | 2 +
15
target/arm/translate.h | 2 ++
9
target/arm/tcg/a64.decode | 22 +++
16
target/arm/helper.c | 22 +++++++++++++++++++
10
target/arm/tcg/translate-a64.c | 241 +++++++++++++++++----------------
17
target/arm/translate-a64.c | 44 ++++++++++++++++++++++++--------------
11
target/arm/tcg/vec_helper.c | 14 ++
18
4 files changed, 57 insertions(+), 20 deletions(-)
12
4 files changed, 163 insertions(+), 116 deletions(-)
19
13
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
21
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu.h
16
--- a/target/arm/helper.h
23
+++ b/target/arm/cpu.h
17
+++ b/target/arm/helper.h
24
@@ -XXX,XX +XXX,XX @@ typedef ARMCPU ArchCPU;
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
* | | | TBFLAG_A32 | |
19
26
* | | +-----+----------+ TBFLAG_AM32 |
20
DEF_HELPER_FLAGS_5(gvec_vfma_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
* | TBFLAG_ANY | |TBFLAG_M32| |
21
DEF_HELPER_FLAGS_5(gvec_vfma_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
- * | | +-------------------------|
22
+DEF_HELPER_FLAGS_5(gvec_vfma_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
- * | | | TBFLAG_A64 |
23
30
- * +--------------+-----------+-------------------------+
24
DEF_HELPER_FLAGS_5(gvec_vfms_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
- * 31 20 14 0
25
DEF_HELPER_FLAGS_5(gvec_vfms_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
+ * | | +-+----------+--------------|
26
+DEF_HELPER_FLAGS_5(gvec_vfms_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
+ * | | | TBFLAG_A64 |
27
34
+ * +--------------+---------+---------------------------+
28
DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
35
+ * 31 20 15 0
29
void, ptr, ptr, ptr, ptr, i32)
36
*
30
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
37
* Unless otherwise noted, these bits are cached in env->hflags.
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/tcg/a64.decode
33
+++ b/target/arm/tcg/a64.decode
34
@@ -XXX,XX +XXX,XX @@ FMINNM_v 0.00 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
35
FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h
36
FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd
37
38
+FMLA_v 0.00 1110 010 ..... 00001 1 ..... ..... @qrrr_h
39
+FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd
40
+
41
+FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h
42
+FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd
43
+
44
### Advanced SIMD scalar x indexed element
45
46
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
47
FMUL_si 0101 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
48
FMUL_si 0101 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
49
50
+FMLA_si 0101 1111 00 .. .... 0001 . 0 ..... ..... @rrx_h
51
+FMLA_si 0101 1111 10 .. .... 0001 . 0 ..... ..... @rrx_s
52
+FMLA_si 0101 1111 11 0. .... 0001 . 0 ..... ..... @rrx_d
53
+
54
+FMLS_si 0101 1111 00 .. .... 0101 . 0 ..... ..... @rrx_h
55
+FMLS_si 0101 1111 10 .. .... 0101 . 0 ..... ..... @rrx_s
56
+FMLS_si 0101 1111 11 0. .... 0101 . 0 ..... ..... @rrx_d
57
+
58
FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
59
FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s
60
FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d
61
@@ -XXX,XX +XXX,XX @@ FMUL_vi 0.00 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
62
FMUL_vi 0.00 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
63
FMUL_vi 0.00 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
64
65
+FMLA_vi 0.00 1111 00 .. .... 0001 . 0 ..... ..... @qrrx_h
66
+FMLA_vi 0.00 1111 10 . ..... 0001 . 0 ..... ..... @qrrx_s
67
+FMLA_vi 0.00 1111 11 0 ..... 0001 . 0 ..... ..... @qrrx_d
68
+
69
+FMLS_vi 0.00 1111 00 .. .... 0101 . 0 ..... ..... @qrrx_h
70
+FMLS_vi 0.00 1111 10 . ..... 0101 . 0 ..... ..... @qrrx_s
71
+FMLS_vi 0.00 1111 11 0 ..... 0101 . 0 ..... ..... @qrrx_d
72
+
73
FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
74
FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
75
FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
76
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
77
index XXXXXXX..XXXXXXX 100644
78
--- a/target/arm/tcg/translate-a64.c
79
+++ b/target/arm/tcg/translate-a64.c
80
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = {
81
};
82
TRANS(FMULX_v, do_fp3_vector, a, f_vector_fmulx)
83
84
+static gen_helper_gvec_3_ptr * const f_vector_fmla[3] = {
85
+ gen_helper_gvec_vfma_h,
86
+ gen_helper_gvec_vfma_s,
87
+ gen_helper_gvec_vfma_d,
88
+};
89
+TRANS(FMLA_v, do_fp3_vector, a, f_vector_fmla)
90
+
91
+static gen_helper_gvec_3_ptr * const f_vector_fmls[3] = {
92
+ gen_helper_gvec_vfms_h,
93
+ gen_helper_gvec_vfms_s,
94
+ gen_helper_gvec_vfms_d,
95
+};
96
+TRANS(FMLS_v, do_fp3_vector, a, f_vector_fmls)
97
+
98
/*
99
* Advanced SIMD scalar/vector x indexed element
38
*/
100
*/
39
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, PAUTH_ACTIVE, 8, 1)
101
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f)
40
FIELD(TBFLAG_A64, BT, 9, 1)
102
TRANS(FMUL_si, do_fp3_scalar_idx, a, &f_scalar_fmul)
41
FIELD(TBFLAG_A64, BTYPE, 10, 2) /* Not cached. */
103
TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx)
42
FIELD(TBFLAG_A64, TBID, 12, 2)
104
43
+FIELD(TBFLAG_A64, UNPRIV, 14, 1)
105
+static bool do_fmla_scalar_idx(DisasContext *s, arg_rrx_e *a, bool neg)
44
106
+{
45
static inline bool bswap_code(bool sctlr_b)
107
+ switch (a->esz) {
46
{
108
+ case MO_64:
47
diff --git a/target/arm/translate.h b/target/arm/translate.h
109
+ if (fp_access_check(s)) {
48
index XXXXXXX..XXXXXXX 100644
110
+ TCGv_i64 t0 = read_fp_dreg(s, a->rd);
49
--- a/target/arm/translate.h
111
+ TCGv_i64 t1 = read_fp_dreg(s, a->rn);
50
+++ b/target/arm/translate.h
112
+ TCGv_i64 t2 = tcg_temp_new_i64();
51
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
113
+
52
* ie A64 LDX*, LDAX*, A32/T32 LDREX*, LDAEX*.
114
+ read_vec_element(s, t2, a->rm, a->idx, MO_64);
53
*/
115
+ if (neg) {
54
bool is_ldex;
116
+ gen_vfp_negd(t1, t1);
55
+ /* True if AccType_UNPRIV should be used for LDTR et al */
117
+ }
56
+ bool unpriv;
118
+ gen_helper_vfp_muladdd(t0, t1, t2, t0, fpstatus_ptr(FPST_FPCR));
57
/* True if v8.3-PAuth is active. */
119
+ write_fp_dreg(s, a->rd, t0);
58
bool pauth_active;
120
+ }
59
/* True with v8.5-BTI and SCTLR_ELx.BT* set. */
60
diff --git a/target/arm/helper.c b/target/arm/helper.c
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/helper.c
63
+++ b/target/arm/helper.c
64
@@ -XXX,XX +XXX,XX @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
65
}
66
}
67
68
+ /* Compute the condition for using AccType_UNPRIV for LDTR et al. */
69
+ /* TODO: ARMv8.2-UAO */
70
+ switch (mmu_idx) {
71
+ case ARMMMUIdx_E10_1:
72
+ case ARMMMUIdx_SE10_1:
73
+ /* TODO: ARMv8.3-NV */
74
+ flags = FIELD_DP32(flags, TBFLAG_A64, UNPRIV, 1);
75
+ break;
121
+ break;
76
+ case ARMMMUIdx_E20_2:
122
+ case MO_32:
77
+ /* TODO: ARMv8.4-SecEL2 */
123
+ if (fp_access_check(s)) {
78
+ /*
124
+ TCGv_i32 t0 = read_fp_sreg(s, a->rd);
79
+ * Note that E20_2 is gated by HCR_EL2.E2H == 1, but E20_0 is
125
+ TCGv_i32 t1 = read_fp_sreg(s, a->rn);
80
+ * gated by HCR_EL2.<E2H,TGE> == '11', and so is LDTR.
126
+ TCGv_i32 t2 = tcg_temp_new_i32();
81
+ */
127
+
82
+ if (env->cp15.hcr_el2 & HCR_TGE) {
128
+ read_vec_element_i32(s, t2, a->rm, a->idx, MO_32);
83
+ flags = FIELD_DP32(flags, TBFLAG_A64, UNPRIV, 1);
129
+ if (neg) {
130
+ gen_vfp_negs(t1, t1);
131
+ }
132
+ gen_helper_vfp_muladds(t0, t1, t2, t0, fpstatus_ptr(FPST_FPCR));
133
+ write_fp_sreg(s, a->rd, t0);
134
+ }
135
+ break;
136
+ case MO_16:
137
+ if (!dc_isar_feature(aa64_fp16, s)) {
138
+ return false;
139
+ }
140
+ if (fp_access_check(s)) {
141
+ TCGv_i32 t0 = read_fp_hreg(s, a->rd);
142
+ TCGv_i32 t1 = read_fp_hreg(s, a->rn);
143
+ TCGv_i32 t2 = tcg_temp_new_i32();
144
+
145
+ read_vec_element_i32(s, t2, a->rm, a->idx, MO_16);
146
+ if (neg) {
147
+ gen_vfp_negh(t1, t1);
148
+ }
149
+ gen_helper_advsimd_muladdh(t0, t1, t2, t0,
150
+ fpstatus_ptr(FPST_FPCR_F16));
151
+ write_fp_sreg(s, a->rd, t0);
84
+ }
152
+ }
85
+ break;
153
+ break;
86
+ default:
154
+ default:
155
+ g_assert_not_reached();
156
+ }
157
+ return true;
158
+}
159
+
160
+TRANS(FMLA_si, do_fmla_scalar_idx, a, false)
161
+TRANS(FMLS_si, do_fmla_scalar_idx, a, true)
162
+
163
static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a,
164
gen_helper_gvec_3_ptr * const fns[3])
165
{
166
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = {
167
};
168
TRANS(FMULX_vi, do_fp3_vector_idx, a, f_vector_idx_fmulx)
169
170
+static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg)
171
+{
172
+ static gen_helper_gvec_4_ptr * const fns[3] = {
173
+ gen_helper_gvec_fmla_idx_h,
174
+ gen_helper_gvec_fmla_idx_s,
175
+ gen_helper_gvec_fmla_idx_d,
176
+ };
177
+ MemOp esz = a->esz;
178
+
179
+ switch (esz) {
180
+ case MO_64:
181
+ if (!a->q) {
182
+ return false;
183
+ }
87
+ break;
184
+ break;
185
+ case MO_32:
186
+ break;
187
+ case MO_16:
188
+ if (!dc_isar_feature(aa64_fp16, s)) {
189
+ return false;
190
+ }
191
+ break;
192
+ default:
193
+ g_assert_not_reached();
88
+ }
194
+ }
89
+
195
+ if (fp_access_check(s)) {
90
return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
196
+ gen_gvec_op4_fpst(s, a->q, a->rd, a->rn, a->rm, a->rd,
197
+ esz == MO_16, (a->idx << 1) | neg,
198
+ fns[esz - 1]);
199
+ }
200
+ return true;
201
+}
202
+
203
+TRANS(FMLA_vi, do_fmla_vector_idx, a, false)
204
+TRANS(FMLS_vi, do_fmla_vector_idx, a, true)
205
+
206
207
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
208
* Note that it is the caller's responsibility to ensure that the
209
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
210
read_vec_element(s, tcg_op2, rm, pass, MO_64);
211
212
switch (fpopcode) {
213
- case 0x39: /* FMLS */
214
- /* As usual for ARM, separate negation for fused multiply-add */
215
- gen_vfp_negd(tcg_op1, tcg_op1);
216
- /* fall through */
217
- case 0x19: /* FMLA */
218
- read_vec_element(s, tcg_res, rd, pass, MO_64);
219
- gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2,
220
- tcg_res, fpst);
221
- break;
222
case 0x1c: /* FCMEQ */
223
gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
224
break;
225
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
226
break;
227
default:
228
case 0x18: /* FMAXNM */
229
+ case 0x19: /* FMLA */
230
case 0x1a: /* FADD */
231
case 0x1b: /* FMULX */
232
case 0x1e: /* FMAX */
233
case 0x38: /* FMINNM */
234
+ case 0x39: /* FMLS */
235
case 0x3a: /* FSUB */
236
case 0x3e: /* FMIN */
237
case 0x5b: /* FMUL */
238
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
239
read_vec_element_i32(s, tcg_op2, rm, pass, MO_32);
240
241
switch (fpopcode) {
242
- case 0x39: /* FMLS */
243
- /* As usual for ARM, separate negation for fused multiply-add */
244
- gen_vfp_negs(tcg_op1, tcg_op1);
245
- /* fall through */
246
- case 0x19: /* FMLA */
247
- read_vec_element_i32(s, tcg_res, rd, pass, MO_32);
248
- gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2,
249
- tcg_res, fpst);
250
- break;
251
case 0x1c: /* FCMEQ */
252
gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
253
break;
254
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
255
break;
256
default:
257
case 0x18: /* FMAXNM */
258
+ case 0x19: /* FMLA */
259
case 0x1a: /* FADD */
260
case 0x1b: /* FMULX */
261
case 0x1e: /* FMAX */
262
case 0x38: /* FMINNM */
263
+ case 0x39: /* FMLS */
264
case 0x3a: /* FSUB */
265
case 0x3e: /* FMIN */
266
case 0x5b: /* FMUL */
267
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
268
case 0x3f: /* FRSQRTS */
269
case 0x5d: /* FACGE */
270
case 0x7d: /* FACGT */
271
- case 0x19: /* FMLA */
272
- case 0x39: /* FMLS */
273
case 0x1c: /* FCMEQ */
274
case 0x5c: /* FCMGE */
275
case 0x7a: /* FABD */
276
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
277
278
default:
279
case 0x18: /* FMAXNM */
280
+ case 0x19: /* FMLA */
281
case 0x1a: /* FADD */
282
case 0x1b: /* FMULX */
283
case 0x1e: /* FMAX */
284
case 0x38: /* FMINNM */
285
+ case 0x39: /* FMLS */
286
case 0x3a: /* FSUB */
287
case 0x3e: /* FMIN */
288
case 0x5b: /* FMUL */
289
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
290
int pass;
291
292
switch (fpopcode) {
293
- case 0x1: /* FMLA */
294
case 0x4: /* FCMEQ */
295
case 0x7: /* FRECPS */
296
- case 0x9: /* FMLS */
297
case 0xf: /* FRSQRTS */
298
case 0x14: /* FCMGE */
299
case 0x15: /* FACGE */
300
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
301
break;
302
default:
303
case 0x0: /* FMAXNM */
304
+ case 0x1: /* FMLA */
305
case 0x2: /* FADD */
306
case 0x3: /* FMULX */
307
case 0x6: /* FMAX */
308
case 0x8: /* FMINNM */
309
+ case 0x9: /* FMLS */
310
case 0xa: /* FSUB */
311
case 0xe: /* FMIN */
312
case 0x13: /* FMUL */
313
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
314
read_vec_element_i32(s, tcg_op2, rm, pass, MO_16);
315
316
switch (fpopcode) {
317
- case 0x1: /* FMLA */
318
- read_vec_element_i32(s, tcg_res, rd, pass, MO_16);
319
- gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
320
- fpst);
321
- break;
322
case 0x4: /* FCMEQ */
323
gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
324
break;
325
case 0x7: /* FRECPS */
326
gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
327
break;
328
- case 0x9: /* FMLS */
329
- /* As usual for ARM, separate negation for fused multiply-add */
330
- tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000);
331
- read_vec_element_i32(s, tcg_res, rd, pass, MO_16);
332
- gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res,
333
- fpst);
334
- break;
335
case 0xf: /* FRSQRTS */
336
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
337
break;
338
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
339
break;
340
default:
341
case 0x0: /* FMAXNM */
342
+ case 0x1: /* FMLA */
343
case 0x2: /* FADD */
344
case 0x3: /* FMULX */
345
case 0x6: /* FMAX */
346
case 0x8: /* FMINNM */
347
+ case 0x9: /* FMLS */
348
case 0xa: /* FSUB */
349
case 0xe: /* FMIN */
350
case 0x13: /* FMUL */
351
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
352
case 0x0c: /* SQDMULH */
353
case 0x0d: /* SQRDMULH */
354
break;
355
- case 0x01: /* FMLA */
356
- case 0x05: /* FMLS */
357
- is_fp = 1;
358
- break;
359
case 0x1d: /* SQRDMLAH */
360
case 0x1f: /* SQRDMLSH */
361
if (!dc_isar_feature(aa64_rdm, s)) {
362
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
363
/* is_fp, but we pass tcg_env not fp_status. */
364
break;
365
default:
366
+ case 0x01: /* FMLA */
367
+ case 0x05: /* FMLS */
368
case 0x09: /* FMUL */
369
case 0x19: /* FMULX */
370
unallocated_encoding(s);
371
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
372
373
switch (is_fp) {
374
case 1: /* normal fp */
375
- /* convert insn encoded size to MemOp size */
376
- switch (size) {
377
- case 0: /* half-precision */
378
- size = MO_16;
379
- is_fp16 = true;
380
- break;
381
- case MO_32: /* single precision */
382
- case MO_64: /* double precision */
383
- break;
384
- default:
385
- unallocated_encoding(s);
386
- return;
387
- }
388
- break;
389
+ unallocated_encoding(s); /* in decodetree */
390
+ return;
391
392
case 2: /* complex fp */
393
/* Each indexable element is a complex pair. */
394
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
395
}
396
397
if (size == 3) {
398
- TCGv_i64 tcg_idx = tcg_temp_new_i64();
399
- int pass;
400
-
401
- assert(is_fp && is_q && !is_long);
402
-
403
- read_vec_element(s, tcg_idx, rm, index, MO_64);
404
-
405
- for (pass = 0; pass < (is_scalar ? 1 : 2); pass++) {
406
- TCGv_i64 tcg_op = tcg_temp_new_i64();
407
- TCGv_i64 tcg_res = tcg_temp_new_i64();
408
-
409
- read_vec_element(s, tcg_op, rn, pass, MO_64);
410
-
411
- switch (16 * u + opcode) {
412
- case 0x05: /* FMLS */
413
- /* As usual for ARM, separate negation for fused multiply-add */
414
- gen_vfp_negd(tcg_op, tcg_op);
415
- /* fall through */
416
- case 0x01: /* FMLA */
417
- read_vec_element(s, tcg_res, rd, pass, MO_64);
418
- gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst);
419
- break;
420
- default:
421
- case 0x09: /* FMUL */
422
- case 0x19: /* FMULX */
423
- g_assert_not_reached();
424
- }
425
-
426
- write_vec_element(s, tcg_res, rd, pass, MO_64);
427
- }
428
-
429
- clear_vec_high(s, !is_scalar, rd);
430
+ g_assert_not_reached();
431
} else if (!is_long) {
432
/* 32 bit floating point, or 16 or 32 bit integer.
433
* For the 16 bit scalar case we use the usual Neon helpers and
434
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
435
genfn(tcg_res, tcg_op, tcg_res);
436
break;
437
}
438
- case 0x05: /* FMLS */
439
- case 0x01: /* FMLA */
440
- read_vec_element_i32(s, tcg_res, rd, pass,
441
- is_scalar ? size : MO_32);
442
- switch (size) {
443
- case 1:
444
- if (opcode == 0x5) {
445
- /* As usual for ARM, separate negation for fused
446
- * multiply-add */
447
- tcg_gen_xori_i32(tcg_op, tcg_op, 0x80008000);
448
- }
449
- if (is_scalar) {
450
- gen_helper_advsimd_muladdh(tcg_res, tcg_op, tcg_idx,
451
- tcg_res, fpst);
452
- } else {
453
- gen_helper_advsimd_muladd2h(tcg_res, tcg_op, tcg_idx,
454
- tcg_res, fpst);
455
- }
456
- break;
457
- case 2:
458
- if (opcode == 0x5) {
459
- /* As usual for ARM, separate negation for
460
- * fused multiply-add */
461
- tcg_gen_xori_i32(tcg_op, tcg_op, 0x80000000);
462
- }
463
- gen_helper_vfp_muladds(tcg_res, tcg_op, tcg_idx,
464
- tcg_res, fpst);
465
- break;
466
- default:
467
- g_assert_not_reached();
468
- }
469
- break;
470
case 0x0c: /* SQDMULH */
471
if (size == 1) {
472
gen_helper_neon_qdmulh_s16(tcg_res, tcg_env,
473
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
474
}
475
break;
476
default:
477
+ case 0x01: /* FMLA */
478
+ case 0x05: /* FMLS */
479
case 0x09: /* FMUL */
480
case 0x19: /* FMULX */
481
g_assert_not_reached();
482
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
483
index XXXXXXX..XXXXXXX 100644
484
--- a/target/arm/tcg/vec_helper.c
485
+++ b/target/arm/tcg/vec_helper.c
486
@@ -XXX,XX +XXX,XX @@ static float32 float32_muladd_f(float32 dest, float32 op1, float32 op2,
487
return float32_muladd(op1, op2, dest, 0, stat);
91
}
488
}
92
489
93
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
490
+static float64 float64_muladd_f(float64 dest, float64 op1, float64 op2,
94
index XXXXXXX..XXXXXXX 100644
491
+ float_status *stat)
95
--- a/target/arm/translate-a64.c
492
+{
96
+++ b/target/arm/translate-a64.c
493
+ return float64_muladd(op1, op2, dest, 0, stat);
97
@@ -XXX,XX +XXX,XX @@ void a64_translate_init(void)
494
+}
98
offsetof(CPUARMState, exclusive_high), "exclusive_high");
495
+
496
static float16 float16_mulsub_f(float16 dest, float16 op1, float16 op2,
497
float_status *stat)
498
{
499
@@ -XXX,XX +XXX,XX @@ static float32 float32_mulsub_f(float32 dest, float32 op1, float32 op2,
500
return float32_muladd(float32_chs(op1), op2, dest, 0, stat);
99
}
501
}
100
502
101
-static inline int get_a64_user_mem_index(DisasContext *s)
503
+static float64 float64_mulsub_f(float64 dest, float64 op1, float64 op2,
102
+/*
504
+ float_status *stat)
103
+ * Return the core mmu_idx to use for A64 "unprivileged load/store" insns
505
+{
104
+ */
506
+ return float64_muladd(float64_chs(op1), op2, dest, 0, stat);
105
+static int get_a64_user_mem_index(DisasContext *s)
507
+}
106
{
508
+
107
- /* Return the core mmu_idx to use for A64 "unprivileged load/store" insns:
509
#define DO_MULADD(NAME, FUNC, TYPE) \
108
- * if EL1, access as if EL0; otherwise access at current EL
510
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
109
+ /*
511
{ \
110
+ * If AccType_UNPRIV is not used, the insn uses AccType_NORMAL,
512
@@ -XXX,XX +XXX,XX @@ DO_MULADD(gvec_fmls_s, float32_mulsub_nf, float32)
111
+ * which is the usual mmu_idx for this cpu state.
513
112
*/
514
DO_MULADD(gvec_vfma_h, float16_muladd_f, float16)
113
- ARMMMUIdx useridx;
515
DO_MULADD(gvec_vfma_s, float32_muladd_f, float32)
114
+ ARMMMUIdx useridx = s->mmu_idx;
516
+DO_MULADD(gvec_vfma_d, float64_muladd_f, float64)
115
517
116
- switch (s->mmu_idx) {
518
DO_MULADD(gvec_vfms_h, float16_mulsub_f, float16)
117
- case ARMMMUIdx_E10_1:
519
DO_MULADD(gvec_vfms_s, float32_mulsub_f, float32)
118
- useridx = ARMMMUIdx_E10_0;
520
+DO_MULADD(gvec_vfms_d, float64_mulsub_f, float64)
119
- break;
521
120
- case ARMMMUIdx_SE10_1:
522
/* For the indexed ops, SVE applies the index per 128-bit vector segment.
121
- useridx = ARMMMUIdx_SE10_0;
523
* For AdvSIMD, there is of course only one such vector segment.
122
- break;
123
- case ARMMMUIdx_Stage2:
124
- g_assert_not_reached();
125
- default:
126
- useridx = s->mmu_idx;
127
- break;
128
+ if (s->unpriv) {
129
+ /*
130
+ * We have pre-computed the condition for AccType_UNPRIV.
131
+ * Therefore we should never get here with a mmu_idx for
132
+ * which we do not know the corresponding user mmu_idx.
133
+ */
134
+ switch (useridx) {
135
+ case ARMMMUIdx_E10_1:
136
+ useridx = ARMMMUIdx_E10_0;
137
+ break;
138
+ case ARMMMUIdx_E20_2:
139
+ useridx = ARMMMUIdx_E20_0;
140
+ break;
141
+ case ARMMMUIdx_SE10_1:
142
+ useridx = ARMMMUIdx_SE10_0;
143
+ break;
144
+ default:
145
+ g_assert_not_reached();
146
+ }
147
}
148
return arm_to_core_mmu_idx(useridx);
149
}
150
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
151
dc->pauth_active = FIELD_EX32(tb_flags, TBFLAG_A64, PAUTH_ACTIVE);
152
dc->bt = FIELD_EX32(tb_flags, TBFLAG_A64, BT);
153
dc->btype = FIELD_EX32(tb_flags, TBFLAG_A64, BTYPE);
154
+ dc->unpriv = FIELD_EX32(tb_flags, TBFLAG_A64, UNPRIV);
155
dc->vec_len = 0;
156
dc->vec_stride = 0;
157
dc->cp_regs = arm_cpu->cp_regs;
158
--
524
--
159
2.20.1
525
2.34.1
160
161
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Rather than call to a separate function and re-compute any
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
parameters for the flush, simply use the correct flush
5
function directly.
6
7
Tested-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-9-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-26-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
7
---
13
target/arm/helper.c | 52 +++++++++++++++++++++------------------------
8
target/arm/helper.h | 5 +
14
1 file changed, 24 insertions(+), 28 deletions(-)
9
target/arm/tcg/a64.decode | 30 ++++++
10
target/arm/tcg/translate-a64.c | 188 +++++++++++++++++++--------------
11
target/arm/tcg/vec_helper.c | 30 ++++++
12
4 files changed, 174 insertions(+), 79 deletions(-)
15
13
16
diff --git a/target/arm/helper.c b/target/arm/helper.c
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper.c
16
--- a/target/arm/helper.h
19
+++ b/target/arm/helper.c
17
+++ b/target/arm/helper.h
20
@@ -XXX,XX +XXX,XX @@ static void tlbiall_write(CPUARMState *env, const ARMCPRegInfo *ri,
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
21
uint64_t value)
19
20
DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
21
DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+DEF_HELPER_FLAGS_5(gvec_fceq_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
24
DEF_HELPER_FLAGS_5(gvec_fcge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
DEF_HELPER_FLAGS_5(gvec_fcge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(gvec_fcge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
28
DEF_HELPER_FLAGS_5(gvec_fcgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_5(gvec_fcgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_5(gvec_fcgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
32
DEF_HELPER_FLAGS_5(gvec_facge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
DEF_HELPER_FLAGS_5(gvec_facge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_5(gvec_facge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
36
DEF_HELPER_FLAGS_5(gvec_facgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_5(gvec_facgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
40
DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
41
DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
42
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/tcg/a64.decode
45
+++ b/target/arm/tcg/a64.decode
46
@@ -XXX,XX +XXX,XX @@ FMINNM_s 0001 1110 ..1 ..... 0111 10 ..... ..... @rrr_hsd
47
FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h
48
FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd
49
50
+FCMEQ_s 0101 1110 010 ..... 00100 1 ..... ..... @rrr_h
51
+FCMEQ_s 0101 1110 0.1 ..... 11100 1 ..... ..... @rrr_sd
52
+
53
+FCMGE_s 0111 1110 010 ..... 00100 1 ..... ..... @rrr_h
54
+FCMGE_s 0111 1110 0.1 ..... 11100 1 ..... ..... @rrr_sd
55
+
56
+FCMGT_s 0111 1110 110 ..... 00100 1 ..... ..... @rrr_h
57
+FCMGT_s 0111 1110 1.1 ..... 11100 1 ..... ..... @rrr_sd
58
+
59
+FACGE_s 0111 1110 010 ..... 00101 1 ..... ..... @rrr_h
60
+FACGE_s 0111 1110 0.1 ..... 11101 1 ..... ..... @rrr_sd
61
+
62
+FACGT_s 0111 1110 110 ..... 00101 1 ..... ..... @rrr_h
63
+FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd
64
+
65
### Advanced SIMD three same
66
67
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
68
@@ -XXX,XX +XXX,XX @@ FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd
69
FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h
70
FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd
71
72
+FCMEQ_v 0.00 1110 010 ..... 00100 1 ..... ..... @qrrr_h
73
+FCMEQ_v 0.00 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd
74
+
75
+FCMGE_v 0.10 1110 010 ..... 00100 1 ..... ..... @qrrr_h
76
+FCMGE_v 0.10 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd
77
+
78
+FCMGT_v 0.10 1110 110 ..... 00100 1 ..... ..... @qrrr_h
79
+FCMGT_v 0.10 1110 1.1 ..... 11100 1 ..... ..... @qrrr_sd
80
+
81
+FACGE_v 0.10 1110 010 ..... 00101 1 ..... ..... @qrrr_h
82
+FACGE_v 0.10 1110 0.1 ..... 11101 1 ..... ..... @qrrr_sd
83
+
84
+FACGT_v 0.10 1110 110 ..... 00101 1 ..... ..... @qrrr_h
85
+FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd
86
+
87
### Advanced SIMD scalar x indexed element
88
89
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
90
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/tcg/translate-a64.c
93
+++ b/target/arm/tcg/translate-a64.c
94
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_fnmul = {
95
};
96
TRANS(FNMUL_s, do_fp3_scalar, a, &f_scalar_fnmul)
97
98
+static const FPScalar f_scalar_fcmeq = {
99
+ gen_helper_advsimd_ceq_f16,
100
+ gen_helper_neon_ceq_f32,
101
+ gen_helper_neon_ceq_f64,
102
+};
103
+TRANS(FCMEQ_s, do_fp3_scalar, a, &f_scalar_fcmeq)
104
+
105
+static const FPScalar f_scalar_fcmge = {
106
+ gen_helper_advsimd_cge_f16,
107
+ gen_helper_neon_cge_f32,
108
+ gen_helper_neon_cge_f64,
109
+};
110
+TRANS(FCMGE_s, do_fp3_scalar, a, &f_scalar_fcmge)
111
+
112
+static const FPScalar f_scalar_fcmgt = {
113
+ gen_helper_advsimd_cgt_f16,
114
+ gen_helper_neon_cgt_f32,
115
+ gen_helper_neon_cgt_f64,
116
+};
117
+TRANS(FCMGT_s, do_fp3_scalar, a, &f_scalar_fcmgt)
118
+
119
+static const FPScalar f_scalar_facge = {
120
+ gen_helper_advsimd_acge_f16,
121
+ gen_helper_neon_acge_f32,
122
+ gen_helper_neon_acge_f64,
123
+};
124
+TRANS(FACGE_s, do_fp3_scalar, a, &f_scalar_facge)
125
+
126
+static const FPScalar f_scalar_facgt = {
127
+ gen_helper_advsimd_acgt_f16,
128
+ gen_helper_neon_acgt_f32,
129
+ gen_helper_neon_acgt_f64,
130
+};
131
+TRANS(FACGT_s, do_fp3_scalar, a, &f_scalar_facgt)
132
+
133
static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
134
gen_helper_gvec_3_ptr * const fns[3])
22
{
135
{
23
/* Invalidate all (TLBIALL) */
136
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fmls[3] = {
24
- ARMCPU *cpu = env_archcpu(env);
137
};
25
+ CPUState *cs = env_cpu(env);
138
TRANS(FMLS_v, do_fp3_vector, a, f_vector_fmls)
26
139
27
if (tlb_force_broadcast(env)) {
140
+static gen_helper_gvec_3_ptr * const f_vector_fcmeq[3] = {
28
- tlbiall_is_write(env, NULL, value);
141
+ gen_helper_gvec_fceq_h,
29
- return;
142
+ gen_helper_gvec_fceq_s,
30
+ tlb_flush_all_cpus_synced(cs);
143
+ gen_helper_gvec_fceq_d,
31
+ } else {
144
+};
32
+ tlb_flush(cs);
145
+TRANS(FCMEQ_v, do_fp3_vector, a, f_vector_fcmeq)
146
+
147
+static gen_helper_gvec_3_ptr * const f_vector_fcmge[3] = {
148
+ gen_helper_gvec_fcge_h,
149
+ gen_helper_gvec_fcge_s,
150
+ gen_helper_gvec_fcge_d,
151
+};
152
+TRANS(FCMGE_v, do_fp3_vector, a, f_vector_fcmge)
153
+
154
+static gen_helper_gvec_3_ptr * const f_vector_fcmgt[3] = {
155
+ gen_helper_gvec_fcgt_h,
156
+ gen_helper_gvec_fcgt_s,
157
+ gen_helper_gvec_fcgt_d,
158
+};
159
+TRANS(FCMGT_v, do_fp3_vector, a, f_vector_fcmgt)
160
+
161
+static gen_helper_gvec_3_ptr * const f_vector_facge[3] = {
162
+ gen_helper_gvec_facge_h,
163
+ gen_helper_gvec_facge_s,
164
+ gen_helper_gvec_facge_d,
165
+};
166
+TRANS(FACGE_v, do_fp3_vector, a, f_vector_facge)
167
+
168
+static gen_helper_gvec_3_ptr * const f_vector_facgt[3] = {
169
+ gen_helper_gvec_facgt_h,
170
+ gen_helper_gvec_facgt_s,
171
+ gen_helper_gvec_facgt_d,
172
+};
173
+TRANS(FACGT_v, do_fp3_vector, a, f_vector_facgt)
174
+
175
/*
176
* Advanced SIMD scalar/vector x indexed element
177
*/
178
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
179
read_vec_element(s, tcg_op2, rm, pass, MO_64);
180
181
switch (fpopcode) {
182
- case 0x1c: /* FCMEQ */
183
- gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst);
184
- break;
185
case 0x1f: /* FRECPS */
186
gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
187
break;
188
case 0x3f: /* FRSQRTS */
189
gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
190
break;
191
- case 0x5c: /* FCMGE */
192
- gen_helper_neon_cge_f64(tcg_res, tcg_op1, tcg_op2, fpst);
193
- break;
194
- case 0x5d: /* FACGE */
195
- gen_helper_neon_acge_f64(tcg_res, tcg_op1, tcg_op2, fpst);
196
- break;
197
case 0x7a: /* FABD */
198
gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
199
gen_vfp_absd(tcg_res, tcg_res);
200
break;
201
- case 0x7c: /* FCMGT */
202
- gen_helper_neon_cgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
203
- break;
204
- case 0x7d: /* FACGT */
205
- gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst);
206
- break;
207
default:
208
case 0x18: /* FMAXNM */
209
case 0x19: /* FMLA */
210
case 0x1a: /* FADD */
211
case 0x1b: /* FMULX */
212
+ case 0x1c: /* FCMEQ */
213
case 0x1e: /* FMAX */
214
case 0x38: /* FMINNM */
215
case 0x39: /* FMLS */
216
case 0x3a: /* FSUB */
217
case 0x3e: /* FMIN */
218
case 0x5b: /* FMUL */
219
+ case 0x5c: /* FCMGE */
220
+ case 0x5d: /* FACGE */
221
case 0x5f: /* FDIV */
222
+ case 0x7c: /* FCMGT */
223
+ case 0x7d: /* FACGT */
224
g_assert_not_reached();
225
}
226
227
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
228
read_vec_element_i32(s, tcg_op2, rm, pass, MO_32);
229
230
switch (fpopcode) {
231
- case 0x1c: /* FCMEQ */
232
- gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst);
233
- break;
234
case 0x1f: /* FRECPS */
235
gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
236
break;
237
case 0x3f: /* FRSQRTS */
238
gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
239
break;
240
- case 0x5c: /* FCMGE */
241
- gen_helper_neon_cge_f32(tcg_res, tcg_op1, tcg_op2, fpst);
242
- break;
243
- case 0x5d: /* FACGE */
244
- gen_helper_neon_acge_f32(tcg_res, tcg_op1, tcg_op2, fpst);
245
- break;
246
case 0x7a: /* FABD */
247
gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
248
gen_vfp_abss(tcg_res, tcg_res);
249
break;
250
- case 0x7c: /* FCMGT */
251
- gen_helper_neon_cgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
252
- break;
253
- case 0x7d: /* FACGT */
254
- gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst);
255
- break;
256
default:
257
case 0x18: /* FMAXNM */
258
case 0x19: /* FMLA */
259
case 0x1a: /* FADD */
260
case 0x1b: /* FMULX */
261
+ case 0x1c: /* FCMEQ */
262
case 0x1e: /* FMAX */
263
case 0x38: /* FMINNM */
264
case 0x39: /* FMLS */
265
case 0x3a: /* FSUB */
266
case 0x3e: /* FMIN */
267
case 0x5b: /* FMUL */
268
+ case 0x5c: /* FCMGE */
269
+ case 0x5d: /* FACGE */
270
case 0x5f: /* FDIV */
271
+ case 0x7c: /* FCMGT */
272
+ case 0x7d: /* FACGT */
273
g_assert_not_reached();
274
}
275
276
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
277
switch (fpopcode) {
278
case 0x1f: /* FRECPS */
279
case 0x3f: /* FRSQRTS */
280
+ case 0x7a: /* FABD */
281
+ break;
282
+ default:
283
+ case 0x1b: /* FMULX */
284
case 0x5d: /* FACGE */
285
case 0x7d: /* FACGT */
286
case 0x1c: /* FCMEQ */
287
case 0x5c: /* FCMGE */
288
case 0x7c: /* FCMGT */
289
- case 0x7a: /* FABD */
290
- break;
291
- default:
292
- case 0x1b: /* FMULX */
293
unallocated_encoding(s);
294
return;
295
}
296
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
297
TCGv_i32 tcg_res;
298
299
switch (fpopcode) {
300
- case 0x04: /* FCMEQ (reg) */
301
case 0x07: /* FRECPS */
302
case 0x0f: /* FRSQRTS */
303
- case 0x14: /* FCMGE (reg) */
304
- case 0x15: /* FACGE */
305
case 0x1a: /* FABD */
306
- case 0x1c: /* FCMGT (reg) */
307
- case 0x1d: /* FACGT */
308
break;
309
default:
310
case 0x03: /* FMULX */
311
+ case 0x04: /* FCMEQ (reg) */
312
+ case 0x14: /* FCMGE (reg) */
313
+ case 0x15: /* FACGE */
314
+ case 0x1c: /* FCMGT (reg) */
315
+ case 0x1d: /* FACGT */
316
unallocated_encoding(s);
317
return;
33
}
318
}
34
-
319
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
35
- tlb_flush(CPU(cpu));
320
tcg_res = tcg_temp_new_i32();
321
322
switch (fpopcode) {
323
- case 0x04: /* FCMEQ (reg) */
324
- gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
325
- break;
326
case 0x07: /* FRECPS */
327
gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
328
break;
329
case 0x0f: /* FRSQRTS */
330
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
331
break;
332
- case 0x14: /* FCMGE (reg) */
333
- gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
334
- break;
335
- case 0x15: /* FACGE */
336
- gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
337
- break;
338
case 0x1a: /* FABD */
339
gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
340
tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
341
break;
342
- case 0x1c: /* FCMGT (reg) */
343
- gen_helper_advsimd_cgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
344
- break;
345
- case 0x1d: /* FACGT */
346
- gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
347
- break;
348
default:
349
case 0x03: /* FMULX */
350
+ case 0x04: /* FCMEQ (reg) */
351
+ case 0x14: /* FCMGE (reg) */
352
+ case 0x15: /* FACGE */
353
+ case 0x1c: /* FCMGT (reg) */
354
+ case 0x1d: /* FACGT */
355
g_assert_not_reached();
356
}
357
358
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
359
return;
360
case 0x1f: /* FRECPS */
361
case 0x3f: /* FRSQRTS */
362
- case 0x5d: /* FACGE */
363
- case 0x7d: /* FACGT */
364
- case 0x1c: /* FCMEQ */
365
- case 0x5c: /* FCMGE */
366
case 0x7a: /* FABD */
367
- case 0x7c: /* FCMGT */
368
if (!fp_access_check(s)) {
369
return;
370
}
371
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
372
case 0x19: /* FMLA */
373
case 0x1a: /* FADD */
374
case 0x1b: /* FMULX */
375
+ case 0x1c: /* FCMEQ */
376
case 0x1e: /* FMAX */
377
case 0x38: /* FMINNM */
378
case 0x39: /* FMLS */
379
case 0x3a: /* FSUB */
380
case 0x3e: /* FMIN */
381
case 0x5b: /* FMUL */
382
+ case 0x5c: /* FCMGE */
383
+ case 0x5d: /* FACGE */
384
case 0x5f: /* FDIV */
385
+ case 0x7d: /* FACGT */
386
+ case 0x7c: /* FCMGT */
387
unallocated_encoding(s);
388
return;
389
}
390
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
391
int pass;
392
393
switch (fpopcode) {
394
- case 0x4: /* FCMEQ */
395
case 0x7: /* FRECPS */
396
case 0xf: /* FRSQRTS */
397
- case 0x14: /* FCMGE */
398
- case 0x15: /* FACGE */
399
case 0x1a: /* FABD */
400
- case 0x1c: /* FCMGT */
401
- case 0x1d: /* FACGT */
402
pairwise = false;
403
break;
404
case 0x10: /* FMAXNMP */
405
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
406
case 0x1: /* FMLA */
407
case 0x2: /* FADD */
408
case 0x3: /* FMULX */
409
+ case 0x4: /* FCMEQ */
410
case 0x6: /* FMAX */
411
case 0x8: /* FMINNM */
412
case 0x9: /* FMLS */
413
case 0xa: /* FSUB */
414
case 0xe: /* FMIN */
415
case 0x13: /* FMUL */
416
+ case 0x14: /* FCMGE */
417
+ case 0x15: /* FACGE */
418
case 0x17: /* FDIV */
419
+ case 0x1c: /* FCMGT */
420
+ case 0x1d: /* FACGT */
421
unallocated_encoding(s);
422
return;
423
}
424
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
425
read_vec_element_i32(s, tcg_op2, rm, pass, MO_16);
426
427
switch (fpopcode) {
428
- case 0x4: /* FCMEQ */
429
- gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst);
430
- break;
431
case 0x7: /* FRECPS */
432
gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
433
break;
434
case 0xf: /* FRSQRTS */
435
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
436
break;
437
- case 0x14: /* FCMGE */
438
- gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
439
- break;
440
- case 0x15: /* FACGE */
441
- gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst);
442
- break;
443
case 0x1a: /* FABD */
444
gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
445
tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
446
break;
447
- case 0x1c: /* FCMGT */
448
- gen_helper_advsimd_cgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
449
- break;
450
- case 0x1d: /* FACGT */
451
- gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst);
452
- break;
453
default:
454
case 0x0: /* FMAXNM */
455
case 0x1: /* FMLA */
456
case 0x2: /* FADD */
457
case 0x3: /* FMULX */
458
+ case 0x4: /* FCMEQ */
459
case 0x6: /* FMAX */
460
case 0x8: /* FMINNM */
461
case 0x9: /* FMLS */
462
case 0xa: /* FSUB */
463
case 0xe: /* FMIN */
464
case 0x13: /* FMUL */
465
+ case 0x14: /* FCMGE */
466
+ case 0x15: /* FACGE */
467
case 0x17: /* FDIV */
468
+ case 0x1c: /* FCMGT */
469
+ case 0x1d: /* FACGT */
470
g_assert_not_reached();
471
}
472
473
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
474
index XXXXXXX..XXXXXXX 100644
475
--- a/target/arm/tcg/vec_helper.c
476
+++ b/target/arm/tcg/vec_helper.c
477
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_ceq(float32 op1, float32 op2, float_status *stat)
478
return -float32_eq_quiet(op1, op2, stat);
36
}
479
}
37
480
38
static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
481
+static uint64_t float64_ceq(float64 op1, float64 op2, float_status *stat)
39
uint64_t value)
482
+{
483
+ return -float64_eq_quiet(op1, op2, stat);
484
+}
485
+
486
static uint16_t float16_cge(float16 op1, float16 op2, float_status *stat)
40
{
487
{
41
/* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
488
return -float16_le(op2, op1, stat);
42
- ARMCPU *cpu = env_archcpu(env);
489
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_cge(float32 op1, float32 op2, float_status *stat)
43
+ CPUState *cs = env_cpu(env);
490
return -float32_le(op2, op1, stat);
44
45
+ value &= TARGET_PAGE_MASK;
46
if (tlb_force_broadcast(env)) {
47
- tlbimva_is_write(env, NULL, value);
48
- return;
49
+ tlb_flush_page_all_cpus_synced(cs, value);
50
+ } else {
51
+ tlb_flush_page(cs, value);
52
}
53
-
54
- tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
55
}
491
}
56
492
57
static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
493
+static uint64_t float64_cge(float64 op1, float64 op2, float_status *stat)
58
uint64_t value)
494
+{
495
+ return -float64_le(op2, op1, stat);
496
+}
497
+
498
static uint16_t float16_cgt(float16 op1, float16 op2, float_status *stat)
59
{
499
{
60
/* Invalidate by ASID (TLBIASID) */
500
return -float16_lt(op2, op1, stat);
61
- ARMCPU *cpu = env_archcpu(env);
501
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_cgt(float32 op1, float32 op2, float_status *stat)
62
+ CPUState *cs = env_cpu(env);
502
return -float32_lt(op2, op1, stat);
63
64
if (tlb_force_broadcast(env)) {
65
- tlbiasid_is_write(env, NULL, value);
66
- return;
67
+ tlb_flush_all_cpus_synced(cs);
68
+ } else {
69
+ tlb_flush(cs);
70
}
71
-
72
- tlb_flush(CPU(cpu));
73
}
503
}
74
504
75
static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
505
+static uint64_t float64_cgt(float64 op1, float64 op2, float_status *stat)
76
uint64_t value)
506
+{
507
+ return -float64_lt(op2, op1, stat);
508
+}
509
+
510
static uint16_t float16_acge(float16 op1, float16 op2, float_status *stat)
77
{
511
{
78
/* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
512
return -float16_le(float16_abs(op2), float16_abs(op1), stat);
79
- ARMCPU *cpu = env_archcpu(env);
513
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_acge(float32 op1, float32 op2, float_status *stat)
80
+ CPUState *cs = env_cpu(env);
514
return -float32_le(float32_abs(op2), float32_abs(op1), stat);
81
82
+ value &= TARGET_PAGE_MASK;
83
if (tlb_force_broadcast(env)) {
84
- tlbimvaa_is_write(env, NULL, value);
85
- return;
86
+ tlb_flush_page_all_cpus_synced(cs, value);
87
+ } else {
88
+ tlb_flush_page(cs, value);
89
}
90
-
91
- tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
92
}
515
}
93
516
94
static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
517
+static uint64_t float64_acge(float64 op1, float64 op2, float_status *stat)
95
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
518
+{
96
int mask = vae1_tlbmask(env);
519
+ return -float64_le(float64_abs(op2), float64_abs(op1), stat);
97
520
+}
98
if (tlb_force_broadcast(env)) {
521
+
99
- tlbi_aa64_vmalle1is_write(env, NULL, value);
522
static uint16_t float16_acgt(float16 op1, float16 op2, float_status *stat)
100
- return;
523
{
101
+ tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
524
return -float16_lt(float16_abs(op2), float16_abs(op1), stat);
102
+ } else {
525
@@ -XXX,XX +XXX,XX @@ static uint32_t float32_acgt(float32 op1, float32 op2, float_status *stat)
103
+ tlb_flush_by_mmuidx(cs, mask);
526
return -float32_lt(float32_abs(op2), float32_abs(op1), stat);
104
}
105
-
106
- tlb_flush_by_mmuidx(cs, mask);
107
}
527
}
108
528
109
static int alle1_tlbmask(CPUARMState *env)
529
+static uint64_t float64_acgt(float64 op1, float64 op2, float_status *stat)
110
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
530
+{
111
uint64_t pageaddr = sextract64(value << 12, 0, 56);
531
+ return -float64_lt(float64_abs(op2), float64_abs(op1), stat);
112
532
+}
113
if (tlb_force_broadcast(env)) {
533
+
114
- tlbi_aa64_vae1is_write(env, NULL, value);
534
static int16_t vfp_tosszh(float16 x, void *fpstp)
115
- return;
535
{
116
+ tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr, mask);
536
float_status *fpst = fpstp;
117
+ } else {
537
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_fabd_s, float32_abd, float32)
118
+ tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
538
119
}
539
DO_3OP(gvec_fceq_h, float16_ceq, float16)
120
-
540
DO_3OP(gvec_fceq_s, float32_ceq, float32)
121
- tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
541
+DO_3OP(gvec_fceq_d, float64_ceq, float64)
122
}
542
123
543
DO_3OP(gvec_fcge_h, float16_cge, float16)
124
static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
544
DO_3OP(gvec_fcge_s, float32_cge, float32)
545
+DO_3OP(gvec_fcge_d, float64_cge, float64)
546
547
DO_3OP(gvec_fcgt_h, float16_cgt, float16)
548
DO_3OP(gvec_fcgt_s, float32_cgt, float32)
549
+DO_3OP(gvec_fcgt_d, float64_cgt, float64)
550
551
DO_3OP(gvec_facge_h, float16_acge, float16)
552
DO_3OP(gvec_facge_s, float32_acge, float32)
553
+DO_3OP(gvec_facge_d, float64_acge, float64)
554
555
DO_3OP(gvec_facgt_h, float16_acgt, float16)
556
DO_3OP(gvec_facgt_s, float32_acgt, float32)
557
+DO_3OP(gvec_facgt_d, float64_acgt, float64)
558
559
DO_3OP(gvec_fmax_h, float16_max, float16)
560
DO_3OP(gvec_fmax_s, float32_max, float32)
125
--
561
--
126
2.20.1
562
2.34.1
127
128
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This is part of a reorganization to the set of mmu_idx.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
This emphasizes that they apply to the Secure EL1&0 regime.
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-13-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-27-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/cpu.h | 8 ++++----
8
target/arm/helper.h | 1 +
13
target/arm/internals.h | 4 ++--
9
target/arm/tcg/a64.decode | 6 ++++
14
target/arm/translate.h | 2 +-
10
target/arm/tcg/translate-a64.c | 60 ++++++++++++++++++++++------------
15
target/arm/helper.c | 26 +++++++++++++-------------
11
target/arm/tcg/vec_helper.c | 6 ++++
16
target/arm/translate-a64.c | 4 ++--
12
4 files changed, 53 insertions(+), 20 deletions(-)
17
target/arm/translate.c | 6 +++---
18
6 files changed, 25 insertions(+), 25 deletions(-)
19
13
20
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
21
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu.h
16
--- a/target/arm/helper.h
23
+++ b/target/arm/cpu.h
17
+++ b/target/arm/helper.h
24
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
25
ARMMMUIdx_E10_1 = 1 | ARM_MMU_IDX_A,
19
26
ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
20
DEF_HELPER_FLAGS_5(gvec_fabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
21
DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
- ARMMMUIdx_S1SE0 = 4 | ARM_MMU_IDX_A,
22
+DEF_HELPER_FLAGS_5(gvec_fabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
- ARMMMUIdx_S1SE1 = 5 | ARM_MMU_IDX_A,
23
30
+ ARMMMUIdx_SE10_0 = 4 | ARM_MMU_IDX_A,
24
DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
+ ARMMMUIdx_SE10_1 = 5 | ARM_MMU_IDX_A,
25
DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
26
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
33
ARMMMUIdx_MUser = 0 | ARM_MMU_IDX_M,
27
index XXXXXXX..XXXXXXX 100644
34
ARMMMUIdx_MPriv = 1 | ARM_MMU_IDX_M,
28
--- a/target/arm/tcg/a64.decode
35
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
29
+++ b/target/arm/tcg/a64.decode
36
ARMMMUIdxBit_E10_1 = 1 << 1,
30
@@ -XXX,XX +XXX,XX @@ FACGE_s 0111 1110 0.1 ..... 11101 1 ..... ..... @rrr_sd
37
ARMMMUIdxBit_S1E2 = 1 << 2,
31
FACGT_s 0111 1110 110 ..... 00101 1 ..... ..... @rrr_h
38
ARMMMUIdxBit_S1E3 = 1 << 3,
32
FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd
39
- ARMMMUIdxBit_S1SE0 = 1 << 4,
33
40
- ARMMMUIdxBit_S1SE1 = 1 << 5,
34
+FABD_s 0111 1110 110 ..... 00010 1 ..... ..... @rrr_h
41
+ ARMMMUIdxBit_SE10_0 = 1 << 4,
35
+FABD_s 0111 1110 1.1 ..... 11010 1 ..... ..... @rrr_sd
42
+ ARMMMUIdxBit_SE10_1 = 1 << 5,
36
+
43
ARMMMUIdxBit_Stage2 = 1 << 6,
37
### Advanced SIMD three same
44
ARMMMUIdxBit_MUser = 1 << 0,
38
45
ARMMMUIdxBit_MPriv = 1 << 1,
39
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
46
diff --git a/target/arm/internals.h b/target/arm/internals.h
40
@@ -XXX,XX +XXX,XX @@ FACGE_v 0.10 1110 0.1 ..... 11101 1 ..... ..... @qrrr_sd
47
index XXXXXXX..XXXXXXX 100644
41
FACGT_v 0.10 1110 110 ..... 00101 1 ..... ..... @qrrr_h
48
--- a/target/arm/internals.h
42
FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd
49
+++ b/target/arm/internals.h
43
50
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
44
+FABD_v 0.10 1110 110 ..... 00010 1 ..... ..... @qrrr_h
51
case ARMMMUIdx_MUser:
45
+FABD_v 0.10 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd
52
return false;
46
+
53
case ARMMMUIdx_S1E3:
47
### Advanced SIMD scalar x indexed element
54
- case ARMMMUIdx_S1SE0:
48
55
- case ARMMMUIdx_S1SE1:
49
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
56
+ case ARMMMUIdx_SE10_0:
50
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
57
+ case ARMMMUIdx_SE10_1:
51
index XXXXXXX..XXXXXXX 100644
58
case ARMMMUIdx_MSPrivNegPri:
52
--- a/target/arm/tcg/translate-a64.c
59
case ARMMMUIdx_MSUserNegPri:
53
+++ b/target/arm/tcg/translate-a64.c
60
case ARMMMUIdx_MSPriv:
54
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_facgt = {
61
diff --git a/target/arm/translate.h b/target/arm/translate.h
55
};
62
index XXXXXXX..XXXXXXX 100644
56
TRANS(FACGT_s, do_fp3_scalar, a, &f_scalar_facgt)
63
--- a/target/arm/translate.h
57
64
+++ b/target/arm/translate.h
58
+static void gen_fabd_h(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
65
@@ -XXX,XX +XXX,XX @@ static inline int default_exception_el(DisasContext *s)
59
+{
66
* exceptions can only be routed to ELs above 1, so we target the higher of
60
+ gen_helper_vfp_subh(d, n, m, s);
67
* 1 or the current EL.
61
+ gen_vfp_absh(d, d);
68
*/
62
+}
69
- return (s->mmu_idx == ARMMMUIdx_S1SE0 && s->secure_routed_to_el3)
63
+
70
+ return (s->mmu_idx == ARMMMUIdx_SE10_0 && s->secure_routed_to_el3)
64
+static void gen_fabd_s(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
71
? 3 : MAX(1, s->current_el);
65
+{
72
}
66
+ gen_helper_vfp_subs(d, n, m, s);
73
67
+ gen_vfp_abss(d, d);
74
diff --git a/target/arm/helper.c b/target/arm/helper.c
68
+}
75
index XXXXXXX..XXXXXXX 100644
69
+
76
--- a/target/arm/helper.c
70
+static void gen_fabd_d(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_ptr s)
77
+++ b/target/arm/helper.c
71
+{
78
@@ -XXX,XX +XXX,XX @@ static void ats_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
72
+ gen_helper_vfp_subd(d, n, m, s);
79
mmu_idx = ARMMMUIdx_Stage1_E1;
73
+ gen_vfp_absd(d, d);
80
break;
74
+}
81
case 1:
75
+
82
- mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_Stage1_E1;
76
+static const FPScalar f_scalar_fabd = {
83
+ mmu_idx = secure ? ARMMMUIdx_SE10_1 : ARMMMUIdx_Stage1_E1;
77
+ gen_fabd_h,
78
+ gen_fabd_s,
79
+ gen_fabd_d,
80
+};
81
+TRANS(FABD_s, do_fp3_scalar, a, &f_scalar_fabd)
82
+
83
static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
84
gen_helper_gvec_3_ptr * const fns[3])
85
{
86
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_facgt[3] = {
87
};
88
TRANS(FACGT_v, do_fp3_vector, a, f_vector_facgt)
89
90
+static gen_helper_gvec_3_ptr * const f_vector_fabd[3] = {
91
+ gen_helper_gvec_fabd_h,
92
+ gen_helper_gvec_fabd_s,
93
+ gen_helper_gvec_fabd_d,
94
+};
95
+TRANS(FABD_v, do_fp3_vector, a, f_vector_fabd)
96
+
97
/*
98
* Advanced SIMD scalar/vector x indexed element
99
*/
100
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
101
case 0x3f: /* FRSQRTS */
102
gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
103
break;
104
- case 0x7a: /* FABD */
105
- gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
106
- gen_vfp_absd(tcg_res, tcg_res);
107
- break;
108
default:
109
case 0x18: /* FMAXNM */
110
case 0x19: /* FMLA */
111
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
112
case 0x5c: /* FCMGE */
113
case 0x5d: /* FACGE */
114
case 0x5f: /* FDIV */
115
+ case 0x7a: /* FABD */
116
case 0x7c: /* FCMGT */
117
case 0x7d: /* FACGT */
118
g_assert_not_reached();
119
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
120
case 0x3f: /* FRSQRTS */
121
gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
122
break;
123
- case 0x7a: /* FABD */
124
- gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
125
- gen_vfp_abss(tcg_res, tcg_res);
126
- break;
127
default:
128
case 0x18: /* FMAXNM */
129
case 0x19: /* FMLA */
130
@@ -XXX,XX +XXX,XX @@ static void handle_3same_float(DisasContext *s, int size, int elements,
131
case 0x5c: /* FCMGE */
132
case 0x5d: /* FACGE */
133
case 0x5f: /* FDIV */
134
+ case 0x7a: /* FABD */
135
case 0x7c: /* FCMGT */
136
case 0x7d: /* FACGT */
137
g_assert_not_reached();
138
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
139
switch (fpopcode) {
140
case 0x1f: /* FRECPS */
141
case 0x3f: /* FRSQRTS */
142
- case 0x7a: /* FABD */
84
break;
143
break;
85
default:
144
default:
86
g_assert_not_reached();
145
case 0x1b: /* FMULX */
87
@@ -XXX,XX +XXX,XX @@ static void ats_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
146
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
88
/* stage 1 current state PL0: ATS1CUR, ATS1CUW */
147
case 0x7d: /* FACGT */
89
switch (el) {
148
case 0x1c: /* FCMEQ */
90
case 3:
149
case 0x5c: /* FCMGE */
91
- mmu_idx = ARMMMUIdx_S1SE0;
150
+ case 0x7a: /* FABD */
92
+ mmu_idx = ARMMMUIdx_SE10_0;
151
case 0x7c: /* FCMGT */
93
break;
152
unallocated_encoding(s);
94
case 2:
153
return;
95
mmu_idx = ARMMMUIdx_Stage1_E0;
154
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
96
break;
155
switch (fpopcode) {
97
case 1:
156
case 0x07: /* FRECPS */
98
- mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_Stage1_E0;
157
case 0x0f: /* FRSQRTS */
99
+ mmu_idx = secure ? ARMMMUIdx_SE10_0 : ARMMMUIdx_Stage1_E0;
158
- case 0x1a: /* FABD */
100
break;
101
default:
102
g_assert_not_reached();
103
@@ -XXX,XX +XXX,XX @@ static void ats_write64(CPUARMState *env, const ARMCPRegInfo *ri,
104
case 0:
105
switch (ri->opc1) {
106
case 0: /* AT S1E1R, AT S1E1W */
107
- mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_Stage1_E1;
108
+ mmu_idx = secure ? ARMMMUIdx_SE10_1 : ARMMMUIdx_Stage1_E1;
109
break;
110
case 4: /* AT S1E2R, AT S1E2W */
111
mmu_idx = ARMMMUIdx_S1E2;
112
@@ -XXX,XX +XXX,XX @@ static void ats_write64(CPUARMState *env, const ARMCPRegInfo *ri,
113
}
114
break;
115
case 2: /* AT S1E0R, AT S1E0W */
116
- mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_Stage1_E0;
117
+ mmu_idx = secure ? ARMMMUIdx_SE10_0 : ARMMMUIdx_Stage1_E0;
118
break;
119
case 4: /* AT S12E1R, AT S12E1W */
120
- mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_E10_1;
121
+ mmu_idx = secure ? ARMMMUIdx_SE10_1 : ARMMMUIdx_E10_1;
122
break;
123
case 6: /* AT S12E0R, AT S12E0W */
124
- mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_E10_0;
125
+ mmu_idx = secure ? ARMMMUIdx_SE10_0 : ARMMMUIdx_E10_0;
126
break;
159
break;
127
default:
160
default:
161
case 0x03: /* FMULX */
162
case 0x04: /* FCMEQ (reg) */
163
case 0x14: /* FCMGE (reg) */
164
case 0x15: /* FACGE */
165
+ case 0x1a: /* FABD */
166
case 0x1c: /* FCMGT (reg) */
167
case 0x1d: /* FACGT */
168
unallocated_encoding(s);
169
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
170
case 0x0f: /* FRSQRTS */
171
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
172
break;
173
- case 0x1a: /* FABD */
174
- gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
175
- tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
176
- break;
177
default:
178
case 0x03: /* FMULX */
179
case 0x04: /* FCMEQ (reg) */
180
case 0x14: /* FCMGE (reg) */
181
case 0x15: /* FACGE */
182
+ case 0x1a: /* FABD */
183
case 0x1c: /* FCMGT (reg) */
184
case 0x1d: /* FACGT */
128
g_assert_not_reached();
185
g_assert_not_reached();
129
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
186
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
130
static int vae1_tlbmask(CPUARMState *env)
187
return;
131
{
188
case 0x1f: /* FRECPS */
132
if (arm_is_secure_below_el3(env)) {
189
case 0x3f: /* FRSQRTS */
133
- return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
190
- case 0x7a: /* FABD */
134
+ return ARMMMUIdxBit_SE10_1 | ARMMMUIdxBit_SE10_0;
191
if (!fp_access_check(s)) {
135
} else {
192
return;
136
return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0;
193
}
137
}
194
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
138
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
195
case 0x5c: /* FCMGE */
139
* stage 1 translations.
196
case 0x5d: /* FACGE */
140
*/
197
case 0x5f: /* FDIV */
141
if (arm_is_secure_below_el3(env)) {
198
+ case 0x7a: /* FABD */
142
- return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
199
case 0x7d: /* FACGT */
143
+ return ARMMMUIdxBit_SE10_1 | ARMMMUIdxBit_SE10_0;
200
case 0x7c: /* FCMGT */
144
} else if (arm_feature(env, ARM_FEATURE_EL2)) {
201
unallocated_encoding(s);
145
return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0 | ARMMMUIdxBit_Stage2;
202
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
146
} else {
203
switch (fpopcode) {
147
@@ -XXX,XX +XXX,XX @@ static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
204
case 0x7: /* FRECPS */
148
return 2;
205
case 0xf: /* FRSQRTS */
149
case ARMMMUIdx_S1E3:
206
- case 0x1a: /* FABD */
150
return 3;
207
pairwise = false;
151
- case ARMMMUIdx_S1SE0:
152
+ case ARMMMUIdx_SE10_0:
153
return arm_el_is_aa64(env, 3) ? 1 : 3;
154
- case ARMMMUIdx_S1SE1:
155
+ case ARMMMUIdx_SE10_1:
156
case ARMMMUIdx_Stage1_E0:
157
case ARMMMUIdx_Stage1_E1:
158
case ARMMMUIdx_MPrivNegPri:
159
@@ -XXX,XX +XXX,XX @@ bool arm_s1_regime_using_lpae_format(CPUARMState *env, ARMMMUIdx mmu_idx)
160
static inline bool regime_is_user(CPUARMState *env, ARMMMUIdx mmu_idx)
161
{
162
switch (mmu_idx) {
163
- case ARMMMUIdx_S1SE0:
164
+ case ARMMMUIdx_SE10_0:
165
case ARMMMUIdx_Stage1_E0:
166
case ARMMMUIdx_MUser:
167
case ARMMMUIdx_MSUser:
168
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
169
}
170
171
if (el < 2 && arm_is_secure_below_el3(env)) {
172
- return ARMMMUIdx_S1SE0 + el;
173
+ return ARMMMUIdx_SE10_0 + el;
174
} else {
175
return ARMMMUIdx_E10_0 + el;
176
}
177
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
178
index XXXXXXX..XXXXXXX 100644
179
--- a/target/arm/translate-a64.c
180
+++ b/target/arm/translate-a64.c
181
@@ -XXX,XX +XXX,XX @@ static inline int get_a64_user_mem_index(DisasContext *s)
182
case ARMMMUIdx_E10_1:
183
useridx = ARMMMUIdx_E10_0;
184
break;
208
break;
185
- case ARMMMUIdx_S1SE1:
209
case 0x10: /* FMAXNMP */
186
- useridx = ARMMMUIdx_S1SE0;
210
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
187
+ case ARMMMUIdx_SE10_1:
211
case 0x14: /* FCMGE */
188
+ useridx = ARMMMUIdx_SE10_0;
212
case 0x15: /* FACGE */
189
break;
213
case 0x17: /* FDIV */
190
case ARMMMUIdx_Stage2:
214
+ case 0x1a: /* FABD */
191
g_assert_not_reached();
215
case 0x1c: /* FCMGT */
192
diff --git a/target/arm/translate.c b/target/arm/translate.c
216
case 0x1d: /* FACGT */
193
index XXXXXXX..XXXXXXX 100644
217
unallocated_encoding(s);
194
--- a/target/arm/translate.c
218
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
195
+++ b/target/arm/translate.c
219
case 0xf: /* FRSQRTS */
196
@@ -XXX,XX +XXX,XX @@ static inline int get_a32_user_mem_index(DisasContext *s)
220
gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
197
case ARMMMUIdx_E10_1:
221
break;
198
return arm_to_core_mmu_idx(ARMMMUIdx_E10_0);
222
- case 0x1a: /* FABD */
199
case ARMMMUIdx_S1E3:
223
- gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
200
- case ARMMMUIdx_S1SE0:
224
- tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
201
- case ARMMMUIdx_S1SE1:
225
- break;
202
- return arm_to_core_mmu_idx(ARMMMUIdx_S1SE0);
226
default:
203
+ case ARMMMUIdx_SE10_0:
227
case 0x0: /* FMAXNM */
204
+ case ARMMMUIdx_SE10_1:
228
case 0x1: /* FMLA */
205
+ return arm_to_core_mmu_idx(ARMMMUIdx_SE10_0);
229
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
206
case ARMMMUIdx_MUser:
230
case 0x14: /* FCMGE */
207
case ARMMMUIdx_MPriv:
231
case 0x15: /* FACGE */
208
return arm_to_core_mmu_idx(ARMMMUIdx_MUser);
232
case 0x17: /* FDIV */
233
+ case 0x1a: /* FABD */
234
case 0x1c: /* FCMGT */
235
case 0x1d: /* FACGT */
236
g_assert_not_reached();
237
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
238
index XXXXXXX..XXXXXXX 100644
239
--- a/target/arm/tcg/vec_helper.c
240
+++ b/target/arm/tcg/vec_helper.c
241
@@ -XXX,XX +XXX,XX @@ static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
242
return float32_abs(float32_sub(op1, op2, stat));
243
}
244
245
+static float64 float64_abd(float64 op1, float64 op2, float_status *stat)
246
+{
247
+ return float64_abs(float64_sub(op1, op2, stat));
248
+}
249
+
250
/*
251
* Reciprocal step. These are the AArch32 version which uses a
252
* non-fused multiply-and-subtract.
253
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
254
255
DO_3OP(gvec_fabd_h, float16_abd, float16)
256
DO_3OP(gvec_fabd_s, float32_abd, float32)
257
+DO_3OP(gvec_fabd_d, float64_abd, float64)
258
259
DO_3OP(gvec_fceq_h, float16_ceq, float16)
260
DO_3OP(gvec_fceq_s, float32_ceq, float32)
209
--
261
--
210
2.20.1
262
2.34.1
211
212
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This inline function has one user in cpu.c, and need not be exposed
3
These are the last instructions within handle_3same_float
4
otherwise. Code movement only, with fixups for checkpatch.
4
and disas_simd_scalar_three_reg_same_fp16 so remove them.
5
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-39-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-28-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
10
---
12
target/arm/cpu.h | 111 -------------------------------------------
11
target/arm/tcg/a64.decode | 12 ++
13
target/arm/cpu.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++
12
target/arm/tcg/translate-a64.c | 293 ++++-----------------------------
14
2 files changed, 119 insertions(+), 111 deletions(-)
13
2 files changed, 46 insertions(+), 259 deletions(-)
15
14
16
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
17
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/cpu.h
17
--- a/target/arm/tcg/a64.decode
19
+++ b/target/arm/cpu.h
18
+++ b/target/arm/tcg/a64.decode
20
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
19
@@ -XXX,XX +XXX,XX @@ FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd
21
#define ARM_CPUID_TI915T 0x54029152
20
FABD_s 0111 1110 110 ..... 00010 1 ..... ..... @rrr_h
22
#define ARM_CPUID_TI925T 0x54029252
21
FABD_s 0111 1110 1.1 ..... 11010 1 ..... ..... @rrr_sd
23
22
24
-static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
23
+FRECPS_s 0101 1110 010 ..... 00111 1 ..... ..... @rrr_h
25
- unsigned int target_el)
24
+FRECPS_s 0101 1110 0.1 ..... 11111 1 ..... ..... @rrr_sd
25
+
26
+FRSQRTS_s 0101 1110 110 ..... 00111 1 ..... ..... @rrr_h
27
+FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd
28
+
29
### Advanced SIMD three same
30
31
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
32
@@ -XXX,XX +XXX,XX @@ FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd
33
FABD_v 0.10 1110 110 ..... 00010 1 ..... ..... @qrrr_h
34
FABD_v 0.10 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd
35
36
+FRECPS_v 0.00 1110 010 ..... 00111 1 ..... ..... @qrrr_h
37
+FRECPS_v 0.00 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd
38
+
39
+FRSQRTS_v 0.00 1110 110 ..... 00111 1 ..... ..... @qrrr_h
40
+FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd
41
+
42
### Advanced SIMD scalar x indexed element
43
44
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
45
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/tcg/translate-a64.c
48
+++ b/target/arm/tcg/translate-a64.c
49
@@ -XXX,XX +XXX,XX @@ static const FPScalar f_scalar_fabd = {
50
};
51
TRANS(FABD_s, do_fp3_scalar, a, &f_scalar_fabd)
52
53
+static const FPScalar f_scalar_frecps = {
54
+ gen_helper_recpsf_f16,
55
+ gen_helper_recpsf_f32,
56
+ gen_helper_recpsf_f64,
57
+};
58
+TRANS(FRECPS_s, do_fp3_scalar, a, &f_scalar_frecps)
59
+
60
+static const FPScalar f_scalar_frsqrts = {
61
+ gen_helper_rsqrtsf_f16,
62
+ gen_helper_rsqrtsf_f32,
63
+ gen_helper_rsqrtsf_f64,
64
+};
65
+TRANS(FRSQRTS_s, do_fp3_scalar, a, &f_scalar_frsqrts)
66
+
67
static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
68
gen_helper_gvec_3_ptr * const fns[3])
69
{
70
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fabd[3] = {
71
};
72
TRANS(FABD_v, do_fp3_vector, a, f_vector_fabd)
73
74
+static gen_helper_gvec_3_ptr * const f_vector_frecps[3] = {
75
+ gen_helper_gvec_recps_h,
76
+ gen_helper_gvec_recps_s,
77
+ gen_helper_gvec_recps_d,
78
+};
79
+TRANS(FRECPS_v, do_fp3_vector, a, f_vector_frecps)
80
+
81
+static gen_helper_gvec_3_ptr * const f_vector_frsqrts[3] = {
82
+ gen_helper_gvec_rsqrts_h,
83
+ gen_helper_gvec_rsqrts_s,
84
+ gen_helper_gvec_rsqrts_d,
85
+};
86
+TRANS(FRSQRTS_v, do_fp3_vector, a, f_vector_frsqrts)
87
+
88
/*
89
* Advanced SIMD scalar/vector x indexed element
90
*/
91
@@ -XXX,XX +XXX,XX @@ static void handle_3same_64(DisasContext *s, int opcode, bool u,
92
}
93
}
94
95
-/* Handle the 3-same-operands float operations; shared by the scalar
96
- * and vector encodings. The caller must filter out any encodings
97
- * not allocated for the encoding it is dealing with.
98
- */
99
-static void handle_3same_float(DisasContext *s, int size, int elements,
100
- int fpopcode, int rd, int rn, int rm)
26
-{
101
-{
27
- CPUARMState *env = cs->env_ptr;
102
- int pass;
28
- unsigned int cur_el = arm_current_el(env);
103
- TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR);
29
- bool secure = arm_is_secure(env);
104
-
30
- bool pstate_unmasked;
105
- for (pass = 0; pass < elements; pass++) {
31
- int8_t unmasked = 0;
106
- if (size) {
32
- uint64_t hcr_el2;
107
- /* Double */
33
-
108
- TCGv_i64 tcg_op1 = tcg_temp_new_i64();
34
- /* Don't take exceptions if they target a lower EL.
109
- TCGv_i64 tcg_op2 = tcg_temp_new_i64();
35
- * This check should catch any exceptions that would not be taken but left
110
- TCGv_i64 tcg_res = tcg_temp_new_i64();
36
- * pending.
111
-
37
- */
112
- read_vec_element(s, tcg_op1, rn, pass, MO_64);
38
- if (cur_el > target_el) {
113
- read_vec_element(s, tcg_op2, rm, pass, MO_64);
39
- return false;
114
-
40
- }
115
- switch (fpopcode) {
41
-
116
- case 0x1f: /* FRECPS */
42
- hcr_el2 = arm_hcr_el2_eff(env);
117
- gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
43
-
118
- break;
44
- switch (excp_idx) {
119
- case 0x3f: /* FRSQRTS */
45
- case EXCP_FIQ:
120
- gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
46
- pstate_unmasked = !(env->daif & PSTATE_F);
47
- break;
48
-
49
- case EXCP_IRQ:
50
- pstate_unmasked = !(env->daif & PSTATE_I);
51
- break;
52
-
53
- case EXCP_VFIQ:
54
- if (secure || !(hcr_el2 & HCR_FMO) || (hcr_el2 & HCR_TGE)) {
55
- /* VFIQs are only taken when hypervized and non-secure. */
56
- return false;
57
- }
58
- return !(env->daif & PSTATE_F);
59
- case EXCP_VIRQ:
60
- if (secure || !(hcr_el2 & HCR_IMO) || (hcr_el2 & HCR_TGE)) {
61
- /* VIRQs are only taken when hypervized and non-secure. */
62
- return false;
63
- }
64
- return !(env->daif & PSTATE_I);
65
- default:
66
- g_assert_not_reached();
67
- }
68
-
69
- /* Use the target EL, current execution state and SCR/HCR settings to
70
- * determine whether the corresponding CPSR bit is used to mask the
71
- * interrupt.
72
- */
73
- if ((target_el > cur_el) && (target_el != 1)) {
74
- /* Exceptions targeting a higher EL may not be maskable */
75
- if (arm_feature(env, ARM_FEATURE_AARCH64)) {
76
- /* 64-bit masking rules are simple: exceptions to EL3
77
- * can't be masked, and exceptions to EL2 can only be
78
- * masked from Secure state. The HCR and SCR settings
79
- * don't affect the masking logic, only the interrupt routing.
80
- */
81
- if (target_el == 3 || !secure) {
82
- unmasked = 1;
83
- }
84
- } else {
85
- /* The old 32-bit-only environment has a more complicated
86
- * masking setup. HCR and SCR bits not only affect interrupt
87
- * routing but also change the behaviour of masking.
88
- */
89
- bool hcr, scr;
90
-
91
- switch (excp_idx) {
92
- case EXCP_FIQ:
93
- /* If FIQs are routed to EL3 or EL2 then there are cases where
94
- * we override the CPSR.F in determining if the exception is
95
- * masked or not. If neither of these are set then we fall back
96
- * to the CPSR.F setting otherwise we further assess the state
97
- * below.
98
- */
99
- hcr = hcr_el2 & HCR_FMO;
100
- scr = (env->cp15.scr_el3 & SCR_FIQ);
101
-
102
- /* When EL3 is 32-bit, the SCR.FW bit controls whether the
103
- * CPSR.F bit masks FIQ interrupts when taken in non-secure
104
- * state. If SCR.FW is set then FIQs can be masked by CPSR.F
105
- * when non-secure but only when FIQs are only routed to EL3.
106
- */
107
- scr = scr && !((env->cp15.scr_el3 & SCR_FW) && !hcr);
108
- break;
109
- case EXCP_IRQ:
110
- /* When EL3 execution state is 32-bit, if HCR.IMO is set then
111
- * we may override the CPSR.I masking when in non-secure state.
112
- * The SCR.IRQ setting has already been taken into consideration
113
- * when setting the target EL, so it does not have a further
114
- * affect here.
115
- */
116
- hcr = hcr_el2 & HCR_IMO;
117
- scr = false;
118
- break;
121
- break;
119
- default:
122
- default:
123
- case 0x18: /* FMAXNM */
124
- case 0x19: /* FMLA */
125
- case 0x1a: /* FADD */
126
- case 0x1b: /* FMULX */
127
- case 0x1c: /* FCMEQ */
128
- case 0x1e: /* FMAX */
129
- case 0x38: /* FMINNM */
130
- case 0x39: /* FMLS */
131
- case 0x3a: /* FSUB */
132
- case 0x3e: /* FMIN */
133
- case 0x5b: /* FMUL */
134
- case 0x5c: /* FCMGE */
135
- case 0x5d: /* FACGE */
136
- case 0x5f: /* FDIV */
137
- case 0x7a: /* FABD */
138
- case 0x7c: /* FCMGT */
139
- case 0x7d: /* FACGT */
120
- g_assert_not_reached();
140
- g_assert_not_reached();
121
- }
141
- }
122
-
142
-
123
- if ((scr || hcr) && !secure) {
143
- write_vec_element(s, tcg_res, rd, pass, MO_64);
124
- unmasked = 1;
144
- } else {
145
- /* Single */
146
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
147
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
148
- TCGv_i32 tcg_res = tcg_temp_new_i32();
149
-
150
- read_vec_element_i32(s, tcg_op1, rn, pass, MO_32);
151
- read_vec_element_i32(s, tcg_op2, rm, pass, MO_32);
152
-
153
- switch (fpopcode) {
154
- case 0x1f: /* FRECPS */
155
- gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
156
- break;
157
- case 0x3f: /* FRSQRTS */
158
- gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
159
- break;
160
- default:
161
- case 0x18: /* FMAXNM */
162
- case 0x19: /* FMLA */
163
- case 0x1a: /* FADD */
164
- case 0x1b: /* FMULX */
165
- case 0x1c: /* FCMEQ */
166
- case 0x1e: /* FMAX */
167
- case 0x38: /* FMINNM */
168
- case 0x39: /* FMLS */
169
- case 0x3a: /* FSUB */
170
- case 0x3e: /* FMIN */
171
- case 0x5b: /* FMUL */
172
- case 0x5c: /* FCMGE */
173
- case 0x5d: /* FACGE */
174
- case 0x5f: /* FDIV */
175
- case 0x7a: /* FABD */
176
- case 0x7c: /* FCMGT */
177
- case 0x7d: /* FACGT */
178
- g_assert_not_reached();
179
- }
180
-
181
- if (elements == 1) {
182
- /* scalar single so clear high part */
183
- TCGv_i64 tcg_tmp = tcg_temp_new_i64();
184
-
185
- tcg_gen_extu_i32_i64(tcg_tmp, tcg_res);
186
- write_vec_element(s, tcg_tmp, rd, pass, MO_64);
187
- } else {
188
- write_vec_element_i32(s, tcg_res, rd, pass, MO_32);
125
- }
189
- }
126
- }
190
- }
127
- }
191
- }
128
-
192
-
129
- /* The PSTATE bits only mask the interrupt if we have not overriden the
193
- clear_vec_high(s, elements * (size ? 8 : 4) > 8, rd);
130
- * ability above.
131
- */
132
- return unmasked || pstate_unmasked;
133
-}
194
-}
134
-
195
-
135
#define ARM_CPU_TYPE_SUFFIX "-" TYPE_ARM_CPU
196
/* AdvSIMD scalar three same
136
#define ARM_CPU_TYPE_NAME(name) (name ARM_CPU_TYPE_SUFFIX)
197
* 31 30 29 28 24 23 22 21 20 16 15 11 10 9 5 4 0
137
#define CPU_RESOLVING_TYPE TYPE_ARM_CPU
198
* +-----+---+-----------+------+---+------+--------+---+------+------+
138
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
199
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
139
index XXXXXXX..XXXXXXX 100644
200
bool u = extract32(insn, 29, 1);
140
--- a/target/arm/cpu.c
201
TCGv_i64 tcg_rd;
141
+++ b/target/arm/cpu.c
202
142
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
203
- if (opcode >= 0x18) {
143
arm_rebuild_hflags(env);
204
- /* Floating point: U, size[1] and opcode indicate operation */
205
- int fpopcode = opcode | (extract32(size, 1, 1) << 5) | (u << 6);
206
- switch (fpopcode) {
207
- case 0x1f: /* FRECPS */
208
- case 0x3f: /* FRSQRTS */
209
- break;
210
- default:
211
- case 0x1b: /* FMULX */
212
- case 0x5d: /* FACGE */
213
- case 0x7d: /* FACGT */
214
- case 0x1c: /* FCMEQ */
215
- case 0x5c: /* FCMGE */
216
- case 0x7a: /* FABD */
217
- case 0x7c: /* FCMGT */
218
- unallocated_encoding(s);
219
- return;
220
- }
221
-
222
- if (!fp_access_check(s)) {
223
- return;
224
- }
225
-
226
- handle_3same_float(s, extract32(size, 0, 1), 1, fpopcode, rd, rn, rm);
227
- return;
228
- }
229
-
230
switch (opcode) {
231
case 0x1: /* SQADD, UQADD */
232
case 0x5: /* SQSUB, UQSUB */
233
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
234
write_fp_dreg(s, rd, tcg_rd);
144
}
235
}
145
236
146
+static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
237
-/* AdvSIMD scalar three same FP16
147
+ unsigned int target_el)
238
- * 31 30 29 28 24 23 22 21 20 16 15 14 13 11 10 9 5 4 0
148
+{
239
- * +-----+---+-----------+---+-----+------+-----+--------+---+----+----+
149
+ CPUARMState *env = cs->env_ptr;
240
- * | 0 1 | U | 1 1 1 1 0 | a | 1 0 | Rm | 0 0 | opcode | 1 | Rn | Rd |
150
+ unsigned int cur_el = arm_current_el(env);
241
- * +-----+---+-----------+---+-----+------+-----+--------+---+----+----+
151
+ bool secure = arm_is_secure(env);
242
- * v: 0101 1110 0100 0000 0000 0100 0000 0000 => 5e400400
152
+ bool pstate_unmasked;
243
- * m: 1101 1111 0110 0000 1100 0100 0000 0000 => df60c400
153
+ int8_t unmasked = 0;
244
- */
154
+ uint64_t hcr_el2;
245
-static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
155
+
246
- uint32_t insn)
156
+ /*
247
-{
157
+ * Don't take exceptions if they target a lower EL.
248
- int rd = extract32(insn, 0, 5);
158
+ * This check should catch any exceptions that would not be taken
249
- int rn = extract32(insn, 5, 5);
159
+ * but left pending.
250
- int opcode = extract32(insn, 11, 3);
160
+ */
251
- int rm = extract32(insn, 16, 5);
161
+ if (cur_el > target_el) {
252
- bool u = extract32(insn, 29, 1);
162
+ return false;
253
- bool a = extract32(insn, 23, 1);
163
+ }
254
- int fpopcode = opcode | (a << 3) | (u << 4);
164
+
255
- TCGv_ptr fpst;
165
+ hcr_el2 = arm_hcr_el2_eff(env);
256
- TCGv_i32 tcg_op1;
166
+
257
- TCGv_i32 tcg_op2;
167
+ switch (excp_idx) {
258
- TCGv_i32 tcg_res;
168
+ case EXCP_FIQ:
259
-
169
+ pstate_unmasked = !(env->daif & PSTATE_F);
260
- switch (fpopcode) {
170
+ break;
261
- case 0x07: /* FRECPS */
171
+
262
- case 0x0f: /* FRSQRTS */
172
+ case EXCP_IRQ:
263
- break;
173
+ pstate_unmasked = !(env->daif & PSTATE_I);
264
- default:
174
+ break;
265
- case 0x03: /* FMULX */
175
+
266
- case 0x04: /* FCMEQ (reg) */
176
+ case EXCP_VFIQ:
267
- case 0x14: /* FCMGE (reg) */
177
+ if (secure || !(hcr_el2 & HCR_FMO) || (hcr_el2 & HCR_TGE)) {
268
- case 0x15: /* FACGE */
178
+ /* VFIQs are only taken when hypervized and non-secure. */
269
- case 0x1a: /* FABD */
179
+ return false;
270
- case 0x1c: /* FCMGT (reg) */
180
+ }
271
- case 0x1d: /* FACGT */
181
+ return !(env->daif & PSTATE_F);
272
- unallocated_encoding(s);
182
+ case EXCP_VIRQ:
273
- return;
183
+ if (secure || !(hcr_el2 & HCR_IMO) || (hcr_el2 & HCR_TGE)) {
274
- }
184
+ /* VIRQs are only taken when hypervized and non-secure. */
275
-
185
+ return false;
276
- if (!dc_isar_feature(aa64_fp16, s)) {
186
+ }
277
- unallocated_encoding(s);
187
+ return !(env->daif & PSTATE_I);
278
- }
188
+ default:
279
-
280
- if (!fp_access_check(s)) {
281
- return;
282
- }
283
-
284
- fpst = fpstatus_ptr(FPST_FPCR_F16);
285
-
286
- tcg_op1 = read_fp_hreg(s, rn);
287
- tcg_op2 = read_fp_hreg(s, rm);
288
- tcg_res = tcg_temp_new_i32();
289
-
290
- switch (fpopcode) {
291
- case 0x07: /* FRECPS */
292
- gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
293
- break;
294
- case 0x0f: /* FRSQRTS */
295
- gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
296
- break;
297
- default:
298
- case 0x03: /* FMULX */
299
- case 0x04: /* FCMEQ (reg) */
300
- case 0x14: /* FCMGE (reg) */
301
- case 0x15: /* FACGE */
302
- case 0x1a: /* FABD */
303
- case 0x1c: /* FCMGT (reg) */
304
- case 0x1d: /* FACGT */
305
- g_assert_not_reached();
306
- }
307
-
308
- write_fp_sreg(s, rd, tcg_res);
309
-}
310
-
311
/* AdvSIMD scalar three same extra
312
* 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0
313
* +-----+---+-----------+------+---+------+---+--------+---+----+----+
314
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
315
316
/* Pairwise op subgroup of C3.6.16.
317
*
318
- * This is called directly or via the handle_3same_float for float pairwise
319
+ * This is called directly for float pairwise
320
* operations where the opcode and size are calculated differently.
321
*/
322
static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
323
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
324
int rn = extract32(insn, 5, 5);
325
int rd = extract32(insn, 0, 5);
326
327
- int datasize = is_q ? 128 : 64;
328
- int esize = 32 << size;
329
- int elements = datasize / esize;
330
-
331
if (size == 1 && !is_q) {
332
unallocated_encoding(s);
333
return;
334
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
335
handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32,
336
rn, rm, rd);
337
return;
338
- case 0x1f: /* FRECPS */
339
- case 0x3f: /* FRSQRTS */
340
- if (!fp_access_check(s)) {
341
- return;
342
- }
343
- handle_3same_float(s, size, elements, fpopcode, rd, rn, rm);
344
- return;
345
346
case 0x1d: /* FMLAL */
347
case 0x3d: /* FMLSL */
348
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
349
case 0x1b: /* FMULX */
350
case 0x1c: /* FCMEQ */
351
case 0x1e: /* FMAX */
352
+ case 0x1f: /* FRECPS */
353
case 0x38: /* FMINNM */
354
case 0x39: /* FMLS */
355
case 0x3a: /* FSUB */
356
case 0x3e: /* FMIN */
357
+ case 0x3f: /* FRSQRTS */
358
case 0x5b: /* FMUL */
359
case 0x5c: /* FCMGE */
360
case 0x5d: /* FACGE */
361
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
362
* together indicate the operation.
363
*/
364
int fpopcode = opcode | (a << 3) | (u << 4);
365
- int datasize = is_q ? 128 : 64;
366
- int elements = datasize / 16;
367
bool pairwise;
368
TCGv_ptr fpst;
369
int pass;
370
371
switch (fpopcode) {
372
- case 0x7: /* FRECPS */
373
- case 0xf: /* FRSQRTS */
374
- pairwise = false;
375
- break;
376
case 0x10: /* FMAXNMP */
377
case 0x12: /* FADDP */
378
case 0x16: /* FMAXP */
379
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
380
case 0x3: /* FMULX */
381
case 0x4: /* FCMEQ */
382
case 0x6: /* FMAX */
383
+ case 0x7: /* FRECPS */
384
case 0x8: /* FMINNM */
385
case 0x9: /* FMLS */
386
case 0xa: /* FSUB */
387
case 0xe: /* FMIN */
388
+ case 0xf: /* FRSQRTS */
389
case 0x13: /* FMUL */
390
case 0x14: /* FCMGE */
391
case 0x15: /* FACGE */
392
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
393
write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_16);
394
}
395
} else {
396
- for (pass = 0; pass < elements; pass++) {
397
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
398
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
399
- TCGv_i32 tcg_res = tcg_temp_new_i32();
400
-
401
- read_vec_element_i32(s, tcg_op1, rn, pass, MO_16);
402
- read_vec_element_i32(s, tcg_op2, rm, pass, MO_16);
403
-
404
- switch (fpopcode) {
405
- case 0x7: /* FRECPS */
406
- gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
407
- break;
408
- case 0xf: /* FRSQRTS */
409
- gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
410
- break;
411
- default:
412
- case 0x0: /* FMAXNM */
413
- case 0x1: /* FMLA */
414
- case 0x2: /* FADD */
415
- case 0x3: /* FMULX */
416
- case 0x4: /* FCMEQ */
417
- case 0x6: /* FMAX */
418
- case 0x8: /* FMINNM */
419
- case 0x9: /* FMLS */
420
- case 0xa: /* FSUB */
421
- case 0xe: /* FMIN */
422
- case 0x13: /* FMUL */
423
- case 0x14: /* FCMGE */
424
- case 0x15: /* FACGE */
425
- case 0x17: /* FDIV */
426
- case 0x1a: /* FABD */
427
- case 0x1c: /* FCMGT */
428
- case 0x1d: /* FACGT */
429
- g_assert_not_reached();
430
- }
431
-
432
- write_vec_element_i32(s, tcg_res, rd, pass, MO_16);
433
- }
189
+ g_assert_not_reached();
434
+ g_assert_not_reached();
190
+ }
435
}
191
+
436
192
+ /*
437
clear_vec_high(s, is_q, rd);
193
+ * Use the target EL, current execution state and SCR/HCR settings to
438
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
194
+ * determine whether the corresponding CPSR bit is used to mask the
439
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
195
+ * interrupt.
440
{ 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
196
+ */
441
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
197
+ if ((target_el > cur_el) && (target_el != 1)) {
442
- { 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 },
198
+ /* Exceptions targeting a higher EL may not be maskable */
443
{ 0x00000000, 0x00000000, NULL }
199
+ if (arm_feature(env, ARM_FEATURE_AARCH64)) {
444
};
200
+ /*
445
201
+ * 64-bit masking rules are simple: exceptions to EL3
202
+ * can't be masked, and exceptions to EL2 can only be
203
+ * masked from Secure state. The HCR and SCR settings
204
+ * don't affect the masking logic, only the interrupt routing.
205
+ */
206
+ if (target_el == 3 || !secure) {
207
+ unmasked = 1;
208
+ }
209
+ } else {
210
+ /*
211
+ * The old 32-bit-only environment has a more complicated
212
+ * masking setup. HCR and SCR bits not only affect interrupt
213
+ * routing but also change the behaviour of masking.
214
+ */
215
+ bool hcr, scr;
216
+
217
+ switch (excp_idx) {
218
+ case EXCP_FIQ:
219
+ /*
220
+ * If FIQs are routed to EL3 or EL2 then there are cases where
221
+ * we override the CPSR.F in determining if the exception is
222
+ * masked or not. If neither of these are set then we fall back
223
+ * to the CPSR.F setting otherwise we further assess the state
224
+ * below.
225
+ */
226
+ hcr = hcr_el2 & HCR_FMO;
227
+ scr = (env->cp15.scr_el3 & SCR_FIQ);
228
+
229
+ /*
230
+ * When EL3 is 32-bit, the SCR.FW bit controls whether the
231
+ * CPSR.F bit masks FIQ interrupts when taken in non-secure
232
+ * state. If SCR.FW is set then FIQs can be masked by CPSR.F
233
+ * when non-secure but only when FIQs are only routed to EL3.
234
+ */
235
+ scr = scr && !((env->cp15.scr_el3 & SCR_FW) && !hcr);
236
+ break;
237
+ case EXCP_IRQ:
238
+ /*
239
+ * When EL3 execution state is 32-bit, if HCR.IMO is set then
240
+ * we may override the CPSR.I masking when in non-secure state.
241
+ * The SCR.IRQ setting has already been taken into consideration
242
+ * when setting the target EL, so it does not have a further
243
+ * affect here.
244
+ */
245
+ hcr = hcr_el2 & HCR_IMO;
246
+ scr = false;
247
+ break;
248
+ default:
249
+ g_assert_not_reached();
250
+ }
251
+
252
+ if ((scr || hcr) && !secure) {
253
+ unmasked = 1;
254
+ }
255
+ }
256
+ }
257
+
258
+ /*
259
+ * The PSTATE bits only mask the interrupt if we have not overriden the
260
+ * ability above.
261
+ */
262
+ return unmasked || pstate_unmasked;
263
+}
264
+
265
bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
266
{
267
CPUClass *cc = CPU_GET_CLASS(cs);
268
--
446
--
269
2.20.1
447
2.34.1
270
271
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Tested-by: Alex Bennée <alex.bennee@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200206105448.4726-27-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-29-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
7
---
9
target/arm/helper.c | 102 +++++++++++++++++++++++++++++++++++---------
8
target/arm/helper.h | 4 ++
10
1 file changed, 81 insertions(+), 21 deletions(-)
9
target/arm/tcg/a64.decode | 12 +++++
10
target/arm/tcg/translate-a64.c | 87 ++++++++++++++++++++++++++--------
11
target/arm/tcg/vec_helper.c | 23 +++++++++
12
4 files changed, 105 insertions(+), 21 deletions(-)
11
13
12
diff --git a/target/arm/helper.c b/target/arm/helper.c
14
diff --git a/target/arm/helper.h b/target/arm/helper.h
13
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/helper.c
16
--- a/target/arm/helper.h
15
+++ b/target/arm/helper.c
17
+++ b/target/arm/helper.h
16
@@ -XXX,XX +XXX,XX @@ static CPAccessResult gt_cntfrq_access(CPUARMState *env, const ARMCPRegInfo *ri,
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_uclamp_s, TCG_CALL_NO_RWG,
17
* Writable only at the highest implemented exception level.
19
DEF_HELPER_FLAGS_5(gvec_uclamp_d, TCG_CALL_NO_RWG,
18
*/
20
void, ptr, ptr, ptr, ptr, i32)
19
int el = arm_current_el(env);
21
20
+ uint64_t hcr;
22
+DEF_HELPER_FLAGS_5(gvec_faddp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
21
+ uint32_t cntkctl;
23
+DEF_HELPER_FLAGS_5(gvec_faddp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
24
+DEF_HELPER_FLAGS_5(gvec_faddp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
switch (el) {
25
+
24
case 0:
26
#ifdef TARGET_AARCH64
25
- if (!extract32(env->cp15.c14_cntkctl, 0, 2)) {
27
#include "tcg/helper-a64.h"
26
+ hcr = arm_hcr_el2_eff(env);
28
#include "tcg/helper-sve.h"
27
+ if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
29
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
28
+ cntkctl = env->cp15.cnthctl_el2;
30
index XXXXXXX..XXXXXXX 100644
29
+ } else {
31
--- a/target/arm/tcg/a64.decode
30
+ cntkctl = env->cp15.c14_cntkctl;
32
+++ b/target/arm/tcg/a64.decode
31
+ }
33
@@ -XXX,XX +XXX,XX @@
32
+ if (!extract32(cntkctl, 0, 2)) {
34
&ri rd imm
33
return CP_ACCESS_TRAP;
35
&rri_sf rd rn imm sf
34
}
36
&i imm
35
break;
37
+&rr_e rd rn esz
36
@@ -XXX,XX +XXX,XX @@ static CPAccessResult gt_counter_access(CPUARMState *env, int timeridx,
38
&rrr_e rd rn rm esz
37
{
39
&rrx_e rd rn rm idx esz
38
unsigned int cur_el = arm_current_el(env);
40
&qrr_e q rd rn esz
39
bool secure = arm_is_secure(env);
41
@@ -XXX,XX +XXX,XX @@
40
+ uint64_t hcr = arm_hcr_el2_eff(env);
42
&qrrx_e q rd rn rm idx esz
41
43
&qrrrr_e q rd rn rm ra esz
42
- /* CNT[PV]CT: not visible from PL0 if ELO[PV]CTEN is zero */
44
43
- if (cur_el == 0 &&
45
+@rr_h ........ ... ..... ...... rn:5 rd:5 &rr_e esz=1
44
- !extract32(env->cp15.c14_cntkctl, timeridx, 1)) {
46
+@rr_sd ........ ... ..... ...... rn:5 rd:5 &rr_e esz=%esz_sd
45
- return CP_ACCESS_TRAP;
47
+
46
- }
48
@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1
47
+ switch (cur_el) {
49
@rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd
48
+ case 0:
50
@rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd
49
+ /* If HCR_EL2.<E2H,TGE> == '11': check CNTHCTL_EL2.EL0[PV]CTEN. */
51
@@ -XXX,XX +XXX,XX @@ FRECPS_s 0101 1110 0.1 ..... 11111 1 ..... ..... @rrr_sd
50
+ if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
52
FRSQRTS_s 0101 1110 110 ..... 00111 1 ..... ..... @rrr_h
51
+ return (extract32(env->cp15.cnthctl_el2, timeridx, 1)
53
FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd
52
+ ? CP_ACCESS_OK : CP_ACCESS_TRAP_EL2);
54
53
+ }
55
+### Advanced SIMD scalar pairwise
54
56
+
55
- if (arm_feature(env, ARM_FEATURE_EL2) &&
57
+FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h
56
- timeridx == GTIMER_PHYS && !secure && cur_el < 2 &&
58
+FADDP_s 0111 1110 0.11 0000 1101 10 ..... ..... @rr_sd
57
- !extract32(env->cp15.cnthctl_el2, 0, 1)) {
59
+
58
- return CP_ACCESS_TRAP_EL2;
60
### Advanced SIMD three same
59
+ /* CNT[PV]CT: not visible from PL0 if EL0[PV]CTEN is zero */
61
60
+ if (!extract32(env->cp15.c14_cntkctl, timeridx, 1)) {
62
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
61
+ return CP_ACCESS_TRAP;
63
@@ -XXX,XX +XXX,XX @@ FRECPS_v 0.00 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd
62
+ }
64
FRSQRTS_v 0.00 1110 110 ..... 00111 1 ..... ..... @qrrr_h
63
+
65
FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd
64
+ /* If HCR_EL2.<E2H,TGE> == '10': check CNTHCTL_EL2.EL1PCTEN. */
66
65
+ if (hcr & HCR_E2H) {
67
+FADDP_v 0.10 1110 010 ..... 00010 1 ..... ..... @qrrr_h
66
+ if (timeridx == GTIMER_PHYS &&
68
+FADDP_v 0.10 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd
67
+ !extract32(env->cp15.cnthctl_el2, 10, 1)) {
69
+
68
+ return CP_ACCESS_TRAP_EL2;
70
### Advanced SIMD scalar x indexed element
69
+ }
71
70
+ } else {
72
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
71
+ /* If HCR_EL2.<E2H> == 0: check CNTHCTL_EL2.EL1PCEN. */
73
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
72
+ if (arm_feature(env, ARM_FEATURE_EL2) &&
74
index XXXXXXX..XXXXXXX 100644
73
+ timeridx == GTIMER_PHYS && !secure &&
75
--- a/target/arm/tcg/translate-a64.c
74
+ !extract32(env->cp15.cnthctl_el2, 1, 1)) {
76
+++ b/target/arm/tcg/translate-a64.c
75
+ return CP_ACCESS_TRAP_EL2;
77
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_frsqrts[3] = {
76
+ }
78
};
79
TRANS(FRSQRTS_v, do_fp3_vector, a, f_vector_frsqrts)
80
81
+static gen_helper_gvec_3_ptr * const f_vector_faddp[3] = {
82
+ gen_helper_gvec_faddp_h,
83
+ gen_helper_gvec_faddp_s,
84
+ gen_helper_gvec_faddp_d,
85
+};
86
+TRANS(FADDP_v, do_fp3_vector, a, f_vector_faddp)
87
+
88
/*
89
* Advanced SIMD scalar/vector x indexed element
90
*/
91
@@ -XXX,XX +XXX,XX @@ static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg)
92
TRANS(FMLA_vi, do_fmla_vector_idx, a, false)
93
TRANS(FMLS_vi, do_fmla_vector_idx, a, true)
94
95
+/*
96
+ * Advanced SIMD scalar pairwise
97
+ */
98
+
99
+static bool do_fp3_scalar_pair(DisasContext *s, arg_rr_e *a, const FPScalar *f)
100
+{
101
+ switch (a->esz) {
102
+ case MO_64:
103
+ if (fp_access_check(s)) {
104
+ TCGv_i64 t0 = tcg_temp_new_i64();
105
+ TCGv_i64 t1 = tcg_temp_new_i64();
106
+
107
+ read_vec_element(s, t0, a->rn, 0, MO_64);
108
+ read_vec_element(s, t1, a->rn, 1, MO_64);
109
+ f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
110
+ write_fp_dreg(s, a->rd, t0);
77
+ }
111
+ }
78
+ break;
112
+ break;
79
+
113
+ case MO_32:
80
+ case 1:
114
+ if (fp_access_check(s)) {
81
+ /* Check CNTHCTL_EL2.EL1PCTEN, which changes location based on E2H. */
115
+ TCGv_i32 t0 = tcg_temp_new_i32();
82
+ if (arm_feature(env, ARM_FEATURE_EL2) &&
116
+ TCGv_i32 t1 = tcg_temp_new_i32();
83
+ timeridx == GTIMER_PHYS && !secure &&
117
+
84
+ (hcr & HCR_E2H
118
+ read_vec_element_i32(s, t0, a->rn, 0, MO_32);
85
+ ? !extract32(env->cp15.cnthctl_el2, 10, 1)
119
+ read_vec_element_i32(s, t1, a->rn, 1, MO_32);
86
+ : !extract32(env->cp15.cnthctl_el2, 0, 1))) {
120
+ f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR));
87
+ return CP_ACCESS_TRAP_EL2;
121
+ write_fp_sreg(s, a->rd, t0);
88
+ }
122
+ }
89
+ break;
123
+ break;
90
}
124
+ case MO_16:
91
return CP_ACCESS_OK;
125
+ if (!dc_isar_feature(aa64_fp16, s)) {
92
}
126
+ return false;
93
@@ -XXX,XX +XXX,XX @@ static CPAccessResult gt_timer_access(CPUARMState *env, int timeridx,
94
{
95
unsigned int cur_el = arm_current_el(env);
96
bool secure = arm_is_secure(env);
97
+ uint64_t hcr = arm_hcr_el2_eff(env);
98
99
- /* CNT[PV]_CVAL, CNT[PV]_CTL, CNT[PV]_TVAL: not visible from PL0 if
100
- * EL0[PV]TEN is zero.
101
- */
102
- if (cur_el == 0 &&
103
- !extract32(env->cp15.c14_cntkctl, 9 - timeridx, 1)) {
104
- return CP_ACCESS_TRAP;
105
- }
106
+ switch (cur_el) {
107
+ case 0:
108
+ if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
109
+ /* If HCR_EL2.<E2H,TGE> == '11': check CNTHCTL_EL2.EL0[PV]TEN. */
110
+ return (extract32(env->cp15.cnthctl_el2, 9 - timeridx, 1)
111
+ ? CP_ACCESS_OK : CP_ACCESS_TRAP_EL2);
112
+ }
127
+ }
113
128
+ if (fp_access_check(s)) {
114
- if (arm_feature(env, ARM_FEATURE_EL2) &&
129
+ TCGv_i32 t0 = tcg_temp_new_i32();
115
- timeridx == GTIMER_PHYS && !secure && cur_el < 2 &&
130
+ TCGv_i32 t1 = tcg_temp_new_i32();
116
- !extract32(env->cp15.cnthctl_el2, 1, 1)) {
131
+
117
- return CP_ACCESS_TRAP_EL2;
132
+ read_vec_element_i32(s, t0, a->rn, 0, MO_16);
118
+ /*
133
+ read_vec_element_i32(s, t1, a->rn, 1, MO_16);
119
+ * CNT[PV]_CVAL, CNT[PV]_CTL, CNT[PV]_TVAL: not visible from
134
+ f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16));
120
+ * EL0 if EL0[PV]TEN is zero.
135
+ write_fp_sreg(s, a->rd, t0);
121
+ */
122
+ if (!extract32(env->cp15.c14_cntkctl, 9 - timeridx, 1)) {
123
+ return CP_ACCESS_TRAP;
124
+ }
125
+ /* fall through */
126
+
127
+ case 1:
128
+ if (arm_feature(env, ARM_FEATURE_EL2) &&
129
+ timeridx == GTIMER_PHYS && !secure) {
130
+ if (hcr & HCR_E2H) {
131
+ /* If HCR_EL2.<E2H,TGE> == '10': check CNTHCTL_EL2.EL1PTEN. */
132
+ if (!extract32(env->cp15.cnthctl_el2, 11, 1)) {
133
+ return CP_ACCESS_TRAP_EL2;
134
+ }
135
+ } else {
136
+ /* If HCR_EL2.<E2H> == 0: check CNTHCTL_EL2.EL1PCEN. */
137
+ if (!extract32(env->cp15.cnthctl_el2, 1, 1)) {
138
+ return CP_ACCESS_TRAP_EL2;
139
+ }
140
+ }
141
+ }
136
+ }
142
+ break;
137
+ break;
138
+ default:
139
+ g_assert_not_reached();
140
+ }
141
+ return true;
142
+}
143
+
144
+TRANS(FADDP_s, do_fp3_scalar_pair, a, &f_scalar_fadd)
145
146
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
147
* Note that it is the caller's responsibility to ensure that the
148
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
149
fpst = NULL;
150
break;
151
case 0xc: /* FMAXNMP */
152
- case 0xd: /* FADDP */
153
case 0xf: /* FMAXP */
154
case 0x2c: /* FMINNMP */
155
case 0x2f: /* FMINP */
156
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
157
fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
158
break;
159
default:
160
+ case 0xd: /* FADDP */
161
unallocated_encoding(s);
162
return;
143
}
163
}
144
return CP_ACCESS_OK;
164
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
145
}
165
case 0xc: /* FMAXNMP */
166
gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
167
break;
168
- case 0xd: /* FADDP */
169
- gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
170
- break;
171
case 0xf: /* FMAXP */
172
gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
173
break;
174
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
175
gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
176
break;
177
default:
178
+ case 0xd: /* FADDP */
179
g_assert_not_reached();
180
}
181
182
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
183
case 0xc: /* FMAXNMP */
184
gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
185
break;
186
- case 0xd: /* FADDP */
187
- gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
188
- break;
189
case 0xf: /* FMAXP */
190
gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
191
break;
192
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
193
gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
194
break;
195
default:
196
+ case 0xd: /* FADDP */
197
g_assert_not_reached();
198
}
199
} else {
200
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
201
case 0xc: /* FMAXNMP */
202
gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
203
break;
204
- case 0xd: /* FADDP */
205
- gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
206
- break;
207
case 0xf: /* FMAXP */
208
gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
209
break;
210
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
211
gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
212
break;
213
default:
214
+ case 0xd: /* FADDP */
215
g_assert_not_reached();
216
}
217
}
218
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
219
case 0x58: /* FMAXNMP */
220
gen_helper_vfp_maxnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
221
break;
222
- case 0x5a: /* FADDP */
223
- gen_helper_vfp_addd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
224
- break;
225
case 0x5e: /* FMAXP */
226
gen_helper_vfp_maxd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
227
break;
228
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
229
gen_helper_vfp_mind(tcg_res[pass], tcg_op1, tcg_op2, fpst);
230
break;
231
default:
232
+ case 0x5a: /* FADDP */
233
g_assert_not_reached();
234
}
235
}
236
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
237
case 0x58: /* FMAXNMP */
238
gen_helper_vfp_maxnums(tcg_res[pass], tcg_op1, tcg_op2, fpst);
239
break;
240
- case 0x5a: /* FADDP */
241
- gen_helper_vfp_adds(tcg_res[pass], tcg_op1, tcg_op2, fpst);
242
- break;
243
case 0x5e: /* FMAXP */
244
gen_helper_vfp_maxs(tcg_res[pass], tcg_op1, tcg_op2, fpst);
245
break;
246
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
247
gen_helper_vfp_mins(tcg_res[pass], tcg_op1, tcg_op2, fpst);
248
break;
249
default:
250
+ case 0x5a: /* FADDP */
251
g_assert_not_reached();
252
}
253
254
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
255
256
switch (fpopcode) {
257
case 0x58: /* FMAXNMP */
258
- case 0x5a: /* FADDP */
259
case 0x5e: /* FMAXP */
260
case 0x78: /* FMINNMP */
261
case 0x7e: /* FMINP */
262
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
263
case 0x3a: /* FSUB */
264
case 0x3e: /* FMIN */
265
case 0x3f: /* FRSQRTS */
266
+ case 0x5a: /* FADDP */
267
case 0x5b: /* FMUL */
268
case 0x5c: /* FCMGE */
269
case 0x5d: /* FACGE */
270
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
271
272
switch (fpopcode) {
273
case 0x10: /* FMAXNMP */
274
- case 0x12: /* FADDP */
275
case 0x16: /* FMAXP */
276
case 0x18: /* FMINNMP */
277
case 0x1e: /* FMINP */
278
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
279
case 0xa: /* FSUB */
280
case 0xe: /* FMIN */
281
case 0xf: /* FRSQRTS */
282
+ case 0x12: /* FADDP */
283
case 0x13: /* FMUL */
284
case 0x14: /* FCMGE */
285
case 0x15: /* FACGE */
286
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
287
gen_helper_advsimd_maxnumh(tcg_res[pass], tcg_op1, tcg_op2,
288
fpst);
289
break;
290
- case 0x12: /* FADDP */
291
- gen_helper_advsimd_addh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
292
- break;
293
case 0x16: /* FMAXP */
294
gen_helper_advsimd_maxh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
295
break;
296
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
297
gen_helper_advsimd_minh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
298
break;
299
default:
300
+ case 0x12: /* FADDP */
301
g_assert_not_reached();
302
}
303
}
304
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
305
index XXXXXXX..XXXXXXX 100644
306
--- a/target/arm/tcg/vec_helper.c
307
+++ b/target/arm/tcg/vec_helper.c
308
@@ -XXX,XX +XXX,XX @@ DO_NEON_PAIRWISE(neon_pmin, min)
309
310
#undef DO_NEON_PAIRWISE
311
312
+#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \
313
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
314
+{ \
315
+ ARMVectorReg scratch; \
316
+ intptr_t oprsz = simd_oprsz(desc); \
317
+ intptr_t half = oprsz / sizeof(TYPE) / 2; \
318
+ TYPE *d = vd, *n = vn, *m = vm; \
319
+ if (unlikely(d == m)) { \
320
+ m = memcpy(&scratch, m, oprsz); \
321
+ } \
322
+ for (intptr_t i = 0; i < half; ++i) { \
323
+ d[H(i)] = FUNC(n[H(i * 2)], n[H(i * 2 + 1)], stat); \
324
+ } \
325
+ for (intptr_t i = 0; i < half; ++i) { \
326
+ d[H(i + half)] = FUNC(m[H(i * 2)], m[H(i * 2 + 1)], stat); \
327
+ } \
328
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
329
+}
330
+
331
+DO_3OP_PAIR(gvec_faddp_h, float16_add, float16, H2)
332
+DO_3OP_PAIR(gvec_faddp_s, float32_add, float32, H4)
333
+DO_3OP_PAIR(gvec_faddp_d, float64_add, float64, )
334
+
335
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
336
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
337
{ \
146
--
338
--
147
2.20.1
339
2.34.1
148
149
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This is part of a reorganization to the set of mmu_idx.
3
These are the last instructions within disas_simd_three_reg_same_fp16,
4
This emphasizes that they apply to the EL1&0 regime.
4
so remove it.
5
5
6
The ultimate goal is
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
8
-- Non-secure regimes:
9
ARMMMUIdx_E10_0,
10
ARMMMUIdx_E20_0,
11
ARMMMUIdx_E10_1,
12
ARMMMUIdx_E2,
13
ARMMMUIdx_E20_2,
14
15
-- Secure regimes:
16
ARMMMUIdx_SE10_0,
17
ARMMMUIdx_SE10_1,
18
ARMMMUIdx_SE3,
19
20
-- Helper mmu_idx for non-secure EL1&0 stage1 and stage2
21
ARMMMUIdx_Stage2,
22
ARMMMUIdx_Stage1_E0,
23
ARMMMUIdx_Stage1_E1,
24
25
The 'S' prefix is reserved for "Secure". Unless otherwise specified,
26
each mmu_idx represents all stages of translation.
27
28
Tested-by: Alex Bennée <alex.bennee@linaro.org>
29
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
30
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
31
Message-id: 20200206105448.4726-10-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-30-richard.henderson@linaro.org
32
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
33
---
10
---
34
target/arm/cpu.h | 8 ++++----
11
target/arm/helper.h | 16 ++
35
target/arm/internals.h | 4 ++--
12
target/arm/tcg/a64.decode | 24 +++
36
target/arm/helper.c | 40 +++++++++++++++++++-------------------
13
target/arm/tcg/translate-a64.c | 296 ++++++---------------------------
37
target/arm/translate-a64.c | 4 ++--
14
target/arm/tcg/vec_helper.c | 16 ++
38
target/arm/translate.c | 6 +++---
15
4 files changed, 107 insertions(+), 245 deletions(-)
39
5 files changed, 31 insertions(+), 31 deletions(-)
40
16
41
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
42
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/cpu.h
19
--- a/target/arm/helper.h
44
+++ b/target/arm/cpu.h
20
+++ b/target/arm/helper.h
45
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_faddp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
46
#define ARM_MMU_IDX_COREIDX_MASK 0x7
22
DEF_HELPER_FLAGS_5(gvec_faddp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
47
23
DEF_HELPER_FLAGS_5(gvec_faddp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
48
typedef enum ARMMMUIdx {
24
49
- ARMMMUIdx_S12NSE0 = 0 | ARM_MMU_IDX_A,
25
+DEF_HELPER_FLAGS_5(gvec_fmaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
50
- ARMMMUIdx_S12NSE1 = 1 | ARM_MMU_IDX_A,
26
+DEF_HELPER_FLAGS_5(gvec_fmaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
51
+ ARMMMUIdx_E10_0 = 0 | ARM_MMU_IDX_A,
27
+DEF_HELPER_FLAGS_5(gvec_fmaxp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
52
+ ARMMMUIdx_E10_1 = 1 | ARM_MMU_IDX_A,
28
+
53
ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
29
+DEF_HELPER_FLAGS_5(gvec_fminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
54
ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
30
+DEF_HELPER_FLAGS_5(gvec_fminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
55
ARMMMUIdx_S1SE0 = 4 | ARM_MMU_IDX_A,
31
+DEF_HELPER_FLAGS_5(gvec_fminp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
56
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
32
+
57
* for use when calling tlb_flush_by_mmuidx() and friends.
33
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_5(gvec_fmaxnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_5(gvec_fminnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_5(gvec_fminnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_5(gvec_fminnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
40
+
41
#ifdef TARGET_AARCH64
42
#include "tcg/helper-a64.h"
43
#include "tcg/helper-sve.h"
44
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
45
index XXXXXXX..XXXXXXX 100644
46
--- a/target/arm/tcg/a64.decode
47
+++ b/target/arm/tcg/a64.decode
48
@@ -XXX,XX +XXX,XX @@ FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd
49
FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h
50
FADDP_s 0111 1110 0.11 0000 1101 10 ..... ..... @rr_sd
51
52
+FMAXP_s 0101 1110 0011 0000 1111 10 ..... ..... @rr_h
53
+FMAXP_s 0111 1110 0.11 0000 1111 10 ..... ..... @rr_sd
54
+
55
+FMINP_s 0101 1110 1011 0000 1111 10 ..... ..... @rr_h
56
+FMINP_s 0111 1110 1.11 0000 1111 10 ..... ..... @rr_sd
57
+
58
+FMAXNMP_s 0101 1110 0011 0000 1100 10 ..... ..... @rr_h
59
+FMAXNMP_s 0111 1110 0.11 0000 1100 10 ..... ..... @rr_sd
60
+
61
+FMINNMP_s 0101 1110 1011 0000 1100 10 ..... ..... @rr_h
62
+FMINNMP_s 0111 1110 1.11 0000 1100 10 ..... ..... @rr_sd
63
+
64
### Advanced SIMD three same
65
66
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
67
@@ -XXX,XX +XXX,XX @@ FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd
68
FADDP_v 0.10 1110 010 ..... 00010 1 ..... ..... @qrrr_h
69
FADDP_v 0.10 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd
70
71
+FMAXP_v 0.10 1110 010 ..... 00110 1 ..... ..... @qrrr_h
72
+FMAXP_v 0.10 1110 0.1 ..... 11110 1 ..... ..... @qrrr_sd
73
+
74
+FMINP_v 0.10 1110 110 ..... 00110 1 ..... ..... @qrrr_h
75
+FMINP_v 0.10 1110 1.1 ..... 11110 1 ..... ..... @qrrr_sd
76
+
77
+FMAXNMP_v 0.10 1110 010 ..... 00000 1 ..... ..... @qrrr_h
78
+FMAXNMP_v 0.10 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd
79
+
80
+FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h
81
+FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
82
+
83
### Advanced SIMD scalar x indexed element
84
85
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
86
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/tcg/translate-a64.c
89
+++ b/target/arm/tcg/translate-a64.c
90
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_faddp[3] = {
91
};
92
TRANS(FADDP_v, do_fp3_vector, a, f_vector_faddp)
93
94
+static gen_helper_gvec_3_ptr * const f_vector_fmaxp[3] = {
95
+ gen_helper_gvec_fmaxp_h,
96
+ gen_helper_gvec_fmaxp_s,
97
+ gen_helper_gvec_fmaxp_d,
98
+};
99
+TRANS(FMAXP_v, do_fp3_vector, a, f_vector_fmaxp)
100
+
101
+static gen_helper_gvec_3_ptr * const f_vector_fminp[3] = {
102
+ gen_helper_gvec_fminp_h,
103
+ gen_helper_gvec_fminp_s,
104
+ gen_helper_gvec_fminp_d,
105
+};
106
+TRANS(FMINP_v, do_fp3_vector, a, f_vector_fminp)
107
+
108
+static gen_helper_gvec_3_ptr * const f_vector_fmaxnmp[3] = {
109
+ gen_helper_gvec_fmaxnump_h,
110
+ gen_helper_gvec_fmaxnump_s,
111
+ gen_helper_gvec_fmaxnump_d,
112
+};
113
+TRANS(FMAXNMP_v, do_fp3_vector, a, f_vector_fmaxnmp)
114
+
115
+static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = {
116
+ gen_helper_gvec_fminnump_h,
117
+ gen_helper_gvec_fminnump_s,
118
+ gen_helper_gvec_fminnump_d,
119
+};
120
+TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp)
121
+
122
/*
123
* Advanced SIMD scalar/vector x indexed element
58
*/
124
*/
59
typedef enum ARMMMUIdxBit {
125
@@ -XXX,XX +XXX,XX @@ static bool do_fp3_scalar_pair(DisasContext *s, arg_rr_e *a, const FPScalar *f)
60
- ARMMMUIdxBit_S12NSE0 = 1 << 0,
61
- ARMMMUIdxBit_S12NSE1 = 1 << 1,
62
+ ARMMMUIdxBit_E10_0 = 1 << 0,
63
+ ARMMMUIdxBit_E10_1 = 1 << 1,
64
ARMMMUIdxBit_S1E2 = 1 << 2,
65
ARMMMUIdxBit_S1E3 = 1 << 3,
66
ARMMMUIdxBit_S1SE0 = 1 << 4,
67
diff --git a/target/arm/internals.h b/target/arm/internals.h
68
index XXXXXXX..XXXXXXX 100644
69
--- a/target/arm/internals.h
70
+++ b/target/arm/internals.h
71
@@ -XXX,XX +XXX,XX @@ static inline void arm_call_el_change_hook(ARMCPU *cpu)
72
static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
73
{
74
switch (mmu_idx) {
75
- case ARMMMUIdx_S12NSE0:
76
- case ARMMMUIdx_S12NSE1:
77
+ case ARMMMUIdx_E10_0:
78
+ case ARMMMUIdx_E10_1:
79
case ARMMMUIdx_S1NSE0:
80
case ARMMMUIdx_S1NSE1:
81
case ARMMMUIdx_S1E2:
82
diff --git a/target/arm/helper.c b/target/arm/helper.c
83
index XXXXXXX..XXXXXXX 100644
84
--- a/target/arm/helper.c
85
+++ b/target/arm/helper.c
86
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
87
CPUState *cs = env_cpu(env);
88
89
tlb_flush_by_mmuidx(cs,
90
- ARMMMUIdxBit_S12NSE1 |
91
- ARMMMUIdxBit_S12NSE0 |
92
+ ARMMMUIdxBit_E10_1 |
93
+ ARMMMUIdxBit_E10_0 |
94
ARMMMUIdxBit_S2NS);
95
}
126
}
96
127
97
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
128
TRANS(FADDP_s, do_fp3_scalar_pair, a, &f_scalar_fadd)
98
CPUState *cs = env_cpu(env);
129
+TRANS(FMAXP_s, do_fp3_scalar_pair, a, &f_scalar_fmax)
99
130
+TRANS(FMINP_s, do_fp3_scalar_pair, a, &f_scalar_fmin)
100
tlb_flush_by_mmuidx_all_cpus_synced(cs,
131
+TRANS(FMAXNMP_s, do_fp3_scalar_pair, a, &f_scalar_fmaxnm)
101
- ARMMMUIdxBit_S12NSE1 |
132
+TRANS(FMINNMP_s, do_fp3_scalar_pair, a, &f_scalar_fminnm)
102
- ARMMMUIdxBit_S12NSE0 |
133
103
+ ARMMMUIdxBit_E10_1 |
134
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
104
+ ARMMMUIdxBit_E10_0 |
135
* Note that it is the caller's responsibility to ensure that the
105
ARMMMUIdxBit_S2NS);
136
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
106
}
137
int opcode = extract32(insn, 12, 5);
107
138
int rn = extract32(insn, 5, 5);
108
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
139
int rd = extract32(insn, 0, 5);
109
format64 = arm_s1_regime_using_lpae_format(env, mmu_idx);
140
- TCGv_ptr fpst;
110
141
111
if (arm_feature(env, ARM_FEATURE_EL2)) {
142
/* For some ops (the FP ones), size[1] is part of the encoding.
112
- if (mmu_idx == ARMMMUIdx_S12NSE0 || mmu_idx == ARMMMUIdx_S12NSE1) {
143
* For ADDP strictly it is not but size[1] is always 1 for valid
113
+ if (mmu_idx == ARMMMUIdx_E10_0 || mmu_idx == ARMMMUIdx_E10_1) {
144
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
114
format64 |= env->cp15.hcr_el2 & (HCR_VM | HCR_DC);
145
if (!fp_access_check(s)) {
115
} else {
146
return;
116
format64 |= arm_current_el(env) == 2;
147
}
117
@@ -XXX,XX +XXX,XX @@ static void ats_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
148
-
149
- fpst = NULL;
118
break;
150
break;
119
case 4:
151
+ default:
120
/* stage 1+2 NonSecure PL1: ATS12NSOPR, ATS12NSOPW */
152
case 0xc: /* FMAXNMP */
121
- mmu_idx = ARMMMUIdx_S12NSE1;
153
+ case 0xd: /* FADDP */
122
+ mmu_idx = ARMMMUIdx_E10_1;
154
case 0xf: /* FMAXP */
123
break;
155
case 0x2c: /* FMINNMP */
124
case 6:
156
case 0x2f: /* FMINP */
125
/* stage 1+2 NonSecure PL0: ATS12NSOUR, ATS12NSOUW */
157
- /* FP op, size[0] is 32 or 64 bit*/
126
- mmu_idx = ARMMMUIdx_S12NSE0;
158
- if (!u) {
127
+ mmu_idx = ARMMMUIdx_E10_0;
159
- if ((size & 1) || !dc_isar_feature(aa64_fp16, s)) {
128
break;
160
- unallocated_encoding(s);
129
default:
161
- return;
130
g_assert_not_reached();
162
- } else {
131
@@ -XXX,XX +XXX,XX @@ static void ats_write64(CPUARMState *env, const ARMCPRegInfo *ri,
163
- size = MO_16;
132
mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
164
- }
133
break;
165
- } else {
134
case 4: /* AT S12E1R, AT S12E1W */
166
- size = extract32(size, 0, 1) ? MO_64 : MO_32;
135
- mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S12NSE1;
167
- }
136
+ mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_E10_1;
168
-
137
break;
169
- if (!fp_access_check(s)) {
138
case 6: /* AT S12E0R, AT S12E0W */
170
- return;
139
- mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S12NSE0;
171
- }
140
+ mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_E10_0;
172
-
141
break;
173
- fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
142
default:
174
- break;
143
g_assert_not_reached();
175
- default:
144
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
176
- case 0xd: /* FADDP */
145
/* Accesses to VTTBR may change the VMID so we must flush the TLB. */
177
unallocated_encoding(s);
146
if (raw_read(env, ri) != value) {
178
return;
147
tlb_flush_by_mmuidx(cs,
148
- ARMMMUIdxBit_S12NSE1 |
149
- ARMMMUIdxBit_S12NSE0 |
150
+ ARMMMUIdxBit_E10_1 |
151
+ ARMMMUIdxBit_E10_0 |
152
ARMMMUIdxBit_S2NS);
153
raw_write(env, ri, value);
154
}
179
}
155
@@ -XXX,XX +XXX,XX @@ static int vae1_tlbmask(CPUARMState *env)
180
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
156
if (arm_is_secure_below_el3(env)) {
181
case 0x3b: /* ADDP */
157
return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
182
tcg_gen_add_i64(tcg_res, tcg_op1, tcg_op2);
183
break;
184
- case 0xc: /* FMAXNMP */
185
- gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
186
- break;
187
- case 0xf: /* FMAXP */
188
- gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
189
- break;
190
- case 0x2c: /* FMINNMP */
191
- gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
192
- break;
193
- case 0x2f: /* FMINP */
194
- gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
195
- break;
196
default:
197
+ case 0xc: /* FMAXNMP */
198
case 0xd: /* FADDP */
199
+ case 0xf: /* FMAXP */
200
+ case 0x2c: /* FMINNMP */
201
+ case 0x2f: /* FMINP */
202
g_assert_not_reached();
203
}
204
205
write_fp_dreg(s, rd, tcg_res);
158
} else {
206
} else {
159
- return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
207
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
160
+ return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0;
208
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
209
- TCGv_i32 tcg_res = tcg_temp_new_i32();
210
-
211
- read_vec_element_i32(s, tcg_op1, rn, 0, size);
212
- read_vec_element_i32(s, tcg_op2, rn, 1, size);
213
-
214
- if (size == MO_16) {
215
- switch (opcode) {
216
- case 0xc: /* FMAXNMP */
217
- gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
218
- break;
219
- case 0xf: /* FMAXP */
220
- gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
221
- break;
222
- case 0x2c: /* FMINNMP */
223
- gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
224
- break;
225
- case 0x2f: /* FMINP */
226
- gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
227
- break;
228
- default:
229
- case 0xd: /* FADDP */
230
- g_assert_not_reached();
231
- }
232
- } else {
233
- switch (opcode) {
234
- case 0xc: /* FMAXNMP */
235
- gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
236
- break;
237
- case 0xf: /* FMAXP */
238
- gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
239
- break;
240
- case 0x2c: /* FMINNMP */
241
- gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
242
- break;
243
- case 0x2f: /* FMINP */
244
- gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
245
- break;
246
- default:
247
- case 0xd: /* FADDP */
248
- g_assert_not_reached();
249
- }
250
- }
251
-
252
- write_fp_sreg(s, rd, tcg_res);
253
+ g_assert_not_reached();
161
}
254
}
162
}
255
}
163
256
164
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
257
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
165
if (arm_is_secure_below_el3(env)) {
258
static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
166
return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
259
int size, int rn, int rm, int rd)
167
} else if (arm_feature(env, ARM_FEATURE_EL2)) {
260
{
168
- return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0 | ARMMMUIdxBit_S2NS;
261
- TCGv_ptr fpst;
169
+ return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0 | ARMMMUIdxBit_S2NS;
262
int pass;
170
} else {
263
171
- return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
264
- /* Floating point operations need fpst */
172
+ return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0;
265
- if (opcode >= 0x58) {
266
- fpst = fpstatus_ptr(FPST_FPCR);
267
- } else {
268
- fpst = NULL;
269
- }
270
-
271
if (!fp_access_check(s)) {
272
return;
273
}
274
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
275
case 0x17: /* ADDP */
276
tcg_gen_add_i64(tcg_res[pass], tcg_op1, tcg_op2);
277
break;
278
- case 0x58: /* FMAXNMP */
279
- gen_helper_vfp_maxnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
280
- break;
281
- case 0x5e: /* FMAXP */
282
- gen_helper_vfp_maxd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
283
- break;
284
- case 0x78: /* FMINNMP */
285
- gen_helper_vfp_minnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst);
286
- break;
287
- case 0x7e: /* FMINP */
288
- gen_helper_vfp_mind(tcg_res[pass], tcg_op1, tcg_op2, fpst);
289
- break;
290
default:
291
+ case 0x58: /* FMAXNMP */
292
case 0x5a: /* FADDP */
293
+ case 0x5e: /* FMAXP */
294
+ case 0x78: /* FMINNMP */
295
+ case 0x7e: /* FMINP */
296
g_assert_not_reached();
297
}
298
}
299
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
300
genfn = fns[size][u];
301
break;
302
}
303
- /* The FP operations are all on single floats (32 bit) */
304
- case 0x58: /* FMAXNMP */
305
- gen_helper_vfp_maxnums(tcg_res[pass], tcg_op1, tcg_op2, fpst);
306
- break;
307
- case 0x5e: /* FMAXP */
308
- gen_helper_vfp_maxs(tcg_res[pass], tcg_op1, tcg_op2, fpst);
309
- break;
310
- case 0x78: /* FMINNMP */
311
- gen_helper_vfp_minnums(tcg_res[pass], tcg_op1, tcg_op2, fpst);
312
- break;
313
- case 0x7e: /* FMINP */
314
- gen_helper_vfp_mins(tcg_res[pass], tcg_op1, tcg_op2, fpst);
315
- break;
316
default:
317
+ case 0x58: /* FMAXNMP */
318
case 0x5a: /* FADDP */
319
+ case 0x5e: /* FMAXP */
320
+ case 0x78: /* FMINNMP */
321
+ case 0x7e: /* FMINP */
322
g_assert_not_reached();
323
}
324
325
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
326
}
327
328
switch (fpopcode) {
329
- case 0x58: /* FMAXNMP */
330
- case 0x5e: /* FMAXP */
331
- case 0x78: /* FMINNMP */
332
- case 0x7e: /* FMINP */
333
- if (size && !is_q) {
334
- unallocated_encoding(s);
335
- return;
336
- }
337
- handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32,
338
- rn, rm, rd);
339
- return;
340
-
341
case 0x1d: /* FMLAL */
342
case 0x3d: /* FMLSL */
343
case 0x59: /* FMLAL2 */
344
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
345
case 0x3a: /* FSUB */
346
case 0x3e: /* FMIN */
347
case 0x3f: /* FRSQRTS */
348
+ case 0x58: /* FMAXNMP */
349
case 0x5a: /* FADDP */
350
case 0x5b: /* FMUL */
351
case 0x5c: /* FCMGE */
352
case 0x5d: /* FACGE */
353
+ case 0x5e: /* FMAXP */
354
case 0x5f: /* FDIV */
355
+ case 0x78: /* FMINNMP */
356
case 0x7a: /* FABD */
357
case 0x7d: /* FACGT */
358
case 0x7c: /* FCMGT */
359
+ case 0x7e: /* FMINP */
360
unallocated_encoding(s);
361
return;
362
}
363
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
173
}
364
}
174
}
365
}
175
366
176
@@ -XXX,XX +XXX,XX @@ static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
367
-/*
177
*/
368
- * Advanced SIMD three same (ARMv8.2 FP16 variants)
178
static inline ARMMMUIdx stage_1_mmu_idx(ARMMMUIdx mmu_idx)
369
- *
179
{
370
- * 31 30 29 28 24 23 22 21 20 16 15 14 13 11 10 9 5 4 0
180
- if (mmu_idx == ARMMMUIdx_S12NSE0 || mmu_idx == ARMMMUIdx_S12NSE1) {
371
- * +---+---+---+-----------+---------+------+-----+--------+---+------+------+
181
- mmu_idx += (ARMMMUIdx_S1NSE0 - ARMMMUIdx_S12NSE0);
372
- * | 0 | Q | U | 0 1 1 1 0 | a | 1 0 | Rm | 0 0 | opcode | 1 | Rn | Rd |
182
+ if (mmu_idx == ARMMMUIdx_E10_0 || mmu_idx == ARMMMUIdx_E10_1) {
373
- * +---+---+---+-----------+---------+------+-----+--------+---+------+------+
183
+ mmu_idx += (ARMMMUIdx_S1NSE0 - ARMMMUIdx_E10_0);
374
- *
184
}
375
- * This includes FMULX, FCMEQ (register), FRECPS, FRSQRTS, FCMGE
185
return mmu_idx;
376
- * (register), FACGE, FABD, FCMGT (register) and FACGT.
186
}
377
- *
187
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_user(CPUARMState *env, ARMMMUIdx mmu_idx)
378
- */
188
return true;
379
-static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
189
default:
380
-{
190
return false;
381
- int opcode = extract32(insn, 11, 3);
191
- case ARMMMUIdx_S12NSE0:
382
- int u = extract32(insn, 29, 1);
192
- case ARMMMUIdx_S12NSE1:
383
- int a = extract32(insn, 23, 1);
193
+ case ARMMMUIdx_E10_0:
384
- int is_q = extract32(insn, 30, 1);
194
+ case ARMMMUIdx_E10_1:
385
- int rm = extract32(insn, 16, 5);
195
g_assert_not_reached();
386
- int rn = extract32(insn, 5, 5);
196
}
387
- int rd = extract32(insn, 0, 5);
197
}
388
- /*
198
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
389
- * For these floating point ops, the U, a and opcode bits
199
target_ulong *page_size,
390
- * together indicate the operation.
200
ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs)
391
- */
201
{
392
- int fpopcode = opcode | (a << 3) | (u << 4);
202
- if (mmu_idx == ARMMMUIdx_S12NSE0 || mmu_idx == ARMMMUIdx_S12NSE1) {
393
- bool pairwise;
203
+ if (mmu_idx == ARMMMUIdx_E10_0 || mmu_idx == ARMMMUIdx_E10_1) {
394
- TCGv_ptr fpst;
204
/* Call ourselves recursively to do the stage 1 and then stage 2
395
- int pass;
205
* translations.
396
-
206
*/
397
- switch (fpopcode) {
207
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
398
- case 0x10: /* FMAXNMP */
208
if (el < 2 && arm_is_secure_below_el3(env)) {
399
- case 0x16: /* FMAXP */
209
return ARMMMUIdx_S1SE0 + el;
400
- case 0x18: /* FMINNMP */
210
} else {
401
- case 0x1e: /* FMINP */
211
- return ARMMMUIdx_S12NSE0 + el;
402
- pairwise = true;
212
+ return ARMMMUIdx_E10_0 + el;
403
- break;
213
}
404
- default:
214
}
405
- case 0x0: /* FMAXNM */
215
406
- case 0x1: /* FMLA */
216
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
407
- case 0x2: /* FADD */
408
- case 0x3: /* FMULX */
409
- case 0x4: /* FCMEQ */
410
- case 0x6: /* FMAX */
411
- case 0x7: /* FRECPS */
412
- case 0x8: /* FMINNM */
413
- case 0x9: /* FMLS */
414
- case 0xa: /* FSUB */
415
- case 0xe: /* FMIN */
416
- case 0xf: /* FRSQRTS */
417
- case 0x12: /* FADDP */
418
- case 0x13: /* FMUL */
419
- case 0x14: /* FCMGE */
420
- case 0x15: /* FACGE */
421
- case 0x17: /* FDIV */
422
- case 0x1a: /* FABD */
423
- case 0x1c: /* FCMGT */
424
- case 0x1d: /* FACGT */
425
- unallocated_encoding(s);
426
- return;
427
- }
428
-
429
- if (!dc_isar_feature(aa64_fp16, s)) {
430
- unallocated_encoding(s);
431
- return;
432
- }
433
-
434
- if (!fp_access_check(s)) {
435
- return;
436
- }
437
-
438
- fpst = fpstatus_ptr(FPST_FPCR_F16);
439
-
440
- if (pairwise) {
441
- int maxpass = is_q ? 8 : 4;
442
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
443
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
444
- TCGv_i32 tcg_res[8];
445
-
446
- for (pass = 0; pass < maxpass; pass++) {
447
- int passreg = pass < (maxpass / 2) ? rn : rm;
448
- int passelt = (pass << 1) & (maxpass - 1);
449
-
450
- read_vec_element_i32(s, tcg_op1, passreg, passelt, MO_16);
451
- read_vec_element_i32(s, tcg_op2, passreg, passelt + 1, MO_16);
452
- tcg_res[pass] = tcg_temp_new_i32();
453
-
454
- switch (fpopcode) {
455
- case 0x10: /* FMAXNMP */
456
- gen_helper_advsimd_maxnumh(tcg_res[pass], tcg_op1, tcg_op2,
457
- fpst);
458
- break;
459
- case 0x16: /* FMAXP */
460
- gen_helper_advsimd_maxh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
461
- break;
462
- case 0x18: /* FMINNMP */
463
- gen_helper_advsimd_minnumh(tcg_res[pass], tcg_op1, tcg_op2,
464
- fpst);
465
- break;
466
- case 0x1e: /* FMINP */
467
- gen_helper_advsimd_minh(tcg_res[pass], tcg_op1, tcg_op2, fpst);
468
- break;
469
- default:
470
- case 0x12: /* FADDP */
471
- g_assert_not_reached();
472
- }
473
- }
474
-
475
- for (pass = 0; pass < maxpass; pass++) {
476
- write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_16);
477
- }
478
- } else {
479
- g_assert_not_reached();
480
- }
481
-
482
- clear_vec_high(s, is_q, rd);
483
-}
484
-
485
/* AdvSIMD three same extra
486
* 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0
487
* +---+---+---+-----------+------+---+------+---+--------+---+----+----+
488
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
489
{ 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
490
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
491
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
492
- { 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 },
493
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
494
{ 0x00000000, 0x00000000, NULL }
495
};
496
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
217
index XXXXXXX..XXXXXXX 100644
497
index XXXXXXX..XXXXXXX 100644
218
--- a/target/arm/translate-a64.c
498
--- a/target/arm/tcg/vec_helper.c
219
+++ b/target/arm/translate-a64.c
499
+++ b/target/arm/tcg/vec_helper.c
220
@@ -XXX,XX +XXX,XX @@ static inline int get_a64_user_mem_index(DisasContext *s)
500
@@ -XXX,XX +XXX,XX @@ DO_3OP_PAIR(gvec_faddp_h, float16_add, float16, H2)
221
ARMMMUIdx useridx;
501
DO_3OP_PAIR(gvec_faddp_s, float32_add, float32, H4)
222
502
DO_3OP_PAIR(gvec_faddp_d, float64_add, float64, )
223
switch (s->mmu_idx) {
503
224
- case ARMMMUIdx_S12NSE1:
504
+DO_3OP_PAIR(gvec_fmaxp_h, float16_max, float16, H2)
225
- useridx = ARMMMUIdx_S12NSE0;
505
+DO_3OP_PAIR(gvec_fmaxp_s, float32_max, float32, H4)
226
+ case ARMMMUIdx_E10_1:
506
+DO_3OP_PAIR(gvec_fmaxp_d, float64_max, float64, )
227
+ useridx = ARMMMUIdx_E10_0;
507
+
228
break;
508
+DO_3OP_PAIR(gvec_fminp_h, float16_min, float16, H2)
229
case ARMMMUIdx_S1SE1:
509
+DO_3OP_PAIR(gvec_fminp_s, float32_min, float32, H4)
230
useridx = ARMMMUIdx_S1SE0;
510
+DO_3OP_PAIR(gvec_fminp_d, float64_min, float64, )
231
diff --git a/target/arm/translate.c b/target/arm/translate.c
511
+
232
index XXXXXXX..XXXXXXX 100644
512
+DO_3OP_PAIR(gvec_fmaxnump_h, float16_maxnum, float16, H2)
233
--- a/target/arm/translate.c
513
+DO_3OP_PAIR(gvec_fmaxnump_s, float32_maxnum, float32, H4)
234
+++ b/target/arm/translate.c
514
+DO_3OP_PAIR(gvec_fmaxnump_d, float64_maxnum, float64, )
235
@@ -XXX,XX +XXX,XX @@ static inline int get_a32_user_mem_index(DisasContext *s)
515
+
236
*/
516
+DO_3OP_PAIR(gvec_fminnump_h, float16_minnum, float16, H2)
237
switch (s->mmu_idx) {
517
+DO_3OP_PAIR(gvec_fminnump_s, float32_minnum, float32, H4)
238
case ARMMMUIdx_S1E2: /* this one is UNPREDICTABLE */
518
+DO_3OP_PAIR(gvec_fminnump_d, float64_minnum, float64, )
239
- case ARMMMUIdx_S12NSE0:
519
+
240
- case ARMMMUIdx_S12NSE1:
520
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
241
- return arm_to_core_mmu_idx(ARMMMUIdx_S12NSE0);
521
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
242
+ case ARMMMUIdx_E10_0:
522
{ \
243
+ case ARMMMUIdx_E10_1:
244
+ return arm_to_core_mmu_idx(ARMMMUIdx_E10_0);
245
case ARMMMUIdx_S1E3:
246
case ARMMMUIdx_S1SE0:
247
case ARMMMUIdx_S1SE1:
248
--
523
--
249
2.20.1
524
2.34.1
250
251
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The fall through organization of this function meant that we
4
would raise an interrupt, then might overwrite that with another.
5
Since interrupt prioritization is IMPLEMENTATION DEFINED, we
6
can recognize these in any order we choose.
7
8
Unify the code to raise the interrupt in a block at the end.
9
10
Tested-by: Alex Bennée <alex.bennee@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20200206105448.4726-42-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-31-richard.henderson@linaro.org
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
7
---
16
target/arm/cpu.c | 30 ++++++++++++------------------
8
target/arm/helper.h | 7 -----
17
1 file changed, 12 insertions(+), 18 deletions(-)
9
target/arm/tcg/translate-neon.c | 55 ++-------------------------------
10
target/arm/tcg/vec_helper.c | 45 ---------------------------
11
3 files changed, 3 insertions(+), 104 deletions(-)
18
12
19
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
20
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
21
--- a/target/arm/cpu.c
15
--- a/target/arm/helper.h
22
+++ b/target/arm/cpu.c
16
+++ b/target/arm/helper.h
23
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
24
uint64_t hcr_el2 = arm_hcr_el2_eff(env);
18
DEF_HELPER_FLAGS_6(gvec_fcmlad, TCG_CALL_NO_RWG,
25
uint32_t target_el;
19
void, ptr, ptr, ptr, ptr, ptr, i32)
26
uint32_t excp_idx;
20
27
- bool ret = false;
21
-DEF_HELPER_FLAGS_5(neon_paddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
+
22
-DEF_HELPER_FLAGS_5(neon_pmaxh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
+ /* The prioritization of interrupts is IMPLEMENTATION DEFINED. */
23
-DEF_HELPER_FLAGS_5(neon_pminh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
30
24
-DEF_HELPER_FLAGS_5(neon_padds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
31
if (interrupt_request & CPU_INTERRUPT_FIQ) {
25
-DEF_HELPER_FLAGS_5(neon_pmaxs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
32
excp_idx = EXCP_FIQ;
26
-DEF_HELPER_FLAGS_5(neon_pmins, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
27
-
34
if (arm_excp_unmasked(cs, excp_idx, target_el,
28
DEF_HELPER_FLAGS_4(gvec_sstoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
cur_el, secure, hcr_el2)) {
29
DEF_HELPER_FLAGS_4(gvec_sitos, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
- cs->exception_index = excp_idx;
30
DEF_HELPER_FLAGS_4(gvec_ustoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
- env->exception.target_el = target_el;
31
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
38
- cc->do_interrupt(cs);
32
index XXXXXXX..XXXXXXX 100644
39
- ret = true;
33
--- a/target/arm/tcg/translate-neon.c
40
+ goto found;
34
+++ b/target/arm/tcg/translate-neon.c
41
}
35
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VFMA, gen_helper_gvec_vfma_s, gen_helper_gvec_vfma_h)
42
}
36
DO_3S_FP_GVEC(VFMS, gen_helper_gvec_vfms_s, gen_helper_gvec_vfms_h)
43
if (interrupt_request & CPU_INTERRUPT_HARD) {
37
DO_3S_FP_GVEC(VRECPS, gen_helper_gvec_recps_nf_s, gen_helper_gvec_recps_nf_h)
44
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
38
DO_3S_FP_GVEC(VRSQRTS, gen_helper_gvec_rsqrts_nf_s, gen_helper_gvec_rsqrts_nf_h)
45
target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
39
+DO_3S_FP_GVEC(VPADD, gen_helper_gvec_faddp_s, gen_helper_gvec_faddp_h)
46
if (arm_excp_unmasked(cs, excp_idx, target_el,
40
+DO_3S_FP_GVEC(VPMAX, gen_helper_gvec_fmaxp_s, gen_helper_gvec_fmaxp_h)
47
cur_el, secure, hcr_el2)) {
41
+DO_3S_FP_GVEC(VPMIN, gen_helper_gvec_fminp_s, gen_helper_gvec_fminp_h)
48
- cs->exception_index = excp_idx;
42
49
- env->exception.target_el = target_el;
43
WRAP_FP_GVEC(gen_VMAXNM_fp32_3s, FPST_STD, gen_helper_gvec_fmaxnum_s)
50
- cc->do_interrupt(cs);
44
WRAP_FP_GVEC(gen_VMAXNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fmaxnum_h)
51
- ret = true;
45
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
52
+ goto found;
46
return do_3same(s, a, gen_VMINNM_fp32_3s);
53
}
54
}
55
if (interrupt_request & CPU_INTERRUPT_VIRQ) {
56
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
57
target_el = 1;
58
if (arm_excp_unmasked(cs, excp_idx, target_el,
59
cur_el, secure, hcr_el2)) {
60
- cs->exception_index = excp_idx;
61
- env->exception.target_el = target_el;
62
- cc->do_interrupt(cs);
63
- ret = true;
64
+ goto found;
65
}
66
}
67
if (interrupt_request & CPU_INTERRUPT_VFIQ) {
68
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
69
target_el = 1;
70
if (arm_excp_unmasked(cs, excp_idx, target_el,
71
cur_el, secure, hcr_el2)) {
72
- cs->exception_index = excp_idx;
73
- env->exception.target_el = target_el;
74
- cc->do_interrupt(cs);
75
- ret = true;
76
+ goto found;
77
}
78
}
79
+ return false;
80
81
- return ret;
82
+ found:
83
+ cs->exception_index = excp_idx;
84
+ env->exception.target_el = target_el;
85
+ cc->do_interrupt(cs);
86
+ return true;
87
}
47
}
88
48
89
#if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
49
-static bool do_3same_fp_pair(DisasContext *s, arg_3same *a,
50
- gen_helper_gvec_3_ptr *fn)
51
-{
52
- /* FP pairwise operations */
53
- TCGv_ptr fpstatus;
54
-
55
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
56
- return false;
57
- }
58
-
59
- /* UNDEF accesses to D16-D31 if they don't exist. */
60
- if (!dc_isar_feature(aa32_simd_r32, s) &&
61
- ((a->vd | a->vn | a->vm) & 0x10)) {
62
- return false;
63
- }
64
-
65
- if (!vfp_access_check(s)) {
66
- return true;
67
- }
68
-
69
- assert(a->q == 0); /* enforced by decode patterns */
70
-
71
-
72
- fpstatus = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD);
73
- tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
74
- vfp_reg_offset(1, a->vn),
75
- vfp_reg_offset(1, a->vm),
76
- fpstatus, 8, 8, 0, fn);
77
-
78
- return true;
79
-}
80
-
81
-/*
82
- * For all the functions using this macro, size == 1 means fp16,
83
- * which is an architecture extension we don't implement yet.
84
- */
85
-#define DO_3S_FP_PAIR(INSN,FUNC) \
86
- static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
87
- { \
88
- if (a->size == MO_16) { \
89
- if (!dc_isar_feature(aa32_fp16_arith, s)) { \
90
- return false; \
91
- } \
92
- return do_3same_fp_pair(s, a, FUNC##h); \
93
- } \
94
- return do_3same_fp_pair(s, a, FUNC##s); \
95
- }
96
-
97
-DO_3S_FP_PAIR(VPADD, gen_helper_neon_padd)
98
-DO_3S_FP_PAIR(VPMAX, gen_helper_neon_pmax)
99
-DO_3S_FP_PAIR(VPMIN, gen_helper_neon_pmin)
100
-
101
static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
102
{
103
/* Handle a 2-reg-shift insn which can be vectorized. */
104
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
105
index XXXXXXX..XXXXXXX 100644
106
--- a/target/arm/tcg/vec_helper.c
107
+++ b/target/arm/tcg/vec_helper.c
108
@@ -XXX,XX +XXX,XX @@ DO_ABA(gvec_uaba_d, uint64_t)
109
110
#undef DO_ABA
111
112
-#define DO_NEON_PAIRWISE(NAME, OP) \
113
- void HELPER(NAME##s)(void *vd, void *vn, void *vm, \
114
- void *stat, uint32_t oprsz) \
115
- { \
116
- float_status *fpst = stat; \
117
- float32 *d = vd; \
118
- float32 *n = vn; \
119
- float32 *m = vm; \
120
- float32 r0, r1; \
121
- \
122
- /* Read all inputs before writing outputs in case vm == vd */ \
123
- r0 = float32_##OP(n[H4(0)], n[H4(1)], fpst); \
124
- r1 = float32_##OP(m[H4(0)], m[H4(1)], fpst); \
125
- \
126
- d[H4(0)] = r0; \
127
- d[H4(1)] = r1; \
128
- } \
129
- \
130
- void HELPER(NAME##h)(void *vd, void *vn, void *vm, \
131
- void *stat, uint32_t oprsz) \
132
- { \
133
- float_status *fpst = stat; \
134
- float16 *d = vd; \
135
- float16 *n = vn; \
136
- float16 *m = vm; \
137
- float16 r0, r1, r2, r3; \
138
- \
139
- /* Read all inputs before writing outputs in case vm == vd */ \
140
- r0 = float16_##OP(n[H2(0)], n[H2(1)], fpst); \
141
- r1 = float16_##OP(n[H2(2)], n[H2(3)], fpst); \
142
- r2 = float16_##OP(m[H2(0)], m[H2(1)], fpst); \
143
- r3 = float16_##OP(m[H2(2)], m[H2(3)], fpst); \
144
- \
145
- d[H2(0)] = r0; \
146
- d[H2(1)] = r1; \
147
- d[H2(2)] = r2; \
148
- d[H2(3)] = r3; \
149
- }
150
-
151
-DO_NEON_PAIRWISE(neon_padd, add)
152
-DO_NEON_PAIRWISE(neon_pmax, max)
153
-DO_NEON_PAIRWISE(neon_pmin, min)
154
-
155
-#undef DO_NEON_PAIRWISE
156
-
157
#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \
158
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
159
{ \
90
--
160
--
91
2.20.1
161
2.34.1
92
93
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Tested-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200206105448.4726-32-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-32-richard.henderson@linaro.org
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
7
---
9
target/arm/helper.c | 25 ++++++++++++++++++-------
8
target/arm/helper.h | 5 ++
10
1 file changed, 18 insertions(+), 7 deletions(-)
9
target/arm/tcg/translate.h | 3 +
10
target/arm/tcg/a64.decode | 6 ++
11
target/arm/tcg/gengvec.c | 12 ++++
12
target/arm/tcg/translate-a64.c | 128 ++++++---------------------------
13
target/arm/tcg/vec_helper.c | 30 ++++++++
14
6 files changed, 77 insertions(+), 107 deletions(-)
11
15
12
diff --git a/target/arm/helper.c b/target/arm/helper.c
16
diff --git a/target/arm/helper.h b/target/arm/helper.h
13
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/helper.c
18
--- a/target/arm/helper.h
15
+++ b/target/arm/helper.c
19
+++ b/target/arm/helper.h
16
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fminnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i
17
21
DEF_HELPER_FLAGS_5(gvec_fminnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
static int vae1_tlbmask(CPUARMState *env)
22
DEF_HELPER_FLAGS_5(gvec_fminnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
19
{
23
20
+ /* Since we exclude secure first, we may read HCR_EL2 directly. */
24
+DEF_HELPER_FLAGS_4(gvec_addp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
if (arm_is_secure_below_el3(env)) {
25
+DEF_HELPER_FLAGS_4(gvec_addp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
return ARMMMUIdxBit_SE10_1 | ARMMMUIdxBit_SE10_0;
26
+DEF_HELPER_FLAGS_4(gvec_addp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+ } else if ((env->cp15.hcr_el2 & (HCR_E2H | HCR_TGE))
27
+DEF_HELPER_FLAGS_4(gvec_addp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+ == (HCR_E2H | HCR_TGE)) {
28
+
25
+ return ARMMMUIdxBit_E20_2 | ARMMMUIdxBit_E20_0;
29
#ifdef TARGET_AARCH64
26
} else {
30
#include "tcg/helper-a64.h"
27
return ARMMMUIdxBit_E10_1 | ARMMMUIdxBit_E10_0;
31
#include "tcg/helper-sve.h"
28
}
32
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
29
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
33
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/tcg/translate.h
35
+++ b/target/arm/tcg/translate.h
36
@@ -XXX,XX +XXX,XX @@ void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
37
void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
38
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
39
40
+void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
41
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
42
+
43
/*
44
* Forward to the isar_feature_* tests given a DisasContext pointer.
45
*/
46
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/tcg/a64.decode
49
+++ b/target/arm/tcg/a64.decode
50
@@ -XXX,XX +XXX,XX @@
51
&qrrrr_e q rd rn rm ra esz
52
53
@rr_h ........ ... ..... ...... rn:5 rd:5 &rr_e esz=1
54
+@rr_d ........ ... ..... ...... rn:5 rd:5 &rr_e esz=3
55
@rr_sd ........ ... ..... ...... rn:5 rd:5 &rr_e esz=%esz_sd
56
57
@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1
58
@@ -XXX,XX +XXX,XX @@
59
60
@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1
61
@qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd
62
+@qrrr_e . q:1 ...... esz:2 . rm:5 ...... rn:5 rd:5 &qrrr_e
63
64
@qrrx_h . q:1 .. .... .. .. rm:4 .... . . rn:5 rd:5 \
65
&qrrx_e esz=1 idx=%hlm
66
@@ -XXX,XX +XXX,XX @@ FMAXNMP_s 0111 1110 0.11 0000 1100 10 ..... ..... @rr_sd
67
FMINNMP_s 0101 1110 1011 0000 1100 10 ..... ..... @rr_h
68
FMINNMP_s 0111 1110 1.11 0000 1100 10 ..... ..... @rr_sd
69
70
+ADDP_s 0101 1110 1111 0001 1011 10 ..... ..... @rr_d
71
+
72
### Advanced SIMD three same
73
74
FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
75
@@ -XXX,XX +XXX,XX @@ FMAXNMP_v 0.10 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd
76
FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h
77
FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
78
79
+ADDP_v 0.00 1110 ..1 ..... 10111 1 ..... ..... @qrrr_e
80
+
81
### Advanced SIMD scalar x indexed element
82
83
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
84
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/target/arm/tcg/gengvec.c
87
+++ b/target/arm/tcg/gengvec.c
88
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
89
};
90
tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
91
}
92
+
93
+void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
94
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
95
+{
96
+ static gen_helper_gvec_3 * const fns[4] = {
97
+ gen_helper_gvec_addp_b,
98
+ gen_helper_gvec_addp_h,
99
+ gen_helper_gvec_addp_s,
100
+ gen_helper_gvec_addp_d,
101
+ };
102
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
103
+}
104
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
105
index XXXXXXX..XXXXXXX 100644
106
--- a/target/arm/tcg/translate-a64.c
107
+++ b/target/arm/tcg/translate-a64.c
108
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = {
109
};
110
TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp)
111
112
+TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp)
113
+
114
/*
115
* Advanced SIMD scalar/vector x indexed element
116
*/
117
@@ -XXX,XX +XXX,XX @@ TRANS(FMINP_s, do_fp3_scalar_pair, a, &f_scalar_fmin)
118
TRANS(FMAXNMP_s, do_fp3_scalar_pair, a, &f_scalar_fmaxnm)
119
TRANS(FMINNMP_s, do_fp3_scalar_pair, a, &f_scalar_fminnm)
120
121
+static bool trans_ADDP_s(DisasContext *s, arg_rr_e *a)
122
+{
123
+ if (fp_access_check(s)) {
124
+ TCGv_i64 t0 = tcg_temp_new_i64();
125
+ TCGv_i64 t1 = tcg_temp_new_i64();
126
+
127
+ read_vec_element(s, t0, a->rn, 0, MO_64);
128
+ read_vec_element(s, t1, a->rn, 1, MO_64);
129
+ tcg_gen_add_i64(t0, t0, t1);
130
+ write_fp_dreg(s, a->rd, t0);
131
+ }
132
+ return true;
133
+}
134
+
135
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
136
* Note that it is the caller's responsibility to ensure that the
137
* shift amount is in range (ie 0..31 or 0..63) and provide the ARM
138
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
30
}
139
}
31
}
140
}
32
141
33
+static int e2_tlbmask(CPUARMState *env)
142
-/* AdvSIMD scalar pairwise
34
+{
143
- * 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0
35
+ /* TODO: ARMv8.4-SecEL2 */
144
- * +-----+---+-----------+------+-----------+--------+-----+------+------+
36
+ return ARMMMUIdxBit_E20_0 | ARMMMUIdxBit_E20_2 | ARMMMUIdxBit_E2;
145
- * | 0 1 | U | 1 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 | Rn | Rd |
146
- * +-----+---+-----------+------+-----------+--------+-----+------+------+
147
- */
148
-static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
149
-{
150
- int u = extract32(insn, 29, 1);
151
- int size = extract32(insn, 22, 2);
152
- int opcode = extract32(insn, 12, 5);
153
- int rn = extract32(insn, 5, 5);
154
- int rd = extract32(insn, 0, 5);
155
-
156
- /* For some ops (the FP ones), size[1] is part of the encoding.
157
- * For ADDP strictly it is not but size[1] is always 1 for valid
158
- * encodings.
159
- */
160
- opcode |= (extract32(size, 1, 1) << 5);
161
-
162
- switch (opcode) {
163
- case 0x3b: /* ADDP */
164
- if (u || size != 3) {
165
- unallocated_encoding(s);
166
- return;
167
- }
168
- if (!fp_access_check(s)) {
169
- return;
170
- }
171
- break;
172
- default:
173
- case 0xc: /* FMAXNMP */
174
- case 0xd: /* FADDP */
175
- case 0xf: /* FMAXP */
176
- case 0x2c: /* FMINNMP */
177
- case 0x2f: /* FMINP */
178
- unallocated_encoding(s);
179
- return;
180
- }
181
-
182
- if (size == MO_64) {
183
- TCGv_i64 tcg_op1 = tcg_temp_new_i64();
184
- TCGv_i64 tcg_op2 = tcg_temp_new_i64();
185
- TCGv_i64 tcg_res = tcg_temp_new_i64();
186
-
187
- read_vec_element(s, tcg_op1, rn, 0, MO_64);
188
- read_vec_element(s, tcg_op2, rn, 1, MO_64);
189
-
190
- switch (opcode) {
191
- case 0x3b: /* ADDP */
192
- tcg_gen_add_i64(tcg_res, tcg_op1, tcg_op2);
193
- break;
194
- default:
195
- case 0xc: /* FMAXNMP */
196
- case 0xd: /* FADDP */
197
- case 0xf: /* FMAXP */
198
- case 0x2c: /* FMINNMP */
199
- case 0x2f: /* FMINP */
200
- g_assert_not_reached();
201
- }
202
-
203
- write_fp_dreg(s, rd, tcg_res);
204
- } else {
205
- g_assert_not_reached();
206
- }
207
-}
208
-
209
/*
210
* Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate)
211
*
212
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
213
* adjacent elements being operated on to produce an element in the result.
214
*/
215
if (size == 3) {
216
- TCGv_i64 tcg_res[2];
217
-
218
- for (pass = 0; pass < 2; pass++) {
219
- TCGv_i64 tcg_op1 = tcg_temp_new_i64();
220
- TCGv_i64 tcg_op2 = tcg_temp_new_i64();
221
- int passreg = (pass == 0) ? rn : rm;
222
-
223
- read_vec_element(s, tcg_op1, passreg, 0, MO_64);
224
- read_vec_element(s, tcg_op2, passreg, 1, MO_64);
225
- tcg_res[pass] = tcg_temp_new_i64();
226
-
227
- switch (opcode) {
228
- case 0x17: /* ADDP */
229
- tcg_gen_add_i64(tcg_res[pass], tcg_op1, tcg_op2);
230
- break;
231
- default:
232
- case 0x58: /* FMAXNMP */
233
- case 0x5a: /* FADDP */
234
- case 0x5e: /* FMAXP */
235
- case 0x78: /* FMINNMP */
236
- case 0x7e: /* FMINP */
237
- g_assert_not_reached();
238
- }
239
- }
240
-
241
- for (pass = 0; pass < 2; pass++) {
242
- write_vec_element(s, tcg_res[pass], rd, pass, MO_64);
243
- }
244
+ g_assert_not_reached();
245
} else {
246
int maxpass = is_q ? 4 : 2;
247
TCGv_i32 tcg_res[4];
248
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
249
tcg_res[pass] = tcg_temp_new_i32();
250
251
switch (opcode) {
252
- case 0x17: /* ADDP */
253
- {
254
- static NeonGenTwoOpFn * const fns[3] = {
255
- gen_helper_neon_padd_u8,
256
- gen_helper_neon_padd_u16,
257
- tcg_gen_add_i32,
258
- };
259
- genfn = fns[size];
260
- break;
261
- }
262
case 0x14: /* SMAXP, UMAXP */
263
{
264
static NeonGenTwoOpFn * const fns[3][2] = {
265
@@ -XXX,XX +XXX,XX @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
266
break;
267
}
268
default:
269
+ case 0x17: /* ADDP */
270
case 0x58: /* FMAXNMP */
271
case 0x5a: /* FADDP */
272
case 0x5e: /* FMAXP */
273
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
274
case 0x3: /* logic ops */
275
disas_simd_3same_logic(s, insn);
276
break;
277
- case 0x17: /* ADDP */
278
case 0x14: /* SMAXP, UMAXP */
279
case 0x15: /* SMINP, UMINP */
280
{
281
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
282
default:
283
disas_simd_3same_int(s, insn);
284
break;
285
+ case 0x17: /* ADDP */
286
+ unallocated_encoding(s);
287
+ break;
288
}
289
}
290
291
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
292
{ 0x5e008400, 0xdf208400, disas_simd_scalar_three_reg_same_extra },
293
{ 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },
294
{ 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
295
- { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
296
{ 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */
297
{ 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
298
{ 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 },
299
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
300
index XXXXXXX..XXXXXXX 100644
301
--- a/target/arm/tcg/vec_helper.c
302
+++ b/target/arm/tcg/vec_helper.c
303
@@ -XXX,XX +XXX,XX @@ DO_3OP_PAIR(gvec_fminnump_h, float16_minnum, float16, H2)
304
DO_3OP_PAIR(gvec_fminnump_s, float32_minnum, float32, H4)
305
DO_3OP_PAIR(gvec_fminnump_d, float64_minnum, float64, )
306
307
+#undef DO_3OP_PAIR
308
+
309
+#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \
310
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
311
+{ \
312
+ ARMVectorReg scratch; \
313
+ intptr_t oprsz = simd_oprsz(desc); \
314
+ intptr_t half = oprsz / sizeof(TYPE) / 2; \
315
+ TYPE *d = vd, *n = vn, *m = vm; \
316
+ if (unlikely(d == m)) { \
317
+ m = memcpy(&scratch, m, oprsz); \
318
+ } \
319
+ for (intptr_t i = 0; i < half; ++i) { \
320
+ d[H(i)] = FUNC(n[H(i * 2)], n[H(i * 2 + 1)]); \
321
+ } \
322
+ for (intptr_t i = 0; i < half; ++i) { \
323
+ d[H(i + half)] = FUNC(m[H(i * 2)], m[H(i * 2 + 1)]); \
324
+ } \
325
+ clear_tail(d, oprsz, simd_maxsz(desc)); \
37
+}
326
+}
38
+
327
+
39
static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
328
+#define ADD(A, B) (A + B)
40
uint64_t value)
329
+DO_3OP_PAIR(gvec_addp_b, ADD, uint8_t, H1)
41
{
330
+DO_3OP_PAIR(gvec_addp_h, ADD, uint16_t, H2)
42
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
331
+DO_3OP_PAIR(gvec_addp_s, ADD, uint32_t, H4)
43
static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,
332
+DO_3OP_PAIR(gvec_addp_d, ADD, uint64_t, )
44
uint64_t value)
333
+#undef ADD
45
{
334
+
46
- ARMCPU *cpu = env_archcpu(env);
335
+#undef DO_3OP_PAIR
47
- CPUState *cs = CPU(cpu);
336
+
48
+ CPUState *cs = env_cpu(env);
337
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
49
+ int mask = e2_tlbmask(env);
338
void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \
50
339
{ \
51
- tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_E2);
52
+ tlb_flush_by_mmuidx(cs, mask);
53
}
54
55
static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
56
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
57
uint64_t value)
58
{
59
CPUState *cs = env_cpu(env);
60
+ int mask = e2_tlbmask(env);
61
62
- tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_E2);
63
+ tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
64
}
65
66
static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
67
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
68
* Currently handles both VAE2 and VALE2, since we don't support
69
* flush-last-level-only.
70
*/
71
- ARMCPU *cpu = env_archcpu(env);
72
- CPUState *cs = CPU(cpu);
73
+ CPUState *cs = env_cpu(env);
74
+ int mask = e2_tlbmask(env);
75
uint64_t pageaddr = sextract64(value << 12, 0, 56);
76
77
- tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_E2);
78
+ tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
79
}
80
81
static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
82
--
340
--
83
2.20.1
341
2.34.1
84
85
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
The TGE bit routes all asynchronous exceptions to EL2.
4
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200206105448.4726-33-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-33-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
7
---
11
target/arm/helper.c | 6 ++++++
8
target/arm/helper.h | 2 --
12
1 file changed, 6 insertions(+)
9
target/arm/tcg/neon_helper.c | 5 -----
10
target/arm/tcg/translate-neon.c | 3 +--
11
3 files changed, 1 insertion(+), 9 deletions(-)
13
12
14
diff --git a/target/arm/helper.c b/target/arm/helper.c
13
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper.c
15
--- a/target/arm/helper.h
17
+++ b/target/arm/helper.c
16
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t excp_idx,
17
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(neon_qrshl_s64, i64, env, i64, i64)
19
break;
18
20
};
19
DEF_HELPER_2(neon_add_u8, i32, i32, i32)
21
20
DEF_HELPER_2(neon_add_u16, i32, i32, i32)
22
+ /*
21
-DEF_HELPER_2(neon_padd_u8, i32, i32, i32)
23
+ * For these purposes, TGE and AMO/IMO/FMO both force the
22
-DEF_HELPER_2(neon_padd_u16, i32, i32, i32)
24
+ * interrupt to EL2. Fold TGE into the bit extracted above.
23
DEF_HELPER_2(neon_sub_u8, i32, i32, i32)
25
+ */
24
DEF_HELPER_2(neon_sub_u16, i32, i32, i32)
26
+ hcr |= (hcr_el2 & HCR_TGE) != 0;
25
DEF_HELPER_2(neon_mul_u8, i32, i32, i32)
27
+
26
diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c
28
/* Perform a table-lookup for the target EL given the current state */
27
index XXXXXXX..XXXXXXX 100644
29
target_el = target_el_table[is64][scr][rw][hcr][secure][cur_el];
28
--- a/target/arm/tcg/neon_helper.c
30
29
+++ b/target/arm/tcg/neon_helper.c
30
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_add_u16)(uint32_t a, uint32_t b)
31
return (a + b) ^ mask;
32
}
33
34
-#define NEON_FN(dest, src1, src2) dest = src1 + src2
35
-NEON_POP(padd_u8, neon_u8, 4)
36
-NEON_POP(padd_u16, neon_u16, 2)
37
-#undef NEON_FN
38
-
39
#define NEON_FN(dest, src1, src2) dest = src1 - src2
40
NEON_VOP(sub_u8, neon_u8, 4)
41
NEON_VOP(sub_u16, neon_u16, 2)
42
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/tcg/translate-neon.c
45
+++ b/target/arm/tcg/translate-neon.c
46
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VABD_S, gen_gvec_sabd)
47
DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
48
DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
49
DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
50
+DO_3SAME_NO_SZ_3(VPADD, gen_gvec_addp)
51
52
#define DO_3SAME_CMP(INSN, COND) \
53
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
54
@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
55
#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
56
#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
57
#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
58
-#define gen_helper_neon_padd_u32 tcg_gen_add_i32
59
60
DO_3SAME_PAIR(VPMAX_S, pmax_s)
61
DO_3SAME_PAIR(VPMIN_S, pmin_s)
62
DO_3SAME_PAIR(VPMAX_U, pmax_u)
63
DO_3SAME_PAIR(VPMIN_U, pmin_u)
64
-DO_3SAME_PAIR(VPADD, padd_u)
65
66
#define DO_3SAME_VQDMULH(INSN, FUNC) \
67
WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \
31
--
68
--
32
2.20.1
69
2.34.1
33
34
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
We are about to expand the number of mmuidx to 10, and so need 4 bits.
3
These are the last instructions within handle_simd_3same_pair
4
For the benefit of reading the number out of -d exec, align it to the
4
so remove it.
5
penultimate nibble.
6
5
7
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-17-richard.henderson@linaro.org
8
Message-id: 20240524232121.284515-34-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
10
---
13
target/arm/cpu.h | 16 ++++++++--------
11
target/arm/helper.h | 16 +++++
14
1 file changed, 8 insertions(+), 8 deletions(-)
12
target/arm/tcg/translate.h | 8 +++
13
target/arm/tcg/a64.decode | 4 ++
14
target/arm/tcg/gengvec.c | 48 +++++++++++++
15
target/arm/tcg/translate-a64.c | 119 +++++----------------------------
16
target/arm/tcg/vec_helper.c | 16 +++++
17
6 files changed, 109 insertions(+), 102 deletions(-)
15
18
16
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
19
diff --git a/target/arm/helper.h b/target/arm/helper.h
17
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/cpu.h
21
--- a/target/arm/helper.h
19
+++ b/target/arm/cpu.h
22
+++ b/target/arm/helper.h
20
@@ -XXX,XX +XXX,XX @@ typedef ARMCPU ArchCPU;
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_addp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
* We put flags which are shared between 32 and 64 bit mode at the top
24
DEF_HELPER_FLAGS_4(gvec_addp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
* of the word, and flags which apply to only one mode at the bottom.
25
DEF_HELPER_FLAGS_4(gvec_addp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
*
26
24
- * 31 21 18 14 9 0
27
+DEF_HELPER_FLAGS_4(gvec_smaxp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+ * 31 20 18 14 9 0
28
+DEF_HELPER_FLAGS_4(gvec_smaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
* +--------------+-----+-----+----------+--------------+
29
+DEF_HELPER_FLAGS_4(gvec_smaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
* | | | TBFLAG_A32 | |
30
+
28
* | | +-----+----------+ TBFLAG_AM32 |
31
+DEF_HELPER_FLAGS_4(gvec_sminp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
@@ -XXX,XX +XXX,XX @@ typedef ARMCPU ArchCPU;
32
+DEF_HELPER_FLAGS_4(gvec_sminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
* | | +-------------------------|
33
+DEF_HELPER_FLAGS_4(gvec_sminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
* | | | TBFLAG_A64 |
34
+
32
* +--------------+-----------+-------------------------+
35
+DEF_HELPER_FLAGS_4(gvec_umaxp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
- * 31 21 14 0
36
+DEF_HELPER_FLAGS_4(gvec_umaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+ * 31 20 14 0
37
+DEF_HELPER_FLAGS_4(gvec_umaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
*
38
+
36
* Unless otherwise noted, these bits are cached in env->hflags.
39
+DEF_HELPER_FLAGS_4(gvec_uminp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
*/
40
+DEF_HELPER_FLAGS_4(gvec_uminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
FIELD(TBFLAG_ANY, AARCH64_STATE, 31, 1)
41
+DEF_HELPER_FLAGS_4(gvec_uminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
39
-FIELD(TBFLAG_ANY, MMUIDX, 28, 3)
42
+
40
-FIELD(TBFLAG_ANY, SS_ACTIVE, 27, 1)
43
#ifdef TARGET_AARCH64
41
-FIELD(TBFLAG_ANY, PSTATE_SS, 26, 1) /* Not cached. */
44
#include "tcg/helper-a64.h"
42
+FIELD(TBFLAG_ANY, SS_ACTIVE, 30, 1)
45
#include "tcg/helper-sve.h"
43
+FIELD(TBFLAG_ANY, PSTATE_SS, 29, 1) /* Not cached. */
46
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
44
+FIELD(TBFLAG_ANY, BE_DATA, 28, 1)
47
index XXXXXXX..XXXXXXX 100644
45
+FIELD(TBFLAG_ANY, MMUIDX, 24, 4)
48
--- a/target/arm/tcg/translate.h
46
/* Target EL if we take a floating-point-disabled exception */
49
+++ b/target/arm/tcg/translate.h
47
-FIELD(TBFLAG_ANY, FPEXC_EL, 24, 2)
50
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
48
-FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
51
49
+FIELD(TBFLAG_ANY, FPEXC_EL, 22, 2)
52
void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
50
/* For A-profile only, target EL for debug exceptions. */
53
uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
51
-FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 21, 2)
54
+void gen_gvec_smaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
52
+FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 20, 2)
55
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
56
+void gen_gvec_sminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
57
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
58
+void gen_gvec_umaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
59
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
60
+void gen_gvec_uminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
61
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
53
62
54
/*
63
/*
55
* Bit usage when in AArch32 state, both A- and M-profile.
64
* Forward to the isar_feature_* tests given a DisasContext pointer.
65
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
66
index XXXXXXX..XXXXXXX 100644
67
--- a/target/arm/tcg/a64.decode
68
+++ b/target/arm/tcg/a64.decode
69
@@ -XXX,XX +XXX,XX @@ FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h
70
FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd
71
72
ADDP_v 0.00 1110 ..1 ..... 10111 1 ..... ..... @qrrr_e
73
+SMAXP_v 0.00 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e
74
+SMINP_v 0.00 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e
75
+UMAXP_v 0.10 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e
76
+UMINP_v 0.10 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e
77
78
### Advanced SIMD scalar x indexed element
79
80
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/tcg/gengvec.c
83
+++ b/target/arm/tcg/gengvec.c
84
@@ -XXX,XX +XXX,XX @@ void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
85
};
86
tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
87
}
88
+
89
+void gen_gvec_smaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
90
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
91
+{
92
+ static gen_helper_gvec_3 * const fns[4] = {
93
+ gen_helper_gvec_smaxp_b,
94
+ gen_helper_gvec_smaxp_h,
95
+ gen_helper_gvec_smaxp_s,
96
+ };
97
+ tcg_debug_assert(vece <= MO_32);
98
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
99
+}
100
+
101
+void gen_gvec_sminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
102
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
103
+{
104
+ static gen_helper_gvec_3 * const fns[4] = {
105
+ gen_helper_gvec_sminp_b,
106
+ gen_helper_gvec_sminp_h,
107
+ gen_helper_gvec_sminp_s,
108
+ };
109
+ tcg_debug_assert(vece <= MO_32);
110
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
111
+}
112
+
113
+void gen_gvec_umaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
114
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
115
+{
116
+ static gen_helper_gvec_3 * const fns[4] = {
117
+ gen_helper_gvec_umaxp_b,
118
+ gen_helper_gvec_umaxp_h,
119
+ gen_helper_gvec_umaxp_s,
120
+ };
121
+ tcg_debug_assert(vece <= MO_32);
122
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
123
+}
124
+
125
+void gen_gvec_uminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
126
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
127
+{
128
+ static gen_helper_gvec_3 * const fns[4] = {
129
+ gen_helper_gvec_uminp_b,
130
+ gen_helper_gvec_uminp_h,
131
+ gen_helper_gvec_uminp_s,
132
+ };
133
+ tcg_debug_assert(vece <= MO_32);
134
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]);
135
+}
136
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
137
index XXXXXXX..XXXXXXX 100644
138
--- a/target/arm/tcg/translate-a64.c
139
+++ b/target/arm/tcg/translate-a64.c
140
@@ -XXX,XX +XXX,XX @@ static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn)
141
return true;
142
}
143
144
+static bool do_gvec_fn3_no64(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn)
145
+{
146
+ if (a->esz == MO_64) {
147
+ return false;
148
+ }
149
+ if (fp_access_check(s)) {
150
+ gen_gvec_fn3(s, a->q, a->rd, a->rn, a->rm, fn, a->esz);
151
+ }
152
+ return true;
153
+}
154
+
155
static bool do_gvec_fn4(DisasContext *s, arg_qrrrr_e *a, GVecGen4Fn *fn)
156
{
157
if (!a->q && a->esz == MO_64) {
158
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = {
159
TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp)
160
161
TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp)
162
+TRANS(SMAXP_v, do_gvec_fn3_no64, a, gen_gvec_smaxp)
163
+TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp)
164
+TRANS(UMAXP_v, do_gvec_fn3_no64, a, gen_gvec_umaxp)
165
+TRANS(UMINP_v, do_gvec_fn3_no64, a, gen_gvec_uminp)
166
167
/*
168
* Advanced SIMD scalar/vector x indexed element
169
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
170
}
171
}
172
173
-/* Pairwise op subgroup of C3.6.16.
174
- *
175
- * This is called directly for float pairwise
176
- * operations where the opcode and size are calculated differently.
177
- */
178
-static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode,
179
- int size, int rn, int rm, int rd)
180
-{
181
- int pass;
182
-
183
- if (!fp_access_check(s)) {
184
- return;
185
- }
186
-
187
- /* These operations work on the concatenated rm:rn, with each pair of
188
- * adjacent elements being operated on to produce an element in the result.
189
- */
190
- if (size == 3) {
191
- g_assert_not_reached();
192
- } else {
193
- int maxpass = is_q ? 4 : 2;
194
- TCGv_i32 tcg_res[4];
195
-
196
- for (pass = 0; pass < maxpass; pass++) {
197
- TCGv_i32 tcg_op1 = tcg_temp_new_i32();
198
- TCGv_i32 tcg_op2 = tcg_temp_new_i32();
199
- NeonGenTwoOpFn *genfn = NULL;
200
- int passreg = pass < (maxpass / 2) ? rn : rm;
201
- int passelt = (is_q && (pass & 1)) ? 2 : 0;
202
-
203
- read_vec_element_i32(s, tcg_op1, passreg, passelt, MO_32);
204
- read_vec_element_i32(s, tcg_op2, passreg, passelt + 1, MO_32);
205
- tcg_res[pass] = tcg_temp_new_i32();
206
-
207
- switch (opcode) {
208
- case 0x14: /* SMAXP, UMAXP */
209
- {
210
- static NeonGenTwoOpFn * const fns[3][2] = {
211
- { gen_helper_neon_pmax_s8, gen_helper_neon_pmax_u8 },
212
- { gen_helper_neon_pmax_s16, gen_helper_neon_pmax_u16 },
213
- { tcg_gen_smax_i32, tcg_gen_umax_i32 },
214
- };
215
- genfn = fns[size][u];
216
- break;
217
- }
218
- case 0x15: /* SMINP, UMINP */
219
- {
220
- static NeonGenTwoOpFn * const fns[3][2] = {
221
- { gen_helper_neon_pmin_s8, gen_helper_neon_pmin_u8 },
222
- { gen_helper_neon_pmin_s16, gen_helper_neon_pmin_u16 },
223
- { tcg_gen_smin_i32, tcg_gen_umin_i32 },
224
- };
225
- genfn = fns[size][u];
226
- break;
227
- }
228
- default:
229
- case 0x17: /* ADDP */
230
- case 0x58: /* FMAXNMP */
231
- case 0x5a: /* FADDP */
232
- case 0x5e: /* FMAXP */
233
- case 0x78: /* FMINNMP */
234
- case 0x7e: /* FMINP */
235
- g_assert_not_reached();
236
- }
237
-
238
- /* FP ops called directly, otherwise call now */
239
- if (genfn) {
240
- genfn(tcg_res[pass], tcg_op1, tcg_op2);
241
- }
242
- }
243
-
244
- for (pass = 0; pass < maxpass; pass++) {
245
- write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_32);
246
- }
247
- clear_vec_high(s, is_q, rd);
248
- }
249
-}
250
-
251
/* Floating point op subgroup of C3.6.16. */
252
static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
253
{
254
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
255
case 0x3: /* logic ops */
256
disas_simd_3same_logic(s, insn);
257
break;
258
- case 0x14: /* SMAXP, UMAXP */
259
- case 0x15: /* SMINP, UMINP */
260
- {
261
- /* Pairwise operations */
262
- int is_q = extract32(insn, 30, 1);
263
- int u = extract32(insn, 29, 1);
264
- int size = extract32(insn, 22, 2);
265
- int rm = extract32(insn, 16, 5);
266
- int rn = extract32(insn, 5, 5);
267
- int rd = extract32(insn, 0, 5);
268
- if (opcode == 0x17) {
269
- if (u || (size == 3 && !is_q)) {
270
- unallocated_encoding(s);
271
- return;
272
- }
273
- } else {
274
- if (size == 3) {
275
- unallocated_encoding(s);
276
- return;
277
- }
278
- }
279
- handle_simd_3same_pair(s, is_q, u, opcode, size, rn, rm, rd);
280
- break;
281
- }
282
case 0x18 ... 0x31:
283
/* floating point ops, sz[1] and U are part of opcode */
284
disas_simd_3same_float(s, insn);
285
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
286
default:
287
disas_simd_3same_int(s, insn);
288
break;
289
+ case 0x14: /* SMAXP, UMAXP */
290
+ case 0x15: /* SMINP, UMINP */
291
case 0x17: /* ADDP */
292
unallocated_encoding(s);
293
break;
294
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
295
index XXXXXXX..XXXXXXX 100644
296
--- a/target/arm/tcg/vec_helper.c
297
+++ b/target/arm/tcg/vec_helper.c
298
@@ -XXX,XX +XXX,XX @@ DO_3OP_PAIR(gvec_addp_s, ADD, uint32_t, H4)
299
DO_3OP_PAIR(gvec_addp_d, ADD, uint64_t, )
300
#undef ADD
301
302
+DO_3OP_PAIR(gvec_smaxp_b, MAX, int8_t, H1)
303
+DO_3OP_PAIR(gvec_smaxp_h, MAX, int16_t, H2)
304
+DO_3OP_PAIR(gvec_smaxp_s, MAX, int32_t, H4)
305
+
306
+DO_3OP_PAIR(gvec_umaxp_b, MAX, uint8_t, H1)
307
+DO_3OP_PAIR(gvec_umaxp_h, MAX, uint16_t, H2)
308
+DO_3OP_PAIR(gvec_umaxp_s, MAX, uint32_t, H4)
309
+
310
+DO_3OP_PAIR(gvec_sminp_b, MIN, int8_t, H1)
311
+DO_3OP_PAIR(gvec_sminp_h, MIN, int16_t, H2)
312
+DO_3OP_PAIR(gvec_sminp_s, MIN, int32_t, H4)
313
+
314
+DO_3OP_PAIR(gvec_uminp_b, MIN, uint8_t, H1)
315
+DO_3OP_PAIR(gvec_uminp_h, MIN, uint16_t, H2)
316
+DO_3OP_PAIR(gvec_uminp_s, MIN, uint32_t, H4)
317
+
318
#undef DO_3OP_PAIR
319
320
#define DO_VCVT_FIXED(NAME, FUNC, TYPE) \
56
--
321
--
57
2.20.1
322
2.34.1
58
59
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
For ARMv8.1, op1 == 5 is reserved for EL2 aliases of
4
EL1 and EL0 registers.
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-28-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-35-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/helper.c | 5 +----
8
target/arm/tcg/translate-neon.c | 78 ++-------------------------------
13
1 file changed, 1 insertion(+), 4 deletions(-)
9
1 file changed, 4 insertions(+), 74 deletions(-)
14
10
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
11
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
16
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
13
--- a/target/arm/tcg/translate-neon.c
18
+++ b/target/arm/helper.c
14
+++ b/target/arm/tcg/translate-neon.c
19
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
15
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
20
mask = PL0_RW;
16
DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
21
break;
17
DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
22
case 4:
18
DO_3SAME_NO_SZ_3(VPADD, gen_gvec_addp)
23
+ case 5:
19
+DO_3SAME_NO_SZ_3(VPMAX_S, gen_gvec_smaxp)
24
/* min_EL EL2 */
20
+DO_3SAME_NO_SZ_3(VPMIN_S, gen_gvec_sminp)
25
mask = PL2_RW;
21
+DO_3SAME_NO_SZ_3(VPMAX_U, gen_gvec_umaxp)
26
break;
22
+DO_3SAME_NO_SZ_3(VPMIN_U, gen_gvec_uminp)
27
- case 5:
23
28
- /* unallocated encoding, so not possible */
24
#define DO_3SAME_CMP(INSN, COND) \
29
- assert(false);
25
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
30
- break;
26
@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_U, qshl_u)
31
case 6:
27
DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
32
/* min_EL EL3 */
28
DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
33
mask = PL3_RW;
29
30
-static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
31
-{
32
- /* Operations handled pairwise 32 bits at a time */
33
- TCGv_i32 tmp, tmp2, tmp3;
34
-
35
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
36
- return false;
37
- }
38
-
39
- /* UNDEF accesses to D16-D31 if they don't exist. */
40
- if (!dc_isar_feature(aa32_simd_r32, s) &&
41
- ((a->vd | a->vn | a->vm) & 0x10)) {
42
- return false;
43
- }
44
-
45
- if (a->size == 3) {
46
- return false;
47
- }
48
-
49
- if (!vfp_access_check(s)) {
50
- return true;
51
- }
52
-
53
- assert(a->q == 0); /* enforced by decode patterns */
54
-
55
- /*
56
- * Note that we have to be careful not to clobber the source operands
57
- * in the "vm == vd" case by storing the result of the first pass too
58
- * early. Since Q is 0 there are always just two passes, so instead
59
- * of a complicated loop over each pass we just unroll.
60
- */
61
- tmp = tcg_temp_new_i32();
62
- tmp2 = tcg_temp_new_i32();
63
- tmp3 = tcg_temp_new_i32();
64
-
65
- read_neon_element32(tmp, a->vn, 0, MO_32);
66
- read_neon_element32(tmp2, a->vn, 1, MO_32);
67
- fn(tmp, tmp, tmp2);
68
-
69
- read_neon_element32(tmp3, a->vm, 0, MO_32);
70
- read_neon_element32(tmp2, a->vm, 1, MO_32);
71
- fn(tmp3, tmp3, tmp2);
72
-
73
- write_neon_element32(tmp, a->vd, 0, MO_32);
74
- write_neon_element32(tmp3, a->vd, 1, MO_32);
75
-
76
- return true;
77
-}
78
-
79
-#define DO_3SAME_PAIR(INSN, func) \
80
- static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
81
- { \
82
- static NeonGenTwoOpFn * const fns[] = { \
83
- gen_helper_neon_##func##8, \
84
- gen_helper_neon_##func##16, \
85
- gen_helper_neon_##func##32, \
86
- }; \
87
- if (a->size > 2) { \
88
- return false; \
89
- } \
90
- return do_3same_pair(s, a, fns[a->size]); \
91
- }
92
-
93
-/* 32-bit pairwise ops end up the same as the elementwise versions. */
94
-#define gen_helper_neon_pmax_s32 tcg_gen_smax_i32
95
-#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
96
-#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
97
-#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
98
-
99
-DO_3SAME_PAIR(VPMAX_S, pmax_s)
100
-DO_3SAME_PAIR(VPMIN_S, pmin_s)
101
-DO_3SAME_PAIR(VPMAX_U, pmax_u)
102
-DO_3SAME_PAIR(VPMIN_U, pmin_u)
103
-
104
#define DO_3SAME_VQDMULH(INSN, FUNC) \
105
WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \
106
WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \
34
--
107
--
35
2.20.1
108
2.34.1
36
37
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
No functional change, but unify code sequences.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
5
Tested-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-8-richard.henderson@linaro.org
5
Message-id: 20240524232121.284515-36-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
target/arm/helper.c | 86 +++++++++++++--------------------------------
8
target/arm/tcg/a64.decode | 10 +++
13
1 file changed, 24 insertions(+), 62 deletions(-)
9
target/arm/tcg/translate-a64.c | 144 ++++++++++-----------------------
10
2 files changed, 51 insertions(+), 103 deletions(-)
14
11
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
14
--- a/target/arm/tcg/a64.decode
18
+++ b/target/arm/helper.c
15
+++ b/target/arm/tcg/a64.decode
19
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
16
@@ -XXX,XX +XXX,XX @@ FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd
20
tlb_flush_by_mmuidx(cs, mask);
17
FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h
21
}
18
FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd
22
19
23
-static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
20
+FMLAL_v 0.00 1110 001 ..... 11101 1 ..... ..... @qrrr_h
24
- uint64_t value)
21
+FMLSL_v 0.00 1110 101 ..... 11101 1 ..... ..... @qrrr_h
25
+static int alle1_tlbmask(CPUARMState *env)
22
+FMLAL2_v 0.10 1110 001 ..... 11001 1 ..... ..... @qrrr_h
26
{
23
+FMLSL2_v 0.10 1110 101 ..... 11001 1 ..... ..... @qrrr_h
27
- /* Note that the 'ALL' scope must invalidate both stage 1 and
24
+
28
+ /*
25
FCMEQ_v 0.00 1110 010 ..... 00100 1 ..... ..... @qrrr_h
29
+ * Note that the 'ALL' scope must invalidate both stage 1 and
26
FCMEQ_v 0.00 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd
30
* stage 2 translations, whereas most other scopes only invalidate
27
31
* stage 1 translations.
28
@@ -XXX,XX +XXX,XX @@ FMLS_vi 0.00 1111 11 0 ..... 0101 . 0 ..... ..... @qrrx_d
32
*/
29
FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h
33
- ARMCPU *cpu = env_archcpu(env);
30
FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s
34
- CPUState *cs = CPU(cpu);
31
FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d
35
-
32
+
36
if (arm_is_secure_below_el3(env)) {
33
+FMLAL_vi 0.00 1111 10 .. .... 0000 . 0 ..... ..... @qrrx_h
37
- tlb_flush_by_mmuidx(cs,
34
+FMLSL_vi 0.00 1111 10 .. .... 0100 . 0 ..... ..... @qrrx_h
38
- ARMMMUIdxBit_S1SE1 |
35
+FMLAL2_vi 0.10 1111 10 .. .... 1000 . 0 ..... ..... @qrrx_h
39
- ARMMMUIdxBit_S1SE0);
36
+FMLSL2_vi 0.10 1111 10 .. .... 1100 . 0 ..... ..... @qrrx_h
40
+ return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
37
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
41
+ } else if (arm_feature(env, ARM_FEATURE_EL2)) {
38
index XXXXXXX..XXXXXXX 100644
42
+ return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0 | ARMMMUIdxBit_S2NS;
39
--- a/target/arm/tcg/translate-a64.c
43
} else {
40
+++ b/target/arm/tcg/translate-a64.c
44
- if (arm_feature(env, ARM_FEATURE_EL2)) {
41
@@ -XXX,XX +XXX,XX @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = {
45
- tlb_flush_by_mmuidx(cs,
42
};
46
- ARMMMUIdxBit_S12NSE1 |
43
TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp)
47
- ARMMMUIdxBit_S12NSE0 |
44
48
- ARMMMUIdxBit_S2NS);
45
+static bool do_fmlal(DisasContext *s, arg_qrrr_e *a, bool is_s, bool is_2)
49
- } else {
46
+{
50
- tlb_flush_by_mmuidx(cs,
47
+ if (fp_access_check(s)) {
51
- ARMMMUIdxBit_S12NSE1 |
48
+ int data = (is_2 << 1) | is_s;
52
- ARMMMUIdxBit_S12NSE0);
49
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
53
- }
50
+ vec_full_reg_offset(s, a->rn),
54
+ return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
51
+ vec_full_reg_offset(s, a->rm), tcg_env,
52
+ a->q ? 16 : 8, vec_full_reg_size(s),
53
+ data, gen_helper_gvec_fmlal_a64);
54
+ }
55
+ return true;
56
+}
57
+
58
+TRANS_FEAT(FMLAL_v, aa64_fhm, do_fmlal, a, false, false)
59
+TRANS_FEAT(FMLSL_v, aa64_fhm, do_fmlal, a, true, false)
60
+TRANS_FEAT(FMLAL2_v, aa64_fhm, do_fmlal, a, false, true)
61
+TRANS_FEAT(FMLSL2_v, aa64_fhm, do_fmlal, a, true, true)
62
+
63
TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp)
64
TRANS(SMAXP_v, do_gvec_fn3_no64, a, gen_gvec_smaxp)
65
TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp)
66
@@ -XXX,XX +XXX,XX @@ static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg)
67
TRANS(FMLA_vi, do_fmla_vector_idx, a, false)
68
TRANS(FMLS_vi, do_fmla_vector_idx, a, true)
69
70
+static bool do_fmlal_idx(DisasContext *s, arg_qrrx_e *a, bool is_s, bool is_2)
71
+{
72
+ if (fp_access_check(s)) {
73
+ int data = (a->idx << 2) | (is_2 << 1) | is_s;
74
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
75
+ vec_full_reg_offset(s, a->rn),
76
+ vec_full_reg_offset(s, a->rm), tcg_env,
77
+ a->q ? 16 : 8, vec_full_reg_size(s),
78
+ data, gen_helper_gvec_fmlal_idx_a64);
79
+ }
80
+ return true;
81
+}
82
+
83
+TRANS_FEAT(FMLAL_vi, aa64_fhm, do_fmlal_idx, a, false, false)
84
+TRANS_FEAT(FMLSL_vi, aa64_fhm, do_fmlal_idx, a, true, false)
85
+TRANS_FEAT(FMLAL2_vi, aa64_fhm, do_fmlal_idx, a, false, true)
86
+TRANS_FEAT(FMLSL2_vi, aa64_fhm, do_fmlal_idx, a, true, true)
87
+
88
/*
89
* Advanced SIMD scalar pairwise
90
*/
91
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
55
}
92
}
56
}
93
}
57
94
58
+static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
95
-/* Floating point op subgroup of C3.6.16. */
59
+ uint64_t value)
96
-static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
60
+{
97
-{
61
+ CPUState *cs = env_cpu(env);
98
- /* For floating point ops, the U, size[1] and opcode bits
62
+ int mask = alle1_tlbmask(env);
99
- * together indicate the operation. size[0] indicates single
63
+
100
- * or double.
64
+ tlb_flush_by_mmuidx(cs, mask);
101
- */
65
+}
102
- int fpopcode = extract32(insn, 11, 5)
66
+
103
- | (extract32(insn, 23, 1) << 5)
67
static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,
104
- | (extract32(insn, 29, 1) << 6);
68
uint64_t value)
105
- int is_q = extract32(insn, 30, 1);
106
- int size = extract32(insn, 22, 1);
107
- int rm = extract32(insn, 16, 5);
108
- int rn = extract32(insn, 5, 5);
109
- int rd = extract32(insn, 0, 5);
110
-
111
- if (size == 1 && !is_q) {
112
- unallocated_encoding(s);
113
- return;
114
- }
115
-
116
- switch (fpopcode) {
117
- case 0x1d: /* FMLAL */
118
- case 0x3d: /* FMLSL */
119
- case 0x59: /* FMLAL2 */
120
- case 0x79: /* FMLSL2 */
121
- if (size & 1 || !dc_isar_feature(aa64_fhm, s)) {
122
- unallocated_encoding(s);
123
- return;
124
- }
125
- if (fp_access_check(s)) {
126
- int is_s = extract32(insn, 23, 1);
127
- int is_2 = extract32(insn, 29, 1);
128
- int data = (is_2 << 1) | is_s;
129
- tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
130
- vec_full_reg_offset(s, rn),
131
- vec_full_reg_offset(s, rm), tcg_env,
132
- is_q ? 16 : 8, vec_full_reg_size(s),
133
- data, gen_helper_gvec_fmlal_a64);
134
- }
135
- return;
136
-
137
- default:
138
- case 0x18: /* FMAXNM */
139
- case 0x19: /* FMLA */
140
- case 0x1a: /* FADD */
141
- case 0x1b: /* FMULX */
142
- case 0x1c: /* FCMEQ */
143
- case 0x1e: /* FMAX */
144
- case 0x1f: /* FRECPS */
145
- case 0x38: /* FMINNM */
146
- case 0x39: /* FMLS */
147
- case 0x3a: /* FSUB */
148
- case 0x3e: /* FMIN */
149
- case 0x3f: /* FRSQRTS */
150
- case 0x58: /* FMAXNMP */
151
- case 0x5a: /* FADDP */
152
- case 0x5b: /* FMUL */
153
- case 0x5c: /* FCMGE */
154
- case 0x5d: /* FACGE */
155
- case 0x5e: /* FMAXP */
156
- case 0x5f: /* FDIV */
157
- case 0x78: /* FMINNMP */
158
- case 0x7a: /* FABD */
159
- case 0x7d: /* FACGT */
160
- case 0x7c: /* FCMGT */
161
- case 0x7e: /* FMINP */
162
- unallocated_encoding(s);
163
- return;
164
- }
165
-}
166
-
167
/* Integer op subgroup of C3.6.16. */
168
static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
69
{
169
{
70
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
170
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
71
static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
171
case 0x3: /* logic ops */
72
uint64_t value)
172
disas_simd_3same_logic(s, insn);
73
{
173
break;
74
- /* Note that the 'ALL' scope must invalidate both stage 1 and
174
- case 0x18 ... 0x31:
75
- * stage 2 translations, whereas most other scopes only invalidate
175
- /* floating point ops, sz[1] and U are part of opcode */
76
- * stage 1 translations.
176
- disas_simd_3same_float(s, insn);
77
- */
177
- break;
78
CPUState *cs = env_cpu(env);
178
default:
79
- bool sec = arm_is_secure_below_el3(env);
179
disas_simd_3same_int(s, insn);
80
- bool has_el2 = arm_feature(env, ARM_FEATURE_EL2);
180
break;
81
+ int mask = alle1_tlbmask(env);
181
case 0x14: /* SMAXP, UMAXP */
82
182
case 0x15: /* SMINP, UMINP */
83
- if (sec) {
183
case 0x17: /* ADDP */
84
- tlb_flush_by_mmuidx_all_cpus_synced(cs,
184
+ case 0x18 ... 0x31: /* floating point ops */
85
- ARMMMUIdxBit_S1SE1 |
185
unallocated_encoding(s);
86
- ARMMMUIdxBit_S1SE0);
186
break;
87
- } else if (has_el2) {
187
}
88
- tlb_flush_by_mmuidx_all_cpus_synced(cs,
188
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
89
- ARMMMUIdxBit_S12NSE1 |
189
}
90
- ARMMMUIdxBit_S12NSE0 |
190
is_fp = 2;
91
- ARMMMUIdxBit_S2NS);
191
break;
92
- } else {
192
- case 0x00: /* FMLAL */
93
- tlb_flush_by_mmuidx_all_cpus_synced(cs,
193
- case 0x04: /* FMLSL */
94
- ARMMMUIdxBit_S12NSE1 |
194
- case 0x18: /* FMLAL2 */
95
- ARMMMUIdxBit_S12NSE0);
195
- case 0x1c: /* FMLSL2 */
96
- }
196
- if (is_scalar || size != MO_32 || !dc_isar_feature(aa64_fhm, s)) {
97
+ tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
197
- unallocated_encoding(s);
98
}
198
- return;
99
199
- }
100
static void tlbi_aa64_alle2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
200
- size = MO_16;
101
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
201
- /* is_fp, but we pass tcg_env not fp_status. */
102
static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
202
- break;
103
uint64_t value)
203
default:
104
{
204
+ case 0x00: /* FMLAL */
105
- ARMCPU *cpu = env_archcpu(env);
205
case 0x01: /* FMLA */
106
- CPUState *cs = CPU(cpu);
206
+ case 0x04: /* FMLSL */
107
- bool sec = arm_is_secure_below_el3(env);
207
case 0x05: /* FMLS */
108
+ CPUState *cs = env_cpu(env);
208
case 0x09: /* FMUL */
109
+ int mask = vae1_tlbmask(env);
209
+ case 0x18: /* FMLAL2 */
110
uint64_t pageaddr = sextract64(value << 12, 0, 56);
210
case 0x19: /* FMULX */
111
211
+ case 0x1c: /* FMLSL2 */
112
- if (sec) {
212
unallocated_encoding(s);
113
- tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
114
- ARMMMUIdxBit_S1SE1 |
115
- ARMMMUIdxBit_S1SE0);
116
- } else {
117
- tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
118
- ARMMMUIdxBit_S12NSE1 |
119
- ARMMMUIdxBit_S12NSE0);
120
- }
121
+ tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr, mask);
122
}
123
124
static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
125
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
126
* since we don't support flush-for-specific-ASID-only or
127
* flush-last-level-only.
128
*/
129
- ARMCPU *cpu = env_archcpu(env);
130
- CPUState *cs = CPU(cpu);
131
+ CPUState *cs = env_cpu(env);
132
+ int mask = vae1_tlbmask(env);
133
uint64_t pageaddr = sextract64(value << 12, 0, 56);
134
135
if (tlb_force_broadcast(env)) {
136
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
137
return;
213
return;
138
}
214
}
139
215
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
140
- if (arm_is_secure_below_el3(env)) {
216
}
141
- tlb_flush_page_by_mmuidx(cs, pageaddr,
217
return;
142
- ARMMMUIdxBit_S1SE1 |
218
143
- ARMMMUIdxBit_S1SE0);
219
- case 0x00: /* FMLAL */
144
- } else {
220
- case 0x04: /* FMLSL */
145
- tlb_flush_page_by_mmuidx(cs, pageaddr,
221
- case 0x18: /* FMLAL2 */
146
- ARMMMUIdxBit_S12NSE1 |
222
- case 0x1c: /* FMLSL2 */
147
- ARMMMUIdxBit_S12NSE0);
223
- {
148
- }
224
- int is_s = extract32(opcode, 2, 1);
149
+ tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
225
- int is_2 = u;
150
}
226
- int data = (index << 2) | (is_2 << 1) | is_s;
151
227
- tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
152
static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
228
- vec_full_reg_offset(s, rn),
229
- vec_full_reg_offset(s, rm), tcg_env,
230
- is_q ? 16 : 8, vec_full_reg_size(s),
231
- data, gen_helper_gvec_fmlal_idx_a64);
232
- }
233
- return;
234
-
235
case 0x08: /* MUL */
236
if (!is_long && !is_scalar) {
237
static gen_helper_gvec_3 * const fns[3] = {
153
--
238
--
154
2.20.1
239
2.34.1
155
156
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
This is part of a reorganization to the set of mmu_idx.
4
The EL1&0 regime is the only one that uses 2-stage translation.
5
Spelling out Stage avoids confusion with Secure.
6
7
Tested-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-12-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
target/arm/cpu.h | 4 ++--
14
target/arm/internals.h | 6 +++---
15
target/arm/helper.c | 27 ++++++++++++++-------------
16
3 files changed, 19 insertions(+), 18 deletions(-)
17
18
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/cpu.h
21
+++ b/target/arm/cpu.h
22
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
23
/* Indexes below here don't have TLBs and are used only for AT system
24
* instructions or for the first stage of an S12 page table walk.
25
*/
26
- ARMMMUIdx_S1NSE0 = 0 | ARM_MMU_IDX_NOTLB,
27
- ARMMMUIdx_S1NSE1 = 1 | ARM_MMU_IDX_NOTLB,
28
+ ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
29
+ ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
30
} ARMMMUIdx;
31
32
/* Bit macros for the core-mmu-index values for each index,
33
diff --git a/target/arm/internals.h b/target/arm/internals.h
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/internals.h
36
+++ b/target/arm/internals.h
37
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
38
switch (mmu_idx) {
39
case ARMMMUIdx_E10_0:
40
case ARMMMUIdx_E10_1:
41
- case ARMMMUIdx_S1NSE0:
42
- case ARMMMUIdx_S1NSE1:
43
+ case ARMMMUIdx_Stage1_E0:
44
+ case ARMMMUIdx_Stage1_E1:
45
case ARMMMUIdx_S1E2:
46
case ARMMMUIdx_Stage2:
47
case ARMMMUIdx_MPrivNegPri:
48
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx(CPUARMState *env);
49
#ifdef CONFIG_USER_ONLY
50
static inline ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env)
51
{
52
- return ARMMMUIdx_S1NSE0;
53
+ return ARMMMUIdx_Stage1_E0;
54
}
55
#else
56
ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env);
57
diff --git a/target/arm/helper.c b/target/arm/helper.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/helper.c
60
+++ b/target/arm/helper.c
61
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
62
bool take_exc = false;
63
64
if (fi.s1ptw && current_el == 1 && !arm_is_secure(env)
65
- && (mmu_idx == ARMMMUIdx_S1NSE1 || mmu_idx == ARMMMUIdx_S1NSE0)) {
66
+ && (mmu_idx == ARMMMUIdx_Stage1_E1 ||
67
+ mmu_idx == ARMMMUIdx_Stage1_E0)) {
68
/*
69
* Synchronous stage 2 fault on an access made as part of the
70
* translation table walk for AT S1E0* or AT S1E1* insn
71
@@ -XXX,XX +XXX,XX @@ static void ats_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
72
mmu_idx = ARMMMUIdx_S1E3;
73
break;
74
case 2:
75
- mmu_idx = ARMMMUIdx_S1NSE1;
76
+ mmu_idx = ARMMMUIdx_Stage1_E1;
77
break;
78
case 1:
79
- mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S1NSE1;
80
+ mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_Stage1_E1;
81
break;
82
default:
83
g_assert_not_reached();
84
@@ -XXX,XX +XXX,XX @@ static void ats_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
85
mmu_idx = ARMMMUIdx_S1SE0;
86
break;
87
case 2:
88
- mmu_idx = ARMMMUIdx_S1NSE0;
89
+ mmu_idx = ARMMMUIdx_Stage1_E0;
90
break;
91
case 1:
92
- mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
93
+ mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_Stage1_E0;
94
break;
95
default:
96
g_assert_not_reached();
97
@@ -XXX,XX +XXX,XX @@ static void ats_write64(CPUARMState *env, const ARMCPRegInfo *ri,
98
case 0:
99
switch (ri->opc1) {
100
case 0: /* AT S1E1R, AT S1E1W */
101
- mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S1NSE1;
102
+ mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_Stage1_E1;
103
break;
104
case 4: /* AT S1E2R, AT S1E2W */
105
mmu_idx = ARMMMUIdx_S1E2;
106
@@ -XXX,XX +XXX,XX @@ static void ats_write64(CPUARMState *env, const ARMCPRegInfo *ri,
107
}
108
break;
109
case 2: /* AT S1E0R, AT S1E0W */
110
- mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
111
+ mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_Stage1_E0;
112
break;
113
case 4: /* AT S12E1R, AT S12E1W */
114
mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_E10_1;
115
@@ -XXX,XX +XXX,XX @@ static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
116
case ARMMMUIdx_S1SE0:
117
return arm_el_is_aa64(env, 3) ? 1 : 3;
118
case ARMMMUIdx_S1SE1:
119
- case ARMMMUIdx_S1NSE0:
120
- case ARMMMUIdx_S1NSE1:
121
+ case ARMMMUIdx_Stage1_E0:
122
+ case ARMMMUIdx_Stage1_E1:
123
case ARMMMUIdx_MPrivNegPri:
124
case ARMMMUIdx_MUserNegPri:
125
case ARMMMUIdx_MPriv:
126
@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_disabled(CPUARMState *env,
127
}
128
129
if ((env->cp15.hcr_el2 & HCR_DC) &&
130
- (mmu_idx == ARMMMUIdx_S1NSE0 || mmu_idx == ARMMMUIdx_S1NSE1)) {
131
+ (mmu_idx == ARMMMUIdx_Stage1_E0 || mmu_idx == ARMMMUIdx_Stage1_E1)) {
132
/* HCR.DC means SCTLR_EL1.M behaves as 0 */
133
return true;
134
}
135
@@ -XXX,XX +XXX,XX @@ static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
136
static inline ARMMMUIdx stage_1_mmu_idx(ARMMMUIdx mmu_idx)
137
{
138
if (mmu_idx == ARMMMUIdx_E10_0 || mmu_idx == ARMMMUIdx_E10_1) {
139
- mmu_idx += (ARMMMUIdx_S1NSE0 - ARMMMUIdx_E10_0);
140
+ mmu_idx += (ARMMMUIdx_Stage1_E0 - ARMMMUIdx_E10_0);
141
}
142
return mmu_idx;
143
}
144
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_user(CPUARMState *env, ARMMMUIdx mmu_idx)
145
{
146
switch (mmu_idx) {
147
case ARMMMUIdx_S1SE0:
148
- case ARMMMUIdx_S1NSE0:
149
+ case ARMMMUIdx_Stage1_E0:
150
case ARMMMUIdx_MUser:
151
case ARMMMUIdx_MSUser:
152
case ARMMMUIdx_MUserNegPri:
153
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
154
hwaddr addr, MemTxAttrs txattrs,
155
ARMMMUFaultInfo *fi)
156
{
157
- if ((mmu_idx == ARMMMUIdx_S1NSE0 || mmu_idx == ARMMMUIdx_S1NSE1) &&
158
+ if ((mmu_idx == ARMMMUIdx_Stage1_E0 || mmu_idx == ARMMMUIdx_Stage1_E1) &&
159
!regime_translation_disabled(env, ARMMMUIdx_Stage2)) {
160
target_ulong s2size;
161
hwaddr s2pa;
162
--
163
2.20.1
164
165
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Define via macro expansion, so that renumbering of the base ARMMMUIdx
4
symbols is automatically reflected in the bit definitions.
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-18-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
target/arm/cpu.h | 39 +++++++++++++++++++++++----------------
14
1 file changed, 23 insertions(+), 16 deletions(-)
15
16
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/cpu.h
19
+++ b/target/arm/cpu.h
20
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
21
ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
22
} ARMMMUIdx;
23
24
-/* Bit macros for the core-mmu-index values for each index,
25
+/*
26
+ * Bit macros for the core-mmu-index values for each index,
27
* for use when calling tlb_flush_by_mmuidx() and friends.
28
*/
29
+#define TO_CORE_BIT(NAME) \
30
+ ARMMMUIdxBit_##NAME = 1 << (ARMMMUIdx_##NAME & ARM_MMU_IDX_COREIDX_MASK)
31
+
32
typedef enum ARMMMUIdxBit {
33
- ARMMMUIdxBit_E10_0 = 1 << 0,
34
- ARMMMUIdxBit_E10_1 = 1 << 1,
35
- ARMMMUIdxBit_E2 = 1 << 2,
36
- ARMMMUIdxBit_SE3 = 1 << 3,
37
- ARMMMUIdxBit_SE10_0 = 1 << 4,
38
- ARMMMUIdxBit_SE10_1 = 1 << 5,
39
- ARMMMUIdxBit_Stage2 = 1 << 6,
40
- ARMMMUIdxBit_MUser = 1 << 0,
41
- ARMMMUIdxBit_MPriv = 1 << 1,
42
- ARMMMUIdxBit_MUserNegPri = 1 << 2,
43
- ARMMMUIdxBit_MPrivNegPri = 1 << 3,
44
- ARMMMUIdxBit_MSUser = 1 << 4,
45
- ARMMMUIdxBit_MSPriv = 1 << 5,
46
- ARMMMUIdxBit_MSUserNegPri = 1 << 6,
47
- ARMMMUIdxBit_MSPrivNegPri = 1 << 7,
48
+ TO_CORE_BIT(E10_0),
49
+ TO_CORE_BIT(E10_1),
50
+ TO_CORE_BIT(E2),
51
+ TO_CORE_BIT(SE10_0),
52
+ TO_CORE_BIT(SE10_1),
53
+ TO_CORE_BIT(SE3),
54
+ TO_CORE_BIT(Stage2),
55
+
56
+ TO_CORE_BIT(MUser),
57
+ TO_CORE_BIT(MPriv),
58
+ TO_CORE_BIT(MUserNegPri),
59
+ TO_CORE_BIT(MPrivNegPri),
60
+ TO_CORE_BIT(MSUser),
61
+ TO_CORE_BIT(MSPriv),
62
+ TO_CORE_BIT(MSUserNegPri),
63
+ TO_CORE_BIT(MSPrivNegPri),
64
} ARMMMUIdxBit;
65
66
+#undef TO_CORE_BIT
67
+
68
#define MMU_USER_IDX 0
69
70
static inline int arm_to_core_mmu_idx(ARMMMUIdx mmu_idx)
71
--
72
2.20.1
73
74
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Replace the magic numbers with the relevant ARM_MMU_IDX_M_* constants.
4
Keep the definitions short by referencing previous symbols.
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-19-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/cpu.h | 16 ++++++++--------
13
1 file changed, 8 insertions(+), 8 deletions(-)
14
15
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/cpu.h
18
+++ b/target/arm/cpu.h
19
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
20
ARMMMUIdx_SE10_0 = 4 | ARM_MMU_IDX_A,
21
ARMMMUIdx_SE10_1 = 5 | ARM_MMU_IDX_A,
22
ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
23
- ARMMMUIdx_MUser = 0 | ARM_MMU_IDX_M,
24
- ARMMMUIdx_MPriv = 1 | ARM_MMU_IDX_M,
25
- ARMMMUIdx_MUserNegPri = 2 | ARM_MMU_IDX_M,
26
- ARMMMUIdx_MPrivNegPri = 3 | ARM_MMU_IDX_M,
27
- ARMMMUIdx_MSUser = 4 | ARM_MMU_IDX_M,
28
- ARMMMUIdx_MSPriv = 5 | ARM_MMU_IDX_M,
29
- ARMMMUIdx_MSUserNegPri = 6 | ARM_MMU_IDX_M,
30
- ARMMMUIdx_MSPrivNegPri = 7 | ARM_MMU_IDX_M,
31
+ ARMMMUIdx_MUser = ARM_MMU_IDX_M,
32
+ ARMMMUIdx_MPriv = ARM_MMU_IDX_M | ARM_MMU_IDX_M_PRIV,
33
+ ARMMMUIdx_MUserNegPri = ARMMMUIdx_MUser | ARM_MMU_IDX_M_NEGPRI,
34
+ ARMMMUIdx_MPrivNegPri = ARMMMUIdx_MPriv | ARM_MMU_IDX_M_NEGPRI,
35
+ ARMMMUIdx_MSUser = ARMMMUIdx_MUser | ARM_MMU_IDX_M_S,
36
+ ARMMMUIdx_MSPriv = ARMMMUIdx_MPriv | ARM_MMU_IDX_M_S,
37
+ ARMMMUIdx_MSUserNegPri = ARMMMUIdx_MUserNegPri | ARM_MMU_IDX_M_S,
38
+ ARMMMUIdx_MSPrivNegPri = ARMMMUIdx_MPrivNegPri | ARM_MMU_IDX_M_S,
39
/* Indexes below here don't have TLBs and are used only for AT system
40
* instructions or for the first stage of an S12 page table walk.
41
*/
42
--
43
2.20.1
44
45
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Prepare for, but do not yet implement, the EL2&0 regime.
3
This includes AND, ORR, EOR, BIC, ORN, BSF, BIT, BIF.
4
This involves adding the new MMUIdx enumerators and adjusting
5
some of the MMUIdx related predicates to match.
6
4
7
Tested-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200206105448.4726-20-richard.henderson@linaro.org
7
Message-id: 20240524232121.284515-37-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
9
---
13
target/arm/cpu-param.h | 2 +-
10
target/arm/tcg/a64.decode | 10 +++++
14
target/arm/cpu.h | 134 ++++++++++++++++++-----------------------
11
target/arm/tcg/translate-a64.c | 68 ++++++++++------------------------
15
target/arm/internals.h | 35 +++++++++++
12
2 files changed, 29 insertions(+), 49 deletions(-)
16
target/arm/helper.c | 66 +++++++++++++++++---
17
target/arm/translate.c | 1 -
18
5 files changed, 152 insertions(+), 86 deletions(-)
19
13
20
diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
21
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/cpu-param.h
16
--- a/target/arm/tcg/a64.decode
23
+++ b/target/arm/cpu-param.h
17
+++ b/target/arm/tcg/a64.decode
24
@@ -XXX,XX +XXX,XX @@
18
@@ -XXX,XX +XXX,XX @@
25
# define TARGET_PAGE_BITS_MIN 10
19
@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3
26
#endif
20
@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3
27
21
28
-#define NB_MMU_MODES 8
22
+@qrrr_b . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=0
29
+#define NB_MMU_MODES 9
23
@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1
30
24
@qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd
31
#endif
25
@qrrr_e . q:1 ...... esz:2 . rm:5 ...... rn:5 rd:5 &qrrr_e
32
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
26
@@ -XXX,XX +XXX,XX @@ SMINP_v 0.00 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e
27
UMAXP_v 0.10 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e
28
UMINP_v 0.10 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e
29
30
+AND_v 0.00 1110 001 ..... 00011 1 ..... ..... @qrrr_b
31
+BIC_v 0.00 1110 011 ..... 00011 1 ..... ..... @qrrr_b
32
+ORR_v 0.00 1110 101 ..... 00011 1 ..... ..... @qrrr_b
33
+ORN_v 0.00 1110 111 ..... 00011 1 ..... ..... @qrrr_b
34
+EOR_v 0.10 1110 001 ..... 00011 1 ..... ..... @qrrr_b
35
+BSL_v 0.10 1110 011 ..... 00011 1 ..... ..... @qrrr_b
36
+BIT_v 0.10 1110 101 ..... 00011 1 ..... ..... @qrrr_b
37
+BIF_v 0.10 1110 111 ..... 00011 1 ..... ..... @qrrr_b
38
+
39
### Advanced SIMD scalar x indexed element
40
41
FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h
42
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
33
index XXXXXXX..XXXXXXX 100644
43
index XXXXXXX..XXXXXXX 100644
34
--- a/target/arm/cpu.h
44
--- a/target/arm/tcg/translate-a64.c
35
+++ b/target/arm/cpu.h
45
+++ b/target/arm/tcg/translate-a64.c
36
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
46
@@ -XXX,XX +XXX,XX @@ TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp)
37
* + NonSecure EL1 & 0 stage 1
47
TRANS(UMAXP_v, do_gvec_fn3_no64, a, gen_gvec_umaxp)
38
* + NonSecure EL1 & 0 stage 2
48
TRANS(UMINP_v, do_gvec_fn3_no64, a, gen_gvec_uminp)
39
* + NonSecure EL2
49
40
- * + Secure EL1 & EL0
50
+TRANS(AND_v, do_gvec_fn3, a, tcg_gen_gvec_and)
41
+ * + NonSecure EL2 & 0 (ARMv8.1-VHE)
51
+TRANS(BIC_v, do_gvec_fn3, a, tcg_gen_gvec_andc)
42
+ * + Secure EL1 & 0
52
+TRANS(ORR_v, do_gvec_fn3, a, tcg_gen_gvec_or)
43
* + Secure EL3
53
+TRANS(ORN_v, do_gvec_fn3, a, tcg_gen_gvec_orc)
44
* If EL3 is 32-bit:
54
+TRANS(EOR_v, do_gvec_fn3, a, tcg_gen_gvec_xor)
45
* + NonSecure PL1 & 0 stage 1
55
+
46
* + NonSecure PL1 & 0 stage 2
56
+static bool do_bitsel(DisasContext *s, bool is_q, int d, int a, int b, int c)
47
* + NonSecure PL2
57
+{
48
- * + Secure PL0 & PL1
58
+ if (fp_access_check(s)) {
49
+ * + Secure PL0
59
+ gen_gvec_fn4(s, is_q, d, a, b, c, tcg_gen_gvec_bitsel, 0);
50
+ * + Secure PL1
60
+ }
51
* (reminder: for 32 bit EL3, Secure PL1 is *EL3*, not EL1.)
61
+ return true;
52
*
62
+}
53
* For QEMU, an mmu_idx is not quite the same as a translation regime because:
63
+
54
- * 1. we need to split the "EL1 & 0" regimes into two mmu_idxes, because they
64
+TRANS(BSL_v, do_bitsel, a->q, a->rd, a->rd, a->rn, a->rm)
55
- * may differ in access permissions even if the VA->PA map is the same
65
+TRANS(BIT_v, do_bitsel, a->q, a->rd, a->rm, a->rn, a->rd)
56
+ * 1. we need to split the "EL1 & 0" and "EL2 & 0" regimes into two mmu_idxes,
66
+TRANS(BIF_v, do_bitsel, a->q, a->rd, a->rm, a->rd, a->rn)
57
+ * because they may differ in access permissions even if the VA->PA map is
67
+
58
+ * the same
68
/*
59
* 2. we want to cache in our TLB the full VA->IPA->PA lookup for a stage 1+2
69
* Advanced SIMD scalar/vector x indexed element
60
* translation, which means that we have one mmu_idx that deals with two
61
* concatenated translation regimes [this sort of combined s1+2 TLB is
62
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
63
* 4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
64
* translation regimes, because they map reasonably well to each other
65
* and they can't both be active at the same time.
66
- * This gives us the following list of mmu_idx values:
67
+ * 5. we want to be able to use the TLB for accesses done as part of a
68
+ * stage1 page table walk, rather than having to walk the stage2 page
69
+ * table over and over.
70
*
71
- * NS EL0 (aka NS PL0) stage 1+2
72
- * NS EL1 (aka NS PL1) stage 1+2
73
+ * This gives us the following list of cases:
74
+ *
75
+ * NS EL0 EL1&0 stage 1+2 (aka NS PL0)
76
+ * NS EL1 EL1&0 stage 1+2 (aka NS PL1)
77
+ * NS EL0 EL2&0
78
+ * NS EL2 EL2&0
79
* NS EL2 (aka NS PL2)
80
+ * S EL0 EL1&0 (aka S PL0)
81
+ * S EL1 EL1&0 (not used if EL3 is 32 bit)
82
* S EL3 (aka S PL1)
83
- * S EL0 (aka S PL0)
84
- * S EL1 (not used if EL3 is 32 bit)
85
- * NS EL0+1 stage 2
86
+ * NS EL1&0 stage 2
87
*
88
- * (The last of these is an mmu_idx because we want to be able to use the TLB
89
- * for the accesses done as part of a stage 1 page table walk, rather than
90
- * having to walk the stage 2 page table over and over.)
91
+ * for a total of 9 different mmu_idx.
92
*
93
* R profile CPUs have an MPU, but can use the same set of MMU indexes
94
* as A profile. They only need to distinguish NS EL0 and NS EL1 (and
95
@@ -XXX,XX +XXX,XX @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
96
* For M profile we arrange them to have a bit for priv, a bit for negpri
97
* and a bit for secure.
98
*/
70
*/
99
-#define ARM_MMU_IDX_A 0x10 /* A profile */
71
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
100
-#define ARM_MMU_IDX_NOTLB 0x20 /* does not have a TLB */
72
}
101
-#define ARM_MMU_IDX_M 0x40 /* M profile */
73
}
102
+#define ARM_MMU_IDX_A 0x10 /* A profile */
74
103
+#define ARM_MMU_IDX_NOTLB 0x20 /* does not have a TLB */
75
-/* Logic op (opcode == 3) subgroup of C3.6.16. */
104
+#define ARM_MMU_IDX_M 0x40 /* M profile */
76
-static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
105
106
-/* meanings of the bits for M profile mmu idx values */
107
-#define ARM_MMU_IDX_M_PRIV 0x1
108
+/* Meanings of the bits for M profile mmu idx values */
109
+#define ARM_MMU_IDX_M_PRIV 0x1
110
#define ARM_MMU_IDX_M_NEGPRI 0x2
111
-#define ARM_MMU_IDX_M_S 0x4
112
+#define ARM_MMU_IDX_M_S 0x4 /* Secure */
113
114
-#define ARM_MMU_IDX_TYPE_MASK (~0x7)
115
-#define ARM_MMU_IDX_COREIDX_MASK 0x7
116
+#define ARM_MMU_IDX_TYPE_MASK \
117
+ (ARM_MMU_IDX_A | ARM_MMU_IDX_M | ARM_MMU_IDX_NOTLB)
118
+#define ARM_MMU_IDX_COREIDX_MASK 0xf
119
120
typedef enum ARMMMUIdx {
121
- ARMMMUIdx_E10_0 = 0 | ARM_MMU_IDX_A,
122
- ARMMMUIdx_E10_1 = 1 | ARM_MMU_IDX_A,
123
- ARMMMUIdx_E2 = 2 | ARM_MMU_IDX_A,
124
- ARMMMUIdx_SE3 = 3 | ARM_MMU_IDX_A,
125
- ARMMMUIdx_SE10_0 = 4 | ARM_MMU_IDX_A,
126
- ARMMMUIdx_SE10_1 = 5 | ARM_MMU_IDX_A,
127
- ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
128
+ /*
129
+ * A-profile.
130
+ */
131
+ ARMMMUIdx_E10_0 = 0 | ARM_MMU_IDX_A,
132
+ ARMMMUIdx_E20_0 = 1 | ARM_MMU_IDX_A,
133
+
134
+ ARMMMUIdx_E10_1 = 2 | ARM_MMU_IDX_A,
135
+
136
+ ARMMMUIdx_E2 = 3 | ARM_MMU_IDX_A,
137
+ ARMMMUIdx_E20_2 = 4 | ARM_MMU_IDX_A,
138
+
139
+ ARMMMUIdx_SE10_0 = 5 | ARM_MMU_IDX_A,
140
+ ARMMMUIdx_SE10_1 = 6 | ARM_MMU_IDX_A,
141
+ ARMMMUIdx_SE3 = 7 | ARM_MMU_IDX_A,
142
+
143
+ ARMMMUIdx_Stage2 = 8 | ARM_MMU_IDX_A,
144
+
145
+ /*
146
+ * These are not allocated TLBs and are used only for AT system
147
+ * instructions or for the first stage of an S12 page table walk.
148
+ */
149
+ ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
150
+ ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
151
+
152
+ /*
153
+ * M-profile.
154
+ */
155
ARMMMUIdx_MUser = ARM_MMU_IDX_M,
156
ARMMMUIdx_MPriv = ARM_MMU_IDX_M | ARM_MMU_IDX_M_PRIV,
157
ARMMMUIdx_MUserNegPri = ARMMMUIdx_MUser | ARM_MMU_IDX_M_NEGPRI,
158
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
159
ARMMMUIdx_MSPriv = ARMMMUIdx_MPriv | ARM_MMU_IDX_M_S,
160
ARMMMUIdx_MSUserNegPri = ARMMMUIdx_MUserNegPri | ARM_MMU_IDX_M_S,
161
ARMMMUIdx_MSPrivNegPri = ARMMMUIdx_MPrivNegPri | ARM_MMU_IDX_M_S,
162
- /* Indexes below here don't have TLBs and are used only for AT system
163
- * instructions or for the first stage of an S12 page table walk.
164
- */
165
- ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
166
- ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
167
} ARMMMUIdx;
168
169
/*
170
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
171
172
typedef enum ARMMMUIdxBit {
173
TO_CORE_BIT(E10_0),
174
+ TO_CORE_BIT(E20_0),
175
TO_CORE_BIT(E10_1),
176
TO_CORE_BIT(E2),
177
+ TO_CORE_BIT(E20_2),
178
TO_CORE_BIT(SE10_0),
179
TO_CORE_BIT(SE10_1),
180
TO_CORE_BIT(SE3),
181
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
182
183
#define MMU_USER_IDX 0
184
185
-static inline int arm_to_core_mmu_idx(ARMMMUIdx mmu_idx)
186
-{
77
-{
187
- return mmu_idx & ARM_MMU_IDX_COREIDX_MASK;
78
- int rd = extract32(insn, 0, 5);
188
-}
79
- int rn = extract32(insn, 5, 5);
80
- int rm = extract32(insn, 16, 5);
81
- int size = extract32(insn, 22, 2);
82
- bool is_u = extract32(insn, 29, 1);
83
- bool is_q = extract32(insn, 30, 1);
189
-
84
-
190
-static inline ARMMMUIdx core_to_arm_mmu_idx(CPUARMState *env, int mmu_idx)
85
- if (!fp_access_check(s)) {
191
-{
86
- return;
192
- if (arm_feature(env, ARM_FEATURE_M)) {
193
- return mmu_idx | ARM_MMU_IDX_M;
194
- } else {
195
- return mmu_idx | ARM_MMU_IDX_A;
196
- }
87
- }
197
-}
198
-
88
-
199
-/* Return the exception level we're running at if this is our mmu_idx */
89
- switch (size + 4 * is_u) {
200
-static inline int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx)
90
- case 0: /* AND */
201
-{
91
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_and, 0);
202
- switch (mmu_idx & ARM_MMU_IDX_TYPE_MASK) {
92
- return;
203
- case ARM_MMU_IDX_A:
93
- case 1: /* BIC */
204
- return mmu_idx & 3;
94
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_andc, 0);
205
- case ARM_MMU_IDX_M:
95
- return;
206
- return mmu_idx & ARM_MMU_IDX_M_PRIV;
96
- case 2: /* ORR */
97
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_or, 0);
98
- return;
99
- case 3: /* ORN */
100
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_orc, 0);
101
- return;
102
- case 4: /* EOR */
103
- gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_xor, 0);
104
- return;
105
-
106
- case 5: /* BSL bitwise select */
107
- gen_gvec_fn4(s, is_q, rd, rd, rn, rm, tcg_gen_gvec_bitsel, 0);
108
- return;
109
- case 6: /* BIT, bitwise insert if true */
110
- gen_gvec_fn4(s, is_q, rd, rm, rn, rd, tcg_gen_gvec_bitsel, 0);
111
- return;
112
- case 7: /* BIF, bitwise insert if false */
113
- gen_gvec_fn4(s, is_q, rd, rm, rd, rn, tcg_gen_gvec_bitsel, 0);
114
- return;
115
-
207
- default:
116
- default:
208
- g_assert_not_reached();
117
- g_assert_not_reached();
209
- }
118
- }
210
-}
119
-}
211
-
120
-
212
-/*
121
/* Integer op subgroup of C3.6.16. */
213
- * Return the MMU index for a v7M CPU with all relevant information
122
static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
214
- * manually specified.
215
- */
216
-ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
217
- bool secstate, bool priv, bool negpri);
218
-
219
-/* Return the MMU index for a v7M CPU in the specified security and
220
- * privilege state.
221
- */
222
-ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
223
- bool secstate, bool priv);
224
-
225
-/* Return the MMU index for a v7M CPU in the specified security state */
226
-ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate);
227
-
228
/**
229
* cpu_mmu_index:
230
* @env: The cpu environment
231
diff --git a/target/arm/internals.h b/target/arm/internals.h
232
index XXXXXXX..XXXXXXX 100644
233
--- a/target/arm/internals.h
234
+++ b/target/arm/internals.h
235
@@ -XXX,XX +XXX,XX @@ bool arm_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
236
MMUAccessType access_type, int mmu_idx,
237
bool probe, uintptr_t retaddr);
238
239
+static inline int arm_to_core_mmu_idx(ARMMMUIdx mmu_idx)
240
+{
241
+ return mmu_idx & ARM_MMU_IDX_COREIDX_MASK;
242
+}
243
+
244
+static inline ARMMMUIdx core_to_arm_mmu_idx(CPUARMState *env, int mmu_idx)
245
+{
246
+ if (arm_feature(env, ARM_FEATURE_M)) {
247
+ return mmu_idx | ARM_MMU_IDX_M;
248
+ } else {
249
+ return mmu_idx | ARM_MMU_IDX_A;
250
+ }
251
+}
252
+
253
+int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx);
254
+
255
+/*
256
+ * Return the MMU index for a v7M CPU with all relevant information
257
+ * manually specified.
258
+ */
259
+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
260
+ bool secstate, bool priv, bool negpri);
261
+
262
+/*
263
+ * Return the MMU index for a v7M CPU in the specified security and
264
+ * privilege state.
265
+ */
266
+ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
267
+ bool secstate, bool priv);
268
+
269
+/* Return the MMU index for a v7M CPU in the specified security state */
270
+ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate);
271
+
272
/* Return true if the stage 1 translation regime is using LPAE format page
273
* tables */
274
bool arm_s1_regime_using_lpae_format(CPUARMState *env, ARMMMUIdx mmu_idx);
275
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
276
switch (mmu_idx) {
277
case ARMMMUIdx_E10_0:
278
case ARMMMUIdx_E10_1:
279
+ case ARMMMUIdx_E20_0:
280
+ case ARMMMUIdx_E20_2:
281
case ARMMMUIdx_Stage1_E0:
282
case ARMMMUIdx_Stage1_E1:
283
case ARMMMUIdx_E2:
284
diff --git a/target/arm/helper.c b/target/arm/helper.c
285
index XXXXXXX..XXXXXXX 100644
286
--- a/target/arm/helper.c
287
+++ b/target/arm/helper.c
288
@@ -XXX,XX +XXX,XX @@ void arm_cpu_do_interrupt(CPUState *cs)
289
#endif /* !CONFIG_USER_ONLY */
290
291
/* Return the exception level which controls this address translation regime */
292
-static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
293
+static uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
294
{
123
{
295
switch (mmu_idx) {
124
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
296
+ case ARMMMUIdx_E20_0:
125
int opcode = extract32(insn, 11, 5);
297
+ case ARMMMUIdx_E20_2:
126
298
case ARMMMUIdx_Stage2:
127
switch (opcode) {
299
case ARMMMUIdx_E2:
128
- case 0x3: /* logic ops */
300
return 2;
129
- disas_simd_3same_logic(s, insn);
301
@@ -XXX,XX +XXX,XX @@ static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
130
- break;
302
case ARMMMUIdx_SE10_1:
303
case ARMMMUIdx_Stage1_E0:
304
case ARMMMUIdx_Stage1_E1:
305
+ case ARMMMUIdx_E10_0:
306
+ case ARMMMUIdx_E10_1:
307
case ARMMMUIdx_MPrivNegPri:
308
case ARMMMUIdx_MUserNegPri:
309
case ARMMMUIdx_MPriv:
310
@@ -XXX,XX +XXX,XX @@ static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
311
*/
312
static inline ARMMMUIdx stage_1_mmu_idx(ARMMMUIdx mmu_idx)
313
{
314
- if (mmu_idx == ARMMMUIdx_E10_0 || mmu_idx == ARMMMUIdx_E10_1) {
315
- mmu_idx += (ARMMMUIdx_Stage1_E0 - ARMMMUIdx_E10_0);
316
+ switch (mmu_idx) {
317
+ case ARMMMUIdx_E10_0:
318
+ return ARMMMUIdx_Stage1_E0;
319
+ case ARMMMUIdx_E10_1:
320
+ return ARMMMUIdx_Stage1_E1;
321
+ default:
322
+ return mmu_idx;
323
}
324
- return mmu_idx;
325
}
326
327
/* Return true if the translation regime is using LPAE format page tables */
328
@@ -XXX,XX +XXX,XX @@ static inline bool regime_is_user(CPUARMState *env, ARMMMUIdx mmu_idx)
329
{
330
switch (mmu_idx) {
331
case ARMMMUIdx_SE10_0:
332
+ case ARMMMUIdx_E20_0:
333
case ARMMMUIdx_Stage1_E0:
334
case ARMMMUIdx_MUser:
335
case ARMMMUIdx_MSUser:
336
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
337
return 0;
338
}
339
340
+/* Return the exception level we're running at if this is our mmu_idx */
341
+int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx)
342
+{
343
+ if (mmu_idx & ARM_MMU_IDX_M) {
344
+ return mmu_idx & ARM_MMU_IDX_M_PRIV;
345
+ }
346
+
347
+ switch (mmu_idx) {
348
+ case ARMMMUIdx_E10_0:
349
+ case ARMMMUIdx_E20_0:
350
+ case ARMMMUIdx_SE10_0:
351
+ return 0;
352
+ case ARMMMUIdx_E10_1:
353
+ case ARMMMUIdx_SE10_1:
354
+ return 1;
355
+ case ARMMMUIdx_E2:
356
+ case ARMMMUIdx_E20_2:
357
+ return 2;
358
+ case ARMMMUIdx_SE3:
359
+ return 3;
360
+ default:
361
+ g_assert_not_reached();
362
+ }
363
+}
364
+
365
#ifndef CONFIG_TCG
366
ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
367
{
368
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
369
return arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
370
}
371
372
- if (el < 2 && arm_is_secure_below_el3(env)) {
373
- return ARMMMUIdx_SE10_0 + el;
374
- } else {
375
- return ARMMMUIdx_E10_0 + el;
376
+ switch (el) {
377
+ case 0:
378
+ /* TODO: ARMv8.1-VHE */
379
+ if (arm_is_secure_below_el3(env)) {
380
+ return ARMMMUIdx_SE10_0;
381
+ }
382
+ return ARMMMUIdx_E10_0;
383
+ case 1:
384
+ if (arm_is_secure_below_el3(env)) {
385
+ return ARMMMUIdx_SE10_1;
386
+ }
387
+ return ARMMMUIdx_E10_1;
388
+ case 2:
389
+ /* TODO: ARMv8.1-VHE */
390
+ /* TODO: ARMv8.4-SecEL2 */
391
+ return ARMMMUIdx_E2;
392
+ case 3:
393
+ return ARMMMUIdx_SE3;
394
+ default:
395
+ g_assert_not_reached();
396
}
397
}
398
399
diff --git a/target/arm/translate.c b/target/arm/translate.c
400
index XXXXXXX..XXXXXXX 100644
401
--- a/target/arm/translate.c
402
+++ b/target/arm/translate.c
403
@@ -XXX,XX +XXX,XX @@ static inline int get_a32_user_mem_index(DisasContext *s)
404
case ARMMMUIdx_MSUserNegPri:
405
case ARMMMUIdx_MSPrivNegPri:
406
return arm_to_core_mmu_idx(ARMMMUIdx_MSUserNegPri);
407
- case ARMMMUIdx_Stage2:
408
default:
131
default:
409
g_assert_not_reached();
132
disas_simd_3same_int(s, insn);
410
}
133
break;
134
+ case 0x3: /* logic ops */
135
case 0x14: /* SMAXP, UMAXP */
136
case 0x15: /* SMINP, UMINP */
137
case 0x17: /* ADDP */
411
--
138
--
412
2.20.1
139
2.34.1
413
414
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Return the indexes for the EL2&0 regime when the appropriate bits
4
are set within HCR_EL2.
5
6
Tested-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200206105448.4726-22-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/helper.c | 11 +++++++++--
13
1 file changed, 9 insertions(+), 2 deletions(-)
14
15
diff --git a/target/arm/helper.c b/target/arm/helper.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper.c
18
+++ b/target/arm/helper.c
19
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
20
return arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
21
}
22
23
+ /* See ARM pseudo-function ELIsInHost. */
24
switch (el) {
25
case 0:
26
- /* TODO: ARMv8.1-VHE */
27
if (arm_is_secure_below_el3(env)) {
28
return ARMMMUIdx_SE10_0;
29
}
30
+ if ((env->cp15.hcr_el2 & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)
31
+ && arm_el_is_aa64(env, 2)) {
32
+ return ARMMMUIdx_E20_0;
33
+ }
34
return ARMMMUIdx_E10_0;
35
case 1:
36
if (arm_is_secure_below_el3(env)) {
37
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
38
}
39
return ARMMMUIdx_E10_1;
40
case 2:
41
- /* TODO: ARMv8.1-VHE */
42
/* TODO: ARMv8.4-SecEL2 */
43
+ /* Note that TGE does not apply at EL2. */
44
+ if ((env->cp15.hcr_el2 & HCR_E2H) && arm_el_is_aa64(env, 2)) {
45
+ return ARMMMUIdx_E20_2;
46
+ }
47
return ARMMMUIdx_E2;
48
case 3:
49
return ARMMMUIdx_SE3;
50
--
51
2.20.1
52
53
diff view generated by jsdifflib