1
The following changes since commit 4c41341af76cfc85b5a6c0f87de4838672ab9f89:
1
The following changes since commit 5a67d7735d4162630769ef495cf813244fc850df:
2
2
3
Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20201020' into staging (2020-10-20 11:20:36 +0100)
3
Merge remote-tracking branch 'remotes/berrange-gitlab/tags/tls-deps-pull-request' into staging (2021-07-02 08:22:39 +0100)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20201020
7
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20210702
8
8
9
for you to fetch changes up to 6358890cb939192f6169fdf7664d903bf9b1d338:
9
for you to fetch changes up to 04ea4d3cfd0a21b248ece8eb7a9436a3d9898dd8:
10
10
11
tests/tcg/aarch64: Add bti smoke tests (2020-10-20 16:12:02 +0100)
11
target/arm: Implement MVE shifts by register (2021-07-02 11:48:38 +0100)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
target-arm queue:
14
target-arm queue:
15
* Fix AArch32 SMLAD incorrect setting of Q bit
15
* more MVE instructions
16
* AArch32 VCVT fixed-point to float is always round-to-nearest
16
* hw/gpio/gpio_pwr: use shutdown function for reboot
17
* strongarm: Fix 'time to transmit a char' unit comment
17
* target/arm: Check NaN mode before silencing NaN
18
* Restrict APEI tables generation to the 'virt' machine
18
* tests: Boot and halt a Linux guest on the Raspberry Pi 2 machine
19
* bcm2835: minor code cleanups
19
* hw/arm: Add basic power management to raspi.
20
* correctly flush TLBs when TBI is enabled
20
* docs/system/arm: Add quanta-gbs-bmc, quanta-q7l1-bmc
21
* tests/qtest: Add npcm7xx timer test
22
* loads-stores.rst: add footnote that clarifies GETPC usage
23
* Fix reported EL for mte_check_fail
24
* Ignore HCR_EL2.ATA when {E2H,TGE} != 11
25
* microbit_i2c: Fix coredump when dump-vmstate
26
* nseries: Fix loading kernel image on n8x0 machines
27
* Implement v8.1M low-overhead-loops
28
* linux-user: Support AArch64 BTI
29
21
30
----------------------------------------------------------------
22
----------------------------------------------------------------
31
Emanuele Giuseppe Esposito (1):
23
Joe Komlodi (1):
32
loads-stores.rst: add footnote that clarifies GETPC usage
24
target/arm: Check NaN mode before silencing NaN
33
25
34
Havard Skinnemoen (1):
26
Maxim Uvarov (1):
35
tests/qtest: Add npcm7xx timer test
27
hw/gpio/gpio_pwr: use shutdown function for reboot
36
28
37
Peng Liang (1):
29
Nolan Leake (1):
38
microbit_i2c: Fix coredump when dump-vmstate
30
hw/arm: Add basic power management to raspi.
39
31
40
Peter Maydell (12):
32
Patrick Venture (2):
41
target/arm: Fix SMLAD incorrect setting of Q bit
33
docs/system/arm: Add quanta-q7l1-bmc reference
42
target/arm: AArch32 VCVT fixed-point to float is always round-to-nearest
34
docs/system/arm: Add quanta-gbs-bmc reference
43
decodetree: Fix codegen for non-overlapping group inside overlapping group
44
target/arm: Implement v8.1M NOCP handling
45
target/arm: Implement v8.1M conditional-select insns
46
target/arm: Make the t32 insn[25:23]=111 group non-overlapping
47
target/arm: Don't allow BLX imm for M-profile
48
target/arm: Implement v8.1M branch-future insns (as NOPs)
49
target/arm: Implement v8.1M low-overhead-loop instructions
50
target/arm: Fix has_vfp/has_neon ID reg squashing for M-profile
51
target/arm: Allow M-profile CPUs with FP16 to set FPSCR.FP16
52
target/arm: Implement FPSCR.LTPSIZE for M-profile LOB extension
53
35
54
Philippe Mathieu-Daudé (10):
36
Peter Maydell (18):
55
hw/arm/strongarm: Fix 'time to transmit a char' unit comment
37
target/arm: Fix MVE widening/narrowing VLDR/VSTR offset calculation
56
hw/arm: Restrict APEI tables generation to the 'virt' machine
38
target/arm: Fix bugs in MVE VRMLALDAVH, VRMLSLDAVH
57
hw/timer/bcm2835: Introduce BCM2835_SYSTIMER_COUNT definition
39
target/arm: Make asimd_imm_const() public
58
hw/timer/bcm2835: Rename variable holding CTRL_STATUS register
40
target/arm: Use asimd_imm_const for A64 decode
59
hw/timer/bcm2835: Support the timer COMPARE registers
41
target/arm: Use dup_const() instead of bitfield_replicate()
60
hw/arm/bcm2835_peripherals: Correctly wire the SYS_timer IRQs
42
target/arm: Implement MVE logical immediate insns
61
hw/intc/bcm2835_ic: Trace GPU/CPU IRQ handlers
43
target/arm: Implement MVE vector shift left by immediate insns
62
hw/intc/bcm2836_control: Use IRQ definitions instead of magic numbers
44
target/arm: Implement MVE vector shift right by immediate insns
63
hw/arm/nseries: Fix loading kernel image on n8x0 machines
45
target/arm: Implement MVE VSHLL
64
linux-user/elfload: Avoid leaking interp_name using GLib memory API
46
target/arm: Implement MVE VSRI, VSLI
47
target/arm: Implement MVE VSHRN, VRSHRN
48
target/arm: Implement MVE saturating narrowing shifts
49
target/arm: Implement MVE VSHLC
50
target/arm: Implement MVE VADDLV
51
target/arm: Implement MVE long shifts by immediate
52
target/arm: Implement MVE long shifts by register
53
target/arm: Implement MVE shifts by immediate
54
target/arm: Implement MVE shifts by register
65
55
66
Richard Henderson (16):
56
Philippe Mathieu-Daudé (1):
67
accel/tcg: Add tlb_flush_page_bits_by_mmuidx*
57
tests: Boot and halt a Linux guest on the Raspberry Pi 2 machine
68
target/arm: Use tlb_flush_page_bits_by_mmuidx*
69
target/arm: Remove redundant mmu_idx lookup
70
target/arm: Fix reported EL for mte_check_fail
71
target/arm: Ignore HCR_EL2.ATA when {E2H,TGE} != 11
72
linux-user/aarch64: Reset btype for signals
73
linux-user: Set PAGE_TARGET_1 for TARGET_PROT_BTI
74
include/elf: Add defines related to GNU property notes for AArch64
75
linux-user/elfload: Fix coding style in load_elf_image
76
linux-user/elfload: Adjust iteration over phdr
77
linux-user/elfload: Move PT_INTERP detection to first loop
78
linux-user/elfload: Use Error for load_elf_image
79
linux-user/elfload: Use Error for load_elf_interp
80
linux-user/elfload: Parse NT_GNU_PROPERTY_TYPE_0 notes
81
linux-user/elfload: Parse GNU_PROPERTY_AARCH64_FEATURE_1_AND
82
tests/tcg/aarch64: Add bti smoke tests
83
58
84
docs/devel/loads-stores.rst | 8 +-
59
docs/system/arm/aspeed.rst | 1 +
85
default-configs/devices/arm-softmmu.mak | 1 -
60
docs/system/arm/nuvoton.rst | 5 +-
86
include/elf.h | 22 ++
61
include/hw/arm/bcm2835_peripherals.h | 3 +-
87
include/exec/cpu-all.h | 2 +
62
include/hw/misc/bcm2835_powermgt.h | 29 ++
88
include/exec/exec-all.h | 36 ++
63
target/arm/helper-mve.h | 108 +++++++
89
include/hw/timer/bcm2835_systmr.h | 17 +-
64
target/arm/translate.h | 41 +++
90
linux-user/qemu.h | 4 +
65
target/arm/mve.decode | 177 ++++++++++-
91
linux-user/syscall_defs.h | 4 +
66
target/arm/t32.decode | 71 ++++-
92
target/arm/cpu.h | 13 +
67
hw/arm/bcm2835_peripherals.c | 13 +-
93
target/arm/helper.h | 13 +
68
hw/gpio/gpio_pwr.c | 2 +-
94
target/arm/internals.h | 9 +-
69
hw/misc/bcm2835_powermgt.c | 160 ++++++++++
95
target/arm/m-nocp.decode | 10 +-
70
target/arm/helper-a64.c | 12 +-
96
target/arm/t32.decode | 50 ++-
71
target/arm/mve_helper.c | 524 +++++++++++++++++++++++++++++++--
97
accel/tcg/cputlb.c | 275 +++++++++++++++-
72
target/arm/translate-a64.c | 86 +-----
98
hw/arm/bcm2835_peripherals.c | 13 +-
73
target/arm/translate-mve.c | 261 +++++++++++++++-
99
hw/arm/nseries.c | 1 +
74
target/arm/translate-neon.c | 81 -----
100
hw/arm/strongarm.c | 2 +-
75
target/arm/translate.c | 327 +++++++++++++++++++-
101
hw/i2c/microbit_i2c.c | 1 +
76
target/arm/vfp_helper.c | 24 +-
102
hw/intc/bcm2835_ic.c | 4 +-
77
hw/misc/meson.build | 1 +
103
hw/intc/bcm2836_control.c | 8 +-
78
tests/acceptance/boot_linux_console.py | 43 +++
104
hw/timer/bcm2835_systmr.c | 57 ++--
79
20 files changed, 1760 insertions(+), 209 deletions(-)
105
linux-user/aarch64/signal.c | 10 +-
80
create mode 100644 include/hw/misc/bcm2835_powermgt.h
106
linux-user/elfload.c | 326 ++++++++++++++----
81
create mode 100644 hw/misc/bcm2835_powermgt.c
107
linux-user/mmap.c | 16 +
108
target/arm/cpu.c | 38 ++-
109
target/arm/helper.c | 55 +++-
110
target/arm/mte_helper.c | 13 +-
111
target/arm/translate-a64.c | 6 +-
112
target/arm/translate.c | 239 +++++++++++++-
113
target/arm/vfp_helper.c | 76 +++--
114
tests/qtest/npcm7xx_timer-test.c | 562 ++++++++++++++++++++++++++++++++
115
tests/tcg/aarch64/bti-1.c | 62 ++++
116
tests/tcg/aarch64/bti-2.c | 108 ++++++
117
tests/tcg/aarch64/bti-crt.inc.c | 51 +++
118
hw/arm/Kconfig | 1 +
119
hw/intc/trace-events | 4 +
120
hw/timer/trace-events | 6 +-
121
scripts/decodetree.py | 2 +-
122
target/arm/translate-vfp.c.inc | 41 ++-
123
tests/qtest/meson.build | 1 +
124
tests/tcg/aarch64/Makefile.target | 10 +
125
tests/tcg/configure.sh | 4 +
126
42 files changed, 1973 insertions(+), 208 deletions(-)
127
create mode 100644 tests/qtest/npcm7xx_timer-test.c
128
create mode 100644 tests/tcg/aarch64/bti-1.c
129
create mode 100644 tests/tcg/aarch64/bti-2.c
130
create mode 100644 tests/tcg/aarch64/bti-crt.inc.c
131
82
diff view generated by jsdifflib
1
From: Peng Liang <liangpeng10@huawei.com>
1
From: Patrick Venture <venture@google.com>
2
2
3
VMStateDescription.fields should be end with VMSTATE_END_OF_LIST().
3
Adds a line-item reference to the supported quanta-q71l-bmc aspeed
4
However, microbit_i2c_vmstate doesn't follow it. Let's change it.
4
entry.
5
5
6
Fixes: 9d68bf564e ("arm: Stub out NRF51 TWI magnetometer/accelerometer detection")
6
Signed-off-by: Patrick Venture <venture@google.com>
7
Reported-by: Euler Robot <euler.robot@huawei.com>
7
Reviewed-by: Cédric Le Goater <clg@kaod.org>
8
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
8
Message-id: 20210615192848.1065297-2-venture@google.com
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Message-id: 20201019093401.2993833-1-liangpeng10@huawei.com
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
10
---
13
hw/i2c/microbit_i2c.c | 1 +
11
docs/system/arm/aspeed.rst | 1 +
14
1 file changed, 1 insertion(+)
12
1 file changed, 1 insertion(+)
15
13
16
diff --git a/hw/i2c/microbit_i2c.c b/hw/i2c/microbit_i2c.c
14
diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
17
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/i2c/microbit_i2c.c
16
--- a/docs/system/arm/aspeed.rst
19
+++ b/hw/i2c/microbit_i2c.c
17
+++ b/docs/system/arm/aspeed.rst
20
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription microbit_i2c_vmstate = {
18
@@ -XXX,XX +XXX,XX @@ etc.
21
.fields = (VMStateField[]) {
19
AST2400 SoC based machines :
22
VMSTATE_UINT32_ARRAY(regs, MicrobitI2CState, MICROBIT_I2C_NREGS),
20
23
VMSTATE_UINT32(read_idx, MicrobitI2CState),
21
- ``palmetto-bmc`` OpenPOWER Palmetto POWER8 BMC
24
+ VMSTATE_END_OF_LIST()
22
+- ``quanta-q71l-bmc`` OpenBMC Quanta BMC
25
},
23
26
};
24
AST2500 SoC based machines :
27
25
28
--
26
--
29
2.20.1
27
2.20.1
30
28
31
29
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Patrick Venture <venture@google.com>
2
2
3
Use the new generic support for NT_GNU_PROPERTY_TYPE_0.
3
Add line item reference to quanta-gbs-bmc machine.
4
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Patrick Venture <venture@google.com>
6
Message-id: 20201016184207.786698-12-richard.henderson@linaro.org
6
Reviewed-by: Cédric Le Goater <clg@kaod.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Message-id: 20210615192848.1065297-3-venture@google.com
8
[PMM: fixed underline Sphinx warning]
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
---
10
linux-user/elfload.c | 48 ++++++++++++++++++++++++++++++++++++++++++--
11
docs/system/arm/nuvoton.rst | 5 +++--
11
1 file changed, 46 insertions(+), 2 deletions(-)
12
1 file changed, 3 insertions(+), 2 deletions(-)
12
13
13
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
14
diff --git a/docs/system/arm/nuvoton.rst b/docs/system/arm/nuvoton.rst
14
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
15
--- a/linux-user/elfload.c
16
--- a/docs/system/arm/nuvoton.rst
16
+++ b/linux-user/elfload.c
17
+++ b/docs/system/arm/nuvoton.rst
17
@@ -XXX,XX +XXX,XX @@ static void elf_core_copy_regs(target_elf_gregset_t *regs,
18
@@ -XXX,XX +XXX,XX @@
18
19
-Nuvoton iBMC boards (``npcm750-evb``, ``quanta-gsj``)
19
#include "elf.h"
20
-=====================================================
20
21
+Nuvoton iBMC boards (``*-bmc``, ``npcm750-evb``, ``quanta-gsj``)
21
+/* We must delay the following stanzas until after "elf.h". */
22
+================================================================
22
+#if defined(TARGET_AARCH64)
23
23
+
24
The `Nuvoton iBMC`_ chips (NPCM7xx) are a family of ARM-based SoCs that are
24
+static bool arch_parse_elf_property(uint32_t pr_type, uint32_t pr_datasz,
25
designed to be used as Baseboard Management Controllers (BMCs) in various
25
+ const uint32_t *data,
26
@@ -XXX,XX +XXX,XX @@ segment. The following machines are based on this chip :
26
+ struct image_info *info,
27
The NPCM730 SoC has two Cortex-A9 cores and is targeted for Data Center and
27
+ Error **errp)
28
Hyperscale applications. The following machines are based on this chip :
28
+{
29
29
+ if (pr_type == GNU_PROPERTY_AARCH64_FEATURE_1_AND) {
30
+- ``quanta-gbs-bmc`` Quanta GBS server BMC
30
+ if (pr_datasz != sizeof(uint32_t)) {
31
- ``quanta-gsj`` Quanta GSJ server BMC
31
+ error_setg(errp, "Ill-formed GNU_PROPERTY_AARCH64_FEATURE_1_AND");
32
32
+ return false;
33
There are also two more SoCs, NPCM710 and NPCM705, which are single-core
33
+ }
34
+ /* We will extract GNU_PROPERTY_AARCH64_FEATURE_1_BTI later. */
35
+ info->note_flags = *data;
36
+ }
37
+ return true;
38
+}
39
+#define ARCH_USE_GNU_PROPERTY 1
40
+
41
+#else
42
+
43
static bool arch_parse_elf_property(uint32_t pr_type, uint32_t pr_datasz,
44
const uint32_t *data,
45
struct image_info *info,
46
@@ -XXX,XX +XXX,XX @@ static bool arch_parse_elf_property(uint32_t pr_type, uint32_t pr_datasz,
47
}
48
#define ARCH_USE_GNU_PROPERTY 0
49
50
+#endif
51
+
52
struct exec
53
{
54
unsigned int a_info; /* Use macros N_MAGIC, etc for access */
55
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
56
struct elfhdr *ehdr = (struct elfhdr *)bprm_buf;
57
struct elf_phdr *phdr;
58
abi_ulong load_addr, load_bias, loaddr, hiaddr, error;
59
- int i, retval;
60
+ int i, retval, prot_exec;
61
Error *err = NULL;
62
63
/* First of all, some simple consistency checks */
64
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
65
info->brk = 0;
66
info->elf_flags = ehdr->e_flags;
67
68
+ prot_exec = PROT_EXEC;
69
+#ifdef TARGET_AARCH64
70
+ /*
71
+ * If the BTI feature is present, this indicates that the executable
72
+ * pages of the startup binary should be mapped with PROT_BTI, so that
73
+ * branch targets are enforced.
74
+ *
75
+ * The startup binary is either the interpreter or the static executable.
76
+ * The interpreter is responsible for all pages of a dynamic executable.
77
+ *
78
+ * Elf notes are backward compatible to older cpus.
79
+ * Do not enable BTI unless it is supported.
80
+ */
81
+ if ((info->note_flags & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
82
+ && (pinterp_name == NULL || *pinterp_name == 0)
83
+ && cpu_isar_feature(aa64_bti, ARM_CPU(thread_cpu))) {
84
+ prot_exec |= TARGET_PROT_BTI;
85
+ }
86
+#endif
87
+
88
for (i = 0; i < ehdr->e_phnum; i++) {
89
struct elf_phdr *eppnt = phdr + i;
90
if (eppnt->p_type == PT_LOAD) {
91
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
92
elf_prot |= PROT_WRITE;
93
}
94
if (eppnt->p_flags & PF_X) {
95
- elf_prot |= PROT_EXEC;
96
+ elf_prot |= prot_exec;
97
}
98
99
vaddr = load_bias + eppnt->p_vaddr;
100
--
34
--
101
2.20.1
35
2.20.1
102
36
103
37
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Nolan Leake <nolan@sigbus.net>
2
2
3
The note test requires gcc 10 for -mbranch-protection=standard.
3
This is just enough to make reboot and poweroff work. Works for
4
The mmap test uses PROT_BTI and does not require special compiler support.
4
linux, u-boot, and the arm trusted firmware. Not tested, but should
5
5
work for plan9, and bare-metal/hobby OSes, since they seem to generally
6
Acked-by: Alex Bennée <alex.bennee@linaro.org>
6
do what linux does for reset.
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
The watchdog timer functionality is not yet implemented.
9
Message-id: 20201016184207.786698-13-richard.henderson@linaro.org
9
10
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/64
11
Signed-off-by: Nolan Leake <nolan@sigbus.net>
12
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
13
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
14
Message-id: 20210625210209.1870217-1-nolan@sigbus.net
15
[PMM: tweaked commit title; fixed region size to 0x200;
16
moved header file to include/]
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
18
---
12
tests/tcg/aarch64/bti-1.c | 62 +++++++++++++++++
19
include/hw/arm/bcm2835_peripherals.h | 3 +-
13
tests/tcg/aarch64/bti-2.c | 108 ++++++++++++++++++++++++++++++
20
include/hw/misc/bcm2835_powermgt.h | 29 +++++
14
tests/tcg/aarch64/bti-crt.inc.c | 51 ++++++++++++++
21
hw/arm/bcm2835_peripherals.c | 13 ++-
15
tests/tcg/aarch64/Makefile.target | 10 +++
22
hw/misc/bcm2835_powermgt.c | 160 +++++++++++++++++++++++++++
16
tests/tcg/configure.sh | 4 ++
23
hw/misc/meson.build | 1 +
17
5 files changed, 235 insertions(+)
24
5 files changed, 204 insertions(+), 2 deletions(-)
18
create mode 100644 tests/tcg/aarch64/bti-1.c
25
create mode 100644 include/hw/misc/bcm2835_powermgt.h
19
create mode 100644 tests/tcg/aarch64/bti-2.c
26
create mode 100644 hw/misc/bcm2835_powermgt.c
20
create mode 100644 tests/tcg/aarch64/bti-crt.inc.c
27
21
28
diff --git a/include/hw/arm/bcm2835_peripherals.h b/include/hw/arm/bcm2835_peripherals.h
22
diff --git a/tests/tcg/aarch64/bti-1.c b/tests/tcg/aarch64/bti-1.c
29
index XXXXXXX..XXXXXXX 100644
30
--- a/include/hw/arm/bcm2835_peripherals.h
31
+++ b/include/hw/arm/bcm2835_peripherals.h
32
@@ -XXX,XX +XXX,XX @@
33
#include "hw/misc/bcm2835_mphi.h"
34
#include "hw/misc/bcm2835_thermal.h"
35
#include "hw/misc/bcm2835_cprman.h"
36
+#include "hw/misc/bcm2835_powermgt.h"
37
#include "hw/sd/sdhci.h"
38
#include "hw/sd/bcm2835_sdhost.h"
39
#include "hw/gpio/bcm2835_gpio.h"
40
@@ -XXX,XX +XXX,XX @@ struct BCM2835PeripheralState {
41
BCM2835MphiState mphi;
42
UnimplementedDeviceState txp;
43
UnimplementedDeviceState armtmr;
44
- UnimplementedDeviceState powermgt;
45
+ BCM2835PowerMgtState powermgt;
46
BCM2835CprmanState cprman;
47
PL011State uart0;
48
BCM2835AuxState aux;
49
diff --git a/include/hw/misc/bcm2835_powermgt.h b/include/hw/misc/bcm2835_powermgt.h
23
new file mode 100644
50
new file mode 100644
24
index XXXXXXX..XXXXXXX
51
index XXXXXXX..XXXXXXX
25
--- /dev/null
52
--- /dev/null
26
+++ b/tests/tcg/aarch64/bti-1.c
53
+++ b/include/hw/misc/bcm2835_powermgt.h
27
@@ -XXX,XX +XXX,XX @@
54
@@ -XXX,XX +XXX,XX @@
28
+/*
55
+/*
29
+ * Branch target identification, basic notskip cases.
56
+ * BCM2835 Power Management emulation
57
+ *
58
+ * Copyright (C) 2017 Marcin Chojnacki <marcinch7@gmail.com>
59
+ * Copyright (C) 2021 Nolan Leake <nolan@sigbus.net>
60
+ *
61
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
62
+ * See the COPYING file in the top-level directory.
30
+ */
63
+ */
31
+
64
+
32
+#include "bti-crt.inc.c"
65
+#ifndef BCM2835_POWERMGT_H
33
+
66
+#define BCM2835_POWERMGT_H
34
+static void skip2_sigill(int sig, siginfo_t *info, ucontext_t *uc)
67
+
35
+{
68
+#include "hw/sysbus.h"
36
+ uc->uc_mcontext.pc += 8;
69
+#include "qom/object.h"
37
+ uc->uc_mcontext.pstate = 1;
70
+
38
+}
71
+#define TYPE_BCM2835_POWERMGT "bcm2835-powermgt"
39
+
72
+OBJECT_DECLARE_SIMPLE_TYPE(BCM2835PowerMgtState, BCM2835_POWERMGT)
40
+#define NOP "nop"
73
+
41
+#define BTI_N "hint #32"
74
+struct BCM2835PowerMgtState {
42
+#define BTI_C "hint #34"
75
+ SysBusDevice busdev;
43
+#define BTI_J "hint #36"
76
+ MemoryRegion iomem;
44
+#define BTI_JC "hint #38"
77
+
45
+
78
+ uint32_t rstc;
46
+#define BTYPE_1(DEST) \
79
+ uint32_t rsts;
47
+ asm("mov %0,#1; adr x16, 1f; br x16; 1: " DEST "; mov %0,#0" \
80
+ uint32_t wdog;
48
+ : "=r"(skipped) : : "x16")
81
+};
49
+
82
+
50
+#define BTYPE_2(DEST) \
83
+#endif
51
+ asm("mov %0,#1; adr x16, 1f; blr x16; 1: " DEST "; mov %0,#0" \
84
diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
52
+ : "=r"(skipped) : : "x16", "x30")
85
index XXXXXXX..XXXXXXX 100644
53
+
86
--- a/hw/arm/bcm2835_peripherals.c
54
+#define BTYPE_3(DEST) \
87
+++ b/hw/arm/bcm2835_peripherals.c
55
+ asm("mov %0,#1; adr x15, 1f; br x15; 1: " DEST "; mov %0,#0" \
88
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_init(Object *obj)
56
+ : "=r"(skipped) : : "x15")
89
57
+
90
object_property_add_const_link(OBJECT(&s->dwc2), "dma-mr",
58
+#define TEST(WHICH, DEST, EXPECT) \
91
OBJECT(&s->gpu_bus_mr));
59
+ do { WHICH(DEST); fail += skipped ^ EXPECT; } while (0)
92
+
60
+
93
+ /* Power Management */
61
+
94
+ object_initialize_child(obj, "powermgt", &s->powermgt,
62
+int main()
95
+ TYPE_BCM2835_POWERMGT);
63
+{
96
}
64
+ int fail = 0;
97
65
+ int skipped;
98
static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
66
+
99
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
67
+ /* Signal-like with SA_SIGINFO. */
100
qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
68
+ signal_info(SIGILL, skip2_sigill);
101
INTERRUPT_USB));
69
+
102
70
+ TEST(BTYPE_1, NOP, 1);
103
+ /* Power Management */
71
+ TEST(BTYPE_1, BTI_N, 1);
104
+ if (!sysbus_realize(SYS_BUS_DEVICE(&s->powermgt), errp)) {
72
+ TEST(BTYPE_1, BTI_C, 0);
105
+ return;
73
+ TEST(BTYPE_1, BTI_J, 0);
106
+ }
74
+ TEST(BTYPE_1, BTI_JC, 0);
107
+
75
+
108
+ memory_region_add_subregion(&s->peri_mr, PM_OFFSET,
76
+ TEST(BTYPE_2, NOP, 1);
109
+ sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->powermgt), 0));
77
+ TEST(BTYPE_2, BTI_N, 1);
110
+
78
+ TEST(BTYPE_2, BTI_C, 0);
111
create_unimp(s, &s->txp, "bcm2835-txp", TXP_OFFSET, 0x1000);
79
+ TEST(BTYPE_2, BTI_J, 1);
112
create_unimp(s, &s->armtmr, "bcm2835-sp804", ARMCTRL_TIMER0_1_OFFSET, 0x40);
80
+ TEST(BTYPE_2, BTI_JC, 0);
113
- create_unimp(s, &s->powermgt, "bcm2835-powermgt", PM_OFFSET, 0x114);
81
+
114
create_unimp(s, &s->i2s, "bcm2835-i2s", I2S_OFFSET, 0x100);
82
+ TEST(BTYPE_3, NOP, 1);
115
create_unimp(s, &s->smi, "bcm2835-smi", SMI_OFFSET, 0x100);
83
+ TEST(BTYPE_3, BTI_N, 1);
116
create_unimp(s, &s->spi[0], "bcm2835-spi0", SPI0_OFFSET, 0x20);
84
+ TEST(BTYPE_3, BTI_C, 1);
117
diff --git a/hw/misc/bcm2835_powermgt.c b/hw/misc/bcm2835_powermgt.c
85
+ TEST(BTYPE_3, BTI_J, 0);
86
+ TEST(BTYPE_3, BTI_JC, 0);
87
+
88
+ return fail;
89
+}
90
diff --git a/tests/tcg/aarch64/bti-2.c b/tests/tcg/aarch64/bti-2.c
91
new file mode 100644
118
new file mode 100644
92
index XXXXXXX..XXXXXXX
119
index XXXXXXX..XXXXXXX
93
--- /dev/null
120
--- /dev/null
94
+++ b/tests/tcg/aarch64/bti-2.c
121
+++ b/hw/misc/bcm2835_powermgt.c
95
@@ -XXX,XX +XXX,XX @@
122
@@ -XXX,XX +XXX,XX @@
96
+/*
123
+/*
97
+ * Branch target identification, basic notskip cases.
124
+ * BCM2835 Power Management emulation
125
+ *
126
+ * Copyright (C) 2017 Marcin Chojnacki <marcinch7@gmail.com>
127
+ * Copyright (C) 2021 Nolan Leake <nolan@sigbus.net>
128
+ *
129
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
130
+ * See the COPYING file in the top-level directory.
98
+ */
131
+ */
99
+
132
+
100
+#include <stdio.h>
133
+#include "qemu/osdep.h"
101
+#include <signal.h>
134
+#include "qemu/log.h"
102
+#include <string.h>
135
+#include "qemu/module.h"
103
+#include <unistd.h>
136
+#include "hw/misc/bcm2835_powermgt.h"
104
+#include <sys/mman.h>
137
+#include "migration/vmstate.h"
105
+
138
+#include "sysemu/runstate.h"
106
+#ifndef PROT_BTI
139
+
107
+#define PROT_BTI 0x10
140
+#define PASSWORD 0x5a000000
108
+#endif
141
+#define PASSWORD_MASK 0xff000000
109
+
142
+
110
+static void skip2_sigill(int sig, siginfo_t *info, void *vuc)
143
+#define R_RSTC 0x1c
111
+{
144
+#define V_RSTC_RESET 0x20
112
+ ucontext_t *uc = vuc;
145
+#define R_RSTS 0x20
113
+ uc->uc_mcontext.pc += 8;
146
+#define V_RSTS_POWEROFF 0x555 /* Linux uses partition 63 to indicate halt. */
114
+ uc->uc_mcontext.pstate = 1;
147
+#define R_WDOG 0x24
115
+}
148
+
116
+
149
+static uint64_t bcm2835_powermgt_read(void *opaque, hwaddr offset,
117
+#define NOP "nop"
150
+ unsigned size)
118
+#define BTI_N "hint #32"
151
+{
119
+#define BTI_C "hint #34"
152
+ BCM2835PowerMgtState *s = (BCM2835PowerMgtState *)opaque;
120
+#define BTI_J "hint #36"
153
+ uint32_t res = 0;
121
+#define BTI_JC "hint #38"
154
+
122
+
155
+ switch (offset) {
123
+#define BTYPE_1(DEST) \
156
+ case R_RSTC:
124
+ "mov x1, #1\n\t" \
157
+ res = s->rstc;
125
+ "adr x16, 1f\n\t" \
158
+ break;
126
+ "br x16\n" \
159
+ case R_RSTS:
127
+"1: " DEST "\n\t" \
160
+ res = s->rsts;
128
+ "mov x1, #0"
161
+ break;
129
+
162
+ case R_WDOG:
130
+#define BTYPE_2(DEST) \
163
+ res = s->wdog;
131
+ "mov x1, #1\n\t" \
164
+ break;
132
+ "adr x16, 1f\n\t" \
165
+
133
+ "blr x16\n" \
166
+ default:
134
+"1: " DEST "\n\t" \
167
+ qemu_log_mask(LOG_UNIMP,
135
+ "mov x1, #0"
168
+ "bcm2835_powermgt_read: Unknown offset 0x%08"HWADDR_PRIx
136
+
169
+ "\n", offset);
137
+#define BTYPE_3(DEST) \
170
+ res = 0;
138
+ "mov x1, #1\n\t" \
171
+ break;
139
+ "adr x15, 1f\n\t" \
172
+ }
140
+ "br x15\n" \
173
+
141
+"1: " DEST "\n\t" \
174
+ return res;
142
+ "mov x1, #0"
175
+}
143
+
176
+
144
+#define TEST(WHICH, DEST, EXPECT) \
177
+static void bcm2835_powermgt_write(void *opaque, hwaddr offset,
145
+ WHICH(DEST) "\n" \
178
+ uint64_t value, unsigned size)
146
+ ".if " #EXPECT "\n\t" \
179
+{
147
+ "eor x1, x1," #EXPECT "\n" \
180
+ BCM2835PowerMgtState *s = (BCM2835PowerMgtState *)opaque;
148
+ ".endif\n\t" \
181
+
149
+ "add x0, x0, x1\n\t"
182
+ if ((value & PASSWORD_MASK) != PASSWORD) {
150
+
183
+ qemu_log_mask(LOG_GUEST_ERROR,
151
+extern char test_begin[], test_end[];
184
+ "bcm2835_powermgt_write: Bad password 0x%"PRIx64
152
+
185
+ " at offset 0x%08"HWADDR_PRIx"\n",
153
+asm("\n"
186
+ value, offset);
154
+"test_begin:\n\t"
187
+ return;
155
+ BTI_C "\n\t"
188
+ }
156
+ "mov x2, x30\n\t"
189
+
157
+ "mov x0, #0\n\t"
190
+ value = value & ~PASSWORD_MASK;
158
+
191
+
159
+ TEST(BTYPE_1, NOP, 1)
192
+ switch (offset) {
160
+ TEST(BTYPE_1, BTI_N, 1)
193
+ case R_RSTC:
161
+ TEST(BTYPE_1, BTI_C, 0)
194
+ s->rstc = value;
162
+ TEST(BTYPE_1, BTI_J, 0)
195
+ if (value & V_RSTC_RESET) {
163
+ TEST(BTYPE_1, BTI_JC, 0)
196
+ if ((s->rsts & 0xfff) == V_RSTS_POWEROFF) {
164
+
197
+ qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
165
+ TEST(BTYPE_2, NOP, 1)
198
+ } else {
166
+ TEST(BTYPE_2, BTI_N, 1)
199
+ qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
167
+ TEST(BTYPE_2, BTI_C, 0)
200
+ }
168
+ TEST(BTYPE_2, BTI_J, 1)
201
+ }
169
+ TEST(BTYPE_2, BTI_JC, 0)
202
+ break;
170
+
203
+ case R_RSTS:
171
+ TEST(BTYPE_3, NOP, 1)
204
+ qemu_log_mask(LOG_UNIMP,
172
+ TEST(BTYPE_3, BTI_N, 1)
205
+ "bcm2835_powermgt_write: RSTS\n");
173
+ TEST(BTYPE_3, BTI_C, 1)
206
+ s->rsts = value;
174
+ TEST(BTYPE_3, BTI_J, 0)
207
+ break;
175
+ TEST(BTYPE_3, BTI_JC, 0)
208
+ case R_WDOG:
176
+
209
+ qemu_log_mask(LOG_UNIMP,
177
+ "ret x2\n"
210
+ "bcm2835_powermgt_write: WDOG\n");
178
+"test_end:"
211
+ s->wdog = value;
179
+);
212
+ break;
180
+
213
+
181
+int main()
214
+ default:
182
+{
215
+ qemu_log_mask(LOG_UNIMP,
183
+ struct sigaction sa;
216
+ "bcm2835_powermgt_write: Unknown offset 0x%08"HWADDR_PRIx
184
+
217
+ "\n", offset);
185
+ void *p = mmap(0, getpagesize(),
218
+ break;
186
+ PROT_EXEC | PROT_READ | PROT_WRITE | PROT_BTI,
219
+ }
187
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
220
+}
188
+ if (p == MAP_FAILED) {
221
+
189
+ perror("mmap");
222
+static const MemoryRegionOps bcm2835_powermgt_ops = {
190
+ return 1;
223
+ .read = bcm2835_powermgt_read,
191
+ }
224
+ .write = bcm2835_powermgt_write,
192
+
225
+ .endianness = DEVICE_NATIVE_ENDIAN,
193
+ memset(&sa, 0, sizeof(sa));
226
+ .impl.min_access_size = 4,
194
+ sa.sa_sigaction = skip2_sigill;
227
+ .impl.max_access_size = 4,
195
+ sa.sa_flags = SA_SIGINFO;
228
+};
196
+ if (sigaction(SIGILL, &sa, NULL) < 0) {
229
+
197
+ perror("sigaction");
230
+static const VMStateDescription vmstate_bcm2835_powermgt = {
198
+ return 1;
231
+ .name = TYPE_BCM2835_POWERMGT,
199
+ }
232
+ .version_id = 1,
200
+
233
+ .minimum_version_id = 1,
201
+ memcpy(p, test_begin, test_end - test_begin);
234
+ .fields = (VMStateField[]) {
202
+ return ((int (*)(void))p)();
235
+ VMSTATE_UINT32(rstc, BCM2835PowerMgtState),
203
+}
236
+ VMSTATE_UINT32(rsts, BCM2835PowerMgtState),
204
diff --git a/tests/tcg/aarch64/bti-crt.inc.c b/tests/tcg/aarch64/bti-crt.inc.c
237
+ VMSTATE_UINT32(wdog, BCM2835PowerMgtState),
205
new file mode 100644
238
+ VMSTATE_END_OF_LIST()
206
index XXXXXXX..XXXXXXX
239
+ }
207
--- /dev/null
240
+};
208
+++ b/tests/tcg/aarch64/bti-crt.inc.c
241
+
209
@@ -XXX,XX +XXX,XX @@
242
+static void bcm2835_powermgt_init(Object *obj)
210
+/*
243
+{
211
+ * Minimal user-environment for testing BTI.
244
+ BCM2835PowerMgtState *s = BCM2835_POWERMGT(obj);
212
+ *
245
+
213
+ * Normal libc is not (yet) built with BTI support enabled,
246
+ memory_region_init_io(&s->iomem, obj, &bcm2835_powermgt_ops, s,
214
+ * and so could generate a BTI TRAP before ever reaching main.
247
+ TYPE_BCM2835_POWERMGT, 0x200);
215
+ */
248
+ sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->iomem);
216
+
249
+}
217
+#include <stdlib.h>
250
+
218
+#include <signal.h>
251
+static void bcm2835_powermgt_reset(DeviceState *dev)
219
+#include <ucontext.h>
252
+{
220
+#include <asm/unistd.h>
253
+ BCM2835PowerMgtState *s = BCM2835_POWERMGT(dev);
221
+
254
+
222
+int main(void);
255
+ /* https://elinux.org/BCM2835_registers#PM */
223
+
256
+ s->rstc = 0x00000102;
224
+void _start(void)
257
+ s->rsts = 0x00001000;
225
+{
258
+ s->wdog = 0x00000000;
226
+ exit(main());
259
+}
227
+}
260
+
228
+
261
+static void bcm2835_powermgt_class_init(ObjectClass *klass, void *data)
229
+void exit(int ret)
262
+{
230
+{
263
+ DeviceClass *dc = DEVICE_CLASS(klass);
231
+ register int x0 __asm__("x0") = ret;
264
+
232
+ register int x8 __asm__("x8") = __NR_exit;
265
+ dc->reset = bcm2835_powermgt_reset;
233
+
266
+ dc->vmsd = &vmstate_bcm2835_powermgt;
234
+ asm volatile("svc #0" : : "r"(x0), "r"(x8));
267
+}
235
+ __builtin_unreachable();
268
+
236
+}
269
+static TypeInfo bcm2835_powermgt_info = {
237
+
270
+ .name = TYPE_BCM2835_POWERMGT,
238
+/*
271
+ .parent = TYPE_SYS_BUS_DEVICE,
239
+ * Irritatingly, the user API struct sigaction does not match the
272
+ .instance_size = sizeof(BCM2835PowerMgtState),
240
+ * kernel API struct sigaction. So for simplicity, isolate the
273
+ .class_init = bcm2835_powermgt_class_init,
241
+ * kernel ABI here, and make this act like signal.
274
+ .instance_init = bcm2835_powermgt_init,
242
+ */
275
+};
243
+void signal_info(int sig, void (*fn)(int, siginfo_t *, ucontext_t *))
276
+
244
+{
277
+static void bcm2835_powermgt_register_types(void)
245
+ struct kernel_sigaction {
278
+{
246
+ void (*handler)(int, siginfo_t *, ucontext_t *);
279
+ type_register_static(&bcm2835_powermgt_info);
247
+ unsigned long flags;
280
+}
248
+ unsigned long restorer;
281
+
249
+ unsigned long mask;
282
+type_init(bcm2835_powermgt_register_types)
250
+ } sa = { fn, SA_SIGINFO, 0, 0 };
283
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
251
+
252
+ register int x0 __asm__("x0") = sig;
253
+ register void *x1 __asm__("x1") = &sa;
254
+ register void *x2 __asm__("x2") = 0;
255
+ register int x3 __asm__("x3") = sizeof(unsigned long);
256
+ register int x8 __asm__("x8") = __NR_rt_sigaction;
257
+
258
+ asm volatile("svc #0"
259
+ : : "r"(x0), "r"(x1), "r"(x2), "r"(x3), "r"(x8) : "memory");
260
+}
261
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
262
index XXXXXXX..XXXXXXX 100644
284
index XXXXXXX..XXXXXXX 100644
263
--- a/tests/tcg/aarch64/Makefile.target
285
--- a/hw/misc/meson.build
264
+++ b/tests/tcg/aarch64/Makefile.target
286
+++ b/hw/misc/meson.build
265
@@ -XXX,XX +XXX,XX @@ run-pauth-%: QEMU_OPTS += -cpu max
287
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_RASPI', if_true: files(
266
run-plugin-pauth-%: QEMU_OPTS += -cpu max
288
'bcm2835_rng.c',
267
endif
289
'bcm2835_thermal.c',
268
290
'bcm2835_cprman.c',
269
+# BTI Tests
291
+ 'bcm2835_powermgt.c',
270
+# bti-1 tests the elf notes, so we require special compiler support.
292
))
271
+ifneq ($(DOCKER_IMAGE)$(CROSS_CC_HAS_ARMV8_BTI),)
293
softmmu_ss.add(when: 'CONFIG_SLAVIO', if_true: files('slavio_misc.c'))
272
+AARCH64_TESTS += bti-1
294
softmmu_ss.add(when: 'CONFIG_ZYNQ', if_true: files('zynq_slcr.c', 'zynq-xadc.c'))
273
+bti-1: CFLAGS += -mbranch-protection=standard
274
+bti-1: LDFLAGS += -nostdlib
275
+endif
276
+# bti-2 tests PROT_BTI, so no special compiler support required.
277
+AARCH64_TESTS += bti-2
278
+
279
# Semihosting smoke test for linux-user
280
AARCH64_TESTS += semihosting
281
run-semihosting: semihosting
282
diff --git a/tests/tcg/configure.sh b/tests/tcg/configure.sh
283
index XXXXXXX..XXXXXXX 100755
284
--- a/tests/tcg/configure.sh
285
+++ b/tests/tcg/configure.sh
286
@@ -XXX,XX +XXX,XX @@ for target in $target_list; do
287
-march=armv8.3-a -o $TMPE $TMPC; then
288
echo "CROSS_CC_HAS_ARMV8_3=y" >> $config_target_mak
289
fi
290
+ if do_compiler "$target_compiler" $target_compiler_cflags \
291
+ -mbranch-protection=standard -o $TMPE $TMPC; then
292
+ echo "CROSS_CC_HAS_ARMV8_BTI=y" >> $config_target_mak
293
+ fi
294
;;
295
esac
296
297
--
295
--
298
2.20.1
296
2.20.1
299
297
300
298
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
2
3
The time to transmit a char is expressed in nanoseconds, not in ticks.
3
Add a test booting and quickly shutdown a raspi2 machine,
4
to test the power management model:
5
6
(1/1) tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_raspi2_initrd:
7
console: [ 0.000000] Booting Linux on physical CPU 0xf00
8
console: [ 0.000000] Linux version 4.14.98-v7+ (dom@dom-XPS-13-9370) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611)) #1200 SMP Tue Feb 12 20:27:48 GMT 2019
9
console: [ 0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
10
console: [ 0.000000] CPU: div instructions available: patching division code
11
console: [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
12
console: [ 0.000000] OF: fdt: Machine model: Raspberry Pi 2 Model B
13
...
14
console: Boot successful.
15
console: cat /proc/cpuinfo
16
console: / # cat /proc/cpuinfo
17
...
18
console: processor : 3
19
console: model name : ARMv7 Processor rev 5 (v7l)
20
console: BogoMIPS : 125.00
21
console: Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
22
console: CPU implementer : 0x41
23
console: CPU architecture: 7
24
console: CPU variant : 0x0
25
console: CPU part : 0xc07
26
console: CPU revision : 5
27
console: Hardware : BCM2835
28
console: Revision : 0000
29
console: Serial : 0000000000000000
30
console: cat /proc/iomem
31
console: / # cat /proc/iomem
32
console: 00000000-3bffffff : System RAM
33
console: 00008000-00afffff : Kernel code
34
console: 00c00000-00d468ef : Kernel data
35
console: 3f006000-3f006fff : dwc_otg
36
console: 3f007000-3f007eff : /soc/dma@7e007000
37
console: 3f00b880-3f00b8bf : /soc/mailbox@7e00b880
38
console: 3f100000-3f100027 : /soc/watchdog@7e100000
39
console: 3f101000-3f102fff : /soc/cprman@7e101000
40
console: 3f200000-3f2000b3 : /soc/gpio@7e200000
41
PASS (24.59 s)
42
RESULTS : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
43
JOB TIME : 25.02 s
4
44
5
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
45
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Message-id: 20201014213601.205222-1-f4bug@amsat.org
46
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
47
Message-id: 20210531113837.1689775-1-f4bug@amsat.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
48
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
49
---
10
hw/arm/strongarm.c | 2 +-
50
tests/acceptance/boot_linux_console.py | 43 ++++++++++++++++++++++++++
11
1 file changed, 1 insertion(+), 1 deletion(-)
51
1 file changed, 43 insertions(+)
12
52
13
diff --git a/hw/arm/strongarm.c b/hw/arm/strongarm.c
53
diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
14
index XXXXXXX..XXXXXXX 100644
54
index XXXXXXX..XXXXXXX 100644
15
--- a/hw/arm/strongarm.c
55
--- a/tests/acceptance/boot_linux_console.py
16
+++ b/hw/arm/strongarm.c
56
+++ b/tests/acceptance/boot_linux_console.py
17
@@ -XXX,XX +XXX,XX @@ struct StrongARMUARTState {
57
@@ -XXX,XX +XXX,XX @@
18
uint8_t rx_start;
58
from avocado import skip
19
uint8_t rx_len;
59
from avocado import skipUnless
20
60
from avocado_qemu import Test
21
- uint64_t char_transmit_time; /* time to transmit a char in ticks*/
61
+from avocado_qemu import exec_command
22
+ uint64_t char_transmit_time; /* time to transmit a char in nanoseconds */
62
from avocado_qemu import exec_command_and_wait_for_pattern
23
bool wait_break_end;
63
from avocado_qemu import interrupt_interactive_console_until_pattern
24
QEMUTimer *rx_timeout_timer;
64
from avocado_qemu import wait_for_console_pattern
25
QEMUTimer *tx_timer;
65
@@ -XXX,XX +XXX,XX @@ def test_arm_raspi2_uart0(self):
66
"""
67
self.do_test_arm_raspi2(0)
68
69
+ def test_arm_raspi2_initrd(self):
70
+ """
71
+ :avocado: tags=arch:arm
72
+ :avocado: tags=machine:raspi2
73
+ """
74
+ deb_url = ('http://archive.raspberrypi.org/debian/'
75
+ 'pool/main/r/raspberrypi-firmware/'
76
+ 'raspberrypi-kernel_1.20190215-1_armhf.deb')
77
+ deb_hash = 'cd284220b32128c5084037553db3c482426f3972'
78
+ deb_path = self.fetch_asset(deb_url, asset_hash=deb_hash)
79
+ kernel_path = self.extract_from_deb(deb_path, '/boot/kernel7.img')
80
+ dtb_path = self.extract_from_deb(deb_path, '/boot/bcm2709-rpi-2-b.dtb')
81
+
82
+ initrd_url = ('https://github.com/groeck/linux-build-test/raw/'
83
+ '2eb0a73b5d5a28df3170c546ddaaa9757e1e0848/rootfs/'
84
+ 'arm/rootfs-armv7a.cpio.gz')
85
+ initrd_hash = '604b2e45cdf35045846b8bbfbf2129b1891bdc9c'
86
+ initrd_path_gz = self.fetch_asset(initrd_url, asset_hash=initrd_hash)
87
+ initrd_path = os.path.join(self.workdir, 'rootfs.cpio')
88
+ archive.gzip_uncompress(initrd_path_gz, initrd_path)
89
+
90
+ self.vm.set_console()
91
+ kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
92
+ 'earlycon=pl011,0x3f201000 console=ttyAMA0 '
93
+ 'panic=-1 noreboot ' +
94
+ 'dwc_otg.fiq_fsm_enable=0')
95
+ self.vm.add_args('-kernel', kernel_path,
96
+ '-dtb', dtb_path,
97
+ '-initrd', initrd_path,
98
+ '-append', kernel_command_line,
99
+ '-no-reboot')
100
+ self.vm.launch()
101
+ self.wait_for_console_pattern('Boot successful.')
102
+
103
+ exec_command_and_wait_for_pattern(self, 'cat /proc/cpuinfo',
104
+ 'BCM2835')
105
+ exec_command_and_wait_for_pattern(self, 'cat /proc/iomem',
106
+ '/soc/cprman@7e101000')
107
+ exec_command(self, 'halt')
108
+ # Wait for VM to shut down gracefully
109
+ self.vm.wait()
110
+
111
def test_arm_exynos4210_initrd(self):
112
"""
113
:avocado: tags=arch:arm
26
--
114
--
27
2.20.1
115
2.20.1
28
116
29
117
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Joe Komlodi <joe.komlodi@xilinx.com>
2
2
3
For BTI, we need to know if the executable is static or dynamic,
3
If the CPU is running in default NaN mode (FPCR.DN == 1) and we execute
4
which means looking for PT_INTERP earlier.
4
FRSQRTE, FRECPE, or FRECPX with a signaling NaN, parts_silence_nan_frac() will
5
assert due to fpst->default_nan_mode being set.
5
6
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
To avoid this, we check to see what NaN mode we're running in before we call
7
Message-id: 20201016184207.786698-8-richard.henderson@linaro.org
8
floatxx_silence_nan().
9
10
Signed-off-by: Joe Komlodi <joe.komlodi@xilinx.com>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 1624662174-175828-2-git-send-email-joe.komlodi@xilinx.com
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
15
---
11
linux-user/elfload.c | 60 +++++++++++++++++++++++---------------------
16
target/arm/helper-a64.c | 12 +++++++++---
12
1 file changed, 31 insertions(+), 29 deletions(-)
17
target/arm/vfp_helper.c | 24 ++++++++++++++++++------
18
2 files changed, 27 insertions(+), 9 deletions(-)
13
19
14
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
20
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
15
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
16
--- a/linux-user/elfload.c
22
--- a/target/arm/helper-a64.c
17
+++ b/linux-user/elfload.c
23
+++ b/target/arm/helper-a64.c
18
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
24
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(frecpx_f16)(uint32_t a, void *fpstp)
19
25
float16 nan = a;
20
mmap_lock();
26
if (float16_is_signaling_nan(a, fpst)) {
21
27
float_raise(float_flag_invalid, fpst);
22
- /* Find the maximum size of the image and allocate an appropriate
28
- nan = float16_silence_nan(a, fpst);
23
- amount of memory to handle that. */
29
+ if (!fpst->default_nan_mode) {
24
+ /*
30
+ nan = float16_silence_nan(a, fpst);
25
+ * Find the maximum size of the image and allocate an appropriate
26
+ * amount of memory to handle that. Locate the interpreter, if any.
27
+ */
28
loaddr = -1, hiaddr = 0;
29
info->alignment = 0;
30
for (i = 0; i < ehdr->e_phnum; ++i) {
31
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
32
}
33
++info->nsegs;
34
info->alignment |= eppnt->p_align;
35
+ } else if (eppnt->p_type == PT_INTERP && pinterp_name) {
36
+ g_autofree char *interp_name = NULL;
37
+
38
+ if (*pinterp_name) {
39
+ errmsg = "Multiple PT_INTERP entries";
40
+ goto exit_errmsg;
41
+ }
31
+ }
42
+ interp_name = g_malloc(eppnt->p_filesz);
32
}
43
+ if (!interp_name) {
33
if (fpst->default_nan_mode) {
44
+ goto exit_perror;
34
nan = float16_default_nan(fpst);
35
@@ -XXX,XX +XXX,XX @@ float32 HELPER(frecpx_f32)(float32 a, void *fpstp)
36
float32 nan = a;
37
if (float32_is_signaling_nan(a, fpst)) {
38
float_raise(float_flag_invalid, fpst);
39
- nan = float32_silence_nan(a, fpst);
40
+ if (!fpst->default_nan_mode) {
41
+ nan = float32_silence_nan(a, fpst);
45
+ }
42
+ }
46
+
43
}
47
+ if (eppnt->p_offset + eppnt->p_filesz <= BPRM_BUF_SIZE) {
44
if (fpst->default_nan_mode) {
48
+ memcpy(interp_name, bprm_buf + eppnt->p_offset,
45
nan = float32_default_nan(fpst);
49
+ eppnt->p_filesz);
46
@@ -XXX,XX +XXX,XX @@ float64 HELPER(frecpx_f64)(float64 a, void *fpstp)
50
+ } else {
47
float64 nan = a;
51
+ retval = pread(image_fd, interp_name, eppnt->p_filesz,
48
if (float64_is_signaling_nan(a, fpst)) {
52
+ eppnt->p_offset);
49
float_raise(float_flag_invalid, fpst);
53
+ if (retval != eppnt->p_filesz) {
50
- nan = float64_silence_nan(a, fpst);
54
+ goto exit_perror;
51
+ if (!fpst->default_nan_mode) {
55
+ }
52
+ nan = float64_silence_nan(a, fpst);
56
+ }
53
+ }
57
+ if (interp_name[eppnt->p_filesz - 1] != 0) {
54
}
58
+ errmsg = "Invalid PT_INTERP entry";
55
if (fpst->default_nan_mode) {
59
+ goto exit_errmsg;
56
nan = float64_default_nan(fpst);
57
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/vfp_helper.c
60
+++ b/target/arm/vfp_helper.c
61
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpe_f16)(uint32_t input, void *fpstp)
62
float16 nan = f16;
63
if (float16_is_signaling_nan(f16, fpst)) {
64
float_raise(float_flag_invalid, fpst);
65
- nan = float16_silence_nan(f16, fpst);
66
+ if (!fpst->default_nan_mode) {
67
+ nan = float16_silence_nan(f16, fpst);
60
+ }
68
+ }
61
+ *pinterp_name = g_steal_pointer(&interp_name);
62
}
69
}
63
}
70
if (fpst->default_nan_mode) {
64
71
nan = float16_default_nan(fpst);
65
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
72
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recpe_f32)(float32 input, void *fpstp)
66
if (vaddr_em > info->brk) {
73
float32 nan = f32;
67
info->brk = vaddr_em;
74
if (float32_is_signaling_nan(f32, fpst)) {
68
}
75
float_raise(float_flag_invalid, fpst);
69
- } else if (eppnt->p_type == PT_INTERP && pinterp_name) {
76
- nan = float32_silence_nan(f32, fpst);
70
- g_autofree char *interp_name = NULL;
77
+ if (!fpst->default_nan_mode) {
71
-
78
+ nan = float32_silence_nan(f32, fpst);
72
- if (*pinterp_name) {
79
+ }
73
- errmsg = "Multiple PT_INTERP entries";
80
}
74
- goto exit_errmsg;
81
if (fpst->default_nan_mode) {
75
- }
82
nan = float32_default_nan(fpst);
76
- interp_name = g_malloc(eppnt->p_filesz);
83
@@ -XXX,XX +XXX,XX @@ float64 HELPER(recpe_f64)(float64 input, void *fpstp)
77
- if (!interp_name) {
84
float64 nan = f64;
78
- goto exit_perror;
85
if (float64_is_signaling_nan(f64, fpst)) {
79
- }
86
float_raise(float_flag_invalid, fpst);
80
-
87
- nan = float64_silence_nan(f64, fpst);
81
- if (eppnt->p_offset + eppnt->p_filesz <= BPRM_BUF_SIZE) {
88
+ if (!fpst->default_nan_mode) {
82
- memcpy(interp_name, bprm_buf + eppnt->p_offset,
89
+ nan = float64_silence_nan(f64, fpst);
83
- eppnt->p_filesz);
90
+ }
84
- } else {
91
}
85
- retval = pread(image_fd, interp_name, eppnt->p_filesz,
92
if (fpst->default_nan_mode) {
86
- eppnt->p_offset);
93
nan = float64_default_nan(fpst);
87
- if (retval != eppnt->p_filesz) {
94
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(rsqrte_f16)(uint32_t input, void *fpstp)
88
- goto exit_perror;
95
float16 nan = f16;
89
- }
96
if (float16_is_signaling_nan(f16, s)) {
90
- }
97
float_raise(float_flag_invalid, s);
91
- if (interp_name[eppnt->p_filesz - 1] != 0) {
98
- nan = float16_silence_nan(f16, s);
92
- errmsg = "Invalid PT_INTERP entry";
99
+ if (!s->default_nan_mode) {
93
- goto exit_errmsg;
100
+ nan = float16_silence_nan(f16, fpstp);
94
- }
101
+ }
95
- *pinterp_name = g_steal_pointer(&interp_name);
102
}
96
#ifdef TARGET_MIPS
103
if (s->default_nan_mode) {
97
} else if (eppnt->p_type == PT_MIPS_ABIFLAGS) {
104
nan = float16_default_nan(s);
98
Mips_elf_abiflags_v0 abiflags;
105
@@ -XXX,XX +XXX,XX @@ float32 HELPER(rsqrte_f32)(float32 input, void *fpstp)
106
float32 nan = f32;
107
if (float32_is_signaling_nan(f32, s)) {
108
float_raise(float_flag_invalid, s);
109
- nan = float32_silence_nan(f32, s);
110
+ if (!s->default_nan_mode) {
111
+ nan = float32_silence_nan(f32, fpstp);
112
+ }
113
}
114
if (s->default_nan_mode) {
115
nan = float32_default_nan(s);
116
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
117
float64 nan = f64;
118
if (float64_is_signaling_nan(f64, s)) {
119
float_raise(float_flag_invalid, s);
120
- nan = float64_silence_nan(f64, s);
121
+ if (!s->default_nan_mode) {
122
+ nan = float64_silence_nan(f64, fpstp);
123
+ }
124
}
125
if (s->default_nan_mode) {
126
nan = float64_default_nan(s);
99
--
127
--
100
2.20.1
128
2.20.1
101
129
102
130
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Maxim Uvarov <maxim.uvarov@linaro.org>
2
2
3
This is a bit clearer than open-coding some of this
3
qemu has 2 type of functions: shutdown and reboot. Shutdown
4
with a bare c string.
4
function has to be used for machine shutdown. Otherwise we cause
5
a reset with a bogus "cause" value, when we intended a shutdown.
5
6
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20201016184207.786698-9-richard.henderson@linaro.org
9
Message-id: 20210625111842.3790-3-maxim.uvarov@linaro.org
10
[PMM: tweaked commit message]
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
12
---
11
linux-user/elfload.c | 37 ++++++++++++++++++++-----------------
13
hw/gpio/gpio_pwr.c | 2 +-
12
1 file changed, 20 insertions(+), 17 deletions(-)
14
1 file changed, 1 insertion(+), 1 deletion(-)
13
15
14
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
16
diff --git a/hw/gpio/gpio_pwr.c b/hw/gpio/gpio_pwr.c
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/linux-user/elfload.c
18
--- a/hw/gpio/gpio_pwr.c
17
+++ b/linux-user/elfload.c
19
+++ b/hw/gpio/gpio_pwr.c
18
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@ static void gpio_pwr_reset(void *opaque, int n, int level)
19
#include "qemu/guest-random.h"
21
static void gpio_pwr_shutdown(void *opaque, int n, int level)
20
#include "qemu/units.h"
22
{
21
#include "qemu/selfmap.h"
23
if (level) {
22
+#include "qapi/error.h"
24
- qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
23
25
+ qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
24
#ifdef _ARCH_PPC64
25
#undef ARCH_DLINFO
26
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
27
struct elf_phdr *phdr;
28
abi_ulong load_addr, load_bias, loaddr, hiaddr, error;
29
int i, retval;
30
- const char *errmsg;
31
+ Error *err = NULL;
32
33
/* First of all, some simple consistency checks */
34
- errmsg = "Invalid ELF image for this architecture";
35
if (!elf_check_ident(ehdr)) {
36
+ error_setg(&err, "Invalid ELF image for this architecture");
37
goto exit_errmsg;
38
}
26
}
39
bswap_ehdr(ehdr);
40
if (!elf_check_ehdr(ehdr)) {
41
+ error_setg(&err, "Invalid ELF image for this architecture");
42
goto exit_errmsg;
43
}
44
45
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
46
g_autofree char *interp_name = NULL;
47
48
if (*pinterp_name) {
49
- errmsg = "Multiple PT_INTERP entries";
50
+ error_setg(&err, "Multiple PT_INTERP entries");
51
goto exit_errmsg;
52
}
53
+
54
interp_name = g_malloc(eppnt->p_filesz);
55
- if (!interp_name) {
56
- goto exit_perror;
57
- }
58
59
if (eppnt->p_offset + eppnt->p_filesz <= BPRM_BUF_SIZE) {
60
memcpy(interp_name, bprm_buf + eppnt->p_offset,
61
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
62
retval = pread(image_fd, interp_name, eppnt->p_filesz,
63
eppnt->p_offset);
64
if (retval != eppnt->p_filesz) {
65
- goto exit_perror;
66
+ goto exit_read;
67
}
68
}
69
if (interp_name[eppnt->p_filesz - 1] != 0) {
70
- errmsg = "Invalid PT_INTERP entry";
71
+ error_setg(&err, "Invalid PT_INTERP entry");
72
goto exit_errmsg;
73
}
74
*pinterp_name = g_steal_pointer(&interp_name);
75
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
76
(ehdr->e_type == ET_EXEC ? MAP_FIXED : 0),
77
-1, 0);
78
if (load_addr == -1) {
79
- goto exit_perror;
80
+ goto exit_mmap;
81
}
82
load_bias = load_addr - loaddr;
83
84
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
85
image_fd, eppnt->p_offset - vaddr_po);
86
87
if (error == -1) {
88
- goto exit_perror;
89
+ goto exit_mmap;
90
}
91
}
92
93
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
94
} else if (eppnt->p_type == PT_MIPS_ABIFLAGS) {
95
Mips_elf_abiflags_v0 abiflags;
96
if (eppnt->p_filesz < sizeof(Mips_elf_abiflags_v0)) {
97
- errmsg = "Invalid PT_MIPS_ABIFLAGS entry";
98
+ error_setg(&err, "Invalid PT_MIPS_ABIFLAGS entry");
99
goto exit_errmsg;
100
}
101
if (eppnt->p_offset + eppnt->p_filesz <= BPRM_BUF_SIZE) {
102
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
103
retval = pread(image_fd, &abiflags, sizeof(Mips_elf_abiflags_v0),
104
eppnt->p_offset);
105
if (retval != sizeof(Mips_elf_abiflags_v0)) {
106
- goto exit_perror;
107
+ goto exit_read;
108
}
109
}
110
bswap_mips_abiflags(&abiflags);
111
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
112
113
exit_read:
114
if (retval >= 0) {
115
- errmsg = "Incomplete read of file header";
116
- goto exit_errmsg;
117
+ error_setg(&err, "Incomplete read of file header");
118
+ } else {
119
+ error_setg_errno(&err, errno, "Error reading file header");
120
}
121
- exit_perror:
122
- errmsg = strerror(errno);
123
+ goto exit_errmsg;
124
+ exit_mmap:
125
+ error_setg_errno(&err, errno, "Error mapping file");
126
+ goto exit_errmsg;
127
exit_errmsg:
128
- fprintf(stderr, "%s: %s\n", image_name, errmsg);
129
+ error_reportf_err(err, "%s: ", image_name);
130
exit(-1);
131
}
27
}
132
28
133
--
29
--
134
2.20.1
30
2.20.1
135
31
136
32
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
In do_ldst(), the calculation of the offset needs to be based on the
2
size of the memory access, not the size of the elements in the
3
vector. This meant we were getting it wrong for the widening and
4
narrowing variants of the various VLDR and VSTR insns.
2
5
3
This is slightly clearer than just using strerror, though
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
the different forms produced by error_setg_file_open and
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
error_setg_errno isn't entirely convenient.
8
Message-id: 20210628135835.6690-2-peter.maydell@linaro.org
9
---
10
target/arm/translate-mve.c | 17 +++++++++--------
11
1 file changed, 9 insertions(+), 8 deletions(-)
6
12
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
8
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
9
Message-id: 20201016184207.786698-10-richard.henderson@linaro.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
linux-user/elfload.c | 15 ++++++++-------
13
1 file changed, 8 insertions(+), 7 deletions(-)
14
15
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
16
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
17
--- a/linux-user/elfload.c
15
--- a/target/arm/translate-mve.c
18
+++ b/linux-user/elfload.c
16
+++ b/target/arm/translate-mve.c
19
@@ -XXX,XX +XXX,XX @@ static void load_elf_interp(const char *filename, struct image_info *info,
17
@@ -XXX,XX +XXX,XX @@ static bool mve_skip_first_beat(DisasContext *s)
20
char bprm_buf[BPRM_BUF_SIZE])
18
}
19
}
20
21
-static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, MVEGenLdStFn *fn)
22
+static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, MVEGenLdStFn *fn,
23
+ unsigned msize)
21
{
24
{
22
int fd, retval;
25
TCGv_i32 addr;
23
+ Error *err = NULL;
26
uint32_t offset;
24
27
@@ -XXX,XX +XXX,XX @@ static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, MVEGenLdStFn *fn)
25
fd = open(path(filename), O_RDONLY);
28
return true;
26
if (fd < 0) {
27
- goto exit_perror;
28
+ error_setg_file_open(&err, errno, filename);
29
+ error_report_err(err);
30
+ exit(-1);
31
}
29
}
32
30
33
retval = read(fd, bprm_buf, BPRM_BUF_SIZE);
31
- offset = a->imm << a->size;
34
if (retval < 0) {
32
+ offset = a->imm << msize;
35
- goto exit_perror;
33
if (!a->a) {
36
+ error_setg_errno(&err, errno, "Error reading file header");
34
offset = -offset;
37
+ error_reportf_err(err, "%s: ", filename);
38
+ exit(-1);
39
}
35
}
40
+
36
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR(DisasContext *s, arg_VLDR_VSTR *a)
41
if (retval < BPRM_BUF_SIZE) {
37
{ gen_helper_mve_vstrw, gen_helper_mve_vldrw },
42
memset(bprm_buf + retval, 0, BPRM_BUF_SIZE - retval);
38
{ NULL, NULL }
39
};
40
- return do_ldst(s, a, ldstfns[a->size][a->l]);
41
+ return do_ldst(s, a, ldstfns[a->size][a->l], a->size);
42
}
43
44
-#define DO_VLDST_WIDE_NARROW(OP, SLD, ULD, ST) \
45
+#define DO_VLDST_WIDE_NARROW(OP, SLD, ULD, ST, MSIZE) \
46
static bool trans_##OP(DisasContext *s, arg_VLDR_VSTR *a) \
47
{ \
48
static MVEGenLdStFn * const ldstfns[2][2] = { \
49
{ gen_helper_mve_##ST, gen_helper_mve_##SLD }, \
50
{ NULL, gen_helper_mve_##ULD }, \
51
}; \
52
- return do_ldst(s, a, ldstfns[a->u][a->l]); \
53
+ return do_ldst(s, a, ldstfns[a->u][a->l], MSIZE); \
43
}
54
}
44
55
45
load_elf_image(filename, fd, info, NULL, bprm_buf);
56
-DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h)
46
- return;
57
-DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w)
47
-
58
-DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w)
48
- exit_perror:
59
+DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h, MO_8)
49
- fprintf(stderr, "%s: %s\n", filename, strerror(errno));
60
+DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w, MO_8)
50
- exit(-1);
61
+DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w, MO_16)
51
}
62
52
63
static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
53
static int symfind(const void *s0, const void *s1)
64
{
54
--
65
--
55
2.20.1
66
2.20.1
56
67
57
68
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
The initial implementation of the MVE VRMLALDAVH and VRMLSLDAVH
2
insns had some bugs:
3
* the 32x32 multiply of elements was being done as 32x32->32,
4
not 32x32->64
5
* we were incorrectly maintaining the accumulator in its full
6
72-bit form across all 4 beats of the insn; in the pseudocode
7
it is squashed back into the 64 bits of the RdaHi:RdaLo
8
registers after each beat
2
9
3
The second loop uses a loop induction variable, and the first
10
In particular, fixing the second of these allows us to recast
4
does not. Transform the first to match the second, to simplify
11
the implementation to avoid 128-bit arithmetic entirely.
5
a following patch moving code between them.
6
12
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Since the element size here is always 4, we can also drop the
8
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
14
parameterization of ESIZE to make the code a little more readable.
9
Message-id: 20201016184207.786698-7-richard.henderson@linaro.org
15
16
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
Message-id: 20210628135835.6690-3-peter.maydell@linaro.org
11
---
20
---
12
linux-user/elfload.c | 9 +++++----
21
target/arm/mve_helper.c | 38 +++++++++++++++++++++-----------------
13
1 file changed, 5 insertions(+), 4 deletions(-)
22
1 file changed, 21 insertions(+), 17 deletions(-)
14
23
15
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
24
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
16
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
17
--- a/linux-user/elfload.c
26
--- a/target/arm/mve_helper.c
18
+++ b/linux-user/elfload.c
27
+++ b/target/arm/mve_helper.c
19
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
28
@@ -XXX,XX +XXX,XX @@
20
loaddr = -1, hiaddr = 0;
29
*/
21
info->alignment = 0;
30
22
for (i = 0; i < ehdr->e_phnum; ++i) {
31
#include "qemu/osdep.h"
23
- if (phdr[i].p_type == PT_LOAD) {
32
-#include "qemu/int128.h"
24
- abi_ulong a = phdr[i].p_vaddr - phdr[i].p_offset;
33
#include "cpu.h"
25
+ struct elf_phdr *eppnt = phdr + i;
34
#include "internals.h"
26
+ if (eppnt->p_type == PT_LOAD) {
35
#include "vec_internal.h"
27
+ abi_ulong a = eppnt->p_vaddr - eppnt->p_offset;
36
@@ -XXX,XX +XXX,XX @@ DO_LDAV(vmlsldavsw, 4, int32_t, false, +=, -=)
28
if (a < loaddr) {
37
DO_LDAV(vmlsldavxsw, 4, int32_t, true, +=, -=)
29
loaddr = a;
38
30
}
39
/*
31
- a = phdr[i].p_vaddr + phdr[i].p_memsz;
40
- * Rounding multiply add long dual accumulate high: we must keep
32
+ a = eppnt->p_vaddr + eppnt->p_memsz;
41
- * a 72-bit internal accumulator value and return the top 64 bits.
33
if (a > hiaddr) {
42
+ * Rounding multiply add long dual accumulate high. In the pseudocode
34
hiaddr = a;
43
+ * this is implemented with a 72-bit internal accumulator value of which
35
}
44
+ * the top 64 bits are returned. We optimize this to avoid having to
36
++info->nsegs;
45
+ * use 128-bit arithmetic -- we can do this because the 74-bit accumulator
37
- info->alignment |= phdr[i].p_align;
46
+ * is squashed back into 64-bits after each beat.
38
+ info->alignment |= eppnt->p_align;
47
*/
39
}
48
-#define DO_LDAVH(OP, ESIZE, TYPE, XCHG, EVENACC, ODDACC, TO128) \
49
+#define DO_LDAVH(OP, TYPE, LTYPE, XCHG, SUB) \
50
uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \
51
void *vm, uint64_t a) \
52
{ \
53
uint16_t mask = mve_element_mask(env); \
54
unsigned e; \
55
TYPE *n = vn, *m = vm; \
56
- Int128 acc = int128_lshift(TO128(a), 8); \
57
- for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
58
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) { \
59
if (mask & 1) { \
60
+ LTYPE mul; \
61
if (e & 1) { \
62
- acc = ODDACC(acc, TO128(n[H##ESIZE(e - 1 * XCHG)] * \
63
- m[H##ESIZE(e)])); \
64
+ mul = (LTYPE)n[H4(e - 1 * XCHG)] * m[H4(e)]; \
65
+ if (SUB) { \
66
+ mul = -mul; \
67
+ } \
68
} else { \
69
- acc = EVENACC(acc, TO128(n[H##ESIZE(e + 1 * XCHG)] * \
70
- m[H##ESIZE(e)])); \
71
+ mul = (LTYPE)n[H4(e + 1 * XCHG)] * m[H4(e)]; \
72
} \
73
- acc = int128_add(acc, int128_make64(1 << 7)); \
74
+ mul = (mul >> 8) + ((mul >> 7) & 1); \
75
+ a += mul; \
76
} \
77
} \
78
mve_advance_vpt(env); \
79
- return int128_getlo(int128_rshift(acc, 8)); \
80
+ return a; \
40
}
81
}
41
82
83
-DO_LDAVH(vrmlaldavhsw, 4, int32_t, false, int128_add, int128_add, int128_makes64)
84
-DO_LDAVH(vrmlaldavhxsw, 4, int32_t, true, int128_add, int128_add, int128_makes64)
85
+DO_LDAVH(vrmlaldavhsw, int32_t, int64_t, false, false)
86
+DO_LDAVH(vrmlaldavhxsw, int32_t, int64_t, true, false)
87
88
-DO_LDAVH(vrmlaldavhuw, 4, uint32_t, false, int128_add, int128_add, int128_make64)
89
+DO_LDAVH(vrmlaldavhuw, uint32_t, uint64_t, false, false)
90
91
-DO_LDAVH(vrmlsldavhsw, 4, int32_t, false, int128_add, int128_sub, int128_makes64)
92
-DO_LDAVH(vrmlsldavhxsw, 4, int32_t, true, int128_add, int128_sub, int128_makes64)
93
+DO_LDAVH(vrmlsldavhsw, int32_t, int64_t, false, true)
94
+DO_LDAVH(vrmlsldavhxsw, int32_t, int64_t, true, true)
95
96
/* Vector add across vector */
97
#define DO_VADDV(OP, ESIZE, TYPE) \
42
--
98
--
43
2.20.1
99
2.20.1
44
100
45
101
diff view generated by jsdifflib
1
The SMLAD instruction is supposed to:
1
The function asimd_imm_const() in translate-neon.c is an
2
* signed multiply Rn[15:0] * Rm[15:0]
2
implementation of the pseudocode AdvSIMDExpandImm(), which we will
3
* signed multiply Rn[31:16] * Rm[31:16]
3
also want for MVE. Move the implementation to translate.c, with a
4
* perform a signed addition of the products and Ra
4
prototype in translate.h.
5
* set Rd to the low 32 bits of the theoretical
6
infinite-precision result
7
* set the Q flag if the sign-extension of Rd
8
would differ from the infinite-precision result
9
(ie on overflow)
10
11
Our current implementation doesn't quite do this, though: it performs
12
an addition of the products setting Q on overflow, and then it adds
13
Ra, again possibly setting Q. This sometimes incorrectly sets Q when
14
the architecturally mandated only-check-for-overflow-once algorithm
15
does not. For instance:
16
r1 = 0x80008000; r2 = 0x80008000; r3 = 0xffffffff
17
smlad r0, r1, r2, r3
18
This is (-32768 * -32768) + (-32768 * -32768) - 1
19
20
The products are both 0x4000_0000, so when added together as 32-bit
21
signed numbers they overflow (and QEMU sets Q), but because the
22
addition of Ra == -1 brings the total back down to 0x7fff_ffff
23
there is no overflow for the complete operation and setting Q is
24
incorrect.
25
26
Fix this edge case by resorting to 64-bit arithmetic for the
27
case where we need to add three values together.
28
5
29
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
30
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
31
Message-id: 20201009144712.11187-1-peter.maydell@linaro.org
8
Message-id: 20210628135835.6690-4-peter.maydell@linaro.org
32
---
9
---
33
target/arm/translate.c | 58 ++++++++++++++++++++++++++++++++++--------
10
target/arm/translate.h | 16 ++++++++++
34
1 file changed, 48 insertions(+), 10 deletions(-)
11
target/arm/translate-neon.c | 63 -------------------------------------
12
target/arm/translate.c | 57 +++++++++++++++++++++++++++++++++
13
3 files changed, 73 insertions(+), 63 deletions(-)
35
14
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.h
18
+++ b/target/arm/translate.h
19
@@ -XXX,XX +XXX,XX @@ static inline MemOp finalize_memop(DisasContext *s, MemOp opc)
20
return opc | s->be_data;
21
}
22
23
+/**
24
+ * asimd_imm_const: Expand an encoded SIMD constant value
25
+ *
26
+ * Expand a SIMD constant value. This is essentially the pseudocode
27
+ * AdvSIMDExpandImm, except that we also perform the boolean NOT needed for
28
+ * VMVN and VBIC (when cmode < 14 && op == 1).
29
+ *
30
+ * The combination cmode == 15 op == 1 is a reserved encoding for AArch32;
31
+ * callers must catch this.
32
+ *
33
+ * cmode = 2,3,4,5,6,7,10,11,12,13 imm=0 was UNPREDICTABLE in v7A but
34
+ * is either not unpredictable or merely CONSTRAINED UNPREDICTABLE in v8A;
35
+ * we produce an immediate constant value of 0 in these cases.
36
+ */
37
+uint64_t asimd_imm_const(uint32_t imm, int cmode, int op);
38
+
39
#endif /* TARGET_ARM_TRANSLATE_H */
40
diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/translate-neon.c
43
+++ b/target/arm/translate-neon.c
44
@@ -XXX,XX +XXX,XX @@ DO_FP_2SH(VCVT_UH, gen_helper_gvec_vcvt_uh)
45
DO_FP_2SH(VCVT_HS, gen_helper_gvec_vcvt_hs)
46
DO_FP_2SH(VCVT_HU, gen_helper_gvec_vcvt_hu)
47
48
-static uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
49
-{
50
- /*
51
- * Expand the encoded constant.
52
- * Note that cmode = 2,3,4,5,6,7,10,11,12,13 imm=0 is UNPREDICTABLE.
53
- * We choose to not special-case this and will behave as if a
54
- * valid constant encoding of 0 had been given.
55
- * cmode = 15 op = 1 must UNDEF; we assume decode has handled that.
56
- */
57
- switch (cmode) {
58
- case 0: case 1:
59
- /* no-op */
60
- break;
61
- case 2: case 3:
62
- imm <<= 8;
63
- break;
64
- case 4: case 5:
65
- imm <<= 16;
66
- break;
67
- case 6: case 7:
68
- imm <<= 24;
69
- break;
70
- case 8: case 9:
71
- imm |= imm << 16;
72
- break;
73
- case 10: case 11:
74
- imm = (imm << 8) | (imm << 24);
75
- break;
76
- case 12:
77
- imm = (imm << 8) | 0xff;
78
- break;
79
- case 13:
80
- imm = (imm << 16) | 0xffff;
81
- break;
82
- case 14:
83
- if (op) {
84
- /*
85
- * This is the only case where the top and bottom 32 bits
86
- * of the encoded constant differ.
87
- */
88
- uint64_t imm64 = 0;
89
- int n;
90
-
91
- for (n = 0; n < 8; n++) {
92
- if (imm & (1 << n)) {
93
- imm64 |= (0xffULL << (n * 8));
94
- }
95
- }
96
- return imm64;
97
- }
98
- imm |= (imm << 8) | (imm << 16) | (imm << 24);
99
- break;
100
- case 15:
101
- imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
102
- | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
103
- break;
104
- }
105
- if (op) {
106
- imm = ~imm;
107
- }
108
- return dup_const(MO_32, imm);
109
-}
110
-
111
static bool do_1reg_imm(DisasContext *s, arg_1reg_imm *a,
112
GVecGen2iFn *fn)
113
{
36
diff --git a/target/arm/translate.c b/target/arm/translate.c
114
diff --git a/target/arm/translate.c b/target/arm/translate.c
37
index XXXXXXX..XXXXXXX 100644
115
index XXXXXXX..XXXXXXX 100644
38
--- a/target/arm/translate.c
116
--- a/target/arm/translate.c
39
+++ b/target/arm/translate.c
117
+++ b/target/arm/translate.c
40
@@ -XXX,XX +XXX,XX @@ static bool op_smlad(DisasContext *s, arg_rrrr *a, bool m_swap, bool sub)
118
@@ -XXX,XX +XXX,XX @@ void arm_translate_init(void)
41
gen_smul_dual(t1, t2);
119
a64_translate_init();
42
120
}
43
if (sub) {
121
44
- /* This subtraction cannot overflow. */
122
+uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
45
+ /*
123
+{
46
+ * This subtraction cannot overflow, so we can do a simple
124
+ /* Expand the encoded constant as per AdvSIMDExpandImm pseudocode */
47
+ * 32-bit subtraction and then a possible 32-bit saturating
125
+ switch (cmode) {
48
+ * addition of Ra.
126
+ case 0: case 1:
49
+ */
127
+ /* no-op */
50
tcg_gen_sub_i32(t1, t1, t2);
128
+ break;
51
+ tcg_temp_free_i32(t2);
129
+ case 2: case 3:
130
+ imm <<= 8;
131
+ break;
132
+ case 4: case 5:
133
+ imm <<= 16;
134
+ break;
135
+ case 6: case 7:
136
+ imm <<= 24;
137
+ break;
138
+ case 8: case 9:
139
+ imm |= imm << 16;
140
+ break;
141
+ case 10: case 11:
142
+ imm = (imm << 8) | (imm << 24);
143
+ break;
144
+ case 12:
145
+ imm = (imm << 8) | 0xff;
146
+ break;
147
+ case 13:
148
+ imm = (imm << 16) | 0xffff;
149
+ break;
150
+ case 14:
151
+ if (op) {
152
+ /*
153
+ * This is the only case where the top and bottom 32 bits
154
+ * of the encoded constant differ.
155
+ */
156
+ uint64_t imm64 = 0;
157
+ int n;
52
+
158
+
53
+ if (a->ra != 15) {
159
+ for (n = 0; n < 8; n++) {
54
+ t2 = load_reg(s, a->ra);
160
+ if (imm & (1 << n)) {
55
+ gen_helper_add_setq(t1, cpu_env, t1, t2);
161
+ imm64 |= (0xffULL << (n * 8));
56
+ tcg_temp_free_i32(t2);
162
+ }
163
+ }
164
+ return imm64;
57
+ }
165
+ }
58
+ } else if (a->ra == 15) {
166
+ imm |= (imm << 8) | (imm << 16) | (imm << 24);
59
+ /* Single saturation-checking addition */
167
+ break;
60
+ gen_helper_add_setq(t1, cpu_env, t1, t2);
168
+ case 15:
61
+ tcg_temp_free_i32(t2);
169
+ imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
62
} else {
170
+ | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
63
/*
171
+ break;
64
- * This addition cannot overflow 32 bits; however it may
172
+ }
65
- * overflow considered as a signed operation, in which case
173
+ if (op) {
66
- * we must set the Q flag.
174
+ imm = ~imm;
67
+ * We need to add the products and Ra together and then
175
+ }
68
+ * determine whether the final result overflowed. Doing
176
+ return dup_const(MO_32, imm);
69
+ * this as two separate add-and-check-overflow steps incorrectly
177
+}
70
+ * sets Q for cases like (-32768 * -32768) + (-32768 * -32768) + -1.
71
+ * Do all the arithmetic at 64-bits and then check for overflow.
72
*/
73
- gen_helper_add_setq(t1, cpu_env, t1, t2);
74
- }
75
- tcg_temp_free_i32(t2);
76
+ TCGv_i64 p64, q64;
77
+ TCGv_i32 t3, qf, one;
78
79
- if (a->ra != 15) {
80
- t2 = load_reg(s, a->ra);
81
- gen_helper_add_setq(t1, cpu_env, t1, t2);
82
+ p64 = tcg_temp_new_i64();
83
+ q64 = tcg_temp_new_i64();
84
+ tcg_gen_ext_i32_i64(p64, t1);
85
+ tcg_gen_ext_i32_i64(q64, t2);
86
+ tcg_gen_add_i64(p64, p64, q64);
87
+ load_reg_var(s, t2, a->ra);
88
+ tcg_gen_ext_i32_i64(q64, t2);
89
+ tcg_gen_add_i64(p64, p64, q64);
90
+ tcg_temp_free_i64(q64);
91
+
178
+
92
+ tcg_gen_extr_i64_i32(t1, t2, p64);
179
/* Generate a label used for skipping this instruction */
93
+ tcg_temp_free_i64(p64);
180
void arm_gen_condlabel(DisasContext *s)
94
+ /*
181
{
95
+ * t1 is the low half of the result which goes into Rd.
96
+ * We have overflow and must set Q if the high half (t2)
97
+ * is different from the sign-extension of t1.
98
+ */
99
+ t3 = tcg_temp_new_i32();
100
+ tcg_gen_sari_i32(t3, t1, 31);
101
+ qf = load_cpu_field(QF);
102
+ one = tcg_const_i32(1);
103
+ tcg_gen_movcond_i32(TCG_COND_NE, qf, t2, t3, one, qf);
104
+ store_cpu_field(qf, QF);
105
+ tcg_temp_free_i32(one);
106
+ tcg_temp_free_i32(t3);
107
tcg_temp_free_i32(t2);
108
}
109
store_reg(s, a->rd, t1);
110
--
182
--
111
2.20.1
183
2.20.1
112
184
113
185
diff view generated by jsdifflib
1
M-profile CPUs with half-precision floating point support should
1
The A64 AdvSIMD modified-immediate grouping uses almost the same
2
be able to write to FPSCR.FZ16, but an M-profile specific masking
2
constant encoding that A32 Neon does; reuse asimd_imm_const() (to
3
of the value at the top of vfp_set_fpscr() currently prevents that.
3
which we add the AArch64-specific case for cmode 15 op 1) instead of
4
This is not yet an active bug because we have no M-profile
4
reimplementing it all.
5
FP16 CPUs, but needs to be fixed before we can add any.
6
7
The bits that the masking is effectively preventing from being
8
set are the A-profile only short-vector Len and Stride fields,
9
plus the Neon QC bit. Rearrange the order of the function so
10
that those fields are handled earlier and only under a suitable
11
guard; this allows us to drop the M-profile specific masking,
12
making FZ16 writeable.
13
14
This change also makes the QC bit correctly RAZ/WI for older
15
no-Neon A-profile cores.
16
17
This refactoring also paves the way for the low-overhead-branch
18
LTPSIZE field, which uses some of the bits that are used for
19
A-profile Stride and Len.
20
5
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
22
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
23
Message-id: 20201019151301.2046-10-peter.maydell@linaro.org
8
Message-id: 20210628135835.6690-5-peter.maydell@linaro.org
24
---
9
---
25
target/arm/vfp_helper.c | 47 ++++++++++++++++++++++++-----------------
10
target/arm/translate.h | 3 +-
26
1 file changed, 28 insertions(+), 19 deletions(-)
11
target/arm/translate-a64.c | 86 ++++----------------------------------
12
target/arm/translate.c | 17 +++++++-
13
3 files changed, 24 insertions(+), 82 deletions(-)
27
14
28
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
29
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/vfp_helper.c
17
--- a/target/arm/translate.h
31
+++ b/target/arm/vfp_helper.c
18
+++ b/target/arm/translate.h
32
@@ -XXX,XX +XXX,XX @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
19
@@ -XXX,XX +XXX,XX @@ static inline MemOp finalize_memop(DisasContext *s, MemOp opc)
33
val &= ~FPCR_FZ16;
20
* VMVN and VBIC (when cmode < 14 && op == 1).
21
*
22
* The combination cmode == 15 op == 1 is a reserved encoding for AArch32;
23
- * callers must catch this.
24
+ * callers must catch this; we return the 64-bit constant value defined
25
+ * for AArch64.
26
*
27
* cmode = 2,3,4,5,6,7,10,11,12,13 imm=0 was UNPREDICTABLE in v7A but
28
* is either not unpredictable or merely CONSTRAINED UNPREDICTABLE in v8A;
29
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate-a64.c
32
+++ b/target/arm/translate-a64.c
33
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
34
{
35
int rd = extract32(insn, 0, 5);
36
int cmode = extract32(insn, 12, 4);
37
- int cmode_3_1 = extract32(cmode, 1, 3);
38
- int cmode_0 = extract32(cmode, 0, 1);
39
int o2 = extract32(insn, 11, 1);
40
uint64_t abcdefgh = extract32(insn, 5, 5) | (extract32(insn, 16, 3) << 5);
41
bool is_neg = extract32(insn, 29, 1);
42
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
43
return;
34
}
44
}
35
45
36
- if (arm_feature(env, ARM_FEATURE_M)) {
46
- /* See AdvSIMDExpandImm() in ARM ARM */
37
+ vfp_set_fpscr_to_host(env, val);
47
- switch (cmode_3_1) {
38
+
48
- case 0: /* Replicate(Zeros(24):imm8, 2) */
39
+ if (!arm_feature(env, ARM_FEATURE_M)) {
49
- case 1: /* Replicate(Zeros(16):imm8:Zeros(8), 2) */
40
/*
50
- case 2: /* Replicate(Zeros(8):imm8:Zeros(16), 2) */
41
- * M profile FPSCR is RES0 for the QC, STRIDE, FZ16, LEN bits
51
- case 3: /* Replicate(imm8:Zeros(24), 2) */
42
- * and also for the trapped-exception-handling bits IxE.
52
- {
43
+ * Short-vector length and stride; on M-profile these bits
53
- int shift = cmode_3_1 * 8;
44
+ * are used for different purposes.
54
- imm = bitfield_replicate(abcdefgh << shift, 32);
45
+ * We can't make this conditional be "if MVFR0.FPShVec != 0",
55
- break;
46
+ * because in v7A no-short-vector-support cores still had to
56
- }
47
+ * allow Stride/Len to be written with the only effect that
57
- case 4: /* Replicate(Zeros(8):imm8, 4) */
48
+ * some insns are required to UNDEF if the guest sets them.
58
- case 5: /* Replicate(imm8:Zeros(8), 4) */
49
+ *
59
- {
50
+ * TODO: if M-profile MVE implemented, set LTPSIZE.
60
- int shift = (cmode_3_1 & 0x1) * 8;
51
*/
61
- imm = bitfield_replicate(abcdefgh << shift, 16);
52
- val &= 0xf7c0009f;
62
- break;
53
+ env->vfp.vec_len = extract32(val, 16, 3);
63
- }
54
+ env->vfp.vec_stride = extract32(val, 20, 2);
64
- case 6:
65
- if (cmode_0) {
66
- /* Replicate(Zeros(8):imm8:Ones(16), 2) */
67
- imm = (abcdefgh << 16) | 0xffff;
68
- } else {
69
- /* Replicate(Zeros(16):imm8:Ones(8), 2) */
70
- imm = (abcdefgh << 8) | 0xff;
71
- }
72
- imm = bitfield_replicate(imm, 32);
73
- break;
74
- case 7:
75
- if (!cmode_0 && !is_neg) {
76
- imm = bitfield_replicate(abcdefgh, 8);
77
- } else if (!cmode_0 && is_neg) {
78
- int i;
79
- imm = 0;
80
- for (i = 0; i < 8; i++) {
81
- if ((abcdefgh) & (1 << i)) {
82
- imm |= 0xffULL << (i * 8);
83
- }
84
- }
85
- } else if (cmode_0) {
86
- if (is_neg) {
87
- imm = (abcdefgh & 0x3f) << 48;
88
- if (abcdefgh & 0x80) {
89
- imm |= 0x8000000000000000ULL;
90
- }
91
- if (abcdefgh & 0x40) {
92
- imm |= 0x3fc0000000000000ULL;
93
- } else {
94
- imm |= 0x4000000000000000ULL;
95
- }
96
- } else {
97
- if (o2) {
98
- /* FMOV (vector, immediate) - half-precision */
99
- imm = vfp_expand_imm(MO_16, abcdefgh);
100
- /* now duplicate across the lanes */
101
- imm = bitfield_replicate(imm, 16);
102
- } else {
103
- imm = (abcdefgh & 0x3f) << 19;
104
- if (abcdefgh & 0x80) {
105
- imm |= 0x80000000;
106
- }
107
- if (abcdefgh & 0x40) {
108
- imm |= 0x3e000000;
109
- } else {
110
- imm |= 0x40000000;
111
- }
112
- imm |= (imm << 32);
113
- }
114
- }
115
- }
116
- break;
117
- default:
118
- g_assert_not_reached();
119
- }
120
-
121
- if (cmode_3_1 != 7 && is_neg) {
122
- imm = ~imm;
123
+ if (cmode == 15 && o2 && !is_neg) {
124
+ /* FMOV (vector, immediate) - half-precision */
125
+ imm = vfp_expand_imm(MO_16, abcdefgh);
126
+ /* now duplicate across the lanes */
127
+ imm = bitfield_replicate(imm, 16);
128
+ } else {
129
+ imm = asimd_imm_const(abcdefgh, cmode, is_neg);
55
}
130
}
56
131
57
- vfp_set_fpscr_to_host(env, val);
132
if (!((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9)) {
58
+ if (arm_feature(env, ARM_FEATURE_NEON)) {
133
diff --git a/target/arm/translate.c b/target/arm/translate.c
59
+ /*
134
index XXXXXXX..XXXXXXX 100644
60
+ * The bit we set within fpscr_q is arbitrary; the register as a
135
--- a/target/arm/translate.c
61
+ * whole being zero/non-zero is what counts.
136
+++ b/target/arm/translate.c
62
+ * TODO: M-profile MVE also has a QC bit.
137
@@ -XXX,XX +XXX,XX @@ uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
63
+ */
138
case 14:
64
+ env->vfp.qc[0] = val & FPCR_QC;
139
if (op) {
65
+ env->vfp.qc[1] = 0;
140
/*
66
+ env->vfp.qc[2] = 0;
141
- * This is the only case where the top and bottom 32 bits
67
+ env->vfp.qc[3] = 0;
142
- * of the encoded constant differ.
68
+ }
143
+ * This and cmode == 15 op == 1 are the only cases where
69
144
+ * the top and bottom 32 bits of the encoded constant differ.
70
/*
145
*/
71
* We don't implement trapped exception handling, so the
146
uint64_t imm64 = 0;
72
* trap enable bits, IDE|IXE|UFE|OFE|DZE|IOE are all RAZ/WI (not RES0!)
147
int n;
73
*
148
@@ -XXX,XX +XXX,XX @@ uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
74
- * If we exclude the exception flags, IOC|DZC|OFC|UFC|IXC|IDC
149
imm |= (imm << 8) | (imm << 16) | (imm << 24);
75
- * (which are stored in fp_status), and the other RES0 bits
150
break;
76
- * in between, then we clear all of the low 16 bits.
151
case 15:
77
+ * The exception flags IOC|DZC|OFC|UFC|IXC|IDC are stored in
152
+ if (op) {
78
+ * fp_status; QC, Len and Stride are stored separately earlier.
153
+ /* Reserved encoding for AArch32; valid for AArch64 */
79
+ * Clear out all of those and the RES0 bits: only NZCV, AHP, DN,
154
+ uint64_t imm64 = (uint64_t)(imm & 0x3f) << 48;
80
+ * FZ, RMode and FZ16 are kept in vfp.xregs[FPSCR].
155
+ if (imm & 0x80) {
81
*/
156
+ imm64 |= 0x8000000000000000ULL;
82
env->vfp.xregs[ARM_VFP_FPSCR] = val & 0xf7c80000;
157
+ }
83
- env->vfp.vec_len = (val >> 16) & 7;
158
+ if (imm & 0x40) {
84
- env->vfp.vec_stride = (val >> 20) & 3;
159
+ imm64 |= 0x3fc0000000000000ULL;
85
-
160
+ } else {
86
- /*
161
+ imm64 |= 0x4000000000000000ULL;
87
- * The bit we set within fpscr_q is arbitrary; the register as a
162
+ }
88
- * whole being zero/non-zero is what counts.
163
+ return imm64;
89
- */
164
+ }
90
- env->vfp.qc[0] = val & FPCR_QC;
165
imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
91
- env->vfp.qc[1] = 0;
166
| ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
92
- env->vfp.qc[2] = 0;
167
break;
93
- env->vfp.qc[3] = 0;
94
}
95
96
void vfp_set_fpscr(CPUARMState *env, uint32_t val)
97
--
168
--
98
2.20.1
169
2.20.1
99
170
100
171
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Use dup_const() instead of bitfield_replicate() in
2
disas_simd_mod_imm().
2
3
3
Transform the prot bit to a qemu internal page bit, and save
4
(We can't replace the other use of bitfield_replicate() in this file,
4
it in the page tables.
5
in logic_imm_decode_wmask(), because that location needs to handle 2
6
and 4 bit elements, which dup_const() cannot.)
5
7
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20201016184207.786698-3-richard.henderson@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20210628135835.6690-6-peter.maydell@linaro.org
10
---
11
---
11
include/exec/cpu-all.h | 2 ++
12
target/arm/translate-a64.c | 2 +-
12
linux-user/syscall_defs.h | 4 ++++
13
1 file changed, 1 insertion(+), 1 deletion(-)
13
target/arm/cpu.h | 5 +++++
14
linux-user/mmap.c | 16 ++++++++++++++++
15
target/arm/translate-a64.c | 6 +++---
16
5 files changed, 30 insertions(+), 3 deletions(-)
17
14
18
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/include/exec/cpu-all.h
21
+++ b/include/exec/cpu-all.h
22
@@ -XXX,XX +XXX,XX @@ extern intptr_t qemu_host_page_mask;
23
/* FIXME: Code that sets/uses this is broken and needs to go away. */
24
#define PAGE_RESERVED 0x0020
25
#endif
26
+/* Target-specific bits that will be used via page_get_flags(). */
27
+#define PAGE_TARGET_1 0x0080
28
29
#if defined(CONFIG_USER_ONLY)
30
void page_dump(FILE *f);
31
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
32
index XXXXXXX..XXXXXXX 100644
33
--- a/linux-user/syscall_defs.h
34
+++ b/linux-user/syscall_defs.h
35
@@ -XXX,XX +XXX,XX @@ struct target_winsize {
36
#define TARGET_PROT_SEM 0x08
37
#endif
38
39
+#ifdef TARGET_AARCH64
40
+#define TARGET_PROT_BTI 0x10
41
+#endif
42
+
43
/* Common */
44
#define TARGET_MAP_SHARED    0x01        /* Share changes */
45
#define TARGET_MAP_PRIVATE    0x02        /* Changes are private */
46
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/cpu.h
49
+++ b/target/arm/cpu.h
50
@@ -XXX,XX +XXX,XX @@ static inline MemTxAttrs *typecheck_memtxattrs(MemTxAttrs *x)
51
#define arm_tlb_bti_gp(x) (typecheck_memtxattrs(x)->target_tlb_bit0)
52
#define arm_tlb_mte_tagged(x) (typecheck_memtxattrs(x)->target_tlb_bit1)
53
54
+/*
55
+ * AArch64 usage of the PAGE_TARGET_* bits for linux-user.
56
+ */
57
+#define PAGE_BTI PAGE_TARGET_1
58
+
59
/*
60
* Naming convention for isar_feature functions:
61
* Functions which test 32-bit ID registers should have _aa32_ in
62
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
63
index XXXXXXX..XXXXXXX 100644
64
--- a/linux-user/mmap.c
65
+++ b/linux-user/mmap.c
66
@@ -XXX,XX +XXX,XX @@ static int validate_prot_to_pageflags(int *host_prot, int prot)
67
*host_prot = (prot & (PROT_READ | PROT_WRITE))
68
| (prot & PROT_EXEC ? PROT_READ : 0);
69
70
+#ifdef TARGET_AARCH64
71
+ /*
72
+ * The PROT_BTI bit is only accepted if the cpu supports the feature.
73
+ * Since this is the unusual case, don't bother checking unless
74
+ * the bit has been requested. If set and valid, record the bit
75
+ * within QEMU's page_flags.
76
+ */
77
+ if (prot & TARGET_PROT_BTI) {
78
+ ARMCPU *cpu = ARM_CPU(thread_cpu);
79
+ if (cpu_isar_feature(aa64_bti, cpu)) {
80
+ valid |= TARGET_PROT_BTI;
81
+ page_flags |= PAGE_BTI;
82
+ }
83
+ }
84
+#endif
85
+
86
return prot & ~valid ? 0 : page_flags;
87
}
88
89
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
15
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
90
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
91
--- a/target/arm/translate-a64.c
17
--- a/target/arm/translate-a64.c
92
+++ b/target/arm/translate-a64.c
18
+++ b/target/arm/translate-a64.c
93
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_simd_fp(DisasContext *s, uint32_t insn)
19
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
94
*/
20
/* FMOV (vector, immediate) - half-precision */
95
static bool is_guarded_page(CPUARMState *env, DisasContext *s)
21
imm = vfp_expand_imm(MO_16, abcdefgh);
96
{
22
/* now duplicate across the lanes */
97
-#ifdef CONFIG_USER_ONLY
23
- imm = bitfield_replicate(imm, 16);
98
- return false; /* FIXME */
24
+ imm = dup_const(MO_16, imm);
99
-#else
25
} else {
100
uint64_t addr = s->base.pc_first;
26
imm = asimd_imm_const(abcdefgh, cmode, is_neg);
101
+#ifdef CONFIG_USER_ONLY
27
}
102
+ return page_get_flags(addr) & PAGE_BTI;
103
+#else
104
int mmu_idx = arm_to_core_mmu_idx(s->mmu_idx);
105
unsigned int index = tlb_index(env, mmu_idx, addr);
106
CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
107
--
28
--
108
2.20.1
29
2.20.1
109
30
110
31
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Implement the MVE logical-immediate insns (VMOV, VMVN,
2
VORR and VBIC). These have essentially the same encoding
3
as their Neon equivalents, and we implement the decode
4
in the same way.
2
5
3
This is generic support, with the code disabled for all targets.
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-7-peter.maydell@linaro.org
9
---
10
target/arm/helper-mve.h | 4 +++
11
target/arm/mve.decode | 17 +++++++++++++
12
target/arm/mve_helper.c | 24 ++++++++++++++++++
13
target/arm/translate-mve.c | 50 ++++++++++++++++++++++++++++++++++++++
14
4 files changed, 95 insertions(+)
4
15
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
16
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
6
Message-id: 20201016184207.786698-11-richard.henderson@linaro.org
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
linux-user/qemu.h | 4 ++
11
linux-user/elfload.c | 157 +++++++++++++++++++++++++++++++++++++++++++
12
2 files changed, 161 insertions(+)
13
14
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/linux-user/qemu.h
18
--- a/target/arm/helper-mve.h
17
+++ b/linux-user/qemu.h
19
+++ b/target/arm/helper-mve.h
18
@@ -XXX,XX +XXX,XX @@ struct image_info {
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vaddvsh, TCG_CALL_NO_WG, i32, env, ptr, i32)
19
abi_ulong interpreter_loadmap_addr;
21
DEF_HELPER_FLAGS_3(mve_vaddvuh, TCG_CALL_NO_WG, i32, env, ptr, i32)
20
abi_ulong interpreter_pt_dynamic_addr;
22
DEF_HELPER_FLAGS_3(mve_vaddvsw, TCG_CALL_NO_WG, i32, env, ptr, i32)
21
struct image_info *other_info;
23
DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
22
+
24
+
23
+ /* For target-specific processing of NT_GNU_PROPERTY_TYPE_0. */
25
+DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
24
+ uint32_t note_flags;
26
+DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
27
+DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
29
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
31
+++ b/target/arm/mve.decode
32
@@ -XXX,XX +XXX,XX @@
33
# VQDMULL has size in bit 28: 0 for 16 bit, 1 for 32 bit
34
%size_28 28:1 !function=plus_1
35
36
+# 1imm format immediate
37
+%imm_28_16_0 28:1 16:3 0:4
25
+
38
+
26
#ifdef TARGET_MIPS
39
&vldr_vstr rn qd imm p a w size l u
27
int fp_abi;
40
&1op qd qm size
28
int interp_fp_abi;
41
&2op qd qm qn size
29
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
42
&2scalar qd qn rm size
43
+&1imm qd imm cmode op
44
45
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
46
# Note that both Rn and Qd are 3 bits only (no D bit)
47
@@ -XXX,XX +XXX,XX @@
48
@2op_nosz .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn size=0
49
@2op_sz28 .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn \
50
size=%size_28
51
+@1imm .... .... .... .... .... cmode:4 .. op:1 . .... &1imm qd=%qd imm=%imm_28_16_0
52
53
# The _rev suffix indicates that Vn and Vm are reversed. This is
54
# the case for shifts. In the Arm ARM these insns are documented
55
@@ -XXX,XX +XXX,XX @@ VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rd
56
# Predicate operations
57
%mask_22_13 22:1 13:3
58
VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
59
+
60
+# Logical immediate operations (1 reg and modified-immediate)
61
+
62
+# The cmode/op bits here decode VORR/VBIC/VMOV/VMVN, but
63
+# not in a way we can conveniently represent in decodetree without
64
+# a lot of repetition:
65
+# VORR: op=0, (cmode & 1) && cmode < 12
66
+# VBIC: op=1, (cmode & 1) && cmode < 12
67
+# VMOV: everything else
68
+# So we have a single decode line and check the cmode/op in the
69
+# trans function.
70
+Vimm_1r 111 . 1111 1 . 00 0 ... ... 0 .... 0 1 . 1 .... @1imm
71
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
30
index XXXXXXX..XXXXXXX 100644
72
index XXXXXXX..XXXXXXX 100644
31
--- a/linux-user/elfload.c
73
--- a/target/arm/mve_helper.c
32
+++ b/linux-user/elfload.c
74
+++ b/target/arm/mve_helper.c
33
@@ -XXX,XX +XXX,XX @@ static void elf_core_copy_regs(target_elf_gregset_t *regs,
75
@@ -XXX,XX +XXX,XX @@ DO_1OP(vnegw, 4, int32_t, DO_NEG)
34
76
DO_1OP(vfnegh, 8, uint64_t, DO_FNEGH)
35
#include "elf.h"
77
DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
36
78
37
+static bool arch_parse_elf_property(uint32_t pr_type, uint32_t pr_datasz,
38
+ const uint32_t *data,
39
+ struct image_info *info,
40
+ Error **errp)
41
+{
42
+ g_assert_not_reached();
43
+}
44
+#define ARCH_USE_GNU_PROPERTY 0
45
+
46
struct exec
47
{
48
unsigned int a_info; /* Use macros N_MAGIC, etc for access */
49
@@ -XXX,XX +XXX,XX @@ void probe_guest_base(const char *image_name, abi_ulong guest_loaddr,
50
"@ 0x%" PRIx64 "\n", (uint64_t)guest_base);
51
}
52
53
+enum {
54
+ /* The string "GNU\0" as a magic number. */
55
+ GNU0_MAGIC = const_le32('G' | 'N' << 8 | 'U' << 16),
56
+ NOTE_DATA_SZ = 1 * KiB,
57
+ NOTE_NAME_SZ = 4,
58
+ ELF_GNU_PROPERTY_ALIGN = ELF_CLASS == ELFCLASS32 ? 4 : 8,
59
+};
60
+
61
+/*
79
+/*
62
+ * Process a single gnu_property entry.
80
+ * 1 operand immediates: Vda is destination and possibly also one source.
63
+ * Return false for error.
81
+ * All these insns work at 64-bit widths.
64
+ */
82
+ */
65
+static bool parse_elf_property(const uint32_t *data, int *off, int datasz,
83
+#define DO_1OP_IMM(OP, FN) \
66
+ struct image_info *info, bool have_prev_type,
84
+ void HELPER(mve_##OP)(CPUARMState *env, void *vda, uint64_t imm) \
67
+ uint32_t *prev_type, Error **errp)
85
+ { \
68
+{
86
+ uint64_t *da = vda; \
69
+ uint32_t pr_type, pr_datasz, step;
87
+ uint16_t mask = mve_element_mask(env); \
70
+
88
+ unsigned e; \
71
+ if (*off > datasz || !QEMU_IS_ALIGNED(*off, ELF_GNU_PROPERTY_ALIGN)) {
89
+ for (e = 0; e < 16 / 8; e++, mask >>= 8) { \
72
+ goto error_data;
90
+ mergemask(&da[H8(e)], FN(da[H8(e)], imm), mask); \
73
+ }
91
+ } \
74
+ datasz -= *off;
92
+ mve_advance_vpt(env); \
75
+ data += *off / sizeof(uint32_t);
76
+
77
+ if (datasz < 2 * sizeof(uint32_t)) {
78
+ goto error_data;
79
+ }
80
+ pr_type = data[0];
81
+ pr_datasz = data[1];
82
+ data += 2;
83
+ datasz -= 2 * sizeof(uint32_t);
84
+ step = ROUND_UP(pr_datasz, ELF_GNU_PROPERTY_ALIGN);
85
+ if (step > datasz) {
86
+ goto error_data;
87
+ }
93
+ }
88
+
94
+
89
+ /* Properties are supposed to be unique and sorted on pr_type. */
95
+#define DO_MOVI(N, I) (I)
90
+ if (have_prev_type && pr_type <= *prev_type) {
96
+#define DO_ANDI(N, I) ((N) & (I))
91
+ if (pr_type == *prev_type) {
97
+#define DO_ORRI(N, I) ((N) | (I))
92
+ error_setg(errp, "Duplicate property in PT_GNU_PROPERTY");
98
+
93
+ } else {
99
+DO_1OP_IMM(vmovi, DO_MOVI)
94
+ error_setg(errp, "Unsorted property in PT_GNU_PROPERTY");
100
+DO_1OP_IMM(vandi, DO_ANDI)
95
+ }
101
+DO_1OP_IMM(vorri, DO_ORRI)
102
+
103
#define DO_2OP(OP, ESIZE, TYPE, FN) \
104
void HELPER(glue(mve_, OP))(CPUARMState *env, \
105
void *vd, void *vn, void *vm) \
106
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
107
index XXXXXXX..XXXXXXX 100644
108
--- a/target/arm/translate-mve.c
109
+++ b/target/arm/translate-mve.c
110
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
111
typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
112
typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
113
typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32);
114
+typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64);
115
116
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
117
static inline long mve_qreg_offset(unsigned reg)
118
@@ -XXX,XX +XXX,XX @@ static bool trans_VADDV(DisasContext *s, arg_VADDV *a)
119
mve_update_eci(s);
120
return true;
121
}
122
+
123
+static bool do_1imm(DisasContext *s, arg_1imm *a, MVEGenOneOpImmFn *fn)
124
+{
125
+ TCGv_ptr qd;
126
+ uint64_t imm;
127
+
128
+ if (!dc_isar_feature(aa32_mve, s) ||
129
+ !mve_check_qreg_bank(s, a->qd) ||
130
+ !fn) {
96
+ return false;
131
+ return false;
97
+ }
132
+ }
98
+ *prev_type = pr_type;
133
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
99
+
100
+ if (!arch_parse_elf_property(pr_type, pr_datasz, data, info, errp)) {
101
+ return false;
102
+ }
103
+
104
+ *off += 2 * sizeof(uint32_t) + step;
105
+ return true;
106
+
107
+ error_data:
108
+ error_setg(errp, "Ill-formed property in PT_GNU_PROPERTY");
109
+ return false;
110
+}
111
+
112
+/* Process NT_GNU_PROPERTY_TYPE_0. */
113
+static bool parse_elf_properties(int image_fd,
114
+ struct image_info *info,
115
+ const struct elf_phdr *phdr,
116
+ char bprm_buf[BPRM_BUF_SIZE],
117
+ Error **errp)
118
+{
119
+ union {
120
+ struct elf_note nhdr;
121
+ uint32_t data[NOTE_DATA_SZ / sizeof(uint32_t)];
122
+ } note;
123
+
124
+ int n, off, datasz;
125
+ bool have_prev_type;
126
+ uint32_t prev_type;
127
+
128
+ /* Unless the arch requires properties, ignore them. */
129
+ if (!ARCH_USE_GNU_PROPERTY) {
130
+ return true;
134
+ return true;
131
+ }
135
+ }
132
+
136
+
133
+ /* If the properties are crazy large, that's too bad. */
137
+ imm = asimd_imm_const(a->imm, a->cmode, a->op);
134
+ n = phdr->p_filesz;
135
+ if (n > sizeof(note)) {
136
+ error_setg(errp, "PT_GNU_PROPERTY too large");
137
+ return false;
138
+ }
139
+ if (n < sizeof(note.nhdr)) {
140
+ error_setg(errp, "PT_GNU_PROPERTY too small");
141
+ return false;
142
+ }
143
+
138
+
144
+ if (phdr->p_offset + n <= BPRM_BUF_SIZE) {
139
+ qd = mve_qreg_ptr(a->qd);
145
+ memcpy(&note, bprm_buf + phdr->p_offset, n);
140
+ fn(cpu_env, qd, tcg_constant_i64(imm));
141
+ tcg_temp_free_ptr(qd);
142
+ mve_update_eci(s);
143
+ return true;
144
+}
145
+
146
+static bool trans_Vimm_1r(DisasContext *s, arg_1imm *a)
147
+{
148
+ /* Handle decode of cmode/op here between VORR/VBIC/VMOV */
149
+ MVEGenOneOpImmFn *fn;
150
+
151
+ if ((a->cmode & 1) && a->cmode < 12) {
152
+ if (a->op) {
153
+ /*
154
+ * For op=1, the immediate will be inverted by asimd_imm_const(),
155
+ * so the VBIC becomes a logical AND operation.
156
+ */
157
+ fn = gen_helper_mve_vandi;
158
+ } else {
159
+ fn = gen_helper_mve_vorri;
160
+ }
146
+ } else {
161
+ } else {
147
+ ssize_t len = pread(image_fd, &note, n, phdr->p_offset);
162
+ /* There is one unallocated cmode/op combination in this space */
148
+ if (len != n) {
163
+ if (a->cmode == 15 && a->op == 1) {
149
+ error_setg_errno(errp, errno, "Error reading file header");
150
+ return false;
164
+ return false;
151
+ }
165
+ }
166
+ /* asimd_imm_const() sorts out VMVNI vs VMOVI for us */
167
+ fn = gen_helper_mve_vmovi;
152
+ }
168
+ }
153
+
169
+ return do_1imm(s, a, fn);
154
+ /*
155
+ * The contents of a valid PT_GNU_PROPERTY is a sequence
156
+ * of uint32_t -- swap them all now.
157
+ */
158
+#ifdef BSWAP_NEEDED
159
+ for (int i = 0; i < n / 4; i++) {
160
+ bswap32s(note.data + i);
161
+ }
162
+#endif
163
+
164
+ /*
165
+ * Note that nhdr is 3 words, and that the "name" described by namesz
166
+ * immediately follows nhdr and is thus at the 4th word. Further, all
167
+ * of the inputs to the kernel's round_up are multiples of 4.
168
+ */
169
+ if (note.nhdr.n_type != NT_GNU_PROPERTY_TYPE_0 ||
170
+ note.nhdr.n_namesz != NOTE_NAME_SZ ||
171
+ note.data[3] != GNU0_MAGIC) {
172
+ error_setg(errp, "Invalid note in PT_GNU_PROPERTY");
173
+ return false;
174
+ }
175
+ off = sizeof(note.nhdr) + NOTE_NAME_SZ;
176
+
177
+ datasz = note.nhdr.n_descsz + off;
178
+ if (datasz > n) {
179
+ error_setg(errp, "Invalid note size in PT_GNU_PROPERTY");
180
+ return false;
181
+ }
182
+
183
+ have_prev_type = false;
184
+ prev_type = 0;
185
+ while (1) {
186
+ if (off == datasz) {
187
+ return true; /* end, exit ok */
188
+ }
189
+ if (!parse_elf_property(note.data, &off, datasz, info,
190
+ have_prev_type, &prev_type, errp)) {
191
+ return false;
192
+ }
193
+ have_prev_type = true;
194
+ }
195
+}
170
+}
196
+
197
/* Load an ELF image into the address space.
198
199
IMAGE_NAME is the filename of the image, to use in error messages.
200
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
201
goto exit_errmsg;
202
}
203
*pinterp_name = g_steal_pointer(&interp_name);
204
+ } else if (eppnt->p_type == PT_GNU_PROPERTY) {
205
+ if (!parse_elf_properties(image_fd, info, eppnt, bprm_buf, &err)) {
206
+ goto exit_errmsg;
207
+ }
208
}
209
}
210
211
--
171
--
212
2.20.1
172
2.20.1
213
173
214
174
diff view generated by jsdifflib
1
If the M-profile low-overhead-branch extension is implemented, FPSCR
1
Implement the MVE shift-vector-left-by-immediate insns VSHL, VQSHL
2
bits [18:16] are a new field LTPSIZE. If MVE is not implemented
2
and VQSHLU.
3
(currently always true for us) then this field always reads as 4 and
3
4
ignores writes.
4
The size-and-immediate encoding here is the same as Neon, and we
5
5
handle it the same way neon-dp.decode does.
6
These bits used to be the vector-length field for the old
7
short-vector extension, so we need to take care that they are not
8
misinterpreted as setting vec_len. We do this with a rearrangement
9
of the vfp_set_fpscr() code that deals with vec_len, vec_stride
10
and also the QC bit; this obviates the need for the M-profile
11
only masking step that we used to have at the start of the function.
12
13
We provide a new field in CPUState for LTPSIZE, even though this
14
will always be 4, in preparation for MVE, so we don't have to
15
come back later and split it out of the vfp.xregs[FPSCR] value.
16
(This state struct field will be saved and restored as part of
17
the FPSCR value via the vmstate_fpscr in machine.c.)
18
6
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
21
Message-id: 20201019151301.2046-11-peter.maydell@linaro.org
9
Message-id: 20210628135835.6690-8-peter.maydell@linaro.org
22
---
10
---
23
target/arm/cpu.h | 1 +
11
target/arm/helper-mve.h | 16 +++++++++++
24
target/arm/cpu.c | 9 +++++++++
12
target/arm/mve.decode | 23 +++++++++++++++
25
target/arm/vfp_helper.c | 6 ++++++
13
target/arm/mve_helper.c | 57 ++++++++++++++++++++++++++++++++++++++
26
3 files changed, 16 insertions(+)
14
target/arm/translate-mve.c | 51 ++++++++++++++++++++++++++++++++++
27
15
4 files changed, 147 insertions(+)
28
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
16
29
index XXXXXXX..XXXXXXX 100644
17
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
30
--- a/target/arm/cpu.h
18
index XXXXXXX..XXXXXXX 100644
31
+++ b/target/arm/cpu.h
19
--- a/target/arm/helper-mve.h
32
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
20
+++ b/target/arm/helper-mve.h
33
uint32_t fpdscr[M_REG_NUM_BANKS];
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
34
uint32_t cpacr[M_REG_NUM_BANKS];
22
DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
35
uint32_t nsacr;
23
DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
36
+ int ltpsize;
24
DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
37
} v7m;
25
+
38
26
+DEF_HELPER_FLAGS_4(mve_vshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
/* Information associated with an exception about to be taken:
27
+DEF_HELPER_FLAGS_4(mve_vshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
28
+DEF_HELPER_FLAGS_4(mve_vshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
index XXXXXXX..XXXXXXX 100644
29
+
42
--- a/target/arm/cpu.c
30
+DEF_HELPER_FLAGS_4(mve_vqshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
43
+++ b/target/arm/cpu.c
31
+DEF_HELPER_FLAGS_4(mve_vqshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
44
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(DeviceState *dev)
32
+DEF_HELPER_FLAGS_4(mve_vqshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
45
uint8_t *rom;
33
+
46
uint32_t vecbase;
34
+DEF_HELPER_FLAGS_4(mve_vqshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
47
35
+DEF_HELPER_FLAGS_4(mve_vqshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
48
+ if (cpu_isar_feature(aa32_lob, cpu)) {
36
+DEF_HELPER_FLAGS_4(mve_vqshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
49
+ /*
37
+
50
+ * LTPSIZE is constant 4 if MVE not implemented, and resets
38
+DEF_HELPER_FLAGS_4(mve_vqshlui_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
51
+ * to an UNKNOWN value if MVE is implemented. We choose to
39
+DEF_HELPER_FLAGS_4(mve_vqshlui_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
52
+ * always reset to 4.
40
+DEF_HELPER_FLAGS_4(mve_vqshlui_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
53
+ */
41
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
54
+ env->v7m.ltpsize = 4;
42
index XXXXXXX..XXXXXXX 100644
55
+ }
43
--- a/target/arm/mve.decode
56
+
44
+++ b/target/arm/mve.decode
57
if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
45
@@ -XXX,XX +XXX,XX @@
58
env->v7m.secure = true;
46
&2op qd qm qn size
59
} else {
47
&2scalar qd qn rm size
60
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
48
&1imm qd imm cmode op
61
index XXXXXXX..XXXXXXX 100644
49
+&2shift qd qm shift size
62
--- a/target/arm/vfp_helper.c
50
63
+++ b/target/arm/vfp_helper.c
51
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
64
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_get_fpscr)(CPUARMState *env)
52
# Note that both Rn and Qd are 3 bits only (no D bit)
65
| (env->vfp.vec_len << 16)
53
@@ -XXX,XX +XXX,XX @@
66
| (env->vfp.vec_stride << 20);
54
@2scalar .... .... .. size:2 .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
67
55
@2scalar_nosz .... .... .... .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
56
57
+@2_shl_b .... .... .. 001 shift:3 .... .... .... .... &2shift qd=%qd qm=%qm size=0
58
+@2_shl_h .... .... .. 01 shift:4 .... .... .... .... &2shift qd=%qd qm=%qm size=1
59
+@2_shl_w .... .... .. 1 shift:5 .... .... .... .... &2shift qd=%qd qm=%qm size=2
60
+
61
# Vector loads and stores
62
63
# Widening loads and narrowing stores:
64
@@ -XXX,XX +XXX,XX @@ VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
65
# So we have a single decode line and check the cmode/op in the
66
# trans function.
67
Vimm_1r 111 . 1111 1 . 00 0 ... ... 0 .... 0 1 . 1 .... @1imm
68
+
69
+# Shifts by immediate
70
+
71
+VSHLI 111 0 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_b
72
+VSHLI 111 0 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_h
73
+VSHLI 111 0 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_w
74
+
75
+VQSHLI_S 111 0 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_b
76
+VQSHLI_S 111 0 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_h
77
+VQSHLI_S 111 0 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_w
78
+
79
+VQSHLI_U 111 1 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_b
80
+VQSHLI_U 111 1 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_h
81
+VQSHLI_U 111 1 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_w
82
+
83
+VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_b
84
+VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_h
85
+VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_w
86
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/mve_helper.c
89
+++ b/target/arm/mve_helper.c
90
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT(vqsubsw, 4, int32_t, DO_SQSUB_W)
91
WRAP_QRSHL_HELPER(do_sqrshl_bhs, N, M, true, satp)
92
#define DO_UQRSHL_OP(N, M, satp) \
93
WRAP_QRSHL_HELPER(do_uqrshl_bhs, N, M, true, satp)
94
+#define DO_SUQSHL_OP(N, M, satp) \
95
+ WRAP_QRSHL_HELPER(do_suqrshl_bhs, N, M, false, satp)
96
97
DO_2OP_SAT_S(vqshls, DO_SQSHL_OP)
98
DO_2OP_SAT_U(vqshlu, DO_UQSHL_OP)
99
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvsw, 4, uint32_t)
100
DO_VADDV(vaddvub, 1, uint8_t)
101
DO_VADDV(vaddvuh, 2, uint16_t)
102
DO_VADDV(vaddvuw, 4, uint32_t)
103
+
104
+/* Shifts by immediate */
105
+#define DO_2SHIFT(OP, ESIZE, TYPE, FN) \
106
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
107
+ void *vm, uint32_t shift) \
108
+ { \
109
+ TYPE *d = vd, *m = vm; \
110
+ uint16_t mask = mve_element_mask(env); \
111
+ unsigned e; \
112
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
113
+ mergemask(&d[H##ESIZE(e)], \
114
+ FN(m[H##ESIZE(e)], shift), mask); \
115
+ } \
116
+ mve_advance_vpt(env); \
117
+ }
118
+
119
+#define DO_2SHIFT_SAT(OP, ESIZE, TYPE, FN) \
120
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
121
+ void *vm, uint32_t shift) \
122
+ { \
123
+ TYPE *d = vd, *m = vm; \
124
+ uint16_t mask = mve_element_mask(env); \
125
+ unsigned e; \
126
+ bool qc = false; \
127
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
128
+ bool sat = false; \
129
+ mergemask(&d[H##ESIZE(e)], \
130
+ FN(m[H##ESIZE(e)], shift, &sat), mask); \
131
+ qc |= sat & mask & 1; \
132
+ } \
133
+ if (qc) { \
134
+ env->vfp.qc[0] = qc; \
135
+ } \
136
+ mve_advance_vpt(env); \
137
+ }
138
+
139
+/* provide unsigned 2-op shift helpers for all sizes */
140
+#define DO_2SHIFT_U(OP, FN) \
141
+ DO_2SHIFT(OP##b, 1, uint8_t, FN) \
142
+ DO_2SHIFT(OP##h, 2, uint16_t, FN) \
143
+ DO_2SHIFT(OP##w, 4, uint32_t, FN)
144
+
145
+#define DO_2SHIFT_SAT_U(OP, FN) \
146
+ DO_2SHIFT_SAT(OP##b, 1, uint8_t, FN) \
147
+ DO_2SHIFT_SAT(OP##h, 2, uint16_t, FN) \
148
+ DO_2SHIFT_SAT(OP##w, 4, uint32_t, FN)
149
+#define DO_2SHIFT_SAT_S(OP, FN) \
150
+ DO_2SHIFT_SAT(OP##b, 1, int8_t, FN) \
151
+ DO_2SHIFT_SAT(OP##h, 2, int16_t, FN) \
152
+ DO_2SHIFT_SAT(OP##w, 4, int32_t, FN)
153
+
154
+DO_2SHIFT_U(vshli_u, DO_VSHLU)
155
+DO_2SHIFT_SAT_U(vqshli_u, DO_UQSHL_OP)
156
+DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP)
157
+DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
158
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
159
index XXXXXXX..XXXXXXX 100644
160
--- a/target/arm/translate-mve.c
161
+++ b/target/arm/translate-mve.c
162
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
163
typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
164
typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
165
typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
166
+typedef void MVEGenTwoOpShiftFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
167
typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
168
typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32);
169
typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64);
170
@@ -XXX,XX +XXX,XX @@ static bool trans_Vimm_1r(DisasContext *s, arg_1imm *a)
171
}
172
return do_1imm(s, a, fn);
173
}
174
+
175
+static bool do_2shift(DisasContext *s, arg_2shift *a, MVEGenTwoOpShiftFn fn,
176
+ bool negateshift)
177
+{
178
+ TCGv_ptr qd, qm;
179
+ int shift = a->shift;
180
+
181
+ if (!dc_isar_feature(aa32_mve, s) ||
182
+ !mve_check_qreg_bank(s, a->qd | a->qm) ||
183
+ !fn) {
184
+ return false;
185
+ }
186
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
187
+ return true;
188
+ }
189
+
68
+ /*
190
+ /*
69
+ * M-profile LTPSIZE overlaps A-profile Stride; whichever of the
191
+ * When we handle a right shift insn using a left-shift helper
70
+ * two is not applicable to this CPU will always be zero.
192
+ * which permits a negative shift count to indicate a right-shift,
193
+ * we must negate the shift count.
71
+ */
194
+ */
72
+ fpscr |= env->v7m.ltpsize << 16;
195
+ if (negateshift) {
73
+
196
+ shift = -shift;
74
fpscr |= vfp_get_fpscr_from_host(env);
197
+ }
75
198
+
76
i = env->vfp.qc[0] | env->vfp.qc[1] | env->vfp.qc[2] | env->vfp.qc[3];
199
+ qd = mve_qreg_ptr(a->qd);
200
+ qm = mve_qreg_ptr(a->qm);
201
+ fn(cpu_env, qd, qm, tcg_constant_i32(shift));
202
+ tcg_temp_free_ptr(qd);
203
+ tcg_temp_free_ptr(qm);
204
+ mve_update_eci(s);
205
+ return true;
206
+}
207
+
208
+#define DO_2SHIFT(INSN, FN, NEGATESHIFT) \
209
+ static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
210
+ { \
211
+ static MVEGenTwoOpShiftFn * const fns[] = { \
212
+ gen_helper_mve_##FN##b, \
213
+ gen_helper_mve_##FN##h, \
214
+ gen_helper_mve_##FN##w, \
215
+ NULL, \
216
+ }; \
217
+ return do_2shift(s, a, fns[a->size], NEGATESHIFT); \
218
+ }
219
+
220
+DO_2SHIFT(VSHLI, vshli_u, false)
221
+DO_2SHIFT(VQSHLI_S, vqshli_s, false)
222
+DO_2SHIFT(VQSHLI_U, vqshli_u, false)
223
+DO_2SHIFT(VQSHLUI, vqshlui_s, false)
77
--
224
--
78
2.20.1
225
2.20.1
79
226
80
227
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Implement the MVE vector shift right by immediate insns VSHRI and
2
VRSHRI. As with Neon, we implement these by using helper functions
3
which perform left shifts but allow negative shift counts to indicate
4
right shifts.
2
5
3
When TBI is enabled in a given regime, 56 bits of the address
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
are significant and we need to clear out any other matching
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
virtual addresses with differing tags.
8
Message-id: 20210628135835.6690-9-peter.maydell@linaro.org
9
---
10
target/arm/helper-mve.h | 12 ++++++++++++
11
target/arm/translate.h | 20 ++++++++++++++++++++
12
target/arm/mve.decode | 28 ++++++++++++++++++++++++++++
13
target/arm/mve_helper.c | 7 +++++++
14
target/arm/translate-mve.c | 5 +++++
15
target/arm/translate-neon.c | 18 ------------------
16
6 files changed, 72 insertions(+), 18 deletions(-)
6
17
7
The other uses of tlb_flush_page (without mmuidx) in this file
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
8
are only used by aarch32 mode.
9
10
Fixes: 38d931687fa1
11
Reported-by: Jordan Frank <jordanfrank@fb.com>
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
15
Message-id: 20201016210754.818257-3-richard.henderson@linaro.org
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
18
target/arm/helper.c | 46 ++++++++++++++++++++++++++++++++++++++-------
19
1 file changed, 39 insertions(+), 7 deletions(-)
20
21
diff --git a/target/arm/helper.c b/target/arm/helper.c
22
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/helper.c
20
--- a/target/arm/helper-mve.h
24
+++ b/target/arm/helper.c
21
+++ b/target/arm/helper-mve.h
25
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
26
#endif
23
DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
27
24
DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
28
static void switch_mode(CPUARMState *env, int mode);
25
29
+static int aa64_va_parameter_tbi(uint64_t tcr, ARMMMUIdx mmu_idx);
26
+DEF_HELPER_FLAGS_4(mve_vshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
27
+DEF_HELPER_FLAGS_4(mve_vshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
static int vfp_gdb_get_reg(CPUARMState *env, GByteArray *buf, int reg)
28
+DEF_HELPER_FLAGS_4(mve_vshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
{
29
+
33
@@ -XXX,XX +XXX,XX @@ static int vae1_tlbmask(CPUARMState *env)
30
DEF_HELPER_FLAGS_4(mve_vshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
}
31
DEF_HELPER_FLAGS_4(mve_vshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_4(mve_vshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
DEF_HELPER_FLAGS_4(mve_vqshlui_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
DEF_HELPER_FLAGS_4(mve_vqshlui_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
DEF_HELPER_FLAGS_4(mve_vqshlui_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
37
+
38
+DEF_HELPER_FLAGS_4(mve_vrshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_4(mve_vrshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(mve_vrshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
+
42
+DEF_HELPER_FLAGS_4(mve_vrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_4(mve_vrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
44
+DEF_HELPER_FLAGS_4(mve_vrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
45
diff --git a/target/arm/translate.h b/target/arm/translate.h
46
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/translate.h
48
+++ b/target/arm/translate.h
49
@@ -XXX,XX +XXX,XX @@ static inline int times_2_plus_1(DisasContext *s, int x)
50
return x * 2 + 1;
35
}
51
}
36
52
37
+/* Return 56 if TBI is enabled, 64 otherwise. */
53
+static inline int rsub_64(DisasContext *s, int x)
38
+static int tlbbits_for_regime(CPUARMState *env, ARMMMUIdx mmu_idx,
39
+ uint64_t addr)
40
+{
54
+{
41
+ uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
55
+ return 64 - x;
42
+ int tbi = aa64_va_parameter_tbi(tcr, mmu_idx);
43
+ int select = extract64(addr, 55, 1);
44
+
45
+ return (tbi >> select) & 1 ? 56 : 64;
46
+}
56
+}
47
+
57
+
48
+static int vae1_tlbbits(CPUARMState *env, uint64_t addr)
58
+static inline int rsub_32(DisasContext *s, int x)
49
+{
59
+{
50
+ ARMMMUIdx mmu_idx;
60
+ return 32 - x;
51
+
52
+ /* Only the regime of the mmu_idx below is significant. */
53
+ if (arm_is_secure_below_el3(env)) {
54
+ mmu_idx = ARMMMUIdx_SE10_0;
55
+ } else if ((env->cp15.hcr_el2 & (HCR_E2H | HCR_TGE))
56
+ == (HCR_E2H | HCR_TGE)) {
57
+ mmu_idx = ARMMMUIdx_E20_0;
58
+ } else {
59
+ mmu_idx = ARMMMUIdx_E10_0;
60
+ }
61
+ return tlbbits_for_regime(env, mmu_idx, addr);
62
+}
61
+}
63
+
62
+
64
static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
63
+static inline int rsub_16(DisasContext *s, int x)
65
uint64_t value)
64
+{
65
+ return 16 - x;
66
+}
67
+
68
+static inline int rsub_8(DisasContext *s, int x)
69
+{
70
+ return 8 - x;
71
+}
72
+
73
static inline int arm_dc_feature(DisasContext *dc, int feature)
66
{
74
{
67
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
75
return (dc->features & (1ULL << feature)) != 0;
68
CPUState *cs = env_cpu(env);
76
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
69
int mask = vae1_tlbmask(env);
77
index XXXXXXX..XXXXXXX 100644
70
uint64_t pageaddr = sextract64(value << 12, 0, 56);
78
--- a/target/arm/mve.decode
71
+ int bits = vae1_tlbbits(env, pageaddr);
79
+++ b/target/arm/mve.decode
72
80
@@ -XXX,XX +XXX,XX @@
73
- tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr, mask);
81
@2_shl_h .... .... .. 01 shift:4 .... .... .... .... &2shift qd=%qd qm=%qm size=1
74
+ tlb_flush_page_bits_by_mmuidx_all_cpus_synced(cs, pageaddr, mask, bits);
82
@2_shl_w .... .... .. 1 shift:5 .... .... .... .... &2shift qd=%qd qm=%qm size=2
83
84
+# Right shifts are encoded as N - shift, where N is the element size in bits.
85
+%rshift_i5 16:5 !function=rsub_32
86
+%rshift_i4 16:4 !function=rsub_16
87
+%rshift_i3 16:3 !function=rsub_8
88
+
89
+@2_shr_b .... .... .. 001 ... .... .... .... .... &2shift qd=%qd qm=%qm \
90
+ size=0 shift=%rshift_i3
91
+@2_shr_h .... .... .. 01 .... .... .... .... .... &2shift qd=%qd qm=%qm \
92
+ size=1 shift=%rshift_i4
93
+@2_shr_w .... .... .. 1 ..... .... .... .... .... &2shift qd=%qd qm=%qm \
94
+ size=2 shift=%rshift_i5
95
+
96
# Vector loads and stores
97
98
# Widening loads and narrowing stores:
99
@@ -XXX,XX +XXX,XX @@ VQSHLI_U 111 1 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_w
100
VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_b
101
VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_h
102
VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_w
103
+
104
+VSHRI_S 111 0 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_b
105
+VSHRI_S 111 0 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_h
106
+VSHRI_S 111 0 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_w
107
+
108
+VSHRI_U 111 1 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_b
109
+VSHRI_U 111 1 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_h
110
+VSHRI_U 111 1 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_w
111
+
112
+VRSHRI_S 111 0 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_b
113
+VRSHRI_S 111 0 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h
114
+VRSHRI_S 111 0 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
115
+
116
+VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_b
117
+VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h
118
+VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
119
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
120
index XXXXXXX..XXXXXXX 100644
121
--- a/target/arm/mve_helper.c
122
+++ b/target/arm/mve_helper.c
123
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvuw, 4, uint32_t)
124
DO_2SHIFT(OP##b, 1, uint8_t, FN) \
125
DO_2SHIFT(OP##h, 2, uint16_t, FN) \
126
DO_2SHIFT(OP##w, 4, uint32_t, FN)
127
+#define DO_2SHIFT_S(OP, FN) \
128
+ DO_2SHIFT(OP##b, 1, int8_t, FN) \
129
+ DO_2SHIFT(OP##h, 2, int16_t, FN) \
130
+ DO_2SHIFT(OP##w, 4, int32_t, FN)
131
132
#define DO_2SHIFT_SAT_U(OP, FN) \
133
DO_2SHIFT_SAT(OP##b, 1, uint8_t, FN) \
134
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvuw, 4, uint32_t)
135
DO_2SHIFT_SAT(OP##w, 4, int32_t, FN)
136
137
DO_2SHIFT_U(vshli_u, DO_VSHLU)
138
+DO_2SHIFT_S(vshli_s, DO_VSHLS)
139
DO_2SHIFT_SAT_U(vqshli_u, DO_UQSHL_OP)
140
DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP)
141
DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
142
+DO_2SHIFT_U(vrshli_u, DO_VRSHLU)
143
+DO_2SHIFT_S(vrshli_s, DO_VRSHLS)
144
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
145
index XXXXXXX..XXXXXXX 100644
146
--- a/target/arm/translate-mve.c
147
+++ b/target/arm/translate-mve.c
148
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT(VSHLI, vshli_u, false)
149
DO_2SHIFT(VQSHLI_S, vqshli_s, false)
150
DO_2SHIFT(VQSHLI_U, vqshli_u, false)
151
DO_2SHIFT(VQSHLUI, vqshlui_s, false)
152
+/* These right shifts use a left-shift helper with negated shift count */
153
+DO_2SHIFT(VSHRI_S, vshli_s, true)
154
+DO_2SHIFT(VSHRI_U, vshli_u, true)
155
+DO_2SHIFT(VRSHRI_S, vrshli_s, true)
156
+DO_2SHIFT(VRSHRI_U, vrshli_u, true)
157
diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c
158
index XXXXXXX..XXXXXXX 100644
159
--- a/target/arm/translate-neon.c
160
+++ b/target/arm/translate-neon.c
161
@@ -XXX,XX +XXX,XX @@ static inline int plus1(DisasContext *s, int x)
162
return x + 1;
75
}
163
}
76
164
77
static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
165
-static inline int rsub_64(DisasContext *s, int x)
78
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
166
-{
79
CPUState *cs = env_cpu(env);
167
- return 64 - x;
80
int mask = vae1_tlbmask(env);
168
-}
81
uint64_t pageaddr = sextract64(value << 12, 0, 56);
169
-
82
+ int bits = vae1_tlbbits(env, pageaddr);
170
-static inline int rsub_32(DisasContext *s, int x)
83
171
-{
84
if (tlb_force_broadcast(env)) {
172
- return 32 - x;
85
- tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr, mask);
173
-}
86
+ tlb_flush_page_bits_by_mmuidx_all_cpus_synced(cs, pageaddr, mask, bits);
174
-static inline int rsub_16(DisasContext *s, int x)
87
} else {
175
-{
88
- tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
176
- return 16 - x;
89
+ tlb_flush_page_bits_by_mmuidx(cs, pageaddr, mask, bits);
177
-}
90
}
178
-static inline int rsub_8(DisasContext *s, int x)
91
}
179
-{
92
180
- return 8 - x;
93
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
181
-}
182
-
183
static inline int neon_3same_fp_size(DisasContext *s, int x)
94
{
184
{
95
CPUState *cs = env_cpu(env);
185
/* Convert 0==fp32, 1==fp16 into a MO_* value */
96
uint64_t pageaddr = sextract64(value << 12, 0, 56);
97
+ int bits = tlbbits_for_regime(env, ARMMMUIdx_E2, pageaddr);
98
99
- tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
100
- ARMMMUIdxBit_E2);
101
+ tlb_flush_page_bits_by_mmuidx_all_cpus_synced(cs, pageaddr,
102
+ ARMMMUIdxBit_E2, bits);
103
}
104
105
static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
106
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
107
{
108
CPUState *cs = env_cpu(env);
109
uint64_t pageaddr = sextract64(value << 12, 0, 56);
110
+ int bits = tlbbits_for_regime(env, ARMMMUIdx_SE3, pageaddr);
111
112
- tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
113
- ARMMMUIdxBit_SE3);
114
+ tlb_flush_page_bits_by_mmuidx_all_cpus_synced(cs, pageaddr,
115
+ ARMMMUIdxBit_SE3, bits);
116
}
117
118
static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
119
--
186
--
120
2.20.1
187
2.20.1
121
188
122
189
diff view generated by jsdifflib
1
From: Havard Skinnemoen <hskinnemoen@google.com>
1
Implement the MVE VHLL (vector shift left long) insn. This has two
2
encodings: the T1 encoding is the usual shift-by-immediate format,
3
and the T2 encoding is a special case where the shift count is always
4
equal to the element size.
2
5
3
This test exercises the various modes of the npcm7xx timer. In
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
particular, it triggers the bug found by the fuzzer, as reported here:
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-10-peter.maydell@linaro.org
9
---
10
target/arm/helper-mve.h | 9 +++++++
11
target/arm/mve.decode | 53 +++++++++++++++++++++++++++++++++++---
12
target/arm/mve_helper.c | 32 +++++++++++++++++++++++
13
target/arm/translate-mve.c | 15 +++++++++++
14
4 files changed, 105 insertions(+), 4 deletions(-)
5
15
6
https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg02992.html
16
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
7
17
index XXXXXXX..XXXXXXX 100644
8
It also found several other bugs, especially related to interrupt
18
--- a/target/arm/helper-mve.h
9
handling.
19
+++ b/target/arm/helper-mve.h
10
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
11
The test exercises all the timers in all the timer modules, which
21
DEF_HELPER_FLAGS_4(mve_vrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
12
expands to 180 test cases in total.
22
DEF_HELPER_FLAGS_4(mve_vrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
13
23
DEF_HELPER_FLAGS_4(mve_vrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
14
Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
24
+
15
Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
25
+DEF_HELPER_FLAGS_4(mve_vshllbsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
16
Message-id: 20201008232154.94221-2-hskinnemoen@google.com
26
+DEF_HELPER_FLAGS_4(mve_vshllbsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
27
+DEF_HELPER_FLAGS_4(mve_vshllbub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
---
28
+DEF_HELPER_FLAGS_4(mve_vshllbuh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
tests/qtest/npcm7xx_timer-test.c | 562 +++++++++++++++++++++++++++++++
29
+DEF_HELPER_FLAGS_4(mve_vshlltsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
20
tests/qtest/meson.build | 1 +
30
+DEF_HELPER_FLAGS_4(mve_vshlltsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
2 files changed, 563 insertions(+)
31
+DEF_HELPER_FLAGS_4(mve_vshlltub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
create mode 100644 tests/qtest/npcm7xx_timer-test.c
32
+DEF_HELPER_FLAGS_4(mve_vshlltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
24
diff --git a/tests/qtest/npcm7xx_timer-test.c b/tests/qtest/npcm7xx_timer-test.c
34
index XXXXXXX..XXXXXXX 100644
25
new file mode 100644
35
--- a/target/arm/mve.decode
26
index XXXXXXX..XXXXXXX
36
+++ b/target/arm/mve.decode
27
--- /dev/null
28
+++ b/tests/qtest/npcm7xx_timer-test.c
29
@@ -XXX,XX +XXX,XX @@
37
@@ -XXX,XX +XXX,XX @@
30
+/*
38
@2_shl_h .... .... .. 01 shift:4 .... .... .... .... &2shift qd=%qd qm=%qm size=1
31
+ * QTest testcase for the Nuvoton NPCM7xx Timer
39
@2_shl_w .... .... .. 1 shift:5 .... .... .... .... &2shift qd=%qd qm=%qm size=2
32
+ *
40
33
+ * Copyright 2020 Google LLC
41
+@2_shll_b .... .... ... 01 shift:3 .... .... .... .... &2shift qd=%qd qm=%qm size=0
34
+ *
42
+@2_shll_h .... .... ... 1 shift:4 .... .... .... .... &2shift qd=%qd qm=%qm size=1
35
+ * This program is free software; you can redistribute it and/or modify it
43
+# VSHLL encoding T2 where shift == esize
36
+ * under the terms of the GNU General Public License as published by the
44
+@2_shll_esize_b .... .... .... 00 .. .... .... .... .... &2shift \
37
+ * Free Software Foundation; either version 2 of the License, or
45
+ qd=%qd qm=%qm size=0 shift=8
38
+ * (at your option) any later version.
46
+@2_shll_esize_h .... .... .... 01 .. .... .... .... .... &2shift \
39
+ *
47
+ qd=%qd qm=%qm size=1 shift=16
40
+ * This program is distributed in the hope that it will be useful, but WITHOUT
41
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
42
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
43
+ * for more details.
44
+ */
45
+
48
+
46
+#include "qemu/osdep.h"
49
# Right shifts are encoded as N - shift, where N is the element size in bits.
47
+#include "qemu/timer.h"
50
%rshift_i5 16:5 !function=rsub_32
48
+#include "libqtest-single.h"
51
%rshift_i4 16:4 !function=rsub_16
49
+
52
@@ -XXX,XX +XXX,XX @@ VADD 1110 1111 0 . .. ... 0 ... 0 1000 . 1 . 0 ... 0 @2op
50
+#define TIM_REF_HZ (25000000)
53
VSUB 1111 1111 0 . .. ... 0 ... 0 1000 . 1 . 0 ... 0 @2op
51
+
54
VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
52
+/* Bits in TCSRx */
55
53
+#define CEN BIT(30)
56
-VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
54
+#define IE BIT(29)
57
-VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
55
+#define MODE_ONESHOT (0 << 27)
58
+# The VSHLL T2 encoding is not a @2op pattern, but is here because it
56
+#define MODE_PERIODIC (1 << 27)
59
+# overlaps what would be size=0b11 VMULH/VRMULH
57
+#define CRST BIT(26)
58
+#define CACT BIT(25)
59
+#define PRESCALE(x) (x)
60
+
61
+/* Registers shared between all timers in a module. */
62
+#define TISR 0x18
63
+#define WTCR 0x1c
64
+# define WTCLK(x) ((x) << 10)
65
+
66
+/* Power-on default; used to re-initialize timers before each test. */
67
+#define TCSR_DEFAULT PRESCALE(5)
68
+
69
+/* Register offsets for a timer within a timer block. */
70
+typedef struct Timer {
71
+ unsigned int tcsr_offset;
72
+ unsigned int ticr_offset;
73
+ unsigned int tdr_offset;
74
+} Timer;
75
+
76
+/* A timer block containing 5 timers. */
77
+typedef struct TimerBlock {
78
+ int irq_base;
79
+ uint64_t base_addr;
80
+} TimerBlock;
81
+
82
+/* Testdata for testing a particular timer within a timer block. */
83
+typedef struct TestData {
84
+ const TimerBlock *tim;
85
+ const Timer *timer;
86
+} TestData;
87
+
88
+const TimerBlock timer_block[] = {
89
+ {
90
+ .irq_base = 32,
91
+ .base_addr = 0xf0008000,
92
+ },
93
+ {
94
+ .irq_base = 37,
95
+ .base_addr = 0xf0009000,
96
+ },
97
+ {
98
+ .irq_base = 42,
99
+ .base_addr = 0xf000a000,
100
+ },
101
+};
102
+
103
+const Timer timer[] = {
104
+ {
105
+ .tcsr_offset = 0x00,
106
+ .ticr_offset = 0x08,
107
+ .tdr_offset = 0x10,
108
+ }, {
109
+ .tcsr_offset = 0x04,
110
+ .ticr_offset = 0x0c,
111
+ .tdr_offset = 0x14,
112
+ }, {
113
+ .tcsr_offset = 0x20,
114
+ .ticr_offset = 0x28,
115
+ .tdr_offset = 0x30,
116
+ }, {
117
+ .tcsr_offset = 0x24,
118
+ .ticr_offset = 0x2c,
119
+ .tdr_offset = 0x34,
120
+ }, {
121
+ .tcsr_offset = 0x40,
122
+ .ticr_offset = 0x48,
123
+ .tdr_offset = 0x50,
124
+ },
125
+};
126
+
127
+/* Returns the index of the timer block. */
128
+static int tim_index(const TimerBlock *tim)
129
+{
60
+{
130
+ ptrdiff_t diff = tim - timer_block;
61
+ VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b
131
+
62
+ VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h
132
+ g_assert(diff >= 0 && diff < ARRAY_SIZE(timer_block));
63
133
+
64
-VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
134
+ return diff;
65
-VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
66
+ VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
135
+}
67
+}
136
+
68
+
137
+/* Returns the index of a timer within a timer block. */
138
+static int timer_index(const Timer *t)
139
+{
69
+{
140
+ ptrdiff_t diff = t - timer;
70
+ VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b
71
+ VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h
141
+
72
+
142
+ g_assert(diff >= 0 && diff < ARRAY_SIZE(timer));
73
+ VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
143
+
144
+ return diff;
145
+}
74
+}
146
+
75
+
147
+/* Returns the irq line for a given timer. */
148
+static int tim_timer_irq(const TestData *td)
149
+{
76
+{
150
+ return td->tim->irq_base + timer_index(td->timer);
77
+ VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b
78
+ VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h
79
+
80
+ VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
151
+}
81
+}
152
+
82
+
153
+/* Register read/write accessors. */
83
+{
84
+ VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b
85
+ VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h
154
+
86
+
155
+static void tim_write(const TestData *td,
87
+ VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
156
+ unsigned int offset, uint32_t value)
157
+{
158
+ writel(td->tim->base_addr + offset, value);
159
+}
88
+}
89
90
VMAX_S 111 0 1111 0 . .. ... 0 ... 0 0110 . 1 . 0 ... 0 @2op
91
VMAX_U 111 1 1111 0 . .. ... 0 ... 0 0110 . 1 . 0 ... 0 @2op
92
@@ -XXX,XX +XXX,XX @@ VRSHRI_S 111 0 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
93
VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_b
94
VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h
95
VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
160
+
96
+
161
+static uint32_t tim_read(const TestData *td, unsigned int offset)
97
+# VSHLL T1 encoding; the T2 VSHLL encoding is elsewhere in this file
162
+{
98
+VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_b
163
+ return readl(td->tim->base_addr + offset);
99
+VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_h
164
+}
165
+
100
+
166
+static void tim_write_tcsr(const TestData *td, uint32_t value)
101
+VSHLL_BU 111 1 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_b
167
+{
102
+VSHLL_BU 111 1 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_h
168
+ tim_write(td, td->timer->tcsr_offset, value);
169
+}
170
+
103
+
171
+static uint32_t tim_read_tcsr(const TestData *td)
104
+VSHLL_TS 111 0 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_b
172
+{
105
+VSHLL_TS 111 0 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_h
173
+ return tim_read(td, td->timer->tcsr_offset);
174
+}
175
+
106
+
176
+static void tim_write_ticr(const TestData *td, uint32_t value)
107
+VSHLL_TU 111 1 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_b
177
+{
108
+VSHLL_TU 111 1 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_h
178
+ tim_write(td, td->timer->ticr_offset, value);
109
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
179
+}
110
index XXXXXXX..XXXXXXX 100644
180
+
111
--- a/target/arm/mve_helper.c
181
+static uint32_t tim_read_ticr(const TestData *td)
112
+++ b/target/arm/mve_helper.c
182
+{
113
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP)
183
+ return tim_read(td, td->timer->ticr_offset);
114
DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
184
+}
115
DO_2SHIFT_U(vrshli_u, DO_VRSHLU)
185
+
116
DO_2SHIFT_S(vrshli_s, DO_VRSHLS)
186
+static uint32_t tim_read_tdr(const TestData *td)
187
+{
188
+ return tim_read(td, td->timer->tdr_offset);
189
+}
190
+
191
+/* Returns the number of nanoseconds to count the given number of cycles. */
192
+static int64_t tim_calculate_step(uint32_t count, uint32_t prescale)
193
+{
194
+ return (1000000000LL / TIM_REF_HZ) * count * (prescale + 1);
195
+}
196
+
197
+/* Returns a bitmask corresponding to the timer under test. */
198
+static uint32_t tim_timer_bit(const TestData *td)
199
+{
200
+ return BIT(timer_index(td->timer));
201
+}
202
+
203
+/* Resets all timers to power-on defaults. */
204
+static void tim_reset(const TestData *td)
205
+{
206
+ int i, j;
207
+
208
+ /* Reset all the timers, in case a previous test left a timer running. */
209
+ for (i = 0; i < ARRAY_SIZE(timer_block); i++) {
210
+ for (j = 0; j < ARRAY_SIZE(timer); j++) {
211
+ writel(timer_block[i].base_addr + timer[j].tcsr_offset,
212
+ CRST | TCSR_DEFAULT);
213
+ }
214
+ writel(timer_block[i].base_addr + TISR, -1);
215
+ }
216
+}
217
+
218
+/* Verifies the reset state of a timer. */
219
+static void test_reset(gconstpointer test_data)
220
+{
221
+ const TestData *td = test_data;
222
+
223
+ tim_reset(td);
224
+
225
+ g_assert_cmphex(tim_read_tcsr(td), ==, TCSR_DEFAULT);
226
+ g_assert_cmphex(tim_read_ticr(td), ==, 0);
227
+ g_assert_cmphex(tim_read_tdr(td), ==, 0);
228
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
229
+ g_assert_cmphex(tim_read(td, WTCR), ==, WTCLK(1));
230
+}
231
+
232
+/* Verifies that CRST wins if both CEN and CRST are set. */
233
+static void test_reset_overrides_enable(gconstpointer test_data)
234
+{
235
+ const TestData *td = test_data;
236
+
237
+ tim_reset(td);
238
+
239
+ /* CRST should force CEN to 0 */
240
+ tim_write_tcsr(td, CEN | CRST | TCSR_DEFAULT);
241
+
242
+ g_assert_cmphex(tim_read_tcsr(td), ==, TCSR_DEFAULT);
243
+ g_assert_cmphex(tim_read_tdr(td), ==, 0);
244
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
245
+}
246
+
247
+/* Verifies the behavior when CEN is set and then cleared. */
248
+static void test_oneshot_enable_then_disable(gconstpointer test_data)
249
+{
250
+ const TestData *td = test_data;
251
+
252
+ tim_reset(td);
253
+
254
+ /* Enable the timer with zero initial count, then disable it again. */
255
+ tim_write_tcsr(td, CEN | TCSR_DEFAULT);
256
+ tim_write_tcsr(td, TCSR_DEFAULT);
257
+
258
+ g_assert_cmphex(tim_read_tcsr(td), ==, TCSR_DEFAULT);
259
+ g_assert_cmphex(tim_read_tdr(td), ==, 0);
260
+ /* Timer interrupt flag should be set, but interrupts are not enabled. */
261
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
262
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
263
+}
264
+
265
+/* Verifies that a one-shot timer fires when expected with prescaler 5. */
266
+static void test_oneshot_ps5(gconstpointer test_data)
267
+{
268
+ const TestData *td = test_data;
269
+ unsigned int count = 256;
270
+ unsigned int ps = 5;
271
+
272
+ tim_reset(td);
273
+
274
+ tim_write_ticr(td, count);
275
+ tim_write_tcsr(td, CEN | PRESCALE(ps));
276
+ g_assert_cmphex(tim_read_tcsr(td), ==, CEN | CACT | PRESCALE(ps));
277
+ g_assert_cmpuint(tim_read_tdr(td), ==, count);
278
+
279
+ clock_step(tim_calculate_step(count, ps) - 1);
280
+
281
+ g_assert_cmphex(tim_read_tcsr(td), ==, CEN | CACT | PRESCALE(ps));
282
+ g_assert_cmpuint(tim_read_tdr(td), <, count);
283
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
284
+
285
+ clock_step(1);
286
+
287
+ g_assert_cmphex(tim_read_tcsr(td), ==, PRESCALE(ps));
288
+ g_assert_cmpuint(tim_read_tdr(td), ==, count);
289
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
290
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
291
+
292
+ /* Clear the interrupt flag. */
293
+ tim_write(td, TISR, tim_timer_bit(td));
294
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
295
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
296
+
297
+ /* Verify that this isn't a periodic timer. */
298
+ clock_step(2 * tim_calculate_step(count, ps));
299
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
300
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
301
+}
302
+
303
+/* Verifies that a one-shot timer fires when expected with prescaler 0. */
304
+static void test_oneshot_ps0(gconstpointer test_data)
305
+{
306
+ const TestData *td = test_data;
307
+ unsigned int count = 1;
308
+ unsigned int ps = 0;
309
+
310
+ tim_reset(td);
311
+
312
+ tim_write_ticr(td, count);
313
+ tim_write_tcsr(td, CEN | PRESCALE(ps));
314
+ g_assert_cmphex(tim_read_tcsr(td), ==, CEN | CACT | PRESCALE(ps));
315
+ g_assert_cmpuint(tim_read_tdr(td), ==, count);
316
+
317
+ clock_step(tim_calculate_step(count, ps) - 1);
318
+
319
+ g_assert_cmphex(tim_read_tcsr(td), ==, CEN | CACT | PRESCALE(ps));
320
+ g_assert_cmpuint(tim_read_tdr(td), <, count);
321
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
322
+
323
+ clock_step(1);
324
+
325
+ g_assert_cmphex(tim_read_tcsr(td), ==, PRESCALE(ps));
326
+ g_assert_cmpuint(tim_read_tdr(td), ==, count);
327
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
328
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
329
+}
330
+
331
+/* Verifies that a one-shot timer fires when expected with highest prescaler. */
332
+static void test_oneshot_ps255(gconstpointer test_data)
333
+{
334
+ const TestData *td = test_data;
335
+ unsigned int count = (1U << 24) - 1;
336
+ unsigned int ps = 255;
337
+
338
+ tim_reset(td);
339
+
340
+ tim_write_ticr(td, count);
341
+ tim_write_tcsr(td, CEN | PRESCALE(ps));
342
+ g_assert_cmphex(tim_read_tcsr(td), ==, CEN | CACT | PRESCALE(ps));
343
+ g_assert_cmpuint(tim_read_tdr(td), ==, count);
344
+
345
+ clock_step(tim_calculate_step(count, ps) - 1);
346
+
347
+ g_assert_cmphex(tim_read_tcsr(td), ==, CEN | CACT | PRESCALE(ps));
348
+ g_assert_cmpuint(tim_read_tdr(td), <, count);
349
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
350
+
351
+ clock_step(1);
352
+
353
+ g_assert_cmphex(tim_read_tcsr(td), ==, PRESCALE(ps));
354
+ g_assert_cmpuint(tim_read_tdr(td), ==, count);
355
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
356
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
357
+}
358
+
359
+/* Verifies that a oneshot timer fires an interrupt when expected. */
360
+static void test_oneshot_interrupt(gconstpointer test_data)
361
+{
362
+ const TestData *td = test_data;
363
+ unsigned int count = 256;
364
+ unsigned int ps = 7;
365
+
366
+ tim_reset(td);
367
+
368
+ tim_write_ticr(td, count);
369
+ tim_write_tcsr(td, IE | CEN | MODE_ONESHOT | PRESCALE(ps));
370
+
371
+ clock_step_next();
372
+
373
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
374
+ g_assert_true(qtest_get_irq(global_qtest, tim_timer_irq(td)));
375
+}
376
+
117
+
377
+/*
118
+/*
378
+ * Verifies that the timer can be paused and later resumed, and it still fires
119
+ * Long shifts taking half-sized inputs from top or bottom of the input
379
+ * at the right moment.
120
+ * vector and producing a double-width result. ESIZE, TYPE are for
121
+ * the input, and LESIZE, LTYPE for the output.
122
+ * Unlike the normal shift helpers, we do not handle negative shift counts,
123
+ * because the long shift is strictly left-only.
380
+ */
124
+ */
381
+static void test_pause_resume(gconstpointer test_data)
125
+#define DO_VSHLL(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE) \
382
+{
126
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
383
+ const TestData *td = test_data;
127
+ void *vm, uint32_t shift) \
384
+ unsigned int count = 256;
128
+ { \
385
+ unsigned int ps = 1;
129
+ LTYPE *d = vd; \
386
+
130
+ TYPE *m = vm; \
387
+ tim_reset(td);
131
+ uint16_t mask = mve_element_mask(env); \
388
+
132
+ unsigned le; \
389
+ tim_write_ticr(td, count);
133
+ assert(shift <= 16); \
390
+ tim_write_tcsr(td, IE | CEN | MODE_ONESHOT | PRESCALE(ps));
134
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
391
+
135
+ LTYPE r = (LTYPE)m[H##ESIZE(le * 2 + TOP)] << shift; \
392
+ /* Pause the timer halfway to expiration. */
136
+ mergemask(&d[H##LESIZE(le)], r, mask); \
393
+ clock_step(tim_calculate_step(count / 2, ps));
137
+ } \
394
+ tim_write_tcsr(td, IE | MODE_ONESHOT | PRESCALE(ps));
138
+ mve_advance_vpt(env); \
395
+ g_assert_cmpuint(tim_read_tdr(td), ==, count / 2);
396
+
397
+ /* Counter should not advance during the following step. */
398
+ clock_step(2 * tim_calculate_step(count, ps));
399
+ g_assert_cmpuint(tim_read_tdr(td), ==, count / 2);
400
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
401
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
402
+
403
+ /* Resume the timer and run _almost_ to expiration. */
404
+ tim_write_tcsr(td, IE | CEN | MODE_ONESHOT | PRESCALE(ps));
405
+ clock_step(tim_calculate_step(count / 2, ps) - 1);
406
+ g_assert_cmpuint(tim_read_tdr(td), <, count);
407
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
408
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
409
+
410
+ /* Now, run the rest of the way and verify that the interrupt fires. */
411
+ clock_step(1);
412
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
413
+ g_assert_true(qtest_get_irq(global_qtest, tim_timer_irq(td)));
414
+}
415
+
416
+/* Verifies that the prescaler can be changed while the timer is runnin. */
417
+static void test_prescaler_change(gconstpointer test_data)
418
+{
419
+ const TestData *td = test_data;
420
+ unsigned int count = 256;
421
+ unsigned int ps = 5;
422
+
423
+ tim_reset(td);
424
+
425
+ tim_write_ticr(td, count);
426
+ tim_write_tcsr(td, CEN | MODE_ONESHOT | PRESCALE(ps));
427
+
428
+ /* Run a quarter of the way, and change the prescaler. */
429
+ clock_step(tim_calculate_step(count / 4, ps));
430
+ g_assert_cmpuint(tim_read_tdr(td), ==, 3 * count / 4);
431
+ ps = 2;
432
+ tim_write_tcsr(td, CEN | MODE_ONESHOT | PRESCALE(ps));
433
+ /* The counter must not change. */
434
+ g_assert_cmpuint(tim_read_tdr(td), ==, 3 * count / 4);
435
+
436
+ /* Run another quarter of the way, and change the prescaler again. */
437
+ clock_step(tim_calculate_step(count / 4, ps));
438
+ g_assert_cmpuint(tim_read_tdr(td), ==, count / 2);
439
+ ps = 8;
440
+ tim_write_tcsr(td, CEN | MODE_ONESHOT | PRESCALE(ps));
441
+ /* The counter must not change. */
442
+ g_assert_cmpuint(tim_read_tdr(td), ==, count / 2);
443
+
444
+ /* Run another quarter of the way, and change the prescaler again. */
445
+ clock_step(tim_calculate_step(count / 4, ps));
446
+ g_assert_cmpuint(tim_read_tdr(td), ==, count / 4);
447
+ ps = 0;
448
+ tim_write_tcsr(td, CEN | MODE_ONESHOT | PRESCALE(ps));
449
+ /* The counter must not change. */
450
+ g_assert_cmpuint(tim_read_tdr(td), ==, count / 4);
451
+
452
+ /* Run almost to expiration, and verify the timer didn't fire yet. */
453
+ clock_step(tim_calculate_step(count / 4, ps) - 1);
454
+ g_assert_cmpuint(tim_read_tdr(td), <, count);
455
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
456
+
457
+ /* Now, run the rest of the way and verify that the timer fires. */
458
+ clock_step(1);
459
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
460
+}
461
+
462
+/* Verifies that a periodic timer automatically restarts after expiration. */
463
+static void test_periodic_no_interrupt(gconstpointer test_data)
464
+{
465
+ const TestData *td = test_data;
466
+ unsigned int count = 2;
467
+ unsigned int ps = 3;
468
+ int i;
469
+
470
+ tim_reset(td);
471
+
472
+ tim_write_ticr(td, count);
473
+ tim_write_tcsr(td, CEN | MODE_PERIODIC | PRESCALE(ps));
474
+
475
+ for (i = 0; i < 4; i++) {
476
+ clock_step_next();
477
+
478
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
479
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
480
+
481
+ tim_write(td, TISR, tim_timer_bit(td));
482
+
483
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
484
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
485
+ }
486
+}
487
+
488
+/* Verifies that a periodict timer fires an interrupt every time it expires. */
489
+static void test_periodic_interrupt(gconstpointer test_data)
490
+{
491
+ const TestData *td = test_data;
492
+ unsigned int count = 65535;
493
+ unsigned int ps = 2;
494
+ int i;
495
+
496
+ tim_reset(td);
497
+
498
+ tim_write_ticr(td, count);
499
+ tim_write_tcsr(td, CEN | IE | MODE_PERIODIC | PRESCALE(ps));
500
+
501
+ for (i = 0; i < 4; i++) {
502
+ clock_step_next();
503
+
504
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
505
+ g_assert_true(qtest_get_irq(global_qtest, tim_timer_irq(td)));
506
+
507
+ tim_write(td, TISR, tim_timer_bit(td));
508
+
509
+ g_assert_cmphex(tim_read(td, TISR), ==, 0);
510
+ g_assert_false(qtest_get_irq(global_qtest, tim_timer_irq(td)));
511
+ }
512
+}
513
+
514
+/*
515
+ * Verifies that the timer behaves correctly when disabled right before and
516
+ * exactly when it's supposed to expire.
517
+ */
518
+static void test_disable_on_expiration(gconstpointer test_data)
519
+{
520
+ const TestData *td = test_data;
521
+ unsigned int count = 8;
522
+ unsigned int ps = 255;
523
+
524
+ tim_reset(td);
525
+
526
+ tim_write_ticr(td, count);
527
+ tim_write_tcsr(td, CEN | MODE_ONESHOT | PRESCALE(ps));
528
+
529
+ clock_step(tim_calculate_step(count, ps) - 1);
530
+
531
+ tim_write_tcsr(td, MODE_ONESHOT | PRESCALE(ps));
532
+ tim_write_tcsr(td, CEN | MODE_ONESHOT | PRESCALE(ps));
533
+ clock_step(1);
534
+ tim_write_tcsr(td, MODE_ONESHOT | PRESCALE(ps));
535
+ g_assert_cmphex(tim_read(td, TISR), ==, tim_timer_bit(td));
536
+}
537
+
538
+/*
539
+ * Constructs a name that includes the timer block, timer and testcase name,
540
+ * and adds the test to the test suite.
541
+ */
542
+static void tim_add_test(const char *name, const TestData *td, GTestDataFunc fn)
543
+{
544
+ g_autofree char *full_name;
545
+
546
+ full_name = g_strdup_printf("npcm7xx_timer/tim[%d]/timer[%d]/%s",
547
+ tim_index(td->tim), timer_index(td->timer),
548
+ name);
549
+ qtest_add_data_func(full_name, td, fn);
550
+}
551
+
552
+/* Convenience macro for adding a test with a predictable function name. */
553
+#define add_test(name, td) tim_add_test(#name, td, test_##name)
554
+
555
+int main(int argc, char **argv)
556
+{
557
+ TestData testdata[ARRAY_SIZE(timer_block) * ARRAY_SIZE(timer)];
558
+ int ret;
559
+ int i, j;
560
+
561
+ g_test_init(&argc, &argv, NULL);
562
+ g_test_set_nonfatal_assertions();
563
+
564
+ for (i = 0; i < ARRAY_SIZE(timer_block); i++) {
565
+ for (j = 0; j < ARRAY_SIZE(timer); j++) {
566
+ TestData *td = &testdata[i * ARRAY_SIZE(timer) + j];
567
+ td->tim = &timer_block[i];
568
+ td->timer = &timer[j];
569
+
570
+ add_test(reset, td);
571
+ add_test(reset_overrides_enable, td);
572
+ add_test(oneshot_enable_then_disable, td);
573
+ add_test(oneshot_ps5, td);
574
+ add_test(oneshot_ps0, td);
575
+ add_test(oneshot_ps255, td);
576
+ add_test(oneshot_interrupt, td);
577
+ add_test(pause_resume, td);
578
+ add_test(prescaler_change, td);
579
+ add_test(periodic_no_interrupt, td);
580
+ add_test(periodic_interrupt, td);
581
+ add_test(disable_on_expiration, td);
582
+ }
583
+ }
139
+ }
584
+
140
+
585
+ qtest_start("-machine npcm750-evb");
141
+#define DO_VSHLL_ALL(OP, TOP) \
586
+ qtest_irq_intercept_in(global_qtest, "/machine/soc/a9mpcore/gic");
142
+ DO_VSHLL(OP##sb, TOP, 1, int8_t, 2, int16_t) \
587
+ ret = g_test_run();
143
+ DO_VSHLL(OP##ub, TOP, 1, uint8_t, 2, uint16_t) \
588
+ qtest_end();
144
+ DO_VSHLL(OP##sh, TOP, 2, int16_t, 4, int32_t) \
145
+ DO_VSHLL(OP##uh, TOP, 2, uint16_t, 4, uint32_t) \
589
+
146
+
590
+ return ret;
147
+DO_VSHLL_ALL(vshllb, false)
591
+}
148
+DO_VSHLL_ALL(vshllt, true)
592
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
149
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
593
index XXXXXXX..XXXXXXX 100644
150
index XXXXXXX..XXXXXXX 100644
594
--- a/tests/qtest/meson.build
151
--- a/target/arm/translate-mve.c
595
+++ b/tests/qtest/meson.build
152
+++ b/target/arm/translate-mve.c
596
@@ -XXX,XX +XXX,XX @@ qtests_arm = \
153
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT(VSHRI_S, vshli_s, true)
597
['arm-cpu-features',
154
DO_2SHIFT(VSHRI_U, vshli_u, true)
598
'microbit-test',
155
DO_2SHIFT(VRSHRI_S, vrshli_s, true)
599
'm25p80-test',
156
DO_2SHIFT(VRSHRI_U, vrshli_u, true)
600
+ 'npcm7xx_timer-test',
157
+
601
'test-arm-mptimer',
158
+#define DO_VSHLL(INSN, FN) \
602
'boot-serial-test',
159
+ static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
603
'hexloader-test']
160
+ { \
161
+ static MVEGenTwoOpShiftFn * const fns[] = { \
162
+ gen_helper_mve_##FN##b, \
163
+ gen_helper_mve_##FN##h, \
164
+ }; \
165
+ return do_2shift(s, a, fns[a->size], false); \
166
+ }
167
+
168
+DO_VSHLL(VSHLL_BS, vshllbs)
169
+DO_VSHLL(VSHLL_BU, vshllbu)
170
+DO_VSHLL(VSHLL_TS, vshllts)
171
+DO_VSHLL(VSHLL_TU, vshlltu)
604
--
172
--
605
2.20.1
173
2.20.1
606
174
607
175
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Implement the MVE VSRI and VSLI insns, which perform a
2
shift-and-insert operation.
2
3
3
These are all of the defines required to parse
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
GNU_PROPERTY_AARCH64_FEATURE_1_AND, copied from binutils.
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Other missing defines related to other GNU program headers
6
Message-id: 20210628135835.6690-11-peter.maydell@linaro.org
6
and notes are elided for now.
7
---
8
target/arm/helper-mve.h | 8 ++++++++
9
target/arm/mve.decode | 9 ++++++++
10
target/arm/mve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++
11
target/arm/translate-mve.c | 3 +++
12
4 files changed, 62 insertions(+)
7
13
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20201016184207.786698-4-richard.henderson@linaro.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
include/elf.h | 22 ++++++++++++++++++++++
14
1 file changed, 22 insertions(+)
15
16
diff --git a/include/elf.h b/include/elf.h
17
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
18
--- a/include/elf.h
16
--- a/target/arm/helper-mve.h
19
+++ b/include/elf.h
17
+++ b/target/arm/helper-mve.h
20
@@ -XXX,XX +XXX,XX @@ typedef int64_t Elf64_Sxword;
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vshlltsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
#define PT_NOTE 4
19
DEF_HELPER_FLAGS_4(mve_vshlltsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
#define PT_SHLIB 5
20
DEF_HELPER_FLAGS_4(mve_vshlltub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
#define PT_PHDR 6
21
DEF_HELPER_FLAGS_4(mve_vshlltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
+#define PT_LOOS 0x60000000
25
+#define PT_HIOS 0x6fffffff
26
#define PT_LOPROC 0x70000000
27
#define PT_HIPROC 0x7fffffff
28
29
+#define PT_GNU_PROPERTY (PT_LOOS + 0x474e553)
30
+
22
+
31
#define PT_MIPS_REGINFO 0x70000000
23
+DEF_HELPER_FLAGS_4(mve_vsrib, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
#define PT_MIPS_RTPROC 0x70000001
24
+DEF_HELPER_FLAGS_4(mve_vsrih, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
#define PT_MIPS_OPTIONS 0x70000002
25
+DEF_HELPER_FLAGS_4(mve_vsriw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
@@ -XXX,XX +XXX,XX @@ typedef struct elf64_shdr {
35
#define NT_ARM_SYSTEM_CALL 0x404 /* ARM system call number */
36
#define NT_ARM_SVE 0x405 /* ARM Scalable Vector Extension regs */
37
38
+/* Defined note types for GNU systems. */
39
+
26
+
40
+#define NT_GNU_PROPERTY_TYPE_0 5 /* Program property */
27
+DEF_HELPER_FLAGS_4(mve_vslib, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(mve_vslih, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(mve_vsliw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
31
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/mve.decode
33
+++ b/target/arm/mve.decode
34
@@ -XXX,XX +XXX,XX @@ VSHLL_TS 111 0 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_h
35
36
VSHLL_TU 111 1 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_b
37
VSHLL_TU 111 1 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_h
41
+
38
+
42
+/* Values used in GNU .note.gnu.property notes (NT_GNU_PROPERTY_TYPE_0). */
39
+# Shift-and-insert
40
+VSRI 111 1 1111 1 . ... ... ... 0 0100 0 1 . 1 ... 0 @2_shr_b
41
+VSRI 111 1 1111 1 . ... ... ... 0 0100 0 1 . 1 ... 0 @2_shr_h
42
+VSRI 111 1 1111 1 . ... ... ... 0 0100 0 1 . 1 ... 0 @2_shr_w
43
+
43
+
44
+#define GNU_PROPERTY_STACK_SIZE 1
44
+VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_b
45
+#define GNU_PROPERTY_NO_COPY_ON_PROTECTED 2
45
+VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_h
46
+VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_w
47
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/mve_helper.c
50
+++ b/target/arm/mve_helper.c
51
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
52
DO_2SHIFT_U(vrshli_u, DO_VRSHLU)
53
DO_2SHIFT_S(vrshli_s, DO_VRSHLS)
54
55
+/* Shift-and-insert; we always work with 64 bits at a time */
56
+#define DO_2SHIFT_INSERT(OP, ESIZE, SHIFTFN, MASKFN) \
57
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
58
+ void *vm, uint32_t shift) \
59
+ { \
60
+ uint64_t *d = vd, *m = vm; \
61
+ uint16_t mask; \
62
+ uint64_t shiftmask; \
63
+ unsigned e; \
64
+ if (shift == 0 || shift == ESIZE * 8) { \
65
+ /* \
66
+ * Only VSLI can shift by 0; only VSRI can shift by <dt>. \
67
+ * The generic logic would give the right answer for 0 but \
68
+ * fails for <dt>. \
69
+ */ \
70
+ goto done; \
71
+ } \
72
+ assert(shift < ESIZE * 8); \
73
+ mask = mve_element_mask(env); \
74
+ /* ESIZE / 2 gives the MO_* value if ESIZE is in [1,2,4] */ \
75
+ shiftmask = dup_const(ESIZE / 2, MASKFN(ESIZE * 8, shift)); \
76
+ for (e = 0; e < 16 / 8; e++, mask >>= 8) { \
77
+ uint64_t r = (SHIFTFN(m[H8(e)], shift) & shiftmask) | \
78
+ (d[H8(e)] & ~shiftmask); \
79
+ mergemask(&d[H8(e)], r, mask); \
80
+ } \
81
+done: \
82
+ mve_advance_vpt(env); \
83
+ }
46
+
84
+
47
+#define GNU_PROPERTY_LOPROC 0xc0000000
85
+#define DO_SHL(N, SHIFT) ((N) << (SHIFT))
48
+#define GNU_PROPERTY_HIPROC 0xdfffffff
86
+#define DO_SHR(N, SHIFT) ((N) >> (SHIFT))
49
+#define GNU_PROPERTY_LOUSER 0xe0000000
87
+#define SHL_MASK(EBITS, SHIFT) MAKE_64BIT_MASK((SHIFT), (EBITS) - (SHIFT))
50
+#define GNU_PROPERTY_HIUSER 0xffffffff
88
+#define SHR_MASK(EBITS, SHIFT) MAKE_64BIT_MASK(0, (EBITS) - (SHIFT))
51
+
89
+
52
+#define GNU_PROPERTY_AARCH64_FEATURE_1_AND 0xc0000000
90
+DO_2SHIFT_INSERT(vsrib, 1, DO_SHR, SHR_MASK)
53
+#define GNU_PROPERTY_AARCH64_FEATURE_1_BTI (1u << 0)
91
+DO_2SHIFT_INSERT(vsrih, 2, DO_SHR, SHR_MASK)
54
+#define GNU_PROPERTY_AARCH64_FEATURE_1_PAC (1u << 1)
92
+DO_2SHIFT_INSERT(vsriw, 4, DO_SHR, SHR_MASK)
93
+DO_2SHIFT_INSERT(vslib, 1, DO_SHL, SHL_MASK)
94
+DO_2SHIFT_INSERT(vslih, 2, DO_SHL, SHL_MASK)
95
+DO_2SHIFT_INSERT(vsliw, 4, DO_SHL, SHL_MASK)
55
+
96
+
56
/*
97
/*
57
* Physical entry point into the kernel.
98
* Long shifts taking half-sized inputs from top or bottom of the input
58
*
99
* vector and producing a double-width result. ESIZE, TYPE are for
100
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/target/arm/translate-mve.c
103
+++ b/target/arm/translate-mve.c
104
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT(VSHRI_U, vshli_u, true)
105
DO_2SHIFT(VRSHRI_S, vrshli_s, true)
106
DO_2SHIFT(VRSHRI_U, vrshli_u, true)
107
108
+DO_2SHIFT(VSRI, vsri, false)
109
+DO_2SHIFT(VSLI, vsli, false)
110
+
111
#define DO_VSHLL(INSN, FN) \
112
static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
113
{ \
59
--
114
--
60
2.20.1
115
2.20.1
61
116
62
117
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Implement the MVE shift-right-and-narrow insn VSHRN and VRSHRN.
2
2
3
The kernel sets btype for the signal handler as if for a call.
3
do_urshr() is borrowed from sve_helper.c.
4
4
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20201016184207.786698-2-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210628135835.6690-12-peter.maydell@linaro.org
9
---
8
---
10
linux-user/aarch64/signal.c | 10 ++++++++--
9
target/arm/helper-mve.h | 10 ++++++++++
11
1 file changed, 8 insertions(+), 2 deletions(-)
10
target/arm/mve.decode | 11 +++++++++++
11
target/arm/mve_helper.c | 40 ++++++++++++++++++++++++++++++++++++++
12
target/arm/translate-mve.c | 15 ++++++++++++++
13
4 files changed, 76 insertions(+)
12
14
13
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
15
--- a/linux-user/aarch64/signal.c
17
--- a/target/arm/helper-mve.h
16
+++ b/linux-user/aarch64/signal.c
18
+++ b/target/arm/helper-mve.h
17
@@ -XXX,XX +XXX,XX @@ static void target_setup_frame(int usig, struct target_sigaction *ka,
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vsriw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
+ offsetof(struct target_rt_frame_record, tramp);
20
DEF_HELPER_FLAGS_4(mve_vslib, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
}
21
DEF_HELPER_FLAGS_4(mve_vslih, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
20
env->xregs[0] = usig;
22
DEF_HELPER_FLAGS_4(mve_vsliw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
- env->xregs[31] = frame_addr;
22
env->xregs[29] = frame_addr + fr_ofs;
23
- env->pc = ka->_sa_handler;
24
env->xregs[30] = return_addr;
25
+ env->xregs[31] = frame_addr;
26
+ env->pc = ka->_sa_handler;
27
+
23
+
28
+ /* Invoke the signal handler as if by indirect call. */
24
+DEF_HELPER_FLAGS_4(mve_vshrnbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+ if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
25
+DEF_HELPER_FLAGS_4(mve_vshrnbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
+ env->btype = 2;
26
+DEF_HELPER_FLAGS_4(mve_vshrntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(mve_vshrnth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
+
29
+DEF_HELPER_FLAGS_4(mve_vrshrnbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_4(mve_vrshrnbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_4(mve_vrshrntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(mve_vrshrnth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/mve.decode
36
+++ b/target/arm/mve.decode
37
@@ -XXX,XX +XXX,XX @@ VSRI 111 1 1111 1 . ... ... ... 0 0100 0 1 . 1 ... 0 @2_shr_w
38
VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_b
39
VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_h
40
VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_w
41
+
42
+# Narrowing shifts (which only support b and h sizes)
43
+VSHRNB 111 0 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_b
44
+VSHRNB 111 0 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_h
45
+VSHRNT 111 0 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_b
46
+VSHRNT 111 0 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_h
47
+
48
+VRSHRNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_b
49
+VRSHRNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_h
50
+VRSHRNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_b
51
+VRSHRNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_h
52
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/target/arm/mve_helper.c
55
+++ b/target/arm/mve_helper.c
56
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_INSERT(vsliw, 4, DO_SHL, SHL_MASK)
57
58
DO_VSHLL_ALL(vshllb, false)
59
DO_VSHLL_ALL(vshllt, true)
60
+
61
+/*
62
+ * Narrowing right shifts, taking a double sized input, shifting it
63
+ * and putting the result in either the top or bottom half of the output.
64
+ * ESIZE, TYPE are the output, and LESIZE, LTYPE the input.
65
+ */
66
+#define DO_VSHRN(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN) \
67
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
68
+ void *vm, uint32_t shift) \
69
+ { \
70
+ LTYPE *m = vm; \
71
+ TYPE *d = vd; \
72
+ uint16_t mask = mve_element_mask(env); \
73
+ unsigned le; \
74
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
75
+ TYPE r = FN(m[H##LESIZE(le)], shift); \
76
+ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \
77
+ } \
78
+ mve_advance_vpt(env); \
31
+ }
79
+ }
32
+
80
+
33
if (info) {
81
+#define DO_VSHRN_ALL(OP, FN) \
34
tswap_siginfo(&frame->info, info);
82
+ DO_VSHRN(OP##bb, false, 1, uint8_t, 2, uint16_t, FN) \
35
env->xregs[1] = frame_addr + offsetof(struct target_rt_sigframe, info);
83
+ DO_VSHRN(OP##bh, false, 2, uint16_t, 4, uint32_t, FN) \
84
+ DO_VSHRN(OP##tb, true, 1, uint8_t, 2, uint16_t, FN) \
85
+ DO_VSHRN(OP##th, true, 2, uint16_t, 4, uint32_t, FN)
86
+
87
+static inline uint64_t do_urshr(uint64_t x, unsigned sh)
88
+{
89
+ if (likely(sh < 64)) {
90
+ return (x >> sh) + ((x >> (sh - 1)) & 1);
91
+ } else if (sh == 64) {
92
+ return x >> 63;
93
+ } else {
94
+ return 0;
95
+ }
96
+}
97
+
98
+DO_VSHRN_ALL(vshrn, DO_SHR)
99
+DO_VSHRN_ALL(vrshrn, do_urshr)
100
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/target/arm/translate-mve.c
103
+++ b/target/arm/translate-mve.c
104
@@ -XXX,XX +XXX,XX @@ DO_VSHLL(VSHLL_BS, vshllbs)
105
DO_VSHLL(VSHLL_BU, vshllbu)
106
DO_VSHLL(VSHLL_TS, vshllts)
107
DO_VSHLL(VSHLL_TU, vshlltu)
108
+
109
+#define DO_2SHIFT_N(INSN, FN) \
110
+ static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
111
+ { \
112
+ static MVEGenTwoOpShiftFn * const fns[] = { \
113
+ gen_helper_mve_##FN##b, \
114
+ gen_helper_mve_##FN##h, \
115
+ }; \
116
+ return do_2shift(s, a, fns[a->size], false); \
117
+ }
118
+
119
+DO_2SHIFT_N(VSHRNB, vshrnb)
120
+DO_2SHIFT_N(VSHRNT, vshrnt)
121
+DO_2SHIFT_N(VRSHRNB, vrshrnb)
122
+DO_2SHIFT_N(VRSHRNT, vrshrnt)
36
--
123
--
37
2.20.1
124
2.20.1
38
125
39
126
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Implement the MVE saturating shift-right-and-narrow insns
2
2
VQSHRN, VQSHRUN, VQRSHRN and VQRSHRUN.
3
On ARM, the Top Byte Ignore feature means that only 56 bits of
3
4
the address are significant in the virtual address. We are
4
do_srshr() is borrowed from sve_helper.c.
5
required to give the entire 64-bit address to FAR_ELx on fault,
5
6
which means that we do not "clean" the top byte early in TCG.
7
8
This new interface allows us to flush all 256 possible aliases
9
for a given page, currently missed by tlb_flush_page*.
10
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 20201016210754.818257-2-richard.henderson@linaro.org
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-13-peter.maydell@linaro.org
16
---
9
---
17
include/exec/exec-all.h | 36 ++++++
10
target/arm/helper-mve.h | 30 +++++++++++
18
accel/tcg/cputlb.c | 275 ++++++++++++++++++++++++++++++++++++++--
11
target/arm/mve.decode | 28 ++++++++++
19
2 files changed, 302 insertions(+), 9 deletions(-)
12
target/arm/mve_helper.c | 104 +++++++++++++++++++++++++++++++++++++
20
13
target/arm/translate-mve.c | 12 +++++
21
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
14
4 files changed, 174 insertions(+)
22
index XXXXXXX..XXXXXXX 100644
15
23
--- a/include/exec/exec-all.h
16
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
24
+++ b/include/exec/exec-all.h
17
index XXXXXXX..XXXXXXX 100644
25
@@ -XXX,XX +XXX,XX @@ void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t idxmap);
18
--- a/target/arm/helper-mve.h
26
* depend on when the guests translation ends the TB.
19
+++ b/target/arm/helper-mve.h
27
*/
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrshrnbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, uint16_t idxmap);
21
DEF_HELPER_FLAGS_4(mve_vrshrnbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+
22
DEF_HELPER_FLAGS_4(mve_vrshrntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
+/**
23
DEF_HELPER_FLAGS_4(mve_vrshrnth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
+ * tlb_flush_page_bits_by_mmuidx
24
+
32
+ * @cpu: CPU whose TLB should be flushed
25
+DEF_HELPER_FLAGS_4(mve_vqshrnb_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
+ * @addr: virtual address of page to be flushed
26
+DEF_HELPER_FLAGS_4(mve_vqshrnb_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
+ * @idxmap: bitmap of mmu indexes to flush
27
+DEF_HELPER_FLAGS_4(mve_vqshrnt_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
+ * @bits: number of significant bits in address
28
+DEF_HELPER_FLAGS_4(mve_vqshrnt_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
+ *
29
+
37
+ * Similar to tlb_flush_page_mask, but with a bitmap of indexes.
30
+DEF_HELPER_FLAGS_4(mve_vqshrnb_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
38
+ */
31
+DEF_HELPER_FLAGS_4(mve_vqshrnb_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
+void tlb_flush_page_bits_by_mmuidx(CPUState *cpu, target_ulong addr,
32
+DEF_HELPER_FLAGS_4(mve_vqshrnt_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+ uint16_t idxmap, unsigned bits);
33
+DEF_HELPER_FLAGS_4(mve_vqshrnt_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
+
34
+
42
+/* Similarly, with broadcast and syncing. */
35
+DEF_HELPER_FLAGS_4(mve_vqshrunbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
43
+void tlb_flush_page_bits_by_mmuidx_all_cpus(CPUState *cpu, target_ulong addr,
36
+DEF_HELPER_FLAGS_4(mve_vqshrunbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
44
+ uint16_t idxmap, unsigned bits);
37
+DEF_HELPER_FLAGS_4(mve_vqshruntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
45
+void tlb_flush_page_bits_by_mmuidx_all_cpus_synced
38
+DEF_HELPER_FLAGS_4(mve_vqshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
46
+ (CPUState *cpu, target_ulong addr, uint16_t idxmap, unsigned bits);
39
+
47
+
40
+DEF_HELPER_FLAGS_4(mve_vqrshrnb_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
48
/**
41
+DEF_HELPER_FLAGS_4(mve_vqrshrnb_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
49
* tlb_set_page_with_attrs:
42
+DEF_HELPER_FLAGS_4(mve_vqrshrnt_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
50
* @cpu: CPU to add this TLB entry for
43
+DEF_HELPER_FLAGS_4(mve_vqrshrnt_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
51
@@ -XXX,XX +XXX,XX @@ static inline void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu,
44
+
52
uint16_t idxmap)
45
+DEF_HELPER_FLAGS_4(mve_vqrshrnb_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
53
{
46
+DEF_HELPER_FLAGS_4(mve_vqrshrnb_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
54
}
47
+DEF_HELPER_FLAGS_4(mve_vqrshrnt_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
55
+static inline void tlb_flush_page_bits_by_mmuidx(CPUState *cpu,
48
+DEF_HELPER_FLAGS_4(mve_vqrshrnt_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
56
+ target_ulong addr,
49
+
57
+ uint16_t idxmap,
50
+DEF_HELPER_FLAGS_4(mve_vqrshrunbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
58
+ unsigned bits)
51
+DEF_HELPER_FLAGS_4(mve_vqrshrunbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
59
+{
52
+DEF_HELPER_FLAGS_4(mve_vqrshruntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
60
+}
53
+DEF_HELPER_FLAGS_4(mve_vqrshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
61
+static inline void tlb_flush_page_bits_by_mmuidx_all_cpus(CPUState *cpu,
54
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
62
+ target_ulong addr,
55
index XXXXXXX..XXXXXXX 100644
63
+ uint16_t idxmap,
56
--- a/target/arm/mve.decode
64
+ unsigned bits)
57
+++ b/target/arm/mve.decode
65
+{
58
@@ -XXX,XX +XXX,XX @@ VRSHRNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_b
66
+}
59
VRSHRNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_h
67
+static inline void
60
VRSHRNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_b
68
+tlb_flush_page_bits_by_mmuidx_all_cpus_synced(CPUState *cpu, target_ulong addr,
61
VRSHRNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_h
69
+ uint16_t idxmap, unsigned bits)
62
+
70
+{
63
+VQSHRNB_S 111 0 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 0 @2_shr_b
71
+}
64
+VQSHRNB_S 111 0 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 0 @2_shr_h
72
#endif
65
+VQSHRNT_S 111 0 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 0 @2_shr_b
73
/**
66
+VQSHRNT_S 111 0 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 0 @2_shr_h
74
* probe_access:
67
+VQSHRNB_U 111 1 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 0 @2_shr_b
75
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
68
+VQSHRNB_U 111 1 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 0 @2_shr_h
76
index XXXXXXX..XXXXXXX 100644
69
+VQSHRNT_U 111 1 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 0 @2_shr_b
77
--- a/accel/tcg/cputlb.c
70
+VQSHRNT_U 111 1 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 0 @2_shr_h
78
+++ b/accel/tcg/cputlb.c
71
+
79
@@ -XXX,XX +XXX,XX @@ void tlb_flush_all_cpus_synced(CPUState *src_cpu)
72
+VQSHRUNB 111 0 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_b
80
tlb_flush_by_mmuidx_all_cpus_synced(src_cpu, ALL_MMUIDX_BITS);
73
+VQSHRUNB 111 0 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_h
81
}
74
+VQSHRUNT 111 0 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b
82
75
+VQSHRUNT 111 0 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h
83
+static bool tlb_hit_page_mask_anyprot(CPUTLBEntry *tlb_entry,
76
+
84
+ target_ulong page, target_ulong mask)
77
+VQRSHRNB_S 111 0 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 1 @2_shr_b
85
+{
78
+VQRSHRNB_S 111 0 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 1 @2_shr_h
86
+ page &= mask;
79
+VQRSHRNT_S 111 0 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 1 @2_shr_b
87
+ mask &= TARGET_PAGE_MASK | TLB_INVALID_MASK;
80
+VQRSHRNT_S 111 0 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 1 @2_shr_h
88
+
81
+VQRSHRNB_U 111 1 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 1 @2_shr_b
89
+ return (page == (tlb_entry->addr_read & mask) ||
82
+VQRSHRNB_U 111 1 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 1 @2_shr_h
90
+ page == (tlb_addr_write(tlb_entry) & mask) ||
83
+VQRSHRNT_U 111 1 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 1 @2_shr_b
91
+ page == (tlb_entry->addr_code & mask));
84
+VQRSHRNT_U 111 1 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 1 @2_shr_h
92
+}
85
+
93
+
86
+VQRSHRUNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_b
94
static inline bool tlb_hit_page_anyprot(CPUTLBEntry *tlb_entry,
87
+VQRSHRUNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_h
95
target_ulong page)
88
+VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b
96
{
89
+VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h
97
- return tlb_hit_page(tlb_entry->addr_read, page) ||
90
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
98
- tlb_hit_page(tlb_addr_write(tlb_entry), page) ||
91
index XXXXXXX..XXXXXXX 100644
99
- tlb_hit_page(tlb_entry->addr_code, page);
92
--- a/target/arm/mve_helper.c
100
+ return tlb_hit_page_mask_anyprot(tlb_entry, page, -1);
93
+++ b/target/arm/mve_helper.c
101
}
94
@@ -XXX,XX +XXX,XX @@ static inline uint64_t do_urshr(uint64_t x, unsigned sh)
102
103
/**
104
@@ -XXX,XX +XXX,XX @@ static inline bool tlb_entry_is_empty(const CPUTLBEntry *te)
105
}
106
107
/* Called with tlb_c.lock held */
108
-static inline bool tlb_flush_entry_locked(CPUTLBEntry *tlb_entry,
109
- target_ulong page)
110
+static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
111
+ target_ulong page,
112
+ target_ulong mask)
113
{
114
- if (tlb_hit_page_anyprot(tlb_entry, page)) {
115
+ if (tlb_hit_page_mask_anyprot(tlb_entry, page, mask)) {
116
memset(tlb_entry, -1, sizeof(*tlb_entry));
117
return true;
118
}
119
return false;
120
}
121
122
+static inline bool tlb_flush_entry_locked(CPUTLBEntry *tlb_entry,
123
+ target_ulong page)
124
+{
125
+ return tlb_flush_entry_mask_locked(tlb_entry, page, -1);
126
+}
127
+
128
/* Called with tlb_c.lock held */
129
-static inline void tlb_flush_vtlb_page_locked(CPUArchState *env, int mmu_idx,
130
- target_ulong page)
131
+static void tlb_flush_vtlb_page_mask_locked(CPUArchState *env, int mmu_idx,
132
+ target_ulong page,
133
+ target_ulong mask)
134
{
135
CPUTLBDesc *d = &env_tlb(env)->d[mmu_idx];
136
int k;
137
138
assert_cpu_is_self(env_cpu(env));
139
for (k = 0; k < CPU_VTLB_SIZE; k++) {
140
- if (tlb_flush_entry_locked(&d->vtable[k], page)) {
141
+ if (tlb_flush_entry_mask_locked(&d->vtable[k], page, mask)) {
142
tlb_n_used_entries_dec(env, mmu_idx);
143
}
144
}
95
}
145
}
96
}
146
97
147
+static inline void tlb_flush_vtlb_page_locked(CPUArchState *env, int mmu_idx,
98
+static inline int64_t do_srshr(int64_t x, unsigned sh)
148
+ target_ulong page)
149
+{
99
+{
150
+ tlb_flush_vtlb_page_mask_locked(env, mmu_idx, page, -1);
100
+ if (likely(sh < 64)) {
151
+}
101
+ return (x >> sh) + ((x >> (sh - 1)) & 1);
152
+
153
static void tlb_flush_page_locked(CPUArchState *env, int midx,
154
target_ulong page)
155
{
156
@@ -XXX,XX +XXX,XX @@ void tlb_flush_page_all_cpus_synced(CPUState *src, target_ulong addr)
157
tlb_flush_page_by_mmuidx_all_cpus_synced(src, addr, ALL_MMUIDX_BITS);
158
}
159
160
+static void tlb_flush_page_bits_locked(CPUArchState *env, int midx,
161
+ target_ulong page, unsigned bits)
162
+{
163
+ CPUTLBDesc *d = &env_tlb(env)->d[midx];
164
+ CPUTLBDescFast *f = &env_tlb(env)->f[midx];
165
+ target_ulong mask = MAKE_64BIT_MASK(0, bits);
166
+
167
+ /*
168
+ * If @bits is smaller than the tlb size, there may be multiple entries
169
+ * within the TLB; otherwise all addresses that match under @mask hit
170
+ * the same TLB entry.
171
+ *
172
+ * TODO: Perhaps allow bits to be a few bits less than the size.
173
+ * For now, just flush the entire TLB.
174
+ */
175
+ if (mask < f->mask) {
176
+ tlb_debug("forcing full flush midx %d ("
177
+ TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
178
+ midx, page, mask);
179
+ tlb_flush_one_mmuidx_locked(env, midx, get_clock_realtime());
180
+ return;
181
+ }
182
+
183
+ /* Check if we need to flush due to large pages. */
184
+ if ((page & d->large_page_mask) == d->large_page_addr) {
185
+ tlb_debug("forcing full flush midx %d ("
186
+ TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
187
+ midx, d->large_page_addr, d->large_page_mask);
188
+ tlb_flush_one_mmuidx_locked(env, midx, get_clock_realtime());
189
+ return;
190
+ }
191
+
192
+ if (tlb_flush_entry_mask_locked(tlb_entry(env, midx, page), page, mask)) {
193
+ tlb_n_used_entries_dec(env, midx);
194
+ }
195
+ tlb_flush_vtlb_page_mask_locked(env, midx, page, mask);
196
+}
197
+
198
+typedef struct {
199
+ target_ulong addr;
200
+ uint16_t idxmap;
201
+ uint16_t bits;
202
+} TLBFlushPageBitsByMMUIdxData;
203
+
204
+static void
205
+tlb_flush_page_bits_by_mmuidx_async_0(CPUState *cpu,
206
+ TLBFlushPageBitsByMMUIdxData d)
207
+{
208
+ CPUArchState *env = cpu->env_ptr;
209
+ int mmu_idx;
210
+
211
+ assert_cpu_is_self(cpu);
212
+
213
+ tlb_debug("page addr:" TARGET_FMT_lx "/%u mmu_map:0x%x\n",
214
+ d.addr, d.bits, d.idxmap);
215
+
216
+ qemu_spin_lock(&env_tlb(env)->c.lock);
217
+ for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
218
+ if ((d.idxmap >> mmu_idx) & 1) {
219
+ tlb_flush_page_bits_locked(env, mmu_idx, d.addr, d.bits);
220
+ }
221
+ }
222
+ qemu_spin_unlock(&env_tlb(env)->c.lock);
223
+
224
+ tb_flush_jmp_cache(cpu, d.addr);
225
+}
226
+
227
+static bool encode_pbm_to_runon(run_on_cpu_data *out,
228
+ TLBFlushPageBitsByMMUIdxData d)
229
+{
230
+ /* We need 6 bits to hold to hold @bits up to 63. */
231
+ if (d.idxmap <= MAKE_64BIT_MASK(0, TARGET_PAGE_BITS - 6)) {
232
+ *out = RUN_ON_CPU_TARGET_PTR(d.addr | (d.idxmap << 6) | d.bits);
233
+ return true;
234
+ }
235
+ return false;
236
+}
237
+
238
+static TLBFlushPageBitsByMMUIdxData
239
+decode_runon_to_pbm(run_on_cpu_data data)
240
+{
241
+ target_ulong addr_map_bits = (target_ulong) data.target_ptr;
242
+ return (TLBFlushPageBitsByMMUIdxData){
243
+ .addr = addr_map_bits & TARGET_PAGE_MASK,
244
+ .idxmap = (addr_map_bits & ~TARGET_PAGE_MASK) >> 6,
245
+ .bits = addr_map_bits & 0x3f
246
+ };
247
+}
248
+
249
+static void tlb_flush_page_bits_by_mmuidx_async_1(CPUState *cpu,
250
+ run_on_cpu_data runon)
251
+{
252
+ tlb_flush_page_bits_by_mmuidx_async_0(cpu, decode_runon_to_pbm(runon));
253
+}
254
+
255
+static void tlb_flush_page_bits_by_mmuidx_async_2(CPUState *cpu,
256
+ run_on_cpu_data data)
257
+{
258
+ TLBFlushPageBitsByMMUIdxData *d = data.host_ptr;
259
+ tlb_flush_page_bits_by_mmuidx_async_0(cpu, *d);
260
+ g_free(d);
261
+}
262
+
263
+void tlb_flush_page_bits_by_mmuidx(CPUState *cpu, target_ulong addr,
264
+ uint16_t idxmap, unsigned bits)
265
+{
266
+ TLBFlushPageBitsByMMUIdxData d;
267
+ run_on_cpu_data runon;
268
+
269
+ /* If all bits are significant, this devolves to tlb_flush_page. */
270
+ if (bits >= TARGET_LONG_BITS) {
271
+ tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
272
+ return;
273
+ }
274
+ /* If no page bits are significant, this devolves to tlb_flush. */
275
+ if (bits < TARGET_PAGE_BITS) {
276
+ tlb_flush_by_mmuidx(cpu, idxmap);
277
+ return;
278
+ }
279
+
280
+ /* This should already be page aligned */
281
+ d.addr = addr & TARGET_PAGE_MASK;
282
+ d.idxmap = idxmap;
283
+ d.bits = bits;
284
+
285
+ if (qemu_cpu_is_self(cpu)) {
286
+ tlb_flush_page_bits_by_mmuidx_async_0(cpu, d);
287
+ } else if (encode_pbm_to_runon(&runon, d)) {
288
+ async_run_on_cpu(cpu, tlb_flush_page_bits_by_mmuidx_async_1, runon);
289
+ } else {
102
+ } else {
290
+ TLBFlushPageBitsByMMUIdxData *p
103
+ /* Rounding the sign bit always produces 0. */
291
+ = g_new(TLBFlushPageBitsByMMUIdxData, 1);
104
+ return 0;
292
+
293
+ /* Otherwise allocate a structure, freed by the worker. */
294
+ *p = d;
295
+ async_run_on_cpu(cpu, tlb_flush_page_bits_by_mmuidx_async_2,
296
+ RUN_ON_CPU_HOST_PTR(p));
297
+ }
105
+ }
298
+}
106
+}
299
+
107
+
300
+void tlb_flush_page_bits_by_mmuidx_all_cpus(CPUState *src_cpu,
108
DO_VSHRN_ALL(vshrn, DO_SHR)
301
+ target_ulong addr,
109
DO_VSHRN_ALL(vrshrn, do_urshr)
302
+ uint16_t idxmap,
110
+
303
+ unsigned bits)
111
+static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max,
112
+ bool *satp)
304
+{
113
+{
305
+ TLBFlushPageBitsByMMUIdxData d;
114
+ if (val > max) {
306
+ run_on_cpu_data runon;
115
+ *satp = true;
307
+
116
+ return max;
308
+ /* If all bits are significant, this devolves to tlb_flush_page. */
117
+ } else if (val < min) {
309
+ if (bits >= TARGET_LONG_BITS) {
118
+ *satp = true;
310
+ tlb_flush_page_by_mmuidx_all_cpus(src_cpu, addr, idxmap);
119
+ return min;
311
+ return;
312
+ }
313
+ /* If no page bits are significant, this devolves to tlb_flush. */
314
+ if (bits < TARGET_PAGE_BITS) {
315
+ tlb_flush_by_mmuidx_all_cpus(src_cpu, idxmap);
316
+ return;
317
+ }
318
+
319
+ /* This should already be page aligned */
320
+ d.addr = addr & TARGET_PAGE_MASK;
321
+ d.idxmap = idxmap;
322
+ d.bits = bits;
323
+
324
+ if (encode_pbm_to_runon(&runon, d)) {
325
+ flush_all_helper(src_cpu, tlb_flush_page_bits_by_mmuidx_async_1, runon);
326
+ } else {
120
+ } else {
327
+ CPUState *dst_cpu;
121
+ return val;
328
+ TLBFlushPageBitsByMMUIdxData *p;
329
+
330
+ /* Allocate a separate data block for each destination cpu. */
331
+ CPU_FOREACH(dst_cpu) {
332
+ if (dst_cpu != src_cpu) {
333
+ p = g_new(TLBFlushPageBitsByMMUIdxData, 1);
334
+ *p = d;
335
+ async_run_on_cpu(dst_cpu,
336
+ tlb_flush_page_bits_by_mmuidx_async_2,
337
+ RUN_ON_CPU_HOST_PTR(p));
338
+ }
339
+ }
340
+ }
341
+
342
+ tlb_flush_page_bits_by_mmuidx_async_0(src_cpu, d);
343
+}
344
+
345
+void tlb_flush_page_bits_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
346
+ target_ulong addr,
347
+ uint16_t idxmap,
348
+ unsigned bits)
349
+{
350
+ TLBFlushPageBitsByMMUIdxData d;
351
+ run_on_cpu_data runon;
352
+
353
+ /* If all bits are significant, this devolves to tlb_flush_page. */
354
+ if (bits >= TARGET_LONG_BITS) {
355
+ tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
356
+ return;
357
+ }
358
+ /* If no page bits are significant, this devolves to tlb_flush. */
359
+ if (bits < TARGET_PAGE_BITS) {
360
+ tlb_flush_by_mmuidx_all_cpus_synced(src_cpu, idxmap);
361
+ return;
362
+ }
363
+
364
+ /* This should already be page aligned */
365
+ d.addr = addr & TARGET_PAGE_MASK;
366
+ d.idxmap = idxmap;
367
+ d.bits = bits;
368
+
369
+ if (encode_pbm_to_runon(&runon, d)) {
370
+ flush_all_helper(src_cpu, tlb_flush_page_bits_by_mmuidx_async_1, runon);
371
+ async_safe_run_on_cpu(src_cpu, tlb_flush_page_bits_by_mmuidx_async_1,
372
+ runon);
373
+ } else {
374
+ CPUState *dst_cpu;
375
+ TLBFlushPageBitsByMMUIdxData *p;
376
+
377
+ /* Allocate a separate data block for each destination cpu. */
378
+ CPU_FOREACH(dst_cpu) {
379
+ if (dst_cpu != src_cpu) {
380
+ p = g_new(TLBFlushPageBitsByMMUIdxData, 1);
381
+ *p = d;
382
+ async_run_on_cpu(dst_cpu, tlb_flush_page_bits_by_mmuidx_async_2,
383
+ RUN_ON_CPU_HOST_PTR(p));
384
+ }
385
+ }
386
+
387
+ p = g_new(TLBFlushPageBitsByMMUIdxData, 1);
388
+ *p = d;
389
+ async_safe_run_on_cpu(src_cpu, tlb_flush_page_bits_by_mmuidx_async_2,
390
+ RUN_ON_CPU_HOST_PTR(p));
391
+ }
122
+ }
392
+}
123
+}
393
+
124
+
394
/* update the TLBs so that writes to code in the virtual page 'addr'
125
+/* Saturating narrowing right shifts */
395
can be detected */
126
+#define DO_VSHRN_SAT(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN) \
396
void tlb_protect_code(ram_addr_t ram_addr)
127
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
128
+ void *vm, uint32_t shift) \
129
+ { \
130
+ LTYPE *m = vm; \
131
+ TYPE *d = vd; \
132
+ uint16_t mask = mve_element_mask(env); \
133
+ bool qc = false; \
134
+ unsigned le; \
135
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
136
+ bool sat = false; \
137
+ TYPE r = FN(m[H##LESIZE(le)], shift, &sat); \
138
+ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \
139
+ qc |= sat && (mask & 1 << (TOP * ESIZE)); \
140
+ } \
141
+ if (qc) { \
142
+ env->vfp.qc[0] = qc; \
143
+ } \
144
+ mve_advance_vpt(env); \
145
+ }
146
+
147
+#define DO_VSHRN_SAT_UB(BOP, TOP, FN) \
148
+ DO_VSHRN_SAT(BOP, false, 1, uint8_t, 2, uint16_t, FN) \
149
+ DO_VSHRN_SAT(TOP, true, 1, uint8_t, 2, uint16_t, FN)
150
+
151
+#define DO_VSHRN_SAT_UH(BOP, TOP, FN) \
152
+ DO_VSHRN_SAT(BOP, false, 2, uint16_t, 4, uint32_t, FN) \
153
+ DO_VSHRN_SAT(TOP, true, 2, uint16_t, 4, uint32_t, FN)
154
+
155
+#define DO_VSHRN_SAT_SB(BOP, TOP, FN) \
156
+ DO_VSHRN_SAT(BOP, false, 1, int8_t, 2, int16_t, FN) \
157
+ DO_VSHRN_SAT(TOP, true, 1, int8_t, 2, int16_t, FN)
158
+
159
+#define DO_VSHRN_SAT_SH(BOP, TOP, FN) \
160
+ DO_VSHRN_SAT(BOP, false, 2, int16_t, 4, int32_t, FN) \
161
+ DO_VSHRN_SAT(TOP, true, 2, int16_t, 4, int32_t, FN)
162
+
163
+#define DO_SHRN_SB(N, M, SATP) \
164
+ do_sat_bhs((int64_t)(N) >> (M), INT8_MIN, INT8_MAX, SATP)
165
+#define DO_SHRN_UB(N, M, SATP) \
166
+ do_sat_bhs((uint64_t)(N) >> (M), 0, UINT8_MAX, SATP)
167
+#define DO_SHRUN_B(N, M, SATP) \
168
+ do_sat_bhs((int64_t)(N) >> (M), 0, UINT8_MAX, SATP)
169
+
170
+#define DO_SHRN_SH(N, M, SATP) \
171
+ do_sat_bhs((int64_t)(N) >> (M), INT16_MIN, INT16_MAX, SATP)
172
+#define DO_SHRN_UH(N, M, SATP) \
173
+ do_sat_bhs((uint64_t)(N) >> (M), 0, UINT16_MAX, SATP)
174
+#define DO_SHRUN_H(N, M, SATP) \
175
+ do_sat_bhs((int64_t)(N) >> (M), 0, UINT16_MAX, SATP)
176
+
177
+#define DO_RSHRN_SB(N, M, SATP) \
178
+ do_sat_bhs(do_srshr(N, M), INT8_MIN, INT8_MAX, SATP)
179
+#define DO_RSHRN_UB(N, M, SATP) \
180
+ do_sat_bhs(do_urshr(N, M), 0, UINT8_MAX, SATP)
181
+#define DO_RSHRUN_B(N, M, SATP) \
182
+ do_sat_bhs(do_srshr(N, M), 0, UINT8_MAX, SATP)
183
+
184
+#define DO_RSHRN_SH(N, M, SATP) \
185
+ do_sat_bhs(do_srshr(N, M), INT16_MIN, INT16_MAX, SATP)
186
+#define DO_RSHRN_UH(N, M, SATP) \
187
+ do_sat_bhs(do_urshr(N, M), 0, UINT16_MAX, SATP)
188
+#define DO_RSHRUN_H(N, M, SATP) \
189
+ do_sat_bhs(do_srshr(N, M), 0, UINT16_MAX, SATP)
190
+
191
+DO_VSHRN_SAT_SB(vqshrnb_sb, vqshrnt_sb, DO_SHRN_SB)
192
+DO_VSHRN_SAT_SH(vqshrnb_sh, vqshrnt_sh, DO_SHRN_SH)
193
+DO_VSHRN_SAT_UB(vqshrnb_ub, vqshrnt_ub, DO_SHRN_UB)
194
+DO_VSHRN_SAT_UH(vqshrnb_uh, vqshrnt_uh, DO_SHRN_UH)
195
+DO_VSHRN_SAT_SB(vqshrunbb, vqshruntb, DO_SHRUN_B)
196
+DO_VSHRN_SAT_SH(vqshrunbh, vqshrunth, DO_SHRUN_H)
197
+
198
+DO_VSHRN_SAT_SB(vqrshrnb_sb, vqrshrnt_sb, DO_RSHRN_SB)
199
+DO_VSHRN_SAT_SH(vqrshrnb_sh, vqrshrnt_sh, DO_RSHRN_SH)
200
+DO_VSHRN_SAT_UB(vqrshrnb_ub, vqrshrnt_ub, DO_RSHRN_UB)
201
+DO_VSHRN_SAT_UH(vqrshrnb_uh, vqrshrnt_uh, DO_RSHRN_UH)
202
+DO_VSHRN_SAT_SB(vqrshrunbb, vqrshruntb, DO_RSHRUN_B)
203
+DO_VSHRN_SAT_SH(vqrshrunbh, vqrshrunth, DO_RSHRUN_H)
204
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
205
index XXXXXXX..XXXXXXX 100644
206
--- a/target/arm/translate-mve.c
207
+++ b/target/arm/translate-mve.c
208
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_N(VSHRNB, vshrnb)
209
DO_2SHIFT_N(VSHRNT, vshrnt)
210
DO_2SHIFT_N(VRSHRNB, vrshrnb)
211
DO_2SHIFT_N(VRSHRNT, vrshrnt)
212
+DO_2SHIFT_N(VQSHRNB_S, vqshrnb_s)
213
+DO_2SHIFT_N(VQSHRNT_S, vqshrnt_s)
214
+DO_2SHIFT_N(VQSHRNB_U, vqshrnb_u)
215
+DO_2SHIFT_N(VQSHRNT_U, vqshrnt_u)
216
+DO_2SHIFT_N(VQSHRUNB, vqshrunb)
217
+DO_2SHIFT_N(VQSHRUNT, vqshrunt)
218
+DO_2SHIFT_N(VQRSHRNB_S, vqrshrnb_s)
219
+DO_2SHIFT_N(VQRSHRNT_S, vqrshrnt_s)
220
+DO_2SHIFT_N(VQRSHRNB_U, vqrshrnb_u)
221
+DO_2SHIFT_N(VQRSHRNT_U, vqrshrnt_u)
222
+DO_2SHIFT_N(VQRSHRUNB, vqrshrunb)
223
+DO_2SHIFT_N(VQRSHRUNT, vqrshrunt)
397
--
224
--
398
2.20.1
225
2.20.1
399
226
400
227
diff view generated by jsdifflib
1
For AArch32, unlike the VCVT of integer to float, which honours the
1
Implement the MVE VSHLC insn, which performs a shift left of the
2
rounding mode specified by the FPSCR, VCVT of fixed-point to float is
2
entire vector with carry in bits provided from a general purpose
3
always round-to-nearest. (AArch64 fixed-point-to-float conversions
3
register and carry out bits written back to that register.
4
always honour the FPCR rounding mode.)
5
6
Implement this by providing _round_to_nearest versions of the
7
relevant helpers which set the rounding mode temporarily when making
8
the call to the underlying softfloat function.
9
10
We only need to change the VFP VCVT instructions, because the
11
standard- FPSCR value used by the Neon VCVT is always set to
12
round-to-nearest, so we don't need to do the extra work of saving
13
and restoring the rounding mode.
14
4
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
Message-id: 20201013103532.13391-1-peter.maydell@linaro.org
7
Message-id: 20210628135835.6690-14-peter.maydell@linaro.org
18
---
8
---
19
target/arm/helper.h | 13 +++++++++++++
9
target/arm/helper-mve.h | 2 ++
20
target/arm/vfp_helper.c | 23 ++++++++++++++++++++++-
10
target/arm/mve.decode | 2 ++
21
target/arm/translate-vfp.c.inc | 24 ++++++++++++------------
11
target/arm/mve_helper.c | 38 ++++++++++++++++++++++++++++++++++++++
22
3 files changed, 47 insertions(+), 13 deletions(-)
12
target/arm/translate-mve.c | 30 ++++++++++++++++++++++++++++++
13
4 files changed, 72 insertions(+)
23
14
24
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
25
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/helper.h
17
--- a/target/arm/helper-mve.h
27
+++ b/target/arm/helper.h
18
+++ b/target/arm/helper-mve.h
28
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_ultoh, f16, i32, i32, ptr)
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrshrunbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, ptr)
20
DEF_HELPER_FLAGS_4(mve_vqrshrunbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
DEF_HELPER_3(vfp_uqtoh, f16, i64, i32, ptr)
21
DEF_HELPER_FLAGS_4(mve_vqrshruntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
22
DEF_HELPER_FLAGS_4(mve_vqrshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
+DEF_HELPER_3(vfp_shtos_round_to_nearest, f32, i32, i32, ptr)
33
+DEF_HELPER_3(vfp_sltos_round_to_nearest, f32, i32, i32, ptr)
34
+DEF_HELPER_3(vfp_uhtos_round_to_nearest, f32, i32, i32, ptr)
35
+DEF_HELPER_3(vfp_ultos_round_to_nearest, f32, i32, i32, ptr)
36
+DEF_HELPER_3(vfp_shtod_round_to_nearest, f64, i64, i32, ptr)
37
+DEF_HELPER_3(vfp_sltod_round_to_nearest, f64, i64, i32, ptr)
38
+DEF_HELPER_3(vfp_uhtod_round_to_nearest, f64, i64, i32, ptr)
39
+DEF_HELPER_3(vfp_ultod_round_to_nearest, f64, i64, i32, ptr)
40
+DEF_HELPER_3(vfp_shtoh_round_to_nearest, f16, i32, i32, ptr)
41
+DEF_HELPER_3(vfp_uhtoh_round_to_nearest, f16, i32, i32, ptr)
42
+DEF_HELPER_3(vfp_sltoh_round_to_nearest, f16, i32, i32, ptr)
43
+DEF_HELPER_3(vfp_ultoh_round_to_nearest, f16, i32, i32, ptr)
44
+
23
+
45
DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, ptr)
24
+DEF_HELPER_FLAGS_4(mve_vshlc, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
46
25
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
47
DEF_HELPER_FLAGS_3(vfp_fcvt_f16_to_f32, TCG_CALL_NO_RWG, f32, f16, ptr, i32)
48
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
49
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/vfp_helper.c
27
--- a/target/arm/mve.decode
51
+++ b/target/arm/vfp_helper.c
28
+++ b/target/arm/mve.decode
52
@@ -XXX,XX +XXX,XX @@ float32 VFP_HELPER(fcvts, d)(float64 x, CPUARMState *env)
29
@@ -XXX,XX +XXX,XX @@ VQRSHRUNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_b
53
return float64_to_float32(x, &env->vfp.fp_status);
30
VQRSHRUNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_h
54
}
31
VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b
55
32
VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h
56
-/* VFP3 fixed point conversion. */
33
+
57
+/*
34
+VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd
58
+ * VFP3 fixed point conversion. The AArch32 versions of fix-to-float
35
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
59
+ * must always round-to-nearest; the AArch64 ones honour the FPSCR
36
index XXXXXXX..XXXXXXX 100644
60
+ * rounding mode. (For AArch32 Neon the standard-FPSCR is set to
37
--- a/target/arm/mve_helper.c
61
+ * round-to-nearest so either helper will work.) AArch32 float-to-fix
38
+++ b/target/arm/mve_helper.c
62
+ * must round-to-zero.
39
@@ -XXX,XX +XXX,XX @@ DO_VSHRN_SAT_UB(vqrshrnb_ub, vqrshrnt_ub, DO_RSHRN_UB)
63
+ */
40
DO_VSHRN_SAT_UH(vqrshrnb_uh, vqrshrnt_uh, DO_RSHRN_UH)
64
#define VFP_CONV_FIX_FLOAT(name, p, fsz, ftype, isz, itype) \
41
DO_VSHRN_SAT_SB(vqrshrunbb, vqrshruntb, DO_RSHRUN_B)
65
ftype HELPER(vfp_##name##to##p)(uint##isz##_t x, uint32_t shift, \
42
DO_VSHRN_SAT_SH(vqrshrunbh, vqrshrunth, DO_RSHRUN_H)
66
void *fpstp) \
43
+
67
{ return itype##_to_##float##fsz##_scalbn(x, -shift, fpstp); }
44
+uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm,
68
45
+ uint32_t shift)
69
+#define VFP_CONV_FIX_FLOAT_ROUND(name, p, fsz, ftype, isz, itype) \
46
+{
70
+ ftype HELPER(vfp_##name##to##p##_round_to_nearest)(uint##isz##_t x, \
47
+ uint32_t *d = vd;
71
+ uint32_t shift, \
48
+ uint16_t mask = mve_element_mask(env);
72
+ void *fpstp) \
49
+ unsigned e;
73
+ { \
50
+ uint32_t r;
74
+ ftype ret; \
51
+
75
+ float_status *fpst = fpstp; \
52
+ /*
76
+ FloatRoundMode oldmode = fpst->float_rounding_mode; \
53
+ * For each 32-bit element, we shift it left, bringing in the
77
+ fpst->float_rounding_mode = float_round_nearest_even; \
54
+ * low 'shift' bits of rdm at the bottom. Bits shifted out at
78
+ ret = itype##_to_##float##fsz##_scalbn(x, -shift, fpstp); \
55
+ * the top become the new rdm, if the predicate mask permits.
79
+ fpst->float_rounding_mode = oldmode; \
56
+ * The final rdm value is returned to update the register.
80
+ return ret; \
57
+ * shift == 0 here means "shift by 32 bits".
58
+ */
59
+ if (shift == 0) {
60
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) {
61
+ r = rdm;
62
+ if (mask & 1) {
63
+ rdm = d[H4(e)];
64
+ }
65
+ mergemask(&d[H4(e)], r, mask);
66
+ }
67
+ } else {
68
+ uint32_t shiftmask = MAKE_64BIT_MASK(0, shift);
69
+
70
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) {
71
+ r = (d[H4(e)] << shift) | (rdm & shiftmask);
72
+ if (mask & 1) {
73
+ rdm = d[H4(e)] >> (32 - shift);
74
+ }
75
+ mergemask(&d[H4(e)], r, mask);
76
+ }
77
+ }
78
+ mve_advance_vpt(env);
79
+ return rdm;
80
+}
81
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
82
index XXXXXXX..XXXXXXX 100644
83
--- a/target/arm/translate-mve.c
84
+++ b/target/arm/translate-mve.c
85
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_N(VQRSHRNB_U, vqrshrnb_u)
86
DO_2SHIFT_N(VQRSHRNT_U, vqrshrnt_u)
87
DO_2SHIFT_N(VQRSHRUNB, vqrshrunb)
88
DO_2SHIFT_N(VQRSHRUNT, vqrshrunt)
89
+
90
+static bool trans_VSHLC(DisasContext *s, arg_VSHLC *a)
91
+{
92
+ /*
93
+ * Whole Vector Left Shift with Carry. The carry is taken
94
+ * from a general purpose register and written back there.
95
+ * An imm of 0 means "shift by 32".
96
+ */
97
+ TCGv_ptr qd;
98
+ TCGv_i32 rdm;
99
+
100
+ if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd)) {
101
+ return false;
102
+ }
103
+ if (a->rdm == 13 || a->rdm == 15) {
104
+ /* CONSTRAINED UNPREDICTABLE: we UNDEF */
105
+ return false;
106
+ }
107
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
108
+ return true;
81
+ }
109
+ }
82
+
110
+
83
#define VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, ftype, isz, itype, ROUND, suff) \
111
+ qd = mve_qreg_ptr(a->qd);
84
uint##isz##_t HELPER(vfp_to##name##p##suff)(ftype x, uint32_t shift, \
112
+ rdm = load_reg(s, a->rdm);
85
void *fpst) \
113
+ gen_helper_mve_vshlc(rdm, cpu_env, qd, rdm, tcg_constant_i32(a->imm));
86
@@ -XXX,XX +XXX,XX @@ uint##isz##_t HELPER(vfp_to##name##p##suff)(ftype x, uint32_t shift, \
114
+ store_reg(s, a->rdm, rdm);
87
115
+ tcg_temp_free_ptr(qd);
88
#define VFP_CONV_FIX(name, p, fsz, ftype, isz, itype) \
116
+ mve_update_eci(s);
89
VFP_CONV_FIX_FLOAT(name, p, fsz, ftype, isz, itype) \
117
+ return true;
90
+VFP_CONV_FIX_FLOAT_ROUND(name, p, fsz, ftype, isz, itype) \
118
+}
91
VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, ftype, isz, itype, \
92
float_round_to_zero, _round_to_zero) \
93
VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, ftype, isz, itype, \
94
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
95
index XXXXXXX..XXXXXXX 100644
96
--- a/target/arm/translate-vfp.c.inc
97
+++ b/target/arm/translate-vfp.c.inc
98
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_hp(DisasContext *s, arg_VCVT_fix_sp *a)
99
/* Switch on op:U:sx bits */
100
switch (a->opc) {
101
case 0:
102
- gen_helper_vfp_shtoh(vd, vd, shift, fpst);
103
+ gen_helper_vfp_shtoh_round_to_nearest(vd, vd, shift, fpst);
104
break;
105
case 1:
106
- gen_helper_vfp_sltoh(vd, vd, shift, fpst);
107
+ gen_helper_vfp_sltoh_round_to_nearest(vd, vd, shift, fpst);
108
break;
109
case 2:
110
- gen_helper_vfp_uhtoh(vd, vd, shift, fpst);
111
+ gen_helper_vfp_uhtoh_round_to_nearest(vd, vd, shift, fpst);
112
break;
113
case 3:
114
- gen_helper_vfp_ultoh(vd, vd, shift, fpst);
115
+ gen_helper_vfp_ultoh_round_to_nearest(vd, vd, shift, fpst);
116
break;
117
case 4:
118
gen_helper_vfp_toshh_round_to_zero(vd, vd, shift, fpst);
119
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a)
120
/* Switch on op:U:sx bits */
121
switch (a->opc) {
122
case 0:
123
- gen_helper_vfp_shtos(vd, vd, shift, fpst);
124
+ gen_helper_vfp_shtos_round_to_nearest(vd, vd, shift, fpst);
125
break;
126
case 1:
127
- gen_helper_vfp_sltos(vd, vd, shift, fpst);
128
+ gen_helper_vfp_sltos_round_to_nearest(vd, vd, shift, fpst);
129
break;
130
case 2:
131
- gen_helper_vfp_uhtos(vd, vd, shift, fpst);
132
+ gen_helper_vfp_uhtos_round_to_nearest(vd, vd, shift, fpst);
133
break;
134
case 3:
135
- gen_helper_vfp_ultos(vd, vd, shift, fpst);
136
+ gen_helper_vfp_ultos_round_to_nearest(vd, vd, shift, fpst);
137
break;
138
case 4:
139
gen_helper_vfp_toshs_round_to_zero(vd, vd, shift, fpst);
140
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
141
/* Switch on op:U:sx bits */
142
switch (a->opc) {
143
case 0:
144
- gen_helper_vfp_shtod(vd, vd, shift, fpst);
145
+ gen_helper_vfp_shtod_round_to_nearest(vd, vd, shift, fpst);
146
break;
147
case 1:
148
- gen_helper_vfp_sltod(vd, vd, shift, fpst);
149
+ gen_helper_vfp_sltod_round_to_nearest(vd, vd, shift, fpst);
150
break;
151
case 2:
152
- gen_helper_vfp_uhtod(vd, vd, shift, fpst);
153
+ gen_helper_vfp_uhtod_round_to_nearest(vd, vd, shift, fpst);
154
break;
155
case 3:
156
- gen_helper_vfp_ultod(vd, vd, shift, fpst);
157
+ gen_helper_vfp_ultod_round_to_nearest(vd, vd, shift, fpst);
158
break;
159
case 4:
160
gen_helper_vfp_toshd_round_to_zero(vd, vd, shift, fpst);
161
--
119
--
162
2.20.1
120
2.20.1
163
121
164
122
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
While APEI is a generic ACPI feature (usable by X86 and ARM64), only
4
the 'virt' machine uses it, by enabling the RAS Virtualization. See
5
commit 2afa8c8519: "hw/arm/virt: Introduce a RAS machine option").
6
7
Restrict the APEI tables generation code to the single user: the virt
8
machine. If another machine wants to use it, it simply has to 'select
9
ACPI_APEI' in its Kconfig.
10
11
Fixes: aa16508f1d ("ACPI: Build related register address fields via hardware error fw_cfg blob")
12
Acked-by: Michael S. Tsirkin <mst@redhat.com>
13
Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com>
14
Acked-by: Laszlo Ersek <lersek@redhat.com>
15
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
16
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
17
Message-id: 20201008161414.2672569-1-philmd@redhat.com
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
---
20
default-configs/devices/arm-softmmu.mak | 1 -
21
hw/arm/Kconfig | 1 +
22
2 files changed, 1 insertion(+), 1 deletion(-)
23
24
diff --git a/default-configs/devices/arm-softmmu.mak b/default-configs/devices/arm-softmmu.mak
25
index XXXXXXX..XXXXXXX 100644
26
--- a/default-configs/devices/arm-softmmu.mak
27
+++ b/default-configs/devices/arm-softmmu.mak
28
@@ -XXX,XX +XXX,XX @@ CONFIG_FSL_IMX7=y
29
CONFIG_FSL_IMX6UL=y
30
CONFIG_SEMIHOSTING=y
31
CONFIG_ALLWINNER_H3=y
32
-CONFIG_ACPI_APEI=y
33
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
34
index XXXXXXX..XXXXXXX 100644
35
--- a/hw/arm/Kconfig
36
+++ b/hw/arm/Kconfig
37
@@ -XXX,XX +XXX,XX @@ config ARM_VIRT
38
select ACPI_MEMORY_HOTPLUG
39
select ACPI_HW_REDUCED
40
select ACPI_NVDIMM
41
+ select ACPI_APEI
42
43
config CHEETAH
44
bool
45
--
46
2.20.1
47
48
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
Use the BCM2835_SYSTIMER_COUNT definition instead of the
4
magic '4' value.
5
6
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
9
Message-id: 20201010203709.3116542-2-f4bug@amsat.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
include/hw/timer/bcm2835_systmr.h | 4 +++-
13
hw/timer/bcm2835_systmr.c | 3 ++-
14
2 files changed, 5 insertions(+), 2 deletions(-)
15
16
diff --git a/include/hw/timer/bcm2835_systmr.h b/include/hw/timer/bcm2835_systmr.h
17
index XXXXXXX..XXXXXXX 100644
18
--- a/include/hw/timer/bcm2835_systmr.h
19
+++ b/include/hw/timer/bcm2835_systmr.h
20
@@ -XXX,XX +XXX,XX @@
21
#define TYPE_BCM2835_SYSTIMER "bcm2835-sys-timer"
22
OBJECT_DECLARE_SIMPLE_TYPE(BCM2835SystemTimerState, BCM2835_SYSTIMER)
23
24
+#define BCM2835_SYSTIMER_COUNT 4
25
+
26
struct BCM2835SystemTimerState {
27
/*< private >*/
28
SysBusDevice parent_obj;
29
@@ -XXX,XX +XXX,XX @@ struct BCM2835SystemTimerState {
30
31
struct {
32
uint32_t status;
33
- uint32_t compare[4];
34
+ uint32_t compare[BCM2835_SYSTIMER_COUNT];
35
} reg;
36
};
37
38
diff --git a/hw/timer/bcm2835_systmr.c b/hw/timer/bcm2835_systmr.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/hw/timer/bcm2835_systmr.c
41
+++ b/hw/timer/bcm2835_systmr.c
42
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription bcm2835_systmr_vmstate = {
43
.minimum_version_id = 1,
44
.fields = (VMStateField[]) {
45
VMSTATE_UINT32(reg.status, BCM2835SystemTimerState),
46
- VMSTATE_UINT32_ARRAY(reg.compare, BCM2835SystemTimerState, 4),
47
+ VMSTATE_UINT32_ARRAY(reg.compare, BCM2835SystemTimerState,
48
+ BCM2835_SYSTIMER_COUNT),
49
VMSTATE_END_OF_LIST()
50
}
51
};
52
--
53
2.20.1
54
55
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
The variable holding the CTRL_STATUS register is misnamed
4
'status'. Rename it 'ctrl_status' to make it more obvious
5
this register is also used to control the peripheral.
6
7
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Message-id: 20201010203709.3116542-3-f4bug@amsat.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
include/hw/timer/bcm2835_systmr.h | 2 +-
14
hw/timer/bcm2835_systmr.c | 8 ++++----
15
2 files changed, 5 insertions(+), 5 deletions(-)
16
17
diff --git a/include/hw/timer/bcm2835_systmr.h b/include/hw/timer/bcm2835_systmr.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/include/hw/timer/bcm2835_systmr.h
20
+++ b/include/hw/timer/bcm2835_systmr.h
21
@@ -XXX,XX +XXX,XX @@ struct BCM2835SystemTimerState {
22
qemu_irq irq;
23
24
struct {
25
- uint32_t status;
26
+ uint32_t ctrl_status;
27
uint32_t compare[BCM2835_SYSTIMER_COUNT];
28
} reg;
29
};
30
diff --git a/hw/timer/bcm2835_systmr.c b/hw/timer/bcm2835_systmr.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/hw/timer/bcm2835_systmr.c
33
+++ b/hw/timer/bcm2835_systmr.c
34
@@ -XXX,XX +XXX,XX @@ REG32(COMPARE3, 0x18)
35
36
static void bcm2835_systmr_update_irq(BCM2835SystemTimerState *s)
37
{
38
- bool enable = !!s->reg.status;
39
+ bool enable = !!s->reg.ctrl_status;
40
41
trace_bcm2835_systmr_irq(enable);
42
qemu_set_irq(s->irq, enable);
43
@@ -XXX,XX +XXX,XX @@ static uint64_t bcm2835_systmr_read(void *opaque, hwaddr offset,
44
45
switch (offset) {
46
case A_CTRL_STATUS:
47
- r = s->reg.status;
48
+ r = s->reg.ctrl_status;
49
break;
50
case A_COMPARE0 ... A_COMPARE3:
51
r = s->reg.compare[(offset - A_COMPARE0) >> 2];
52
@@ -XXX,XX +XXX,XX @@ static void bcm2835_systmr_write(void *opaque, hwaddr offset,
53
trace_bcm2835_systmr_write(offset, value);
54
switch (offset) {
55
case A_CTRL_STATUS:
56
- s->reg.status &= ~value; /* Ack */
57
+ s->reg.ctrl_status &= ~value; /* Ack */
58
bcm2835_systmr_update_irq(s);
59
break;
60
case A_COMPARE0 ... A_COMPARE3:
61
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription bcm2835_systmr_vmstate = {
62
.version_id = 1,
63
.minimum_version_id = 1,
64
.fields = (VMStateField[]) {
65
- VMSTATE_UINT32(reg.status, BCM2835SystemTimerState),
66
+ VMSTATE_UINT32(reg.ctrl_status, BCM2835SystemTimerState),
67
VMSTATE_UINT32_ARRAY(reg.compare, BCM2835SystemTimerState,
68
BCM2835_SYSTIMER_COUNT),
69
VMSTATE_END_OF_LIST()
70
--
71
2.20.1
72
73
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
This peripheral has 1 free-running timer and 4 compare registers.
4
5
Only the free-running timer is implemented. Add support the
6
COMPARE registers (each register is wired to an IRQ).
7
8
Reference: "BCM2835 ARM Peripherals" datasheet [*]
9
chapter 12 "System Timer":
10
11
The System Timer peripheral provides four 32-bit timer channels
12
and a single 64-bit free running counter. Each channel has an
13
output compare register, which is compared against the 32 least
14
significant bits of the free running counter values. When the
15
two values match, the system timer peripheral generates a signal
16
to indicate a match for the appropriate channel. The match signal
17
is then fed into the interrupt controller.
18
19
This peripheral is used since Linux 3.7, commit ee4af5696720
20
("ARM: bcm2835: add system timer").
21
22
[*] https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
23
24
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
25
Reviewed-by: Luc Michel <luc@lmichel.fr>
26
Message-id: 20201010203709.3116542-4-f4bug@amsat.org
27
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
28
---
29
include/hw/timer/bcm2835_systmr.h | 11 +++++--
30
hw/timer/bcm2835_systmr.c | 48 ++++++++++++++++++++-----------
31
hw/timer/trace-events | 6 ++--
32
3 files changed, 44 insertions(+), 21 deletions(-)
33
34
diff --git a/include/hw/timer/bcm2835_systmr.h b/include/hw/timer/bcm2835_systmr.h
35
index XXXXXXX..XXXXXXX 100644
36
--- a/include/hw/timer/bcm2835_systmr.h
37
+++ b/include/hw/timer/bcm2835_systmr.h
38
@@ -XXX,XX +XXX,XX @@
39
40
#include "hw/sysbus.h"
41
#include "hw/irq.h"
42
+#include "qemu/timer.h"
43
#include "qom/object.h"
44
45
#define TYPE_BCM2835_SYSTIMER "bcm2835-sys-timer"
46
@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(BCM2835SystemTimerState, BCM2835_SYSTIMER)
47
48
#define BCM2835_SYSTIMER_COUNT 4
49
50
+typedef struct {
51
+ unsigned id;
52
+ QEMUTimer timer;
53
+ qemu_irq irq;
54
+ BCM2835SystemTimerState *state;
55
+} BCM2835SystemTimerCompare;
56
+
57
struct BCM2835SystemTimerState {
58
/*< private >*/
59
SysBusDevice parent_obj;
60
61
/*< public >*/
62
MemoryRegion iomem;
63
- qemu_irq irq;
64
-
65
struct {
66
uint32_t ctrl_status;
67
uint32_t compare[BCM2835_SYSTIMER_COUNT];
68
} reg;
69
+ BCM2835SystemTimerCompare tmr[BCM2835_SYSTIMER_COUNT];
70
};
71
72
#endif
73
diff --git a/hw/timer/bcm2835_systmr.c b/hw/timer/bcm2835_systmr.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/hw/timer/bcm2835_systmr.c
76
+++ b/hw/timer/bcm2835_systmr.c
77
@@ -XXX,XX +XXX,XX @@ REG32(COMPARE1, 0x10)
78
REG32(COMPARE2, 0x14)
79
REG32(COMPARE3, 0x18)
80
81
-static void bcm2835_systmr_update_irq(BCM2835SystemTimerState *s)
82
+static void bcm2835_systmr_timer_expire(void *opaque)
83
{
84
- bool enable = !!s->reg.ctrl_status;
85
+ BCM2835SystemTimerCompare *tmr = opaque;
86
87
- trace_bcm2835_systmr_irq(enable);
88
- qemu_set_irq(s->irq, enable);
89
-}
90
-
91
-static void bcm2835_systmr_update_compare(BCM2835SystemTimerState *s,
92
- unsigned timer_index)
93
-{
94
- /* TODO fow now, since neither Linux nor U-boot use these timers. */
95
- qemu_log_mask(LOG_UNIMP, "COMPARE register %u not implemented\n",
96
- timer_index);
97
+ trace_bcm2835_systmr_timer_expired(tmr->id);
98
+ tmr->state->reg.ctrl_status |= 1 << tmr->id;
99
+ qemu_set_irq(tmr->irq, 1);
100
}
101
102
static uint64_t bcm2835_systmr_read(void *opaque, hwaddr offset,
103
@@ -XXX,XX +XXX,XX @@ static uint64_t bcm2835_systmr_read(void *opaque, hwaddr offset,
104
}
105
106
static void bcm2835_systmr_write(void *opaque, hwaddr offset,
107
- uint64_t value, unsigned size)
108
+ uint64_t value64, unsigned size)
109
{
110
BCM2835SystemTimerState *s = BCM2835_SYSTIMER(opaque);
111
+ int index;
112
+ uint32_t value = value64;
113
+ uint32_t triggers_delay_us;
114
+ uint64_t now;
115
116
trace_bcm2835_systmr_write(offset, value);
117
switch (offset) {
118
case A_CTRL_STATUS:
119
s->reg.ctrl_status &= ~value; /* Ack */
120
- bcm2835_systmr_update_irq(s);
121
+ for (index = 0; index < ARRAY_SIZE(s->tmr); index++) {
122
+ if (extract32(value, index, 1)) {
123
+ trace_bcm2835_systmr_irq_ack(index);
124
+ qemu_set_irq(s->tmr[index].irq, 0);
125
+ }
126
+ }
127
break;
128
case A_COMPARE0 ... A_COMPARE3:
129
- s->reg.compare[(offset - A_COMPARE0) >> 2] = value;
130
- bcm2835_systmr_update_compare(s, (offset - A_COMPARE0) >> 2);
131
+ index = (offset - A_COMPARE0) >> 2;
132
+ s->reg.compare[index] = value;
133
+ now = qemu_clock_get_us(QEMU_CLOCK_VIRTUAL);
134
+ /* Compare lower 32-bits of the free-running counter. */
135
+ triggers_delay_us = value - now;
136
+ trace_bcm2835_systmr_run(index, triggers_delay_us);
137
+ timer_mod(&s->tmr[index].timer, now + triggers_delay_us);
138
break;
139
case A_COUNTER_LOW:
140
case A_COUNTER_HIGH:
141
@@ -XXX,XX +XXX,XX @@ static void bcm2835_systmr_realize(DeviceState *dev, Error **errp)
142
memory_region_init_io(&s->iomem, OBJECT(dev), &bcm2835_systmr_ops,
143
s, "bcm2835-sys-timer", 0x20);
144
sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->iomem);
145
- sysbus_init_irq(SYS_BUS_DEVICE(dev), &s->irq);
146
+
147
+ for (size_t i = 0; i < ARRAY_SIZE(s->tmr); i++) {
148
+ s->tmr[i].id = i;
149
+ s->tmr[i].state = s;
150
+ sysbus_init_irq(SYS_BUS_DEVICE(dev), &s->tmr[i].irq);
151
+ timer_init_us(&s->tmr[i].timer, QEMU_CLOCK_VIRTUAL,
152
+ bcm2835_systmr_timer_expire, &s->tmr[i]);
153
+ }
154
}
155
156
static const VMStateDescription bcm2835_systmr_vmstate = {
157
diff --git a/hw/timer/trace-events b/hw/timer/trace-events
158
index XXXXXXX..XXXXXXX 100644
159
--- a/hw/timer/trace-events
160
+++ b/hw/timer/trace-events
161
@@ -XXX,XX +XXX,XX @@ nrf51_timer_write(uint8_t timer_id, uint64_t addr, uint32_t value, unsigned size
162
nrf51_timer_set_count(uint8_t timer_id, uint8_t counter_id, uint32_t value) "timer %u counter %u count 0x%" PRIx32
163
164
# bcm2835_systmr.c
165
-bcm2835_systmr_irq(bool enable) "timer irq state %u"
166
+bcm2835_systmr_timer_expired(unsigned id) "timer #%u expired"
167
+bcm2835_systmr_irq_ack(unsigned id) "timer #%u acked"
168
bcm2835_systmr_read(uint64_t offset, uint64_t data) "timer read: offset 0x%" PRIx64 " data 0x%" PRIx64
169
-bcm2835_systmr_write(uint64_t offset, uint64_t data) "timer write: offset 0x%" PRIx64 " data 0x%" PRIx64
170
+bcm2835_systmr_write(uint64_t offset, uint32_t data) "timer write: offset 0x%" PRIx64 " data 0x%" PRIx32
171
+bcm2835_systmr_run(unsigned id, uint64_t delay_us) "timer #%u expiring in %"PRIu64" us"
172
173
# avr_timer16.c
174
avr_timer16_read(uint8_t addr, uint8_t value) "timer16 read addr:%u value:%u"
175
--
176
2.20.1
177
178
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
The SYS_timer is not directly wired to the ARM core, but to the
4
SoC (peripheral) interrupt controller.
5
6
Fixes: 0e5bbd74064 ("hw/arm/bcm2835_peripherals: Use the SYS_timer")
7
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Message-id: 20201010203709.3116542-5-f4bug@amsat.org
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
hw/arm/bcm2835_peripherals.c | 13 +++++++++++--
14
1 file changed, 11 insertions(+), 2 deletions(-)
15
16
diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/arm/bcm2835_peripherals.c
19
+++ b/hw/arm/bcm2835_peripherals.c
20
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
21
memory_region_add_subregion(&s->peri_mr, ST_OFFSET,
22
sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->systmr), 0));
23
sysbus_connect_irq(SYS_BUS_DEVICE(&s->systmr), 0,
24
- qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_ARM_IRQ,
25
- INTERRUPT_ARM_TIMER));
26
+ qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
27
+ INTERRUPT_TIMER0));
28
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->systmr), 1,
29
+ qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
30
+ INTERRUPT_TIMER1));
31
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->systmr), 2,
32
+ qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
33
+ INTERRUPT_TIMER2));
34
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->systmr), 3,
35
+ qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
36
+ INTERRUPT_TIMER3));
37
38
/* UART0 */
39
qdev_prop_set_chr(DEVICE(&s->uart0), "chardev", serial_hd(0));
40
--
41
2.20.1
42
43
diff view generated by jsdifflib
Deleted patch
1
From: Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
2
1
3
Current documentation is not too clear on the GETPC usage.
4
In particular, when used outside the top level helper function
5
it causes unexpected behavior.
6
7
Signed-off-by: Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
8
Message-id: 20201015095147.1691-1-e.emanuelegiuseppe@gmail.com
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
docs/devel/loads-stores.rst | 8 +++++++-
13
1 file changed, 7 insertions(+), 1 deletion(-)
14
15
diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst
16
index XXXXXXX..XXXXXXX 100644
17
--- a/docs/devel/loads-stores.rst
18
+++ b/docs/devel/loads-stores.rst
19
@@ -XXX,XX +XXX,XX @@ guest CPU state in case of a guest CPU exception. This is passed
20
to ``cpu_restore_state()``. Therefore the value should either be 0,
21
to indicate that the guest CPU state is already synchronized, or
22
the result of ``GETPC()`` from the top level ``HELPER(foo)``
23
-function, which is a return address into the generated code.
24
+function, which is a return address into the generated code [#gpc]_.
25
+
26
+.. [#gpc] Note that ``GETPC()`` should be used with great care: calling
27
+ it in other functions that are *not* the top level
28
+ ``HELPER(foo)`` will cause unexpected behavior. Instead, the
29
+ value of ``GETPC()`` should be read from the helper and passed
30
+ if needed to the functions that the helper calls.
31
32
Function names follow the pattern:
33
34
--
35
2.20.1
36
37
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
Add trace events for GPU and CPU IRQs.
4
5
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
6
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Message-id: 20201017180731.1165871-2-f4bug@amsat.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
hw/intc/bcm2835_ic.c | 4 +++-
11
hw/intc/trace-events | 4 ++++
12
2 files changed, 7 insertions(+), 1 deletion(-)
13
14
diff --git a/hw/intc/bcm2835_ic.c b/hw/intc/bcm2835_ic.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/intc/bcm2835_ic.c
17
+++ b/hw/intc/bcm2835_ic.c
18
@@ -XXX,XX +XXX,XX @@
19
#include "migration/vmstate.h"
20
#include "qemu/log.h"
21
#include "qemu/module.h"
22
+#include "trace.h"
23
24
#define GPU_IRQS 64
25
#define ARM_IRQS 8
26
@@ -XXX,XX +XXX,XX @@ static void bcm2835_ic_update(BCM2835ICState *s)
27
set = (s->gpu_irq_level & s->gpu_irq_enable)
28
|| (s->arm_irq_level & s->arm_irq_enable);
29
qemu_set_irq(s->irq, set);
30
-
31
}
32
33
static void bcm2835_ic_set_gpu_irq(void *opaque, int irq, int level)
34
@@ -XXX,XX +XXX,XX @@ static void bcm2835_ic_set_gpu_irq(void *opaque, int irq, int level)
35
BCM2835ICState *s = opaque;
36
37
assert(irq >= 0 && irq < 64);
38
+ trace_bcm2835_ic_set_gpu_irq(irq, level);
39
s->gpu_irq_level = deposit64(s->gpu_irq_level, irq, 1, level != 0);
40
bcm2835_ic_update(s);
41
}
42
@@ -XXX,XX +XXX,XX @@ static void bcm2835_ic_set_arm_irq(void *opaque, int irq, int level)
43
BCM2835ICState *s = opaque;
44
45
assert(irq >= 0 && irq < 8);
46
+ trace_bcm2835_ic_set_cpu_irq(irq, level);
47
s->arm_irq_level = deposit32(s->arm_irq_level, irq, 1, level != 0);
48
bcm2835_ic_update(s);
49
}
50
diff --git a/hw/intc/trace-events b/hw/intc/trace-events
51
index XXXXXXX..XXXXXXX 100644
52
--- a/hw/intc/trace-events
53
+++ b/hw/intc/trace-events
54
@@ -XXX,XX +XXX,XX @@ nvic_sysreg_write(uint64_t addr, uint32_t value, unsigned size) "NVIC sysreg wri
55
heathrow_write(uint64_t addr, unsigned int n, uint64_t value) "0x%"PRIx64" %u: 0x%"PRIx64
56
heathrow_read(uint64_t addr, unsigned int n, uint64_t value) "0x%"PRIx64" %u: 0x%"PRIx64
57
heathrow_set_irq(int num, int level) "set_irq: num=0x%02x level=%d"
58
+
59
+# bcm2835_ic.c
60
+bcm2835_ic_set_gpu_irq(int irq, int level) "GPU irq #%d level %d"
61
+bcm2835_ic_set_cpu_irq(int irq, int level) "CPU irq #%d level %d"
62
--
63
2.20.1
64
65
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
The IRQ values are defined few lines earlier, use them instead of
4
the magic numbers.
5
6
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Message-id: 20201017180731.1165871-3-f4bug@amsat.org
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
hw/intc/bcm2836_control.c | 8 ++++----
12
1 file changed, 4 insertions(+), 4 deletions(-)
13
14
diff --git a/hw/intc/bcm2836_control.c b/hw/intc/bcm2836_control.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/intc/bcm2836_control.c
17
+++ b/hw/intc/bcm2836_control.c
18
@@ -XXX,XX +XXX,XX @@ static void bcm2836_control_set_local_irq(void *opaque, int core, int local_irq,
19
20
static void bcm2836_control_set_local_irq0(void *opaque, int core, int level)
21
{
22
- bcm2836_control_set_local_irq(opaque, core, 0, level);
23
+ bcm2836_control_set_local_irq(opaque, core, IRQ_CNTPSIRQ, level);
24
}
25
26
static void bcm2836_control_set_local_irq1(void *opaque, int core, int level)
27
{
28
- bcm2836_control_set_local_irq(opaque, core, 1, level);
29
+ bcm2836_control_set_local_irq(opaque, core, IRQ_CNTPNSIRQ, level);
30
}
31
32
static void bcm2836_control_set_local_irq2(void *opaque, int core, int level)
33
{
34
- bcm2836_control_set_local_irq(opaque, core, 2, level);
35
+ bcm2836_control_set_local_irq(opaque, core, IRQ_CNTHPIRQ, level);
36
}
37
38
static void bcm2836_control_set_local_irq3(void *opaque, int core, int level)
39
{
40
- bcm2836_control_set_local_irq(opaque, core, 3, level);
41
+ bcm2836_control_set_local_irq(opaque, core, IRQ_CNTVIRQ, level);
42
}
43
44
static void bcm2836_control_set_gpu_irq(void *opaque, int irq, int level)
45
--
46
2.20.1
47
48
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
We already have the full ARMMMUIdx as computed from the
4
function parameter.
5
6
For the purpose of regime_has_2_ranges, we can ignore any
7
difference between AccType_Normal and AccType_Unpriv, which
8
would be the only difference between the passed mmu_idx
9
and arm_mmu_idx_el.
10
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
13
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
14
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
15
Message-id: 20201008162155.161886-2-richard.henderson@linaro.org
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
18
target/arm/mte_helper.c | 3 +--
19
1 file changed, 1 insertion(+), 2 deletions(-)
20
21
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
22
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/mte_helper.c
24
+++ b/target/arm/mte_helper.c
25
@@ -XXX,XX +XXX,XX @@ static void mte_check_fail(CPUARMState *env, uint32_t desc,
26
27
case 2:
28
/* Tag check fail causes asynchronous flag set. */
29
- mmu_idx = arm_mmu_idx_el(env, el);
30
- if (regime_has_2_ranges(mmu_idx)) {
31
+ if (regime_has_2_ranges(arm_mmu_idx)) {
32
select = extract64(dirty_ptr, 55, 1);
33
} else {
34
select = 0;
35
--
36
2.20.1
37
38
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
The reporting in AArch64.TagCheckFail only depends on PSTATE.EL,
4
and not the AccType of the operation. There are two guest
5
visible problems that affect LDTR and STTR because of this:
6
7
(1) Selecting TCF0 vs TCF1 to decide on reporting,
8
(2) Report "data abort same el" not "data abort lower el".
9
10
Reported-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
13
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
14
Message-id: 20201008162155.161886-3-richard.henderson@linaro.org
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
17
target/arm/mte_helper.c | 10 +++-------
18
1 file changed, 3 insertions(+), 7 deletions(-)
19
20
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
21
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/mte_helper.c
23
+++ b/target/arm/mte_helper.c
24
@@ -XXX,XX +XXX,XX @@ static void mte_check_fail(CPUARMState *env, uint32_t desc,
25
reg_el = regime_el(env, arm_mmu_idx);
26
sctlr = env->cp15.sctlr_el[reg_el];
27
28
- switch (arm_mmu_idx) {
29
- case ARMMMUIdx_E10_0:
30
- case ARMMMUIdx_E20_0:
31
- el = 0;
32
+ el = arm_current_el(env);
33
+ if (el == 0) {
34
tcf = extract64(sctlr, 38, 2);
35
- break;
36
- default:
37
- el = reg_el;
38
+ } else {
39
tcf = extract64(sctlr, 40, 2);
40
}
41
42
--
43
2.20.1
44
45
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Unlike many other bits in HCR_EL2, the description for this
4
bit does not contain the phrase "if ... this field behaves
5
as 0 for all purposes other than", so do not squash the bit
6
in arm_hcr_el2_eff.
7
8
Instead, replicate the E2H+TGE test in the two places that
9
require it.
10
11
Reported-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
13
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
14
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
15
Message-id: 20201008162155.161886-4-richard.henderson@linaro.org
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
17
---
18
target/arm/internals.h | 9 +++++----
19
target/arm/helper.c | 9 +++++----
20
2 files changed, 10 insertions(+), 8 deletions(-)
21
22
diff --git a/target/arm/internals.h b/target/arm/internals.h
23
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/internals.h
25
+++ b/target/arm/internals.h
26
@@ -XXX,XX +XXX,XX @@ static inline bool allocation_tag_access_enabled(CPUARMState *env, int el,
27
&& !(env->cp15.scr_el3 & SCR_ATA)) {
28
return false;
29
}
30
- if (el < 2
31
- && arm_feature(env, ARM_FEATURE_EL2)
32
- && !(arm_hcr_el2_eff(env) & HCR_ATA)) {
33
- return false;
34
+ if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) {
35
+ uint64_t hcr = arm_hcr_el2_eff(env);
36
+ if (!(hcr & HCR_ATA) && (!(hcr & HCR_E2H) || !(hcr & HCR_TGE))) {
37
+ return false;
38
+ }
39
}
40
sctlr &= (el == 0 ? SCTLR_ATA0 : SCTLR_ATA);
41
return sctlr != 0;
42
diff --git a/target/arm/helper.c b/target/arm/helper.c
43
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/helper.c
45
+++ b/target/arm/helper.c
46
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_mte(CPUARMState *env, const ARMCPRegInfo *ri,
47
{
48
int el = arm_current_el(env);
49
50
- if (el < 2 &&
51
- arm_feature(env, ARM_FEATURE_EL2) &&
52
- !(arm_hcr_el2_eff(env) & HCR_ATA)) {
53
- return CP_ACCESS_TRAP_EL2;
54
+ if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) {
55
+ uint64_t hcr = arm_hcr_el2_eff(env);
56
+ if (!(hcr & HCR_ATA) && (!(hcr & HCR_E2H) || !(hcr & HCR_TGE))) {
57
+ return CP_ACCESS_TRAP_EL2;
58
+ }
59
}
60
if (el < 3 &&
61
arm_feature(env, ARM_FEATURE_EL3) &&
62
--
63
2.20.1
64
65
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
Commit 7998beb9c2e removed the ram_size initialization in the
4
arm_boot_info structure, however it is used by arm_load_kernel().
5
6
Initialize the field to fix:
7
8
$ qemu-system-arm -M n800 -append 'console=ttyS1' \
9
-kernel meego-arm-n8x0-1.0.80.20100712.1431-vmlinuz-2.6.35~rc4-129.1-n8x0
10
qemu-system-arm: kernel 'meego-arm-n8x0-1.0.80.20100712.1431-vmlinuz-2.6.35~rc4-129.1-n8x0' is too large to fit in RAM (kernel size 1964608, RAM size 0)
11
12
Noticed while running the test introduced in commit 050a82f0c5b
13
("tests/acceptance: Add a test for the N800 and N810 arm machines").
14
15
Fixes: 7998beb9c2e ("arm/nseries: use memdev for RAM")
16
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
17
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
18
Tested-by: Thomas Huth <thuth@redhat.com>
19
Message-id: 20201019095148.1602119-1-f4bug@amsat.org
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
---
22
hw/arm/nseries.c | 1 +
23
1 file changed, 1 insertion(+)
24
25
diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
26
index XXXXXXX..XXXXXXX 100644
27
--- a/hw/arm/nseries.c
28
+++ b/hw/arm/nseries.c
29
@@ -XXX,XX +XXX,XX @@ static void n8x0_init(MachineState *machine,
30
g_free(sz);
31
exit(EXIT_FAILURE);
32
}
33
+ binfo->ram_size = machine->ram_size;
34
35
memory_region_add_subregion(get_system_memory(), OMAP2_Q2_BASE,
36
machine->ram);
37
--
38
2.20.1
39
40
diff view generated by jsdifflib
Deleted patch
1
For nested groups like:
2
1
3
{
4
[
5
pattern 1
6
pattern 2
7
]
8
pattern 3
9
}
10
11
the intended behaviour is that patterns 1 and 2 must not
12
overlap with each other; if the insn matches neither then
13
we fall through to pattern 3 as the next thing in the
14
outer overlapping group.
15
16
Currently we generate incorrect code for this situation,
17
because in the code path for a failed match inside the
18
inner non-overlapping group we generate a "return" statement,
19
which causes decode to stop entirely rather than continuing
20
to the next thing in the outer group.
21
22
Generate a "break" instead, so that decode flow behaves
23
as required for this nested group case.
24
25
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
26
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
27
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
28
Message-id: 20201019151301.2046-2-peter.maydell@linaro.org
29
---
30
scripts/decodetree.py | 2 +-
31
1 file changed, 1 insertion(+), 1 deletion(-)
32
33
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
34
index XXXXXXX..XXXXXXX 100644
35
--- a/scripts/decodetree.py
36
+++ b/scripts/decodetree.py
37
@@ -XXX,XX +XXX,XX @@ class Tree:
38
output(ind, ' /* ',
39
str_match_bits(innerbits, innermask), ' */\n')
40
s.output_code(i + 4, extracted, innerbits, innermask)
41
- output(ind, ' return false;\n')
42
+ output(ind, ' break;\n')
43
output(ind, '}\n')
44
# end Tree
45
46
--
47
2.20.1
48
49
diff view generated by jsdifflib
1
From v8.1M, disabled-coprocessor handling changes slightly:
1
Implement the MVE VADDLV insn; this is similar to VADDV, except
2
* coprocessors 8, 9, 14 and 15 are also governed by the
2
that it accumulates 32-bit elements into a 64-bit accumulator
3
cp10 enable bit, like cp11
3
stored in a pair of general-purpose registers.
4
* an extra range of instruction patterns is considered
5
to be inside the coprocessor space
6
7
We previously marked these up with TODO comments; implement the
8
correct behaviour.
9
10
Unfortunately there is no ID register field which indicates this
11
behaviour. We could in theory test an unrelated ID register which
12
indicates guaranteed-to-be-in-v8.1M behaviour like ID_ISAR0.CmpBranch
13
>= 3 (low-overhead-loops), but it seems better to simply define a new
14
ARM_FEATURE_V8_1M feature flag and use it for this and other
15
new-in-v8.1M behaviour that isn't identifiable from the ID registers.
16
4
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
Message-id: 20201019151301.2046-3-peter.maydell@linaro.org
7
Message-id: 20210628135835.6690-15-peter.maydell@linaro.org
20
---
8
---
21
target/arm/cpu.h | 1 +
9
target/arm/helper-mve.h | 3 ++
22
target/arm/m-nocp.decode | 10 ++++++----
10
target/arm/mve.decode | 6 +++-
23
target/arm/translate-vfp.c.inc | 17 +++++++++++++++--
11
target/arm/mve_helper.c | 19 ++++++++++++
24
3 files changed, 22 insertions(+), 6 deletions(-)
12
target/arm/translate-mve.c | 63 ++++++++++++++++++++++++++++++++++++++
13
4 files changed, 90 insertions(+), 1 deletion(-)
25
14
26
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
27
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/cpu.h
17
--- a/target/arm/helper-mve.h
29
+++ b/target/arm/cpu.h
18
+++ b/target/arm/helper-mve.h
30
@@ -XXX,XX +XXX,XX @@ enum arm_features {
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vaddvuh, TCG_CALL_NO_WG, i32, env, ptr, i32)
31
ARM_FEATURE_VBAR, /* has cp15 VBAR */
20
DEF_HELPER_FLAGS_3(mve_vaddvsw, TCG_CALL_NO_WG, i32, env, ptr, i32)
32
ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
21
DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
33
ARM_FEATURE_M_MAIN, /* M profile Main Extension */
22
34
+ ARM_FEATURE_V8_1M, /* M profile extras only in v8.1M and later */
23
+DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64)
35
};
24
+DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64)
36
25
+
37
static inline int arm_feature(CPUARMState *env, int feature)
26
DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
38
diff --git a/target/arm/m-nocp.decode b/target/arm/m-nocp.decode
27
DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
28
DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
29
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
39
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/m-nocp.decode
31
--- a/target/arm/mve.decode
41
+++ b/target/arm/m-nocp.decode
32
+++ b/target/arm/mve.decode
42
@@ -XXX,XX +XXX,XX @@
33
@@ -XXX,XX +XXX,XX @@ VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
43
# If the coprocessor is not present or disabled then we will generate
34
VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
44
# the NOCP exception; otherwise we let the insn through to the main decode.
35
45
36
# Vector add across vector
46
+&nocp cp
37
-VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo
38
+{
39
+ VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo
40
+ VADDLV 111 u:1 1110 1 ... 1001 ... 0 1111 00 a:1 0 qm:3 0 \
41
+ rdahi=%rdahi rdalo=%rdalo
42
+}
43
44
# Predicate operations
45
%mask_22_13 22:1 13:3
46
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/mve_helper.c
49
+++ b/target/arm/mve_helper.c
50
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvub, 1, uint8_t)
51
DO_VADDV(vaddvuh, 2, uint16_t)
52
DO_VADDV(vaddvuw, 4, uint32_t)
53
54
+#define DO_VADDLV(OP, TYPE, LTYPE) \
55
+ uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \
56
+ uint64_t ra) \
57
+ { \
58
+ uint16_t mask = mve_element_mask(env); \
59
+ unsigned e; \
60
+ TYPE *m = vm; \
61
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) { \
62
+ if (mask & 1) { \
63
+ ra += (LTYPE)m[H4(e)]; \
64
+ } \
65
+ } \
66
+ mve_advance_vpt(env); \
67
+ return ra; \
68
+ } \
47
+
69
+
48
{
70
+DO_VADDLV(vaddlv_s, int32_t, int64_t)
49
# Special cases which do not take an early NOCP: VLLDM and VLSTM
71
+DO_VADDLV(vaddlv_u, uint32_t, uint64_t)
50
VLLDM_VLSTM 1110 1100 001 l:1 rn:4 0000 1010 0000 0000
72
+
51
# TODO: VSCCLRM (new in v8.1M) is similar:
73
/* Shifts by immediate */
52
#VSCCLRM 1110 1100 1-01 1111 ---- 1011 ---- ---0
74
#define DO_2SHIFT(OP, ESIZE, TYPE, FN) \
53
75
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
54
- NOCP 111- 1110 ---- ---- ---- cp:4 ---- ----
76
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
55
- NOCP 111- 110- ---- ---- ---- cp:4 ---- ----
56
- # TODO: From v8.1M onwards we will also want this range to NOCP
57
- #NOCP_8_1 111- 1111 ---- ---- ---- ---- ---- ---- cp=10
58
+ NOCP 111- 1110 ---- ---- ---- cp:4 ---- ---- &nocp
59
+ NOCP 111- 110- ---- ---- ---- cp:4 ---- ---- &nocp
60
+ # From v8.1M onwards this range will also NOCP:
61
+ NOCP_8_1 111- 1111 ---- ---- ---- ---- ---- ---- &nocp cp=10
62
}
63
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
64
index XXXXXXX..XXXXXXX 100644
77
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/translate-vfp.c.inc
78
--- a/target/arm/translate-mve.c
66
+++ b/target/arm/translate-vfp.c.inc
79
+++ b/target/arm/translate-mve.c
67
@@ -XXX,XX +XXX,XX @@ static bool trans_VLLDM_VLSTM(DisasContext *s, arg_VLLDM_VLSTM *a)
80
@@ -XXX,XX +XXX,XX @@ static bool trans_VADDV(DisasContext *s, arg_VADDV *a)
68
return true;
81
return true;
69
}
82
}
70
83
71
-static bool trans_NOCP(DisasContext *s, arg_NOCP *a)
84
+static bool trans_VADDLV(DisasContext *s, arg_VADDLV *a)
72
+static bool trans_NOCP(DisasContext *s, arg_nocp *a)
73
{
74
/*
75
* Handle M-profile early check for disabled coprocessor:
76
@@ -XXX,XX +XXX,XX @@ static bool trans_NOCP(DisasContext *s, arg_NOCP *a)
77
if (a->cp == 11) {
78
a->cp = 10;
79
}
80
- /* TODO: in v8.1M cp 8, 9, 14, 15 also are governed by the cp10 enable */
81
+ if (arm_dc_feature(s, ARM_FEATURE_V8_1M) &&
82
+ (a->cp == 8 || a->cp == 9 || a->cp == 14 || a->cp == 15)) {
83
+ /* in v8.1M cp 8, 9, 14, 15 also are governed by the cp10 enable */
84
+ a->cp = 10;
85
+ }
86
87
if (a->cp != 10) {
88
gen_exception_insn(s, s->pc_curr, EXCP_NOCP,
89
@@ -XXX,XX +XXX,XX @@ static bool trans_NOCP(DisasContext *s, arg_NOCP *a)
90
return false;
91
}
92
93
+static bool trans_NOCP_8_1(DisasContext *s, arg_nocp *a)
94
+{
85
+{
95
+ /* This range needs a coprocessor check for v8.1M and later only */
86
+ /*
96
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
87
+ * Vector Add Long Across Vector: accumulate the 32-bit
88
+ * elements of the vector into a 64-bit result stored in
89
+ * a pair of general-purpose registers.
90
+ * No need to check Qm's bank: it is only 3 bits in decode.
91
+ */
92
+ TCGv_ptr qm;
93
+ TCGv_i64 rda;
94
+ TCGv_i32 rdalo, rdahi;
95
+
96
+ if (!dc_isar_feature(aa32_mve, s)) {
97
+ return false;
97
+ return false;
98
+ }
98
+ }
99
+ return trans_NOCP(s, a);
99
+ /*
100
+ * rdahi == 13 is UNPREDICTABLE; rdahi == 15 is a related
101
+ * encoding; rdalo always has bit 0 clear so cannot be 13 or 15.
102
+ */
103
+ if (a->rdahi == 13 || a->rdahi == 15) {
104
+ return false;
105
+ }
106
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
107
+ return true;
108
+ }
109
+
110
+ /*
111
+ * This insn is subject to beat-wise execution. Partial execution
112
+ * of an A=0 (no-accumulate) insn which does not execute the first
113
+ * beat must start with the current value of RdaHi:RdaLo, not zero.
114
+ */
115
+ if (a->a || mve_skip_first_beat(s)) {
116
+ /* Accumulate input from RdaHi:RdaLo */
117
+ rda = tcg_temp_new_i64();
118
+ rdalo = load_reg(s, a->rdalo);
119
+ rdahi = load_reg(s, a->rdahi);
120
+ tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
121
+ tcg_temp_free_i32(rdalo);
122
+ tcg_temp_free_i32(rdahi);
123
+ } else {
124
+ /* Accumulate starting at zero */
125
+ rda = tcg_const_i64(0);
126
+ }
127
+
128
+ qm = mve_qreg_ptr(a->qm);
129
+ if (a->u) {
130
+ gen_helper_mve_vaddlv_u(rda, cpu_env, qm, rda);
131
+ } else {
132
+ gen_helper_mve_vaddlv_s(rda, cpu_env, qm, rda);
133
+ }
134
+ tcg_temp_free_ptr(qm);
135
+
136
+ rdalo = tcg_temp_new_i32();
137
+ rdahi = tcg_temp_new_i32();
138
+ tcg_gen_extrl_i64_i32(rdalo, rda);
139
+ tcg_gen_extrh_i64_i32(rdahi, rda);
140
+ store_reg(s, a->rdalo, rdalo);
141
+ store_reg(s, a->rdahi, rdahi);
142
+ tcg_temp_free_i64(rda);
143
+ mve_update_eci(s);
144
+ return true;
100
+}
145
+}
101
+
146
+
102
static bool trans_VINS(DisasContext *s, arg_VINS *a)
147
static bool do_1imm(DisasContext *s, arg_1imm *a, MVEGenOneOpImmFn *fn)
103
{
148
{
104
TCGv_i32 rd, rm;
149
TCGv_ptr qd;
105
--
150
--
106
2.20.1
151
2.20.1
107
152
108
153
diff view generated by jsdifflib
1
v8.1M implements a new 'branch future' feature, which is a
1
The MVE extension to v8.1M includes some new shift instructions which
2
set of instructions that request the CPU to perform a branch
2
sit entirely within the non-coprocessor part of the encoding space
3
"in the future", when it reaches a particular execution address.
3
and which operate only on general-purpose registers. They take up
4
In hardware, the expected implementation is that the information
4
the space which was previously UNPREDICTABLE MOVS and ORRS encodings
5
about the branch location and destination is cached and then
5
with Rm == 13 or 15.
6
acted upon when execution reaches the specified address.
6
7
However the architecture permits an implementation to discard
7
Implement the long shifts by immediate, which perform shifts on a
8
this cached information at any point, and so guest code must
8
pair of general-purpose registers treated as a 64-bit quantity, with
9
always include a normal branch insn at the branch point as
9
an immediate shift count between 1 and 32.
10
a fallback. In particular, an implementation is specifically
10
11
permitted to treat all BF insns as NOPs (which is equivalent
11
Awkwardly, because the MOVS and ORRS trans functions do not UNDEF for
12
to discarding the cached information immediately).
12
the Rm==13,15 case, we need to explicitly emit code to UNDEF for the
13
13
cases where v8.1M now requires that. (Trying to change MOVS and ORRS
14
For QEMU, implementing this caching of branch information
14
is too difficult, because the functions that generate the code are
15
would be complicated and would not improve the speed of
15
shared between a dozen different kinds of arithmetic or logical
16
execution at all, so we make the IMPDEF choice to implement
16
instruction for all A32, T16 and T32 encodings, and for some insns
17
all BF insns as NOPs.
17
and some encodings Rm==13,15 are valid.)
18
18
19
We make the helper functions we need for UQSHLL and SQSHLL take
20
a 32-bit value which the helper casts to int8_t because we'll need
21
these helpers also for the shift-by-register insns, where the shift
22
count might be < 0 or > 32.
23
24
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
19
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
25
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
26
Message-id: 20210628135835.6690-16-peter.maydell@linaro.org
21
Message-id: 20201019151301.2046-7-peter.maydell@linaro.org
22
---
27
---
23
target/arm/cpu.h | 6 ++++++
28
target/arm/helper-mve.h | 3 ++
24
target/arm/t32.decode | 13 ++++++++++++-
29
target/arm/translate.h | 1 +
25
target/arm/translate.c | 20 ++++++++++++++++++++
30
target/arm/t32.decode | 28 +++++++++++++
26
3 files changed, 38 insertions(+), 1 deletion(-)
31
target/arm/mve_helper.c | 10 +++++
27
32
target/arm/translate.c | 90 +++++++++++++++++++++++++++++++++++++++++
28
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
33
5 files changed, 132 insertions(+)
29
index XXXXXXX..XXXXXXX 100644
34
30
--- a/target/arm/cpu.h
35
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
31
+++ b/target/arm/cpu.h
36
index XXXXXXX..XXXXXXX 100644
32
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_arm_div(const ARMISARegisters *id)
37
--- a/target/arm/helper-mve.h
33
return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) > 1;
38
+++ b/target/arm/helper-mve.h
34
}
39
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrshruntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
40
DEF_HELPER_FLAGS_4(mve_vqrshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
+static inline bool isar_feature_aa32_lob(const ARMISARegisters *id)
41
37
+{
42
DEF_HELPER_FLAGS_4(mve_vshlc, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
38
+ /* (M-profile) low-overhead loops and branch future */
43
+
39
+ return FIELD_EX32(id->id_isar0, ID_ISAR0, CMPBRANCH) >= 3;
44
+DEF_HELPER_FLAGS_3(mve_sqshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
40
+}
45
+DEF_HELPER_FLAGS_3(mve_uqshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
41
+
46
diff --git a/target/arm/translate.h b/target/arm/translate.h
42
static inline bool isar_feature_aa32_jazelle(const ARMISARegisters *id)
47
index XXXXXXX..XXXXXXX 100644
43
{
48
--- a/target/arm/translate.h
44
return FIELD_EX32(id->id_isar1, ID_ISAR1, JAZELLE) != 0;
49
+++ b/target/arm/translate.h
50
@@ -XXX,XX +XXX,XX @@ typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
51
typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
52
typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
53
typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
54
+typedef void WideShiftImmFn(TCGv_i64, TCGv_i64, int64_t shift);
55
56
/**
57
* arm_tbflags_from_tb:
45
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
58
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
46
index XXXXXXX..XXXXXXX 100644
59
index XXXXXXX..XXXXXXX 100644
47
--- a/target/arm/t32.decode
60
--- a/target/arm/t32.decode
48
+++ b/target/arm/t32.decode
61
+++ b/target/arm/t32.decode
49
@@ -XXX,XX +XXX,XX @@ MRC 1110 1110 ... 1 .... .... .... ... 1 .... @mcr
62
@@ -XXX,XX +XXX,XX @@
50
63
&mcr !extern cp opc1 crn crm opc2 rt
51
B 1111 0. .......... 10.1 ............ @branch24
64
&mcrr !extern cp opc1 crm rt rt2
52
BL 1111 0. .......... 11.1 ............ @branch24
65
53
-BLX_i 1111 0. .......... 11.0 ............ @branch24
66
+&mve_shl_ri rdalo rdahi shim
54
+{
67
+
55
+ # BLX_i is non-M-profile only
68
+# rdahi: bits [3:1] from insn, bit 0 is 1
56
+ BLX_i 1111 0. .......... 11.0 ............ @branch24
69
+# rdalo: bits [3:1] from insn, bit 0 is 0
57
+ # M-profile only: loop and branch insns
70
+%rdahi_9 9:3 !function=times_2_plus_1
71
+%rdalo_17 17:3 !function=times_2
72
+
73
# Data-processing (register)
74
75
%imm5_12_6 12:3 6:2
76
@@ -XXX,XX +XXX,XX @@
77
@S_xrr_shi ....... .... . rn:4 .... .... .. shty:2 rm:4 \
78
&s_rrr_shi shim=%imm5_12_6 s=1 rd=0
79
80
+@mve_shl_ri ....... .... . ... . . ... ... . .. .. .... \
81
+ &mve_shl_ri shim=%imm5_12_6 rdalo=%rdalo_17 rdahi=%rdahi_9
82
+
83
{
84
TST_xrri 1110101 0000 1 .... 0 ... 1111 .... .... @S_xrr_shi
85
AND_rrri 1110101 0000 . .... 0 ... .... .... .... @s_rrr_shi
86
}
87
BIC_rrri 1110101 0001 . .... 0 ... .... .... .... @s_rrr_shi
88
{
89
+ # The v8.1M MVE shift insns overlap in encoding with MOVS/ORRS
90
+ # and are distinguished by having Rm==13 or 15. Those are UNPREDICTABLE
91
+ # cases for MOVS/ORRS. We decode the MVE cases first, ensuring that
92
+ # they explicitly call unallocated_encoding() for cases that must UNDEF
93
+ # (eg "using a new shift insn on a v8.1M CPU without MVE"), and letting
94
+ # the rest fall through (where ORR_rrri and MOV_rxri will end up
95
+ # handling them as r13 and r15 accesses with the same semantics as A32).
58
+ [
96
+ [
59
+ # All these BF insns have boff != 0b0000; we NOP them all
97
+ LSLL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 00 1111 @mve_shl_ri
60
+ BF 1111 0 boff:4 ------- 1100 - ---------- 1 # BFL
98
+ LSRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 01 1111 @mve_shl_ri
61
+ BF 1111 0 boff:4 0 ------ 1110 - ---------- 1 # BFCSEL
99
+ ASRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 10 1111 @mve_shl_ri
62
+ BF 1111 0 boff:4 10 ----- 1110 - ---------- 1 # BF
100
+
63
+ BF 1111 0 boff:4 11 ----- 1110 0 0000000000 1 # BFX, BFLX
101
+ UQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 00 1111 @mve_shl_ri
102
+ URSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 01 1111 @mve_shl_ri
103
+ SRSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 10 1111 @mve_shl_ri
104
+ SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
64
+ ]
105
+ ]
106
+
107
MOV_rxri 1110101 0010 . 1111 0 ... .... .... .... @s_rxr_shi
108
ORR_rrri 1110101 0010 . .... 0 ... .... .... .... @s_rrr_shi
109
}
110
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
111
index XXXXXXX..XXXXXXX 100644
112
--- a/target/arm/mve_helper.c
113
+++ b/target/arm/mve_helper.c
114
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm,
115
mve_advance_vpt(env);
116
return rdm;
117
}
118
+
119
+uint64_t HELPER(mve_sqshll)(CPUARMState *env, uint64_t n, uint32_t shift)
120
+{
121
+ return do_sqrshl_d(n, (int8_t)shift, false, &env->QF);
122
+}
123
+
124
+uint64_t HELPER(mve_uqshll)(CPUARMState *env, uint64_t n, uint32_t shift)
125
+{
126
+ return do_uqrshl_d(n, (int8_t)shift, false, &env->QF);
65
+}
127
+}
66
diff --git a/target/arm/translate.c b/target/arm/translate.c
128
diff --git a/target/arm/translate.c b/target/arm/translate.c
67
index XXXXXXX..XXXXXXX 100644
129
index XXXXXXX..XXXXXXX 100644
68
--- a/target/arm/translate.c
130
--- a/target/arm/translate.c
69
+++ b/target/arm/translate.c
131
+++ b/target/arm/translate.c
70
@@ -XXX,XX +XXX,XX @@ static bool trans_BLX_suffix(DisasContext *s, arg_BLX_suffix *a)
132
@@ -XXX,XX +XXX,XX @@ static bool trans_MOVT(DisasContext *s, arg_MOVW *a)
71
return true;
133
return true;
72
}
134
}
73
135
74
+static bool trans_BF(DisasContext *s, arg_BF *a)
136
+/*
75
+{
137
+ * v8.1M MVE wide-shifts
76
+ /*
138
+ */
77
+ * M-profile branch future insns. The architecture permits an
139
+static bool do_mve_shl_ri(DisasContext *s, arg_mve_shl_ri *a,
78
+ * implementation to implement these as NOPs (equivalent to
140
+ WideShiftImmFn *fn)
79
+ * discarding the LO_BRANCH_INFO cache immediately), and we
141
+{
80
+ * take that IMPDEF option because for QEMU a "real" implementation
142
+ TCGv_i64 rda;
81
+ * would be complicated and wouldn't execute any faster.
143
+ TCGv_i32 rdalo, rdahi;
82
+ */
144
+
83
+ if (!dc_isar_feature(aa32_lob, s)) {
145
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
146
+ /* Decode falls through to ORR/MOV UNPREDICTABLE handling */
84
+ return false;
147
+ return false;
85
+ }
148
+ }
86
+ if (a->boff == 0) {
149
+ if (a->rdahi == 15) {
87
+ /* SEE "Related encodings" (loop insns) */
150
+ /* These are a different encoding (SQSHL/SRSHR/UQSHL/URSHR) */
88
+ return false;
151
+ return false;
89
+ }
152
+ }
90
+ /* Handle as NOP */
153
+ if (!dc_isar_feature(aa32_mve, s) ||
154
+ !arm_dc_feature(s, ARM_FEATURE_M_MAIN) ||
155
+ a->rdahi == 13) {
156
+ /* RdaHi == 13 is UNPREDICTABLE; we choose to UNDEF */
157
+ unallocated_encoding(s);
158
+ return true;
159
+ }
160
+
161
+ if (a->shim == 0) {
162
+ a->shim = 32;
163
+ }
164
+
165
+ rda = tcg_temp_new_i64();
166
+ rdalo = load_reg(s, a->rdalo);
167
+ rdahi = load_reg(s, a->rdahi);
168
+ tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
169
+
170
+ fn(rda, rda, a->shim);
171
+
172
+ tcg_gen_extrl_i64_i32(rdalo, rda);
173
+ tcg_gen_extrh_i64_i32(rdahi, rda);
174
+ store_reg(s, a->rdalo, rdalo);
175
+ store_reg(s, a->rdahi, rdahi);
176
+ tcg_temp_free_i64(rda);
177
+
91
+ return true;
178
+ return true;
92
+}
179
+}
93
+
180
+
94
static bool op_tbranch(DisasContext *s, arg_tbranch *a, bool half)
181
+static bool trans_ASRL_ri(DisasContext *s, arg_mve_shl_ri *a)
95
{
182
+{
96
TCGv_i32 addr, tmp;
183
+ return do_mve_shl_ri(s, a, tcg_gen_sari_i64);
184
+}
185
+
186
+static bool trans_LSLL_ri(DisasContext *s, arg_mve_shl_ri *a)
187
+{
188
+ return do_mve_shl_ri(s, a, tcg_gen_shli_i64);
189
+}
190
+
191
+static bool trans_LSRL_ri(DisasContext *s, arg_mve_shl_ri *a)
192
+{
193
+ return do_mve_shl_ri(s, a, tcg_gen_shri_i64);
194
+}
195
+
196
+static void gen_mve_sqshll(TCGv_i64 r, TCGv_i64 n, int64_t shift)
197
+{
198
+ gen_helper_mve_sqshll(r, cpu_env, n, tcg_constant_i32(shift));
199
+}
200
+
201
+static bool trans_SQSHLL_ri(DisasContext *s, arg_mve_shl_ri *a)
202
+{
203
+ return do_mve_shl_ri(s, a, gen_mve_sqshll);
204
+}
205
+
206
+static void gen_mve_uqshll(TCGv_i64 r, TCGv_i64 n, int64_t shift)
207
+{
208
+ gen_helper_mve_uqshll(r, cpu_env, n, tcg_constant_i32(shift));
209
+}
210
+
211
+static bool trans_UQSHLL_ri(DisasContext *s, arg_mve_shl_ri *a)
212
+{
213
+ return do_mve_shl_ri(s, a, gen_mve_uqshll);
214
+}
215
+
216
+static bool trans_SRSHRL_ri(DisasContext *s, arg_mve_shl_ri *a)
217
+{
218
+ return do_mve_shl_ri(s, a, gen_srshr64_i64);
219
+}
220
+
221
+static bool trans_URSHRL_ri(DisasContext *s, arg_mve_shl_ri *a)
222
+{
223
+ return do_mve_shl_ri(s, a, gen_urshr64_i64);
224
+}
225
+
226
/*
227
* Multiply and multiply accumulate
228
*/
97
--
229
--
98
2.20.1
230
2.20.1
99
231
100
232
diff view generated by jsdifflib
1
v8.1M brings four new insns to M-profile:
1
Implement the MVE long shifts by register, which perform shifts on a
2
* CSEL : Rd = cond ? Rn : Rm
2
pair of general-purpose registers treated as a 64-bit quantity, with
3
* CSINC : Rd = cond ? Rn : Rm+1
3
the shift count in another general-purpose register, which might be
4
* CSINV : Rd = cond ? Rn : ~Rm
4
either positive or negative.
5
* CSNEG : Rd = cond ? Rn : -Rm
5
6
6
Like the long-shifts-by-immediate, these encodings sit in the space
7
Implement these.
7
that was previously the UNPREDICTABLE MOVS/ORRS with Rm==13,15.
8
8
Because LSLL_rr and ASRL_rr overlap with both MOV_rxri/ORR_rrri and
9
also with CSEL (as one of the previously-UNPREDICTABLE Rm==13 cases),
10
we have to move the CSEL pattern into the same decodetree group.
11
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 20210628135835.6690-17-peter.maydell@linaro.org
11
Message-id: 20201019151301.2046-4-peter.maydell@linaro.org
12
---
15
---
13
target/arm/t32.decode | 3 +++
16
target/arm/helper-mve.h | 6 +++
14
target/arm/translate.c | 60 ++++++++++++++++++++++++++++++++++++++++++
17
target/arm/translate.h | 1 +
15
2 files changed, 63 insertions(+)
18
target/arm/t32.decode | 16 +++++--
16
19
target/arm/mve_helper.c | 93 +++++++++++++++++++++++++++++++++++++++++
20
target/arm/translate.c | 69 ++++++++++++++++++++++++++++++
21
5 files changed, 182 insertions(+), 3 deletions(-)
22
23
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
24
index XXXXXXX..XXXXXXX 100644
25
--- a/target/arm/helper-mve.h
26
+++ b/target/arm/helper-mve.h
27
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
29
DEF_HELPER_FLAGS_4(mve_vshlc, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
30
31
+DEF_HELPER_FLAGS_3(mve_sshrl, TCG_CALL_NO_RWG, i64, env, i64, i32)
32
+DEF_HELPER_FLAGS_3(mve_ushll, TCG_CALL_NO_RWG, i64, env, i64, i32)
33
DEF_HELPER_FLAGS_3(mve_sqshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
34
DEF_HELPER_FLAGS_3(mve_uqshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
35
+DEF_HELPER_FLAGS_3(mve_sqrshrl, TCG_CALL_NO_RWG, i64, env, i64, i32)
36
+DEF_HELPER_FLAGS_3(mve_uqrshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
37
+DEF_HELPER_FLAGS_3(mve_sqrshrl48, TCG_CALL_NO_RWG, i64, env, i64, i32)
38
+DEF_HELPER_FLAGS_3(mve_uqrshll48, TCG_CALL_NO_RWG, i64, env, i64, i32)
39
diff --git a/target/arm/translate.h b/target/arm/translate.h
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate.h
42
+++ b/target/arm/translate.h
43
@@ -XXX,XX +XXX,XX @@ typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
44
typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
45
typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
46
typedef void WideShiftImmFn(TCGv_i64, TCGv_i64, int64_t shift);
47
+typedef void WideShiftFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i32);
48
49
/**
50
* arm_tbflags_from_tb:
17
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
51
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
18
index XXXXXXX..XXXXXXX 100644
52
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/t32.decode
53
--- a/target/arm/t32.decode
20
+++ b/target/arm/t32.decode
54
+++ b/target/arm/t32.decode
55
@@ -XXX,XX +XXX,XX @@
56
&mcrr !extern cp opc1 crm rt rt2
57
58
&mve_shl_ri rdalo rdahi shim
59
+&mve_shl_rr rdalo rdahi rm
60
61
# rdahi: bits [3:1] from insn, bit 0 is 1
62
# rdalo: bits [3:1] from insn, bit 0 is 0
63
@@ -XXX,XX +XXX,XX @@
64
65
@mve_shl_ri ....... .... . ... . . ... ... . .. .. .... \
66
&mve_shl_ri shim=%imm5_12_6 rdalo=%rdalo_17 rdahi=%rdahi_9
67
+@mve_shl_rr ....... .... . ... . rm:4 ... . .. .. .... \
68
+ &mve_shl_rr rdalo=%rdalo_17 rdahi=%rdahi_9
69
70
{
71
TST_xrri 1110101 0000 1 .... 0 ... 1111 .... .... @S_xrr_shi
72
@@ -XXX,XX +XXX,XX @@ BIC_rrri 1110101 0001 . .... 0 ... .... .... .... @s_rrr_shi
73
URSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 01 1111 @mve_shl_ri
74
SRSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 10 1111 @mve_shl_ri
75
SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
76
+
77
+ LSLL_rr 1110101 0010 1 ... 0 .... ... 1 0000 1101 @mve_shl_rr
78
+ ASRL_rr 1110101 0010 1 ... 0 .... ... 1 0010 1101 @mve_shl_rr
79
+ UQRSHLL64_rr 1110101 0010 1 ... 1 .... ... 1 0000 1101 @mve_shl_rr
80
+ SQRSHRL64_rr 1110101 0010 1 ... 1 .... ... 1 0010 1101 @mve_shl_rr
81
+ UQRSHLL48_rr 1110101 0010 1 ... 1 .... ... 1 1000 1101 @mve_shl_rr
82
+ SQRSHRL48_rr 1110101 0010 1 ... 1 .... ... 1 1010 1101 @mve_shl_rr
83
]
84
85
MOV_rxri 1110101 0010 . 1111 0 ... .... .... .... @s_rxr_shi
86
ORR_rrri 1110101 0010 . .... 0 ... .... .... .... @s_rrr_shi
87
+
88
+ # v8.1M CSEL and friends
89
+ CSEL 1110101 0010 1 rn:4 10 op:2 rd:4 fcond:4 rm:4
90
}
91
{
92
MVN_rxri 1110101 0011 . 1111 0 ... .... .... .... @s_rxr_shi
21
@@ -XXX,XX +XXX,XX @@ SBC_rrri 1110101 1011 . .... 0 ... .... .... .... @s_rrr_shi
93
@@ -XXX,XX +XXX,XX @@ SBC_rrri 1110101 1011 . .... 0 ... .... .... .... @s_rrr_shi
22
}
94
}
23
RSB_rrri 1110101 1110 . .... 0 ... .... .... .... @s_rrr_shi
95
RSB_rrri 1110101 1110 . .... 0 ... .... .... .... @s_rrr_shi
24
96
25
+# v8.1M CSEL and friends
97
-# v8.1M CSEL and friends
26
+CSEL 1110101 0010 1 rn:4 10 op:2 rd:4 fcond:4 rm:4
98
-CSEL 1110101 0010 1 rn:4 10 op:2 rd:4 fcond:4 rm:4
27
+
99
-
28
# Data-processing (register-shifted register)
100
# Data-processing (register-shifted register)
29
101
30
MOV_rxrr 1111 1010 0 shty:2 s:1 rm:4 1111 rd:4 0000 rs:4 \
102
MOV_rxrr 1111 1010 0 shty:2 s:1 rm:4 1111 rd:4 0000 rs:4 \
103
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
104
index XXXXXXX..XXXXXXX 100644
105
--- a/target/arm/mve_helper.c
106
+++ b/target/arm/mve_helper.c
107
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm,
108
return rdm;
109
}
110
111
+uint64_t HELPER(mve_sshrl)(CPUARMState *env, uint64_t n, uint32_t shift)
112
+{
113
+ return do_sqrshl_d(n, -(int8_t)shift, false, NULL);
114
+}
115
+
116
+uint64_t HELPER(mve_ushll)(CPUARMState *env, uint64_t n, uint32_t shift)
117
+{
118
+ return do_uqrshl_d(n, (int8_t)shift, false, NULL);
119
+}
120
+
121
uint64_t HELPER(mve_sqshll)(CPUARMState *env, uint64_t n, uint32_t shift)
122
{
123
return do_sqrshl_d(n, (int8_t)shift, false, &env->QF);
124
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(mve_uqshll)(CPUARMState *env, uint64_t n, uint32_t shift)
125
{
126
return do_uqrshl_d(n, (int8_t)shift, false, &env->QF);
127
}
128
+
129
+uint64_t HELPER(mve_sqrshrl)(CPUARMState *env, uint64_t n, uint32_t shift)
130
+{
131
+ return do_sqrshl_d(n, -(int8_t)shift, true, &env->QF);
132
+}
133
+
134
+uint64_t HELPER(mve_uqrshll)(CPUARMState *env, uint64_t n, uint32_t shift)
135
+{
136
+ return do_uqrshl_d(n, (int8_t)shift, true, &env->QF);
137
+}
138
+
139
+/* Operate on 64-bit values, but saturate at 48 bits */
140
+static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift,
141
+ bool round, uint32_t *sat)
142
+{
143
+ if (shift <= -48) {
144
+ /* Rounding the sign bit always produces 0. */
145
+ if (round) {
146
+ return 0;
147
+ }
148
+ return src >> 63;
149
+ } else if (shift < 0) {
150
+ if (round) {
151
+ src >>= -shift - 1;
152
+ return (src >> 1) + (src & 1);
153
+ }
154
+ return src >> -shift;
155
+ } else if (shift < 48) {
156
+ int64_t val = src << shift;
157
+ int64_t extval = sextract64(val, 0, 48);
158
+ if (!sat || val == extval) {
159
+ return extval;
160
+ }
161
+ } else if (!sat || src == 0) {
162
+ return 0;
163
+ }
164
+
165
+ *sat = 1;
166
+ return (1ULL << 47) - (src >= 0);
167
+}
168
+
169
+/* Operate on 64-bit values, but saturate at 48 bits */
170
+static inline uint64_t do_uqrshl48_d(uint64_t src, int64_t shift,
171
+ bool round, uint32_t *sat)
172
+{
173
+ uint64_t val, extval;
174
+
175
+ if (shift <= -(48 + round)) {
176
+ return 0;
177
+ } else if (shift < 0) {
178
+ if (round) {
179
+ val = src >> (-shift - 1);
180
+ val = (val >> 1) + (val & 1);
181
+ } else {
182
+ val = src >> -shift;
183
+ }
184
+ extval = extract64(val, 0, 48);
185
+ if (!sat || val == extval) {
186
+ return extval;
187
+ }
188
+ } else if (shift < 48) {
189
+ uint64_t val = src << shift;
190
+ uint64_t extval = extract64(val, 0, 48);
191
+ if (!sat || val == extval) {
192
+ return extval;
193
+ }
194
+ } else if (!sat || src == 0) {
195
+ return 0;
196
+ }
197
+
198
+ *sat = 1;
199
+ return MAKE_64BIT_MASK(0, 48);
200
+}
201
+
202
+uint64_t HELPER(mve_sqrshrl48)(CPUARMState *env, uint64_t n, uint32_t shift)
203
+{
204
+ return do_sqrshl48_d(n, -(int8_t)shift, true, &env->QF);
205
+}
206
+
207
+uint64_t HELPER(mve_uqrshll48)(CPUARMState *env, uint64_t n, uint32_t shift)
208
+{
209
+ return do_uqrshl48_d(n, (int8_t)shift, true, &env->QF);
210
+}
31
diff --git a/target/arm/translate.c b/target/arm/translate.c
211
diff --git a/target/arm/translate.c b/target/arm/translate.c
32
index XXXXXXX..XXXXXXX 100644
212
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/translate.c
213
--- a/target/arm/translate.c
34
+++ b/target/arm/translate.c
214
+++ b/target/arm/translate.c
35
@@ -XXX,XX +XXX,XX @@ static bool trans_IT(DisasContext *s, arg_IT *a)
215
@@ -XXX,XX +XXX,XX @@ static bool trans_URSHRL_ri(DisasContext *s, arg_mve_shl_ri *a)
36
return true;
216
return do_mve_shl_ri(s, a, gen_urshr64_i64);
37
}
217
}
38
218
39
+/* v8.1M CSEL/CSINC/CSNEG/CSINV */
219
+static bool do_mve_shl_rr(DisasContext *s, arg_mve_shl_rr *a, WideShiftFn *fn)
40
+static bool trans_CSEL(DisasContext *s, arg_CSEL *a)
220
+{
41
+{
221
+ TCGv_i64 rda;
42
+ TCGv_i32 rn, rm, zero;
222
+ TCGv_i32 rdalo, rdahi;
43
+ DisasCompare c;
44
+
223
+
45
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
224
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
225
+ /* Decode falls through to ORR/MOV UNPREDICTABLE handling */
46
+ return false;
226
+ return false;
47
+ }
227
+ }
48
+
228
+ if (a->rdahi == 15) {
49
+ if (a->rm == 13) {
229
+ /* These are a different encoding (SQSHL/SRSHR/UQSHL/URSHR) */
50
+ /* SEE "Related encodings" (MVE shifts) */
51
+ return false;
230
+ return false;
52
+ }
231
+ }
53
+
232
+ if (!dc_isar_feature(aa32_mve, s) ||
54
+ if (a->rd == 13 || a->rd == 15 || a->rn == 13 || a->fcond >= 14) {
233
+ !arm_dc_feature(s, ARM_FEATURE_M_MAIN) ||
55
+ /* CONSTRAINED UNPREDICTABLE: we choose to UNDEF */
234
+ a->rdahi == 13 || a->rm == 13 || a->rm == 15 ||
56
+ return false;
235
+ a->rm == a->rdahi || a->rm == a->rdalo) {
57
+ }
236
+ /* These rdahi/rdalo/rm cases are UNPREDICTABLE; we choose to UNDEF */
58
+
237
+ unallocated_encoding(s);
59
+ /* In this insn input reg fields of 0b1111 mean "zero", not "PC" */
238
+ return true;
60
+ if (a->rn == 15) {
239
+ }
61
+ rn = tcg_const_i32(0);
240
+
62
+ } else {
241
+ rda = tcg_temp_new_i64();
63
+ rn = load_reg(s, a->rn);
242
+ rdalo = load_reg(s, a->rdalo);
64
+ }
243
+ rdahi = load_reg(s, a->rdahi);
65
+ if (a->rm == 15) {
244
+ tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
66
+ rm = tcg_const_i32(0);
245
+
67
+ } else {
246
+ /* The helper takes care of the sign-extension of the low 8 bits of Rm */
68
+ rm = load_reg(s, a->rm);
247
+ fn(rda, cpu_env, rda, cpu_R[a->rm]);
69
+ }
248
+
70
+
249
+ tcg_gen_extrl_i64_i32(rdalo, rda);
71
+ switch (a->op) {
250
+ tcg_gen_extrh_i64_i32(rdahi, rda);
72
+ case 0: /* CSEL */
251
+ store_reg(s, a->rdalo, rdalo);
73
+ break;
252
+ store_reg(s, a->rdahi, rdahi);
74
+ case 1: /* CSINC */
253
+ tcg_temp_free_i64(rda);
75
+ tcg_gen_addi_i32(rm, rm, 1);
76
+ break;
77
+ case 2: /* CSINV */
78
+ tcg_gen_not_i32(rm, rm);
79
+ break;
80
+ case 3: /* CSNEG */
81
+ tcg_gen_neg_i32(rm, rm);
82
+ break;
83
+ default:
84
+ g_assert_not_reached();
85
+ }
86
+
87
+ arm_test_cc(&c, a->fcond);
88
+ zero = tcg_const_i32(0);
89
+ tcg_gen_movcond_i32(c.cond, rn, c.value, zero, rn, rm);
90
+ arm_free_cc(&c);
91
+ tcg_temp_free_i32(zero);
92
+
93
+ store_reg(s, a->rd, rn);
94
+ tcg_temp_free_i32(rm);
95
+
254
+
96
+ return true;
255
+ return true;
97
+}
256
+}
98
+
257
+
258
+static bool trans_LSLL_rr(DisasContext *s, arg_mve_shl_rr *a)
259
+{
260
+ return do_mve_shl_rr(s, a, gen_helper_mve_ushll);
261
+}
262
+
263
+static bool trans_ASRL_rr(DisasContext *s, arg_mve_shl_rr *a)
264
+{
265
+ return do_mve_shl_rr(s, a, gen_helper_mve_sshrl);
266
+}
267
+
268
+static bool trans_UQRSHLL64_rr(DisasContext *s, arg_mve_shl_rr *a)
269
+{
270
+ return do_mve_shl_rr(s, a, gen_helper_mve_uqrshll);
271
+}
272
+
273
+static bool trans_SQRSHRL64_rr(DisasContext *s, arg_mve_shl_rr *a)
274
+{
275
+ return do_mve_shl_rr(s, a, gen_helper_mve_sqrshrl);
276
+}
277
+
278
+static bool trans_UQRSHLL48_rr(DisasContext *s, arg_mve_shl_rr *a)
279
+{
280
+ return do_mve_shl_rr(s, a, gen_helper_mve_uqrshll48);
281
+}
282
+
283
+static bool trans_SQRSHRL48_rr(DisasContext *s, arg_mve_shl_rr *a)
284
+{
285
+ return do_mve_shl_rr(s, a, gen_helper_mve_sqrshrl48);
286
+}
287
+
99
/*
288
/*
100
* Legacy decoder.
289
* Multiply and multiply accumulate
101
*/
290
*/
102
--
291
--
103
2.20.1
292
2.20.1
104
293
105
294
diff view generated by jsdifflib
Deleted patch
1
The t32 decode has a group which represents a set of insns
2
which overlap with B_cond_thumb because they have [25:23]=111
3
(which is an invalid condition code field for the branch insn).
4
This group is currently defined using the {} overlap-OK syntax,
5
but it is almost entirely non-overlapping patterns. Switch
6
it over to use a non-overlapping group.
7
1
8
For this to be valid syntactically, CPS must move into the same
9
overlapping-group as the hint insns (CPS vs hints was the
10
only actual use of the overlap facility for the group).
11
12
The non-overlapping subgroup for CLREX/DSB/DMB/ISB/SB is no longer
13
necessary and so we can remove it (promoting those insns to
14
be members of the parent group).
15
16
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Message-id: 20201019151301.2046-5-peter.maydell@linaro.org
19
---
20
target/arm/t32.decode | 26 ++++++++++++--------------
21
1 file changed, 12 insertions(+), 14 deletions(-)
22
23
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
24
index XXXXXXX..XXXXXXX 100644
25
--- a/target/arm/t32.decode
26
+++ b/target/arm/t32.decode
27
@@ -XXX,XX +XXX,XX @@ CLZ 1111 1010 1011 ---- 1111 .... 1000 .... @rdm
28
{
29
# Group insn[25:23] = 111, which is cond=111x for the branch below,
30
# or unconditional, which would be illegal for the branch.
31
- {
32
- # Hints
33
+ [
34
+ # Hints, and CPS
35
{
36
YIELD 1111 0011 1010 1111 1000 0000 0000 0001
37
WFE 1111 0011 1010 1111 1000 0000 0000 0010
38
@@ -XXX,XX +XXX,XX @@ CLZ 1111 1010 1011 ---- 1111 .... 1000 .... @rdm
39
# The canonical nop ends in 0000 0000, but the whole rest
40
# of the space is "reserved hint, behaves as nop".
41
NOP 1111 0011 1010 1111 1000 0000 ---- ----
42
+
43
+ # If imod == '00' && M == '0' then SEE "Hint instructions", above.
44
+ CPS 1111 0011 1010 1111 1000 0 imod:2 M:1 A:1 I:1 F:1 mode:5 \
45
+ &cps
46
}
47
48
- # If imod == '00' && M == '0' then SEE "Hint instructions", above.
49
- CPS 1111 0011 1010 1111 1000 0 imod:2 M:1 A:1 I:1 F:1 mode:5 \
50
- &cps
51
-
52
# Miscellaneous control
53
- [
54
- CLREX 1111 0011 1011 1111 1000 1111 0010 1111
55
- DSB 1111 0011 1011 1111 1000 1111 0100 ----
56
- DMB 1111 0011 1011 1111 1000 1111 0101 ----
57
- ISB 1111 0011 1011 1111 1000 1111 0110 ----
58
- SB 1111 0011 1011 1111 1000 1111 0111 0000
59
- ]
60
+ CLREX 1111 0011 1011 1111 1000 1111 0010 1111
61
+ DSB 1111 0011 1011 1111 1000 1111 0100 ----
62
+ DMB 1111 0011 1011 1111 1000 1111 0101 ----
63
+ ISB 1111 0011 1011 1111 1000 1111 0110 ----
64
+ SB 1111 0011 1011 1111 1000 1111 0111 0000
65
66
# Note that the v7m insn overlaps both the normal and banked insn.
67
{
68
@@ -XXX,XX +XXX,XX @@ CLZ 1111 1010 1011 ---- 1111 .... 1000 .... @rdm
69
HVC 1111 0111 1110 .... 1000 .... .... .... \
70
&i imm=%imm16_16_0
71
UDF 1111 0111 1111 ---- 1010 ---- ---- ----
72
- }
73
+ ]
74
B_cond_thumb 1111 0. cond:4 ...... 10.0 ............ &ci imm=%imm21
75
}
76
77
--
78
2.20.1
79
80
diff view generated by jsdifflib
1
The BLX immediate insn in the Thumb encoding always performs
1
Implement the MVE shifts by immediate, which perform shifts
2
a switch from Thumb to Arm state. This would be totally useless
2
on a single general-purpose register.
3
in M-profile which has no Arm decoder, and so the instruction
3
4
does not exist at all there. Make the encoding UNDEF for M-profile.
4
These patterns overlap with the long-shift-by-immediates,
5
5
so we have to rearrange the grouping a little here.
6
(This part of the encoding space is used for the branch-future
6
7
and low-overhead-loop insns in v8.1M.)
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20210628135835.6690-18-peter.maydell@linaro.org
11
Message-id: 20201019151301.2046-6-peter.maydell@linaro.org
12
---
10
---
13
target/arm/translate.c | 8 ++++++++
11
target/arm/helper-mve.h | 3 ++
14
1 file changed, 8 insertions(+)
12
target/arm/translate.h | 1 +
15
13
target/arm/t32.decode | 31 ++++++++++++++-----
14
target/arm/mve_helper.c | 10 ++++++
15
target/arm/translate.c | 68 +++++++++++++++++++++++++++++++++++++++--
16
5 files changed, 104 insertions(+), 9 deletions(-)
17
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-mve.h
21
+++ b/target/arm/helper-mve.h
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_sqrshrl, TCG_CALL_NO_RWG, i64, env, i64, i32)
23
DEF_HELPER_FLAGS_3(mve_uqrshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
24
DEF_HELPER_FLAGS_3(mve_sqrshrl48, TCG_CALL_NO_RWG, i64, env, i64, i32)
25
DEF_HELPER_FLAGS_3(mve_uqrshll48, TCG_CALL_NO_RWG, i64, env, i64, i32)
26
+
27
+DEF_HELPER_FLAGS_3(mve_uqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
28
+DEF_HELPER_FLAGS_3(mve_sqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
29
diff --git a/target/arm/translate.h b/target/arm/translate.h
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate.h
32
+++ b/target/arm/translate.h
33
@@ -XXX,XX +XXX,XX @@ typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
34
typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
35
typedef void WideShiftImmFn(TCGv_i64, TCGv_i64, int64_t shift);
36
typedef void WideShiftFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i32);
37
+typedef void ShiftImmFn(TCGv_i32, TCGv_i32, int32_t shift);
38
39
/**
40
* arm_tbflags_from_tb:
41
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
42
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/t32.decode
44
+++ b/target/arm/t32.decode
45
@@ -XXX,XX +XXX,XX @@
46
47
&mve_shl_ri rdalo rdahi shim
48
&mve_shl_rr rdalo rdahi rm
49
+&mve_sh_ri rda shim
50
51
# rdahi: bits [3:1] from insn, bit 0 is 1
52
# rdalo: bits [3:1] from insn, bit 0 is 0
53
@@ -XXX,XX +XXX,XX @@
54
&mve_shl_ri shim=%imm5_12_6 rdalo=%rdalo_17 rdahi=%rdahi_9
55
@mve_shl_rr ....... .... . ... . rm:4 ... . .. .. .... \
56
&mve_shl_rr rdalo=%rdalo_17 rdahi=%rdahi_9
57
+@mve_sh_ri ....... .... . rda:4 . ... ... . .. .. .... \
58
+ &mve_sh_ri shim=%imm5_12_6
59
60
{
61
TST_xrri 1110101 0000 1 .... 0 ... 1111 .... .... @S_xrr_shi
62
@@ -XXX,XX +XXX,XX @@ BIC_rrri 1110101 0001 . .... 0 ... .... .... .... @s_rrr_shi
63
# the rest fall through (where ORR_rrri and MOV_rxri will end up
64
# handling them as r13 and r15 accesses with the same semantics as A32).
65
[
66
- LSLL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 00 1111 @mve_shl_ri
67
- LSRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 01 1111 @mve_shl_ri
68
- ASRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 10 1111 @mve_shl_ri
69
+ {
70
+ UQSHL_ri 1110101 0010 1 .... 0 ... 1111 .. 00 1111 @mve_sh_ri
71
+ LSLL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 00 1111 @mve_shl_ri
72
+ UQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 00 1111 @mve_shl_ri
73
+ }
74
75
- UQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 00 1111 @mve_shl_ri
76
- URSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 01 1111 @mve_shl_ri
77
- SRSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 10 1111 @mve_shl_ri
78
- SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
79
+ {
80
+ URSHR_ri 1110101 0010 1 .... 0 ... 1111 .. 01 1111 @mve_sh_ri
81
+ LSRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 01 1111 @mve_shl_ri
82
+ URSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 01 1111 @mve_shl_ri
83
+ }
84
+
85
+ {
86
+ SRSHR_ri 1110101 0010 1 .... 0 ... 1111 .. 10 1111 @mve_sh_ri
87
+ ASRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 10 1111 @mve_shl_ri
88
+ SRSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 10 1111 @mve_shl_ri
89
+ }
90
+
91
+ {
92
+ SQSHL_ri 1110101 0010 1 .... 0 ... 1111 .. 11 1111 @mve_sh_ri
93
+ SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
94
+ }
95
96
LSLL_rr 1110101 0010 1 ... 0 .... ... 1 0000 1101 @mve_shl_rr
97
ASRL_rr 1110101 0010 1 ... 0 .... ... 1 0010 1101 @mve_shl_rr
98
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
99
index XXXXXXX..XXXXXXX 100644
100
--- a/target/arm/mve_helper.c
101
+++ b/target/arm/mve_helper.c
102
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(mve_uqrshll48)(CPUARMState *env, uint64_t n, uint32_t shift)
103
{
104
return do_uqrshl48_d(n, (int8_t)shift, true, &env->QF);
105
}
106
+
107
+uint32_t HELPER(mve_uqshl)(CPUARMState *env, uint32_t n, uint32_t shift)
108
+{
109
+ return do_uqrshl_bhs(n, (int8_t)shift, 32, false, &env->QF);
110
+}
111
+
112
+uint32_t HELPER(mve_sqshl)(CPUARMState *env, uint32_t n, uint32_t shift)
113
+{
114
+ return do_sqrshl_bhs(n, (int8_t)shift, 32, false, &env->QF);
115
+}
16
diff --git a/target/arm/translate.c b/target/arm/translate.c
116
diff --git a/target/arm/translate.c b/target/arm/translate.c
17
index XXXXXXX..XXXXXXX 100644
117
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/translate.c
118
--- a/target/arm/translate.c
19
+++ b/target/arm/translate.c
119
+++ b/target/arm/translate.c
20
@@ -XXX,XX +XXX,XX @@ static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
120
@@ -XXX,XX +XXX,XX @@ static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
21
{
121
22
TCGv_i32 tmp;
122
static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
23
123
{
24
+ /*
124
- TCGv_i32 t = tcg_temp_new_i32();
25
+ * BLX <imm> would be useless on M-profile; the encoding space
125
+ TCGv_i32 t;
26
+ * is used for other insns from v8.1M onward, and UNDEFs before that.
126
27
+ */
127
+ /* Handle shift by the input size for the benefit of trans_SRSHR_ri */
28
+ if (arm_dc_feature(s, ARM_FEATURE_M)) {
128
+ if (sh == 32) {
129
+ tcg_gen_movi_i32(d, 0);
130
+ return;
131
+ }
132
+ t = tcg_temp_new_i32();
133
tcg_gen_extract_i32(t, a, sh - 1, 1);
134
tcg_gen_sari_i32(d, a, sh);
135
tcg_gen_add_i32(d, d, t);
136
@@ -XXX,XX +XXX,XX @@ static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
137
138
static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
139
{
140
- TCGv_i32 t = tcg_temp_new_i32();
141
+ TCGv_i32 t;
142
143
+ /* Handle shift by the input size for the benefit of trans_URSHR_ri */
144
+ if (sh == 32) {
145
+ tcg_gen_extract_i32(d, a, sh - 1, 1);
146
+ return;
147
+ }
148
+ t = tcg_temp_new_i32();
149
tcg_gen_extract_i32(t, a, sh - 1, 1);
150
tcg_gen_shri_i32(d, a, sh);
151
tcg_gen_add_i32(d, d, t);
152
@@ -XXX,XX +XXX,XX @@ static bool trans_SQRSHRL48_rr(DisasContext *s, arg_mve_shl_rr *a)
153
return do_mve_shl_rr(s, a, gen_helper_mve_sqrshrl48);
154
}
155
156
+static bool do_mve_sh_ri(DisasContext *s, arg_mve_sh_ri *a, ShiftImmFn *fn)
157
+{
158
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
159
+ /* Decode falls through to ORR/MOV UNPREDICTABLE handling */
29
+ return false;
160
+ return false;
30
+ }
161
+ }
31
+
162
+ if (!dc_isar_feature(aa32_mve, s) ||
32
/* For A32, ARM_FEATURE_V5 is checked near the start of the uncond block. */
163
+ !arm_dc_feature(s, ARM_FEATURE_M_MAIN) ||
33
if (s->thumb && (a->imm & 2)) {
164
+ a->rda == 13 || a->rda == 15) {
34
return false;
165
+ /* These rda cases are UNPREDICTABLE; we choose to UNDEF */
166
+ unallocated_encoding(s);
167
+ return true;
168
+ }
169
+
170
+ if (a->shim == 0) {
171
+ a->shim = 32;
172
+ }
173
+ fn(cpu_R[a->rda], cpu_R[a->rda], a->shim);
174
+
175
+ return true;
176
+}
177
+
178
+static bool trans_URSHR_ri(DisasContext *s, arg_mve_sh_ri *a)
179
+{
180
+ return do_mve_sh_ri(s, a, gen_urshr32_i32);
181
+}
182
+
183
+static bool trans_SRSHR_ri(DisasContext *s, arg_mve_sh_ri *a)
184
+{
185
+ return do_mve_sh_ri(s, a, gen_srshr32_i32);
186
+}
187
+
188
+static void gen_mve_sqshl(TCGv_i32 r, TCGv_i32 n, int32_t shift)
189
+{
190
+ gen_helper_mve_sqshl(r, cpu_env, n, tcg_constant_i32(shift));
191
+}
192
+
193
+static bool trans_SQSHL_ri(DisasContext *s, arg_mve_sh_ri *a)
194
+{
195
+ return do_mve_sh_ri(s, a, gen_mve_sqshl);
196
+}
197
+
198
+static void gen_mve_uqshl(TCGv_i32 r, TCGv_i32 n, int32_t shift)
199
+{
200
+ gen_helper_mve_uqshl(r, cpu_env, n, tcg_constant_i32(shift));
201
+}
202
+
203
+static bool trans_UQSHL_ri(DisasContext *s, arg_mve_sh_ri *a)
204
+{
205
+ return do_mve_sh_ri(s, a, gen_mve_uqshl);
206
+}
207
+
208
/*
209
* Multiply and multiply accumulate
210
*/
35
--
211
--
36
2.20.1
212
2.20.1
37
213
38
214
diff view generated by jsdifflib
1
v8.1M's "low-overhead-loop" extension has three instructions
1
Implement the MVE shifts by register, which perform
2
for looping:
2
shifts on a single general-purpose register.
3
* DLS (start of a do-loop)
4
* WLS (start of a while-loop)
5
* LE (end of a loop)
6
7
The loop-start instructions are both simple operations to start a
8
loop whose iteration count (if any) is in LR. The loop-end
9
instruction handles "decrement iteration count and jump back to loop
10
start"; it also caches the information about the branch back to the
11
start of the loop to improve performance of the branch on subsequent
12
iterations.
13
14
As with the branch-future instructions, the architecture permits an
15
implementation to discard the LO_BRANCH_INFO cache at any time, and
16
QEMU takes the IMPDEF option to never set it in the first place
17
(equivalent to discarding it immediately), because for us a "real"
18
implementation would be unnecessary complexity.
19
20
(This implementation only provides the simple looping constructs; the
21
vector extension MVE (Helium) adds some extra variants to handle
22
looping across vectors. We'll add those later when we implement
23
MVE.)
24
3
25
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
26
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
27
Message-id: 20201019151301.2046-8-peter.maydell@linaro.org
6
Message-id: 20210628135835.6690-19-peter.maydell@linaro.org
28
---
7
---
29
target/arm/t32.decode | 8 ++++
8
target/arm/helper-mve.h | 2 ++
30
target/arm/translate.c | 93 +++++++++++++++++++++++++++++++++++++++++-
9
target/arm/translate.h | 1 +
31
2 files changed, 99 insertions(+), 2 deletions(-)
10
target/arm/t32.decode | 18 ++++++++++++++----
11
target/arm/mve_helper.c | 10 ++++++++++
12
target/arm/translate.c | 30 ++++++++++++++++++++++++++++++
13
5 files changed, 57 insertions(+), 4 deletions(-)
32
14
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
18
+++ b/target/arm/helper-mve.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_uqrshll48, TCG_CALL_NO_RWG, i64, env, i64, i32)
20
21
DEF_HELPER_FLAGS_3(mve_uqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
22
DEF_HELPER_FLAGS_3(mve_sqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
23
+DEF_HELPER_FLAGS_3(mve_uqrshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
24
+DEF_HELPER_FLAGS_3(mve_sqrshr, TCG_CALL_NO_RWG, i32, env, i32, i32)
25
diff --git a/target/arm/translate.h b/target/arm/translate.h
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate.h
28
+++ b/target/arm/translate.h
29
@@ -XXX,XX +XXX,XX @@ typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
30
typedef void WideShiftImmFn(TCGv_i64, TCGv_i64, int64_t shift);
31
typedef void WideShiftFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i32);
32
typedef void ShiftImmFn(TCGv_i32, TCGv_i32, int32_t shift);
33
+typedef void ShiftFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
34
35
/**
36
* arm_tbflags_from_tb:
33
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
37
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
34
index XXXXXXX..XXXXXXX 100644
38
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/t32.decode
39
--- a/target/arm/t32.decode
36
+++ b/target/arm/t32.decode
40
+++ b/target/arm/t32.decode
37
@@ -XXX,XX +XXX,XX @@ BL 1111 0. .......... 11.1 ............ @branch24
41
@@ -XXX,XX +XXX,XX @@
38
BF 1111 0 boff:4 10 ----- 1110 - ---------- 1 # BF
42
&mve_shl_ri rdalo rdahi shim
39
BF 1111 0 boff:4 11 ----- 1110 0 0000000000 1 # BFX, BFLX
43
&mve_shl_rr rdalo rdahi rm
44
&mve_sh_ri rda shim
45
+&mve_sh_rr rda rm
46
47
# rdahi: bits [3:1] from insn, bit 0 is 1
48
# rdalo: bits [3:1] from insn, bit 0 is 0
49
@@ -XXX,XX +XXX,XX @@
50
&mve_shl_rr rdalo=%rdalo_17 rdahi=%rdahi_9
51
@mve_sh_ri ....... .... . rda:4 . ... ... . .. .. .... \
52
&mve_sh_ri shim=%imm5_12_6
53
+@mve_sh_rr ....... .... . rda:4 rm:4 .... .... .... &mve_sh_rr
54
55
{
56
TST_xrri 1110101 0000 1 .... 0 ... 1111 .... .... @S_xrr_shi
57
@@ -XXX,XX +XXX,XX @@ BIC_rrri 1110101 0001 . .... 0 ... .... .... .... @s_rrr_shi
58
SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
59
}
60
61
- LSLL_rr 1110101 0010 1 ... 0 .... ... 1 0000 1101 @mve_shl_rr
62
- ASRL_rr 1110101 0010 1 ... 0 .... ... 1 0010 1101 @mve_shl_rr
63
- UQRSHLL64_rr 1110101 0010 1 ... 1 .... ... 1 0000 1101 @mve_shl_rr
64
- SQRSHRL64_rr 1110101 0010 1 ... 1 .... ... 1 0010 1101 @mve_shl_rr
65
+ {
66
+ UQRSHL_rr 1110101 0010 1 .... .... 1111 0000 1101 @mve_sh_rr
67
+ LSLL_rr 1110101 0010 1 ... 0 .... ... 1 0000 1101 @mve_shl_rr
68
+ UQRSHLL64_rr 1110101 0010 1 ... 1 .... ... 1 0000 1101 @mve_shl_rr
69
+ }
70
+
71
+ {
72
+ SQRSHR_rr 1110101 0010 1 .... .... 1111 0010 1101 @mve_sh_rr
73
+ ASRL_rr 1110101 0010 1 ... 0 .... ... 1 0010 1101 @mve_shl_rr
74
+ SQRSHRL64_rr 1110101 0010 1 ... 1 .... ... 1 0010 1101 @mve_shl_rr
75
+ }
76
+
77
UQRSHLL48_rr 1110101 0010 1 ... 1 .... ... 1 1000 1101 @mve_shl_rr
78
SQRSHRL48_rr 1110101 0010 1 ... 1 .... ... 1 1010 1101 @mve_shl_rr
40
]
79
]
41
+ [
80
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
42
+ # LE and WLS immediate
81
index XXXXXXX..XXXXXXX 100644
43
+ %lob_imm 1:10 11:1 !function=times_2
82
--- a/target/arm/mve_helper.c
83
+++ b/target/arm/mve_helper.c
84
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(mve_sqshl)(CPUARMState *env, uint32_t n, uint32_t shift)
85
{
86
return do_sqrshl_bhs(n, (int8_t)shift, 32, false, &env->QF);
87
}
44
+
88
+
45
+ DLS 1111 0 0000 100 rn:4 1110 0000 0000 0001
89
+uint32_t HELPER(mve_uqrshl)(CPUARMState *env, uint32_t n, uint32_t shift)
46
+ WLS 1111 0 0000 100 rn:4 1100 . .......... 1 imm=%lob_imm
90
+{
47
+ LE 1111 0 0000 0 f:1 0 1111 1100 . .......... 1 imm=%lob_imm
91
+ return do_uqrshl_bhs(n, (int8_t)shift, 32, true, &env->QF);
48
+ ]
92
+}
49
}
93
+
94
+uint32_t HELPER(mve_sqrshr)(CPUARMState *env, uint32_t n, uint32_t shift)
95
+{
96
+ return do_sqrshl_bhs(n, -(int8_t)shift, 32, true, &env->QF);
97
+}
50
diff --git a/target/arm/translate.c b/target/arm/translate.c
98
diff --git a/target/arm/translate.c b/target/arm/translate.c
51
index XXXXXXX..XXXXXXX 100644
99
index XXXXXXX..XXXXXXX 100644
52
--- a/target/arm/translate.c
100
--- a/target/arm/translate.c
53
+++ b/target/arm/translate.c
101
+++ b/target/arm/translate.c
54
@@ -XXX,XX +XXX,XX @@ static void gen_goto_tb(DisasContext *s, int n, target_ulong dest)
102
@@ -XXX,XX +XXX,XX @@ static bool trans_UQSHL_ri(DisasContext *s, arg_mve_sh_ri *a)
55
s->base.is_jmp = DISAS_NORETURN;
103
return do_mve_sh_ri(s, a, gen_mve_uqshl);
56
}
104
}
57
105
58
-static inline void gen_jmp (DisasContext *s, uint32_t dest)
106
+static bool do_mve_sh_rr(DisasContext *s, arg_mve_sh_rr *a, ShiftFn *fn)
59
+/* Jump, specifying which TB number to use if we gen_goto_tb() */
60
+static inline void gen_jmp_tb(DisasContext *s, uint32_t dest, int tbno)
61
{
62
if (unlikely(is_singlestepping(s))) {
63
/* An indirect jump so that we still trigger the debug exception. */
64
gen_set_pc_im(s, dest);
65
s->base.is_jmp = DISAS_JUMP;
66
} else {
67
- gen_goto_tb(s, 0, dest);
68
+ gen_goto_tb(s, tbno, dest);
69
}
70
}
71
72
+static inline void gen_jmp(DisasContext *s, uint32_t dest)
73
+{
107
+{
74
+ gen_jmp_tb(s, dest, 0);
108
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
75
+}
109
+ /* Decode falls through to ORR/MOV UNPREDICTABLE handling */
76
+
77
static inline void gen_mulxy(TCGv_i32 t0, TCGv_i32 t1, int x, int y)
78
{
79
if (x)
80
@@ -XXX,XX +XXX,XX @@ static bool trans_BF(DisasContext *s, arg_BF *a)
81
return true;
82
}
83
84
+static bool trans_DLS(DisasContext *s, arg_DLS *a)
85
+{
86
+ /* M-profile low-overhead loop start */
87
+ TCGv_i32 tmp;
88
+
89
+ if (!dc_isar_feature(aa32_lob, s)) {
90
+ return false;
110
+ return false;
91
+ }
111
+ }
92
+ if (a->rn == 13 || a->rn == 15) {
112
+ if (!dc_isar_feature(aa32_mve, s) ||
93
+ /* CONSTRAINED UNPREDICTABLE: we choose to UNDEF */
113
+ !arm_dc_feature(s, ARM_FEATURE_M_MAIN) ||
94
+ return false;
114
+ a->rda == 13 || a->rda == 15 || a->rm == 13 || a->rm == 15 ||
115
+ a->rm == a->rda) {
116
+ /* These rda/rm cases are UNPREDICTABLE; we choose to UNDEF */
117
+ unallocated_encoding(s);
118
+ return true;
95
+ }
119
+ }
96
+
120
+
97
+ /* Not a while loop, no tail predication: just set LR to the count */
121
+ /* The helper takes care of the sign-extension of the low 8 bits of Rm */
98
+ tmp = load_reg(s, a->rn);
122
+ fn(cpu_R[a->rda], cpu_env, cpu_R[a->rda], cpu_R[a->rm]);
99
+ store_reg(s, 14, tmp);
100
+ return true;
123
+ return true;
101
+}
124
+}
102
+
125
+
103
+static bool trans_WLS(DisasContext *s, arg_WLS *a)
126
+static bool trans_SQRSHR_rr(DisasContext *s, arg_mve_sh_rr *a)
104
+{
127
+{
105
+ /* M-profile low-overhead while-loop start */
128
+ return do_mve_sh_rr(s, a, gen_helper_mve_sqrshr);
106
+ TCGv_i32 tmp;
107
+ TCGLabel *nextlabel;
108
+
109
+ if (!dc_isar_feature(aa32_lob, s)) {
110
+ return false;
111
+ }
112
+ if (a->rn == 13 || a->rn == 15) {
113
+ /* CONSTRAINED UNPREDICTABLE: we choose to UNDEF */
114
+ return false;
115
+ }
116
+ if (s->condexec_mask) {
117
+ /*
118
+ * WLS in an IT block is CONSTRAINED UNPREDICTABLE;
119
+ * we choose to UNDEF, because otherwise our use of
120
+ * gen_goto_tb(1) would clash with the use of TB exit 1
121
+ * in the dc->condjmp condition-failed codepath in
122
+ * arm_tr_tb_stop() and we'd get an assertion.
123
+ */
124
+ return false;
125
+ }
126
+ nextlabel = gen_new_label();
127
+ tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_R[a->rn], 0, nextlabel);
128
+ tmp = load_reg(s, a->rn);
129
+ store_reg(s, 14, tmp);
130
+ gen_jmp_tb(s, s->base.pc_next, 1);
131
+
132
+ gen_set_label(nextlabel);
133
+ gen_jmp(s, read_pc(s) + a->imm);
134
+ return true;
135
+}
129
+}
136
+
130
+
137
+static bool trans_LE(DisasContext *s, arg_LE *a)
131
+static bool trans_UQRSHL_rr(DisasContext *s, arg_mve_sh_rr *a)
138
+{
132
+{
139
+ /*
133
+ return do_mve_sh_rr(s, a, gen_helper_mve_uqrshl);
140
+ * M-profile low-overhead loop end. The architecture permits an
141
+ * implementation to discard the LO_BRANCH_INFO cache at any time,
142
+ * and we take the IMPDEF option to never set it in the first place
143
+ * (equivalent to always discarding it immediately), because for QEMU
144
+ * a "real" implementation would be complicated and wouldn't execute
145
+ * any faster.
146
+ */
147
+ TCGv_i32 tmp;
148
+
149
+ if (!dc_isar_feature(aa32_lob, s)) {
150
+ return false;
151
+ }
152
+
153
+ if (!a->f) {
154
+ /* Not loop-forever. If LR <= 1 this is the last loop: do nothing. */
155
+ arm_gen_condlabel(s);
156
+ tcg_gen_brcondi_i32(TCG_COND_LEU, cpu_R[14], 1, s->condlabel);
157
+ /* Decrement LR */
158
+ tmp = load_reg(s, 14);
159
+ tcg_gen_addi_i32(tmp, tmp, -1);
160
+ store_reg(s, 14, tmp);
161
+ }
162
+ /* Jump back to the loop start */
163
+ gen_jmp(s, read_pc(s) - a->imm);
164
+ return true;
165
+}
134
+}
166
+
135
+
167
static bool op_tbranch(DisasContext *s, arg_tbranch *a, bool half)
136
/*
168
{
137
* Multiply and multiply accumulate
169
TCGv_i32 addr, tmp;
138
*/
170
--
139
--
171
2.20.1
140
2.20.1
172
141
173
142
diff view generated by jsdifflib
Deleted patch
1
In arm_cpu_realizefn(), if the CPU has VFP or Neon disabled then we
2
squash the ID register fields so that we don't advertise it to the
3
guest. This code was written for A-profile and needs some tweaks to
4
work correctly on M-profile:
5
1
6
* A-profile only fields should not be zeroed on M-profile:
7
- MVFR0.FPSHVEC,FPTRAP
8
- MVFR1.SIMDLS,SIMDINT,SIMDSP,SIMDHP
9
- MVFR2.SIMDMISC
10
* M-profile only fields should be zeroed on M-profile:
11
- MVFR1.FP16
12
13
In particular, because MVFR1.SIMDHP on A-profile is the same field as
14
MVFR1.FP16 on M-profile this code was incorrectly disabling FP16
15
support on an M-profile CPU (where has_neon is always false). This
16
isn't a visible bug yet because we don't have any M-profile CPUs with
17
FP16 support, but the change is necessary before we introduce any.
18
19
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Message-id: 20201019151301.2046-9-peter.maydell@linaro.org
22
---
23
target/arm/cpu.c | 29 ++++++++++++++++++-----------
24
1 file changed, 18 insertions(+), 11 deletions(-)
25
26
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
27
index XXXXXXX..XXXXXXX 100644
28
--- a/target/arm/cpu.c
29
+++ b/target/arm/cpu.c
30
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
31
u = cpu->isar.mvfr0;
32
u = FIELD_DP32(u, MVFR0, FPSP, 0);
33
u = FIELD_DP32(u, MVFR0, FPDP, 0);
34
- u = FIELD_DP32(u, MVFR0, FPTRAP, 0);
35
u = FIELD_DP32(u, MVFR0, FPDIVIDE, 0);
36
u = FIELD_DP32(u, MVFR0, FPSQRT, 0);
37
- u = FIELD_DP32(u, MVFR0, FPSHVEC, 0);
38
u = FIELD_DP32(u, MVFR0, FPROUND, 0);
39
+ if (!arm_feature(env, ARM_FEATURE_M)) {
40
+ u = FIELD_DP32(u, MVFR0, FPTRAP, 0);
41
+ u = FIELD_DP32(u, MVFR0, FPSHVEC, 0);
42
+ }
43
cpu->isar.mvfr0 = u;
44
45
u = cpu->isar.mvfr1;
46
u = FIELD_DP32(u, MVFR1, FPFTZ, 0);
47
u = FIELD_DP32(u, MVFR1, FPDNAN, 0);
48
u = FIELD_DP32(u, MVFR1, FPHP, 0);
49
+ if (arm_feature(env, ARM_FEATURE_M)) {
50
+ u = FIELD_DP32(u, MVFR1, FP16, 0);
51
+ }
52
cpu->isar.mvfr1 = u;
53
54
u = cpu->isar.mvfr2;
55
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
56
u = FIELD_DP32(u, ID_ISAR6, FHM, 0);
57
cpu->isar.id_isar6 = u;
58
59
- u = cpu->isar.mvfr1;
60
- u = FIELD_DP32(u, MVFR1, SIMDLS, 0);
61
- u = FIELD_DP32(u, MVFR1, SIMDINT, 0);
62
- u = FIELD_DP32(u, MVFR1, SIMDSP, 0);
63
- u = FIELD_DP32(u, MVFR1, SIMDHP, 0);
64
- cpu->isar.mvfr1 = u;
65
+ if (!arm_feature(env, ARM_FEATURE_M)) {
66
+ u = cpu->isar.mvfr1;
67
+ u = FIELD_DP32(u, MVFR1, SIMDLS, 0);
68
+ u = FIELD_DP32(u, MVFR1, SIMDINT, 0);
69
+ u = FIELD_DP32(u, MVFR1, SIMDSP, 0);
70
+ u = FIELD_DP32(u, MVFR1, SIMDHP, 0);
71
+ cpu->isar.mvfr1 = u;
72
73
- u = cpu->isar.mvfr2;
74
- u = FIELD_DP32(u, MVFR2, SIMDMISC, 0);
75
- cpu->isar.mvfr2 = u;
76
+ u = cpu->isar.mvfr2;
77
+ u = FIELD_DP32(u, MVFR2, SIMDMISC, 0);
78
+ cpu->isar.mvfr2 = u;
79
+ }
80
}
81
82
if (!cpu->has_neon && !cpu->has_vfp) {
83
--
84
2.20.1
85
86
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
1
3
Fix an unlikely memory leak in load_elf_image().
4
5
Fixes: bf858897b7 ("linux-user: Re-use load_elf_image for the main binary.")
6
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20201016184207.786698-5-richard.henderson@linaro.org
9
Message-Id: <20201003174944.1972444-1-f4bug@amsat.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
14
linux-user/elfload.c | 8 ++++----
15
1 file changed, 4 insertions(+), 4 deletions(-)
16
17
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/linux-user/elfload.c
20
+++ b/linux-user/elfload.c
21
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
22
info->brk = vaddr_em;
23
}
24
} else if (eppnt->p_type == PT_INTERP && pinterp_name) {
25
- char *interp_name;
26
+ g_autofree char *interp_name = NULL;
27
28
if (*pinterp_name) {
29
errmsg = "Multiple PT_INTERP entries";
30
goto exit_errmsg;
31
}
32
- interp_name = malloc(eppnt->p_filesz);
33
+ interp_name = g_malloc(eppnt->p_filesz);
34
if (!interp_name) {
35
goto exit_perror;
36
}
37
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
38
errmsg = "Invalid PT_INTERP entry";
39
goto exit_errmsg;
40
}
41
- *pinterp_name = interp_name;
42
+ *pinterp_name = g_steal_pointer(&interp_name);
43
#ifdef TARGET_MIPS
44
} else if (eppnt->p_type == PT_MIPS_ABIFLAGS) {
45
Mips_elf_abiflags_v0 abiflags;
46
@@ -XXX,XX +XXX,XX @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
47
if (elf_interpreter) {
48
info->load_bias = interp_info.load_bias;
49
info->entry = interp_info.entry;
50
- free(elf_interpreter);
51
+ g_free(elf_interpreter);
52
}
53
54
#ifdef USE_ELF_CORE_DUMP
55
--
56
2.20.1
57
58
diff view generated by jsdifflib
Deleted patch
1
From: Richard Henderson <richard.henderson@linaro.org>
2
1
3
Fixing this now will clarify following patches.
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Message-id: 20201016184207.786698-6-richard.henderson@linaro.org
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
10
linux-user/elfload.c | 12 +++++++++---
11
1 file changed, 9 insertions(+), 3 deletions(-)
12
13
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/linux-user/elfload.c
16
+++ b/linux-user/elfload.c
17
@@ -XXX,XX +XXX,XX @@ static void load_elf_image(const char *image_name, int image_fd,
18
abi_ulong vaddr, vaddr_po, vaddr_ps, vaddr_ef, vaddr_em, vaddr_len;
19
int elf_prot = 0;
20
21
- if (eppnt->p_flags & PF_R) elf_prot = PROT_READ;
22
- if (eppnt->p_flags & PF_W) elf_prot |= PROT_WRITE;
23
- if (eppnt->p_flags & PF_X) elf_prot |= PROT_EXEC;
24
+ if (eppnt->p_flags & PF_R) {
25
+ elf_prot |= PROT_READ;
26
+ }
27
+ if (eppnt->p_flags & PF_W) {
28
+ elf_prot |= PROT_WRITE;
29
+ }
30
+ if (eppnt->p_flags & PF_X) {
31
+ elf_prot |= PROT_EXEC;
32
+ }
33
34
vaddr = load_bias + eppnt->p_vaddr;
35
vaddr_po = TARGET_ELF_PAGEOFFSET(vaddr);
36
--
37
2.20.1
38
39
diff view generated by jsdifflib