1
The following changes since commit 5a67d7735d4162630769ef495cf813244fc850df:
1
Hi; this mostly contains the first slice of A64 decodetree
2
patches, plus some other minor pieces. It also has the
3
enablement of MTE for KVM guests.
2
4
3
Merge remote-tracking branch 'remotes/berrange-gitlab/tags/tls-deps-pull-request' into staging (2021-07-02 08:22:39 +0100)
5
thanks
6
-- PMM
7
8
The following changes since commit d27e7c359330ba7020bdbed7ed2316cb4cf6ffc1:
9
10
qapi/parser: Drop two bad type hints for now (2023-05-17 10:18:33 -0700)
4
11
5
are available in the Git repository at:
12
are available in the Git repository at:
6
13
7
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20210702
14
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230518
8
15
9
for you to fetch changes up to 04ea4d3cfd0a21b248ece8eb7a9436a3d9898dd8:
16
for you to fetch changes up to 91608e2a44f36e79cb83f863b8a7bb57d2c98061:
10
17
11
target/arm: Implement MVE shifts by register (2021-07-02 11:48:38 +0100)
18
docs: Convert u2f.txt to rST (2023-05-18 11:40:32 +0100)
12
19
13
----------------------------------------------------------------
20
----------------------------------------------------------------
14
target-arm queue:
21
target-arm queue:
15
* more MVE instructions
22
* Fix vd == vm overlap in sve_ldff1_z
16
* hw/gpio/gpio_pwr: use shutdown function for reboot
23
* Add support for MTE with KVM guests
17
* target/arm: Check NaN mode before silencing NaN
24
* Add RAZ/WI handling for DBGDTR[TX|RX]
18
* tests: Boot and halt a Linux guest on the Raspberry Pi 2 machine
25
* Start of conversion of A64 decoder to decodetree
19
* hw/arm: Add basic power management to raspi.
26
* Saturate L2CTLR_EL1 core count field rather than overflowing
20
* docs/system/arm: Add quanta-gbs-bmc, quanta-q7l1-bmc
27
* vexpress: Avoid trivial memory leak of 'flashalias'
28
* sbsa-ref: switch default cpu core to Neoverse-N1
29
* sbsa-ref: use Bochs graphics card instead of VGA
30
* MAINTAINERS: Add Marcin Juszkiewicz to sbsa-ref reviewer list
31
* docs: Convert u2f.txt to rST
21
32
22
----------------------------------------------------------------
33
----------------------------------------------------------------
23
Joe Komlodi (1):
34
Alex Bennée (1):
24
target/arm: Check NaN mode before silencing NaN
35
target/arm: add RAZ/WI handling for DBGDTR[TX|RX]
25
36
26
Maxim Uvarov (1):
37
Cornelia Huck (1):
27
hw/gpio/gpio_pwr: use shutdown function for reboot
38
arm/kvm: add support for MTE
28
39
29
Nolan Leake (1):
40
Marcin Juszkiewicz (3):
30
hw/arm: Add basic power management to raspi.
41
sbsa-ref: switch default cpu core to Neoverse-N1
42
Maintainers: add myself as reviewer for sbsa-ref
43
sbsa-ref: use Bochs graphics card instead of VGA
31
44
32
Patrick Venture (2):
45
Peter Maydell (14):
33
docs/system/arm: Add quanta-q7l1-bmc reference
46
target/arm: Create decodetree skeleton for A64
34
docs/system/arm: Add quanta-gbs-bmc reference
47
target/arm: Pull calls to disas_sve() and disas_sme() out of legacy decoder
48
target/arm: Convert Extract instructions to decodetree
49
target/arm: Convert unconditional branch immediate to decodetree
50
target/arm: Convert CBZ, CBNZ to decodetree
51
target/arm: Convert TBZ, TBNZ to decodetree
52
target/arm: Convert conditional branch insns to decodetree
53
target/arm: Convert BR, BLR, RET to decodetree
54
target/arm: Convert BRA[AB]Z, BLR[AB]Z, RETA[AB] to decodetree
55
target/arm: Convert BRAA, BRAB, BLRAA, BLRAB to decodetree
56
target/arm: Convert ERET, ERETAA, ERETAB to decodetree
57
target/arm: Saturate L2CTLR_EL1 core count field rather than overflowing
58
hw/arm/vexpress: Avoid trivial memory leak of 'flashalias'
59
docs: Convert u2f.txt to rST
35
60
36
Peter Maydell (18):
61
Richard Henderson (10):
37
target/arm: Fix MVE widening/narrowing VLDR/VSTR offset calculation
62
target/arm: Fix vd == vm overlap in sve_ldff1_z
38
target/arm: Fix bugs in MVE VRMLALDAVH, VRMLSLDAVH
63
target/arm: Split out disas_a64_legacy
39
target/arm: Make asimd_imm_const() public
64
target/arm: Convert PC-rel addressing to decodetree
40
target/arm: Use asimd_imm_const for A64 decode
65
target/arm: Split gen_add_CC and gen_sub_CC
41
target/arm: Use dup_const() instead of bitfield_replicate()
66
target/arm: Convert Add/subtract (immediate) to decodetree
42
target/arm: Implement MVE logical immediate insns
67
target/arm: Convert Add/subtract (immediate with tags) to decodetree
43
target/arm: Implement MVE vector shift left by immediate insns
68
target/arm: Replace bitmask64 with MAKE_64BIT_MASK
44
target/arm: Implement MVE vector shift right by immediate insns
69
target/arm: Convert Logical (immediate) to decodetree
45
target/arm: Implement MVE VSHLL
70
target/arm: Convert Move wide (immediate) to decodetree
46
target/arm: Implement MVE VSRI, VSLI
71
target/arm: Convert Bitfield to decodetree
47
target/arm: Implement MVE VSHRN, VRSHRN
48
target/arm: Implement MVE saturating narrowing shifts
49
target/arm: Implement MVE VSHLC
50
target/arm: Implement MVE VADDLV
51
target/arm: Implement MVE long shifts by immediate
52
target/arm: Implement MVE long shifts by register
53
target/arm: Implement MVE shifts by immediate
54
target/arm: Implement MVE shifts by register
55
72
56
Philippe Mathieu-Daudé (1):
73
MAINTAINERS | 1 +
57
tests: Boot and halt a Linux guest on the Raspberry Pi 2 machine
74
docs/system/device-emulation.rst | 1 +
75
docs/system/devices/usb-u2f.rst | 93 +++
76
docs/system/devices/usb.rst | 2 +-
77
docs/u2f.txt | 110 ----
78
target/arm/cpu.h | 4 +
79
target/arm/kvm_arm.h | 19 +
80
target/arm/tcg/translate.h | 5 +
81
target/arm/tcg/a64.decode | 152 +++++
82
hw/arm/sbsa-ref.c | 4 +-
83
hw/arm/vexpress.c | 40 +-
84
hw/arm/virt.c | 73 ++-
85
target/arm/cortex-regs.c | 11 +-
86
target/arm/cpu.c | 9 +-
87
target/arm/debug_helper.c | 11 +-
88
target/arm/kvm.c | 35 +
89
target/arm/kvm64.c | 5 +
90
target/arm/tcg/sve_helper.c | 6 +
91
target/arm/tcg/translate-a64.c | 1321 ++++++++++++++++----------------------
92
target/arm/tcg/meson.build | 1 +
93
20 files changed, 979 insertions(+), 924 deletions(-)
94
create mode 100644 docs/system/devices/usb-u2f.rst
95
delete mode 100644 docs/u2f.txt
96
create mode 100644 target/arm/tcg/a64.decode
58
97
59
docs/system/arm/aspeed.rst | 1 +
60
docs/system/arm/nuvoton.rst | 5 +-
61
include/hw/arm/bcm2835_peripherals.h | 3 +-
62
include/hw/misc/bcm2835_powermgt.h | 29 ++
63
target/arm/helper-mve.h | 108 +++++++
64
target/arm/translate.h | 41 +++
65
target/arm/mve.decode | 177 ++++++++++-
66
target/arm/t32.decode | 71 ++++-
67
hw/arm/bcm2835_peripherals.c | 13 +-
68
hw/gpio/gpio_pwr.c | 2 +-
69
hw/misc/bcm2835_powermgt.c | 160 ++++++++++
70
target/arm/helper-a64.c | 12 +-
71
target/arm/mve_helper.c | 524 +++++++++++++++++++++++++++++++--
72
target/arm/translate-a64.c | 86 +-----
73
target/arm/translate-mve.c | 261 +++++++++++++++-
74
target/arm/translate-neon.c | 81 -----
75
target/arm/translate.c | 327 +++++++++++++++++++-
76
target/arm/vfp_helper.c | 24 +-
77
hw/misc/meson.build | 1 +
78
tests/acceptance/boot_linux_console.py | 43 +++
79
20 files changed, 1760 insertions(+), 209 deletions(-)
80
create mode 100644 include/hw/misc/bcm2835_powermgt.h
81
create mode 100644 hw/misc/bcm2835_powermgt.c
82
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
2
2
3
Add a test booting and quickly shutdown a raspi2 machine,
3
The world outside moves to newer and newer cpu cores. Let move SBSA
4
to test the power management model:
4
Reference Platform to something newer as well.
5
5
6
(1/1) tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_raspi2_initrd:
6
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
7
console: [ 0.000000] Booting Linux on physical CPU 0xf00
7
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
8
console: [ 0.000000] Linux version 4.14.98-v7+ (dom@dom-XPS-13-9370) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611)) #1200 SMP Tue Feb 12 20:27:48 GMT 2019
8
Message-id: 20230506183417.1360427-1-marcin.juszkiewicz@linaro.org
9
console: [ 0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
10
console: [ 0.000000] CPU: div instructions available: patching division code
11
console: [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
12
console: [ 0.000000] OF: fdt: Machine model: Raspberry Pi 2 Model B
13
...
14
console: Boot successful.
15
console: cat /proc/cpuinfo
16
console: / # cat /proc/cpuinfo
17
...
18
console: processor : 3
19
console: model name : ARMv7 Processor rev 5 (v7l)
20
console: BogoMIPS : 125.00
21
console: Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
22
console: CPU implementer : 0x41
23
console: CPU architecture: 7
24
console: CPU variant : 0x0
25
console: CPU part : 0xc07
26
console: CPU revision : 5
27
console: Hardware : BCM2835
28
console: Revision : 0000
29
console: Serial : 0000000000000000
30
console: cat /proc/iomem
31
console: / # cat /proc/iomem
32
console: 00000000-3bffffff : System RAM
33
console: 00008000-00afffff : Kernel code
34
console: 00c00000-00d468ef : Kernel data
35
console: 3f006000-3f006fff : dwc_otg
36
console: 3f007000-3f007eff : /soc/dma@7e007000
37
console: 3f00b880-3f00b8bf : /soc/mailbox@7e00b880
38
console: 3f100000-3f100027 : /soc/watchdog@7e100000
39
console: 3f101000-3f102fff : /soc/cprman@7e101000
40
console: 3f200000-3f2000b3 : /soc/gpio@7e200000
41
PASS (24.59 s)
42
RESULTS : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
43
JOB TIME : 25.02 s
44
45
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
46
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
47
Message-id: 20210531113837.1689775-1-f4bug@amsat.org
48
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
49
---
10
---
50
tests/acceptance/boot_linux_console.py | 43 ++++++++++++++++++++++++++
11
hw/arm/sbsa-ref.c | 2 +-
51
1 file changed, 43 insertions(+)
12
1 file changed, 1 insertion(+), 1 deletion(-)
52
13
53
diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
14
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
54
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
55
--- a/tests/acceptance/boot_linux_console.py
16
--- a/hw/arm/sbsa-ref.c
56
+++ b/tests/acceptance/boot_linux_console.py
17
+++ b/hw/arm/sbsa-ref.c
57
@@ -XXX,XX +XXX,XX @@
18
@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data)
58
from avocado import skip
19
59
from avocado import skipUnless
20
mc->init = sbsa_ref_init;
60
from avocado_qemu import Test
21
mc->desc = "QEMU 'SBSA Reference' ARM Virtual Machine";
61
+from avocado_qemu import exec_command
22
- mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a57");
62
from avocado_qemu import exec_command_and_wait_for_pattern
23
+ mc->default_cpu_type = ARM_CPU_TYPE_NAME("neoverse-n1");
63
from avocado_qemu import interrupt_interactive_console_until_pattern
24
mc->max_cpus = 512;
64
from avocado_qemu import wait_for_console_pattern
25
mc->pci_allow_0_address = true;
65
@@ -XXX,XX +XXX,XX @@ def test_arm_raspi2_uart0(self):
26
mc->minimum_page_bits = 12;
66
"""
67
self.do_test_arm_raspi2(0)
68
69
+ def test_arm_raspi2_initrd(self):
70
+ """
71
+ :avocado: tags=arch:arm
72
+ :avocado: tags=machine:raspi2
73
+ """
74
+ deb_url = ('http://archive.raspberrypi.org/debian/'
75
+ 'pool/main/r/raspberrypi-firmware/'
76
+ 'raspberrypi-kernel_1.20190215-1_armhf.deb')
77
+ deb_hash = 'cd284220b32128c5084037553db3c482426f3972'
78
+ deb_path = self.fetch_asset(deb_url, asset_hash=deb_hash)
79
+ kernel_path = self.extract_from_deb(deb_path, '/boot/kernel7.img')
80
+ dtb_path = self.extract_from_deb(deb_path, '/boot/bcm2709-rpi-2-b.dtb')
81
+
82
+ initrd_url = ('https://github.com/groeck/linux-build-test/raw/'
83
+ '2eb0a73b5d5a28df3170c546ddaaa9757e1e0848/rootfs/'
84
+ 'arm/rootfs-armv7a.cpio.gz')
85
+ initrd_hash = '604b2e45cdf35045846b8bbfbf2129b1891bdc9c'
86
+ initrd_path_gz = self.fetch_asset(initrd_url, asset_hash=initrd_hash)
87
+ initrd_path = os.path.join(self.workdir, 'rootfs.cpio')
88
+ archive.gzip_uncompress(initrd_path_gz, initrd_path)
89
+
90
+ self.vm.set_console()
91
+ kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
92
+ 'earlycon=pl011,0x3f201000 console=ttyAMA0 '
93
+ 'panic=-1 noreboot ' +
94
+ 'dwc_otg.fiq_fsm_enable=0')
95
+ self.vm.add_args('-kernel', kernel_path,
96
+ '-dtb', dtb_path,
97
+ '-initrd', initrd_path,
98
+ '-append', kernel_command_line,
99
+ '-no-reboot')
100
+ self.vm.launch()
101
+ self.wait_for_console_pattern('Boot successful.')
102
+
103
+ exec_command_and_wait_for_pattern(self, 'cat /proc/cpuinfo',
104
+ 'BCM2835')
105
+ exec_command_and_wait_for_pattern(self, 'cat /proc/iomem',
106
+ '/soc/cprman@7e101000')
107
+ exec_command(self, 'halt')
108
+ # Wait for VM to shut down gracefully
109
+ self.vm.wait()
110
+
111
def test_arm_exynos4210_initrd(self):
112
"""
113
:avocado: tags=arch:arm
114
--
27
--
115
2.20.1
28
2.34.1
116
117
diff view generated by jsdifflib
New patch
1
From: Richard Henderson <richard.henderson@linaro.org>
1
2
3
If vd == vm, copy vm to scratch, so that we can pre-zero
4
the output and still access the gather indicies.
5
6
Cc: qemu-stable@nongnu.org
7
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1612
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20230504104232.1877774-1-richard.henderson@linaro.org
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
target/arm/tcg/sve_helper.c | 6 ++++++
14
1 file changed, 6 insertions(+)
15
16
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/tcg/sve_helper.c
19
+++ b/target/arm/tcg/sve_helper.c
20
@@ -XXX,XX +XXX,XX @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
21
intptr_t reg_off;
22
SVEHostPage info;
23
target_ulong addr, in_page;
24
+ ARMVectorReg scratch;
25
26
/* Skip to the first true predicate. */
27
reg_off = find_next_active(vg, 0, reg_max, esz);
28
@@ -XXX,XX +XXX,XX @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
29
return;
30
}
31
32
+ /* Protect against overlap between vd and vm. */
33
+ if (unlikely(vd == vm)) {
34
+ vm = memcpy(&scratch, vm, reg_max);
35
+ }
36
+
37
/*
38
* Probe the first element, allowing faults.
39
*/
40
--
41
2.34.1
diff view generated by jsdifflib
1
From: Patrick Venture <venture@google.com>
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
2
2
3
Adds a line-item reference to the supported quanta-q71l-bmc aspeed
3
At Linaro I work on sbsa-ref, know direction it goes.
4
entry.
5
4
6
Signed-off-by: Patrick Venture <venture@google.com>
5
May not get code details each time.
7
Reviewed-by: Cédric Le Goater <clg@kaod.org>
6
8
Message-id: 20210615192848.1065297-2-venture@google.com
7
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
9
Message-id: 20230515143753.365591-1-marcin.juszkiewicz@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
---
11
docs/system/arm/aspeed.rst | 1 +
12
MAINTAINERS | 1 +
12
1 file changed, 1 insertion(+)
13
1 file changed, 1 insertion(+)
13
14
14
diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
15
diff --git a/MAINTAINERS b/MAINTAINERS
15
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
16
--- a/docs/system/arm/aspeed.rst
17
--- a/MAINTAINERS
17
+++ b/docs/system/arm/aspeed.rst
18
+++ b/MAINTAINERS
18
@@ -XXX,XX +XXX,XX @@ etc.
19
@@ -XXX,XX +XXX,XX @@ SBSA-REF
19
AST2400 SoC based machines :
20
M: Radoslaw Biernacki <rad@semihalf.com>
20
21
M: Peter Maydell <peter.maydell@linaro.org>
21
- ``palmetto-bmc`` OpenPOWER Palmetto POWER8 BMC
22
R: Leif Lindholm <quic_llindhol@quicinc.com>
22
+- ``quanta-q71l-bmc`` OpenBMC Quanta BMC
23
+R: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
23
24
L: qemu-arm@nongnu.org
24
AST2500 SoC based machines :
25
S: Maintained
25
26
F: hw/arm/sbsa-ref.c
26
--
27
--
27
2.20.1
28
2.34.1
28
29
29
30
diff view generated by jsdifflib
New patch
1
1
From: Cornelia Huck <cohuck@redhat.com>
2
3
Extend the 'mte' property for the virt machine to cover KVM as
4
well. For KVM, we don't allocate tag memory, but instead enable the
5
capability.
6
7
If MTE has been enabled, we need to disable migration, as we do not
8
yet have a way to migrate the tags as well. Therefore, MTE will stay
9
off with KVM unless requested explicitly.
10
11
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20230428095533.21747-2-cohuck@redhat.com
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
---
17
target/arm/cpu.h | 4 +++
18
target/arm/kvm_arm.h | 19 ++++++++++++
19
hw/arm/virt.c | 73 +++++++++++++++++++++++++-------------------
20
target/arm/cpu.c | 9 +++---
21
target/arm/kvm.c | 35 +++++++++++++++++++++
22
target/arm/kvm64.c | 5 +++
23
6 files changed, 109 insertions(+), 36 deletions(-)
24
25
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/cpu.h
28
+++ b/target/arm/cpu.h
29
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
30
*/
31
uint32_t psci_conduit;
32
33
+ /* CPU has Memory Tag Extension */
34
+ bool has_mte;
35
+
36
/* For v8M, initial value of the Secure VTOR */
37
uint32_t init_svtor;
38
/* For v8M, initial value of the Non-secure VTOR */
39
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
40
bool prop_pauth;
41
bool prop_pauth_impdef;
42
bool prop_lpa2;
43
+ OnOffAuto prop_mte;
44
45
/* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
46
uint32_t dcz_blocksize;
47
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/kvm_arm.h
50
+++ b/target/arm/kvm_arm.h
51
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_pmu_supported(void);
52
*/
53
bool kvm_arm_sve_supported(void);
54
55
+/**
56
+ * kvm_arm_mte_supported:
57
+ *
58
+ * Returns: true if KVM can enable MTE, and false otherwise.
59
+ */
60
+bool kvm_arm_mte_supported(void);
61
+
62
/**
63
* kvm_arm_get_max_vm_ipa_size:
64
* @ms: Machine state handle
65
@@ -XXX,XX +XXX,XX @@ void kvm_arm_pvtime_init(CPUState *cs, uint64_t ipa);
66
67
int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
68
69
+void kvm_arm_enable_mte(Object *cpuobj, Error **errp);
70
+
71
#else
72
73
/*
74
@@ -XXX,XX +XXX,XX @@ static inline bool kvm_arm_steal_time_supported(void)
75
return false;
76
}
77
78
+static inline bool kvm_arm_mte_supported(void)
79
+{
80
+ return false;
81
+}
82
+
83
/*
84
* These functions should never actually be called without KVM support.
85
*/
86
@@ -XXX,XX +XXX,XX @@ static inline uint32_t kvm_arm_sve_get_vls(CPUState *cs)
87
g_assert_not_reached();
88
}
89
90
+static inline void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
91
+{
92
+ g_assert_not_reached();
93
+}
94
+
95
#endif
96
97
static inline const char *gic_class_name(void)
98
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
99
index XXXXXXX..XXXXXXX 100644
100
--- a/hw/arm/virt.c
101
+++ b/hw/arm/virt.c
102
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
103
exit(1);
104
}
105
106
- if (vms->mte && (kvm_enabled() || hvf_enabled())) {
107
+ if (vms->mte && hvf_enabled()) {
108
error_report("mach-virt: %s does not support providing "
109
"MTE to the guest CPU",
110
current_accel_name());
111
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
112
}
113
114
if (vms->mte) {
115
- /* Create the memory region only once, but link to all cpus. */
116
- if (!tag_sysmem) {
117
- /*
118
- * The property exists only if MemTag is supported.
119
- * If it is, we must allocate the ram to back that up.
120
- */
121
- if (!object_property_find(cpuobj, "tag-memory")) {
122
- error_report("MTE requested, but not supported "
123
- "by the guest CPU");
124
+ if (tcg_enabled()) {
125
+ /* Create the memory region only once, but link to all cpus. */
126
+ if (!tag_sysmem) {
127
+ /*
128
+ * The property exists only if MemTag is supported.
129
+ * If it is, we must allocate the ram to back that up.
130
+ */
131
+ if (!object_property_find(cpuobj, "tag-memory")) {
132
+ error_report("MTE requested, but not supported "
133
+ "by the guest CPU");
134
+ exit(1);
135
+ }
136
+
137
+ tag_sysmem = g_new(MemoryRegion, 1);
138
+ memory_region_init(tag_sysmem, OBJECT(machine),
139
+ "tag-memory", UINT64_MAX / 32);
140
+
141
+ if (vms->secure) {
142
+ secure_tag_sysmem = g_new(MemoryRegion, 1);
143
+ memory_region_init(secure_tag_sysmem, OBJECT(machine),
144
+ "secure-tag-memory",
145
+ UINT64_MAX / 32);
146
+
147
+ /* As with ram, secure-tag takes precedence over tag. */
148
+ memory_region_add_subregion_overlap(secure_tag_sysmem,
149
+ 0, tag_sysmem, -1);
150
+ }
151
+ }
152
+
153
+ object_property_set_link(cpuobj, "tag-memory",
154
+ OBJECT(tag_sysmem), &error_abort);
155
+ if (vms->secure) {
156
+ object_property_set_link(cpuobj, "secure-tag-memory",
157
+ OBJECT(secure_tag_sysmem),
158
+ &error_abort);
159
+ }
160
+ } else if (kvm_enabled()) {
161
+ if (!kvm_arm_mte_supported()) {
162
+ error_report("MTE requested, but not supported by KVM");
163
exit(1);
164
}
165
-
166
- tag_sysmem = g_new(MemoryRegion, 1);
167
- memory_region_init(tag_sysmem, OBJECT(machine),
168
- "tag-memory", UINT64_MAX / 32);
169
-
170
- if (vms->secure) {
171
- secure_tag_sysmem = g_new(MemoryRegion, 1);
172
- memory_region_init(secure_tag_sysmem, OBJECT(machine),
173
- "secure-tag-memory", UINT64_MAX / 32);
174
-
175
- /* As with ram, secure-tag takes precedence over tag. */
176
- memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
177
- tag_sysmem, -1);
178
- }
179
- }
180
-
181
- object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
182
- &error_abort);
183
- if (vms->secure) {
184
- object_property_set_link(cpuobj, "secure-tag-memory",
185
- OBJECT(secure_tag_sysmem),
186
- &error_abort);
187
+ kvm_arm_enable_mte(cpuobj, &error_abort);
188
}
189
}
190
191
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
192
index XXXXXXX..XXXXXXX 100644
193
--- a/target/arm/cpu.c
194
+++ b/target/arm/cpu.c
195
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
196
qdev_prop_allow_set_link_before_realize,
197
OBJ_PROP_LINK_STRONG);
198
}
199
+ cpu->has_mte = true;
200
}
201
#endif
202
}
203
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
204
}
205
if (cpu->tag_memory) {
206
error_setg(errp,
207
- "Cannot enable %s when guest CPUs has MTE enabled",
208
+ "Cannot enable %s when guest CPUs has tag memory enabled",
209
current_accel_name());
210
return;
211
}
212
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
213
}
214
215
#ifndef CONFIG_USER_ONLY
216
- if (cpu->tag_memory == NULL && cpu_isar_feature(aa64_mte, cpu)) {
217
+ if (!cpu->has_mte && cpu_isar_feature(aa64_mte, cpu)) {
218
/*
219
- * Disable the MTE feature bits if we do not have tag-memory
220
- * provided by the machine.
221
+ * Disable the MTE feature bits if we do not have the feature
222
+ * setup by the machine.
223
*/
224
cpu->isar.id_aa64pfr1 =
225
FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 0);
226
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
227
index XXXXXXX..XXXXXXX 100644
228
--- a/target/arm/kvm.c
229
+++ b/target/arm/kvm.c
230
@@ -XXX,XX +XXX,XX @@
231
#include "hw/boards.h"
232
#include "hw/irq.h"
233
#include "qemu/log.h"
234
+#include "migration/blocker.h"
235
236
const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
237
KVM_CAP_LAST_INFO
238
@@ -XXX,XX +XXX,XX @@ bool kvm_arch_cpu_check_are_resettable(void)
239
void kvm_arch_accel_class_init(ObjectClass *oc)
240
{
241
}
242
+
243
+void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
244
+{
245
+ static bool tried_to_enable;
246
+ static bool succeeded_to_enable;
247
+ Error *mte_migration_blocker = NULL;
248
+ int ret;
249
+
250
+ if (!tried_to_enable) {
251
+ /*
252
+ * MTE on KVM is enabled on a per-VM basis (and retrying doesn't make
253
+ * sense), and we only want a single migration blocker as well.
254
+ */
255
+ tried_to_enable = true;
256
+
257
+ ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0);
258
+ if (ret) {
259
+ error_setg_errno(errp, -ret, "Failed to enable KVM_CAP_ARM_MTE");
260
+ return;
261
+ }
262
+
263
+ /* TODO: add proper migration support with MTE enabled */
264
+ error_setg(&mte_migration_blocker,
265
+ "Live migration disabled due to MTE enabled");
266
+ if (migrate_add_blocker(mte_migration_blocker, errp)) {
267
+ error_free(mte_migration_blocker);
268
+ return;
269
+ }
270
+ succeeded_to_enable = true;
271
+ }
272
+ if (succeeded_to_enable) {
273
+ object_property_set_bool(cpuobj, "has_mte", true, NULL);
274
+ }
275
+}
276
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
277
index XXXXXXX..XXXXXXX 100644
278
--- a/target/arm/kvm64.c
279
+++ b/target/arm/kvm64.c
280
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_steal_time_supported(void)
281
return kvm_check_extension(kvm_state, KVM_CAP_STEAL_TIME);
282
}
283
284
+bool kvm_arm_mte_supported(void)
285
+{
286
+ return kvm_check_extension(kvm_state, KVM_CAP_ARM_MTE);
287
+}
288
+
289
QEMU_BUILD_BUG_ON(KVM_ARM64_SVE_VQ_MIN != 1);
290
291
uint32_t kvm_arm_sve_get_vls(CPUState *cs)
292
--
293
2.34.1
diff view generated by jsdifflib
1
From: Patrick Venture <venture@google.com>
1
From: Alex Bennée <alex.bennee@linaro.org>
2
2
3
Add line item reference to quanta-gbs-bmc machine.
3
The commit b3aa2f2128 (target/arm: provide stubs for more external
4
debug registers) was added to handle HyperV's unconditional usage of
5
Debug Communications Channel. It turns out that Linux will similarly
6
break if you enable CONFIG_HVC_DCC "ARM JTAG DCC console".
4
7
5
Signed-off-by: Patrick Venture <venture@google.com>
8
Extend the registers we RAZ/WI set to avoid this.
6
Reviewed-by: Cédric Le Goater <clg@kaod.org>
9
7
Message-id: 20210615192848.1065297-3-venture@google.com
10
Cc: Anders Roxell <anders.roxell@linaro.org>
8
[PMM: fixed underline Sphinx warning]
11
Cc: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
12
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20230516104420.407912-1-alex.bennee@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
16
---
11
docs/system/arm/nuvoton.rst | 5 +++--
17
target/arm/debug_helper.c | 11 +++++++++--
12
1 file changed, 3 insertions(+), 2 deletions(-)
18
1 file changed, 9 insertions(+), 2 deletions(-)
13
19
14
diff --git a/docs/system/arm/nuvoton.rst b/docs/system/arm/nuvoton.rst
20
diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
15
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
16
--- a/docs/system/arm/nuvoton.rst
22
--- a/target/arm/debug_helper.c
17
+++ b/docs/system/arm/nuvoton.rst
23
+++ b/target/arm/debug_helper.c
18
@@ -XXX,XX +XXX,XX @@
24
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
19
-Nuvoton iBMC boards (``npcm750-evb``, ``quanta-gsj``)
25
.access = PL0_R, .accessfn = access_tdcc,
20
-=====================================================
26
.type = ARM_CP_CONST, .resetvalue = 0 },
21
+Nuvoton iBMC boards (``*-bmc``, ``npcm750-evb``, ``quanta-gsj``)
27
/*
22
+================================================================
28
- * OSDTRRX_EL1/OSDTRTX_EL1 are used for save and restore of DBGDTRRX_EL0.
23
29
- * It is a component of the Debug Communications Channel, which is not implemented.
24
The `Nuvoton iBMC`_ chips (NPCM7xx) are a family of ARM-based SoCs that are
30
+ * These registers belong to the Debug Communications Channel,
25
designed to be used as Baseboard Management Controllers (BMCs) in various
31
+ * which is not implemented. However we implement RAZ/WI behaviour
26
@@ -XXX,XX +XXX,XX @@ segment. The following machines are based on this chip :
32
+ * with trapping to prevent spurious SIGILLs if the guest OS does
27
The NPCM730 SoC has two Cortex-A9 cores and is targeted for Data Center and
33
+ * access them as the support cannot be probed for.
28
Hyperscale applications. The following machines are based on this chip :
34
*/
29
35
{ .name = "OSDTRRX_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14,
30
+- ``quanta-gbs-bmc`` Quanta GBS server BMC
36
.opc0 = 2, .opc1 = 0, .crn = 0, .crm = 0, .opc2 = 2,
31
- ``quanta-gsj`` Quanta GSJ server BMC
37
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
32
38
.opc0 = 2, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
33
There are also two more SoCs, NPCM710 and NPCM705, which are single-core
39
.access = PL1_RW, .accessfn = access_tdcc,
40
.type = ARM_CP_CONST, .resetvalue = 0 },
41
+ /* DBGDTRTX_EL0/DBGDTRRX_EL0 depend on direction */
42
+ { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_BOTH, .cp = 14,
43
+ .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0,
44
+ .access = PL0_RW, .accessfn = access_tdcc,
45
+ .type = ARM_CP_CONST, .resetvalue = 0 },
46
/*
47
* OSECCR_EL1 provides a mechanism for an operating system
48
* to access the contents of EDECCR. EDECCR is not implemented though,
34
--
49
--
35
2.20.1
50
2.34.1
36
51
37
52
diff view generated by jsdifflib
New patch
1
From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
1
2
3
Bochs card is normal PCI Express card so it fits better in system with
4
PCI Express bus. VGA is simple legacy PCI card.
5
6
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
7
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
8
Message-id: 20230505120936.1097060-1-marcin.juszkiewicz@linaro.org
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
---
11
hw/arm/sbsa-ref.c | 2 +-
12
1 file changed, 1 insertion(+), 1 deletion(-)
13
14
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/arm/sbsa-ref.c
17
+++ b/hw/arm/sbsa-ref.c
18
@@ -XXX,XX +XXX,XX @@ static void create_pcie(SBSAMachineState *sms)
19
}
20
}
21
22
- pci_create_simple(pci->bus, -1, "VGA");
23
+ pci_create_simple(pci->bus, -1, "bochs-display");
24
25
create_smmu(sms, pci->bus);
26
}
27
--
28
2.34.1
diff view generated by jsdifflib
1
From: Nolan Leake <nolan@sigbus.net>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
This is just enough to make reboot and poweroff work. Works for
3
Split out all of the decode stuff from aarch64_tr_translate_insn.
4
linux, u-boot, and the arm trusted firmware. Not tested, but should
4
Call it disas_a64_legacy to indicate it will be replaced.
5
work for plan9, and bare-metal/hobby OSes, since they seem to generally
6
do what linux does for reset.
7
5
8
The watchdog timer functionality is not yet implemented.
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/64
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Nolan Leake <nolan@sigbus.net>
9
Message-id: 20230512144106.3608981-2-peter.maydell@linaro.org
12
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
[PMM: Rebased]
13
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 20210625210209.1870217-1-nolan@sigbus.net
15
[PMM: tweaked commit title; fixed region size to 0x200;
16
moved header file to include/]
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
---
13
---
19
include/hw/arm/bcm2835_peripherals.h | 3 +-
14
target/arm/tcg/translate-a64.c | 82 ++++++++++++++++++----------------
20
include/hw/misc/bcm2835_powermgt.h | 29 +++++
15
1 file changed, 44 insertions(+), 38 deletions(-)
21
hw/arm/bcm2835_peripherals.c | 13 ++-
22
hw/misc/bcm2835_powermgt.c | 160 +++++++++++++++++++++++++++
23
hw/misc/meson.build | 1 +
24
5 files changed, 204 insertions(+), 2 deletions(-)
25
create mode 100644 include/hw/misc/bcm2835_powermgt.h
26
create mode 100644 hw/misc/bcm2835_powermgt.c
27
16
28
diff --git a/include/hw/arm/bcm2835_peripherals.h b/include/hw/arm/bcm2835_peripherals.h
17
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
30
--- a/include/hw/arm/bcm2835_peripherals.h
19
--- a/target/arm/tcg/translate-a64.c
31
+++ b/include/hw/arm/bcm2835_peripherals.h
20
+++ b/target/arm/tcg/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@
21
@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
33
#include "hw/misc/bcm2835_mphi.h"
22
return false;
34
#include "hw/misc/bcm2835_thermal.h"
35
#include "hw/misc/bcm2835_cprman.h"
36
+#include "hw/misc/bcm2835_powermgt.h"
37
#include "hw/sd/sdhci.h"
38
#include "hw/sd/bcm2835_sdhost.h"
39
#include "hw/gpio/bcm2835_gpio.h"
40
@@ -XXX,XX +XXX,XX @@ struct BCM2835PeripheralState {
41
BCM2835MphiState mphi;
42
UnimplementedDeviceState txp;
43
UnimplementedDeviceState armtmr;
44
- UnimplementedDeviceState powermgt;
45
+ BCM2835PowerMgtState powermgt;
46
BCM2835CprmanState cprman;
47
PL011State uart0;
48
BCM2835AuxState aux;
49
diff --git a/include/hw/misc/bcm2835_powermgt.h b/include/hw/misc/bcm2835_powermgt.h
50
new file mode 100644
51
index XXXXXXX..XXXXXXX
52
--- /dev/null
53
+++ b/include/hw/misc/bcm2835_powermgt.h
54
@@ -XXX,XX +XXX,XX @@
55
+/*
56
+ * BCM2835 Power Management emulation
57
+ *
58
+ * Copyright (C) 2017 Marcin Chojnacki <marcinch7@gmail.com>
59
+ * Copyright (C) 2021 Nolan Leake <nolan@sigbus.net>
60
+ *
61
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
62
+ * See the COPYING file in the top-level directory.
63
+ */
64
+
65
+#ifndef BCM2835_POWERMGT_H
66
+#define BCM2835_POWERMGT_H
67
+
68
+#include "hw/sysbus.h"
69
+#include "qom/object.h"
70
+
71
+#define TYPE_BCM2835_POWERMGT "bcm2835-powermgt"
72
+OBJECT_DECLARE_SIMPLE_TYPE(BCM2835PowerMgtState, BCM2835_POWERMGT)
73
+
74
+struct BCM2835PowerMgtState {
75
+ SysBusDevice busdev;
76
+ MemoryRegion iomem;
77
+
78
+ uint32_t rstc;
79
+ uint32_t rsts;
80
+ uint32_t wdog;
81
+};
82
+
83
+#endif
84
diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/hw/arm/bcm2835_peripherals.c
87
+++ b/hw/arm/bcm2835_peripherals.c
88
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_init(Object *obj)
89
90
object_property_add_const_link(OBJECT(&s->dwc2), "dma-mr",
91
OBJECT(&s->gpu_bus_mr));
92
+
93
+ /* Power Management */
94
+ object_initialize_child(obj, "powermgt", &s->powermgt,
95
+ TYPE_BCM2835_POWERMGT);
96
}
23
}
97
24
98
static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
25
+/* C3.1 A64 instruction index by encoding */
99
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
26
+static void disas_a64_legacy(DisasContext *s, uint32_t insn)
100
qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
101
INTERRUPT_USB));
102
103
+ /* Power Management */
104
+ if (!sysbus_realize(SYS_BUS_DEVICE(&s->powermgt), errp)) {
105
+ return;
106
+ }
107
+
108
+ memory_region_add_subregion(&s->peri_mr, PM_OFFSET,
109
+ sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->powermgt), 0));
110
+
111
create_unimp(s, &s->txp, "bcm2835-txp", TXP_OFFSET, 0x1000);
112
create_unimp(s, &s->armtmr, "bcm2835-sp804", ARMCTRL_TIMER0_1_OFFSET, 0x40);
113
- create_unimp(s, &s->powermgt, "bcm2835-powermgt", PM_OFFSET, 0x114);
114
create_unimp(s, &s->i2s, "bcm2835-i2s", I2S_OFFSET, 0x100);
115
create_unimp(s, &s->smi, "bcm2835-smi", SMI_OFFSET, 0x100);
116
create_unimp(s, &s->spi[0], "bcm2835-spi0", SPI0_OFFSET, 0x20);
117
diff --git a/hw/misc/bcm2835_powermgt.c b/hw/misc/bcm2835_powermgt.c
118
new file mode 100644
119
index XXXXXXX..XXXXXXX
120
--- /dev/null
121
+++ b/hw/misc/bcm2835_powermgt.c
122
@@ -XXX,XX +XXX,XX @@
123
+/*
124
+ * BCM2835 Power Management emulation
125
+ *
126
+ * Copyright (C) 2017 Marcin Chojnacki <marcinch7@gmail.com>
127
+ * Copyright (C) 2021 Nolan Leake <nolan@sigbus.net>
128
+ *
129
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
130
+ * See the COPYING file in the top-level directory.
131
+ */
132
+
133
+#include "qemu/osdep.h"
134
+#include "qemu/log.h"
135
+#include "qemu/module.h"
136
+#include "hw/misc/bcm2835_powermgt.h"
137
+#include "migration/vmstate.h"
138
+#include "sysemu/runstate.h"
139
+
140
+#define PASSWORD 0x5a000000
141
+#define PASSWORD_MASK 0xff000000
142
+
143
+#define R_RSTC 0x1c
144
+#define V_RSTC_RESET 0x20
145
+#define R_RSTS 0x20
146
+#define V_RSTS_POWEROFF 0x555 /* Linux uses partition 63 to indicate halt. */
147
+#define R_WDOG 0x24
148
+
149
+static uint64_t bcm2835_powermgt_read(void *opaque, hwaddr offset,
150
+ unsigned size)
151
+{
27
+{
152
+ BCM2835PowerMgtState *s = (BCM2835PowerMgtState *)opaque;
28
+ switch (extract32(insn, 25, 4)) {
153
+ uint32_t res = 0;
29
+ case 0x0:
154
+
30
+ if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
155
+ switch (offset) {
31
+ unallocated_encoding(s);
156
+ case R_RSTC:
157
+ res = s->rstc;
158
+ break;
159
+ case R_RSTS:
160
+ res = s->rsts;
161
+ break;
162
+ case R_WDOG:
163
+ res = s->wdog;
164
+ break;
165
+
166
+ default:
167
+ qemu_log_mask(LOG_UNIMP,
168
+ "bcm2835_powermgt_read: Unknown offset 0x%08"HWADDR_PRIx
169
+ "\n", offset);
170
+ res = 0;
171
+ break;
172
+ }
173
+
174
+ return res;
175
+}
176
+
177
+static void bcm2835_powermgt_write(void *opaque, hwaddr offset,
178
+ uint64_t value, unsigned size)
179
+{
180
+ BCM2835PowerMgtState *s = (BCM2835PowerMgtState *)opaque;
181
+
182
+ if ((value & PASSWORD_MASK) != PASSWORD) {
183
+ qemu_log_mask(LOG_GUEST_ERROR,
184
+ "bcm2835_powermgt_write: Bad password 0x%"PRIx64
185
+ " at offset 0x%08"HWADDR_PRIx"\n",
186
+ value, offset);
187
+ return;
188
+ }
189
+
190
+ value = value & ~PASSWORD_MASK;
191
+
192
+ switch (offset) {
193
+ case R_RSTC:
194
+ s->rstc = value;
195
+ if (value & V_RSTC_RESET) {
196
+ if ((s->rsts & 0xfff) == V_RSTS_POWEROFF) {
197
+ qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
198
+ } else {
199
+ qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
200
+ }
201
+ }
32
+ }
202
+ break;
33
+ break;
203
+ case R_RSTS:
34
+ case 0x1: case 0x3: /* UNALLOCATED */
204
+ qemu_log_mask(LOG_UNIMP,
35
+ unallocated_encoding(s);
205
+ "bcm2835_powermgt_write: RSTS\n");
206
+ s->rsts = value;
207
+ break;
36
+ break;
208
+ case R_WDOG:
37
+ case 0x2:
209
+ qemu_log_mask(LOG_UNIMP,
38
+ if (!disas_sve(s, insn)) {
210
+ "bcm2835_powermgt_write: WDOG\n");
39
+ unallocated_encoding(s);
211
+ s->wdog = value;
40
+ }
212
+ break;
41
+ break;
213
+
42
+ case 0x8: case 0x9: /* Data processing - immediate */
43
+ disas_data_proc_imm(s, insn);
44
+ break;
45
+ case 0xa: case 0xb: /* Branch, exception generation and system insns */
46
+ disas_b_exc_sys(s, insn);
47
+ break;
48
+ case 0x4:
49
+ case 0x6:
50
+ case 0xc:
51
+ case 0xe: /* Loads and stores */
52
+ disas_ldst(s, insn);
53
+ break;
54
+ case 0x5:
55
+ case 0xd: /* Data processing - register */
56
+ disas_data_proc_reg(s, insn);
57
+ break;
58
+ case 0x7:
59
+ case 0xf: /* Data processing - SIMD and floating point */
60
+ disas_data_proc_simd_fp(s, insn);
61
+ break;
214
+ default:
62
+ default:
215
+ qemu_log_mask(LOG_UNIMP,
63
+ assert(FALSE); /* all 15 cases should be handled above */
216
+ "bcm2835_powermgt_write: Unknown offset 0x%08"HWADDR_PRIx
217
+ "\n", offset);
218
+ break;
64
+ break;
219
+ }
65
+ }
220
+}
66
+}
221
+
67
+
222
+static const MemoryRegionOps bcm2835_powermgt_ops = {
68
static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
223
+ .read = bcm2835_powermgt_read,
69
CPUState *cpu)
224
+ .write = bcm2835_powermgt_write,
70
{
225
+ .endianness = DEVICE_NATIVE_ENDIAN,
71
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
226
+ .impl.min_access_size = 4,
72
disas_sme_fa64(s, insn);
227
+ .impl.max_access_size = 4,
73
}
228
+};
74
229
+
75
- switch (extract32(insn, 25, 4)) {
230
+static const VMStateDescription vmstate_bcm2835_powermgt = {
76
- case 0x0:
231
+ .name = TYPE_BCM2835_POWERMGT,
77
- if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
232
+ .version_id = 1,
78
- unallocated_encoding(s);
233
+ .minimum_version_id = 1,
79
- }
234
+ .fields = (VMStateField[]) {
80
- break;
235
+ VMSTATE_UINT32(rstc, BCM2835PowerMgtState),
81
- case 0x1: case 0x3: /* UNALLOCATED */
236
+ VMSTATE_UINT32(rsts, BCM2835PowerMgtState),
82
- unallocated_encoding(s);
237
+ VMSTATE_UINT32(wdog, BCM2835PowerMgtState),
83
- break;
238
+ VMSTATE_END_OF_LIST()
84
- case 0x2:
239
+ }
85
- if (!disas_sve(s, insn)) {
240
+};
86
- unallocated_encoding(s);
241
+
87
- }
242
+static void bcm2835_powermgt_init(Object *obj)
88
- break;
243
+{
89
- case 0x8: case 0x9: /* Data processing - immediate */
244
+ BCM2835PowerMgtState *s = BCM2835_POWERMGT(obj);
90
- disas_data_proc_imm(s, insn);
245
+
91
- break;
246
+ memory_region_init_io(&s->iomem, obj, &bcm2835_powermgt_ops, s,
92
- case 0xa: case 0xb: /* Branch, exception generation and system insns */
247
+ TYPE_BCM2835_POWERMGT, 0x200);
93
- disas_b_exc_sys(s, insn);
248
+ sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->iomem);
94
- break;
249
+}
95
- case 0x4:
250
+
96
- case 0x6:
251
+static void bcm2835_powermgt_reset(DeviceState *dev)
97
- case 0xc:
252
+{
98
- case 0xe: /* Loads and stores */
253
+ BCM2835PowerMgtState *s = BCM2835_POWERMGT(dev);
99
- disas_ldst(s, insn);
254
+
100
- break;
255
+ /* https://elinux.org/BCM2835_registers#PM */
101
- case 0x5:
256
+ s->rstc = 0x00000102;
102
- case 0xd: /* Data processing - register */
257
+ s->rsts = 0x00001000;
103
- disas_data_proc_reg(s, insn);
258
+ s->wdog = 0x00000000;
104
- break;
259
+}
105
- case 0x7:
260
+
106
- case 0xf: /* Data processing - SIMD and floating point */
261
+static void bcm2835_powermgt_class_init(ObjectClass *klass, void *data)
107
- disas_data_proc_simd_fp(s, insn);
262
+{
108
- break;
263
+ DeviceClass *dc = DEVICE_CLASS(klass);
109
- default:
264
+
110
- assert(FALSE); /* all 15 cases should be handled above */
265
+ dc->reset = bcm2835_powermgt_reset;
111
- break;
266
+ dc->vmsd = &vmstate_bcm2835_powermgt;
112
- }
267
+}
113
+ disas_a64_legacy(s, insn);
268
+
114
269
+static TypeInfo bcm2835_powermgt_info = {
115
/*
270
+ .name = TYPE_BCM2835_POWERMGT,
116
* After execution of most insns, btype is reset to 0.
271
+ .parent = TYPE_SYS_BUS_DEVICE,
272
+ .instance_size = sizeof(BCM2835PowerMgtState),
273
+ .class_init = bcm2835_powermgt_class_init,
274
+ .instance_init = bcm2835_powermgt_init,
275
+};
276
+
277
+static void bcm2835_powermgt_register_types(void)
278
+{
279
+ type_register_static(&bcm2835_powermgt_info);
280
+}
281
+
282
+type_init(bcm2835_powermgt_register_types)
283
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
284
index XXXXXXX..XXXXXXX 100644
285
--- a/hw/misc/meson.build
286
+++ b/hw/misc/meson.build
287
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_RASPI', if_true: files(
288
'bcm2835_rng.c',
289
'bcm2835_thermal.c',
290
'bcm2835_cprman.c',
291
+ 'bcm2835_powermgt.c',
292
))
293
softmmu_ss.add(when: 'CONFIG_SLAVIO', if_true: files('slavio_misc.c'))
294
softmmu_ss.add(when: 'CONFIG_ZYNQ', if_true: files('zynq_slcr.c', 'zynq-xadc.c'))
295
--
117
--
296
2.20.1
118
2.34.1
297
298
diff view generated by jsdifflib
1
Implement the MVE saturating shift-right-and-narrow insns
1
The A64 translator uses a hand-written decoder for everything except
2
VQSHRN, VQSHRUN, VQRSHRN and VQRSHRUN.
2
SVE or SME. It's fairly well structured, but it's becoming obvious
3
that it's still more painful to add instructions to than the A32
4
translator, because putting a new instruction into the right place in
5
a hand-written decoder is much harder than adding new instruction
6
patterns to a decodetree file.
3
7
4
do_srshr() is borrowed from sve_helper.c.
8
As the first step in conversion to decodetree, create the skeleton of
9
the decodetree decoder; where it does not handle instructions we will
10
fall back to the legacy decoder (which will be for everything at the
11
moment, since there are no patterns in a64.decode).
5
12
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-13-peter.maydell@linaro.org
15
Message-id: 20230512144106.3608981-3-peter.maydell@linaro.org
9
---
16
---
10
target/arm/helper-mve.h | 30 +++++++++++
17
target/arm/tcg/a64.decode | 20 ++++++++++++++++++++
11
target/arm/mve.decode | 28 ++++++++++
18
target/arm/tcg/translate-a64.c | 18 +++++++++++-------
12
target/arm/mve_helper.c | 104 +++++++++++++++++++++++++++++++++++++
19
target/arm/tcg/meson.build | 1 +
13
target/arm/translate-mve.c | 12 +++++
20
3 files changed, 32 insertions(+), 7 deletions(-)
14
4 files changed, 174 insertions(+)
21
create mode 100644 target/arm/tcg/a64.decode
15
22
16
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
23
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
24
new file mode 100644
25
index XXXXXXX..XXXXXXX
26
--- /dev/null
27
+++ b/target/arm/tcg/a64.decode
28
@@ -XXX,XX +XXX,XX @@
29
+# AArch64 A64 allowed instruction decoding
30
+#
31
+# Copyright (c) 2023 Linaro, Ltd
32
+#
33
+# This library is free software; you can redistribute it and/or
34
+# modify it under the terms of the GNU Lesser General Public
35
+# License as published by the Free Software Foundation; either
36
+# version 2.1 of the License, or (at your option) any later version.
37
+#
38
+# This library is distributed in the hope that it will be useful,
39
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
40
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
41
+# Lesser General Public License for more details.
42
+#
43
+# You should have received a copy of the GNU Lesser General Public
44
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
45
+
46
+#
47
+# This file is processed by scripts/decodetree.py
48
+#
49
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
17
index XXXXXXX..XXXXXXX 100644
50
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-mve.h
51
--- a/target/arm/tcg/translate-a64.c
19
+++ b/target/arm/helper-mve.h
52
+++ b/target/arm/tcg/translate-a64.c
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrshrnbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
53
@@ -XXX,XX +XXX,XX @@ enum a64_shift_type {
21
DEF_HELPER_FLAGS_4(mve_vrshrnbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
54
A64_SHIFT_TYPE_ROR = 3
22
DEF_HELPER_FLAGS_4(mve_vrshrntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
55
};
23
DEF_HELPER_FLAGS_4(mve_vrshrnth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
56
57
+/*
58
+ * Include the generated decoders.
59
+ */
24
+
60
+
25
+DEF_HELPER_FLAGS_4(mve_vqshrnb_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
61
+#include "decode-sme-fa64.c.inc"
26
+DEF_HELPER_FLAGS_4(mve_vqshrnb_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
62
+#include "decode-a64.c.inc"
27
+DEF_HELPER_FLAGS_4(mve_vqshrnt_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(mve_vqshrnt_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+
63
+
30
+DEF_HELPER_FLAGS_4(mve_vqshrnb_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
64
/* Table based decoder typedefs - used when the relevant bits for decode
31
+DEF_HELPER_FLAGS_4(mve_vqshrnb_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
65
* are too awkwardly scattered across the instruction (eg SIMD).
32
+DEF_HELPER_FLAGS_4(mve_vqshrnt_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
66
*/
33
+DEF_HELPER_FLAGS_4(mve_vqshrnt_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
67
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_simd_fp(DisasContext *s, uint32_t insn)
34
+
35
+DEF_HELPER_FLAGS_4(mve_vqshrunbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
+DEF_HELPER_FLAGS_4(mve_vqshrunbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
37
+DEF_HELPER_FLAGS_4(mve_vqshruntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_4(mve_vqshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
+
40
+DEF_HELPER_FLAGS_4(mve_vqrshrnb_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
+DEF_HELPER_FLAGS_4(mve_vqrshrnb_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
42
+DEF_HELPER_FLAGS_4(mve_vqrshrnt_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_4(mve_vqrshrnt_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
44
+
45
+DEF_HELPER_FLAGS_4(mve_vqrshrnb_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
46
+DEF_HELPER_FLAGS_4(mve_vqrshrnb_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_4(mve_vqrshrnt_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
48
+DEF_HELPER_FLAGS_4(mve_vqrshrnt_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
49
+
50
+DEF_HELPER_FLAGS_4(mve_vqrshrunbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
51
+DEF_HELPER_FLAGS_4(mve_vqrshrunbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_4(mve_vqrshruntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
53
+DEF_HELPER_FLAGS_4(mve_vqrshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
54
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
55
index XXXXXXX..XXXXXXX 100644
56
--- a/target/arm/mve.decode
57
+++ b/target/arm/mve.decode
58
@@ -XXX,XX +XXX,XX @@ VRSHRNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_b
59
VRSHRNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_h
60
VRSHRNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_b
61
VRSHRNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_h
62
+
63
+VQSHRNB_S 111 0 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 0 @2_shr_b
64
+VQSHRNB_S 111 0 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 0 @2_shr_h
65
+VQSHRNT_S 111 0 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 0 @2_shr_b
66
+VQSHRNT_S 111 0 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 0 @2_shr_h
67
+VQSHRNB_U 111 1 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 0 @2_shr_b
68
+VQSHRNB_U 111 1 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 0 @2_shr_h
69
+VQSHRNT_U 111 1 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 0 @2_shr_b
70
+VQSHRNT_U 111 1 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 0 @2_shr_h
71
+
72
+VQSHRUNB 111 0 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_b
73
+VQSHRUNB 111 0 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_h
74
+VQSHRUNT 111 0 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b
75
+VQSHRUNT 111 0 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h
76
+
77
+VQRSHRNB_S 111 0 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 1 @2_shr_b
78
+VQRSHRNB_S 111 0 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 1 @2_shr_h
79
+VQRSHRNT_S 111 0 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 1 @2_shr_b
80
+VQRSHRNT_S 111 0 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 1 @2_shr_h
81
+VQRSHRNB_U 111 1 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 1 @2_shr_b
82
+VQRSHRNB_U 111 1 1110 1 . ... ... ... 0 1111 0 1 . 0 ... 1 @2_shr_h
83
+VQRSHRNT_U 111 1 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 1 @2_shr_b
84
+VQRSHRNT_U 111 1 1110 1 . ... ... ... 1 1111 0 1 . 0 ... 1 @2_shr_h
85
+
86
+VQRSHRUNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_b
87
+VQRSHRUNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_h
88
+VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b
89
+VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h
90
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/target/arm/mve_helper.c
93
+++ b/target/arm/mve_helper.c
94
@@ -XXX,XX +XXX,XX @@ static inline uint64_t do_urshr(uint64_t x, unsigned sh)
95
}
68
}
96
}
69
}
97
70
98
+static inline int64_t do_srshr(int64_t x, unsigned sh)
71
-/*
99
+{
72
- * Include the generated SME FA64 decoder.
100
+ if (likely(sh < 64)) {
73
- */
101
+ return (x >> sh) + ((x >> (sh - 1)) & 1);
74
-
102
+ } else {
75
-#include "decode-sme-fa64.c.inc"
103
+ /* Rounding the sign bit always produces 0. */
76
-
104
+ return 0;
77
static bool trans_OK(DisasContext *s, arg_OK *a)
78
{
79
return true;
80
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
81
disas_sme_fa64(s, insn);
82
}
83
84
- disas_a64_legacy(s, insn);
85
+
86
+ if (!disas_a64(s, insn)) {
87
+ disas_a64_legacy(s, insn);
105
+ }
88
+ }
106
+}
89
107
+
90
/*
108
DO_VSHRN_ALL(vshrn, DO_SHR)
91
* After execution of most insns, btype is reset to 0.
109
DO_VSHRN_ALL(vrshrn, do_urshr)
92
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
110
+
111
+static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max,
112
+ bool *satp)
113
+{
114
+ if (val > max) {
115
+ *satp = true;
116
+ return max;
117
+ } else if (val < min) {
118
+ *satp = true;
119
+ return min;
120
+ } else {
121
+ return val;
122
+ }
123
+}
124
+
125
+/* Saturating narrowing right shifts */
126
+#define DO_VSHRN_SAT(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN) \
127
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
128
+ void *vm, uint32_t shift) \
129
+ { \
130
+ LTYPE *m = vm; \
131
+ TYPE *d = vd; \
132
+ uint16_t mask = mve_element_mask(env); \
133
+ bool qc = false; \
134
+ unsigned le; \
135
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
136
+ bool sat = false; \
137
+ TYPE r = FN(m[H##LESIZE(le)], shift, &sat); \
138
+ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \
139
+ qc |= sat && (mask & 1 << (TOP * ESIZE)); \
140
+ } \
141
+ if (qc) { \
142
+ env->vfp.qc[0] = qc; \
143
+ } \
144
+ mve_advance_vpt(env); \
145
+ }
146
+
147
+#define DO_VSHRN_SAT_UB(BOP, TOP, FN) \
148
+ DO_VSHRN_SAT(BOP, false, 1, uint8_t, 2, uint16_t, FN) \
149
+ DO_VSHRN_SAT(TOP, true, 1, uint8_t, 2, uint16_t, FN)
150
+
151
+#define DO_VSHRN_SAT_UH(BOP, TOP, FN) \
152
+ DO_VSHRN_SAT(BOP, false, 2, uint16_t, 4, uint32_t, FN) \
153
+ DO_VSHRN_SAT(TOP, true, 2, uint16_t, 4, uint32_t, FN)
154
+
155
+#define DO_VSHRN_SAT_SB(BOP, TOP, FN) \
156
+ DO_VSHRN_SAT(BOP, false, 1, int8_t, 2, int16_t, FN) \
157
+ DO_VSHRN_SAT(TOP, true, 1, int8_t, 2, int16_t, FN)
158
+
159
+#define DO_VSHRN_SAT_SH(BOP, TOP, FN) \
160
+ DO_VSHRN_SAT(BOP, false, 2, int16_t, 4, int32_t, FN) \
161
+ DO_VSHRN_SAT(TOP, true, 2, int16_t, 4, int32_t, FN)
162
+
163
+#define DO_SHRN_SB(N, M, SATP) \
164
+ do_sat_bhs((int64_t)(N) >> (M), INT8_MIN, INT8_MAX, SATP)
165
+#define DO_SHRN_UB(N, M, SATP) \
166
+ do_sat_bhs((uint64_t)(N) >> (M), 0, UINT8_MAX, SATP)
167
+#define DO_SHRUN_B(N, M, SATP) \
168
+ do_sat_bhs((int64_t)(N) >> (M), 0, UINT8_MAX, SATP)
169
+
170
+#define DO_SHRN_SH(N, M, SATP) \
171
+ do_sat_bhs((int64_t)(N) >> (M), INT16_MIN, INT16_MAX, SATP)
172
+#define DO_SHRN_UH(N, M, SATP) \
173
+ do_sat_bhs((uint64_t)(N) >> (M), 0, UINT16_MAX, SATP)
174
+#define DO_SHRUN_H(N, M, SATP) \
175
+ do_sat_bhs((int64_t)(N) >> (M), 0, UINT16_MAX, SATP)
176
+
177
+#define DO_RSHRN_SB(N, M, SATP) \
178
+ do_sat_bhs(do_srshr(N, M), INT8_MIN, INT8_MAX, SATP)
179
+#define DO_RSHRN_UB(N, M, SATP) \
180
+ do_sat_bhs(do_urshr(N, M), 0, UINT8_MAX, SATP)
181
+#define DO_RSHRUN_B(N, M, SATP) \
182
+ do_sat_bhs(do_srshr(N, M), 0, UINT8_MAX, SATP)
183
+
184
+#define DO_RSHRN_SH(N, M, SATP) \
185
+ do_sat_bhs(do_srshr(N, M), INT16_MIN, INT16_MAX, SATP)
186
+#define DO_RSHRN_UH(N, M, SATP) \
187
+ do_sat_bhs(do_urshr(N, M), 0, UINT16_MAX, SATP)
188
+#define DO_RSHRUN_H(N, M, SATP) \
189
+ do_sat_bhs(do_srshr(N, M), 0, UINT16_MAX, SATP)
190
+
191
+DO_VSHRN_SAT_SB(vqshrnb_sb, vqshrnt_sb, DO_SHRN_SB)
192
+DO_VSHRN_SAT_SH(vqshrnb_sh, vqshrnt_sh, DO_SHRN_SH)
193
+DO_VSHRN_SAT_UB(vqshrnb_ub, vqshrnt_ub, DO_SHRN_UB)
194
+DO_VSHRN_SAT_UH(vqshrnb_uh, vqshrnt_uh, DO_SHRN_UH)
195
+DO_VSHRN_SAT_SB(vqshrunbb, vqshruntb, DO_SHRUN_B)
196
+DO_VSHRN_SAT_SH(vqshrunbh, vqshrunth, DO_SHRUN_H)
197
+
198
+DO_VSHRN_SAT_SB(vqrshrnb_sb, vqrshrnt_sb, DO_RSHRN_SB)
199
+DO_VSHRN_SAT_SH(vqrshrnb_sh, vqrshrnt_sh, DO_RSHRN_SH)
200
+DO_VSHRN_SAT_UB(vqrshrnb_ub, vqrshrnt_ub, DO_RSHRN_UB)
201
+DO_VSHRN_SAT_UH(vqrshrnb_uh, vqrshrnt_uh, DO_RSHRN_UH)
202
+DO_VSHRN_SAT_SB(vqrshrunbb, vqrshruntb, DO_RSHRUN_B)
203
+DO_VSHRN_SAT_SH(vqrshrunbh, vqrshrunth, DO_RSHRUN_H)
204
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
205
index XXXXXXX..XXXXXXX 100644
93
index XXXXXXX..XXXXXXX 100644
206
--- a/target/arm/translate-mve.c
94
--- a/target/arm/tcg/meson.build
207
+++ b/target/arm/translate-mve.c
95
+++ b/target/arm/tcg/meson.build
208
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_N(VSHRNB, vshrnb)
96
@@ -XXX,XX +XXX,XX @@ gen = [
209
DO_2SHIFT_N(VSHRNT, vshrnt)
97
decodetree.process('a32-uncond.decode', extra_args: '--static-decode=disas_a32_uncond'),
210
DO_2SHIFT_N(VRSHRNB, vrshrnb)
98
decodetree.process('t32.decode', extra_args: '--static-decode=disas_t32'),
211
DO_2SHIFT_N(VRSHRNT, vrshrnt)
99
decodetree.process('t16.decode', extra_args: ['-w', '16', '--static-decode=disas_t16']),
212
+DO_2SHIFT_N(VQSHRNB_S, vqshrnb_s)
100
+ decodetree.process('a64.decode', extra_args: ['--static-decode=disas_a64']),
213
+DO_2SHIFT_N(VQSHRNT_S, vqshrnt_s)
101
]
214
+DO_2SHIFT_N(VQSHRNB_U, vqshrnb_u)
102
215
+DO_2SHIFT_N(VQSHRNT_U, vqshrnt_u)
103
arm_ss.add(gen)
216
+DO_2SHIFT_N(VQSHRUNB, vqshrunb)
217
+DO_2SHIFT_N(VQSHRUNT, vqshrunt)
218
+DO_2SHIFT_N(VQRSHRNB_S, vqrshrnb_s)
219
+DO_2SHIFT_N(VQRSHRNT_S, vqrshrnt_s)
220
+DO_2SHIFT_N(VQRSHRNB_U, vqrshrnb_u)
221
+DO_2SHIFT_N(VQRSHRNT_U, vqrshrnt_u)
222
+DO_2SHIFT_N(VQRSHRUNB, vqrshrunb)
223
+DO_2SHIFT_N(VQRSHRUNT, vqrshrunt)
224
--
104
--
225
2.20.1
105
2.34.1
226
227
diff view generated by jsdifflib
1
The A64 AdvSIMD modified-immediate grouping uses almost the same
1
The SVE and SME decode is already done by decodetree. Pull the calls
2
constant encoding that A32 Neon does; reuse asimd_imm_const() (to
2
to these decoders out of the legacy decoder. This doesn't change
3
which we add the AArch64-specific case for cmode 15 op 1) instead of
3
behaviour because all the patterns in sve.decode and sme.decode
4
reimplementing it all.
4
already require the bits that the legacy decoder is decoding to have
5
the correct values.
5
6
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-5-peter.maydell@linaro.org
9
Message-id: 20230512144106.3608981-4-peter.maydell@linaro.org
9
---
10
---
10
target/arm/translate.h | 3 +-
11
target/arm/tcg/translate-a64.c | 20 ++++----------------
11
target/arm/translate-a64.c | 86 ++++----------------------------------
12
1 file changed, 4 insertions(+), 16 deletions(-)
12
target/arm/translate.c | 17 +++++++-
13
3 files changed, 24 insertions(+), 82 deletions(-)
14
13
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
14
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.h
16
--- a/target/arm/tcg/translate-a64.c
18
+++ b/target/arm/translate.h
17
+++ b/target/arm/tcg/translate-a64.c
19
@@ -XXX,XX +XXX,XX @@ static inline MemOp finalize_memop(DisasContext *s, MemOp opc)
18
@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
20
* VMVN and VBIC (when cmode < 14 && op == 1).
19
static void disas_a64_legacy(DisasContext *s, uint32_t insn)
21
*
22
* The combination cmode == 15 op == 1 is a reserved encoding for AArch32;
23
- * callers must catch this.
24
+ * callers must catch this; we return the 64-bit constant value defined
25
+ * for AArch64.
26
*
27
* cmode = 2,3,4,5,6,7,10,11,12,13 imm=0 was UNPREDICTABLE in v7A but
28
* is either not unpredictable or merely CONSTRAINED UNPREDICTABLE in v8A;
29
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate-a64.c
32
+++ b/target/arm/translate-a64.c
33
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
34
{
20
{
35
int rd = extract32(insn, 0, 5);
21
switch (extract32(insn, 25, 4)) {
36
int cmode = extract32(insn, 12, 4);
22
- case 0x0:
37
- int cmode_3_1 = extract32(cmode, 1, 3);
23
- if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
38
- int cmode_0 = extract32(cmode, 0, 1);
24
- unallocated_encoding(s);
39
int o2 = extract32(insn, 11, 1);
40
uint64_t abcdefgh = extract32(insn, 5, 5) | (extract32(insn, 16, 3) << 5);
41
bool is_neg = extract32(insn, 29, 1);
42
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
43
return;
44
}
45
46
- /* See AdvSIMDExpandImm() in ARM ARM */
47
- switch (cmode_3_1) {
48
- case 0: /* Replicate(Zeros(24):imm8, 2) */
49
- case 1: /* Replicate(Zeros(16):imm8:Zeros(8), 2) */
50
- case 2: /* Replicate(Zeros(8):imm8:Zeros(16), 2) */
51
- case 3: /* Replicate(imm8:Zeros(24), 2) */
52
- {
53
- int shift = cmode_3_1 * 8;
54
- imm = bitfield_replicate(abcdefgh << shift, 32);
55
- break;
56
- }
57
- case 4: /* Replicate(Zeros(8):imm8, 4) */
58
- case 5: /* Replicate(imm8:Zeros(8), 4) */
59
- {
60
- int shift = (cmode_3_1 & 0x1) * 8;
61
- imm = bitfield_replicate(abcdefgh << shift, 16);
62
- break;
63
- }
64
- case 6:
65
- if (cmode_0) {
66
- /* Replicate(Zeros(8):imm8:Ones(16), 2) */
67
- imm = (abcdefgh << 16) | 0xffff;
68
- } else {
69
- /* Replicate(Zeros(16):imm8:Ones(8), 2) */
70
- imm = (abcdefgh << 8) | 0xff;
71
- }
72
- imm = bitfield_replicate(imm, 32);
73
- break;
74
- case 7:
75
- if (!cmode_0 && !is_neg) {
76
- imm = bitfield_replicate(abcdefgh, 8);
77
- } else if (!cmode_0 && is_neg) {
78
- int i;
79
- imm = 0;
80
- for (i = 0; i < 8; i++) {
81
- if ((abcdefgh) & (1 << i)) {
82
- imm |= 0xffULL << (i * 8);
83
- }
84
- }
85
- } else if (cmode_0) {
86
- if (is_neg) {
87
- imm = (abcdefgh & 0x3f) << 48;
88
- if (abcdefgh & 0x80) {
89
- imm |= 0x8000000000000000ULL;
90
- }
91
- if (abcdefgh & 0x40) {
92
- imm |= 0x3fc0000000000000ULL;
93
- } else {
94
- imm |= 0x4000000000000000ULL;
95
- }
96
- } else {
97
- if (o2) {
98
- /* FMOV (vector, immediate) - half-precision */
99
- imm = vfp_expand_imm(MO_16, abcdefgh);
100
- /* now duplicate across the lanes */
101
- imm = bitfield_replicate(imm, 16);
102
- } else {
103
- imm = (abcdefgh & 0x3f) << 19;
104
- if (abcdefgh & 0x80) {
105
- imm |= 0x80000000;
106
- }
107
- if (abcdefgh & 0x40) {
108
- imm |= 0x3e000000;
109
- } else {
110
- imm |= 0x40000000;
111
- }
112
- imm |= (imm << 32);
113
- }
114
- }
115
- }
25
- }
116
- break;
26
- break;
117
- default:
27
- case 0x1: case 0x3: /* UNALLOCATED */
118
- g_assert_not_reached();
28
- unallocated_encoding(s);
119
- }
29
- break;
30
- case 0x2:
31
- if (!disas_sve(s, insn)) {
32
- unallocated_encoding(s);
33
- }
34
- break;
35
case 0x8: case 0x9: /* Data processing - immediate */
36
disas_data_proc_imm(s, insn);
37
break;
38
@@ -XXX,XX +XXX,XX @@ static void disas_a64_legacy(DisasContext *s, uint32_t insn)
39
disas_data_proc_simd_fp(s, insn);
40
break;
41
default:
42
- assert(FALSE); /* all 15 cases should be handled above */
43
+ unallocated_encoding(s);
44
break;
45
}
46
}
47
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
48
disas_sme_fa64(s, insn);
49
}
50
120
-
51
-
121
- if (cmode_3_1 != 7 && is_neg) {
52
- if (!disas_a64(s, insn)) {
122
- imm = ~imm;
53
+ if (!disas_a64(s, insn) &&
123
+ if (cmode == 15 && o2 && !is_neg) {
54
+ !disas_sme(s, insn) &&
124
+ /* FMOV (vector, immediate) - half-precision */
55
+ !disas_sve(s, insn)) {
125
+ imm = vfp_expand_imm(MO_16, abcdefgh);
56
disas_a64_legacy(s, insn);
126
+ /* now duplicate across the lanes */
127
+ imm = bitfield_replicate(imm, 16);
128
+ } else {
129
+ imm = asimd_imm_const(abcdefgh, cmode, is_neg);
130
}
57
}
131
58
132
if (!((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9)) {
133
diff --git a/target/arm/translate.c b/target/arm/translate.c
134
index XXXXXXX..XXXXXXX 100644
135
--- a/target/arm/translate.c
136
+++ b/target/arm/translate.c
137
@@ -XXX,XX +XXX,XX @@ uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
138
case 14:
139
if (op) {
140
/*
141
- * This is the only case where the top and bottom 32 bits
142
- * of the encoded constant differ.
143
+ * This and cmode == 15 op == 1 are the only cases where
144
+ * the top and bottom 32 bits of the encoded constant differ.
145
*/
146
uint64_t imm64 = 0;
147
int n;
148
@@ -XXX,XX +XXX,XX @@ uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
149
imm |= (imm << 8) | (imm << 16) | (imm << 24);
150
break;
151
case 15:
152
+ if (op) {
153
+ /* Reserved encoding for AArch32; valid for AArch64 */
154
+ uint64_t imm64 = (uint64_t)(imm & 0x3f) << 48;
155
+ if (imm & 0x80) {
156
+ imm64 |= 0x8000000000000000ULL;
157
+ }
158
+ if (imm & 0x40) {
159
+ imm64 |= 0x3fc0000000000000ULL;
160
+ } else {
161
+ imm64 |= 0x4000000000000000ULL;
162
+ }
163
+ return imm64;
164
+ }
165
imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
166
| ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
167
break;
168
--
59
--
169
2.20.1
60
2.34.1
170
171
diff view generated by jsdifflib
1
From: Maxim Uvarov <maxim.uvarov@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
qemu has 2 type of functions: shutdown and reboot. Shutdown
3
Convert the ADR and ADRP instructions.
4
function has to be used for machine shutdown. Otherwise we cause
5
a reset with a bogus "cause" value, when we intended a shutdown.
6
4
7
Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-id: 20210625111842.3790-3-maxim.uvarov@linaro.org
8
Message-id: 20230512144106.3608981-5-peter.maydell@linaro.org
10
[PMM: tweaked commit message]
9
[PMM: Rebased]
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
12
---
13
hw/gpio/gpio_pwr.c | 2 +-
13
target/arm/tcg/a64.decode | 13 ++++++++++++
14
1 file changed, 1 insertion(+), 1 deletion(-)
14
target/arm/tcg/translate-a64.c | 38 +++++++++++++---------------------
15
2 files changed, 27 insertions(+), 24 deletions(-)
15
16
16
diff --git a/hw/gpio/gpio_pwr.c b/hw/gpio/gpio_pwr.c
17
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
17
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/gpio/gpio_pwr.c
19
--- a/target/arm/tcg/a64.decode
19
+++ b/hw/gpio/gpio_pwr.c
20
+++ b/target/arm/tcg/a64.decode
20
@@ -XXX,XX +XXX,XX @@ static void gpio_pwr_reset(void *opaque, int n, int level)
21
@@ -XXX,XX +XXX,XX @@
21
static void gpio_pwr_shutdown(void *opaque, int n, int level)
22
#
22
{
23
# This file is processed by scripts/decodetree.py
23
if (level) {
24
#
24
- qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
25
+
25
+ qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
26
+&ri rd imm
27
+
28
+
29
+### Data Processing - Immediate
30
+
31
+# PC-rel addressing
32
+
33
+%imm_pcrel 5:s19 29:2
34
+@pcrel . .. ..... ................... rd:5 &ri imm=%imm_pcrel
35
+
36
+ADR 0 .. 10000 ................... ..... @pcrel
37
+ADRP 1 .. 10000 ................... ..... @pcrel
38
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/target/arm/tcg/translate-a64.c
41
+++ b/target/arm/tcg/translate-a64.c
42
@@ -XXX,XX +XXX,XX @@ static void disas_ldst(DisasContext *s, uint32_t insn)
26
}
43
}
27
}
44
}
28
45
46
-/* PC-rel. addressing
47
- * 31 30 29 28 24 23 5 4 0
48
- * +----+-------+-----------+-------------------+------+
49
- * | op | immlo | 1 0 0 0 0 | immhi | Rd |
50
- * +----+-------+-----------+-------------------+------+
51
+/*
52
+ * PC-rel. addressing
53
*/
54
-static void disas_pc_rel_adr(DisasContext *s, uint32_t insn)
55
+
56
+static bool trans_ADR(DisasContext *s, arg_ri *a)
57
{
58
- unsigned int page, rd;
59
- int64_t offset;
60
+ gen_pc_plus_diff(s, cpu_reg(s, a->rd), a->imm);
61
+ return true;
62
+}
63
64
- page = extract32(insn, 31, 1);
65
- /* SignExtend(immhi:immlo) -> offset */
66
- offset = sextract64(insn, 5, 19);
67
- offset = offset << 2 | extract32(insn, 29, 2);
68
- rd = extract32(insn, 0, 5);
69
+static bool trans_ADRP(DisasContext *s, arg_ri *a)
70
+{
71
+ int64_t offset = (int64_t)a->imm << 12;
72
73
- if (page) {
74
- /* ADRP (page based) */
75
- offset <<= 12;
76
- /* The page offset is ok for CF_PCREL. */
77
- offset -= s->pc_curr & 0xfff;
78
- }
79
-
80
- gen_pc_plus_diff(s, cpu_reg(s, rd), offset);
81
+ /* The page offset is ok for CF_PCREL. */
82
+ offset -= s->pc_curr & 0xfff;
83
+ gen_pc_plus_diff(s, cpu_reg(s, a->rd), offset);
84
+ return true;
85
}
86
87
/*
88
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
89
static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
90
{
91
switch (extract32(insn, 23, 6)) {
92
- case 0x20: case 0x21: /* PC-rel. addressing */
93
- disas_pc_rel_adr(s, insn);
94
- break;
95
case 0x22: /* Add/subtract (immediate) */
96
disas_add_sub_imm(s, insn);
97
break;
29
--
98
--
30
2.20.1
99
2.34.1
31
32
diff view generated by jsdifflib
1
Implement the MVE vector shift right by immediate insns VSHRI and
1
From: Richard Henderson <richard.henderson@linaro.org>
2
VRSHRI. As with Neon, we implement these by using helper functions
3
which perform left shifts but allow negative shift counts to indicate
4
right shifts.
5
2
3
Split out specific 32-bit and 64-bit functions.
4
These carry the same signature as tcg_gen_add_i64,
5
and so will be easier to pass as callbacks.
6
7
Retain gen_add_CC and gen_sub_CC during conversion.
8
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20210628135835.6690-9-peter.maydell@linaro.org
12
Message-id: 20230512144106.3608981-6-peter.maydell@linaro.org
13
[PMM: rebased]
14
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
15
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
---
16
---
10
target/arm/helper-mve.h | 12 ++++++++++++
17
target/arm/tcg/translate-a64.c | 149 +++++++++++++++++++--------------
11
target/arm/translate.h | 20 ++++++++++++++++++++
18
1 file changed, 84 insertions(+), 65 deletions(-)
12
target/arm/mve.decode | 28 ++++++++++++++++++++++++++++
13
target/arm/mve_helper.c | 7 +++++++
14
target/arm/translate-mve.c | 5 +++++
15
target/arm/translate-neon.c | 18 ------------------
16
6 files changed, 72 insertions(+), 18 deletions(-)
17
19
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
20
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
19
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-mve.h
22
--- a/target/arm/tcg/translate-a64.c
21
+++ b/target/arm/helper-mve.h
23
+++ b/target/arm/tcg/translate-a64.c
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
24
@@ -XXX,XX +XXX,XX @@ static inline void gen_logic_CC(int sf, TCGv_i64 result)
23
DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
25
}
24
DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
26
25
27
/* dest = T0 + T1; compute C, N, V and Z flags */
26
+DEF_HELPER_FLAGS_4(mve_vshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
+static void gen_add64_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
27
+DEF_HELPER_FLAGS_4(mve_vshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+{
28
+DEF_HELPER_FLAGS_4(mve_vshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
+ TCGv_i64 result, flag, tmp;
31
+ result = tcg_temp_new_i64();
32
+ flag = tcg_temp_new_i64();
33
+ tmp = tcg_temp_new_i64();
29
+
34
+
30
DEF_HELPER_FLAGS_4(mve_vshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
+ tcg_gen_movi_i64(tmp, 0);
31
DEF_HELPER_FLAGS_4(mve_vshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
+ tcg_gen_add2_i64(result, flag, t0, tmp, t1, tmp);
32
DEF_HELPER_FLAGS_4(mve_vshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
DEF_HELPER_FLAGS_4(mve_vqshlui_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
DEF_HELPER_FLAGS_4(mve_vqshlui_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
DEF_HELPER_FLAGS_4(mve_vqshlui_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
37
+
37
+
38
+DEF_HELPER_FLAGS_4(mve_vrshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
38
+ tcg_gen_extrl_i64_i32(cpu_CF, flag);
39
+DEF_HELPER_FLAGS_4(mve_vrshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(mve_vrshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
+
39
+
42
+DEF_HELPER_FLAGS_4(mve_vrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+ gen_set_NZ64(result);
43
+DEF_HELPER_FLAGS_4(mve_vrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
+
44
+DEF_HELPER_FLAGS_4(mve_vrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
42
+ tcg_gen_xor_i64(flag, result, t0);
45
diff --git a/target/arm/translate.h b/target/arm/translate.h
43
+ tcg_gen_xor_i64(tmp, t0, t1);
46
index XXXXXXX..XXXXXXX 100644
44
+ tcg_gen_andc_i64(flag, flag, tmp);
47
--- a/target/arm/translate.h
45
+ tcg_gen_extrh_i64_i32(cpu_VF, flag);
48
+++ b/target/arm/translate.h
46
+
49
@@ -XXX,XX +XXX,XX @@ static inline int times_2_plus_1(DisasContext *s, int x)
47
+ tcg_gen_mov_i64(dest, result);
50
return x * 2 + 1;
51
}
52
53
+static inline int rsub_64(DisasContext *s, int x)
54
+{
55
+ return 64 - x;
56
+}
48
+}
57
+
49
+
58
+static inline int rsub_32(DisasContext *s, int x)
50
+static void gen_add32_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
59
+{
51
+{
60
+ return 32 - x;
52
+ TCGv_i32 t0_32 = tcg_temp_new_i32();
53
+ TCGv_i32 t1_32 = tcg_temp_new_i32();
54
+ TCGv_i32 tmp = tcg_temp_new_i32();
55
+
56
+ tcg_gen_movi_i32(tmp, 0);
57
+ tcg_gen_extrl_i64_i32(t0_32, t0);
58
+ tcg_gen_extrl_i64_i32(t1_32, t1);
59
+ tcg_gen_add2_i32(cpu_NF, cpu_CF, t0_32, tmp, t1_32, tmp);
60
+ tcg_gen_mov_i32(cpu_ZF, cpu_NF);
61
+ tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
62
+ tcg_gen_xor_i32(tmp, t0_32, t1_32);
63
+ tcg_gen_andc_i32(cpu_VF, cpu_VF, tmp);
64
+ tcg_gen_extu_i32_i64(dest, cpu_NF);
61
+}
65
+}
62
+
66
+
63
+static inline int rsub_16(DisasContext *s, int x)
67
static void gen_add_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
68
{
69
if (sf) {
70
- TCGv_i64 result, flag, tmp;
71
- result = tcg_temp_new_i64();
72
- flag = tcg_temp_new_i64();
73
- tmp = tcg_temp_new_i64();
74
-
75
- tcg_gen_movi_i64(tmp, 0);
76
- tcg_gen_add2_i64(result, flag, t0, tmp, t1, tmp);
77
-
78
- tcg_gen_extrl_i64_i32(cpu_CF, flag);
79
-
80
- gen_set_NZ64(result);
81
-
82
- tcg_gen_xor_i64(flag, result, t0);
83
- tcg_gen_xor_i64(tmp, t0, t1);
84
- tcg_gen_andc_i64(flag, flag, tmp);
85
- tcg_gen_extrh_i64_i32(cpu_VF, flag);
86
-
87
- tcg_gen_mov_i64(dest, result);
88
+ gen_add64_CC(dest, t0, t1);
89
} else {
90
- /* 32 bit arithmetic */
91
- TCGv_i32 t0_32 = tcg_temp_new_i32();
92
- TCGv_i32 t1_32 = tcg_temp_new_i32();
93
- TCGv_i32 tmp = tcg_temp_new_i32();
94
-
95
- tcg_gen_movi_i32(tmp, 0);
96
- tcg_gen_extrl_i64_i32(t0_32, t0);
97
- tcg_gen_extrl_i64_i32(t1_32, t1);
98
- tcg_gen_add2_i32(cpu_NF, cpu_CF, t0_32, tmp, t1_32, tmp);
99
- tcg_gen_mov_i32(cpu_ZF, cpu_NF);
100
- tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
101
- tcg_gen_xor_i32(tmp, t0_32, t1_32);
102
- tcg_gen_andc_i32(cpu_VF, cpu_VF, tmp);
103
- tcg_gen_extu_i32_i64(dest, cpu_NF);
104
+ gen_add32_CC(dest, t0, t1);
105
}
106
}
107
108
/* dest = T0 - T1; compute C, N, V and Z flags */
109
+static void gen_sub64_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
64
+{
110
+{
65
+ return 16 - x;
111
+ /* 64 bit arithmetic */
112
+ TCGv_i64 result, flag, tmp;
113
+
114
+ result = tcg_temp_new_i64();
115
+ flag = tcg_temp_new_i64();
116
+ tcg_gen_sub_i64(result, t0, t1);
117
+
118
+ gen_set_NZ64(result);
119
+
120
+ tcg_gen_setcond_i64(TCG_COND_GEU, flag, t0, t1);
121
+ tcg_gen_extrl_i64_i32(cpu_CF, flag);
122
+
123
+ tcg_gen_xor_i64(flag, result, t0);
124
+ tmp = tcg_temp_new_i64();
125
+ tcg_gen_xor_i64(tmp, t0, t1);
126
+ tcg_gen_and_i64(flag, flag, tmp);
127
+ tcg_gen_extrh_i64_i32(cpu_VF, flag);
128
+ tcg_gen_mov_i64(dest, result);
66
+}
129
+}
67
+
130
+
68
+static inline int rsub_8(DisasContext *s, int x)
131
+static void gen_sub32_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
69
+{
132
+{
70
+ return 8 - x;
133
+ /* 32 bit arithmetic */
134
+ TCGv_i32 t0_32 = tcg_temp_new_i32();
135
+ TCGv_i32 t1_32 = tcg_temp_new_i32();
136
+ TCGv_i32 tmp;
137
+
138
+ tcg_gen_extrl_i64_i32(t0_32, t0);
139
+ tcg_gen_extrl_i64_i32(t1_32, t1);
140
+ tcg_gen_sub_i32(cpu_NF, t0_32, t1_32);
141
+ tcg_gen_mov_i32(cpu_ZF, cpu_NF);
142
+ tcg_gen_setcond_i32(TCG_COND_GEU, cpu_CF, t0_32, t1_32);
143
+ tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
144
+ tmp = tcg_temp_new_i32();
145
+ tcg_gen_xor_i32(tmp, t0_32, t1_32);
146
+ tcg_gen_and_i32(cpu_VF, cpu_VF, tmp);
147
+ tcg_gen_extu_i32_i64(dest, cpu_NF);
71
+}
148
+}
72
+
149
+
73
static inline int arm_dc_feature(DisasContext *dc, int feature)
150
static void gen_sub_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
74
{
151
{
75
return (dc->features & (1ULL << feature)) != 0;
152
if (sf) {
76
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
153
- /* 64 bit arithmetic */
77
index XXXXXXX..XXXXXXX 100644
154
- TCGv_i64 result, flag, tmp;
78
--- a/target/arm/mve.decode
155
-
79
+++ b/target/arm/mve.decode
156
- result = tcg_temp_new_i64();
80
@@ -XXX,XX +XXX,XX @@
157
- flag = tcg_temp_new_i64();
81
@2_shl_h .... .... .. 01 shift:4 .... .... .... .... &2shift qd=%qd qm=%qm size=1
158
- tcg_gen_sub_i64(result, t0, t1);
82
@2_shl_w .... .... .. 1 shift:5 .... .... .... .... &2shift qd=%qd qm=%qm size=2
159
-
83
160
- gen_set_NZ64(result);
84
+# Right shifts are encoded as N - shift, where N is the element size in bits.
161
-
85
+%rshift_i5 16:5 !function=rsub_32
162
- tcg_gen_setcond_i64(TCG_COND_GEU, flag, t0, t1);
86
+%rshift_i4 16:4 !function=rsub_16
163
- tcg_gen_extrl_i64_i32(cpu_CF, flag);
87
+%rshift_i3 16:3 !function=rsub_8
164
-
88
+
165
- tcg_gen_xor_i64(flag, result, t0);
89
+@2_shr_b .... .... .. 001 ... .... .... .... .... &2shift qd=%qd qm=%qm \
166
- tmp = tcg_temp_new_i64();
90
+ size=0 shift=%rshift_i3
167
- tcg_gen_xor_i64(tmp, t0, t1);
91
+@2_shr_h .... .... .. 01 .... .... .... .... .... &2shift qd=%qd qm=%qm \
168
- tcg_gen_and_i64(flag, flag, tmp);
92
+ size=1 shift=%rshift_i4
169
- tcg_gen_extrh_i64_i32(cpu_VF, flag);
93
+@2_shr_w .... .... .. 1 ..... .... .... .... .... &2shift qd=%qd qm=%qm \
170
- tcg_gen_mov_i64(dest, result);
94
+ size=2 shift=%rshift_i5
171
+ gen_sub64_CC(dest, t0, t1);
95
+
172
} else {
96
# Vector loads and stores
173
- /* 32 bit arithmetic */
97
174
- TCGv_i32 t0_32 = tcg_temp_new_i32();
98
# Widening loads and narrowing stores:
175
- TCGv_i32 t1_32 = tcg_temp_new_i32();
99
@@ -XXX,XX +XXX,XX @@ VQSHLI_U 111 1 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_w
176
- TCGv_i32 tmp;
100
VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_b
177
-
101
VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_h
178
- tcg_gen_extrl_i64_i32(t0_32, t0);
102
VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_w
179
- tcg_gen_extrl_i64_i32(t1_32, t1);
103
+
180
- tcg_gen_sub_i32(cpu_NF, t0_32, t1_32);
104
+VSHRI_S 111 0 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_b
181
- tcg_gen_mov_i32(cpu_ZF, cpu_NF);
105
+VSHRI_S 111 0 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_h
182
- tcg_gen_setcond_i32(TCG_COND_GEU, cpu_CF, t0_32, t1_32);
106
+VSHRI_S 111 0 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_w
183
- tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
107
+
184
- tmp = tcg_temp_new_i32();
108
+VSHRI_U 111 1 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_b
185
- tcg_gen_xor_i32(tmp, t0_32, t1_32);
109
+VSHRI_U 111 1 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_h
186
- tcg_gen_and_i32(cpu_VF, cpu_VF, tmp);
110
+VSHRI_U 111 1 1111 1 . ... ... ... 0 0000 0 1 . 1 ... 0 @2_shr_w
187
- tcg_gen_extu_i32_i64(dest, cpu_NF);
111
+
188
+ gen_sub32_CC(dest, t0, t1);
112
+VRSHRI_S 111 0 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_b
189
}
113
+VRSHRI_S 111 0 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h
114
+VRSHRI_S 111 0 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
115
+
116
+VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_b
117
+VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h
118
+VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
119
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
120
index XXXXXXX..XXXXXXX 100644
121
--- a/target/arm/mve_helper.c
122
+++ b/target/arm/mve_helper.c
123
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvuw, 4, uint32_t)
124
DO_2SHIFT(OP##b, 1, uint8_t, FN) \
125
DO_2SHIFT(OP##h, 2, uint16_t, FN) \
126
DO_2SHIFT(OP##w, 4, uint32_t, FN)
127
+#define DO_2SHIFT_S(OP, FN) \
128
+ DO_2SHIFT(OP##b, 1, int8_t, FN) \
129
+ DO_2SHIFT(OP##h, 2, int16_t, FN) \
130
+ DO_2SHIFT(OP##w, 4, int32_t, FN)
131
132
#define DO_2SHIFT_SAT_U(OP, FN) \
133
DO_2SHIFT_SAT(OP##b, 1, uint8_t, FN) \
134
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvuw, 4, uint32_t)
135
DO_2SHIFT_SAT(OP##w, 4, int32_t, FN)
136
137
DO_2SHIFT_U(vshli_u, DO_VSHLU)
138
+DO_2SHIFT_S(vshli_s, DO_VSHLS)
139
DO_2SHIFT_SAT_U(vqshli_u, DO_UQSHL_OP)
140
DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP)
141
DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
142
+DO_2SHIFT_U(vrshli_u, DO_VRSHLU)
143
+DO_2SHIFT_S(vrshli_s, DO_VRSHLS)
144
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
145
index XXXXXXX..XXXXXXX 100644
146
--- a/target/arm/translate-mve.c
147
+++ b/target/arm/translate-mve.c
148
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT(VSHLI, vshli_u, false)
149
DO_2SHIFT(VQSHLI_S, vqshli_s, false)
150
DO_2SHIFT(VQSHLI_U, vqshli_u, false)
151
DO_2SHIFT(VQSHLUI, vqshlui_s, false)
152
+/* These right shifts use a left-shift helper with negated shift count */
153
+DO_2SHIFT(VSHRI_S, vshli_s, true)
154
+DO_2SHIFT(VSHRI_U, vshli_u, true)
155
+DO_2SHIFT(VRSHRI_S, vrshli_s, true)
156
+DO_2SHIFT(VRSHRI_U, vrshli_u, true)
157
diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c
158
index XXXXXXX..XXXXXXX 100644
159
--- a/target/arm/translate-neon.c
160
+++ b/target/arm/translate-neon.c
161
@@ -XXX,XX +XXX,XX @@ static inline int plus1(DisasContext *s, int x)
162
return x + 1;
163
}
190
}
164
191
165
-static inline int rsub_64(DisasContext *s, int x)
166
-{
167
- return 64 - x;
168
-}
169
-
170
-static inline int rsub_32(DisasContext *s, int x)
171
-{
172
- return 32 - x;
173
-}
174
-static inline int rsub_16(DisasContext *s, int x)
175
-{
176
- return 16 - x;
177
-}
178
-static inline int rsub_8(DisasContext *s, int x)
179
-{
180
- return 8 - x;
181
-}
182
-
183
static inline int neon_3same_fp_size(DisasContext *s, int x)
184
{
185
/* Convert 0==fp32, 1==fp16 into a MO_* value */
186
--
192
--
187
2.20.1
193
2.34.1
188
189
diff view generated by jsdifflib
1
Implement the MVE shifts by register, which perform
1
From: Richard Henderson <richard.henderson@linaro.org>
2
shifts on a single general-purpose register.
3
2
3
Convert the ADD and SUB (immediate) instructions.
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Message-id: 20210628135835.6690-19-peter.maydell@linaro.org
8
Message-id: 20230512144106.3608981-7-peter.maydell@linaro.org
9
[PMM: Rebased; adjusted to use translate.h's TRANS macro]
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
12
---
8
target/arm/helper-mve.h | 2 ++
13
target/arm/tcg/translate.h | 5 +++
9
target/arm/translate.h | 1 +
14
target/arm/tcg/a64.decode | 17 ++++++++
10
target/arm/t32.decode | 18 ++++++++++++++----
15
target/arm/tcg/translate-a64.c | 73 ++++++++++------------------------
11
target/arm/mve_helper.c | 10 ++++++++++
16
3 files changed, 42 insertions(+), 53 deletions(-)
12
target/arm/translate.c | 30 ++++++++++++++++++++++++++++++
13
5 files changed, 57 insertions(+), 4 deletions(-)
14
17
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
18
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
16
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
20
--- a/target/arm/tcg/translate.h
18
+++ b/target/arm/helper-mve.h
21
+++ b/target/arm/tcg/translate.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_uqrshll48, TCG_CALL_NO_RWG, i64, env, i64, i32)
22
@@ -XXX,XX +XXX,XX @@ static inline int rsub_8(DisasContext *s, int x)
20
23
return 8 - x;
21
DEF_HELPER_FLAGS_3(mve_uqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
22
DEF_HELPER_FLAGS_3(mve_sqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
23
+DEF_HELPER_FLAGS_3(mve_uqrshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
24
+DEF_HELPER_FLAGS_3(mve_sqrshr, TCG_CALL_NO_RWG, i32, env, i32, i32)
25
diff --git a/target/arm/translate.h b/target/arm/translate.h
26
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/translate.h
28
+++ b/target/arm/translate.h
29
@@ -XXX,XX +XXX,XX @@ typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
30
typedef void WideShiftImmFn(TCGv_i64, TCGv_i64, int64_t shift);
31
typedef void WideShiftFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i32);
32
typedef void ShiftImmFn(TCGv_i32, TCGv_i32, int32_t shift);
33
+typedef void ShiftFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
34
35
/**
36
* arm_tbflags_from_tb:
37
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
38
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/t32.decode
40
+++ b/target/arm/t32.decode
41
@@ -XXX,XX +XXX,XX @@
42
&mve_shl_ri rdalo rdahi shim
43
&mve_shl_rr rdalo rdahi rm
44
&mve_sh_ri rda shim
45
+&mve_sh_rr rda rm
46
47
# rdahi: bits [3:1] from insn, bit 0 is 1
48
# rdalo: bits [3:1] from insn, bit 0 is 0
49
@@ -XXX,XX +XXX,XX @@
50
&mve_shl_rr rdalo=%rdalo_17 rdahi=%rdahi_9
51
@mve_sh_ri ....... .... . rda:4 . ... ... . .. .. .... \
52
&mve_sh_ri shim=%imm5_12_6
53
+@mve_sh_rr ....... .... . rda:4 rm:4 .... .... .... &mve_sh_rr
54
55
{
56
TST_xrri 1110101 0000 1 .... 0 ... 1111 .... .... @S_xrr_shi
57
@@ -XXX,XX +XXX,XX @@ BIC_rrri 1110101 0001 . .... 0 ... .... .... .... @s_rrr_shi
58
SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
59
}
60
61
- LSLL_rr 1110101 0010 1 ... 0 .... ... 1 0000 1101 @mve_shl_rr
62
- ASRL_rr 1110101 0010 1 ... 0 .... ... 1 0010 1101 @mve_shl_rr
63
- UQRSHLL64_rr 1110101 0010 1 ... 1 .... ... 1 0000 1101 @mve_shl_rr
64
- SQRSHRL64_rr 1110101 0010 1 ... 1 .... ... 1 0010 1101 @mve_shl_rr
65
+ {
66
+ UQRSHL_rr 1110101 0010 1 .... .... 1111 0000 1101 @mve_sh_rr
67
+ LSLL_rr 1110101 0010 1 ... 0 .... ... 1 0000 1101 @mve_shl_rr
68
+ UQRSHLL64_rr 1110101 0010 1 ... 1 .... ... 1 0000 1101 @mve_shl_rr
69
+ }
70
+
71
+ {
72
+ SQRSHR_rr 1110101 0010 1 .... .... 1111 0010 1101 @mve_sh_rr
73
+ ASRL_rr 1110101 0010 1 ... 0 .... ... 1 0010 1101 @mve_shl_rr
74
+ SQRSHRL64_rr 1110101 0010 1 ... 1 .... ... 1 0010 1101 @mve_shl_rr
75
+ }
76
+
77
UQRSHLL48_rr 1110101 0010 1 ... 1 .... ... 1 1000 1101 @mve_shl_rr
78
SQRSHRL48_rr 1110101 0010 1 ... 1 .... ... 1 1010 1101 @mve_shl_rr
79
]
80
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/target/arm/mve_helper.c
83
+++ b/target/arm/mve_helper.c
84
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(mve_sqshl)(CPUARMState *env, uint32_t n, uint32_t shift)
85
{
86
return do_sqrshl_bhs(n, (int8_t)shift, 32, false, &env->QF);
87
}
24
}
88
+
25
89
+uint32_t HELPER(mve_uqrshl)(CPUARMState *env, uint32_t n, uint32_t shift)
26
+static inline int shl_12(DisasContext *s, int x)
90
+{
27
+{
91
+ return do_uqrshl_bhs(n, (int8_t)shift, 32, true, &env->QF);
28
+ return x << 12;
92
+}
29
+}
93
+
30
+
94
+uint32_t HELPER(mve_sqrshr)(CPUARMState *env, uint32_t n, uint32_t shift)
31
static inline int neon_3same_fp_size(DisasContext *s, int x)
32
{
33
/* Convert 0==fp32, 1==fp16 into a MO_* value */
34
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
35
index XXXXXXX..XXXXXXX 100644
36
--- a/target/arm/tcg/a64.decode
37
+++ b/target/arm/tcg/a64.decode
38
@@ -XXX,XX +XXX,XX @@
39
#
40
41
&ri rd imm
42
+&rri_sf rd rn imm sf
43
44
45
### Data Processing - Immediate
46
@@ -XXX,XX +XXX,XX @@
47
48
ADR 0 .. 10000 ................... ..... @pcrel
49
ADRP 1 .. 10000 ................... ..... @pcrel
50
+
51
+# Add/subtract (immediate)
52
+
53
+%imm12_sh12 10:12 !function=shl_12
54
+@addsub_imm sf:1 .. ...... . imm:12 rn:5 rd:5
55
+@addsub_imm12 sf:1 .. ...... . ............ rn:5 rd:5 imm=%imm12_sh12
56
+
57
+ADD_i . 00 100010 0 ............ ..... ..... @addsub_imm
58
+ADD_i . 00 100010 1 ............ ..... ..... @addsub_imm12
59
+ADDS_i . 01 100010 0 ............ ..... ..... @addsub_imm
60
+ADDS_i . 01 100010 1 ............ ..... ..... @addsub_imm12
61
+
62
+SUB_i . 10 100010 0 ............ ..... ..... @addsub_imm
63
+SUB_i . 10 100010 1 ............ ..... ..... @addsub_imm12
64
+SUBS_i . 11 100010 0 ............ ..... ..... @addsub_imm
65
+SUBS_i . 11 100010 1 ............ ..... ..... @addsub_imm12
66
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/target/arm/tcg/translate-a64.c
69
+++ b/target/arm/tcg/translate-a64.c
70
@@ -XXX,XX +XXX,XX @@ static void disas_ldst(DisasContext *s, uint32_t insn)
71
}
72
}
73
74
+typedef void ArithTwoOp(TCGv_i64, TCGv_i64, TCGv_i64);
75
+
76
+static bool gen_rri(DisasContext *s, arg_rri_sf *a,
77
+ bool rd_sp, bool rn_sp, ArithTwoOp *fn)
95
+{
78
+{
96
+ return do_sqrshl_bhs(n, -(int8_t)shift, 32, true, &env->QF);
79
+ TCGv_i64 tcg_rn = rn_sp ? cpu_reg_sp(s, a->rn) : cpu_reg(s, a->rn);
97
+}
80
+ TCGv_i64 tcg_rd = rd_sp ? cpu_reg_sp(s, a->rd) : cpu_reg(s, a->rd);
98
diff --git a/target/arm/translate.c b/target/arm/translate.c
81
+ TCGv_i64 tcg_imm = tcg_constant_i64(a->imm);
99
index XXXXXXX..XXXXXXX 100644
82
+
100
--- a/target/arm/translate.c
83
+ fn(tcg_rd, tcg_rn, tcg_imm);
101
+++ b/target/arm/translate.c
84
+ if (!a->sf) {
102
@@ -XXX,XX +XXX,XX @@ static bool trans_UQSHL_ri(DisasContext *s, arg_mve_sh_ri *a)
85
+ tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
103
return do_mve_sh_ri(s, a, gen_mve_uqshl);
104
}
105
106
+static bool do_mve_sh_rr(DisasContext *s, arg_mve_sh_rr *a, ShiftFn *fn)
107
+{
108
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
109
+ /* Decode falls through to ORR/MOV UNPREDICTABLE handling */
110
+ return false;
111
+ }
86
+ }
112
+ if (!dc_isar_feature(aa32_mve, s) ||
113
+ !arm_dc_feature(s, ARM_FEATURE_M_MAIN) ||
114
+ a->rda == 13 || a->rda == 15 || a->rm == 13 || a->rm == 15 ||
115
+ a->rm == a->rda) {
116
+ /* These rda/rm cases are UNPREDICTABLE; we choose to UNDEF */
117
+ unallocated_encoding(s);
118
+ return true;
119
+ }
120
+
121
+ /* The helper takes care of the sign-extension of the low 8 bits of Rm */
122
+ fn(cpu_R[a->rda], cpu_env, cpu_R[a->rda], cpu_R[a->rm]);
123
+ return true;
87
+ return true;
124
+}
88
+}
125
+
89
+
126
+static bool trans_SQRSHR_rr(DisasContext *s, arg_mve_sh_rr *a)
127
+{
128
+ return do_mve_sh_rr(s, a, gen_helper_mve_sqrshr);
129
+}
130
+
131
+static bool trans_UQRSHL_rr(DisasContext *s, arg_mve_sh_rr *a)
132
+{
133
+ return do_mve_sh_rr(s, a, gen_helper_mve_uqrshl);
134
+}
135
+
136
/*
90
/*
137
* Multiply and multiply accumulate
91
* PC-rel. addressing
138
*/
92
*/
93
@@ -XXX,XX +XXX,XX @@ static bool trans_ADRP(DisasContext *s, arg_ri *a)
94
95
/*
96
* Add/subtract (immediate)
97
- *
98
- * 31 30 29 28 23 22 21 10 9 5 4 0
99
- * +--+--+--+-------------+--+-------------+-----+-----+
100
- * |sf|op| S| 1 0 0 0 1 0 |sh| imm12 | Rn | Rd |
101
- * +--+--+--+-------------+--+-------------+-----+-----+
102
- *
103
- * sf: 0 -> 32bit, 1 -> 64bit
104
- * op: 0 -> add , 1 -> sub
105
- * S: 1 -> set flags
106
- * sh: 1 -> LSL imm by 12
107
*/
108
-static void disas_add_sub_imm(DisasContext *s, uint32_t insn)
109
-{
110
- int rd = extract32(insn, 0, 5);
111
- int rn = extract32(insn, 5, 5);
112
- uint64_t imm = extract32(insn, 10, 12);
113
- bool shift = extract32(insn, 22, 1);
114
- bool setflags = extract32(insn, 29, 1);
115
- bool sub_op = extract32(insn, 30, 1);
116
- bool is_64bit = extract32(insn, 31, 1);
117
-
118
- TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
119
- TCGv_i64 tcg_rd = setflags ? cpu_reg(s, rd) : cpu_reg_sp(s, rd);
120
- TCGv_i64 tcg_result;
121
-
122
- if (shift) {
123
- imm <<= 12;
124
- }
125
-
126
- tcg_result = tcg_temp_new_i64();
127
- if (!setflags) {
128
- if (sub_op) {
129
- tcg_gen_subi_i64(tcg_result, tcg_rn, imm);
130
- } else {
131
- tcg_gen_addi_i64(tcg_result, tcg_rn, imm);
132
- }
133
- } else {
134
- TCGv_i64 tcg_imm = tcg_constant_i64(imm);
135
- if (sub_op) {
136
- gen_sub_CC(is_64bit, tcg_result, tcg_rn, tcg_imm);
137
- } else {
138
- gen_add_CC(is_64bit, tcg_result, tcg_rn, tcg_imm);
139
- }
140
- }
141
-
142
- if (is_64bit) {
143
- tcg_gen_mov_i64(tcg_rd, tcg_result);
144
- } else {
145
- tcg_gen_ext32u_i64(tcg_rd, tcg_result);
146
- }
147
-}
148
+TRANS(ADD_i, gen_rri, a, 1, 1, tcg_gen_add_i64)
149
+TRANS(SUB_i, gen_rri, a, 1, 1, tcg_gen_sub_i64)
150
+TRANS(ADDS_i, gen_rri, a, 0, 1, a->sf ? gen_add64_CC : gen_add32_CC)
151
+TRANS(SUBS_i, gen_rri, a, 0, 1, a->sf ? gen_sub64_CC : gen_sub32_CC)
152
153
/*
154
* Add/subtract (immediate, with tags)
155
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
156
static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
157
{
158
switch (extract32(insn, 23, 6)) {
159
- case 0x22: /* Add/subtract (immediate) */
160
- disas_add_sub_imm(s, insn);
161
- break;
162
case 0x23: /* Add/subtract (immediate, with tags) */
163
disas_add_sub_imm_with_tags(s, insn);
164
break;
139
--
165
--
140
2.20.1
166
2.34.1
141
142
diff view generated by jsdifflib
1
The initial implementation of the MVE VRMLALDAVH and VRMLSLDAVH
1
From: Richard Henderson <richard.henderson@linaro.org>
2
insns had some bugs:
3
* the 32x32 multiply of elements was being done as 32x32->32,
4
not 32x32->64
5
* we were incorrectly maintaining the accumulator in its full
6
72-bit form across all 4 beats of the insn; in the pseudocode
7
it is squashed back into the 64 bits of the RdaHi:RdaLo
8
registers after each beat
9
2
10
In particular, fixing the second of these allows us to recast
3
Convert the ADDG and SUBG (immediate) instructions.
11
the implementation to avoid 128-bit arithmetic entirely.
12
4
13
Since the element size here is always 4, we can also drop the
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
parameterization of ESIZE to make the code a little more readable.
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20230512144106.3608981-8-peter.maydell@linaro.org
9
[PMM: Rebased; use TRANS_FEAT()]
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
---
13
target/arm/tcg/a64.decode | 8 +++++++
14
target/arm/tcg/translate-a64.c | 38 ++++++++++------------------------
15
2 files changed, 19 insertions(+), 27 deletions(-)
15
16
16
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
17
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
Message-id: 20210628135835.6690-3-peter.maydell@linaro.org
20
---
21
target/arm/mve_helper.c | 38 +++++++++++++++++++++-----------------
22
1 file changed, 21 insertions(+), 17 deletions(-)
23
24
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
25
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/mve_helper.c
19
--- a/target/arm/tcg/a64.decode
27
+++ b/target/arm/mve_helper.c
20
+++ b/target/arm/tcg/a64.decode
28
@@ -XXX,XX +XXX,XX @@
21
@@ -XXX,XX +XXX,XX @@ SUB_i . 10 100010 0 ............ ..... ..... @addsub_imm
22
SUB_i . 10 100010 1 ............ ..... ..... @addsub_imm12
23
SUBS_i . 11 100010 0 ............ ..... ..... @addsub_imm
24
SUBS_i . 11 100010 1 ............ ..... ..... @addsub_imm12
25
+
26
+# Add/subtract (immediate with tags)
27
+
28
+&rri_tag rd rn uimm6 uimm4
29
+@addsub_imm_tag . .. ...... . uimm6:6 .. uimm4:4 rn:5 rd:5 &rri_tag
30
+
31
+ADDG_i 1 00 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
32
+SUBG_i 1 10 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
33
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/tcg/translate-a64.c
36
+++ b/target/arm/tcg/translate-a64.c
37
@@ -XXX,XX +XXX,XX @@ TRANS(SUBS_i, gen_rri, a, 0, 1, a->sf ? gen_sub64_CC : gen_sub32_CC)
38
39
/*
40
* Add/subtract (immediate, with tags)
41
- *
42
- * 31 30 29 28 23 22 21 16 14 10 9 5 4 0
43
- * +--+--+--+-------------+--+---------+--+-------+-----+-----+
44
- * |sf|op| S| 1 0 0 0 1 1 |o2| uimm6 |o3| uimm4 | Rn | Rd |
45
- * +--+--+--+-------------+--+---------+--+-------+-----+-----+
46
- *
47
- * op: 0 -> add, 1 -> sub
29
*/
48
*/
30
49
-static void disas_add_sub_imm_with_tags(DisasContext *s, uint32_t insn)
31
#include "qemu/osdep.h"
50
+
32
-#include "qemu/int128.h"
51
+static bool gen_add_sub_imm_with_tags(DisasContext *s, arg_rri_tag *a,
33
#include "cpu.h"
52
+ bool sub_op)
34
#include "internals.h"
53
{
35
#include "vec_internal.h"
54
- int rd = extract32(insn, 0, 5);
36
@@ -XXX,XX +XXX,XX @@ DO_LDAV(vmlsldavsw, 4, int32_t, false, +=, -=)
55
- int rn = extract32(insn, 5, 5);
37
DO_LDAV(vmlsldavxsw, 4, int32_t, true, +=, -=)
56
- int uimm4 = extract32(insn, 10, 4);
38
57
- int uimm6 = extract32(insn, 16, 6);
39
/*
58
- bool sub_op = extract32(insn, 30, 1);
40
- * Rounding multiply add long dual accumulate high: we must keep
59
TCGv_i64 tcg_rn, tcg_rd;
41
- * a 72-bit internal accumulator value and return the top 64 bits.
60
int imm;
42
+ * Rounding multiply add long dual accumulate high. In the pseudocode
61
43
+ * this is implemented with a 72-bit internal accumulator value of which
62
- /* Test all of sf=1, S=0, o2=0, o3=0. */
44
+ * the top 64 bits are returned. We optimize this to avoid having to
63
- if ((insn & 0xa040c000u) != 0x80000000u ||
45
+ * use 128-bit arithmetic -- we can do this because the 74-bit accumulator
64
- !dc_isar_feature(aa64_mte_insn_reg, s)) {
46
+ * is squashed back into 64-bits after each beat.
65
- unallocated_encoding(s);
47
*/
66
- return;
48
-#define DO_LDAVH(OP, ESIZE, TYPE, XCHG, EVENACC, ODDACC, TO128) \
67
- }
49
+#define DO_LDAVH(OP, TYPE, LTYPE, XCHG, SUB) \
68
-
50
uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \
69
- imm = uimm6 << LOG2_TAG_GRANULE;
51
void *vm, uint64_t a) \
70
+ imm = a->uimm6 << LOG2_TAG_GRANULE;
52
{ \
71
if (sub_op) {
53
uint16_t mask = mve_element_mask(env); \
72
imm = -imm;
54
unsigned e; \
55
TYPE *n = vn, *m = vm; \
56
- Int128 acc = int128_lshift(TO128(a), 8); \
57
- for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
58
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) { \
59
if (mask & 1) { \
60
+ LTYPE mul; \
61
if (e & 1) { \
62
- acc = ODDACC(acc, TO128(n[H##ESIZE(e - 1 * XCHG)] * \
63
- m[H##ESIZE(e)])); \
64
+ mul = (LTYPE)n[H4(e - 1 * XCHG)] * m[H4(e)]; \
65
+ if (SUB) { \
66
+ mul = -mul; \
67
+ } \
68
} else { \
69
- acc = EVENACC(acc, TO128(n[H##ESIZE(e + 1 * XCHG)] * \
70
- m[H##ESIZE(e)])); \
71
+ mul = (LTYPE)n[H4(e + 1 * XCHG)] * m[H4(e)]; \
72
} \
73
- acc = int128_add(acc, int128_make64(1 << 7)); \
74
+ mul = (mul >> 8) + ((mul >> 7) & 1); \
75
+ a += mul; \
76
} \
77
} \
78
mve_advance_vpt(env); \
79
- return int128_getlo(int128_rshift(acc, 8)); \
80
+ return a; \
81
}
73
}
82
74
83
-DO_LDAVH(vrmlaldavhsw, 4, int32_t, false, int128_add, int128_add, int128_makes64)
75
- tcg_rn = cpu_reg_sp(s, rn);
84
-DO_LDAVH(vrmlaldavhxsw, 4, int32_t, true, int128_add, int128_add, int128_makes64)
76
- tcg_rd = cpu_reg_sp(s, rd);
85
+DO_LDAVH(vrmlaldavhsw, int32_t, int64_t, false, false)
77
+ tcg_rn = cpu_reg_sp(s, a->rn);
86
+DO_LDAVH(vrmlaldavhxsw, int32_t, int64_t, true, false)
78
+ tcg_rd = cpu_reg_sp(s, a->rd);
87
79
88
-DO_LDAVH(vrmlaldavhuw, 4, uint32_t, false, int128_add, int128_add, int128_make64)
80
if (s->ata) {
89
+DO_LDAVH(vrmlaldavhuw, uint32_t, uint64_t, false, false)
81
gen_helper_addsubg(tcg_rd, cpu_env, tcg_rn,
90
82
tcg_constant_i32(imm),
91
-DO_LDAVH(vrmlsldavhsw, 4, int32_t, false, int128_add, int128_sub, int128_makes64)
83
- tcg_constant_i32(uimm4));
92
-DO_LDAVH(vrmlsldavhxsw, 4, int32_t, true, int128_add, int128_sub, int128_makes64)
84
+ tcg_constant_i32(a->uimm4));
93
+DO_LDAVH(vrmlsldavhsw, int32_t, int64_t, false, true)
85
} else {
94
+DO_LDAVH(vrmlsldavhxsw, int32_t, int64_t, true, true)
86
tcg_gen_addi_i64(tcg_rd, tcg_rn, imm);
95
87
gen_address_with_allocation_tag0(tcg_rd, tcg_rd);
96
/* Vector add across vector */
88
}
97
#define DO_VADDV(OP, ESIZE, TYPE) \
89
+ return true;
90
}
91
92
+TRANS_FEAT(ADDG_i, aa64_mte_insn_reg, gen_add_sub_imm_with_tags, a, false)
93
+TRANS_FEAT(SUBG_i, aa64_mte_insn_reg, gen_add_sub_imm_with_tags, a, true)
94
+
95
/* The input should be a value in the bottom e bits (with higher
96
* bits zero); returns that value replicated into every element
97
* of size e in a 64 bit integer.
98
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
99
static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
100
{
101
switch (extract32(insn, 23, 6)) {
102
- case 0x23: /* Add/subtract (immediate, with tags) */
103
- disas_add_sub_imm_with_tags(s, insn);
104
- break;
105
case 0x24: /* Logical (immediate) */
106
disas_logic_imm(s, insn);
107
break;
98
--
108
--
99
2.20.1
109
2.34.1
100
101
diff view generated by jsdifflib
New patch
1
From: Richard Henderson <richard.henderson@linaro.org>
1
2
3
Use the bitops.h macro rather than rolling our own here.
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20230512144106.3608981-9-peter.maydell@linaro.org
9
---
10
target/arm/tcg/translate-a64.c | 11 ++---------
11
1 file changed, 2 insertions(+), 9 deletions(-)
12
13
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/tcg/translate-a64.c
16
+++ b/target/arm/tcg/translate-a64.c
17
@@ -XXX,XX +XXX,XX @@ static uint64_t bitfield_replicate(uint64_t mask, unsigned int e)
18
return mask;
19
}
20
21
-/* Return a value with the bottom len bits set (where 0 < len <= 64) */
22
-static inline uint64_t bitmask64(unsigned int length)
23
-{
24
- assert(length > 0 && length <= 64);
25
- return ~0ULL >> (64 - length);
26
-}
27
-
28
/* Simplified variant of pseudocode DecodeBitMasks() for the case where we
29
* only require the wmask. Returns false if the imms/immr/immn are a reserved
30
* value (ie should cause a guest UNDEF exception), and true if they are
31
@@ -XXX,XX +XXX,XX @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
32
/* Create the value of one element: s+1 set bits rotated
33
* by r within the element (which is e bits wide)...
34
*/
35
- mask = bitmask64(s + 1);
36
+ mask = MAKE_64BIT_MASK(0, s + 1);
37
if (r) {
38
mask = (mask >> r) | (mask << (e - r));
39
- mask &= bitmask64(e);
40
+ mask &= MAKE_64BIT_MASK(0, e);
41
}
42
/* ...then replicate the element over the whole 64 bit value */
43
mask = bitfield_replicate(mask, e);
44
--
45
2.34.1
diff view generated by jsdifflib
New patch
1
From: Richard Henderson <richard.henderson@linaro.org>
1
2
3
Convert the ADD, ORR, EOR, ANDS (immediate) instructions.
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20230512144106.3608981-10-peter.maydell@linaro.org
9
[PMM: rebased]
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
12
target/arm/tcg/a64.decode | 15 ++++++
13
target/arm/tcg/translate-a64.c | 94 +++++++++++-----------------------
14
2 files changed, 44 insertions(+), 65 deletions(-)
15
16
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
17
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/tcg/a64.decode
19
+++ b/target/arm/tcg/a64.decode
20
@@ -XXX,XX +XXX,XX @@ SUBS_i . 11 100010 1 ............ ..... ..... @addsub_imm12
21
22
ADDG_i 1 00 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
23
SUBG_i 1 10 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
24
+
25
+# Logical (immediate)
26
+
27
+&rri_log rd rn sf dbm
28
+@logic_imm_64 1 .. ...... dbm:13 rn:5 rd:5 &rri_log sf=1
29
+@logic_imm_32 0 .. ...... 0 dbm:12 rn:5 rd:5 &rri_log sf=0
30
+
31
+AND_i . 00 100100 . ...... ...... ..... ..... @logic_imm_64
32
+AND_i . 00 100100 . ...... ...... ..... ..... @logic_imm_32
33
+ORR_i . 01 100100 . ...... ...... ..... ..... @logic_imm_64
34
+ORR_i . 01 100100 . ...... ...... ..... ..... @logic_imm_32
35
+EOR_i . 10 100100 . ...... ...... ..... ..... @logic_imm_64
36
+EOR_i . 10 100100 . ...... ...... ..... ..... @logic_imm_32
37
+ANDS_i . 11 100100 . ...... ...... ..... ..... @logic_imm_64
38
+ANDS_i . 11 100100 . ...... ...... ..... ..... @logic_imm_32
39
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/tcg/translate-a64.c
42
+++ b/target/arm/tcg/translate-a64.c
43
@@ -XXX,XX +XXX,XX @@ static uint64_t bitfield_replicate(uint64_t mask, unsigned int e)
44
return mask;
45
}
46
47
-/* Simplified variant of pseudocode DecodeBitMasks() for the case where we
48
+/*
49
+ * Logical (immediate)
50
+ */
51
+
52
+/*
53
+ * Simplified variant of pseudocode DecodeBitMasks() for the case where we
54
* only require the wmask. Returns false if the imms/immr/immn are a reserved
55
* value (ie should cause a guest UNDEF exception), and true if they are
56
* valid, in which case the decoded bit pattern is written to result.
57
@@ -XXX,XX +XXX,XX @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
58
return true;
59
}
60
61
-/* Logical (immediate)
62
- * 31 30 29 28 23 22 21 16 15 10 9 5 4 0
63
- * +----+-----+-------------+---+------+------+------+------+
64
- * | sf | opc | 1 0 0 1 0 0 | N | immr | imms | Rn | Rd |
65
- * +----+-----+-------------+---+------+------+------+------+
66
- */
67
-static void disas_logic_imm(DisasContext *s, uint32_t insn)
68
+static bool gen_rri_log(DisasContext *s, arg_rri_log *a, bool set_cc,
69
+ void (*fn)(TCGv_i64, TCGv_i64, int64_t))
70
{
71
- unsigned int sf, opc, is_n, immr, imms, rn, rd;
72
TCGv_i64 tcg_rd, tcg_rn;
73
- uint64_t wmask;
74
- bool is_and = false;
75
+ uint64_t imm;
76
77
- sf = extract32(insn, 31, 1);
78
- opc = extract32(insn, 29, 2);
79
- is_n = extract32(insn, 22, 1);
80
- immr = extract32(insn, 16, 6);
81
- imms = extract32(insn, 10, 6);
82
- rn = extract32(insn, 5, 5);
83
- rd = extract32(insn, 0, 5);
84
-
85
- if (!sf && is_n) {
86
- unallocated_encoding(s);
87
- return;
88
+ /* Some immediate field values are reserved. */
89
+ if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1),
90
+ extract32(a->dbm, 0, 6),
91
+ extract32(a->dbm, 6, 6))) {
92
+ return false;
93
+ }
94
+ if (!a->sf) {
95
+ imm &= 0xffffffffull;
96
}
97
98
- if (opc == 0x3) { /* ANDS */
99
- tcg_rd = cpu_reg(s, rd);
100
- } else {
101
- tcg_rd = cpu_reg_sp(s, rd);
102
- }
103
- tcg_rn = cpu_reg(s, rn);
104
+ tcg_rd = set_cc ? cpu_reg(s, a->rd) : cpu_reg_sp(s, a->rd);
105
+ tcg_rn = cpu_reg(s, a->rn);
106
107
- if (!logic_imm_decode_wmask(&wmask, is_n, imms, immr)) {
108
- /* some immediate field values are reserved */
109
- unallocated_encoding(s);
110
- return;
111
+ fn(tcg_rd, tcg_rn, imm);
112
+ if (set_cc) {
113
+ gen_logic_CC(a->sf, tcg_rd);
114
}
115
-
116
- if (!sf) {
117
- wmask &= 0xffffffff;
118
- }
119
-
120
- switch (opc) {
121
- case 0x3: /* ANDS */
122
- case 0x0: /* AND */
123
- tcg_gen_andi_i64(tcg_rd, tcg_rn, wmask);
124
- is_and = true;
125
- break;
126
- case 0x1: /* ORR */
127
- tcg_gen_ori_i64(tcg_rd, tcg_rn, wmask);
128
- break;
129
- case 0x2: /* EOR */
130
- tcg_gen_xori_i64(tcg_rd, tcg_rn, wmask);
131
- break;
132
- default:
133
- assert(FALSE); /* must handle all above */
134
- break;
135
- }
136
-
137
- if (!sf && !is_and) {
138
- /* zero extend final result; we know we can skip this for AND
139
- * since the immediate had the high 32 bits clear.
140
- */
141
+ if (!a->sf) {
142
tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
143
}
144
-
145
- if (opc == 3) { /* ANDS */
146
- gen_logic_CC(sf, tcg_rd);
147
- }
148
+ return true;
149
}
150
151
+TRANS(AND_i, gen_rri_log, a, false, tcg_gen_andi_i64)
152
+TRANS(ORR_i, gen_rri_log, a, false, tcg_gen_ori_i64)
153
+TRANS(EOR_i, gen_rri_log, a, false, tcg_gen_xori_i64)
154
+TRANS(ANDS_i, gen_rri_log, a, true, tcg_gen_andi_i64)
155
+
156
/*
157
* Move wide (immediate)
158
*
159
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
160
static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
161
{
162
switch (extract32(insn, 23, 6)) {
163
- case 0x24: /* Logical (immediate) */
164
- disas_logic_imm(s, insn);
165
- break;
166
case 0x25: /* Move wide (immediate) */
167
disas_movw_imm(s, insn);
168
break;
169
--
170
2.34.1
diff view generated by jsdifflib
1
From: Joe Komlodi <joe.komlodi@xilinx.com>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
If the CPU is running in default NaN mode (FPCR.DN == 1) and we execute
3
Convert the MON, MOVZ, MOVK instructions.
4
FRSQRTE, FRECPE, or FRECPX with a signaling NaN, parts_silence_nan_frac() will
5
assert due to fpst->default_nan_mode being set.
6
4
7
To avoid this, we check to see what NaN mode we're running in before we call
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
floatxx_silence_nan().
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Joe Komlodi <joe.komlodi@xilinx.com>
8
Message-id: 20230512144106.3608981-11-peter.maydell@linaro.org
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
[PMM: Rebased]
12
Message-id: 1624662174-175828-2-git-send-email-joe.komlodi@xilinx.com
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
15
---
12
---
16
target/arm/helper-a64.c | 12 +++++++++---
13
target/arm/tcg/a64.decode | 13 ++++++
17
target/arm/vfp_helper.c | 24 ++++++++++++++++++------
14
target/arm/tcg/translate-a64.c | 73 ++++++++++++++--------------------
18
2 files changed, 27 insertions(+), 9 deletions(-)
15
2 files changed, 42 insertions(+), 44 deletions(-)
19
16
20
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
17
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
21
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
22
--- a/target/arm/helper-a64.c
19
--- a/target/arm/tcg/a64.decode
23
+++ b/target/arm/helper-a64.c
20
+++ b/target/arm/tcg/a64.decode
24
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(frecpx_f16)(uint32_t a, void *fpstp)
21
@@ -XXX,XX +XXX,XX @@ EOR_i . 10 100100 . ...... ...... ..... ..... @logic_imm_64
25
float16 nan = a;
22
EOR_i . 10 100100 . ...... ...... ..... ..... @logic_imm_32
26
if (float16_is_signaling_nan(a, fpst)) {
23
ANDS_i . 11 100100 . ...... ...... ..... ..... @logic_imm_64
27
float_raise(float_flag_invalid, fpst);
24
ANDS_i . 11 100100 . ...... ...... ..... ..... @logic_imm_32
28
- nan = float16_silence_nan(a, fpst);
25
+
29
+ if (!fpst->default_nan_mode) {
26
+# Move wide (immediate)
30
+ nan = float16_silence_nan(a, fpst);
27
+
31
+ }
28
+&movw rd sf imm hw
32
}
29
+@movw_64 1 .. ...... hw:2 imm:16 rd:5 &movw sf=1
33
if (fpst->default_nan_mode) {
30
+@movw_32 0 .. ...... 0 hw:1 imm:16 rd:5 &movw sf=0
34
nan = float16_default_nan(fpst);
31
+
35
@@ -XXX,XX +XXX,XX @@ float32 HELPER(frecpx_f32)(float32 a, void *fpstp)
32
+MOVN . 00 100101 .. ................ ..... @movw_64
36
float32 nan = a;
33
+MOVN . 00 100101 .. ................ ..... @movw_32
37
if (float32_is_signaling_nan(a, fpst)) {
34
+MOVZ . 10 100101 .. ................ ..... @movw_64
38
float_raise(float_flag_invalid, fpst);
35
+MOVZ . 10 100101 .. ................ ..... @movw_32
39
- nan = float32_silence_nan(a, fpst);
36
+MOVK . 11 100101 .. ................ ..... @movw_64
40
+ if (!fpst->default_nan_mode) {
37
+MOVK . 11 100101 .. ................ ..... @movw_32
41
+ nan = float32_silence_nan(a, fpst);
38
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
42
+ }
43
}
44
if (fpst->default_nan_mode) {
45
nan = float32_default_nan(fpst);
46
@@ -XXX,XX +XXX,XX @@ float64 HELPER(frecpx_f64)(float64 a, void *fpstp)
47
float64 nan = a;
48
if (float64_is_signaling_nan(a, fpst)) {
49
float_raise(float_flag_invalid, fpst);
50
- nan = float64_silence_nan(a, fpst);
51
+ if (!fpst->default_nan_mode) {
52
+ nan = float64_silence_nan(a, fpst);
53
+ }
54
}
55
if (fpst->default_nan_mode) {
56
nan = float64_default_nan(fpst);
57
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
58
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
59
--- a/target/arm/vfp_helper.c
40
--- a/target/arm/tcg/translate-a64.c
60
+++ b/target/arm/vfp_helper.c
41
+++ b/target/arm/tcg/translate-a64.c
61
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpe_f16)(uint32_t input, void *fpstp)
42
@@ -XXX,XX +XXX,XX @@ TRANS(ANDS_i, gen_rri_log, a, true, tcg_gen_andi_i64)
62
float16 nan = f16;
43
63
if (float16_is_signaling_nan(f16, fpst)) {
44
/*
64
float_raise(float_flag_invalid, fpst);
45
* Move wide (immediate)
65
- nan = float16_silence_nan(f16, fpst);
46
- *
66
+ if (!fpst->default_nan_mode) {
47
- * 31 30 29 28 23 22 21 20 5 4 0
67
+ nan = float16_silence_nan(f16, fpst);
48
- * +--+-----+-------------+-----+----------------+------+
68
+ }
49
- * |sf| opc | 1 0 0 1 0 1 | hw | imm16 | Rd |
69
}
50
- * +--+-----+-------------+-----+----------------+------+
70
if (fpst->default_nan_mode) {
51
- *
71
nan = float16_default_nan(fpst);
52
- * sf: 0 -> 32 bit, 1 -> 64 bit
72
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recpe_f32)(float32 input, void *fpstp)
53
- * opc: 00 -> N, 10 -> Z, 11 -> K
73
float32 nan = f32;
54
- * hw: shift/16 (0,16, and sf only 32, 48)
74
if (float32_is_signaling_nan(f32, fpst)) {
55
*/
75
float_raise(float_flag_invalid, fpst);
56
-static void disas_movw_imm(DisasContext *s, uint32_t insn)
76
- nan = float32_silence_nan(f32, fpst);
57
+
77
+ if (!fpst->default_nan_mode) {
58
+static bool trans_MOVZ(DisasContext *s, arg_movw *a)
78
+ nan = float32_silence_nan(f32, fpst);
59
{
79
+ }
60
- int rd = extract32(insn, 0, 5);
80
}
61
- uint64_t imm = extract32(insn, 5, 16);
81
if (fpst->default_nan_mode) {
62
- int sf = extract32(insn, 31, 1);
82
nan = float32_default_nan(fpst);
63
- int opc = extract32(insn, 29, 2);
83
@@ -XXX,XX +XXX,XX @@ float64 HELPER(recpe_f64)(float64 input, void *fpstp)
64
- int pos = extract32(insn, 21, 2) << 4;
84
float64 nan = f64;
65
- TCGv_i64 tcg_rd = cpu_reg(s, rd);
85
if (float64_is_signaling_nan(f64, fpst)) {
66
+ int pos = a->hw << 4;
86
float_raise(float_flag_invalid, fpst);
67
+ tcg_gen_movi_i64(cpu_reg(s, a->rd), (uint64_t)a->imm << pos);
87
- nan = float64_silence_nan(f64, fpst);
68
+ return true;
88
+ if (!fpst->default_nan_mode) {
69
+}
89
+ nan = float64_silence_nan(f64, fpst);
70
90
+ }
71
- if (!sf && (pos >= 32)) {
91
}
72
- unallocated_encoding(s);
92
if (fpst->default_nan_mode) {
73
- return;
93
nan = float64_default_nan(fpst);
74
- }
94
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(rsqrte_f16)(uint32_t input, void *fpstp)
75
+static bool trans_MOVN(DisasContext *s, arg_movw *a)
95
float16 nan = f16;
76
+{
96
if (float16_is_signaling_nan(f16, s)) {
77
+ int pos = a->hw << 4;
97
float_raise(float_flag_invalid, s);
78
+ uint64_t imm = a->imm;
98
- nan = float16_silence_nan(f16, s);
79
99
+ if (!s->default_nan_mode) {
80
- switch (opc) {
100
+ nan = float16_silence_nan(f16, fpstp);
81
- case 0: /* MOVN */
101
+ }
82
- case 2: /* MOVZ */
102
}
83
- imm <<= pos;
103
if (s->default_nan_mode) {
84
- if (opc == 0) {
104
nan = float16_default_nan(s);
85
- imm = ~imm;
105
@@ -XXX,XX +XXX,XX @@ float32 HELPER(rsqrte_f32)(float32 input, void *fpstp)
86
- }
106
float32 nan = f32;
87
- if (!sf) {
107
if (float32_is_signaling_nan(f32, s)) {
88
- imm &= 0xffffffffu;
108
float_raise(float_flag_invalid, s);
89
- }
109
- nan = float32_silence_nan(f32, s);
90
- tcg_gen_movi_i64(tcg_rd, imm);
110
+ if (!s->default_nan_mode) {
91
- break;
111
+ nan = float32_silence_nan(f32, fpstp);
92
- case 3: /* MOVK */
112
+ }
93
- tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_constant_i64(imm), pos, 16);
113
}
94
- if (!sf) {
114
if (s->default_nan_mode) {
95
- tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
115
nan = float32_default_nan(s);
96
- }
116
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
97
- break;
117
float64 nan = f64;
98
- default:
118
if (float64_is_signaling_nan(f64, s)) {
99
- unallocated_encoding(s);
119
float_raise(float_flag_invalid, s);
100
- break;
120
- nan = float64_silence_nan(f64, s);
101
+ imm = ~(imm << pos);
121
+ if (!s->default_nan_mode) {
102
+ if (!a->sf) {
122
+ nan = float64_silence_nan(f64, fpstp);
103
+ imm = (uint32_t)imm;
123
+ }
104
}
124
}
105
+ tcg_gen_movi_i64(cpu_reg(s, a->rd), imm);
125
if (s->default_nan_mode) {
106
+ return true;
126
nan = float64_default_nan(s);
107
+}
108
+
109
+static bool trans_MOVK(DisasContext *s, arg_movw *a)
110
+{
111
+ int pos = a->hw << 4;
112
+ TCGv_i64 tcg_rd, tcg_im;
113
+
114
+ tcg_rd = cpu_reg(s, a->rd);
115
+ tcg_im = tcg_constant_i64(a->imm);
116
+ tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_im, pos, 16);
117
+ if (!a->sf) {
118
+ tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
119
+ }
120
+ return true;
121
}
122
123
/* Bitfield
124
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
125
static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
126
{
127
switch (extract32(insn, 23, 6)) {
128
- case 0x25: /* Move wide (immediate) */
129
- disas_movw_imm(s, insn);
130
- break;
131
case 0x26: /* Bitfield */
132
disas_bitfield(s, insn);
133
break;
127
--
134
--
128
2.20.1
135
2.34.1
129
130
diff view generated by jsdifflib
1
Implement the MVE VADDLV insn; this is similar to VADDV, except
1
From: Richard Henderson <richard.henderson@linaro.org>
2
that it accumulates 32-bit elements into a 64-bit accumulator
3
stored in a pair of general-purpose registers.
4
2
3
Convert the BFM, SBFM, UBFM instructions.
4
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20230512144106.3608981-12-peter.maydell@linaro.org
7
Message-id: 20210628135835.6690-15-peter.maydell@linaro.org
9
[PMM: Rebased]
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
---
11
---
9
target/arm/helper-mve.h | 3 ++
12
target/arm/tcg/a64.decode | 13 +++
10
target/arm/mve.decode | 6 +++-
13
target/arm/tcg/translate-a64.c | 144 ++++++++++++++++++---------------
11
target/arm/mve_helper.c | 19 ++++++++++++
14
2 files changed, 94 insertions(+), 63 deletions(-)
12
target/arm/translate-mve.c | 63 ++++++++++++++++++++++++++++++++++++++
13
4 files changed, 90 insertions(+), 1 deletion(-)
14
15
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
16
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
18
--- a/target/arm/tcg/a64.decode
18
+++ b/target/arm/helper-mve.h
19
+++ b/target/arm/tcg/a64.decode
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vaddvuh, TCG_CALL_NO_WG, i32, env, ptr, i32)
20
@@ -XXX,XX +XXX,XX @@ MOVZ . 10 100101 .. ................ ..... @movw_64
20
DEF_HELPER_FLAGS_3(mve_vaddvsw, TCG_CALL_NO_WG, i32, env, ptr, i32)
21
MOVZ . 10 100101 .. ................ ..... @movw_32
21
DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
22
MOVK . 11 100101 .. ................ ..... @movw_64
22
23
MOVK . 11 100101 .. ................ ..... @movw_32
23
+DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64)
24
+
24
+DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64)
25
+# Bitfield
25
+
26
+
26
DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
27
+&bitfield rd rn sf immr imms
27
DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
28
+@bitfield_64 1 .. ...... 1 immr:6 imms:6 rn:5 rd:5 &bitfield sf=1
28
DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
29
+@bitfield_32 0 .. ...... 0 0 immr:5 0 imms:5 rn:5 rd:5 &bitfield sf=0
29
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
30
+
31
+SBFM . 00 100110 . ...... ...... ..... ..... @bitfield_64
32
+SBFM . 00 100110 . ...... ...... ..... ..... @bitfield_32
33
+BFM . 01 100110 . ...... ...... ..... ..... @bitfield_64
34
+BFM . 01 100110 . ...... ...... ..... ..... @bitfield_32
35
+UBFM . 10 100110 . ...... ...... ..... ..... @bitfield_64
36
+UBFM . 10 100110 . ...... ...... ..... ..... @bitfield_32
37
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
30
index XXXXXXX..XXXXXXX 100644
38
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/mve.decode
39
--- a/target/arm/tcg/translate-a64.c
32
+++ b/target/arm/mve.decode
40
+++ b/target/arm/tcg/translate-a64.c
33
@@ -XXX,XX +XXX,XX @@ VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
41
@@ -XXX,XX +XXX,XX @@ static bool trans_MOVK(DisasContext *s, arg_movw *a)
34
VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar
35
36
# Vector add across vector
37
-VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo
38
+{
39
+ VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo
40
+ VADDLV 111 u:1 1110 1 ... 1001 ... 0 1111 00 a:1 0 qm:3 0 \
41
+ rdahi=%rdahi rdalo=%rdalo
42
+}
43
44
# Predicate operations
45
%mask_22_13 22:1 13:3
46
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/mve_helper.c
49
+++ b/target/arm/mve_helper.c
50
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvub, 1, uint8_t)
51
DO_VADDV(vaddvuh, 2, uint16_t)
52
DO_VADDV(vaddvuw, 4, uint32_t)
53
54
+#define DO_VADDLV(OP, TYPE, LTYPE) \
55
+ uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \
56
+ uint64_t ra) \
57
+ { \
58
+ uint16_t mask = mve_element_mask(env); \
59
+ unsigned e; \
60
+ TYPE *m = vm; \
61
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) { \
62
+ if (mask & 1) { \
63
+ ra += (LTYPE)m[H4(e)]; \
64
+ } \
65
+ } \
66
+ mve_advance_vpt(env); \
67
+ return ra; \
68
+ } \
69
+
70
+DO_VADDLV(vaddlv_s, int32_t, int64_t)
71
+DO_VADDLV(vaddlv_u, uint32_t, uint64_t)
72
+
73
/* Shifts by immediate */
74
#define DO_2SHIFT(OP, ESIZE, TYPE, FN) \
75
void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
76
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
77
index XXXXXXX..XXXXXXX 100644
78
--- a/target/arm/translate-mve.c
79
+++ b/target/arm/translate-mve.c
80
@@ -XXX,XX +XXX,XX @@ static bool trans_VADDV(DisasContext *s, arg_VADDV *a)
81
return true;
42
return true;
82
}
43
}
83
44
84
+static bool trans_VADDLV(DisasContext *s, arg_VADDLV *a)
45
-/* Bitfield
85
+{
46
- * 31 30 29 28 23 22 21 16 15 10 9 5 4 0
86
+ /*
47
- * +----+-----+-------------+---+------+------+------+------+
87
+ * Vector Add Long Across Vector: accumulate the 32-bit
48
- * | sf | opc | 1 0 0 1 1 0 | N | immr | imms | Rn | Rd |
88
+ * elements of the vector into a 64-bit result stored in
49
- * +----+-----+-------------+---+------+------+------+------+
89
+ * a pair of general-purpose registers.
50
+/*
90
+ * No need to check Qm's bank: it is only 3 bits in decode.
51
+ * Bitfield
91
+ */
52
*/
92
+ TCGv_ptr qm;
53
-static void disas_bitfield(DisasContext *s, uint32_t insn)
93
+ TCGv_i64 rda;
54
+
94
+ TCGv_i32 rdalo, rdahi;
55
+static bool trans_SBFM(DisasContext *s, arg_SBFM *a)
95
+
56
{
96
+ if (!dc_isar_feature(aa32_mve, s)) {
57
- unsigned int sf, n, opc, ri, si, rn, rd, bitsize, pos, len;
97
+ return false;
58
- TCGv_i64 tcg_rd, tcg_tmp;
59
+ TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
60
+ TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
61
+ unsigned int bitsize = a->sf ? 64 : 32;
62
+ unsigned int ri = a->immr;
63
+ unsigned int si = a->imms;
64
+ unsigned int pos, len;
65
66
- sf = extract32(insn, 31, 1);
67
- opc = extract32(insn, 29, 2);
68
- n = extract32(insn, 22, 1);
69
- ri = extract32(insn, 16, 6);
70
- si = extract32(insn, 10, 6);
71
- rn = extract32(insn, 5, 5);
72
- rd = extract32(insn, 0, 5);
73
- bitsize = sf ? 64 : 32;
74
-
75
- if (sf != n || ri >= bitsize || si >= bitsize || opc > 2) {
76
- unallocated_encoding(s);
77
- return;
78
- }
79
-
80
- tcg_rd = cpu_reg(s, rd);
81
-
82
- /* Suppress the zero-extend for !sf. Since RI and SI are constrained
83
- to be smaller than bitsize, we'll never reference data outside the
84
- low 32-bits anyway. */
85
- tcg_tmp = read_cpu_reg(s, rn, 1);
86
-
87
- /* Recognize simple(r) extractions. */
88
if (si >= ri) {
89
/* Wd<s-r:0> = Wn<s:r> */
90
len = (si - ri) + 1;
91
- if (opc == 0) { /* SBFM: ASR, SBFX, SXTB, SXTH, SXTW */
92
- tcg_gen_sextract_i64(tcg_rd, tcg_tmp, ri, len);
93
- goto done;
94
- } else if (opc == 2) { /* UBFM: UBFX, LSR, UXTB, UXTH */
95
- tcg_gen_extract_i64(tcg_rd, tcg_tmp, ri, len);
96
- return;
97
+ tcg_gen_sextract_i64(tcg_rd, tcg_tmp, ri, len);
98
+ if (!a->sf) {
99
+ tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
100
}
101
- /* opc == 1, BFXIL fall through to deposit */
102
+ } else {
103
+ /* Wd<32+s-r,32-r> = Wn<s:0> */
104
+ len = si + 1;
105
+ pos = (bitsize - ri) & (bitsize - 1);
106
+
107
+ if (len < ri) {
108
+ /*
109
+ * Sign extend the destination field from len to fill the
110
+ * balance of the word. Let the deposit below insert all
111
+ * of those sign bits.
112
+ */
113
+ tcg_gen_sextract_i64(tcg_tmp, tcg_tmp, 0, len);
114
+ len = ri;
115
+ }
116
+
117
+ /*
118
+ * We start with zero, and we haven't modified any bits outside
119
+ * bitsize, therefore no final zero-extension is unneeded for !sf.
120
+ */
121
+ tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
98
+ }
122
+ }
99
+ /*
100
+ * rdahi == 13 is UNPREDICTABLE; rdahi == 15 is a related
101
+ * encoding; rdalo always has bit 0 clear so cannot be 13 or 15.
102
+ */
103
+ if (a->rdahi == 13 || a->rdahi == 15) {
104
+ return false;
105
+ }
106
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
107
+ return true;
108
+ }
109
+
110
+ /*
111
+ * This insn is subject to beat-wise execution. Partial execution
112
+ * of an A=0 (no-accumulate) insn which does not execute the first
113
+ * beat must start with the current value of RdaHi:RdaLo, not zero.
114
+ */
115
+ if (a->a || mve_skip_first_beat(s)) {
116
+ /* Accumulate input from RdaHi:RdaLo */
117
+ rda = tcg_temp_new_i64();
118
+ rdalo = load_reg(s, a->rdalo);
119
+ rdahi = load_reg(s, a->rdahi);
120
+ tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
121
+ tcg_temp_free_i32(rdalo);
122
+ tcg_temp_free_i32(rdahi);
123
+ } else {
124
+ /* Accumulate starting at zero */
125
+ rda = tcg_const_i64(0);
126
+ }
127
+
128
+ qm = mve_qreg_ptr(a->qm);
129
+ if (a->u) {
130
+ gen_helper_mve_vaddlv_u(rda, cpu_env, qm, rda);
131
+ } else {
132
+ gen_helper_mve_vaddlv_s(rda, cpu_env, qm, rda);
133
+ }
134
+ tcg_temp_free_ptr(qm);
135
+
136
+ rdalo = tcg_temp_new_i32();
137
+ rdahi = tcg_temp_new_i32();
138
+ tcg_gen_extrl_i64_i32(rdalo, rda);
139
+ tcg_gen_extrh_i64_i32(rdahi, rda);
140
+ store_reg(s, a->rdalo, rdalo);
141
+ store_reg(s, a->rdahi, rdahi);
142
+ tcg_temp_free_i64(rda);
143
+ mve_update_eci(s);
144
+ return true;
123
+ return true;
145
+}
124
+}
146
+
125
+
147
static bool do_1imm(DisasContext *s, arg_1imm *a, MVEGenOneOpImmFn *fn)
126
+static bool trans_UBFM(DisasContext *s, arg_UBFM *a)
127
+{
128
+ TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
129
+ TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
130
+ unsigned int bitsize = a->sf ? 64 : 32;
131
+ unsigned int ri = a->immr;
132
+ unsigned int si = a->imms;
133
+ unsigned int pos, len;
134
+
135
+ tcg_rd = cpu_reg(s, a->rd);
136
+ tcg_tmp = read_cpu_reg(s, a->rn, 1);
137
+
138
+ if (si >= ri) {
139
+ /* Wd<s-r:0> = Wn<s:r> */
140
+ len = (si - ri) + 1;
141
+ tcg_gen_extract_i64(tcg_rd, tcg_tmp, ri, len);
142
+ } else {
143
+ /* Wd<32+s-r,32-r> = Wn<s:0> */
144
+ len = si + 1;
145
+ pos = (bitsize - ri) & (bitsize - 1);
146
+ tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
147
+ }
148
+ return true;
149
+}
150
+
151
+static bool trans_BFM(DisasContext *s, arg_BFM *a)
152
+{
153
+ TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
154
+ TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
155
+ unsigned int bitsize = a->sf ? 64 : 32;
156
+ unsigned int ri = a->immr;
157
+ unsigned int si = a->imms;
158
+ unsigned int pos, len;
159
+
160
+ tcg_rd = cpu_reg(s, a->rd);
161
+ tcg_tmp = read_cpu_reg(s, a->rn, 1);
162
+
163
+ if (si >= ri) {
164
+ /* Wd<s-r:0> = Wn<s:r> */
165
tcg_gen_shri_i64(tcg_tmp, tcg_tmp, ri);
166
+ len = (si - ri) + 1;
167
pos = 0;
168
} else {
169
- /* Handle the ri > si case with a deposit
170
- * Wd<32+s-r,32-r> = Wn<s:0>
171
- */
172
+ /* Wd<32+s-r,32-r> = Wn<s:0> */
173
len = si + 1;
174
pos = (bitsize - ri) & (bitsize - 1);
175
}
176
177
- if (opc == 0 && len < ri) {
178
- /* SBFM: sign extend the destination field from len to fill
179
- the balance of the word. Let the deposit below insert all
180
- of those sign bits. */
181
- tcg_gen_sextract_i64(tcg_tmp, tcg_tmp, 0, len);
182
- len = ri;
183
- }
184
-
185
- if (opc == 1) { /* BFM, BFXIL */
186
- tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, pos, len);
187
- } else {
188
- /* SBFM or UBFM: We start with zero, and we haven't modified
189
- any bits outside bitsize, therefore the zero-extension
190
- below is unneeded. */
191
- tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
192
- return;
193
- }
194
-
195
- done:
196
- if (!sf) { /* zero extend final result */
197
+ tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, pos, len);
198
+ if (!a->sf) {
199
tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
200
}
201
+ return true;
202
}
203
204
/* Extract
205
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
206
static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
148
{
207
{
149
TCGv_ptr qd;
208
switch (extract32(insn, 23, 6)) {
209
- case 0x26: /* Bitfield */
210
- disas_bitfield(s, insn);
211
- break;
212
case 0x27: /* Extract */
213
disas_extract(s, insn);
214
break;
150
--
215
--
151
2.20.1
216
2.34.1
152
153
diff view generated by jsdifflib
1
Implement the MVE VSHLC insn, which performs a shift left of the
1
Convert the EXTR instruction to decodetree (this is the
2
entire vector with carry in bits provided from a general purpose
2
only one in the 'Extract" class). This is the last of
3
register and carry out bits written back to that register.
3
the dp-immediate insns in the legacy decoder, so we
4
can now remove disas_data_proc_imm().
4
5
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210628135835.6690-14-peter.maydell@linaro.org
8
Message-id: 20230512144106.3608981-13-peter.maydell@linaro.org
8
---
9
---
9
target/arm/helper-mve.h | 2 ++
10
target/arm/tcg/a64.decode | 7 +++
10
target/arm/mve.decode | 2 ++
11
target/arm/tcg/translate-a64.c | 94 +++++++++++-----------------------
11
target/arm/mve_helper.c | 38 ++++++++++++++++++++++++++++++++++++++
12
2 files changed, 36 insertions(+), 65 deletions(-)
12
target/arm/translate-mve.c | 30 ++++++++++++++++++++++++++++++
13
4 files changed, 72 insertions(+)
14
13
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
16
--- a/target/arm/tcg/a64.decode
18
+++ b/target/arm/helper-mve.h
17
+++ b/target/arm/tcg/a64.decode
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrshrunbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
@@ -XXX,XX +XXX,XX @@ BFM . 01 100110 . ...... ...... ..... ..... @bitfield_64
20
DEF_HELPER_FLAGS_4(mve_vqrshrunbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
BFM . 01 100110 . ...... ...... ..... ..... @bitfield_32
21
DEF_HELPER_FLAGS_4(mve_vqrshruntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
20
UBFM . 10 100110 . ...... ...... ..... ..... @bitfield_64
22
DEF_HELPER_FLAGS_4(mve_vqrshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
UBFM . 10 100110 . ...... ...... ..... ..... @bitfield_32
23
+
22
+
24
+DEF_HELPER_FLAGS_4(mve_vshlc, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
23
+# Extract
25
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
24
+
25
+&extract rd rn rm imm sf
26
+
27
+EXTR 1 00 100111 1 0 rm:5 imm:6 rn:5 rd:5 &extract sf=1
28
+EXTR 0 00 100111 0 0 rm:5 0 imm:5 rn:5 rd:5 &extract sf=0
29
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
26
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
27
--- a/target/arm/mve.decode
31
--- a/target/arm/tcg/translate-a64.c
28
+++ b/target/arm/mve.decode
32
+++ b/target/arm/tcg/translate-a64.c
29
@@ -XXX,XX +XXX,XX @@ VQRSHRUNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_b
33
@@ -XXX,XX +XXX,XX @@ static bool trans_BFM(DisasContext *s, arg_BFM *a)
30
VQRSHRUNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 0 @2_shr_h
34
return true;
31
VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b
35
}
32
VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h
36
33
+
37
-/* Extract
34
+VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd
38
- * 31 30 29 28 23 22 21 20 16 15 10 9 5 4 0
35
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
39
- * +----+------+-------------+---+----+------+--------+------+------+
36
index XXXXXXX..XXXXXXX 100644
40
- * | sf | op21 | 1 0 0 1 1 1 | N | o0 | Rm | imms | Rn | Rd |
37
--- a/target/arm/mve_helper.c
41
- * +----+------+-------------+---+----+------+--------+------+------+
38
+++ b/target/arm/mve_helper.c
42
- */
39
@@ -XXX,XX +XXX,XX @@ DO_VSHRN_SAT_UB(vqrshrnb_ub, vqrshrnt_ub, DO_RSHRN_UB)
43
-static void disas_extract(DisasContext *s, uint32_t insn)
40
DO_VSHRN_SAT_UH(vqrshrnb_uh, vqrshrnt_uh, DO_RSHRN_UH)
44
+static bool trans_EXTR(DisasContext *s, arg_extract *a)
41
DO_VSHRN_SAT_SB(vqrshrunbb, vqrshruntb, DO_RSHRUN_B)
45
{
42
DO_VSHRN_SAT_SH(vqrshrunbh, vqrshrunth, DO_RSHRUN_H)
46
- unsigned int sf, n, rm, imm, rn, rd, bitsize, op21, op0;
43
+
47
+ TCGv_i64 tcg_rd, tcg_rm, tcg_rn;
44
+uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm,
48
45
+ uint32_t shift)
49
- sf = extract32(insn, 31, 1);
46
+{
50
- n = extract32(insn, 22, 1);
47
+ uint32_t *d = vd;
51
- rm = extract32(insn, 16, 5);
48
+ uint16_t mask = mve_element_mask(env);
52
- imm = extract32(insn, 10, 6);
49
+ unsigned e;
53
- rn = extract32(insn, 5, 5);
50
+ uint32_t r;
54
- rd = extract32(insn, 0, 5);
51
+
55
- op21 = extract32(insn, 29, 2);
52
+ /*
56
- op0 = extract32(insn, 21, 1);
53
+ * For each 32-bit element, we shift it left, bringing in the
57
- bitsize = sf ? 64 : 32;
54
+ * low 'shift' bits of rdm at the bottom. Bits shifted out at
58
+ tcg_rd = cpu_reg(s, a->rd);
55
+ * the top become the new rdm, if the predicate mask permits.
59
56
+ * The final rdm value is returned to update the register.
60
- if (sf != n || op21 || op0 || imm >= bitsize) {
57
+ * shift == 0 here means "shift by 32 bits".
61
- unallocated_encoding(s);
58
+ */
62
- } else {
59
+ if (shift == 0) {
63
- TCGv_i64 tcg_rd, tcg_rm, tcg_rn;
60
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) {
64
-
61
+ r = rdm;
65
- tcg_rd = cpu_reg(s, rd);
62
+ if (mask & 1) {
66
-
63
+ rdm = d[H4(e)];
67
- if (unlikely(imm == 0)) {
64
+ }
68
- /* tcg shl_i32/shl_i64 is undefined for 32/64 bit shifts,
65
+ mergemask(&d[H4(e)], r, mask);
69
- * so an extract from bit 0 is a special case.
70
- */
71
- if (sf) {
72
- tcg_gen_mov_i64(tcg_rd, cpu_reg(s, rm));
73
- } else {
74
- tcg_gen_ext32u_i64(tcg_rd, cpu_reg(s, rm));
75
- }
76
+ if (unlikely(a->imm == 0)) {
77
+ /*
78
+ * tcg shl_i32/shl_i64 is undefined for 32/64 bit shifts,
79
+ * so an extract from bit 0 is a special case.
80
+ */
81
+ if (a->sf) {
82
+ tcg_gen_mov_i64(tcg_rd, cpu_reg(s, a->rm));
83
} else {
84
- tcg_rm = cpu_reg(s, rm);
85
- tcg_rn = cpu_reg(s, rn);
86
+ tcg_gen_ext32u_i64(tcg_rd, cpu_reg(s, a->rm));
66
+ }
87
+ }
67
+ } else {
88
+ } else {
68
+ uint32_t shiftmask = MAKE_64BIT_MASK(0, shift);
89
+ tcg_rm = cpu_reg(s, a->rm);
90
+ tcg_rn = cpu_reg(s, a->rn);
91
92
- if (sf) {
93
- /* Specialization to ROR happens in EXTRACT2. */
94
- tcg_gen_extract2_i64(tcg_rd, tcg_rm, tcg_rn, imm);
95
+ if (a->sf) {
96
+ /* Specialization to ROR happens in EXTRACT2. */
97
+ tcg_gen_extract2_i64(tcg_rd, tcg_rm, tcg_rn, a->imm);
98
+ } else {
99
+ TCGv_i32 t0 = tcg_temp_new_i32();
69
+
100
+
70
+ for (e = 0; e < 16 / 4; e++, mask >>= 4) {
101
+ tcg_gen_extrl_i64_i32(t0, tcg_rm);
71
+ r = (d[H4(e)] << shift) | (rdm & shiftmask);
102
+ if (a->rm == a->rn) {
72
+ if (mask & 1) {
103
+ tcg_gen_rotri_i32(t0, t0, a->imm);
73
+ rdm = d[H4(e)] >> (32 - shift);
104
} else {
74
+ }
105
- TCGv_i32 t0 = tcg_temp_new_i32();
75
+ mergemask(&d[H4(e)], r, mask);
106
-
76
+ }
107
- tcg_gen_extrl_i64_i32(t0, tcg_rm);
77
+ }
108
- if (rm == rn) {
78
+ mve_advance_vpt(env);
109
- tcg_gen_rotri_i32(t0, t0, imm);
79
+ return rdm;
110
- } else {
80
+}
111
- TCGv_i32 t1 = tcg_temp_new_i32();
81
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
112
- tcg_gen_extrl_i64_i32(t1, tcg_rn);
82
index XXXXXXX..XXXXXXX 100644
113
- tcg_gen_extract2_i32(t0, t0, t1, imm);
83
--- a/target/arm/translate-mve.c
114
- }
84
+++ b/target/arm/translate-mve.c
115
- tcg_gen_extu_i32_i64(tcg_rd, t0);
85
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_N(VQRSHRNB_U, vqrshrnb_u)
116
+ TCGv_i32 t1 = tcg_temp_new_i32();
86
DO_2SHIFT_N(VQRSHRNT_U, vqrshrnt_u)
117
+ tcg_gen_extrl_i64_i32(t1, tcg_rn);
87
DO_2SHIFT_N(VQRSHRUNB, vqrshrunb)
118
+ tcg_gen_extract2_i32(t0, t0, t1, a->imm);
88
DO_2SHIFT_N(VQRSHRUNT, vqrshrunt)
119
}
89
+
120
+ tcg_gen_extu_i32_i64(tcg_rd, t0);
90
+static bool trans_VSHLC(DisasContext *s, arg_VSHLC *a)
121
}
91
+{
122
}
92
+ /*
123
-}
93
+ * Whole Vector Left Shift with Carry. The carry is taken
124
-
94
+ * from a general purpose register and written back there.
125
-/* Data processing - immediate */
95
+ * An imm of 0 means "shift by 32".
126
-static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
96
+ */
127
-{
97
+ TCGv_ptr qd;
128
- switch (extract32(insn, 23, 6)) {
98
+ TCGv_i32 rdm;
129
- case 0x27: /* Extract */
99
+
130
- disas_extract(s, insn);
100
+ if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd)) {
131
- break;
101
+ return false;
132
- default:
102
+ }
133
- unallocated_encoding(s);
103
+ if (a->rdm == 13 || a->rdm == 15) {
134
- break;
104
+ /* CONSTRAINED UNPREDICTABLE: we UNDEF */
135
- }
105
+ return false;
106
+ }
107
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
108
+ return true;
109
+ }
110
+
111
+ qd = mve_qreg_ptr(a->qd);
112
+ rdm = load_reg(s, a->rdm);
113
+ gen_helper_mve_vshlc(rdm, cpu_env, qd, rdm, tcg_constant_i32(a->imm));
114
+ store_reg(s, a->rdm, rdm);
115
+ tcg_temp_free_ptr(qd);
116
+ mve_update_eci(s);
117
+ return true;
136
+ return true;
118
+}
137
}
138
139
/* Shift a TCGv src by TCGv shift_amount, put result in dst.
140
@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
141
static void disas_a64_legacy(DisasContext *s, uint32_t insn)
142
{
143
switch (extract32(insn, 25, 4)) {
144
- case 0x8: case 0x9: /* Data processing - immediate */
145
- disas_data_proc_imm(s, insn);
146
- break;
147
case 0xa: case 0xb: /* Branch, exception generation and system insns */
148
disas_b_exc_sys(s, insn);
149
break;
119
--
150
--
120
2.20.1
151
2.34.1
121
122
diff view generated by jsdifflib
1
Implement the MVE shift-vector-left-by-immediate insns VSHL, VQSHL
1
Convert the unconditional branch immediate insns B and BL to
2
and VQSHLU.
2
decodetree.
3
4
The size-and-immediate encoding here is the same as Neon, and we
5
handle it the same way neon-dp.decode does.
6
3
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20210628135835.6690-8-peter.maydell@linaro.org
6
Message-id: 20230512144106.3608981-14-peter.maydell@linaro.org
10
---
7
---
11
target/arm/helper-mve.h | 16 +++++++++++
8
target/arm/tcg/a64.decode | 9 +++++++++
12
target/arm/mve.decode | 23 +++++++++++++++
9
target/arm/tcg/translate-a64.c | 31 +++++++++++--------------------
13
target/arm/mve_helper.c | 57 ++++++++++++++++++++++++++++++++++++++
10
2 files changed, 20 insertions(+), 20 deletions(-)
14
target/arm/translate-mve.c | 51 ++++++++++++++++++++++++++++++++++
15
4 files changed, 147 insertions(+)
16
11
17
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
18
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
19
--- a/target/arm/helper-mve.h
14
--- a/target/arm/tcg/a64.decode
20
+++ b/target/arm/helper-mve.h
15
+++ b/target/arm/tcg/a64.decode
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@
22
DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
17
23
DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
18
&ri rd imm
24
DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
19
&rri_sf rd rn imm sf
20
+&i imm
21
22
23
### Data Processing - Immediate
24
@@ -XXX,XX +XXX,XX @@ UBFM . 10 100110 . ...... ...... ..... ..... @bitfield_32
25
26
EXTR 1 00 100111 1 0 rm:5 imm:6 rn:5 rd:5 &extract sf=1
27
EXTR 0 00 100111 0 0 rm:5 0 imm:5 rn:5 rd:5 &extract sf=0
25
+
28
+
26
+DEF_HELPER_FLAGS_4(mve_vshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+# Branches
27
+DEF_HELPER_FLAGS_4(mve_vshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(mve_vshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
29
+
30
+
30
+DEF_HELPER_FLAGS_4(mve_vqshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
31
+%imm26 0:s26 !function=times_4
31
+DEF_HELPER_FLAGS_4(mve_vqshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
+@branch . ..... .......................... &i imm=%imm26
32
+DEF_HELPER_FLAGS_4(mve_vqshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
+
33
+
34
+DEF_HELPER_FLAGS_4(mve_vqshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
+B 0 00101 .......................... @branch
35
+DEF_HELPER_FLAGS_4(mve_vqshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
+BL 1 00101 .......................... @branch
36
+DEF_HELPER_FLAGS_4(mve_vqshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
37
+
38
+DEF_HELPER_FLAGS_4(mve_vqshlui_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_4(mve_vqshlui_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(mve_vqshlui_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
42
index XXXXXXX..XXXXXXX 100644
37
index XXXXXXX..XXXXXXX 100644
43
--- a/target/arm/mve.decode
38
--- a/target/arm/tcg/translate-a64.c
44
+++ b/target/arm/mve.decode
39
+++ b/target/arm/tcg/translate-a64.c
45
@@ -XXX,XX +XXX,XX @@
40
@@ -XXX,XX +XXX,XX @@ static inline AArch64DecodeFn *lookup_disas_fn(const AArch64DecodeTable *table,
46
&2op qd qm qn size
41
* match up with those in the manual.
47
&2scalar qd qn rm size
42
*/
48
&1imm qd imm cmode op
43
49
+&2shift qd qm shift size
44
-/* Unconditional branch (immediate)
50
45
- * 31 30 26 25 0
51
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
46
- * +----+-----------+-------------------------------------+
52
# Note that both Rn and Qd are 3 bits only (no D bit)
47
- * | op | 0 0 1 0 1 | imm26 |
53
@@ -XXX,XX +XXX,XX @@
48
- * +----+-----------+-------------------------------------+
54
@2scalar .... .... .. size:2 .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
49
- */
55
@2scalar_nosz .... .... .... .... .... .... .... rm:4 &2scalar qd=%qd qn=%qn
50
-static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
56
51
+static bool trans_B(DisasContext *s, arg_i *a)
57
+@2_shl_b .... .... .. 001 shift:3 .... .... .... .... &2shift qd=%qd qm=%qm size=0
52
{
58
+@2_shl_h .... .... .. 01 shift:4 .... .... .... .... &2shift qd=%qd qm=%qm size=1
53
- int64_t diff = sextract32(insn, 0, 26) * 4;
59
+@2_shl_w .... .... .. 1 shift:5 .... .... .... .... &2shift qd=%qd qm=%qm size=2
54
-
60
+
55
- if (insn & (1U << 31)) {
61
# Vector loads and stores
56
- /* BL Branch with link */
62
57
- gen_pc_plus_diff(s, cpu_reg(s, 30), curr_insn_len(s));
63
# Widening loads and narrowing stores:
58
- }
64
@@ -XXX,XX +XXX,XX @@ VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
59
-
65
# So we have a single decode line and check the cmode/op in the
60
- /* B Branch / BL Branch with link */
66
# trans function.
61
reset_btype(s);
67
Vimm_1r 111 . 1111 1 . 00 0 ... ... 0 .... 0 1 . 1 .... @1imm
62
- gen_goto_tb(s, 0, diff);
68
+
63
+ gen_goto_tb(s, 0, a->imm);
69
+# Shifts by immediate
70
+
71
+VSHLI 111 0 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_b
72
+VSHLI 111 0 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_h
73
+VSHLI 111 0 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_w
74
+
75
+VQSHLI_S 111 0 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_b
76
+VQSHLI_S 111 0 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_h
77
+VQSHLI_S 111 0 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_w
78
+
79
+VQSHLI_U 111 1 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_b
80
+VQSHLI_U 111 1 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_h
81
+VQSHLI_U 111 1 1111 1 . ... ... ... 0 0111 0 1 . 1 ... 0 @2_shl_w
82
+
83
+VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_b
84
+VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_h
85
+VQSHLUI 111 1 1111 1 . ... ... ... 0 0110 0 1 . 1 ... 0 @2_shl_w
86
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/target/arm/mve_helper.c
89
+++ b/target/arm/mve_helper.c
90
@@ -XXX,XX +XXX,XX @@ DO_2OP_SAT(vqsubsw, 4, int32_t, DO_SQSUB_W)
91
WRAP_QRSHL_HELPER(do_sqrshl_bhs, N, M, true, satp)
92
#define DO_UQRSHL_OP(N, M, satp) \
93
WRAP_QRSHL_HELPER(do_uqrshl_bhs, N, M, true, satp)
94
+#define DO_SUQSHL_OP(N, M, satp) \
95
+ WRAP_QRSHL_HELPER(do_suqrshl_bhs, N, M, false, satp)
96
97
DO_2OP_SAT_S(vqshls, DO_SQSHL_OP)
98
DO_2OP_SAT_U(vqshlu, DO_UQSHL_OP)
99
@@ -XXX,XX +XXX,XX @@ DO_VADDV(vaddvsw, 4, uint32_t)
100
DO_VADDV(vaddvub, 1, uint8_t)
101
DO_VADDV(vaddvuh, 2, uint16_t)
102
DO_VADDV(vaddvuw, 4, uint32_t)
103
+
104
+/* Shifts by immediate */
105
+#define DO_2SHIFT(OP, ESIZE, TYPE, FN) \
106
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
107
+ void *vm, uint32_t shift) \
108
+ { \
109
+ TYPE *d = vd, *m = vm; \
110
+ uint16_t mask = mve_element_mask(env); \
111
+ unsigned e; \
112
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
113
+ mergemask(&d[H##ESIZE(e)], \
114
+ FN(m[H##ESIZE(e)], shift), mask); \
115
+ } \
116
+ mve_advance_vpt(env); \
117
+ }
118
+
119
+#define DO_2SHIFT_SAT(OP, ESIZE, TYPE, FN) \
120
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
121
+ void *vm, uint32_t shift) \
122
+ { \
123
+ TYPE *d = vd, *m = vm; \
124
+ uint16_t mask = mve_element_mask(env); \
125
+ unsigned e; \
126
+ bool qc = false; \
127
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
128
+ bool sat = false; \
129
+ mergemask(&d[H##ESIZE(e)], \
130
+ FN(m[H##ESIZE(e)], shift, &sat), mask); \
131
+ qc |= sat & mask & 1; \
132
+ } \
133
+ if (qc) { \
134
+ env->vfp.qc[0] = qc; \
135
+ } \
136
+ mve_advance_vpt(env); \
137
+ }
138
+
139
+/* provide unsigned 2-op shift helpers for all sizes */
140
+#define DO_2SHIFT_U(OP, FN) \
141
+ DO_2SHIFT(OP##b, 1, uint8_t, FN) \
142
+ DO_2SHIFT(OP##h, 2, uint16_t, FN) \
143
+ DO_2SHIFT(OP##w, 4, uint32_t, FN)
144
+
145
+#define DO_2SHIFT_SAT_U(OP, FN) \
146
+ DO_2SHIFT_SAT(OP##b, 1, uint8_t, FN) \
147
+ DO_2SHIFT_SAT(OP##h, 2, uint16_t, FN) \
148
+ DO_2SHIFT_SAT(OP##w, 4, uint32_t, FN)
149
+#define DO_2SHIFT_SAT_S(OP, FN) \
150
+ DO_2SHIFT_SAT(OP##b, 1, int8_t, FN) \
151
+ DO_2SHIFT_SAT(OP##h, 2, int16_t, FN) \
152
+ DO_2SHIFT_SAT(OP##w, 4, int32_t, FN)
153
+
154
+DO_2SHIFT_U(vshli_u, DO_VSHLU)
155
+DO_2SHIFT_SAT_U(vqshli_u, DO_UQSHL_OP)
156
+DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP)
157
+DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
158
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
159
index XXXXXXX..XXXXXXX 100644
160
--- a/target/arm/translate-mve.c
161
+++ b/target/arm/translate-mve.c
162
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
163
typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
164
typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
165
typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
166
+typedef void MVEGenTwoOpShiftFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
167
typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
168
typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32);
169
typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64);
170
@@ -XXX,XX +XXX,XX @@ static bool trans_Vimm_1r(DisasContext *s, arg_1imm *a)
171
}
172
return do_1imm(s, a, fn);
173
}
174
+
175
+static bool do_2shift(DisasContext *s, arg_2shift *a, MVEGenTwoOpShiftFn fn,
176
+ bool negateshift)
177
+{
178
+ TCGv_ptr qd, qm;
179
+ int shift = a->shift;
180
+
181
+ if (!dc_isar_feature(aa32_mve, s) ||
182
+ !mve_check_qreg_bank(s, a->qd | a->qm) ||
183
+ !fn) {
184
+ return false;
185
+ }
186
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
187
+ return true;
188
+ }
189
+
190
+ /*
191
+ * When we handle a right shift insn using a left-shift helper
192
+ * which permits a negative shift count to indicate a right-shift,
193
+ * we must negate the shift count.
194
+ */
195
+ if (negateshift) {
196
+ shift = -shift;
197
+ }
198
+
199
+ qd = mve_qreg_ptr(a->qd);
200
+ qm = mve_qreg_ptr(a->qm);
201
+ fn(cpu_env, qd, qm, tcg_constant_i32(shift));
202
+ tcg_temp_free_ptr(qd);
203
+ tcg_temp_free_ptr(qm);
204
+ mve_update_eci(s);
205
+ return true;
64
+ return true;
206
+}
65
+}
207
+
66
+
208
+#define DO_2SHIFT(INSN, FN, NEGATESHIFT) \
67
+static bool trans_BL(DisasContext *s, arg_i *a)
209
+ static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
68
+{
210
+ { \
69
+ gen_pc_plus_diff(s, cpu_reg(s, 30), curr_insn_len(s));
211
+ static MVEGenTwoOpShiftFn * const fns[] = { \
70
+ reset_btype(s);
212
+ gen_helper_mve_##FN##b, \
71
+ gen_goto_tb(s, 0, a->imm);
213
+ gen_helper_mve_##FN##h, \
72
+ return true;
214
+ gen_helper_mve_##FN##w, \
73
}
215
+ NULL, \
74
216
+ }; \
75
/* Compare and branch (immediate)
217
+ return do_2shift(s, a, fns[a->size], NEGATESHIFT); \
76
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
218
+ }
77
static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
219
+
78
{
220
+DO_2SHIFT(VSHLI, vshli_u, false)
79
switch (extract32(insn, 25, 7)) {
221
+DO_2SHIFT(VQSHLI_S, vqshli_s, false)
80
- case 0x0a: case 0x0b:
222
+DO_2SHIFT(VQSHLI_U, vqshli_u, false)
81
- case 0x4a: case 0x4b: /* Unconditional branch (immediate) */
223
+DO_2SHIFT(VQSHLUI, vqshlui_s, false)
82
- disas_uncond_b_imm(s, insn);
83
- break;
84
case 0x1a: case 0x5a: /* Compare & branch (immediate) */
85
disas_comp_b_imm(s, insn);
86
break;
224
--
87
--
225
2.20.1
88
2.34.1
226
227
diff view generated by jsdifflib
1
Implement the MVE shift-right-and-narrow insn VSHRN and VRSHRN.
1
Convert the compare-and-branch-immediate insns CBZ and CBNZ
2
2
to decodetree.
3
do_urshr() is borrowed from sve_helper.c.
4
3
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20210628135835.6690-12-peter.maydell@linaro.org
6
Message-id: 20230512144106.3608981-15-peter.maydell@linaro.org
8
---
7
---
9
target/arm/helper-mve.h | 10 ++++++++++
8
target/arm/tcg/a64.decode | 5 +++++
10
target/arm/mve.decode | 11 +++++++++++
9
target/arm/tcg/translate-a64.c | 26 ++++++--------------------
11
target/arm/mve_helper.c | 40 ++++++++++++++++++++++++++++++++++++++
10
2 files changed, 11 insertions(+), 20 deletions(-)
12
target/arm/translate-mve.c | 15 ++++++++++++++
13
4 files changed, 76 insertions(+)
14
11
15
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/helper-mve.h
14
--- a/target/arm/tcg/a64.decode
18
+++ b/target/arm/helper-mve.h
15
+++ b/target/arm/tcg/a64.decode
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vsriw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@ EXTR 0 00 100111 0 0 rm:5 0 imm:5 rn:5 rd:5 &extract sf=0
20
DEF_HELPER_FLAGS_4(mve_vslib, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
17
21
DEF_HELPER_FLAGS_4(mve_vslih, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
B 0 00101 .......................... @branch
22
DEF_HELPER_FLAGS_4(mve_vsliw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
BL 1 00101 .......................... @branch
23
+
20
+
24
+DEF_HELPER_FLAGS_4(mve_vshrnbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
+%imm19 5:s19 !function=times_4
25
+DEF_HELPER_FLAGS_4(mve_vshrnbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
+&cbz rt imm sf nz
26
+DEF_HELPER_FLAGS_4(mve_vshrntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(mve_vshrnth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
28
+
23
+
29
+DEF_HELPER_FLAGS_4(mve_vrshrnbb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
24
+CBZ sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
30
+DEF_HELPER_FLAGS_4(mve_vrshrnbh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
31
+DEF_HELPER_FLAGS_4(mve_vrshrntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(mve_vrshrnth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/mve.decode
27
--- a/target/arm/tcg/translate-a64.c
36
+++ b/target/arm/mve.decode
28
+++ b/target/arm/tcg/translate-a64.c
37
@@ -XXX,XX +XXX,XX @@ VSRI 111 1 1111 1 . ... ... ... 0 0100 0 1 . 1 ... 0 @2_shr_w
29
@@ -XXX,XX +XXX,XX @@ static bool trans_BL(DisasContext *s, arg_i *a)
38
VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_b
30
return true;
39
VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_h
31
}
40
VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_w
32
33
-/* Compare and branch (immediate)
34
- * 31 30 25 24 23 5 4 0
35
- * +----+-------------+----+---------------------+--------+
36
- * | sf | 0 1 1 0 1 0 | op | imm19 | Rt |
37
- * +----+-------------+----+---------------------+--------+
38
- */
39
-static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
41
+
40
+
42
+# Narrowing shifts (which only support b and h sizes)
41
+static bool trans_CBZ(DisasContext *s, arg_cbz *a)
43
+VSHRNB 111 0 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_b
42
{
44
+VSHRNB 111 0 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_h
43
- unsigned int sf, op, rt;
45
+VSHRNT 111 0 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_b
44
- int64_t diff;
46
+VSHRNT 111 0 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_h
45
DisasLabel match;
47
+
46
TCGv_i64 tcg_cmp;
48
+VRSHRNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_b
47
49
+VRSHRNB 111 1 1110 1 . ... ... ... 0 1111 1 1 . 0 ... 1 @2_shr_h
48
- sf = extract32(insn, 31, 1);
50
+VRSHRNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_b
49
- op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */
51
+VRSHRNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 1 @2_shr_h
50
- rt = extract32(insn, 0, 5);
52
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
51
- diff = sextract32(insn, 5, 19) * 4;
53
index XXXXXXX..XXXXXXX 100644
52
-
54
--- a/target/arm/mve_helper.c
53
- tcg_cmp = read_cpu_reg(s, rt, sf);
55
+++ b/target/arm/mve_helper.c
54
+ tcg_cmp = read_cpu_reg(s, a->rt, a->sf);
56
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_INSERT(vsliw, 4, DO_SHL, SHL_MASK)
55
reset_btype(s);
57
56
58
DO_VSHLL_ALL(vshllb, false)
57
match = gen_disas_label(s);
59
DO_VSHLL_ALL(vshllt, true)
58
- tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
60
+
59
+ tcg_gen_brcondi_i64(a->nz ? TCG_COND_NE : TCG_COND_EQ,
61
+/*
60
tcg_cmp, 0, match.label);
62
+ * Narrowing right shifts, taking a double sized input, shifting it
61
gen_goto_tb(s, 0, 4);
63
+ * and putting the result in either the top or bottom half of the output.
62
set_disas_label(s, match);
64
+ * ESIZE, TYPE are the output, and LESIZE, LTYPE the input.
63
- gen_goto_tb(s, 1, diff);
65
+ */
64
+ gen_goto_tb(s, 1, a->imm);
66
+#define DO_VSHRN(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN) \
65
+ return true;
67
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
66
}
68
+ void *vm, uint32_t shift) \
67
69
+ { \
68
/* Test and branch (immediate)
70
+ LTYPE *m = vm; \
69
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
71
+ TYPE *d = vd; \
70
static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
72
+ uint16_t mask = mve_element_mask(env); \
71
{
73
+ unsigned le; \
72
switch (extract32(insn, 25, 7)) {
74
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
73
- case 0x1a: case 0x5a: /* Compare & branch (immediate) */
75
+ TYPE r = FN(m[H##LESIZE(le)], shift); \
74
- disas_comp_b_imm(s, insn);
76
+ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \
75
- break;
77
+ } \
76
case 0x1b: case 0x5b: /* Test & branch (immediate) */
78
+ mve_advance_vpt(env); \
77
disas_test_b_imm(s, insn);
79
+ }
78
break;
80
+
81
+#define DO_VSHRN_ALL(OP, FN) \
82
+ DO_VSHRN(OP##bb, false, 1, uint8_t, 2, uint16_t, FN) \
83
+ DO_VSHRN(OP##bh, false, 2, uint16_t, 4, uint32_t, FN) \
84
+ DO_VSHRN(OP##tb, true, 1, uint8_t, 2, uint16_t, FN) \
85
+ DO_VSHRN(OP##th, true, 2, uint16_t, 4, uint32_t, FN)
86
+
87
+static inline uint64_t do_urshr(uint64_t x, unsigned sh)
88
+{
89
+ if (likely(sh < 64)) {
90
+ return (x >> sh) + ((x >> (sh - 1)) & 1);
91
+ } else if (sh == 64) {
92
+ return x >> 63;
93
+ } else {
94
+ return 0;
95
+ }
96
+}
97
+
98
+DO_VSHRN_ALL(vshrn, DO_SHR)
99
+DO_VSHRN_ALL(vrshrn, do_urshr)
100
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/target/arm/translate-mve.c
103
+++ b/target/arm/translate-mve.c
104
@@ -XXX,XX +XXX,XX @@ DO_VSHLL(VSHLL_BS, vshllbs)
105
DO_VSHLL(VSHLL_BU, vshllbu)
106
DO_VSHLL(VSHLL_TS, vshllts)
107
DO_VSHLL(VSHLL_TU, vshlltu)
108
+
109
+#define DO_2SHIFT_N(INSN, FN) \
110
+ static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
111
+ { \
112
+ static MVEGenTwoOpShiftFn * const fns[] = { \
113
+ gen_helper_mve_##FN##b, \
114
+ gen_helper_mve_##FN##h, \
115
+ }; \
116
+ return do_2shift(s, a, fns[a->size], false); \
117
+ }
118
+
119
+DO_2SHIFT_N(VSHRNB, vshrnb)
120
+DO_2SHIFT_N(VSHRNT, vshrnt)
121
+DO_2SHIFT_N(VRSHRNB, vrshrnb)
122
+DO_2SHIFT_N(VRSHRNT, vrshrnt)
123
--
79
--
124
2.20.1
80
2.34.1
125
126
diff view generated by jsdifflib
1
Implement the MVE VSRI and VSLI insns, which perform a
1
Convert the test-and-branch-immediate insns TBZ and TBNZ
2
shift-and-insert operation.
2
to decodetree.
3
3
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20210628135835.6690-11-peter.maydell@linaro.org
6
Message-id: 20230512144106.3608981-16-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper-mve.h | 8 ++++++++
8
target/arm/tcg/a64.decode | 6 ++++++
9
target/arm/mve.decode | 9 ++++++++
9
target/arm/tcg/translate-a64.c | 25 +++++--------------------
10
target/arm/mve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++
10
2 files changed, 11 insertions(+), 20 deletions(-)
11
target/arm/translate-mve.c | 3 +++
12
4 files changed, 62 insertions(+)
13
11
14
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-mve.h
14
--- a/target/arm/tcg/a64.decode
17
+++ b/target/arm/helper-mve.h
15
+++ b/target/arm/tcg/a64.decode
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vshlltsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@ BL 1 00101 .......................... @branch
19
DEF_HELPER_FLAGS_4(mve_vshlltsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
17
&cbz rt imm sf nz
20
DEF_HELPER_FLAGS_4(mve_vshlltub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
21
DEF_HELPER_FLAGS_4(mve_vshlltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
CBZ sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
22
+
20
+
23
+DEF_HELPER_FLAGS_4(mve_vsrib, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
21
+%imm14 5:s14 !function=times_4
24
+DEF_HELPER_FLAGS_4(mve_vsrih, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
22
+%imm31_19 31:1 19:5
25
+DEF_HELPER_FLAGS_4(mve_vsriw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
23
+&tbz rt imm nz bitpos
26
+
24
+
27
+DEF_HELPER_FLAGS_4(mve_vslib, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
25
+TBZ . 011011 nz:1 ..... .............. rt:5 &tbz imm=%imm14 bitpos=%imm31_19
28
+DEF_HELPER_FLAGS_4(mve_vslih, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
26
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
29
+DEF_HELPER_FLAGS_4(mve_vsliw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
30
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
31
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
32
--- a/target/arm/mve.decode
28
--- a/target/arm/tcg/translate-a64.c
33
+++ b/target/arm/mve.decode
29
+++ b/target/arm/tcg/translate-a64.c
34
@@ -XXX,XX +XXX,XX @@ VSHLL_TS 111 0 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_h
30
@@ -XXX,XX +XXX,XX @@ static bool trans_CBZ(DisasContext *s, arg_cbz *a)
35
31
return true;
36
VSHLL_TU 111 1 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_b
32
}
37
VSHLL_TU 111 1 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_h
33
38
+
34
-/* Test and branch (immediate)
39
+# Shift-and-insert
35
- * 31 30 25 24 23 19 18 5 4 0
40
+VSRI 111 1 1111 1 . ... ... ... 0 0100 0 1 . 1 ... 0 @2_shr_b
36
- * +----+-------------+----+-------+-------------+------+
41
+VSRI 111 1 1111 1 . ... ... ... 0 0100 0 1 . 1 ... 0 @2_shr_h
37
- * | b5 | 0 1 1 0 1 1 | op | b40 | imm14 | Rt |
42
+VSRI 111 1 1111 1 . ... ... ... 0 0100 0 1 . 1 ... 0 @2_shr_w
38
- * +----+-------------+----+-------+-------------+------+
43
+
39
- */
44
+VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_b
40
-static void disas_test_b_imm(DisasContext *s, uint32_t insn)
45
+VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_h
41
+static bool trans_TBZ(DisasContext *s, arg_tbz *a)
46
+VSLI 111 1 1111 1 . ... ... ... 0 0101 0 1 . 1 ... 0 @2_shl_w
42
{
47
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
43
- unsigned int bit_pos, op, rt;
48
index XXXXXXX..XXXXXXX 100644
44
- int64_t diff;
49
--- a/target/arm/mve_helper.c
45
DisasLabel match;
50
+++ b/target/arm/mve_helper.c
46
TCGv_i64 tcg_cmp;
51
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
47
52
DO_2SHIFT_U(vrshli_u, DO_VRSHLU)
48
- bit_pos = (extract32(insn, 31, 1) << 5) | extract32(insn, 19, 5);
53
DO_2SHIFT_S(vrshli_s, DO_VRSHLS)
49
- op = extract32(insn, 24, 1); /* 0: TBZ; 1: TBNZ */
54
50
- diff = sextract32(insn, 5, 14) * 4;
55
+/* Shift-and-insert; we always work with 64 bits at a time */
51
- rt = extract32(insn, 0, 5);
56
+#define DO_2SHIFT_INSERT(OP, ESIZE, SHIFTFN, MASKFN) \
52
-
57
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
53
tcg_cmp = tcg_temp_new_i64();
58
+ void *vm, uint32_t shift) \
54
- tcg_gen_andi_i64(tcg_cmp, cpu_reg(s, rt), (1ULL << bit_pos));
59
+ { \
55
+ tcg_gen_andi_i64(tcg_cmp, cpu_reg(s, a->rt), 1ULL << a->bitpos);
60
+ uint64_t *d = vd, *m = vm; \
56
61
+ uint16_t mask; \
57
reset_btype(s);
62
+ uint64_t shiftmask; \
58
63
+ unsigned e; \
59
match = gen_disas_label(s);
64
+ if (shift == 0 || shift == ESIZE * 8) { \
60
- tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
65
+ /* \
61
+ tcg_gen_brcondi_i64(a->nz ? TCG_COND_NE : TCG_COND_EQ,
66
+ * Only VSLI can shift by 0; only VSRI can shift by <dt>. \
62
tcg_cmp, 0, match.label);
67
+ * The generic logic would give the right answer for 0 but \
63
gen_goto_tb(s, 0, 4);
68
+ * fails for <dt>. \
64
set_disas_label(s, match);
69
+ */ \
65
- gen_goto_tb(s, 1, diff);
70
+ goto done; \
66
+ gen_goto_tb(s, 1, a->imm);
71
+ } \
67
+ return true;
72
+ assert(shift < ESIZE * 8); \
68
}
73
+ mask = mve_element_mask(env); \
69
74
+ /* ESIZE / 2 gives the MO_* value if ESIZE is in [1,2,4] */ \
70
/* Conditional branch (immediate)
75
+ shiftmask = dup_const(ESIZE / 2, MASKFN(ESIZE * 8, shift)); \
71
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
76
+ for (e = 0; e < 16 / 8; e++, mask >>= 8) { \
72
static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
77
+ uint64_t r = (SHIFTFN(m[H8(e)], shift) & shiftmask) | \
73
{
78
+ (d[H8(e)] & ~shiftmask); \
74
switch (extract32(insn, 25, 7)) {
79
+ mergemask(&d[H8(e)], r, mask); \
75
- case 0x1b: case 0x5b: /* Test & branch (immediate) */
80
+ } \
76
- disas_test_b_imm(s, insn);
81
+done: \
77
- break;
82
+ mve_advance_vpt(env); \
78
case 0x2a: /* Conditional branch (immediate) */
83
+ }
79
disas_cond_b_imm(s, insn);
84
+
80
break;
85
+#define DO_SHL(N, SHIFT) ((N) << (SHIFT))
86
+#define DO_SHR(N, SHIFT) ((N) >> (SHIFT))
87
+#define SHL_MASK(EBITS, SHIFT) MAKE_64BIT_MASK((SHIFT), (EBITS) - (SHIFT))
88
+#define SHR_MASK(EBITS, SHIFT) MAKE_64BIT_MASK(0, (EBITS) - (SHIFT))
89
+
90
+DO_2SHIFT_INSERT(vsrib, 1, DO_SHR, SHR_MASK)
91
+DO_2SHIFT_INSERT(vsrih, 2, DO_SHR, SHR_MASK)
92
+DO_2SHIFT_INSERT(vsriw, 4, DO_SHR, SHR_MASK)
93
+DO_2SHIFT_INSERT(vslib, 1, DO_SHL, SHL_MASK)
94
+DO_2SHIFT_INSERT(vslih, 2, DO_SHL, SHL_MASK)
95
+DO_2SHIFT_INSERT(vsliw, 4, DO_SHL, SHL_MASK)
96
+
97
/*
98
* Long shifts taking half-sized inputs from top or bottom of the input
99
* vector and producing a double-width result. ESIZE, TYPE are for
100
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/target/arm/translate-mve.c
103
+++ b/target/arm/translate-mve.c
104
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT(VSHRI_U, vshli_u, true)
105
DO_2SHIFT(VRSHRI_S, vrshli_s, true)
106
DO_2SHIFT(VRSHRI_U, vrshli_u, true)
107
108
+DO_2SHIFT(VSRI, vsri, false)
109
+DO_2SHIFT(VSLI, vsli, false)
110
+
111
#define DO_VSHLL(INSN, FN) \
112
static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
113
{ \
114
--
81
--
115
2.20.1
82
2.34.1
116
117
diff view generated by jsdifflib
1
The function asimd_imm_const() in translate-neon.c is an
1
Convert the immediate conditional branch insn B.cond to
2
implementation of the pseudocode AdvSIMDExpandImm(), which we will
2
decodetree.
3
also want for MVE. Move the implementation to translate.c, with a
4
prototype in translate.h.
5
3
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-4-peter.maydell@linaro.org
6
Message-id: 20230512144106.3608981-17-peter.maydell@linaro.org
9
---
7
---
10
target/arm/translate.h | 16 ++++++++++
8
target/arm/tcg/a64.decode | 2 ++
11
target/arm/translate-neon.c | 63 -------------------------------------
9
target/arm/tcg/translate-a64.c | 30 ++++++------------------------
12
target/arm/translate.c | 57 +++++++++++++++++++++++++++++++++
10
2 files changed, 8 insertions(+), 24 deletions(-)
13
3 files changed, 73 insertions(+), 63 deletions(-)
14
11
15
diff --git a/target/arm/translate.h b/target/arm/translate.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate.h
14
--- a/target/arm/tcg/a64.decode
18
+++ b/target/arm/translate.h
15
+++ b/target/arm/tcg/a64.decode
19
@@ -XXX,XX +XXX,XX @@ static inline MemOp finalize_memop(DisasContext *s, MemOp opc)
16
@@ -XXX,XX +XXX,XX @@ CBZ sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
20
return opc | s->be_data;
17
&tbz rt imm nz bitpos
18
19
TBZ . 011011 nz:1 ..... .............. rt:5 &tbz imm=%imm14 bitpos=%imm31_19
20
+
21
+B_cond 0101010 0 ................... 0 cond:4 imm=%imm19
22
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
23
index XXXXXXX..XXXXXXX 100644
24
--- a/target/arm/tcg/translate-a64.c
25
+++ b/target/arm/tcg/translate-a64.c
26
@@ -XXX,XX +XXX,XX @@ static bool trans_TBZ(DisasContext *s, arg_tbz *a)
27
return true;
21
}
28
}
22
29
23
+/**
30
-/* Conditional branch (immediate)
24
+ * asimd_imm_const: Expand an encoded SIMD constant value
31
- * 31 25 24 23 5 4 3 0
25
+ *
32
- * +---------------+----+---------------------+----+------+
26
+ * Expand a SIMD constant value. This is essentially the pseudocode
33
- * | 0 1 0 1 0 1 0 | o1 | imm19 | o0 | cond |
27
+ * AdvSIMDExpandImm, except that we also perform the boolean NOT needed for
34
- * +---------------+----+---------------------+----+------+
28
+ * VMVN and VBIC (when cmode < 14 && op == 1).
35
- */
29
+ *
36
-static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
30
+ * The combination cmode == 15 op == 1 is a reserved encoding for AArch32;
37
+static bool trans_B_cond(DisasContext *s, arg_B_cond *a)
31
+ * callers must catch this.
38
{
32
+ *
39
- unsigned int cond;
33
+ * cmode = 2,3,4,5,6,7,10,11,12,13 imm=0 was UNPREDICTABLE in v7A but
40
- int64_t diff;
34
+ * is either not unpredictable or merely CONSTRAINED UNPREDICTABLE in v8A;
41
-
35
+ * we produce an immediate constant value of 0 in these cases.
42
- if ((insn & (1 << 4)) || (insn & (1 << 24))) {
36
+ */
43
- unallocated_encoding(s);
37
+uint64_t asimd_imm_const(uint32_t imm, int cmode, int op);
44
- return;
38
+
45
- }
39
#endif /* TARGET_ARM_TRANSLATE_H */
46
- diff = sextract32(insn, 5, 19) * 4;
40
diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c
47
- cond = extract32(insn, 0, 4);
41
index XXXXXXX..XXXXXXX 100644
48
-
42
--- a/target/arm/translate-neon.c
49
reset_btype(s);
43
+++ b/target/arm/translate-neon.c
50
- if (cond < 0x0e) {
44
@@ -XXX,XX +XXX,XX @@ DO_FP_2SH(VCVT_UH, gen_helper_gvec_vcvt_uh)
51
+ if (a->cond < 0x0e) {
45
DO_FP_2SH(VCVT_HS, gen_helper_gvec_vcvt_hs)
52
/* genuinely conditional branches */
46
DO_FP_2SH(VCVT_HU, gen_helper_gvec_vcvt_hu)
53
DisasLabel match = gen_disas_label(s);
47
54
- arm_gen_test_cc(cond, match.label);
48
-static uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
55
+ arm_gen_test_cc(a->cond, match.label);
49
-{
56
gen_goto_tb(s, 0, 4);
50
- /*
57
set_disas_label(s, match);
51
- * Expand the encoded constant.
58
- gen_goto_tb(s, 1, diff);
52
- * Note that cmode = 2,3,4,5,6,7,10,11,12,13 imm=0 is UNPREDICTABLE.
59
+ gen_goto_tb(s, 1, a->imm);
53
- * We choose to not special-case this and will behave as if a
60
} else {
54
- * valid constant encoding of 0 had been given.
61
/* 0xe and 0xf are both "always" conditions */
55
- * cmode = 15 op = 1 must UNDEF; we assume decode has handled that.
62
- gen_goto_tb(s, 0, diff);
56
- */
63
+ gen_goto_tb(s, 0, a->imm);
57
- switch (cmode) {
64
}
58
- case 0: case 1:
65
+ return true;
59
- /* no-op */
66
}
67
68
/* HINT instruction group, including various allocated HINTs */
69
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
70
static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
71
{
72
switch (extract32(insn, 25, 7)) {
73
- case 0x2a: /* Conditional branch (immediate) */
74
- disas_cond_b_imm(s, insn);
60
- break;
75
- break;
61
- case 2: case 3:
76
case 0x6a: /* Exception generation / System */
62
- imm <<= 8;
77
if (insn & (1 << 24)) {
63
- break;
78
if (extract32(insn, 22, 2) == 0) {
64
- case 4: case 5:
65
- imm <<= 16;
66
- break;
67
- case 6: case 7:
68
- imm <<= 24;
69
- break;
70
- case 8: case 9:
71
- imm |= imm << 16;
72
- break;
73
- case 10: case 11:
74
- imm = (imm << 8) | (imm << 24);
75
- break;
76
- case 12:
77
- imm = (imm << 8) | 0xff;
78
- break;
79
- case 13:
80
- imm = (imm << 16) | 0xffff;
81
- break;
82
- case 14:
83
- if (op) {
84
- /*
85
- * This is the only case where the top and bottom 32 bits
86
- * of the encoded constant differ.
87
- */
88
- uint64_t imm64 = 0;
89
- int n;
90
-
91
- for (n = 0; n < 8; n++) {
92
- if (imm & (1 << n)) {
93
- imm64 |= (0xffULL << (n * 8));
94
- }
95
- }
96
- return imm64;
97
- }
98
- imm |= (imm << 8) | (imm << 16) | (imm << 24);
99
- break;
100
- case 15:
101
- imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
102
- | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
103
- break;
104
- }
105
- if (op) {
106
- imm = ~imm;
107
- }
108
- return dup_const(MO_32, imm);
109
-}
110
-
111
static bool do_1reg_imm(DisasContext *s, arg_1reg_imm *a,
112
GVecGen2iFn *fn)
113
{
114
diff --git a/target/arm/translate.c b/target/arm/translate.c
115
index XXXXXXX..XXXXXXX 100644
116
--- a/target/arm/translate.c
117
+++ b/target/arm/translate.c
118
@@ -XXX,XX +XXX,XX @@ void arm_translate_init(void)
119
a64_translate_init();
120
}
121
122
+uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
123
+{
124
+ /* Expand the encoded constant as per AdvSIMDExpandImm pseudocode */
125
+ switch (cmode) {
126
+ case 0: case 1:
127
+ /* no-op */
128
+ break;
129
+ case 2: case 3:
130
+ imm <<= 8;
131
+ break;
132
+ case 4: case 5:
133
+ imm <<= 16;
134
+ break;
135
+ case 6: case 7:
136
+ imm <<= 24;
137
+ break;
138
+ case 8: case 9:
139
+ imm |= imm << 16;
140
+ break;
141
+ case 10: case 11:
142
+ imm = (imm << 8) | (imm << 24);
143
+ break;
144
+ case 12:
145
+ imm = (imm << 8) | 0xff;
146
+ break;
147
+ case 13:
148
+ imm = (imm << 16) | 0xffff;
149
+ break;
150
+ case 14:
151
+ if (op) {
152
+ /*
153
+ * This is the only case where the top and bottom 32 bits
154
+ * of the encoded constant differ.
155
+ */
156
+ uint64_t imm64 = 0;
157
+ int n;
158
+
159
+ for (n = 0; n < 8; n++) {
160
+ if (imm & (1 << n)) {
161
+ imm64 |= (0xffULL << (n * 8));
162
+ }
163
+ }
164
+ return imm64;
165
+ }
166
+ imm |= (imm << 8) | (imm << 16) | (imm << 24);
167
+ break;
168
+ case 15:
169
+ imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
170
+ | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
171
+ break;
172
+ }
173
+ if (op) {
174
+ imm = ~imm;
175
+ }
176
+ return dup_const(MO_32, imm);
177
+}
178
+
179
/* Generate a label used for skipping this instruction */
180
void arm_gen_condlabel(DisasContext *s)
181
{
182
--
79
--
183
2.20.1
80
2.34.1
184
185
diff view generated by jsdifflib
1
Implement the MVE long shifts by register, which perform shifts on a
1
Convert the simple (non-pointer-auth) BR, BLR and RET insns
2
pair of general-purpose registers treated as a 64-bit quantity, with
2
to decodetree.
3
the shift count in another general-purpose register, which might be
4
either positive or negative.
5
6
Like the long-shifts-by-immediate, these encodings sit in the space
7
that was previously the UNPREDICTABLE MOVS/ORRS with Rm==13,15.
8
Because LSLL_rr and ASRL_rr overlap with both MOV_rxri/ORR_rrri and
9
also with CSEL (as one of the previously-UNPREDICTABLE Rm==13 cases),
10
we have to move the CSEL pattern into the same decodetree group.
11
3
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
14
Message-id: 20210628135835.6690-17-peter.maydell@linaro.org
6
Message-id: 20230512144106.3608981-18-peter.maydell@linaro.org
15
---
7
---
16
target/arm/helper-mve.h | 6 +++
8
target/arm/tcg/a64.decode | 5 ++++
17
target/arm/translate.h | 1 +
9
target/arm/tcg/translate-a64.c | 55 ++++++++++++++++++++++++++++++----
18
target/arm/t32.decode | 16 +++++--
10
2 files changed, 54 insertions(+), 6 deletions(-)
19
target/arm/mve_helper.c | 93 +++++++++++++++++++++++++++++++++++++++++
20
target/arm/translate.c | 69 ++++++++++++++++++++++++++++++
21
5 files changed, 182 insertions(+), 3 deletions(-)
22
11
23
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
12
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
24
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
25
--- a/target/arm/helper-mve.h
14
--- a/target/arm/tcg/a64.decode
26
+++ b/target/arm/helper-mve.h
15
+++ b/target/arm/tcg/a64.decode
27
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
16
@@ -XXX,XX +XXX,XX @@
28
17
# This file is processed by scripts/decodetree.py
29
DEF_HELPER_FLAGS_4(mve_vshlc, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
18
#
30
19
31
+DEF_HELPER_FLAGS_3(mve_sshrl, TCG_CALL_NO_RWG, i64, env, i64, i32)
20
+&r rn
32
+DEF_HELPER_FLAGS_3(mve_ushll, TCG_CALL_NO_RWG, i64, env, i64, i32)
21
&ri rd imm
33
DEF_HELPER_FLAGS_3(mve_sqshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
22
&rri_sf rd rn imm sf
34
DEF_HELPER_FLAGS_3(mve_uqshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
23
&i imm
35
+DEF_HELPER_FLAGS_3(mve_sqrshrl, TCG_CALL_NO_RWG, i64, env, i64, i32)
24
@@ -XXX,XX +XXX,XX @@ CBZ sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
36
+DEF_HELPER_FLAGS_3(mve_uqrshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
25
TBZ . 011011 nz:1 ..... .............. rt:5 &tbz imm=%imm14 bitpos=%imm31_19
37
+DEF_HELPER_FLAGS_3(mve_sqrshrl48, TCG_CALL_NO_RWG, i64, env, i64, i32)
26
38
+DEF_HELPER_FLAGS_3(mve_uqrshll48, TCG_CALL_NO_RWG, i64, env, i64, i32)
27
B_cond 0101010 0 ................... 0 cond:4 imm=%imm19
39
diff --git a/target/arm/translate.h b/target/arm/translate.h
28
+
29
+BR 1101011 0000 11111 000000 rn:5 00000 &r
30
+BLR 1101011 0001 11111 000000 rn:5 00000 &r
31
+RET 1101011 0010 11111 000000 rn:5 00000 &r
32
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
40
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/translate.h
34
--- a/target/arm/tcg/translate-a64.c
42
+++ b/target/arm/translate.h
35
+++ b/target/arm/tcg/translate-a64.c
43
@@ -XXX,XX +XXX,XX @@ typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
36
@@ -XXX,XX +XXX,XX @@ static bool trans_B_cond(DisasContext *s, arg_B_cond *a)
44
typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
37
return true;
45
typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
46
typedef void WideShiftImmFn(TCGv_i64, TCGv_i64, int64_t shift);
47
+typedef void WideShiftFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i32);
48
49
/**
50
* arm_tbflags_from_tb:
51
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
52
index XXXXXXX..XXXXXXX 100644
53
--- a/target/arm/t32.decode
54
+++ b/target/arm/t32.decode
55
@@ -XXX,XX +XXX,XX @@
56
&mcrr !extern cp opc1 crm rt rt2
57
58
&mve_shl_ri rdalo rdahi shim
59
+&mve_shl_rr rdalo rdahi rm
60
61
# rdahi: bits [3:1] from insn, bit 0 is 1
62
# rdalo: bits [3:1] from insn, bit 0 is 0
63
@@ -XXX,XX +XXX,XX @@
64
65
@mve_shl_ri ....... .... . ... . . ... ... . .. .. .... \
66
&mve_shl_ri shim=%imm5_12_6 rdalo=%rdalo_17 rdahi=%rdahi_9
67
+@mve_shl_rr ....... .... . ... . rm:4 ... . .. .. .... \
68
+ &mve_shl_rr rdalo=%rdalo_17 rdahi=%rdahi_9
69
70
{
71
TST_xrri 1110101 0000 1 .... 0 ... 1111 .... .... @S_xrr_shi
72
@@ -XXX,XX +XXX,XX @@ BIC_rrri 1110101 0001 . .... 0 ... .... .... .... @s_rrr_shi
73
URSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 01 1111 @mve_shl_ri
74
SRSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 10 1111 @mve_shl_ri
75
SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
76
+
77
+ LSLL_rr 1110101 0010 1 ... 0 .... ... 1 0000 1101 @mve_shl_rr
78
+ ASRL_rr 1110101 0010 1 ... 0 .... ... 1 0010 1101 @mve_shl_rr
79
+ UQRSHLL64_rr 1110101 0010 1 ... 1 .... ... 1 0000 1101 @mve_shl_rr
80
+ SQRSHRL64_rr 1110101 0010 1 ... 1 .... ... 1 0010 1101 @mve_shl_rr
81
+ UQRSHLL48_rr 1110101 0010 1 ... 1 .... ... 1 1000 1101 @mve_shl_rr
82
+ SQRSHRL48_rr 1110101 0010 1 ... 1 .... ... 1 1010 1101 @mve_shl_rr
83
]
84
85
MOV_rxri 1110101 0010 . 1111 0 ... .... .... .... @s_rxr_shi
86
ORR_rrri 1110101 0010 . .... 0 ... .... .... .... @s_rrr_shi
87
+
88
+ # v8.1M CSEL and friends
89
+ CSEL 1110101 0010 1 rn:4 10 op:2 rd:4 fcond:4 rm:4
90
}
38
}
91
{
39
92
MVN_rxri 1110101 0011 . 1111 0 ... .... .... .... @s_rxr_shi
40
+static void set_btype_for_br(DisasContext *s, int rn)
93
@@ -XXX,XX +XXX,XX @@ SBC_rrri 1110101 1011 . .... 0 ... .... .... .... @s_rrr_shi
94
}
95
RSB_rrri 1110101 1110 . .... 0 ... .... .... .... @s_rrr_shi
96
97
-# v8.1M CSEL and friends
98
-CSEL 1110101 0010 1 rn:4 10 op:2 rd:4 fcond:4 rm:4
99
-
100
# Data-processing (register-shifted register)
101
102
MOV_rxrr 1111 1010 0 shty:2 s:1 rm:4 1111 rd:4 0000 rs:4 \
103
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
104
index XXXXXXX..XXXXXXX 100644
105
--- a/target/arm/mve_helper.c
106
+++ b/target/arm/mve_helper.c
107
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm,
108
return rdm;
109
}
110
111
+uint64_t HELPER(mve_sshrl)(CPUARMState *env, uint64_t n, uint32_t shift)
112
+{
41
+{
113
+ return do_sqrshl_d(n, -(int8_t)shift, false, NULL);
42
+ if (dc_isar_feature(aa64_bti, s)) {
43
+ /* BR to {x16,x17} or !guard -> 1, else 3. */
44
+ set_btype(s, rn == 16 || rn == 17 || !s->guarded_page ? 1 : 3);
45
+ }
114
+}
46
+}
115
+
47
+
116
+uint64_t HELPER(mve_ushll)(CPUARMState *env, uint64_t n, uint32_t shift)
48
+static void set_btype_for_blr(DisasContext *s)
117
+{
49
+{
118
+ return do_uqrshl_d(n, (int8_t)shift, false, NULL);
50
+ if (dc_isar_feature(aa64_bti, s)) {
51
+ /* BLR sets BTYPE to 2, regardless of source guarded page. */
52
+ set_btype(s, 2);
53
+ }
119
+}
54
+}
120
+
55
+
121
uint64_t HELPER(mve_sqshll)(CPUARMState *env, uint64_t n, uint32_t shift)
56
+static bool trans_BR(DisasContext *s, arg_r *a)
122
{
123
return do_sqrshl_d(n, (int8_t)shift, false, &env->QF);
124
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(mve_uqshll)(CPUARMState *env, uint64_t n, uint32_t shift)
125
{
126
return do_uqrshl_d(n, (int8_t)shift, false, &env->QF);
127
}
128
+
129
+uint64_t HELPER(mve_sqrshrl)(CPUARMState *env, uint64_t n, uint32_t shift)
130
+{
57
+{
131
+ return do_sqrshl_d(n, -(int8_t)shift, true, &env->QF);
58
+ gen_a64_set_pc(s, cpu_reg(s, a->rn));
132
+}
59
+ set_btype_for_br(s, a->rn);
133
+
60
+ s->base.is_jmp = DISAS_JUMP;
134
+uint64_t HELPER(mve_uqrshll)(CPUARMState *env, uint64_t n, uint32_t shift)
135
+{
136
+ return do_uqrshl_d(n, (int8_t)shift, true, &env->QF);
137
+}
138
+
139
+/* Operate on 64-bit values, but saturate at 48 bits */
140
+static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift,
141
+ bool round, uint32_t *sat)
142
+{
143
+ if (shift <= -48) {
144
+ /* Rounding the sign bit always produces 0. */
145
+ if (round) {
146
+ return 0;
147
+ }
148
+ return src >> 63;
149
+ } else if (shift < 0) {
150
+ if (round) {
151
+ src >>= -shift - 1;
152
+ return (src >> 1) + (src & 1);
153
+ }
154
+ return src >> -shift;
155
+ } else if (shift < 48) {
156
+ int64_t val = src << shift;
157
+ int64_t extval = sextract64(val, 0, 48);
158
+ if (!sat || val == extval) {
159
+ return extval;
160
+ }
161
+ } else if (!sat || src == 0) {
162
+ return 0;
163
+ }
164
+
165
+ *sat = 1;
166
+ return (1ULL << 47) - (src >= 0);
167
+}
168
+
169
+/* Operate on 64-bit values, but saturate at 48 bits */
170
+static inline uint64_t do_uqrshl48_d(uint64_t src, int64_t shift,
171
+ bool round, uint32_t *sat)
172
+{
173
+ uint64_t val, extval;
174
+
175
+ if (shift <= -(48 + round)) {
176
+ return 0;
177
+ } else if (shift < 0) {
178
+ if (round) {
179
+ val = src >> (-shift - 1);
180
+ val = (val >> 1) + (val & 1);
181
+ } else {
182
+ val = src >> -shift;
183
+ }
184
+ extval = extract64(val, 0, 48);
185
+ if (!sat || val == extval) {
186
+ return extval;
187
+ }
188
+ } else if (shift < 48) {
189
+ uint64_t val = src << shift;
190
+ uint64_t extval = extract64(val, 0, 48);
191
+ if (!sat || val == extval) {
192
+ return extval;
193
+ }
194
+ } else if (!sat || src == 0) {
195
+ return 0;
196
+ }
197
+
198
+ *sat = 1;
199
+ return MAKE_64BIT_MASK(0, 48);
200
+}
201
+
202
+uint64_t HELPER(mve_sqrshrl48)(CPUARMState *env, uint64_t n, uint32_t shift)
203
+{
204
+ return do_sqrshl48_d(n, -(int8_t)shift, true, &env->QF);
205
+}
206
+
207
+uint64_t HELPER(mve_uqrshll48)(CPUARMState *env, uint64_t n, uint32_t shift)
208
+{
209
+ return do_uqrshl48_d(n, (int8_t)shift, true, &env->QF);
210
+}
211
diff --git a/target/arm/translate.c b/target/arm/translate.c
212
index XXXXXXX..XXXXXXX 100644
213
--- a/target/arm/translate.c
214
+++ b/target/arm/translate.c
215
@@ -XXX,XX +XXX,XX @@ static bool trans_URSHRL_ri(DisasContext *s, arg_mve_shl_ri *a)
216
return do_mve_shl_ri(s, a, gen_urshr64_i64);
217
}
218
219
+static bool do_mve_shl_rr(DisasContext *s, arg_mve_shl_rr *a, WideShiftFn *fn)
220
+{
221
+ TCGv_i64 rda;
222
+ TCGv_i32 rdalo, rdahi;
223
+
224
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
225
+ /* Decode falls through to ORR/MOV UNPREDICTABLE handling */
226
+ return false;
227
+ }
228
+ if (a->rdahi == 15) {
229
+ /* These are a different encoding (SQSHL/SRSHR/UQSHL/URSHR) */
230
+ return false;
231
+ }
232
+ if (!dc_isar_feature(aa32_mve, s) ||
233
+ !arm_dc_feature(s, ARM_FEATURE_M_MAIN) ||
234
+ a->rdahi == 13 || a->rm == 13 || a->rm == 15 ||
235
+ a->rm == a->rdahi || a->rm == a->rdalo) {
236
+ /* These rdahi/rdalo/rm cases are UNPREDICTABLE; we choose to UNDEF */
237
+ unallocated_encoding(s);
238
+ return true;
239
+ }
240
+
241
+ rda = tcg_temp_new_i64();
242
+ rdalo = load_reg(s, a->rdalo);
243
+ rdahi = load_reg(s, a->rdahi);
244
+ tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
245
+
246
+ /* The helper takes care of the sign-extension of the low 8 bits of Rm */
247
+ fn(rda, cpu_env, rda, cpu_R[a->rm]);
248
+
249
+ tcg_gen_extrl_i64_i32(rdalo, rda);
250
+ tcg_gen_extrh_i64_i32(rdahi, rda);
251
+ store_reg(s, a->rdalo, rdalo);
252
+ store_reg(s, a->rdahi, rdahi);
253
+ tcg_temp_free_i64(rda);
254
+
255
+ return true;
61
+ return true;
256
+}
62
+}
257
+
63
+
258
+static bool trans_LSLL_rr(DisasContext *s, arg_mve_shl_rr *a)
64
+static bool trans_BLR(DisasContext *s, arg_r *a)
259
+{
65
+{
260
+ return do_mve_shl_rr(s, a, gen_helper_mve_ushll);
66
+ TCGv_i64 dst = cpu_reg(s, a->rn);
67
+ TCGv_i64 lr = cpu_reg(s, 30);
68
+ if (dst == lr) {
69
+ TCGv_i64 tmp = tcg_temp_new_i64();
70
+ tcg_gen_mov_i64(tmp, dst);
71
+ dst = tmp;
72
+ }
73
+ gen_pc_plus_diff(s, lr, curr_insn_len(s));
74
+ gen_a64_set_pc(s, dst);
75
+ set_btype_for_blr(s);
76
+ s->base.is_jmp = DISAS_JUMP;
77
+ return true;
261
+}
78
+}
262
+
79
+
263
+static bool trans_ASRL_rr(DisasContext *s, arg_mve_shl_rr *a)
80
+static bool trans_RET(DisasContext *s, arg_r *a)
264
+{
81
+{
265
+ return do_mve_shl_rr(s, a, gen_helper_mve_sshrl);
82
+ gen_a64_set_pc(s, cpu_reg(s, a->rn));
83
+ s->base.is_jmp = DISAS_JUMP;
84
+ return true;
266
+}
85
+}
267
+
86
+
268
+static bool trans_UQRSHLL64_rr(DisasContext *s, arg_mve_shl_rr *a)
87
/* HINT instruction group, including various allocated HINTs */
269
+{
88
static void handle_hint(DisasContext *s, uint32_t insn,
270
+ return do_mve_shl_rr(s, a, gen_helper_mve_uqrshll);
89
unsigned int op1, unsigned int op2, unsigned int crm)
271
+}
90
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
272
+
91
btype_mod = opc;
273
+static bool trans_SQRSHRL64_rr(DisasContext *s, arg_mve_shl_rr *a)
92
switch (op3) {
274
+{
93
case 0:
275
+ return do_mve_shl_rr(s, a, gen_helper_mve_sqrshrl);
94
- /* BR, BLR, RET */
276
+}
95
- if (op4 != 0) {
277
+
96
- goto do_unallocated;
278
+static bool trans_UQRSHLL48_rr(DisasContext *s, arg_mve_shl_rr *a)
97
- }
279
+{
98
- dst = cpu_reg(s, rn);
280
+ return do_mve_shl_rr(s, a, gen_helper_mve_uqrshll48);
99
- break;
281
+}
100
+ /* BR, BLR, RET : handled in decodetree */
282
+
101
+ goto do_unallocated;
283
+static bool trans_SQRSHRL48_rr(DisasContext *s, arg_mve_shl_rr *a)
102
284
+{
103
case 2:
285
+ return do_mve_shl_rr(s, a, gen_helper_mve_sqrshrl48);
104
case 3:
286
+}
287
+
288
/*
289
* Multiply and multiply accumulate
290
*/
291
--
105
--
292
2.20.1
106
2.34.1
293
294
diff view generated by jsdifflib
1
Implement the MVE shifts by immediate, which perform shifts
1
Convert the single-register pointer-authentication variants of BR,
2
on a single general-purpose register.
2
BLR, RET to decodetree. (BRAA/BLRAA are in a different branch of
3
3
the legacy decoder and will be dealt with in the next commit.)
4
These patterns overlap with the long-shift-by-immediates,
5
so we have to rearrange the grouping a little here.
6
4
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20210628135835.6690-18-peter.maydell@linaro.org
7
Message-id: 20230512144106.3608981-19-peter.maydell@linaro.org
10
---
8
---
11
target/arm/helper-mve.h | 3 ++
9
target/arm/tcg/a64.decode | 7 ++
12
target/arm/translate.h | 1 +
10
target/arm/tcg/translate-a64.c | 132 +++++++++++++++++++--------------
13
target/arm/t32.decode | 31 ++++++++++++++-----
11
2 files changed, 84 insertions(+), 55 deletions(-)
14
target/arm/mve_helper.c | 10 ++++++
15
target/arm/translate.c | 68 +++++++++++++++++++++++++++++++++++++++--
16
5 files changed, 104 insertions(+), 9 deletions(-)
17
12
18
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
13
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
19
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
20
--- a/target/arm/helper-mve.h
15
--- a/target/arm/tcg/a64.decode
21
+++ b/target/arm/helper-mve.h
16
+++ b/target/arm/tcg/a64.decode
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_sqrshrl, TCG_CALL_NO_RWG, i64, env, i64, i32)
17
@@ -XXX,XX +XXX,XX @@ B_cond 0101010 0 ................... 0 cond:4 imm=%imm19
23
DEF_HELPER_FLAGS_3(mve_uqrshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
18
BR 1101011 0000 11111 000000 rn:5 00000 &r
24
DEF_HELPER_FLAGS_3(mve_sqrshrl48, TCG_CALL_NO_RWG, i64, env, i64, i32)
19
BLR 1101011 0001 11111 000000 rn:5 00000 &r
25
DEF_HELPER_FLAGS_3(mve_uqrshll48, TCG_CALL_NO_RWG, i64, env, i64, i32)
20
RET 1101011 0010 11111 000000 rn:5 00000 &r
26
+
21
+
27
+DEF_HELPER_FLAGS_3(mve_uqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
22
+&braz rn m
28
+DEF_HELPER_FLAGS_3(mve_sqshl, TCG_CALL_NO_RWG, i32, env, i32, i32)
23
+BRAZ 1101011 0000 11111 00001 m:1 rn:5 11111 &braz # BRAAZ, BRABZ
29
diff --git a/target/arm/translate.h b/target/arm/translate.h
24
+BLRAZ 1101011 0001 11111 00001 m:1 rn:5 11111 &braz # BLRAAZ, BLRABZ
25
+
26
+&reta m
27
+RETA 1101011 0010 11111 00001 m:1 11111 11111 &reta # RETAA, RETAB
28
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
30
index XXXXXXX..XXXXXXX 100644
29
index XXXXXXX..XXXXXXX 100644
31
--- a/target/arm/translate.h
30
--- a/target/arm/tcg/translate-a64.c
32
+++ b/target/arm/translate.h
31
+++ b/target/arm/tcg/translate-a64.c
33
@@ -XXX,XX +XXX,XX @@ typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
32
@@ -XXX,XX +XXX,XX @@ static bool trans_RET(DisasContext *s, arg_r *a)
34
typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
33
return true;
35
typedef void WideShiftImmFn(TCGv_i64, TCGv_i64, int64_t shift);
34
}
36
typedef void WideShiftFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i32);
35
37
+typedef void ShiftImmFn(TCGv_i32, TCGv_i32, int32_t shift);
36
+static TCGv_i64 auth_branch_target(DisasContext *s, TCGv_i64 dst,
38
37
+ TCGv_i64 modifier, bool use_key_a)
39
/**
38
+{
40
* arm_tbflags_from_tb:
39
+ TCGv_i64 truedst;
41
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
40
+ /*
42
index XXXXXXX..XXXXXXX 100644
41
+ * Return the branch target for a BRAA/RETA/etc, which is either
43
--- a/target/arm/t32.decode
42
+ * just the destination dst, or that value with the pauth check
44
+++ b/target/arm/t32.decode
43
+ * done and the code removed from the high bits.
45
@@ -XXX,XX +XXX,XX @@
44
+ */
46
45
+ if (!s->pauth_active) {
47
&mve_shl_ri rdalo rdahi shim
46
+ return dst;
48
&mve_shl_rr rdalo rdahi rm
49
+&mve_sh_ri rda shim
50
51
# rdahi: bits [3:1] from insn, bit 0 is 1
52
# rdalo: bits [3:1] from insn, bit 0 is 0
53
@@ -XXX,XX +XXX,XX @@
54
&mve_shl_ri shim=%imm5_12_6 rdalo=%rdalo_17 rdahi=%rdahi_9
55
@mve_shl_rr ....... .... . ... . rm:4 ... . .. .. .... \
56
&mve_shl_rr rdalo=%rdalo_17 rdahi=%rdahi_9
57
+@mve_sh_ri ....... .... . rda:4 . ... ... . .. .. .... \
58
+ &mve_sh_ri shim=%imm5_12_6
59
60
{
61
TST_xrri 1110101 0000 1 .... 0 ... 1111 .... .... @S_xrr_shi
62
@@ -XXX,XX +XXX,XX @@ BIC_rrri 1110101 0001 . .... 0 ... .... .... .... @s_rrr_shi
63
# the rest fall through (where ORR_rrri and MOV_rxri will end up
64
# handling them as r13 and r15 accesses with the same semantics as A32).
65
[
66
- LSLL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 00 1111 @mve_shl_ri
67
- LSRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 01 1111 @mve_shl_ri
68
- ASRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 10 1111 @mve_shl_ri
69
+ {
70
+ UQSHL_ri 1110101 0010 1 .... 0 ... 1111 .. 00 1111 @mve_sh_ri
71
+ LSLL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 00 1111 @mve_shl_ri
72
+ UQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 00 1111 @mve_shl_ri
73
+ }
74
75
- UQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 00 1111 @mve_shl_ri
76
- URSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 01 1111 @mve_shl_ri
77
- SRSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 10 1111 @mve_shl_ri
78
- SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
79
+ {
80
+ URSHR_ri 1110101 0010 1 .... 0 ... 1111 .. 01 1111 @mve_sh_ri
81
+ LSRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 01 1111 @mve_shl_ri
82
+ URSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 01 1111 @mve_shl_ri
83
+ }
47
+ }
84
+
48
+
85
+ {
49
+ truedst = tcg_temp_new_i64();
86
+ SRSHR_ri 1110101 0010 1 .... 0 ... 1111 .. 10 1111 @mve_sh_ri
50
+ if (use_key_a) {
87
+ ASRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 10 1111 @mve_shl_ri
51
+ gen_helper_autia(truedst, cpu_env, dst, modifier);
88
+ SRSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 10 1111 @mve_shl_ri
52
+ } else {
53
+ gen_helper_autib(truedst, cpu_env, dst, modifier);
54
+ }
55
+ return truedst;
56
+}
57
+
58
+static bool trans_BRAZ(DisasContext *s, arg_braz *a)
59
+{
60
+ TCGv_i64 dst;
61
+
62
+ if (!dc_isar_feature(aa64_pauth, s)) {
63
+ return false;
89
+ }
64
+ }
90
+
65
+
91
+ {
66
+ dst = auth_branch_target(s, cpu_reg(s, a->rn), tcg_constant_i64(0), !a->m);
92
+ SQSHL_ri 1110101 0010 1 .... 0 ... 1111 .. 11 1111 @mve_sh_ri
67
+ gen_a64_set_pc(s, dst);
93
+ SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
68
+ set_btype_for_br(s, a->rn);
94
+ }
69
+ s->base.is_jmp = DISAS_JUMP;
95
96
LSLL_rr 1110101 0010 1 ... 0 .... ... 1 0000 1101 @mve_shl_rr
97
ASRL_rr 1110101 0010 1 ... 0 .... ... 1 0010 1101 @mve_shl_rr
98
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
99
index XXXXXXX..XXXXXXX 100644
100
--- a/target/arm/mve_helper.c
101
+++ b/target/arm/mve_helper.c
102
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(mve_uqrshll48)(CPUARMState *env, uint64_t n, uint32_t shift)
103
{
104
return do_uqrshl48_d(n, (int8_t)shift, true, &env->QF);
105
}
106
+
107
+uint32_t HELPER(mve_uqshl)(CPUARMState *env, uint32_t n, uint32_t shift)
108
+{
109
+ return do_uqrshl_bhs(n, (int8_t)shift, 32, false, &env->QF);
110
+}
111
+
112
+uint32_t HELPER(mve_sqshl)(CPUARMState *env, uint32_t n, uint32_t shift)
113
+{
114
+ return do_sqrshl_bhs(n, (int8_t)shift, 32, false, &env->QF);
115
+}
116
diff --git a/target/arm/translate.c b/target/arm/translate.c
117
index XXXXXXX..XXXXXXX 100644
118
--- a/target/arm/translate.c
119
+++ b/target/arm/translate.c
120
@@ -XXX,XX +XXX,XX @@ static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
121
122
static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
123
{
124
- TCGv_i32 t = tcg_temp_new_i32();
125
+ TCGv_i32 t;
126
127
+ /* Handle shift by the input size for the benefit of trans_SRSHR_ri */
128
+ if (sh == 32) {
129
+ tcg_gen_movi_i32(d, 0);
130
+ return;
131
+ }
132
+ t = tcg_temp_new_i32();
133
tcg_gen_extract_i32(t, a, sh - 1, 1);
134
tcg_gen_sari_i32(d, a, sh);
135
tcg_gen_add_i32(d, d, t);
136
@@ -XXX,XX +XXX,XX @@ static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
137
138
static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
139
{
140
- TCGv_i32 t = tcg_temp_new_i32();
141
+ TCGv_i32 t;
142
143
+ /* Handle shift by the input size for the benefit of trans_URSHR_ri */
144
+ if (sh == 32) {
145
+ tcg_gen_extract_i32(d, a, sh - 1, 1);
146
+ return;
147
+ }
148
+ t = tcg_temp_new_i32();
149
tcg_gen_extract_i32(t, a, sh - 1, 1);
150
tcg_gen_shri_i32(d, a, sh);
151
tcg_gen_add_i32(d, d, t);
152
@@ -XXX,XX +XXX,XX @@ static bool trans_SQRSHRL48_rr(DisasContext *s, arg_mve_shl_rr *a)
153
return do_mve_shl_rr(s, a, gen_helper_mve_sqrshrl48);
154
}
155
156
+static bool do_mve_sh_ri(DisasContext *s, arg_mve_sh_ri *a, ShiftImmFn *fn)
157
+{
158
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
159
+ /* Decode falls through to ORR/MOV UNPREDICTABLE handling */
160
+ return false;
161
+ }
162
+ if (!dc_isar_feature(aa32_mve, s) ||
163
+ !arm_dc_feature(s, ARM_FEATURE_M_MAIN) ||
164
+ a->rda == 13 || a->rda == 15) {
165
+ /* These rda cases are UNPREDICTABLE; we choose to UNDEF */
166
+ unallocated_encoding(s);
167
+ return true;
168
+ }
169
+
170
+ if (a->shim == 0) {
171
+ a->shim = 32;
172
+ }
173
+ fn(cpu_R[a->rda], cpu_R[a->rda], a->shim);
174
+
175
+ return true;
70
+ return true;
176
+}
71
+}
177
+
72
+
178
+static bool trans_URSHR_ri(DisasContext *s, arg_mve_sh_ri *a)
73
+static bool trans_BLRAZ(DisasContext *s, arg_braz *a)
179
+{
74
+{
180
+ return do_mve_sh_ri(s, a, gen_urshr32_i32);
75
+ TCGv_i64 dst, lr;
76
+
77
+ if (!dc_isar_feature(aa64_pauth, s)) {
78
+ return false;
79
+ }
80
+
81
+ dst = auth_branch_target(s, cpu_reg(s, a->rn), tcg_constant_i64(0), !a->m);
82
+ lr = cpu_reg(s, 30);
83
+ if (dst == lr) {
84
+ TCGv_i64 tmp = tcg_temp_new_i64();
85
+ tcg_gen_mov_i64(tmp, dst);
86
+ dst = tmp;
87
+ }
88
+ gen_pc_plus_diff(s, lr, curr_insn_len(s));
89
+ gen_a64_set_pc(s, dst);
90
+ set_btype_for_blr(s);
91
+ s->base.is_jmp = DISAS_JUMP;
92
+ return true;
181
+}
93
+}
182
+
94
+
183
+static bool trans_SRSHR_ri(DisasContext *s, arg_mve_sh_ri *a)
95
+static bool trans_RETA(DisasContext *s, arg_reta *a)
184
+{
96
+{
185
+ return do_mve_sh_ri(s, a, gen_srshr32_i32);
97
+ TCGv_i64 dst;
98
+
99
+ dst = auth_branch_target(s, cpu_reg(s, 30), cpu_X[31], !a->m);
100
+ gen_a64_set_pc(s, dst);
101
+ s->base.is_jmp = DISAS_JUMP;
102
+ return true;
186
+}
103
+}
187
+
104
+
188
+static void gen_mve_sqshl(TCGv_i32 r, TCGv_i32 n, int32_t shift)
105
/* HINT instruction group, including various allocated HINTs */
189
+{
106
static void handle_hint(DisasContext *s, uint32_t insn,
190
+ gen_helper_mve_sqshl(r, cpu_env, n, tcg_constant_i32(shift));
107
unsigned int op1, unsigned int op2, unsigned int crm)
191
+}
108
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
192
+
109
}
193
+static bool trans_SQSHL_ri(DisasContext *s, arg_mve_sh_ri *a)
110
194
+{
111
switch (opc) {
195
+ return do_mve_sh_ri(s, a, gen_mve_sqshl);
112
- case 0: /* BR */
196
+}
113
- case 1: /* BLR */
197
+
114
- case 2: /* RET */
198
+static void gen_mve_uqshl(TCGv_i32 r, TCGv_i32 n, int32_t shift)
115
- btype_mod = opc;
199
+{
116
- switch (op3) {
200
+ gen_helper_mve_uqshl(r, cpu_env, n, tcg_constant_i32(shift));
117
- case 0:
201
+}
118
- /* BR, BLR, RET : handled in decodetree */
202
+
119
- goto do_unallocated;
203
+static bool trans_UQSHL_ri(DisasContext *s, arg_mve_sh_ri *a)
120
-
204
+{
121
- case 2:
205
+ return do_mve_sh_ri(s, a, gen_mve_uqshl);
122
- case 3:
206
+}
123
- if (!dc_isar_feature(aa64_pauth, s)) {
207
+
124
- goto do_unallocated;
208
/*
125
- }
209
* Multiply and multiply accumulate
126
- if (opc == 2) {
210
*/
127
- /* RETAA, RETAB */
128
- if (rn != 0x1f || op4 != 0x1f) {
129
- goto do_unallocated;
130
- }
131
- rn = 30;
132
- modifier = cpu_X[31];
133
- } else {
134
- /* BRAAZ, BRABZ, BLRAAZ, BLRABZ */
135
- if (op4 != 0x1f) {
136
- goto do_unallocated;
137
- }
138
- modifier = tcg_constant_i64(0);
139
- }
140
- if (s->pauth_active) {
141
- dst = tcg_temp_new_i64();
142
- if (op3 == 2) {
143
- gen_helper_autia(dst, cpu_env, cpu_reg(s, rn), modifier);
144
- } else {
145
- gen_helper_autib(dst, cpu_env, cpu_reg(s, rn), modifier);
146
- }
147
- } else {
148
- dst = cpu_reg(s, rn);
149
- }
150
- break;
151
-
152
- default:
153
- goto do_unallocated;
154
- }
155
- /* BLR also needs to load return address */
156
- if (opc == 1) {
157
- TCGv_i64 lr = cpu_reg(s, 30);
158
- if (dst == lr) {
159
- TCGv_i64 tmp = tcg_temp_new_i64();
160
- tcg_gen_mov_i64(tmp, dst);
161
- dst = tmp;
162
- }
163
- gen_pc_plus_diff(s, lr, curr_insn_len(s));
164
- }
165
- gen_a64_set_pc(s, dst);
166
- break;
167
+ case 0:
168
+ case 1:
169
+ case 2:
170
+ /*
171
+ * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ:
172
+ * handled in decodetree
173
+ */
174
+ goto do_unallocated;
175
176
case 8: /* BRAA */
177
case 9: /* BLRAA */
211
--
178
--
212
2.20.1
179
2.34.1
213
214
diff view generated by jsdifflib
1
The MVE extension to v8.1M includes some new shift instructions which
1
Convert the last four BR-with-pointer-auth insns to decodetree.
2
sit entirely within the non-coprocessor part of the encoding space
2
The remaining cases in the outer switch in disas_uncond_b_reg()
3
and which operate only on general-purpose registers. They take up
3
all return early rather than leaving the case statement, so we
4
the space which was previously UNPREDICTABLE MOVS and ORRS encodings
4
can delete the now-unused code at the end of that function.
5
with Rm == 13 or 15.
6
7
Implement the long shifts by immediate, which perform shifts on a
8
pair of general-purpose registers treated as a 64-bit quantity, with
9
an immediate shift count between 1 and 32.
10
11
Awkwardly, because the MOVS and ORRS trans functions do not UNDEF for
12
the Rm==13,15 case, we need to explicitly emit code to UNDEF for the
13
cases where v8.1M now requires that. (Trying to change MOVS and ORRS
14
is too difficult, because the functions that generate the code are
15
shared between a dozen different kinds of arithmetic or logical
16
instruction for all A32, T16 and T32 encodings, and for some insns
17
and some encodings Rm==13,15 are valid.)
18
19
We make the helper functions we need for UQSHLL and SQSHLL take
20
a 32-bit value which the helper casts to int8_t because we'll need
21
these helpers also for the shift-by-register insns, where the shift
22
count might be < 0 or > 32.
23
5
24
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
25
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
26
Message-id: 20210628135835.6690-16-peter.maydell@linaro.org
8
Message-id: 20230512144106.3608981-20-peter.maydell@linaro.org
27
---
9
---
28
target/arm/helper-mve.h | 3 ++
10
target/arm/tcg/a64.decode | 4 ++
29
target/arm/translate.h | 1 +
11
target/arm/tcg/translate-a64.c | 97 ++++++++++++++--------------------
30
target/arm/t32.decode | 28 +++++++++++++
12
2 files changed, 43 insertions(+), 58 deletions(-)
31
target/arm/mve_helper.c | 10 +++++
32
target/arm/translate.c | 90 +++++++++++++++++++++++++++++++++++++++++
33
5 files changed, 132 insertions(+)
34
13
35
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
14
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
36
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
37
--- a/target/arm/helper-mve.h
16
--- a/target/arm/tcg/a64.decode
38
+++ b/target/arm/helper-mve.h
17
+++ b/target/arm/tcg/a64.decode
39
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vqrshruntb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
18
@@ -XXX,XX +XXX,XX @@ BLRAZ 1101011 0001 11111 00001 m:1 rn:5 11111 &braz # BLRAAZ, BLRABZ
40
DEF_HELPER_FLAGS_4(mve_vqrshrunth, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
19
41
20
&reta m
42
DEF_HELPER_FLAGS_4(mve_vshlc, TCG_CALL_NO_WG, i32, env, ptr, i32, i32)
21
RETA 1101011 0010 11111 00001 m:1 11111 11111 &reta # RETAA, RETAB
43
+
22
+
44
+DEF_HELPER_FLAGS_3(mve_sqshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
23
+&bra rn rm m
45
+DEF_HELPER_FLAGS_3(mve_uqshll, TCG_CALL_NO_RWG, i64, env, i64, i32)
24
+BRA 1101011 1000 11111 00001 m:1 rn:5 rm:5 &bra # BRAA, BRAB
46
diff --git a/target/arm/translate.h b/target/arm/translate.h
25
+BLRA 1101011 1001 11111 00001 m:1 rn:5 rm:5 &bra # BLRAA, BLRAB
26
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
47
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/translate.h
28
--- a/target/arm/tcg/translate-a64.c
49
+++ b/target/arm/translate.h
29
+++ b/target/arm/tcg/translate-a64.c
50
@@ -XXX,XX +XXX,XX @@ typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
30
@@ -XXX,XX +XXX,XX @@ static bool trans_RETA(DisasContext *s, arg_reta *a)
51
typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
52
typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
53
typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
54
+typedef void WideShiftImmFn(TCGv_i64, TCGv_i64, int64_t shift);
55
56
/**
57
* arm_tbflags_from_tb:
58
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/t32.decode
61
+++ b/target/arm/t32.decode
62
@@ -XXX,XX +XXX,XX @@
63
&mcr !extern cp opc1 crn crm opc2 rt
64
&mcrr !extern cp opc1 crm rt rt2
65
66
+&mve_shl_ri rdalo rdahi shim
67
+
68
+# rdahi: bits [3:1] from insn, bit 0 is 1
69
+# rdalo: bits [3:1] from insn, bit 0 is 0
70
+%rdahi_9 9:3 !function=times_2_plus_1
71
+%rdalo_17 17:3 !function=times_2
72
+
73
# Data-processing (register)
74
75
%imm5_12_6 12:3 6:2
76
@@ -XXX,XX +XXX,XX @@
77
@S_xrr_shi ....... .... . rn:4 .... .... .. shty:2 rm:4 \
78
&s_rrr_shi shim=%imm5_12_6 s=1 rd=0
79
80
+@mve_shl_ri ....... .... . ... . . ... ... . .. .. .... \
81
+ &mve_shl_ri shim=%imm5_12_6 rdalo=%rdalo_17 rdahi=%rdahi_9
82
+
83
{
84
TST_xrri 1110101 0000 1 .... 0 ... 1111 .... .... @S_xrr_shi
85
AND_rrri 1110101 0000 . .... 0 ... .... .... .... @s_rrr_shi
86
}
87
BIC_rrri 1110101 0001 . .... 0 ... .... .... .... @s_rrr_shi
88
{
89
+ # The v8.1M MVE shift insns overlap in encoding with MOVS/ORRS
90
+ # and are distinguished by having Rm==13 or 15. Those are UNPREDICTABLE
91
+ # cases for MOVS/ORRS. We decode the MVE cases first, ensuring that
92
+ # they explicitly call unallocated_encoding() for cases that must UNDEF
93
+ # (eg "using a new shift insn on a v8.1M CPU without MVE"), and letting
94
+ # the rest fall through (where ORR_rrri and MOV_rxri will end up
95
+ # handling them as r13 and r15 accesses with the same semantics as A32).
96
+ [
97
+ LSLL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 00 1111 @mve_shl_ri
98
+ LSRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 01 1111 @mve_shl_ri
99
+ ASRL_ri 1110101 0010 1 ... 0 0 ... ... 1 .. 10 1111 @mve_shl_ri
100
+
101
+ UQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 00 1111 @mve_shl_ri
102
+ URSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 01 1111 @mve_shl_ri
103
+ SRSHRL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 10 1111 @mve_shl_ri
104
+ SQSHLL_ri 1110101 0010 1 ... 1 0 ... ... 1 .. 11 1111 @mve_shl_ri
105
+ ]
106
+
107
MOV_rxri 1110101 0010 . 1111 0 ... .... .... .... @s_rxr_shi
108
ORR_rrri 1110101 0010 . .... 0 ... .... .... .... @s_rrr_shi
109
}
110
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
111
index XXXXXXX..XXXXXXX 100644
112
--- a/target/arm/mve_helper.c
113
+++ b/target/arm/mve_helper.c
114
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm,
115
mve_advance_vpt(env);
116
return rdm;
117
}
118
+
119
+uint64_t HELPER(mve_sqshll)(CPUARMState *env, uint64_t n, uint32_t shift)
120
+{
121
+ return do_sqrshl_d(n, (int8_t)shift, false, &env->QF);
122
+}
123
+
124
+uint64_t HELPER(mve_uqshll)(CPUARMState *env, uint64_t n, uint32_t shift)
125
+{
126
+ return do_uqrshl_d(n, (int8_t)shift, false, &env->QF);
127
+}
128
diff --git a/target/arm/translate.c b/target/arm/translate.c
129
index XXXXXXX..XXXXXXX 100644
130
--- a/target/arm/translate.c
131
+++ b/target/arm/translate.c
132
@@ -XXX,XX +XXX,XX @@ static bool trans_MOVT(DisasContext *s, arg_MOVW *a)
133
return true;
31
return true;
134
}
32
}
135
33
136
+/*
34
+static bool trans_BRA(DisasContext *s, arg_bra *a)
137
+ * v8.1M MVE wide-shifts
138
+ */
139
+static bool do_mve_shl_ri(DisasContext *s, arg_mve_shl_ri *a,
140
+ WideShiftImmFn *fn)
141
+{
35
+{
142
+ TCGv_i64 rda;
36
+ TCGv_i64 dst;
143
+ TCGv_i32 rdalo, rdahi;
144
+
37
+
145
+ if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
38
+ if (!dc_isar_feature(aa64_pauth, s)) {
146
+ /* Decode falls through to ORR/MOV UNPREDICTABLE handling */
147
+ return false;
39
+ return false;
148
+ }
40
+ }
149
+ if (a->rdahi == 15) {
41
+ dst = auth_branch_target(s, cpu_reg(s,a->rn), cpu_reg_sp(s, a->rm), !a->m);
150
+ /* These are a different encoding (SQSHL/SRSHR/UQSHL/URSHR) */
42
+ gen_a64_set_pc(s, dst);
151
+ return false;
43
+ set_btype_for_br(s, a->rn);
152
+ }
44
+ s->base.is_jmp = DISAS_JUMP;
153
+ if (!dc_isar_feature(aa32_mve, s) ||
154
+ !arm_dc_feature(s, ARM_FEATURE_M_MAIN) ||
155
+ a->rdahi == 13) {
156
+ /* RdaHi == 13 is UNPREDICTABLE; we choose to UNDEF */
157
+ unallocated_encoding(s);
158
+ return true;
159
+ }
160
+
161
+ if (a->shim == 0) {
162
+ a->shim = 32;
163
+ }
164
+
165
+ rda = tcg_temp_new_i64();
166
+ rdalo = load_reg(s, a->rdalo);
167
+ rdahi = load_reg(s, a->rdahi);
168
+ tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
169
+
170
+ fn(rda, rda, a->shim);
171
+
172
+ tcg_gen_extrl_i64_i32(rdalo, rda);
173
+ tcg_gen_extrh_i64_i32(rdahi, rda);
174
+ store_reg(s, a->rdalo, rdalo);
175
+ store_reg(s, a->rdahi, rdahi);
176
+ tcg_temp_free_i64(rda);
177
+
178
+ return true;
45
+ return true;
179
+}
46
+}
180
+
47
+
181
+static bool trans_ASRL_ri(DisasContext *s, arg_mve_shl_ri *a)
48
+static bool trans_BLRA(DisasContext *s, arg_bra *a)
182
+{
49
+{
183
+ return do_mve_shl_ri(s, a, tcg_gen_sari_i64);
50
+ TCGv_i64 dst, lr;
51
+
52
+ if (!dc_isar_feature(aa64_pauth, s)) {
53
+ return false;
54
+ }
55
+ dst = auth_branch_target(s, cpu_reg(s, a->rn), cpu_reg_sp(s, a->rm), !a->m);
56
+ lr = cpu_reg(s, 30);
57
+ if (dst == lr) {
58
+ TCGv_i64 tmp = tcg_temp_new_i64();
59
+ tcg_gen_mov_i64(tmp, dst);
60
+ dst = tmp;
61
+ }
62
+ gen_pc_plus_diff(s, lr, curr_insn_len(s));
63
+ gen_a64_set_pc(s, dst);
64
+ set_btype_for_blr(s);
65
+ s->base.is_jmp = DISAS_JUMP;
66
+ return true;
184
+}
67
+}
185
+
68
+
186
+static bool trans_LSLL_ri(DisasContext *s, arg_mve_shl_ri *a)
69
/* HINT instruction group, including various allocated HINTs */
187
+{
70
static void handle_hint(DisasContext *s, uint32_t insn,
188
+ return do_mve_shl_ri(s, a, tcg_gen_shli_i64);
71
unsigned int op1, unsigned int op2, unsigned int crm)
189
+}
72
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
190
+
73
static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
191
+static bool trans_LSRL_ri(DisasContext *s, arg_mve_shl_ri *a)
74
{
192
+{
75
unsigned int opc, op2, op3, rn, op4;
193
+ return do_mve_shl_ri(s, a, tcg_gen_shri_i64);
76
- unsigned btype_mod = 2; /* 0: BR, 1: BLR, 2: other */
194
+}
77
TCGv_i64 dst;
195
+
78
TCGv_i64 modifier;
196
+static void gen_mve_sqshll(TCGv_i64 r, TCGv_i64 n, int64_t shift)
79
197
+{
80
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
198
+ gen_helper_mve_sqshll(r, cpu_env, n, tcg_constant_i32(shift));
81
case 0:
199
+}
82
case 1:
200
+
83
case 2:
201
+static bool trans_SQSHLL_ri(DisasContext *s, arg_mve_shl_ri *a)
84
+ case 8:
202
+{
85
+ case 9:
203
+ return do_mve_shl_ri(s, a, gen_mve_sqshll);
86
/*
204
+}
87
- * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ:
205
+
88
- * handled in decodetree
206
+static void gen_mve_uqshll(TCGv_i64 r, TCGv_i64 n, int64_t shift)
89
+ * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ,
207
+{
90
+ * BRAA, BLRAA: handled in decodetree
208
+ gen_helper_mve_uqshll(r, cpu_env, n, tcg_constant_i32(shift));
91
*/
209
+}
92
goto do_unallocated;
210
+
93
211
+static bool trans_UQSHLL_ri(DisasContext *s, arg_mve_shl_ri *a)
94
- case 8: /* BRAA */
212
+{
95
- case 9: /* BLRAA */
213
+ return do_mve_shl_ri(s, a, gen_mve_uqshll);
96
- if (!dc_isar_feature(aa64_pauth, s)) {
214
+}
97
- goto do_unallocated;
215
+
98
- }
216
+static bool trans_SRSHRL_ri(DisasContext *s, arg_mve_shl_ri *a)
99
- if ((op3 & ~1) != 2) {
217
+{
100
- goto do_unallocated;
218
+ return do_mve_shl_ri(s, a, gen_srshr64_i64);
101
- }
219
+}
102
- btype_mod = opc & 1;
220
+
103
- if (s->pauth_active) {
221
+static bool trans_URSHRL_ri(DisasContext *s, arg_mve_shl_ri *a)
104
- dst = tcg_temp_new_i64();
222
+{
105
- modifier = cpu_reg_sp(s, op4);
223
+ return do_mve_shl_ri(s, a, gen_urshr64_i64);
106
- if (op3 == 2) {
224
+}
107
- gen_helper_autia(dst, cpu_env, cpu_reg(s, rn), modifier);
225
+
108
- } else {
226
/*
109
- gen_helper_autib(dst, cpu_env, cpu_reg(s, rn), modifier);
227
* Multiply and multiply accumulate
110
- }
228
*/
111
- } else {
112
- dst = cpu_reg(s, rn);
113
- }
114
- /* BLRAA also needs to load return address */
115
- if (opc == 9) {
116
- TCGv_i64 lr = cpu_reg(s, 30);
117
- if (dst == lr) {
118
- TCGv_i64 tmp = tcg_temp_new_i64();
119
- tcg_gen_mov_i64(tmp, dst);
120
- dst = tmp;
121
- }
122
- gen_pc_plus_diff(s, lr, curr_insn_len(s));
123
- }
124
- gen_a64_set_pc(s, dst);
125
- break;
126
-
127
case 4: /* ERET */
128
if (s->current_el == 0) {
129
goto do_unallocated;
130
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
131
unallocated_encoding(s);
132
return;
133
}
134
-
135
- switch (btype_mod) {
136
- case 0: /* BR */
137
- if (dc_isar_feature(aa64_bti, s)) {
138
- /* BR to {x16,x17} or !guard -> 1, else 3. */
139
- set_btype(s, rn == 16 || rn == 17 || !s->guarded_page ? 1 : 3);
140
- }
141
- break;
142
-
143
- case 1: /* BLR */
144
- if (dc_isar_feature(aa64_bti, s)) {
145
- /* BLR sets BTYPE to 2, regardless of source guarded page. */
146
- set_btype(s, 2);
147
- }
148
- break;
149
-
150
- default: /* RET or none of the above. */
151
- /* BTYPE will be set to 0 by normal end-of-insn processing. */
152
- break;
153
- }
154
-
155
- s->base.is_jmp = DISAS_JUMP;
156
}
157
158
/* Branches, exception generating and system instructions */
229
--
159
--
230
2.20.1
160
2.34.1
231
232
diff view generated by jsdifflib
1
Implement the MVE logical-immediate insns (VMOV, VMVN,
1
Convert the exception-return insns ERET, ERETA and ERETB to
2
VORR and VBIC). These have essentially the same encoding
2
decodetree. These were the last insns left in the legacy
3
as their Neon equivalents, and we implement the decode
3
decoder function disas_uncond_reg_b(), which allows us to
4
in the same way.
4
remove it.
5
6
The old decoder explicitly decoded the DRPS instruction,
7
only in order to call unallocated_encoding() on it, exactly
8
as would have happened if it hadn't decoded it. This is
9
because this insn always UNDEFs unless the CPU is in
10
halting-debug state, which we don't emulate. So we list
11
the pattern in a comment in a64.decode, but don't actively
12
decode it.
5
13
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-7-peter.maydell@linaro.org
16
Message-id: 20230512144106.3608981-21-peter.maydell@linaro.org
9
---
17
---
10
target/arm/helper-mve.h | 4 +++
18
target/arm/tcg/a64.decode | 8 ++
11
target/arm/mve.decode | 17 +++++++++++++
19
target/arm/tcg/translate-a64.c | 163 +++++++++++----------------------
12
target/arm/mve_helper.c | 24 ++++++++++++++++++
20
2 files changed, 63 insertions(+), 108 deletions(-)
13
target/arm/translate-mve.c | 50 ++++++++++++++++++++++++++++++++++++++
14
4 files changed, 95 insertions(+)
15
21
16
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
22
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
17
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-mve.h
24
--- a/target/arm/tcg/a64.decode
19
+++ b/target/arm/helper-mve.h
25
+++ b/target/arm/tcg/a64.decode
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(mve_vaddvsh, TCG_CALL_NO_WG, i32, env, ptr, i32)
26
@@ -XXX,XX +XXX,XX @@ RETA 1101011 0010 11111 00001 m:1 11111 11111 &reta # RETAA, RETAB
21
DEF_HELPER_FLAGS_3(mve_vaddvuh, TCG_CALL_NO_WG, i32, env, ptr, i32)
27
&bra rn rm m
22
DEF_HELPER_FLAGS_3(mve_vaddvsw, TCG_CALL_NO_WG, i32, env, ptr, i32)
28
BRA 1101011 1000 11111 00001 m:1 rn:5 rm:5 &bra # BRAA, BRAB
23
DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32)
29
BLRA 1101011 1001 11111 00001 m:1 rn:5 rm:5 &bra # BLRAA, BLRAB
24
+
30
+
25
+DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64)
31
+ERET 1101011 0100 11111 000000 11111 00000
26
+DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64)
32
+ERETA 1101011 0100 11111 00001 m:1 11111 11111 &reta # ERETAA, ERETAB
27
+DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64)
33
+
28
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
34
+# We don't need to decode DRPS because it always UNDEFs except when
35
+# the processor is in halting debug state (which we don't implement).
36
+# The pattern is listed here as documentation.
37
+# DRPS 1101011 0101 11111 000000 11111 00000
38
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/mve.decode
40
--- a/target/arm/tcg/translate-a64.c
31
+++ b/target/arm/mve.decode
41
+++ b/target/arm/tcg/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@
42
@@ -XXX,XX +XXX,XX @@ static bool trans_BLRA(DisasContext *s, arg_bra *a)
33
# VQDMULL has size in bit 28: 0 for 16 bit, 1 for 32 bit
34
%size_28 28:1 !function=plus_1
35
36
+# 1imm format immediate
37
+%imm_28_16_0 28:1 16:3 0:4
38
+
39
&vldr_vstr rn qd imm p a w size l u
40
&1op qd qm size
41
&2op qd qm qn size
42
&2scalar qd qn rm size
43
+&1imm qd imm cmode op
44
45
@vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0
46
# Note that both Rn and Qd are 3 bits only (no D bit)
47
@@ -XXX,XX +XXX,XX @@
48
@2op_nosz .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn size=0
49
@2op_sz28 .... .... .... .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn \
50
size=%size_28
51
+@1imm .... .... .... .... .... cmode:4 .. op:1 . .... &1imm qd=%qd imm=%imm_28_16_0
52
53
# The _rev suffix indicates that Vn and Vm are reversed. This is
54
# the case for shifts. In the Arm ARM these insns are documented
55
@@ -XXX,XX +XXX,XX @@ VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rd
56
# Predicate operations
57
%mask_22_13 22:1 13:3
58
VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13
59
+
60
+# Logical immediate operations (1 reg and modified-immediate)
61
+
62
+# The cmode/op bits here decode VORR/VBIC/VMOV/VMVN, but
63
+# not in a way we can conveniently represent in decodetree without
64
+# a lot of repetition:
65
+# VORR: op=0, (cmode & 1) && cmode < 12
66
+# VBIC: op=1, (cmode & 1) && cmode < 12
67
+# VMOV: everything else
68
+# So we have a single decode line and check the cmode/op in the
69
+# trans function.
70
+Vimm_1r 111 . 1111 1 . 00 0 ... ... 0 .... 0 1 . 1 .... @1imm
71
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
72
index XXXXXXX..XXXXXXX 100644
73
--- a/target/arm/mve_helper.c
74
+++ b/target/arm/mve_helper.c
75
@@ -XXX,XX +XXX,XX @@ DO_1OP(vnegw, 4, int32_t, DO_NEG)
76
DO_1OP(vfnegh, 8, uint64_t, DO_FNEGH)
77
DO_1OP(vfnegs, 8, uint64_t, DO_FNEGS)
78
79
+/*
80
+ * 1 operand immediates: Vda is destination and possibly also one source.
81
+ * All these insns work at 64-bit widths.
82
+ */
83
+#define DO_1OP_IMM(OP, FN) \
84
+ void HELPER(mve_##OP)(CPUARMState *env, void *vda, uint64_t imm) \
85
+ { \
86
+ uint64_t *da = vda; \
87
+ uint16_t mask = mve_element_mask(env); \
88
+ unsigned e; \
89
+ for (e = 0; e < 16 / 8; e++, mask >>= 8) { \
90
+ mergemask(&da[H8(e)], FN(da[H8(e)], imm), mask); \
91
+ } \
92
+ mve_advance_vpt(env); \
93
+ }
94
+
95
+#define DO_MOVI(N, I) (I)
96
+#define DO_ANDI(N, I) ((N) & (I))
97
+#define DO_ORRI(N, I) ((N) | (I))
98
+
99
+DO_1OP_IMM(vmovi, DO_MOVI)
100
+DO_1OP_IMM(vandi, DO_ANDI)
101
+DO_1OP_IMM(vorri, DO_ORRI)
102
+
103
#define DO_2OP(OP, ESIZE, TYPE, FN) \
104
void HELPER(glue(mve_, OP))(CPUARMState *env, \
105
void *vd, void *vn, void *vm) \
106
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
107
index XXXXXXX..XXXXXXX 100644
108
--- a/target/arm/translate-mve.c
109
+++ b/target/arm/translate-mve.c
110
@@ -XXX,XX +XXX,XX @@ typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr);
111
typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32);
112
typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64);
113
typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32);
114
+typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64);
115
116
/* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */
117
static inline long mve_qreg_offset(unsigned reg)
118
@@ -XXX,XX +XXX,XX @@ static bool trans_VADDV(DisasContext *s, arg_VADDV *a)
119
mve_update_eci(s);
120
return true;
43
return true;
121
}
44
}
122
+
45
123
+static bool do_1imm(DisasContext *s, arg_1imm *a, MVEGenOneOpImmFn *fn)
46
+static bool trans_ERET(DisasContext *s, arg_ERET *a)
124
+{
47
+{
125
+ TCGv_ptr qd;
48
+ TCGv_i64 dst;
126
+ uint64_t imm;
49
+
127
+
50
+ if (s->current_el == 0) {
128
+ if (!dc_isar_feature(aa32_mve, s) ||
129
+ !mve_check_qreg_bank(s, a->qd) ||
130
+ !fn) {
131
+ return false;
51
+ return false;
132
+ }
52
+ }
133
+ if (!mve_eci_check(s) || !vfp_access_check(s)) {
53
+ if (s->fgt_eret) {
54
+ gen_exception_insn_el(s, 0, EXCP_UDEF, 0, 2);
134
+ return true;
55
+ return true;
135
+ }
56
+ }
136
+
57
+ dst = tcg_temp_new_i64();
137
+ imm = asimd_imm_const(a->imm, a->cmode, a->op);
58
+ tcg_gen_ld_i64(dst, cpu_env,
138
+
59
+ offsetof(CPUARMState, elr_el[s->current_el]));
139
+ qd = mve_qreg_ptr(a->qd);
60
+
140
+ fn(cpu_env, qd, tcg_constant_i64(imm));
61
+ if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
141
+ tcg_temp_free_ptr(qd);
62
+ gen_io_start();
142
+ mve_update_eci(s);
63
+ }
64
+
65
+ gen_helper_exception_return(cpu_env, dst);
66
+ /* Must exit loop to check un-masked IRQs */
67
+ s->base.is_jmp = DISAS_EXIT;
143
+ return true;
68
+ return true;
144
+}
69
+}
145
+
70
+
146
+static bool trans_Vimm_1r(DisasContext *s, arg_1imm *a)
71
+static bool trans_ERETA(DisasContext *s, arg_reta *a)
147
+{
72
+{
148
+ /* Handle decode of cmode/op here between VORR/VBIC/VMOV */
73
+ TCGv_i64 dst;
149
+ MVEGenOneOpImmFn *fn;
74
+
150
+
75
+ if (!dc_isar_feature(aa64_pauth, s)) {
151
+ if ((a->cmode & 1) && a->cmode < 12) {
76
+ return false;
152
+ if (a->op) {
77
+ }
153
+ /*
78
+ if (s->current_el == 0) {
154
+ * For op=1, the immediate will be inverted by asimd_imm_const(),
79
+ return false;
155
+ * so the VBIC becomes a logical AND operation.
80
+ }
156
+ */
81
+ /* The FGT trap takes precedence over an auth trap. */
157
+ fn = gen_helper_mve_vandi;
82
+ if (s->fgt_eret) {
158
+ } else {
83
+ gen_exception_insn_el(s, 0, EXCP_UDEF, a->m ? 3 : 2, 2);
159
+ fn = gen_helper_mve_vorri;
84
+ return true;
160
+ }
85
+ }
161
+ } else {
86
+ dst = tcg_temp_new_i64();
162
+ /* There is one unallocated cmode/op combination in this space */
87
+ tcg_gen_ld_i64(dst, cpu_env,
163
+ if (a->cmode == 15 && a->op == 1) {
88
+ offsetof(CPUARMState, elr_el[s->current_el]));
164
+ return false;
89
+
165
+ }
90
+ dst = auth_branch_target(s, dst, cpu_X[31], !a->m);
166
+ /* asimd_imm_const() sorts out VMVNI vs VMOVI for us */
91
+ if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
167
+ fn = gen_helper_mve_vmovi;
92
+ gen_io_start();
168
+ }
93
+ }
169
+ return do_1imm(s, a, fn);
94
+
95
+ gen_helper_exception_return(cpu_env, dst);
96
+ /* Must exit loop to check un-masked IRQs */
97
+ s->base.is_jmp = DISAS_EXIT;
98
+ return true;
170
+}
99
+}
100
+
101
/* HINT instruction group, including various allocated HINTs */
102
static void handle_hint(DisasContext *s, uint32_t insn,
103
unsigned int op1, unsigned int op2, unsigned int crm)
104
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
105
}
106
}
107
108
-/* Unconditional branch (register)
109
- * 31 25 24 21 20 16 15 10 9 5 4 0
110
- * +---------------+-------+-------+-------+------+-------+
111
- * | 1 1 0 1 0 1 1 | opc | op2 | op3 | Rn | op4 |
112
- * +---------------+-------+-------+-------+------+-------+
113
- */
114
-static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
115
-{
116
- unsigned int opc, op2, op3, rn, op4;
117
- TCGv_i64 dst;
118
- TCGv_i64 modifier;
119
-
120
- opc = extract32(insn, 21, 4);
121
- op2 = extract32(insn, 16, 5);
122
- op3 = extract32(insn, 10, 6);
123
- rn = extract32(insn, 5, 5);
124
- op4 = extract32(insn, 0, 5);
125
-
126
- if (op2 != 0x1f) {
127
- goto do_unallocated;
128
- }
129
-
130
- switch (opc) {
131
- case 0:
132
- case 1:
133
- case 2:
134
- case 8:
135
- case 9:
136
- /*
137
- * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ,
138
- * BRAA, BLRAA: handled in decodetree
139
- */
140
- goto do_unallocated;
141
-
142
- case 4: /* ERET */
143
- if (s->current_el == 0) {
144
- goto do_unallocated;
145
- }
146
- switch (op3) {
147
- case 0: /* ERET */
148
- if (op4 != 0) {
149
- goto do_unallocated;
150
- }
151
- if (s->fgt_eret) {
152
- gen_exception_insn_el(s, 0, EXCP_UDEF, syn_erettrap(op3), 2);
153
- return;
154
- }
155
- dst = tcg_temp_new_i64();
156
- tcg_gen_ld_i64(dst, cpu_env,
157
- offsetof(CPUARMState, elr_el[s->current_el]));
158
- break;
159
-
160
- case 2: /* ERETAA */
161
- case 3: /* ERETAB */
162
- if (!dc_isar_feature(aa64_pauth, s)) {
163
- goto do_unallocated;
164
- }
165
- if (rn != 0x1f || op4 != 0x1f) {
166
- goto do_unallocated;
167
- }
168
- /* The FGT trap takes precedence over an auth trap. */
169
- if (s->fgt_eret) {
170
- gen_exception_insn_el(s, 0, EXCP_UDEF, syn_erettrap(op3), 2);
171
- return;
172
- }
173
- dst = tcg_temp_new_i64();
174
- tcg_gen_ld_i64(dst, cpu_env,
175
- offsetof(CPUARMState, elr_el[s->current_el]));
176
- if (s->pauth_active) {
177
- modifier = cpu_X[31];
178
- if (op3 == 2) {
179
- gen_helper_autia(dst, cpu_env, dst, modifier);
180
- } else {
181
- gen_helper_autib(dst, cpu_env, dst, modifier);
182
- }
183
- }
184
- break;
185
-
186
- default:
187
- goto do_unallocated;
188
- }
189
- if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
190
- gen_io_start();
191
- }
192
-
193
- gen_helper_exception_return(cpu_env, dst);
194
- /* Must exit loop to check un-masked IRQs */
195
- s->base.is_jmp = DISAS_EXIT;
196
- return;
197
-
198
- case 5: /* DRPS */
199
- if (op3 != 0 || op4 != 0 || rn != 0x1f) {
200
- goto do_unallocated;
201
- } else {
202
- unallocated_encoding(s);
203
- }
204
- return;
205
-
206
- default:
207
- do_unallocated:
208
- unallocated_encoding(s);
209
- return;
210
- }
211
-}
212
-
213
/* Branches, exception generating and system instructions */
214
static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
215
{
216
@@ -XXX,XX +XXX,XX @@ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
217
disas_exc(s, insn);
218
}
219
break;
220
- case 0x6b: /* Unconditional branch (register) */
221
- disas_uncond_b_reg(s, insn);
222
- break;
223
default:
224
unallocated_encoding(s);
225
break;
171
--
226
--
172
2.20.1
227
2.34.1
173
174
diff view generated by jsdifflib
1
Implement the MVE VHLL (vector shift left long) insn. This has two
1
The IMPDEF sysreg L2CTLR_EL1 found on the Cortex-A35, A53, A57, A72
2
encodings: the T1 encoding is the usual shift-by-immediate format,
2
and which we (arguably dubiously) also provide in '-cpu max' has a
3
and the T2 encoding is a special case where the shift count is always
3
2 bit field for the number of processors in the cluster. On real
4
equal to the element size.
4
hardware this must be sufficient because it can only be configured
5
with up to 4 CPUs in the cluster. However on QEMU if the board code
6
does not explicitly configure the code into clusters with the right
7
CPU count we default to "give the value assuming that all CPUs in
8
the system are in a single cluster", which might be too big to fit
9
in the field.
10
11
Instead of just overflowing this 2-bit field, saturate to 3 (meaning
12
"4 CPUs", so at least we don't overwrite other fields in the register.
13
It's unlikely that any guest code really cares about the value in
14
this field; at least, if it does it probably also wants the system
15
to be more closely matching real hardware, i.e. not to have more
16
than 4 CPUs.
17
18
This issue has been present since the L2CTLR was first added in
19
commit 377a44ec8f2fac5b back in 2014. It was only noticed because
20
Coverity complains (CID 1509227) that the shift might overflow 32 bits
21
and inadvertently sign extend into the top half of the 64 bit value.
5
22
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
24
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-10-peter.maydell@linaro.org
25
Message-id: 20230512170223.3801643-2-peter.maydell@linaro.org
9
---
26
---
10
target/arm/helper-mve.h | 9 +++++++
27
target/arm/cortex-regs.c | 11 +++++++++--
11
target/arm/mve.decode | 53 +++++++++++++++++++++++++++++++++++---
28
1 file changed, 9 insertions(+), 2 deletions(-)
12
target/arm/mve_helper.c | 32 +++++++++++++++++++++++
13
target/arm/translate-mve.c | 15 +++++++++++
14
4 files changed, 105 insertions(+), 4 deletions(-)
15
29
16
diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
30
diff --git a/target/arm/cortex-regs.c b/target/arm/cortex-regs.c
17
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
18
--- a/target/arm/helper-mve.h
32
--- a/target/arm/cortex-regs.c
19
+++ b/target/arm/helper-mve.h
33
+++ b/target/arm/cortex-regs.c
20
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(mve_vrshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
34
@@ -XXX,XX +XXX,XX @@ static uint64_t l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
21
DEF_HELPER_FLAGS_4(mve_vrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
35
{
22
DEF_HELPER_FLAGS_4(mve_vrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
36
ARMCPU *cpu = env_archcpu(env);
23
DEF_HELPER_FLAGS_4(mve_vrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
37
24
+
38
- /* Number of cores is in [25:24]; otherwise we RAZ */
25
+DEF_HELPER_FLAGS_4(mve_vshllbsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
39
- return (cpu->core_count - 1) << 24;
26
+DEF_HELPER_FLAGS_4(mve_vshllbsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
40
+ /*
27
+DEF_HELPER_FLAGS_4(mve_vshllbub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
41
+ * Number of cores is in [25:24]; otherwise we RAZ.
28
+DEF_HELPER_FLAGS_4(mve_vshllbuh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
42
+ * If the board didn't configure the CPUs into clusters,
29
+DEF_HELPER_FLAGS_4(mve_vshlltsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
43
+ * we default to "all CPUs in one cluster", which might be
30
+DEF_HELPER_FLAGS_4(mve_vshlltsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
44
+ * more than the 4 that the hardware permits and which is
31
+DEF_HELPER_FLAGS_4(mve_vshlltub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
45
+ * all you can report in this two-bit field. Saturate to
32
+DEF_HELPER_FLAGS_4(mve_vshlltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
46
+ * 0b11 (== 4 CPUs) rather than overflowing the field.
33
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
47
+ */
34
index XXXXXXX..XXXXXXX 100644
48
+ return MIN(cpu->core_count - 1, 3) << 24;
35
--- a/target/arm/mve.decode
49
}
36
+++ b/target/arm/mve.decode
50
37
@@ -XXX,XX +XXX,XX @@
51
static const ARMCPRegInfo cortex_a72_a57_a53_cp_reginfo[] = {
38
@2_shl_h .... .... .. 01 shift:4 .... .... .... .... &2shift qd=%qd qm=%qm size=1
39
@2_shl_w .... .... .. 1 shift:5 .... .... .... .... &2shift qd=%qd qm=%qm size=2
40
41
+@2_shll_b .... .... ... 01 shift:3 .... .... .... .... &2shift qd=%qd qm=%qm size=0
42
+@2_shll_h .... .... ... 1 shift:4 .... .... .... .... &2shift qd=%qd qm=%qm size=1
43
+# VSHLL encoding T2 where shift == esize
44
+@2_shll_esize_b .... .... .... 00 .. .... .... .... .... &2shift \
45
+ qd=%qd qm=%qm size=0 shift=8
46
+@2_shll_esize_h .... .... .... 01 .. .... .... .... .... &2shift \
47
+ qd=%qd qm=%qm size=1 shift=16
48
+
49
# Right shifts are encoded as N - shift, where N is the element size in bits.
50
%rshift_i5 16:5 !function=rsub_32
51
%rshift_i4 16:4 !function=rsub_16
52
@@ -XXX,XX +XXX,XX @@ VADD 1110 1111 0 . .. ... 0 ... 0 1000 . 1 . 0 ... 0 @2op
53
VSUB 1111 1111 0 . .. ... 0 ... 0 1000 . 1 . 0 ... 0 @2op
54
VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op
55
56
-VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
57
-VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
58
+# The VSHLL T2 encoding is not a @2op pattern, but is here because it
59
+# overlaps what would be size=0b11 VMULH/VRMULH
60
+{
61
+ VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b
62
+ VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h
63
64
-VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
65
-VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
66
+ VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
67
+}
68
+
69
+{
70
+ VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b
71
+ VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h
72
+
73
+ VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op
74
+}
75
+
76
+{
77
+ VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b
78
+ VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h
79
+
80
+ VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
81
+}
82
+
83
+{
84
+ VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b
85
+ VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h
86
+
87
+ VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op
88
+}
89
90
VMAX_S 111 0 1111 0 . .. ... 0 ... 0 0110 . 1 . 0 ... 0 @2op
91
VMAX_U 111 1 1111 0 . .. ... 0 ... 0 0110 . 1 . 0 ... 0 @2op
92
@@ -XXX,XX +XXX,XX @@ VRSHRI_S 111 0 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
93
VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_b
94
VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h
95
VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w
96
+
97
+# VSHLL T1 encoding; the T2 VSHLL encoding is elsewhere in this file
98
+VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_b
99
+VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_h
100
+
101
+VSHLL_BU 111 1 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_b
102
+VSHLL_BU 111 1 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_h
103
+
104
+VSHLL_TS 111 0 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_b
105
+VSHLL_TS 111 0 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_h
106
+
107
+VSHLL_TU 111 1 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_b
108
+VSHLL_TU 111 1 1110 1 . 1 .. ... ... 1 1111 0 1 . 0 ... 0 @2_shll_h
109
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
110
index XXXXXXX..XXXXXXX 100644
111
--- a/target/arm/mve_helper.c
112
+++ b/target/arm/mve_helper.c
113
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP)
114
DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP)
115
DO_2SHIFT_U(vrshli_u, DO_VRSHLU)
116
DO_2SHIFT_S(vrshli_s, DO_VRSHLS)
117
+
118
+/*
119
+ * Long shifts taking half-sized inputs from top or bottom of the input
120
+ * vector and producing a double-width result. ESIZE, TYPE are for
121
+ * the input, and LESIZE, LTYPE for the output.
122
+ * Unlike the normal shift helpers, we do not handle negative shift counts,
123
+ * because the long shift is strictly left-only.
124
+ */
125
+#define DO_VSHLL(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE) \
126
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, \
127
+ void *vm, uint32_t shift) \
128
+ { \
129
+ LTYPE *d = vd; \
130
+ TYPE *m = vm; \
131
+ uint16_t mask = mve_element_mask(env); \
132
+ unsigned le; \
133
+ assert(shift <= 16); \
134
+ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \
135
+ LTYPE r = (LTYPE)m[H##ESIZE(le * 2 + TOP)] << shift; \
136
+ mergemask(&d[H##LESIZE(le)], r, mask); \
137
+ } \
138
+ mve_advance_vpt(env); \
139
+ }
140
+
141
+#define DO_VSHLL_ALL(OP, TOP) \
142
+ DO_VSHLL(OP##sb, TOP, 1, int8_t, 2, int16_t) \
143
+ DO_VSHLL(OP##ub, TOP, 1, uint8_t, 2, uint16_t) \
144
+ DO_VSHLL(OP##sh, TOP, 2, int16_t, 4, int32_t) \
145
+ DO_VSHLL(OP##uh, TOP, 2, uint16_t, 4, uint32_t) \
146
+
147
+DO_VSHLL_ALL(vshllb, false)
148
+DO_VSHLL_ALL(vshllt, true)
149
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
150
index XXXXXXX..XXXXXXX 100644
151
--- a/target/arm/translate-mve.c
152
+++ b/target/arm/translate-mve.c
153
@@ -XXX,XX +XXX,XX @@ DO_2SHIFT(VSHRI_S, vshli_s, true)
154
DO_2SHIFT(VSHRI_U, vshli_u, true)
155
DO_2SHIFT(VRSHRI_S, vrshli_s, true)
156
DO_2SHIFT(VRSHRI_U, vrshli_u, true)
157
+
158
+#define DO_VSHLL(INSN, FN) \
159
+ static bool trans_##INSN(DisasContext *s, arg_2shift *a) \
160
+ { \
161
+ static MVEGenTwoOpShiftFn * const fns[] = { \
162
+ gen_helper_mve_##FN##b, \
163
+ gen_helper_mve_##FN##h, \
164
+ }; \
165
+ return do_2shift(s, a, fns[a->size], false); \
166
+ }
167
+
168
+DO_VSHLL(VSHLL_BS, vshllbs)
169
+DO_VSHLL(VSHLL_BU, vshllbu)
170
+DO_VSHLL(VSHLL_TS, vshllts)
171
+DO_VSHLL(VSHLL_TU, vshlltu)
172
--
52
--
173
2.20.1
53
2.34.1
174
175
diff view generated by jsdifflib
1
In do_ldst(), the calculation of the offset needs to be based on the
1
In the vexpress board code, we allocate a new MemoryRegion at the top
2
size of the memory access, not the size of the elements in the
2
of vexpress_common_init() but only set it up and use it inside the
3
vector. This meant we were getting it wrong for the widening and
3
"if (map[VE_NORFLASHALIAS] != -1)" conditional, so we leak it if not.
4
narrowing variants of the various VLDR and VSTR insns.
4
This isn't a very interesting leak as it's a tiny amount of memory
5
once at startup, but it's easy to fix.
6
7
We could silence Coverity simply by moving the g_new() into the
8
if() block, but this use of g_new(MemoryRegion, 1) is a legacy from
9
when this board model was originally written; we wouldn't do that
10
if we wrote it today. The MemoryRegions are conceptually a part of
11
the board and must not go away until the whole board is done with
12
(at the end of the simulation), so they belong in its state struct.
13
14
This machine already has a VexpressMachineState struct that extends
15
MachineState, so statically put the MemoryRegions in there instead of
16
dynamically allocating them separately at runtime.
17
18
Spotted by Coverity (CID 1509083).
5
19
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
21
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20210628135835.6690-2-peter.maydell@linaro.org
22
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
23
Message-id: 20230512170223.3801643-3-peter.maydell@linaro.org
9
---
24
---
10
target/arm/translate-mve.c | 17 +++++++++--------
25
hw/arm/vexpress.c | 40 ++++++++++++++++++++--------------------
11
1 file changed, 9 insertions(+), 8 deletions(-)
26
1 file changed, 20 insertions(+), 20 deletions(-)
12
27
13
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
28
diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
14
index XXXXXXX..XXXXXXX 100644
29
index XXXXXXX..XXXXXXX 100644
15
--- a/target/arm/translate-mve.c
30
--- a/hw/arm/vexpress.c
16
+++ b/target/arm/translate-mve.c
31
+++ b/hw/arm/vexpress.c
17
@@ -XXX,XX +XXX,XX @@ static bool mve_skip_first_beat(DisasContext *s)
32
@@ -XXX,XX +XXX,XX @@ struct VexpressMachineClass {
33
34
struct VexpressMachineState {
35
MachineState parent;
36
+ MemoryRegion vram;
37
+ MemoryRegion sram;
38
+ MemoryRegion flashalias;
39
+ MemoryRegion lowram;
40
+ MemoryRegion a15sram;
41
bool secure;
42
bool virt;
43
};
44
@@ -XXX,XX +XXX,XX @@ struct VexpressMachineState {
45
#define TYPE_VEXPRESS_A15_MACHINE MACHINE_TYPE_NAME("vexpress-a15")
46
OBJECT_DECLARE_TYPE(VexpressMachineState, VexpressMachineClass, VEXPRESS_MACHINE)
47
48
-typedef void DBoardInitFn(const VexpressMachineState *machine,
49
+typedef void DBoardInitFn(VexpressMachineState *machine,
50
ram_addr_t ram_size,
51
const char *cpu_type,
52
qemu_irq *pic);
53
@@ -XXX,XX +XXX,XX @@ static void init_cpus(MachineState *ms, const char *cpu_type,
18
}
54
}
19
}
55
}
20
56
21
-static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, MVEGenLdStFn *fn)
57
-static void a9_daughterboard_init(const VexpressMachineState *vms,
22
+static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, MVEGenLdStFn *fn,
58
+static void a9_daughterboard_init(VexpressMachineState *vms,
23
+ unsigned msize)
59
ram_addr_t ram_size,
60
const char *cpu_type,
61
qemu_irq *pic)
24
{
62
{
25
TCGv_i32 addr;
63
MachineState *machine = MACHINE(vms);
26
uint32_t offset;
64
MemoryRegion *sysmem = get_system_memory();
27
@@ -XXX,XX +XXX,XX @@ static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, MVEGenLdStFn *fn)
65
- MemoryRegion *lowram = g_new(MemoryRegion, 1);
28
return true;
66
ram_addr_t low_ram_size;
67
68
if (ram_size > 0x40000000) {
69
@@ -XXX,XX +XXX,XX @@ static void a9_daughterboard_init(const VexpressMachineState *vms,
70
* address space should in theory be remappable to various
71
* things including ROM or RAM; we always map the RAM there.
72
*/
73
- memory_region_init_alias(lowram, NULL, "vexpress.lowmem", machine->ram,
74
- 0, low_ram_size);
75
- memory_region_add_subregion(sysmem, 0x0, lowram);
76
+ memory_region_init_alias(&vms->lowram, NULL, "vexpress.lowmem",
77
+ machine->ram, 0, low_ram_size);
78
+ memory_region_add_subregion(sysmem, 0x0, &vms->lowram);
79
memory_region_add_subregion(sysmem, 0x60000000, machine->ram);
80
81
/* 0x1e000000 A9MPCore (SCU) private memory region */
82
@@ -XXX,XX +XXX,XX @@ static VEDBoardInfo a9_daughterboard = {
83
.init = a9_daughterboard_init,
84
};
85
86
-static void a15_daughterboard_init(const VexpressMachineState *vms,
87
+static void a15_daughterboard_init(VexpressMachineState *vms,
88
ram_addr_t ram_size,
89
const char *cpu_type,
90
qemu_irq *pic)
91
{
92
MachineState *machine = MACHINE(vms);
93
MemoryRegion *sysmem = get_system_memory();
94
- MemoryRegion *sram = g_new(MemoryRegion, 1);
95
96
{
97
/* We have to use a separate 64 bit variable here to avoid the gcc
98
@@ -XXX,XX +XXX,XX @@ static void a15_daughterboard_init(const VexpressMachineState *vms,
99
/* 0x2b060000: SP805 watchdog: not modelled */
100
/* 0x2b0a0000: PL341 dynamic memory controller: not modelled */
101
/* 0x2e000000: system SRAM */
102
- memory_region_init_ram(sram, NULL, "vexpress.a15sram", 0x10000,
103
+ memory_region_init_ram(&vms->a15sram, NULL, "vexpress.a15sram", 0x10000,
104
&error_fatal);
105
- memory_region_add_subregion(sysmem, 0x2e000000, sram);
106
+ memory_region_add_subregion(sysmem, 0x2e000000, &vms->a15sram);
107
108
/* 0x7ffb0000: DMA330 DMA controller: not modelled */
109
/* 0x7ffd0000: PL354 static memory controller: not modelled */
110
@@ -XXX,XX +XXX,XX @@ static void vexpress_common_init(MachineState *machine)
111
I2CBus *i2c;
112
ram_addr_t vram_size, sram_size;
113
MemoryRegion *sysmem = get_system_memory();
114
- MemoryRegion *vram = g_new(MemoryRegion, 1);
115
- MemoryRegion *sram = g_new(MemoryRegion, 1);
116
- MemoryRegion *flashalias = g_new(MemoryRegion, 1);
117
- MemoryRegion *flash0mem;
118
const hwaddr *map = daughterboard->motherboard_map;
119
int i;
120
121
@@ -XXX,XX +XXX,XX @@ static void vexpress_common_init(MachineState *machine)
122
123
if (map[VE_NORFLASHALIAS] != -1) {
124
/* Map flash 0 as an alias into low memory */
125
+ MemoryRegion *flash0mem;
126
flash0mem = sysbus_mmio_get_region(SYS_BUS_DEVICE(pflash0), 0);
127
- memory_region_init_alias(flashalias, NULL, "vexpress.flashalias",
128
+ memory_region_init_alias(&vms->flashalias, NULL, "vexpress.flashalias",
129
flash0mem, 0, VEXPRESS_FLASH_SIZE);
130
- memory_region_add_subregion(sysmem, map[VE_NORFLASHALIAS], flashalias);
131
+ memory_region_add_subregion(sysmem, map[VE_NORFLASHALIAS], &vms->flashalias);
29
}
132
}
30
133
31
- offset = a->imm << a->size;
134
dinfo = drive_get(IF_PFLASH, 0, 1);
32
+ offset = a->imm << msize;
135
ve_pflash_cfi01_register(map[VE_NORFLASH1], "vexpress.flash1", dinfo);
33
if (!a->a) {
136
34
offset = -offset;
137
sram_size = 0x2000000;
35
}
138
- memory_region_init_ram(sram, NULL, "vexpress.sram", sram_size,
36
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR(DisasContext *s, arg_VLDR_VSTR *a)
139
+ memory_region_init_ram(&vms->sram, NULL, "vexpress.sram", sram_size,
37
{ gen_helper_mve_vstrw, gen_helper_mve_vldrw },
140
&error_fatal);
38
{ NULL, NULL }
141
- memory_region_add_subregion(sysmem, map[VE_SRAM], sram);
39
};
142
+ memory_region_add_subregion(sysmem, map[VE_SRAM], &vms->sram);
40
- return do_ldst(s, a, ldstfns[a->size][a->l]);
143
41
+ return do_ldst(s, a, ldstfns[a->size][a->l], a->size);
144
vram_size = 0x800000;
42
}
145
- memory_region_init_ram(vram, NULL, "vexpress.vram", vram_size,
43
146
+ memory_region_init_ram(&vms->vram, NULL, "vexpress.vram", vram_size,
44
-#define DO_VLDST_WIDE_NARROW(OP, SLD, ULD, ST) \
147
&error_fatal);
45
+#define DO_VLDST_WIDE_NARROW(OP, SLD, ULD, ST, MSIZE) \
148
- memory_region_add_subregion(sysmem, map[VE_VIDEORAM], vram);
46
static bool trans_##OP(DisasContext *s, arg_VLDR_VSTR *a) \
149
+ memory_region_add_subregion(sysmem, map[VE_VIDEORAM], &vms->vram);
47
{ \
150
48
static MVEGenLdStFn * const ldstfns[2][2] = { \
151
/* 0x4e000000 LAN9118 Ethernet */
49
{ gen_helper_mve_##ST, gen_helper_mve_##SLD }, \
152
if (nd_table[0].used) {
50
{ NULL, gen_helper_mve_##ULD }, \
51
}; \
52
- return do_ldst(s, a, ldstfns[a->u][a->l]); \
53
+ return do_ldst(s, a, ldstfns[a->u][a->l], MSIZE); \
54
}
55
56
-DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h)
57
-DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w)
58
-DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w)
59
+DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h, MO_8)
60
+DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w, MO_8)
61
+DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w, MO_16)
62
63
static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
64
{
65
--
153
--
66
2.20.1
154
2.34.1
67
155
68
156
diff view generated by jsdifflib
1
Use dup_const() instead of bitfield_replicate() in
1
Convert the u2f.txt file to rST, and place it in the right place
2
disas_simd_mod_imm().
2
in our manual layout. The old text didn't fit very well into our
3
3
manual style, so the new version ends up looking like a rewrite,
4
(We can't replace the other use of bitfield_replicate() in this file,
4
although some of the original text is preserved:
5
in logic_imm_decode_wmask(), because that location needs to handle 2
5
6
and 4 bit elements, which dup_const() cannot.)
6
* the 'building' section of the old file is removed, since we
7
generally assume that users have already built QEMU
8
* some rather verbose text has been cut back
9
* document the passthrough device first, on the assumption
10
that's most likely to be of interest to users
11
* cut back on the duplication of text between sections
12
* format example command lines etc with rST
13
14
As it's a short document it seemed simplest to do this all
15
in one go rather than try to do a minimal syntactic conversion
16
and then clean up the wording and layout.
7
17
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
19
Reviewed-by: Thomas Huth <thuth@redhat.com>
10
Message-id: 20210628135835.6690-6-peter.maydell@linaro.org
20
Message-id: 20230421163734.1152076-1-peter.maydell@linaro.org
11
---
21
---
12
target/arm/translate-a64.c | 2 +-
22
docs/system/device-emulation.rst | 1 +
13
1 file changed, 1 insertion(+), 1 deletion(-)
23
docs/system/devices/usb-u2f.rst | 93 ++++++++++++++++++++++++++
14
24
docs/system/devices/usb.rst | 2 +-
15
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
25
docs/u2f.txt | 110 -------------------------------
26
4 files changed, 95 insertions(+), 111 deletions(-)
27
create mode 100644 docs/system/devices/usb-u2f.rst
28
delete mode 100644 docs/u2f.txt
29
30
diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst
16
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-a64.c
32
--- a/docs/system/device-emulation.rst
18
+++ b/target/arm/translate-a64.c
33
+++ b/docs/system/device-emulation.rst
19
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
34
@@ -XXX,XX +XXX,XX @@ Emulated Devices
20
/* FMOV (vector, immediate) - half-precision */
35
devices/virtio-pmem.rst
21
imm = vfp_expand_imm(MO_16, abcdefgh);
36
devices/vhost-user-rng.rst
22
/* now duplicate across the lanes */
37
devices/canokey.rst
23
- imm = bitfield_replicate(imm, 16);
38
+ devices/usb-u2f.rst
24
+ imm = dup_const(MO_16, imm);
39
devices/igb.rst
25
} else {
40
diff --git a/docs/system/devices/usb-u2f.rst b/docs/system/devices/usb-u2f.rst
26
imm = asimd_imm_const(abcdefgh, cmode, is_neg);
41
new file mode 100644
27
}
42
index XXXXXXX..XXXXXXX
43
--- /dev/null
44
+++ b/docs/system/devices/usb-u2f.rst
45
@@ -XXX,XX +XXX,XX @@
46
+Universal Second Factor (U2F) USB Key Device
47
+============================================
48
+
49
+U2F is an open authentication standard that enables relying parties
50
+exposed to the internet to offer a strong second factor option for end
51
+user authentication.
52
+
53
+The second factor is provided by a device implementing the U2F
54
+protocol. In case of a USB U2F security key, it is a USB HID device
55
+that implements the U2F protocol.
56
+
57
+QEMU supports both pass-through of a host U2F key device to a VM,
58
+and software emulation of a U2F key.
59
+
60
+``u2f-passthru``
61
+----------------
62
+
63
+The ``u2f-passthru`` device allows you to connect a real hardware
64
+U2F key on your host to a guest VM. All requests made from the guest
65
+are passed through to the physical security key connected to the
66
+host machine and vice versa.
67
+
68
+In addition, the dedicated pass-through allows you to share a single
69
+U2F security key with several guest VMs, which is not possible with a
70
+simple host device assignment pass-through.
71
+
72
+You can specify the host U2F key to use with the ``hidraw``
73
+option, which takes the host path to a Linux ``/dev/hidrawN`` device:
74
+
75
+.. parsed-literal::
76
+ |qemu_system| -usb -device u2f-passthru,hidraw=/dev/hidraw0
77
+
78
+If you don't specify the device, the ``u2f-passthru`` device will
79
+autoscan to take the first U2F device it finds on the host (this
80
+requires a working libudev):
81
+
82
+.. parsed-literal::
83
+ |qemu_system| -usb -device u2f-passthru
84
+
85
+``u2f-emulated``
86
+----------------
87
+
88
+``u2f-emulated`` is a completely software emulated U2F device.
89
+It uses `libu2f-emu <https://github.com/MattGorko/libu2f-emu>`__
90
+for the U2F key emulation. libu2f-emu
91
+provides a complete implementation of the U2F protocol device part for
92
+all specified transports given by the FIDO Alliance.
93
+
94
+To work, an emulated U2F device must have four elements:
95
+
96
+ * ec x509 certificate
97
+ * ec private key
98
+ * counter (four bytes value)
99
+ * 48 bytes of entropy (random bits)
100
+
101
+To use this type of device, these have to be configured, and these
102
+four elements must be passed one way or another.
103
+
104
+Assuming that you have a working libu2f-emu installed on the host,
105
+there are three possible ways to configure the ``u2f-emulated`` device:
106
+
107
+ * ephemeral
108
+ * setup directory
109
+ * manual
110
+
111
+Ephemeral is the simplest way to configure; it lets the device generate
112
+all the elements it needs for a single use of the lifetime of the device.
113
+It is the default if you do not pass any other options to the device.
114
+
115
+.. parsed-literal::
116
+ |qemu_system| -usb -device u2f-emulated
117
+
118
+You can pass the device the path of a setup directory on the host
119
+using the ``dir`` option; the directory must contain these four files:
120
+
121
+ * ``certificate.pem``: ec x509 certificate
122
+ * ``private-key.pem``: ec private key
123
+ * ``counter``: counter value
124
+ * ``entropy``: 48 bytes of entropy
125
+
126
+.. parsed-literal::
127
+ |qemu_system| -usb -device u2f-emulated,dir=$dir
128
+
129
+You can also manually pass the device the paths to each of these files,
130
+if you don't want them all to be in the same directory, using the options
131
+
132
+ * ``cert``
133
+ * ``priv``
134
+ * ``counter``
135
+ * ``entropy``
136
+
137
+.. parsed-literal::
138
+ |qemu_system| -usb -device u2f-emulated,cert=$DIR1/$FILE1,priv=$DIR2/$FILE2,counter=$DIR3/$FILE3,entropy=$DIR4/$FILE4
139
diff --git a/docs/system/devices/usb.rst b/docs/system/devices/usb.rst
140
index XXXXXXX..XXXXXXX 100644
141
--- a/docs/system/devices/usb.rst
142
+++ b/docs/system/devices/usb.rst
143
@@ -XXX,XX +XXX,XX @@ option or the ``device_add`` monitor command. Available devices are:
144
USB audio device
145
146
``u2f-{emulated,passthru}``
147
- Universal Second Factor device
148
+ :doc:`usb-u2f`
149
150
``canokey``
151
An Open-source Secure Key implementing FIDO2, OpenPGP, PIV and more.
152
diff --git a/docs/u2f.txt b/docs/u2f.txt
153
deleted file mode 100644
154
index XXXXXXX..XXXXXXX
155
--- a/docs/u2f.txt
156
+++ /dev/null
157
@@ -XXX,XX +XXX,XX @@
158
-QEMU U2F Key Device Documentation.
159
-
160
-Contents
161
-1. USB U2F key device
162
-2. Building
163
-3. Using u2f-emulated
164
-4. Using u2f-passthru
165
-5. Libu2f-emu
166
-
167
-1. USB U2F key device
168
-
169
-U2F is an open authentication standard that enables relying parties
170
-exposed to the internet to offer a strong second factor option for end
171
-user authentication.
172
-
173
-The standard brings many advantages to both parties, client and server,
174
-allowing to reduce over-reliance on passwords, it increases authentication
175
-security and simplifies passwords.
176
-
177
-The second factor is materialized by a device implementing the U2F
178
-protocol. In case of a USB U2F security key, it is a USB HID device
179
-that implements the U2F protocol.
180
-
181
-In QEMU, the USB U2F key device offers a dedicated support of U2F, allowing
182
-guest USB FIDO/U2F security keys operating in two possible modes:
183
-pass-through and emulated.
184
-
185
-The pass-through mode consists of passing all requests made from the guest
186
-to the physical security key connected to the host machine and vice versa.
187
-In addition, the dedicated pass-through allows to have a U2F security key
188
-shared on several guests which is not possible with a simple host device
189
-assignment pass-through.
190
-
191
-The emulated mode consists of completely emulating the behavior of an
192
-U2F device through software part. Libu2f-emu is used for that.
193
-
194
-
195
-2. Building
196
-
197
-To ensure the build of the u2f-emulated device variant which depends
198
-on libu2f-emu: configuring and building:
199
-
200
- ./configure --enable-u2f && make
201
-
202
-The pass-through mode is built by default on Linux. To take advantage
203
-of the autoscan option it provides, make sure you have a working libudev
204
-installed on the host.
205
-
206
-
207
-3. Using u2f-emulated
208
-
209
-To work, an emulated U2F device must have four elements:
210
- * ec x509 certificate
211
- * ec private key
212
- * counter (four bytes value)
213
- * 48 bytes of entropy (random bits)
214
-
215
-To use this type of device, this one has to be configured, and these
216
-four elements must be passed one way or another.
217
-
218
-Assuming that you have a working libu2f-emu installed on the host.
219
-There are three possible ways of configurations:
220
- * ephemeral
221
- * setup directory
222
- * manual
223
-
224
-Ephemeral is the simplest way to configure, it lets the device generate
225
-all the elements it needs for a single use of the lifetime of the device.
226
-
227
- qemu -usb -device u2f-emulated
228
-
229
-Setup directory allows to configure the device from a directory containing
230
-four files:
231
- * certificate.pem: ec x509 certificate
232
- * private-key.pem: ec private key
233
- * counter: counter value
234
- * entropy: 48 bytes of entropy
235
-
236
- qemu -usb -device u2f-emulated,dir=$dir
237
-
238
-Manual allows to configure the device more finely by specifying each
239
-of the elements necessary for the device:
240
- * cert
241
- * priv
242
- * counter
243
- * entropy
244
-
245
- qemu -usb -device u2f-emulated,cert=$DIR1/$FILE1,priv=$DIR2/$FILE2,counter=$DIR3/$FILE3,entropy=$DIR4/$FILE4
246
-
247
-
248
-4. Using u2f-passthru
249
-
250
-On the host specify the u2f-passthru device with a suitable hidraw:
251
-
252
- qemu -usb -device u2f-passthru,hidraw=/dev/hidraw0
253
-
254
-Alternately, the u2f-passthru device can autoscan to take the first
255
-U2F device it finds on the host (this requires a working libudev):
256
-
257
- qemu -usb -device u2f-passthru
258
-
259
-
260
-5. Libu2f-emu
261
-
262
-The u2f-emulated device uses libu2f-emu for the U2F key emulation. Libu2f-emu
263
-implements completely the U2F protocol device part for all specified
264
-transport given by the FIDO Alliance.
265
-
266
-For more information about libu2f-emu see this page:
267
-https://github.com/MattGorko/libu2f-emu.
28
--
268
--
29
2.20.1
269
2.34.1
30
31
diff view generated by jsdifflib