1
target-arm queue; this one has a fair scattering of more
1
Arm queue; some of the simpler stuff, things other have reviewed (thanks!), etc.
2
miscellaneous things in it which I've sent out this week.
3
I've shoved those in as well as it seemed the least-effort
4
way of getting them into master; a few of them are dependencies
5
on arm-related patches I have brewing.
6
2
7
thanks
8
-- PMM
3
-- PMM
9
4
5
The following changes since commit 5d2f557b47dfbf8f23277a5bdd8473d4607c681a:
10
6
11
The following changes since commit 2702c2d3eb74e3908c0c5dbf3a71c8987595a86e:
7
Merge remote-tracking branch 'remotes/kraxel/tags/vga-20200605-pull-request' into staging (2020-06-05 13:53:05 +0100)
12
13
Merge remote-tracking branch 'remotes/stsquad/tags/pull-travis-updates-140618-1' into staging (2018-06-15 12:49:36 +0100)
14
8
15
are available in the Git repository at:
9
are available in the Git repository at:
16
10
17
git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180615
11
https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200605
18
12
19
for you to fetch changes up to 14120108f87b3f9e1beacdf0a6096e464e62bb65:
13
for you to fetch changes up to 2c35a39eda0b16c2ed85c94cec204bf5efb97812:
20
14
21
target/arm: Allow ARMv6-M Thumb2 instructions (2018-06-15 15:23:34 +0100)
15
target/arm: Convert Neon one-register-and-immediate insns to decodetree (2020-06-05 17:23:10 +0100)
22
16
23
----------------------------------------------------------------
17
----------------------------------------------------------------
24
target-arm and miscellaneous queue:
18
target-arm queue:
25
* fix KVM state save/restore for GICv3 priority registers for high IRQ numbers
19
hw/ssi/imx_spi: Handle tx burst lengths other than 8 correctly
26
* hw/arm/mps2-tz: Put ethernet controller behind PPC
20
hw/input/pxa2xx_keypad: Replace hw_error() by qemu_log_mask()
27
* hw/sh/sh7750: Convert away from old_mmio
21
hw/arm/pxa2xx: Replace printf() call by qemu_log_mask()
28
* hw/m68k/mcf5206: Convert away from old_mmio
22
target/arm: Convert crypto insns to gvec
29
* hw/block/pflash_cfi02: Convert away from old_mmio
23
hw/adc/stm32f2xx_adc: Correct memory region size and access size
30
* hw/watchdog/wdt_i6300esb: Convert away from old_mmio
24
tests/acceptance: Add a boot test for the xlnx-versal-virt machine
31
* hw/input/pckbd: Convert away from old_mmio
25
docs/system: Document Aspeed boards
32
* hw/char/parallel: Convert away from old_mmio
26
raspi: Add model of the USB controller
33
* armv7m: refactor to get rid of armv7m_init() function
27
target/arm: Convert 2-reg-and-shift and 1-reg-imm Neon insns to decodetree
34
* arm: Don't crash if user tries to use a Cortex-M CPU without an NVIC
35
* hw/core/or-irq: Support more than 16 inputs to an OR gate
36
* cpu-defs.h: Document CPUIOTLBEntry 'addr' field
37
* cputlb: Pass cpu_transaction_failed() the correct physaddr
38
* CODING_STYLE: Define our preferred form for multiline comments
39
* Add and use new stn_*_p() and ldn_*_p() memory access functions
40
* target/arm: More parts of the upcoming SVE support
41
* aspeed_scu: Implement RNG register
42
* m25p80: add support for two bytes WRSR for Macronix chips
43
* exec.c: Handle IOMMUs being in the path of TCG CPU memory accesses
44
* target/arm: Allow ARMv6-M Thumb2 instructions
45
28
46
----------------------------------------------------------------
29
----------------------------------------------------------------
47
Cédric Le Goater (1):
30
Cédric Le Goater (1):
48
m25p80: add support for two bytes WRSR for Macronix chips
31
docs/system: Document Aspeed boards
49
32
50
Joel Stanley (1):
33
Eden Mikitas (2):
51
aspeed_scu: Implement RNG register
34
hw/ssi/imx_spi: changed while statement to prevent underflow
35
hw/ssi/imx_spi: Removed unnecessary cast of rx data received from slave
52
36
53
Julia Suvorova (1):
37
Paul Zimmerman (7):
54
target/arm: Allow ARMv6-M Thumb2 instructions
38
raspi: add BCM2835 SOC MPHI emulation
39
dwc-hsotg (dwc2) USB host controller register definitions
40
dwc-hsotg (dwc2) USB host controller state definitions
41
dwc-hsotg (dwc2) USB host controller emulation
42
usb: add short-packet handling to usb-storage driver
43
wire in the dwc-hsotg (dwc2) USB host controller emulation
44
raspi2 acceptance test: add test for dwc-hsotg (dwc2) USB host
55
45
56
Peter Maydell (21):
46
Peter Maydell (9):
57
hw/arm/mps2-tz: Put ethernet controller behind PPC
47
target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn to decodetree
58
hw/sh/sh7750: Convert away from old_mmio
48
target/arm: Convert Neon VSHR 2-reg-shift insns to decodetree
59
hw/m68k/mcf5206: Convert away from old_mmio
49
target/arm: Convert Neon VSRA, VSRI, VRSHR, VRSRA 2-reg-shift insns to decodetree
60
hw/block/pflash_cfi02: Convert away from old_mmio
50
target/arm: Convert VQSHLU, VQSHL 2-reg-shift insns to decodetree
61
hw/watchdog/wdt_i6300esb: Convert away from old_mmio
51
target/arm: Convert Neon narrowing shifts with op==8 to decodetree
62
hw/input/pckbd: Convert away from old_mmio
52
target/arm: Convert Neon narrowing shifts with op==9 to decodetree
63
hw/char/parallel: Convert away from old_mmio
53
target/arm: Convert Neon VSHLL, VMOVL to decodetree
64
stellaris: Stop using armv7m_init()
54
target/arm: Convert VCVT fixed-point ops to decodetree
65
hw/arm/armv7m: Remove unused armv7m_init() function
55
target/arm: Convert Neon one-register-and-immediate insns to decodetree
66
arm: Don't crash if user tries to use a Cortex-M CPU without an NVIC
67
hw/core/or-irq: Support more than 16 inputs to an OR gate
68
cpu-defs.h: Document CPUIOTLBEntry 'addr' field
69
cputlb: Pass cpu_transaction_failed() the correct physaddr
70
CODING_STYLE: Define our preferred form for multiline comments
71
bswap: Add new stn_*_p() and ldn_*_p() memory access functions
72
exec.c: Don't accidentally sign-extend 4-byte loads in subpage_read()
73
exec.c: Use stn_p() and ldn_p() instead of explicit switches
74
iommu: Add IOMMU index concept to IOMMU API
75
iommu: Add IOMMU index argument to notifier APIs
76
iommu: Add IOMMU index argument to translate method
77
exec.c: Handle IOMMUs in address_space_translate_for_iotlb()
78
56
79
Richard Henderson (18):
57
Philippe Mathieu-Daudé (3):
80
target/arm: Extend vec_reg_offset to larger sizes
58
hw/input/pxa2xx_keypad: Replace hw_error() by qemu_log_mask()
81
target/arm: Implement SVE Permute - Unpredicated Group
59
hw/arm/pxa2xx: Replace printf() call by qemu_log_mask()
82
target/arm: Implement SVE Permute - Predicates Group
60
hw/adc/stm32f2xx_adc: Correct memory region size and access size
83
target/arm: Implement SVE Permute - Interleaving Group
84
target/arm: Implement SVE compress active elements
85
target/arm: Implement SVE conditionally broadcast/extract element
86
target/arm: Implement SVE copy to vector (predicated)
87
target/arm: Implement SVE reverse within elements
88
target/arm: Implement SVE vector splice (predicated)
89
target/arm: Implement SVE Select Vectors Group
90
target/arm: Implement SVE Integer Compare - Vectors Group
91
target/arm: Implement SVE Integer Compare - Immediate Group
92
target/arm: Implement SVE Partition Break Group
93
target/arm: Implement SVE Predicate Count Group
94
target/arm: Implement SVE Integer Compare - Scalars Group
95
target/arm: Implement FDUP/DUP
96
target/arm: Implement SVE Integer Wide Immediate - Unpredicated Group
97
target/arm: Implement SVE Floating Point Arithmetic - Unpredicated Group
98
61
99
Shannon Zhao (1):
62
Richard Henderson (6):
100
arm_gicv3_kvm: kvm_dist_get/put_priority: skip the registers banked by GICR_IPRIORITYR
63
target/arm: Convert aes and sm4 to gvec helpers
64
target/arm: Convert rax1 to gvec helpers
65
target/arm: Convert sha512 and sm3 to gvec helpers
66
target/arm: Convert sha1 and sha256 to gvec helpers
67
target/arm: Split helper_crypto_sha1_3reg
68
target/arm: Split helper_crypto_sm3tt
101
69
102
include/exec/cpu-all.h | 4 +
70
Thomas Huth (1):
103
include/exec/cpu-defs.h | 9 +
71
tests/acceptance: Add a boot test for the xlnx-versal-virt machine
104
include/exec/exec-all.h | 16 +-
105
include/exec/memory.h | 65 +-
106
include/hw/arm/arm.h | 8 +-
107
include/hw/or-irq.h | 5 +-
108
include/qemu/bswap.h | 52 ++
109
include/qom/cpu.h | 3 +
110
target/arm/helper-sve.h | 294 +++++++++
111
target/arm/helper.h | 19 +
112
target/arm/translate-a64.h | 26 +-
113
accel/tcg/cputlb.c | 59 +-
114
exec.c | 263 ++++----
115
hw/alpha/typhoon.c | 3 +-
116
hw/arm/armv7m.c | 28 +-
117
hw/arm/mps2-tz.c | 32 +-
118
hw/arm/smmuv3.c | 2 +-
119
hw/arm/stellaris.c | 12 +-
120
hw/block/m25p80.c | 1 +
121
hw/block/pflash_cfi02.c | 97 +--
122
hw/char/parallel.c | 50 +-
123
hw/core/or-irq.c | 39 +-
124
hw/dma/rc4030.c | 2 +-
125
hw/i386/amd_iommu.c | 2 +-
126
hw/i386/intel_iommu.c | 8 +-
127
hw/input/pckbd.c | 14 +-
128
hw/intc/arm_gicv3_kvm.c | 18 +-
129
hw/intc/armv7m_nvic.c | 6 +-
130
hw/m68k/mcf5206.c | 48 +-
131
hw/misc/aspeed_scu.c | 20 +
132
hw/ppc/spapr_iommu.c | 5 +-
133
hw/s390x/s390-pci-bus.c | 2 +-
134
hw/s390x/s390-pci-inst.c | 4 +-
135
hw/sh4/sh7750.c | 44 +-
136
hw/sparc/sun4m_iommu.c | 3 +-
137
hw/sparc64/sun4u_iommu.c | 2 +-
138
hw/vfio/common.c | 6 +-
139
hw/virtio/vhost.c | 7 +-
140
hw/watchdog/wdt_i6300esb.c | 48 +-
141
memory.c | 33 +-
142
target/arm/cpu.c | 18 +
143
target/arm/sve_helper.c | 1250 +++++++++++++++++++++++++++++++++++++
144
target/arm/translate-sve.c | 1458 +++++++++++++++++++++++++++++++++++++++++++
145
target/arm/translate.c | 43 +-
146
target/arm/vec_helper.c | 69 ++
147
CODING_STYLE | 17 +
148
docs/devel/loads-stores.rst | 15 +
149
target/arm/sve.decode | 248 ++++++++
150
48 files changed, 4114 insertions(+), 363 deletions(-)
151
72
73
docs/system/arm/aspeed.rst | 85 ++
74
docs/system/target-arm.rst | 1 +
75
hw/usb/hcd-dwc2.h | 190 +++++
76
include/hw/arm/bcm2835_peripherals.h | 5 +-
77
include/hw/misc/bcm2835_mphi.h | 44 +
78
include/hw/usb/dwc2-regs.h | 899 ++++++++++++++++++++
79
target/arm/helper.h | 45 +-
80
target/arm/translate-a64.h | 3 +
81
target/arm/vec_internal.h | 33 +
82
target/arm/neon-dp.decode | 214 ++++-
83
hw/adc/stm32f2xx_adc.c | 4 +-
84
hw/arm/bcm2835_peripherals.c | 38 +-
85
hw/arm/pxa2xx.c | 66 +-
86
hw/input/pxa2xx_keypad.c | 10 +-
87
hw/misc/bcm2835_mphi.c | 191 +++++
88
hw/ssi/imx_spi.c | 4 +-
89
hw/usb/dev-storage.c | 15 +-
90
hw/usb/hcd-dwc2.c | 1417 ++++++++++++++++++++++++++++++++
91
target/arm/crypto_helper.c | 267 ++++--
92
target/arm/translate-a64.c | 198 ++---
93
target/arm/translate-neon.inc.c | 796 ++++++++++++++----
94
target/arm/translate.c | 539 +-----------
95
target/arm/vec_helper.c | 12 +-
96
hw/misc/Makefile.objs | 1 +
97
hw/usb/Kconfig | 5 +
98
hw/usb/Makefile.objs | 1 +
99
hw/usb/trace-events | 50 ++
100
tests/acceptance/boot_linux_console.py | 35 +-
101
28 files changed, 4258 insertions(+), 910 deletions(-)
102
create mode 100644 docs/system/arm/aspeed.rst
103
create mode 100644 hw/usb/hcd-dwc2.h
104
create mode 100644 include/hw/misc/bcm2835_mphi.h
105
create mode 100644 include/hw/usb/dwc2-regs.h
106
create mode 100644 target/arm/vec_internal.h
107
create mode 100644 hw/misc/bcm2835_mphi.c
108
create mode 100644 hw/usb/hcd-dwc2.c
109
diff view generated by jsdifflib
1
From: Joel Stanley <joel@jms.id.au>
1
From: Eden Mikitas <e.mikitas@gmail.com>
2
2
3
The ASPEED SoCs contain a single register that returns random data when
3
The while statement in question only checked if tx_burst is not 0.
4
read. This models that register so that guests can use it.
4
tx_burst is a signed int, which is assigned the value put by the
5
guest driver in ECSPI_CONREG. The burst length can be anywhere
6
between 1 and 4096, and since tx_burst is always decremented by 8
7
it could possibly underflow, causing an infinite loop.
5
8
6
The random number data register has a corresponding control register,
9
Signed-off-by: Eden Mikitas <e.mikitas@gmail.com>
7
however it returns data regardless of the state of the enabled bit, so
10
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
8
the model follows this behaviour.
9
10
When the qcrypto call fails we exit as the guest uses the random number
11
device to feed it's entropy pool, which is used for cryptographic
12
purposes.
13
14
Reviewed-by: Cédric Le Goater <clg@kaod.org>
15
Signed-off-by: Joel Stanley <joel@jms.id.au>
16
Message-id: 20180613114836.9265-1-joel@jms.id.au
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
---
12
---
19
hw/misc/aspeed_scu.c | 20 ++++++++++++++++++++
13
hw/ssi/imx_spi.c | 2 +-
20
1 file changed, 20 insertions(+)
14
1 file changed, 1 insertion(+), 1 deletion(-)
21
15
22
diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
16
diff --git a/hw/ssi/imx_spi.c b/hw/ssi/imx_spi.c
23
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
24
--- a/hw/misc/aspeed_scu.c
18
--- a/hw/ssi/imx_spi.c
25
+++ b/hw/misc/aspeed_scu.c
19
+++ b/hw/ssi/imx_spi.c
26
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@ static void imx_spi_flush_txfifo(IMXSPIState *s)
27
#include "qapi/visitor.h"
21
28
#include "qemu/bitops.h"
22
rx = 0;
29
#include "qemu/log.h"
23
30
+#include "crypto/random.h"
24
- while (tx_burst) {
31
#include "trace.h"
25
+ while (tx_burst > 0) {
32
26
uint8_t byte = tx & 0xff;
33
#define TO_REG(offset) ((offset) >> 2)
27
34
@@ -XXX,XX +XXX,XX @@ static const uint32_t ast2500_a1_resets[ASPEED_SCU_NR_REGS] = {
28
DPRINTF("writing 0x%02x\n", (uint32_t)byte);
35
[BMC_DEV_ID] = 0x00002402U
36
};
37
38
+static uint32_t aspeed_scu_get_random(void)
39
+{
40
+ Error *err = NULL;
41
+ uint32_t num;
42
+
43
+ if (qcrypto_random_bytes((uint8_t *)&num, sizeof(num), &err)) {
44
+ error_report_err(err);
45
+ exit(1);
46
+ }
47
+
48
+ return num;
49
+}
50
+
51
static uint64_t aspeed_scu_read(void *opaque, hwaddr offset, unsigned size)
52
{
53
AspeedSCUState *s = ASPEED_SCU(opaque);
54
@@ -XXX,XX +XXX,XX @@ static uint64_t aspeed_scu_read(void *opaque, hwaddr offset, unsigned size)
55
}
56
57
switch (reg) {
58
+ case RNG_DATA:
59
+ /* On hardware, RNG_DATA works regardless of
60
+ * the state of the enable bit in RNG_CTRL
61
+ */
62
+ s->regs[RNG_DATA] = aspeed_scu_get_random();
63
+ break;
64
case WAKEUP_EN:
65
qemu_log_mask(LOG_GUEST_ERROR,
66
"%s: Read of write-only offset 0x%" HWADDR_PRIx "\n",
67
--
29
--
68
2.17.1
30
2.20.1
69
31
70
32
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Eden Mikitas <e.mikitas@gmail.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
When inserting the value retrieved (rx) from the spi slave, rx is pushed to
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
rx_fifo after being cast to uint8_t. rx_fifo is a fifo32, and the rx
5
Message-id: 20180613015641.5667-15-richard.henderson@linaro.org
5
register the driver uses is also 32 bit. This zeroes the 24 most
6
significant bits of rx. This proved problematic with devices that expect to
7
use the whole 32 bits of the rx register.
8
9
Signed-off-by: Eden Mikitas <e.mikitas@gmail.com>
10
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
12
---
8
target/arm/helper-sve.h | 2 +
13
hw/ssi/imx_spi.c | 2 +-
9
target/arm/sve_helper.c | 14 ++++
14
1 file changed, 1 insertion(+), 1 deletion(-)
10
target/arm/translate-sve.c | 133 +++++++++++++++++++++++++++++++++++++
11
target/arm/sve.decode | 27 ++++++++
12
4 files changed, 176 insertions(+)
13
15
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
16
diff --git a/hw/ssi/imx_spi.c b/hw/ssi/imx_spi.c
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
18
--- a/hw/ssi/imx_spi.c
17
+++ b/target/arm/helper-sve.h
19
+++ b/hw/ssi/imx_spi.c
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
20
@@ -XXX,XX +XXX,XX @@ static void imx_spi_flush_txfifo(IMXSPIState *s)
19
21
if (fifo32_is_full(&s->rx_fifo)) {
20
DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
s->regs[ECSPI_STATREG] |= ECSPI_STATREG_RO;
21
DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
23
} else {
22
+
24
- fifo32_push(&s->rx_fifo, (uint8_t)rx);
23
+DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
25
+ fifo32_push(&s->rx_fifo, rx);
24
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
26
}
25
index XXXXXXX..XXXXXXX 100644
27
26
--- a/target/arm/sve_helper.c
28
if (s->burst_length <= 0) {
27
+++ b/target/arm/sve_helper.c
28
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_desc)
29
return do_zero(vd, oprsz);
30
}
31
}
32
+
33
+uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc)
34
+{
35
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
36
+ intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2);
37
+ uint64_t *n = vn, *g = vg, sum = 0, mask = pred_esz_masks[esz];
38
+ intptr_t i;
39
+
40
+ for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) {
41
+ uint64_t t = n[i] & g[i] & mask;
42
+ sum += ctpop64(t);
43
+ }
44
+ return sum;
45
+}
46
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/target/arm/translate-sve.c
49
+++ b/target/arm/translate-sve.c
50
@@ -XXX,XX +XXX,XX @@
51
#include "translate-a64.h"
52
53
54
+typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t,
55
+ TCGv_i64, uint32_t, uint32_t);
56
+
57
typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr,
58
TCGv_ptr, TCGv_i32);
59
typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr,
60
@@ -XXX,XX +XXX,XX @@ static bool trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn)
61
return do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns);
62
}
63
64
+/*
65
+ *** SVE Predicate Count Group
66
+ */
67
+
68
+static void do_cntp(DisasContext *s, TCGv_i64 val, int esz, int pn, int pg)
69
+{
70
+ unsigned psz = pred_full_reg_size(s);
71
+
72
+ if (psz <= 8) {
73
+ uint64_t psz_mask;
74
+
75
+ tcg_gen_ld_i64(val, cpu_env, pred_full_reg_offset(s, pn));
76
+ if (pn != pg) {
77
+ TCGv_i64 g = tcg_temp_new_i64();
78
+ tcg_gen_ld_i64(g, cpu_env, pred_full_reg_offset(s, pg));
79
+ tcg_gen_and_i64(val, val, g);
80
+ tcg_temp_free_i64(g);
81
+ }
82
+
83
+ /* Reduce the pred_esz_masks value simply to reduce the
84
+ * size of the code generated here.
85
+ */
86
+ psz_mask = MAKE_64BIT_MASK(0, psz * 8);
87
+ tcg_gen_andi_i64(val, val, pred_esz_masks[esz] & psz_mask);
88
+
89
+ tcg_gen_ctpop_i64(val, val);
90
+ } else {
91
+ TCGv_ptr t_pn = tcg_temp_new_ptr();
92
+ TCGv_ptr t_pg = tcg_temp_new_ptr();
93
+ unsigned desc;
94
+ TCGv_i32 t_desc;
95
+
96
+ desc = psz - 2;
97
+ desc = deposit32(desc, SIMD_DATA_SHIFT, 2, esz);
98
+
99
+ tcg_gen_addi_ptr(t_pn, cpu_env, pred_full_reg_offset(s, pn));
100
+ tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
101
+ t_desc = tcg_const_i32(desc);
102
+
103
+ gen_helper_sve_cntp(val, t_pn, t_pg, t_desc);
104
+ tcg_temp_free_ptr(t_pn);
105
+ tcg_temp_free_ptr(t_pg);
106
+ tcg_temp_free_i32(t_desc);
107
+ }
108
+}
109
+
110
+static bool trans_CNTP(DisasContext *s, arg_CNTP *a, uint32_t insn)
111
+{
112
+ if (sve_access_check(s)) {
113
+ do_cntp(s, cpu_reg(s, a->rd), a->esz, a->rn, a->pg);
114
+ }
115
+ return true;
116
+}
117
+
118
+static bool trans_INCDECP_r(DisasContext *s, arg_incdec_pred *a,
119
+ uint32_t insn)
120
+{
121
+ if (sve_access_check(s)) {
122
+ TCGv_i64 reg = cpu_reg(s, a->rd);
123
+ TCGv_i64 val = tcg_temp_new_i64();
124
+
125
+ do_cntp(s, val, a->esz, a->pg, a->pg);
126
+ if (a->d) {
127
+ tcg_gen_sub_i64(reg, reg, val);
128
+ } else {
129
+ tcg_gen_add_i64(reg, reg, val);
130
+ }
131
+ tcg_temp_free_i64(val);
132
+ }
133
+ return true;
134
+}
135
+
136
+static bool trans_INCDECP_z(DisasContext *s, arg_incdec2_pred *a,
137
+ uint32_t insn)
138
+{
139
+ if (a->esz == 0) {
140
+ return false;
141
+ }
142
+ if (sve_access_check(s)) {
143
+ unsigned vsz = vec_full_reg_size(s);
144
+ TCGv_i64 val = tcg_temp_new_i64();
145
+ GVecGen2sFn *gvec_fn = a->d ? tcg_gen_gvec_subs : tcg_gen_gvec_adds;
146
+
147
+ do_cntp(s, val, a->esz, a->pg, a->pg);
148
+ gvec_fn(a->esz, vec_full_reg_offset(s, a->rd),
149
+ vec_full_reg_offset(s, a->rn), val, vsz, vsz);
150
+ }
151
+ return true;
152
+}
153
+
154
+static bool trans_SINCDECP_r_32(DisasContext *s, arg_incdec_pred *a,
155
+ uint32_t insn)
156
+{
157
+ if (sve_access_check(s)) {
158
+ TCGv_i64 reg = cpu_reg(s, a->rd);
159
+ TCGv_i64 val = tcg_temp_new_i64();
160
+
161
+ do_cntp(s, val, a->esz, a->pg, a->pg);
162
+ do_sat_addsub_32(reg, val, a->u, a->d);
163
+ }
164
+ return true;
165
+}
166
+
167
+static bool trans_SINCDECP_r_64(DisasContext *s, arg_incdec_pred *a,
168
+ uint32_t insn)
169
+{
170
+ if (sve_access_check(s)) {
171
+ TCGv_i64 reg = cpu_reg(s, a->rd);
172
+ TCGv_i64 val = tcg_temp_new_i64();
173
+
174
+ do_cntp(s, val, a->esz, a->pg, a->pg);
175
+ do_sat_addsub_64(reg, val, a->u, a->d);
176
+ }
177
+ return true;
178
+}
179
+
180
+static bool trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a,
181
+ uint32_t insn)
182
+{
183
+ if (a->esz == 0) {
184
+ return false;
185
+ }
186
+ if (sve_access_check(s)) {
187
+ TCGv_i64 val = tcg_temp_new_i64();
188
+ do_cntp(s, val, a->esz, a->pg, a->pg);
189
+ do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, a->u, a->d);
190
+ }
191
+ return true;
192
+}
193
+
194
/*
195
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
196
*/
197
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
198
index XXXXXXX..XXXXXXX 100644
199
--- a/target/arm/sve.decode
200
+++ b/target/arm/sve.decode
201
@@ -XXX,XX +XXX,XX @@
202
&ptrue rd esz pat s
203
&incdec_cnt rd pat esz imm d u
204
&incdec2_cnt rd rn pat esz imm d u
205
+&incdec_pred rd pg esz d u
206
+&incdec2_pred rd rn pg esz d u
207
208
###########################################################################
209
# Named instruction formats. These are generally used to
210
@@ -XXX,XX +XXX,XX @@
211
212
# One register operand, with governing predicate, vector element size
213
@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz
214
+@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz
215
216
# Two register operands with a 6-bit signed immediate.
217
@rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri
218
@@ -XXX,XX +XXX,XX @@
219
@incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \
220
&incdec2_cnt imm=%imm4_16_p1 rn=%reg_movprfx
221
222
+# One register, predicate.
223
+# User must fill in U and D.
224
+@incdec_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 &incdec_pred
225
+@incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \
226
+ &incdec2_pred rn=%reg_movprfx
227
+
228
###########################################################################
229
# Instruction patterns. Grouped according to the SVE encodingindex.xhtml.
230
231
@@ -XXX,XX +XXX,XX @@ BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s
232
# SVE propagate break to next partition
233
BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s
234
235
+### SVE Predicate Count Group
236
+
237
+# SVE predicate count
238
+CNTP 00100101 .. 100 000 10 .... 0 .... ..... @rd_pg4_pn
239
+
240
+# SVE inc/dec register by predicate count
241
+INCDECP_r 00100101 .. 10110 d:1 10001 00 .... ..... @incdec_pred u=1
242
+
243
+# SVE inc/dec vector by predicate count
244
+INCDECP_z 00100101 .. 10110 d:1 10000 00 .... ..... @incdec2_pred u=1
245
+
246
+# SVE saturating inc/dec register by predicate count
247
+SINCDECP_r_32 00100101 .. 1010 d:1 u:1 10001 00 .... ..... @incdec_pred
248
+SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred
249
+
250
+# SVE saturating inc/dec vector by predicate count
251
+SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred
252
+
253
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
254
255
# SVE load predicate register
256
--
29
--
257
2.17.1
30
2.20.1
258
31
259
32
diff view generated by jsdifflib
1
Currently we don't support board configurations that put an IOMMU
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
in the path of the CPU's memory transactions, and instead just
3
assert() if the memory region fonud in address_space_translate_for_iotlb()
4
is an IOMMUMemoryRegion.
5
2
6
Remove this limitation by having the function handle IOMMUs.
3
hw_error() calls exit(). This a bit overkill when we can log
7
This is mostly straightforward, but we must make sure we have
4
the accesses as unimplemented or guest error.
8
a notifier registered for every IOMMU that a transaction has
9
passed through, so that we can flush the TLB appropriately
10
when any of the IOMMUs change their mappings.
11
5
6
When fuzzing the devices, we don't want the whole process to
7
exit. Replace some hw_error() calls by qemu_log_mask()
8
(missed in commit 5a0001ec7e).
9
10
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Message-id: 20200525114123.21317-2-f4bug@amsat.org
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
14
Message-id: 20180604152941.20374-5-peter.maydell@linaro.org
15
---
14
---
16
include/exec/exec-all.h | 3 +-
15
hw/input/pxa2xx_keypad.c | 10 +++++++---
17
include/qom/cpu.h | 3 +
16
1 file changed, 7 insertions(+), 3 deletions(-)
18
accel/tcg/cputlb.c | 3 +-
19
exec.c | 135 +++++++++++++++++++++++++++++++++++++++-
20
4 files changed, 140 insertions(+), 4 deletions(-)
21
17
22
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
18
diff --git a/hw/input/pxa2xx_keypad.c b/hw/input/pxa2xx_keypad.c
23
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
24
--- a/include/exec/exec-all.h
20
--- a/hw/input/pxa2xx_keypad.c
25
+++ b/include/exec/exec-all.h
21
+++ b/hw/input/pxa2xx_keypad.c
26
@@ -XXX,XX +XXX,XX @@ void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr);
22
@@ -XXX,XX +XXX,XX @@
27
23
*/
28
MemoryRegionSection *
24
29
address_space_translate_for_iotlb(CPUState *cpu, int asidx, hwaddr addr,
25
#include "qemu/osdep.h"
30
- hwaddr *xlat, hwaddr *plen);
26
-#include "hw/hw.h"
31
+ hwaddr *xlat, hwaddr *plen,
27
+#include "qemu/log.h"
32
+ MemTxAttrs attrs, int *prot);
28
#include "hw/irq.h"
33
hwaddr memory_region_section_get_iotlb(CPUState *cpu,
29
#include "migration/vmstate.h"
34
MemoryRegionSection *section,
30
#include "hw/arm/pxa.h"
35
target_ulong vaddr,
31
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_keypad_read(void *opaque, hwaddr offset,
36
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
32
return s->kpkdi;
37
index XXXXXXX..XXXXXXX 100644
33
break;
38
--- a/include/qom/cpu.h
34
default:
39
+++ b/include/qom/cpu.h
35
- hw_error("%s: Bad offset " REG_FMT "\n", __func__, offset);
40
@@ -XXX,XX +XXX,XX @@ struct CPUState {
36
+ qemu_log_mask(LOG_GUEST_ERROR,
41
uint16_t pending_tlb_flush;
37
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
42
38
+ __func__, offset);
43
int hvf_fd;
44
+
45
+ /* track IOMMUs whose translations we've cached in the TCG TLB */
46
+ GArray *iommu_notifiers;
47
};
48
49
QTAILQ_HEAD(CPUTailQ, CPUState);
50
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
51
index XXXXXXX..XXXXXXX 100644
52
--- a/accel/tcg/cputlb.c
53
+++ b/accel/tcg/cputlb.c
54
@@ -XXX,XX +XXX,XX @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
55
}
39
}
56
40
57
sz = size;
41
return 0;
58
- section = address_space_translate_for_iotlb(cpu, asidx, paddr, &xlat, &sz);
42
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_keypad_write(void *opaque, hwaddr offset,
59
+ section = address_space_translate_for_iotlb(cpu, asidx, paddr, &xlat, &sz,
43
break;
60
+ attrs, &prot);
44
61
assert(sz >= TARGET_PAGE_SIZE);
45
default:
62
46
- hw_error("%s: Bad offset " REG_FMT "\n", __func__, offset);
63
tlb_debug("vaddr=" TARGET_FMT_lx " paddr=0x" TARGET_FMT_plx
47
+ qemu_log_mask(LOG_GUEST_ERROR,
64
diff --git a/exec.c b/exec.c
48
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
65
index XXXXXXX..XXXXXXX 100644
49
+ __func__, offset);
66
--- a/exec.c
50
}
67
+++ b/exec.c
68
@@ -XXX,XX +XXX,XX @@ MemoryRegion *flatview_translate(FlatView *fv, hwaddr addr, hwaddr *xlat,
69
return mr;
70
}
51
}
71
52
72
+typedef struct TCGIOMMUNotifier {
73
+ IOMMUNotifier n;
74
+ MemoryRegion *mr;
75
+ CPUState *cpu;
76
+ int iommu_idx;
77
+ bool active;
78
+} TCGIOMMUNotifier;
79
+
80
+static void tcg_iommu_unmap_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
81
+{
82
+ TCGIOMMUNotifier *notifier = container_of(n, TCGIOMMUNotifier, n);
83
+
84
+ if (!notifier->active) {
85
+ return;
86
+ }
87
+ tlb_flush(notifier->cpu);
88
+ notifier->active = false;
89
+ /* We leave the notifier struct on the list to avoid reallocating it later.
90
+ * Generally the number of IOMMUs a CPU deals with will be small.
91
+ * In any case we can't unregister the iommu notifier from a notify
92
+ * callback.
93
+ */
94
+}
95
+
96
+static void tcg_register_iommu_notifier(CPUState *cpu,
97
+ IOMMUMemoryRegion *iommu_mr,
98
+ int iommu_idx)
99
+{
100
+ /* Make sure this CPU has an IOMMU notifier registered for this
101
+ * IOMMU/IOMMU index combination, so that we can flush its TLB
102
+ * when the IOMMU tells us the mappings we've cached have changed.
103
+ */
104
+ MemoryRegion *mr = MEMORY_REGION(iommu_mr);
105
+ TCGIOMMUNotifier *notifier;
106
+ int i;
107
+
108
+ for (i = 0; i < cpu->iommu_notifiers->len; i++) {
109
+ notifier = &g_array_index(cpu->iommu_notifiers, TCGIOMMUNotifier, i);
110
+ if (notifier->mr == mr && notifier->iommu_idx == iommu_idx) {
111
+ break;
112
+ }
113
+ }
114
+ if (i == cpu->iommu_notifiers->len) {
115
+ /* Not found, add a new entry at the end of the array */
116
+ cpu->iommu_notifiers = g_array_set_size(cpu->iommu_notifiers, i + 1);
117
+ notifier = &g_array_index(cpu->iommu_notifiers, TCGIOMMUNotifier, i);
118
+
119
+ notifier->mr = mr;
120
+ notifier->iommu_idx = iommu_idx;
121
+ notifier->cpu = cpu;
122
+ /* Rather than trying to register interest in the specific part
123
+ * of the iommu's address space that we've accessed and then
124
+ * expand it later as subsequent accesses touch more of it, we
125
+ * just register interest in the whole thing, on the assumption
126
+ * that iommu reconfiguration will be rare.
127
+ */
128
+ iommu_notifier_init(&notifier->n,
129
+ tcg_iommu_unmap_notify,
130
+ IOMMU_NOTIFIER_UNMAP,
131
+ 0,
132
+ HWADDR_MAX,
133
+ iommu_idx);
134
+ memory_region_register_iommu_notifier(notifier->mr, &notifier->n);
135
+ }
136
+
137
+ if (!notifier->active) {
138
+ notifier->active = true;
139
+ }
140
+}
141
+
142
+static void tcg_iommu_free_notifier_list(CPUState *cpu)
143
+{
144
+ /* Destroy the CPU's notifier list */
145
+ int i;
146
+ TCGIOMMUNotifier *notifier;
147
+
148
+ for (i = 0; i < cpu->iommu_notifiers->len; i++) {
149
+ notifier = &g_array_index(cpu->iommu_notifiers, TCGIOMMUNotifier, i);
150
+ memory_region_unregister_iommu_notifier(notifier->mr, &notifier->n);
151
+ }
152
+ g_array_free(cpu->iommu_notifiers, true);
153
+}
154
+
155
/* Called from RCU critical section */
156
MemoryRegionSection *
157
address_space_translate_for_iotlb(CPUState *cpu, int asidx, hwaddr addr,
158
- hwaddr *xlat, hwaddr *plen)
159
+ hwaddr *xlat, hwaddr *plen,
160
+ MemTxAttrs attrs, int *prot)
161
{
162
MemoryRegionSection *section;
163
+ IOMMUMemoryRegion *iommu_mr;
164
+ IOMMUMemoryRegionClass *imrc;
165
+ IOMMUTLBEntry iotlb;
166
+ int iommu_idx;
167
AddressSpaceDispatch *d = atomic_rcu_read(&cpu->cpu_ases[asidx].memory_dispatch);
168
169
- section = address_space_translate_internal(d, addr, xlat, plen, false);
170
+ for (;;) {
171
+ section = address_space_translate_internal(d, addr, &addr, plen, false);
172
+
173
+ iommu_mr = memory_region_get_iommu(section->mr);
174
+ if (!iommu_mr) {
175
+ break;
176
+ }
177
+
178
+ imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
179
+
180
+ iommu_idx = imrc->attrs_to_index(iommu_mr, attrs);
181
+ tcg_register_iommu_notifier(cpu, iommu_mr, iommu_idx);
182
+ /* We need all the permissions, so pass IOMMU_NONE so the IOMMU
183
+ * doesn't short-cut its translation table walk.
184
+ */
185
+ iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, iommu_idx);
186
+ addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
187
+ | (addr & iotlb.addr_mask));
188
+ /* Update the caller's prot bits to remove permissions the IOMMU
189
+ * is giving us a failure response for. If we get down to no
190
+ * permissions left at all we can give up now.
191
+ */
192
+ if (!(iotlb.perm & IOMMU_RO)) {
193
+ *prot &= ~(PAGE_READ | PAGE_EXEC);
194
+ }
195
+ if (!(iotlb.perm & IOMMU_WO)) {
196
+ *prot &= ~PAGE_WRITE;
197
+ }
198
+
199
+ if (!*prot) {
200
+ goto translate_fail;
201
+ }
202
+
203
+ d = flatview_to_dispatch(address_space_to_flatview(iotlb.target_as));
204
+ }
205
206
assert(!memory_region_is_iommu(section->mr));
207
+ *xlat = addr;
208
return section;
209
+
210
+translate_fail:
211
+ return &d->map.sections[PHYS_SECTION_UNASSIGNED];
212
}
213
#endif
214
215
@@ -XXX,XX +XXX,XX @@ void cpu_exec_unrealizefn(CPUState *cpu)
216
if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
217
vmstate_unregister(NULL, &vmstate_cpu_common, cpu);
218
}
219
+#ifndef CONFIG_USER_ONLY
220
+ tcg_iommu_free_notifier_list(cpu);
221
+#endif
222
}
223
224
Property cpu_common_props[] = {
225
@@ -XXX,XX +XXX,XX @@ void cpu_exec_realizefn(CPUState *cpu, Error **errp)
226
if (cc->vmsd != NULL) {
227
vmstate_register(NULL, cpu->cpu_index, cc->vmsd, cpu);
228
}
229
+
230
+ cpu->iommu_notifiers = g_array_new(false, true, sizeof(TCGIOMMUNotifier));
231
#endif
232
}
233
234
--
53
--
235
2.17.1
54
2.20.1
236
55
237
56
diff view generated by jsdifflib
1
Convert the pflash_cfi02 device away from using the old_mmio field
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
of MemoryRegionOps.
2
3
3
Replace printf() calls by qemu_log_mask(), which is disabled
4
by default. This avoid flooding the terminal when fuzzing the
5
device.
6
7
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Message-id: 20200525114123.21317-3-f4bug@amsat.org
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Acked-by: Max Reitz <mreitz@redhat.com>
7
Message-id: 20180601141223.26630-4-peter.maydell@linaro.org
8
---
11
---
9
hw/block/pflash_cfi02.c | 97 ++++++++---------------------------------
12
hw/arm/pxa2xx.c | 66 ++++++++++++++++++++++++++++++++++++-------------
10
1 file changed, 18 insertions(+), 79 deletions(-)
13
1 file changed, 49 insertions(+), 17 deletions(-)
11
14
12
diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c
15
diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
13
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
14
--- a/hw/block/pflash_cfi02.c
17
--- a/hw/arm/pxa2xx.c
15
+++ b/hw/block/pflash_cfi02.c
18
+++ b/hw/arm/pxa2xx.c
16
@@ -XXX,XX +XXX,XX @@ static void pflash_write (pflash_t *pfl, hwaddr offset,
19
@@ -XXX,XX +XXX,XX @@
17
pfl->cmd = 0;
20
#include "sysemu/blockdev.h"
18
}
21
#include "sysemu/qtest.h"
19
22
#include "qemu/cutils.h"
23
+#include "qemu/log.h"
24
25
static struct {
26
hwaddr io_base;
27
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_pm_read(void *opaque, hwaddr addr,
28
return s->pm_regs[addr >> 2];
29
default:
30
fail:
31
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
32
+ qemu_log_mask(LOG_GUEST_ERROR,
33
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
34
+ __func__, addr);
35
break;
36
}
37
return 0;
38
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_pm_write(void *opaque, hwaddr addr,
39
s->pm_regs[addr >> 2] = value;
40
break;
41
}
20
-
42
-
21
-static uint32_t pflash_readb_be(void *opaque, hwaddr addr)
43
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
22
+static uint64_t pflash_be_readfn(void *opaque, hwaddr addr, unsigned size)
44
+ qemu_log_mask(LOG_GUEST_ERROR,
23
{
45
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
24
- return pflash_read(opaque, addr, 1, 1);
46
+ __func__, addr);
25
+ return pflash_read(opaque, addr, size, 1);
47
break;
26
}
48
}
27
49
}
28
-static uint32_t pflash_readb_le(void *opaque, hwaddr addr)
50
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_cm_read(void *opaque, hwaddr addr,
29
+static void pflash_be_writefn(void *opaque, hwaddr addr,
51
return s->cm_regs[CCCR >> 2] | (3 << 28);
30
+ uint64_t value, unsigned size)
52
31
{
53
default:
32
- return pflash_read(opaque, addr, 1, 0);
54
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
33
+ pflash_write(opaque, addr, value, size, 1);
55
+ qemu_log_mask(LOG_GUEST_ERROR,
34
}
56
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
35
57
+ __func__, addr);
36
-static uint32_t pflash_readw_be(void *opaque, hwaddr addr)
58
break;
37
+static uint64_t pflash_le_readfn(void *opaque, hwaddr addr, unsigned size)
59
}
38
{
60
return 0;
39
- pflash_t *pfl = opaque;
61
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_cm_write(void *opaque, hwaddr addr,
40
-
62
break;
41
- return pflash_read(pfl, addr, 2, 1);
63
42
+ return pflash_read(opaque, addr, size, 0);
64
default:
43
}
65
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
44
66
+ qemu_log_mask(LOG_GUEST_ERROR,
45
-static uint32_t pflash_readw_le(void *opaque, hwaddr addr)
67
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
46
+static void pflash_le_writefn(void *opaque, hwaddr addr,
68
+ __func__, addr);
47
+ uint64_t value, unsigned size)
69
break;
48
{
70
}
49
- pflash_t *pfl = opaque;
71
}
50
-
72
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_mm_read(void *opaque, hwaddr addr,
51
- return pflash_read(pfl, addr, 2, 0);
73
return s->mm_regs[addr >> 2];
52
-}
74
/* fall through */
53
-
75
default:
54
-static uint32_t pflash_readl_be(void *opaque, hwaddr addr)
76
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
55
-{
77
+ qemu_log_mask(LOG_GUEST_ERROR,
56
- pflash_t *pfl = opaque;
78
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
57
-
79
+ __func__, addr);
58
- return pflash_read(pfl, addr, 4, 1);
80
break;
59
-}
81
}
60
-
82
return 0;
61
-static uint32_t pflash_readl_le(void *opaque, hwaddr addr)
83
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_mm_write(void *opaque, hwaddr addr,
62
-{
84
}
63
- pflash_t *pfl = opaque;
85
64
-
86
default:
65
- return pflash_read(pfl, addr, 4, 0);
87
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
66
-}
88
+ qemu_log_mask(LOG_GUEST_ERROR,
67
-
89
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
68
-static void pflash_writeb_be(void *opaque, hwaddr addr,
90
+ __func__, addr);
69
- uint32_t value)
91
break;
70
-{
92
}
71
- pflash_write(opaque, addr, value, 1, 1);
93
}
72
-}
94
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_ssp_read(void *opaque, hwaddr addr,
73
-
95
case SSACD:
74
-static void pflash_writeb_le(void *opaque, hwaddr addr,
96
return s->ssacd;
75
- uint32_t value)
97
default:
76
-{
98
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
77
- pflash_write(opaque, addr, value, 1, 0);
99
+ qemu_log_mask(LOG_GUEST_ERROR,
78
-}
100
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
79
-
101
+ __func__, addr);
80
-static void pflash_writew_be(void *opaque, hwaddr addr,
102
break;
81
- uint32_t value)
103
}
82
-{
104
return 0;
83
- pflash_t *pfl = opaque;
105
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_ssp_write(void *opaque, hwaddr addr,
84
-
106
break;
85
- pflash_write(pfl, addr, value, 2, 1);
107
86
-}
108
default:
87
-
109
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
88
-static void pflash_writew_le(void *opaque, hwaddr addr,
110
+ qemu_log_mask(LOG_GUEST_ERROR,
89
- uint32_t value)
111
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
90
-{
112
+ __func__, addr);
91
- pflash_t *pfl = opaque;
113
break;
92
-
114
}
93
- pflash_write(pfl, addr, value, 2, 0);
115
}
94
-}
116
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_rtc_read(void *opaque, hwaddr addr,
95
-
117
else
96
-static void pflash_writel_be(void *opaque, hwaddr addr,
118
return s->last_swcr;
97
- uint32_t value)
119
default:
98
-{
120
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
99
- pflash_t *pfl = opaque;
121
+ qemu_log_mask(LOG_GUEST_ERROR,
100
-
122
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
101
- pflash_write(pfl, addr, value, 4, 1);
123
+ __func__, addr);
102
-}
124
break;
103
-
125
}
104
-static void pflash_writel_le(void *opaque, hwaddr addr,
126
return 0;
105
- uint32_t value)
127
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_rtc_write(void *opaque, hwaddr addr,
106
-{
128
break;
107
- pflash_t *pfl = opaque;
129
108
-
130
default:
109
- pflash_write(pfl, addr, value, 4, 0);
131
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
110
+ pflash_write(opaque, addr, value, size, 0);
132
+ qemu_log_mask(LOG_GUEST_ERROR,
111
}
133
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
112
134
+ __func__, addr);
113
static const MemoryRegionOps pflash_cfi02_ops_be = {
135
}
114
- .old_mmio = {
136
}
115
- .read = { pflash_readb_be, pflash_readw_be, pflash_readl_be, },
137
116
- .write = { pflash_writeb_be, pflash_writew_be, pflash_writel_be, },
138
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_i2c_read(void *opaque, hwaddr addr,
117
- },
139
s->ibmr = 0;
118
+ .read = pflash_be_readfn,
140
return s->ibmr;
119
+ .write = pflash_be_writefn,
141
default:
120
+ .valid.min_access_size = 1,
142
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
121
+ .valid.max_access_size = 4,
143
+ qemu_log_mask(LOG_GUEST_ERROR,
122
.endianness = DEVICE_NATIVE_ENDIAN,
144
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
123
};
145
+ __func__, addr);
124
146
break;
125
static const MemoryRegionOps pflash_cfi02_ops_le = {
147
}
126
- .old_mmio = {
148
return 0;
127
- .read = { pflash_readb_le, pflash_readw_le, pflash_readl_le, },
149
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_i2c_write(void *opaque, hwaddr addr,
128
- .write = { pflash_writeb_le, pflash_writew_le, pflash_writel_le, },
150
break;
129
- },
151
130
+ .read = pflash_le_readfn,
152
default:
131
+ .write = pflash_le_writefn,
153
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
132
+ .valid.min_access_size = 1,
154
+ qemu_log_mask(LOG_GUEST_ERROR,
133
+ .valid.max_access_size = 4,
155
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
134
.endianness = DEVICE_NATIVE_ENDIAN,
156
+ __func__, addr);
135
};
157
}
158
}
159
160
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_i2s_read(void *opaque, hwaddr addr,
161
}
162
return 0;
163
default:
164
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
165
+ qemu_log_mask(LOG_GUEST_ERROR,
166
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
167
+ __func__, addr);
168
break;
169
}
170
return 0;
171
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_i2s_write(void *opaque, hwaddr addr,
172
}
173
break;
174
default:
175
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
176
+ qemu_log_mask(LOG_GUEST_ERROR,
177
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
178
+ __func__, addr);
179
}
180
}
181
182
@@ -XXX,XX +XXX,XX @@ static uint64_t pxa2xx_fir_read(void *opaque, hwaddr addr,
183
case ICFOR:
184
return s->rx_len;
185
default:
186
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
187
+ qemu_log_mask(LOG_GUEST_ERROR,
188
+ "%s: Bad read offset 0x%"HWADDR_PRIx"\n",
189
+ __func__, addr);
190
break;
191
}
192
return 0;
193
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_fir_write(void *opaque, hwaddr addr,
194
case ICFOR:
195
break;
196
default:
197
- printf("%s: Bad register " REG_FMT "\n", __func__, addr);
198
+ qemu_log_mask(LOG_GUEST_ERROR,
199
+ "%s: Bad write offset 0x%"HWADDR_PRIx"\n",
200
+ __func__, addr);
201
}
202
}
136
203
137
--
204
--
138
2.17.1
205
2.20.1
139
206
140
207
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
With this conversion, we will be able to use the same helpers
4
with sve. In particular, pass 3 vector parameters for the
5
3-operand operations; for advsimd the destination register
6
is also an input.
7
8
This also fixes a bug in which we failed to clear the high bits
9
of the SVE register after an AdvSIMD operation.
10
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20200514212831.31248-2-richard.henderson@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-19-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
15
---
8
target/arm/helper-sve.h | 14 ++++++++
16
target/arm/helper.h | 6 ++--
9
target/arm/helper.h | 19 +++++++++++
17
target/arm/vec_internal.h | 33 +++++++++++++++++
10
target/arm/translate-sve.c | 42 +++++++++++++++++++++++
18
target/arm/crypto_helper.c | 72 +++++++++++++++++++++++++++-----------
11
target/arm/vec_helper.c | 69 ++++++++++++++++++++++++++++++++++++++
19
target/arm/translate-a64.c | 55 ++++++++++++++++++-----------
12
target/arm/sve.decode | 10 ++++++
20
target/arm/translate.c | 27 +++++++-------
13
5 files changed, 154 insertions(+)
21
target/arm/vec_helper.c | 12 +------
14
22
6 files changed, 138 insertions(+), 67 deletions(-)
15
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
23
create mode 100644 target/arm/vec_internal.h
16
index XXXXXXX..XXXXXXX 100644
24
17
--- a/target/arm/helper-sve.h
18
+++ b/target/arm/helper-sve.h
19
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
20
DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
21
DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
22
DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
23
+
24
+DEF_HELPER_FLAGS_5(gvec_recps_h, TCG_CALL_NO_RWG,
25
+ void, ptr, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(gvec_recps_s, TCG_CALL_NO_RWG,
27
+ void, ptr, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_5(gvec_recps_d, TCG_CALL_NO_RWG,
29
+ void, ptr, ptr, ptr, ptr, i32)
30
+
31
+DEF_HELPER_FLAGS_5(gvec_rsqrts_h, TCG_CALL_NO_RWG,
32
+ void, ptr, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
34
+ void, ptr, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
36
+ void, ptr, ptr, ptr, ptr, i32)
37
diff --git a/target/arm/helper.h b/target/arm/helper.h
25
diff --git a/target/arm/helper.h b/target/arm/helper.h
38
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
39
--- a/target/arm/helper.h
27
--- a/target/arm/helper.h
40
+++ b/target/arm/helper.h
28
+++ b/target/arm/helper.h
41
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
29
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(neon_qzip8, TCG_CALL_NO_RWG, void, ptr, ptr)
42
DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG,
30
DEF_HELPER_FLAGS_2(neon_qzip16, TCG_CALL_NO_RWG, void, ptr, ptr)
43
void, ptr, ptr, ptr, ptr, i32)
31
DEF_HELPER_FLAGS_2(neon_qzip32, TCG_CALL_NO_RWG, void, ptr, ptr)
44
32
45
+DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
33
-DEF_HELPER_FLAGS_3(crypto_aese, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
46
+DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(crypto_aese, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
35
DEF_HELPER_FLAGS_3(crypto_aesmc, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
48
+
36
49
+DEF_HELPER_FLAGS_5(gvec_fsub_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
37
DEF_HELPER_FLAGS_4(crypto_sha1_3reg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_5(gvec_fsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(crypto_sm3tt, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32, i32)
51
+DEF_HELPER_FLAGS_5(gvec_fsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
DEF_HELPER_FLAGS_3(crypto_sm3partw1, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
52
+
40
DEF_HELPER_FLAGS_3(crypto_sm3partw2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
53
+DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
41
54
+DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
42
-DEF_HELPER_FLAGS_2(crypto_sm4e, TCG_CALL_NO_RWG, void, ptr, ptr)
55
+DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
43
-DEF_HELPER_FLAGS_3(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
56
+
44
+DEF_HELPER_FLAGS_4(crypto_sm4e, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
57
+DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
45
+DEF_HELPER_FLAGS_4(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
58
+ void, ptr, ptr, ptr, ptr, i32)
46
59
+DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
47
DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
60
+ void, ptr, ptr, ptr, ptr, i32)
48
DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
61
+DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG,
49
diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h
62
+ void, ptr, ptr, ptr, ptr, i32)
50
new file mode 100644
63
+
51
index XXXXXXX..XXXXXXX
64
#ifdef TARGET_AARCH64
52
--- /dev/null
65
#include "helper-a64.h"
53
+++ b/target/arm/vec_internal.h
66
#include "helper-sve.h"
54
@@ -XXX,XX +XXX,XX @@
67
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
55
+/*
56
+ * ARM AdvSIMD / SVE Vector Helpers
57
+ *
58
+ * Copyright (c) 2020 Linaro
59
+ *
60
+ * This library is free software; you can redistribute it and/or
61
+ * modify it under the terms of the GNU Lesser General Public
62
+ * License as published by the Free Software Foundation; either
63
+ * version 2 of the License, or (at your option) any later version.
64
+ *
65
+ * This library is distributed in the hope that it will be useful,
66
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
67
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
68
+ * Lesser General Public License for more details.
69
+ *
70
+ * You should have received a copy of the GNU Lesser General Public
71
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
72
+ */
73
+
74
+#ifndef TARGET_ARM_VEC_INTERNALS_H
75
+#define TARGET_ARM_VEC_INTERNALS_H
76
+
77
+static inline void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
78
+{
79
+ uint64_t *d = vd + opr_sz;
80
+ uintptr_t i;
81
+
82
+ for (i = opr_sz; i < max_sz; i += 8) {
83
+ *d++ = 0;
84
+ }
85
+}
86
+
87
+#endif /* TARGET_ARM_VEC_INTERNALS_H */
88
diff --git a/target/arm/crypto_helper.c b/target/arm/crypto_helper.c
68
index XXXXXXX..XXXXXXX 100644
89
index XXXXXXX..XXXXXXX 100644
69
--- a/target/arm/translate-sve.c
90
--- a/target/arm/crypto_helper.c
70
+++ b/target/arm/translate-sve.c
91
+++ b/target/arm/crypto_helper.c
71
@@ -XXX,XX +XXX,XX @@ DO_ZZI(UMIN, umin)
92
@@ -XXX,XX +XXX,XX @@
72
93
73
#undef DO_ZZI
94
#include "cpu.h"
74
95
#include "exec/helper-proto.h"
75
+/*
96
+#include "tcg/tcg-gvec-desc.h"
76
+ *** SVE Floating Point Arithmetic - Unpredicated Group
97
#include "crypto/aes.h"
77
+ */
98
+#include "vec_internal.h"
78
+
99
79
+static bool do_zzz_fp(DisasContext *s, arg_rrr_esz *a,
100
union CRYPTO_STATE {
80
+ gen_helper_gvec_3_ptr *fn)
101
uint8_t bytes[16];
81
+{
102
@@ -XXX,XX +XXX,XX @@ union CRYPTO_STATE {
82
+ if (fn == NULL) {
103
#define CR_ST_WORD(state, i) (state.words[i])
83
+ return false;
104
#endif
84
+ }
105
85
+ if (sve_access_check(s)) {
106
-void HELPER(crypto_aese)(void *vd, void *vm, uint32_t decrypt)
86
+ unsigned vsz = vec_full_reg_size(s);
107
+static void do_crypto_aese(uint64_t *rd, uint64_t *rn,
87
+ TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
108
+ uint64_t *rm, bool decrypt)
88
+ tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
109
{
89
+ vec_full_reg_offset(s, a->rn),
110
static uint8_t const * const sbox[2] = { AES_sbox, AES_isbox };
90
+ vec_full_reg_offset(s, a->rm),
111
static uint8_t const * const shift[2] = { AES_shifts, AES_ishifts };
91
+ status, vsz, vsz, 0, fn);
112
- uint64_t *rd = vd;
92
+ tcg_temp_free_ptr(status);
113
- uint64_t *rm = vm;
93
+ }
114
union CRYPTO_STATE rk = { .l = { rm[0], rm[1] } };
94
+ return true;
115
- union CRYPTO_STATE st = { .l = { rd[0], rd[1] } };
95
+}
116
+ union CRYPTO_STATE st = { .l = { rn[0], rn[1] } };
96
+
117
int i;
97
+
118
98
+#define DO_FP3(NAME, name) \
119
- assert(decrypt < 2);
99
+static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a, uint32_t insn) \
120
-
100
+{ \
121
/* xor state vector with round key */
101
+ static gen_helper_gvec_3_ptr * const fns[4] = { \
122
rk.l[0] ^= st.l[0];
102
+ NULL, gen_helper_gvec_##name##_h, \
123
rk.l[1] ^= st.l[1];
103
+ gen_helper_gvec_##name##_s, gen_helper_gvec_##name##_d \
124
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_aese)(void *vd, void *vm, uint32_t decrypt)
104
+ }; \
125
rd[1] = st.l[1];
105
+ return do_zzz_fp(s, a, fns[a->esz]); \
126
}
106
+}
127
107
+
128
-void HELPER(crypto_aesmc)(void *vd, void *vm, uint32_t decrypt)
108
+DO_FP3(FADD_zzz, fadd)
129
+void HELPER(crypto_aese)(void *vd, void *vn, void *vm, uint32_t desc)
109
+DO_FP3(FSUB_zzz, fsub)
130
+{
110
+DO_FP3(FMUL_zzz, fmul)
131
+ intptr_t i, opr_sz = simd_oprsz(desc);
111
+DO_FP3(FTSMUL, ftsmul)
132
+ bool decrypt = simd_data(desc);
112
+DO_FP3(FRECPS, recps)
133
+
113
+DO_FP3(FRSQRTS, rsqrts)
134
+ for (i = 0; i < opr_sz; i += 16) {
114
+
135
+ do_crypto_aese(vd + i, vn + i, vm + i, decrypt);
115
+#undef DO_FP3
136
+ }
137
+ clear_tail(vd, opr_sz, simd_maxsz(desc));
138
+}
139
+
140
+static void do_crypto_aesmc(uint64_t *rd, uint64_t *rm, bool decrypt)
141
{
142
static uint32_t const mc[][256] = { {
143
/* MixColumns lookup table */
144
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_aesmc)(void *vd, void *vm, uint32_t decrypt)
145
0xbe805d9f, 0xb58d5491, 0xa89a4f83, 0xa397468d,
146
} };
147
148
- uint64_t *rd = vd;
149
- uint64_t *rm = vm;
150
union CRYPTO_STATE st = { .l = { rm[0], rm[1] } };
151
int i;
152
153
- assert(decrypt < 2);
154
-
155
for (i = 0; i < 16; i += 4) {
156
CR_ST_WORD(st, i >> 2) =
157
mc[decrypt][CR_ST_BYTE(st, i)] ^
158
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_aesmc)(void *vd, void *vm, uint32_t decrypt)
159
rd[1] = st.l[1];
160
}
161
162
+void HELPER(crypto_aesmc)(void *vd, void *vm, uint32_t desc)
163
+{
164
+ intptr_t i, opr_sz = simd_oprsz(desc);
165
+ bool decrypt = simd_data(desc);
166
+
167
+ for (i = 0; i < opr_sz; i += 16) {
168
+ do_crypto_aesmc(vd + i, vm + i, decrypt);
169
+ }
170
+ clear_tail(vd, opr_sz, simd_maxsz(desc));
171
+}
116
+
172
+
117
/*
173
/*
118
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
174
* SHA-1 logical functions
119
*/
175
*/
176
@@ -XXX,XX +XXX,XX @@ static uint8_t const sm4_sbox[] = {
177
0x79, 0xee, 0x5f, 0x3e, 0xd7, 0xcb, 0x39, 0x48,
178
};
179
180
-void HELPER(crypto_sm4e)(void *vd, void *vn)
181
+static void do_crypto_sm4e(uint64_t *rd, uint64_t *rn, uint64_t *rm)
182
{
183
- uint64_t *rd = vd;
184
- uint64_t *rn = vn;
185
- union CRYPTO_STATE d = { .l = { rd[0], rd[1] } };
186
- union CRYPTO_STATE n = { .l = { rn[0], rn[1] } };
187
+ union CRYPTO_STATE d = { .l = { rn[0], rn[1] } };
188
+ union CRYPTO_STATE n = { .l = { rm[0], rm[1] } };
189
uint32_t t, i;
190
191
for (i = 0; i < 4; i++) {
192
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sm4e)(void *vd, void *vn)
193
rd[1] = d.l[1];
194
}
195
196
-void HELPER(crypto_sm4ekey)(void *vd, void *vn, void* vm)
197
+void HELPER(crypto_sm4e)(void *vd, void *vn, void *vm, uint32_t desc)
198
+{
199
+ intptr_t i, opr_sz = simd_oprsz(desc);
200
+
201
+ for (i = 0; i < opr_sz; i += 16) {
202
+ do_crypto_sm4e(vd + i, vn + i, vm + i);
203
+ }
204
+ clear_tail(vd, opr_sz, simd_maxsz(desc));
205
+}
206
+
207
+static void do_crypto_sm4ekey(uint64_t *rd, uint64_t *rn, uint64_t *rm)
208
{
209
- uint64_t *rd = vd;
210
- uint64_t *rn = vn;
211
- uint64_t *rm = vm;
212
union CRYPTO_STATE d;
213
union CRYPTO_STATE n = { .l = { rn[0], rn[1] } };
214
union CRYPTO_STATE m = { .l = { rm[0], rm[1] } };
215
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sm4ekey)(void *vd, void *vn, void* vm)
216
rd[0] = d.l[0];
217
rd[1] = d.l[1];
218
}
219
+
220
+void HELPER(crypto_sm4ekey)(void *vd, void *vn, void* vm, uint32_t desc)
221
+{
222
+ intptr_t i, opr_sz = simd_oprsz(desc);
223
+
224
+ for (i = 0; i < opr_sz; i += 16) {
225
+ do_crypto_sm4ekey(vd + i, vn + i, vm + i);
226
+ }
227
+ clear_tail(vd, opr_sz, simd_maxsz(desc));
228
+}
229
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
230
index XXXXXXX..XXXXXXX 100644
231
--- a/target/arm/translate-a64.c
232
+++ b/target/arm/translate-a64.c
233
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
234
is_q ? 16 : 8, vec_full_reg_size(s));
235
}
236
237
+/* Expand a 2-operand operation using an out-of-line helper. */
238
+static void gen_gvec_op2_ool(DisasContext *s, bool is_q, int rd,
239
+ int rn, int data, gen_helper_gvec_2 *fn)
240
+{
241
+ tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd),
242
+ vec_full_reg_offset(s, rn),
243
+ is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
244
+}
245
+
246
/* Expand a 3-operand operation using an out-of-line helper. */
247
static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
248
int rn, int rm, int data, gen_helper_gvec_3 *fn)
249
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_aes(DisasContext *s, uint32_t insn)
250
int rn = extract32(insn, 5, 5);
251
int rd = extract32(insn, 0, 5);
252
int decrypt;
253
- TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
254
- TCGv_i32 tcg_decrypt;
255
- CryptoThreeOpIntFn *genfn;
256
+ gen_helper_gvec_2 *genfn2 = NULL;
257
+ gen_helper_gvec_3 *genfn3 = NULL;
258
259
if (!dc_isar_feature(aa64_aes, s) || size != 0) {
260
unallocated_encoding(s);
261
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_aes(DisasContext *s, uint32_t insn)
262
switch (opcode) {
263
case 0x4: /* AESE */
264
decrypt = 0;
265
- genfn = gen_helper_crypto_aese;
266
+ genfn3 = gen_helper_crypto_aese;
267
break;
268
case 0x6: /* AESMC */
269
decrypt = 0;
270
- genfn = gen_helper_crypto_aesmc;
271
+ genfn2 = gen_helper_crypto_aesmc;
272
break;
273
case 0x5: /* AESD */
274
decrypt = 1;
275
- genfn = gen_helper_crypto_aese;
276
+ genfn3 = gen_helper_crypto_aese;
277
break;
278
case 0x7: /* AESIMC */
279
decrypt = 1;
280
- genfn = gen_helper_crypto_aesmc;
281
+ genfn2 = gen_helper_crypto_aesmc;
282
break;
283
default:
284
unallocated_encoding(s);
285
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_aes(DisasContext *s, uint32_t insn)
286
if (!fp_access_check(s)) {
287
return;
288
}
289
-
290
- tcg_rd_ptr = vec_full_reg_ptr(s, rd);
291
- tcg_rn_ptr = vec_full_reg_ptr(s, rn);
292
- tcg_decrypt = tcg_const_i32(decrypt);
293
-
294
- genfn(tcg_rd_ptr, tcg_rn_ptr, tcg_decrypt);
295
-
296
- tcg_temp_free_ptr(tcg_rd_ptr);
297
- tcg_temp_free_ptr(tcg_rn_ptr);
298
- tcg_temp_free_i32(tcg_decrypt);
299
+ if (genfn2) {
300
+ gen_gvec_op2_ool(s, true, rd, rn, decrypt, genfn2);
301
+ } else {
302
+ gen_gvec_op3_ool(s, true, rd, rd, rn, decrypt, genfn3);
303
+ }
304
}
305
306
/* Crypto three-reg SHA
307
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
308
int rn = extract32(insn, 5, 5);
309
int rd = extract32(insn, 0, 5);
310
bool feature;
311
- CryptoThreeOpFn *genfn;
312
+ CryptoThreeOpFn *genfn = NULL;
313
+ gen_helper_gvec_3 *oolfn = NULL;
314
315
if (o == 0) {
316
switch (opcode) {
317
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
318
break;
319
case 2: /* SM4EKEY */
320
feature = dc_isar_feature(aa64_sm4, s);
321
- genfn = gen_helper_crypto_sm4ekey;
322
+ oolfn = gen_helper_crypto_sm4ekey;
323
break;
324
default:
325
unallocated_encoding(s);
326
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
327
return;
328
}
329
330
+ if (oolfn) {
331
+ gen_gvec_op3_ool(s, true, rd, rn, rm, 0, oolfn);
332
+ return;
333
+ }
334
+
335
if (genfn) {
336
TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
337
338
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
339
TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
340
bool feature;
341
CryptoTwoOpFn *genfn;
342
+ gen_helper_gvec_3 *oolfn = NULL;
343
344
switch (opcode) {
345
case 0: /* SHA512SU0 */
346
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
347
break;
348
case 1: /* SM4E */
349
feature = dc_isar_feature(aa64_sm4, s);
350
- genfn = gen_helper_crypto_sm4e;
351
+ oolfn = gen_helper_crypto_sm4e;
352
break;
353
default:
354
unallocated_encoding(s);
355
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
356
return;
357
}
358
359
+ if (oolfn) {
360
+ gen_gvec_op3_ool(s, true, rd, rd, rn, 0, oolfn);
361
+ return;
362
+ }
363
+
364
tcg_rd_ptr = vec_full_reg_ptr(s, rd);
365
tcg_rn_ptr = vec_full_reg_ptr(s, rn);
366
367
diff --git a/target/arm/translate.c b/target/arm/translate.c
368
index XXXXXXX..XXXXXXX 100644
369
--- a/target/arm/translate.c
370
+++ b/target/arm/translate.c
371
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
372
if (!dc_isar_feature(aa32_aes, s) || ((rm | rd) & 1)) {
373
return 1;
374
}
375
- ptr1 = vfp_reg_ptr(true, rd);
376
- ptr2 = vfp_reg_ptr(true, rm);
377
-
378
- /* Bit 6 is the lowest opcode bit; it distinguishes between
379
- * encryption (AESE/AESMC) and decryption (AESD/AESIMC)
380
- */
381
- tmp3 = tcg_const_i32(extract32(insn, 6, 1));
382
-
383
+ /*
384
+ * Bit 6 is the lowest opcode bit; it distinguishes
385
+ * between encryption (AESE/AESMC) and decryption
386
+ * (AESD/AESIMC).
387
+ */
388
if (op == NEON_2RM_AESE) {
389
- gen_helper_crypto_aese(ptr1, ptr2, tmp3);
390
+ tcg_gen_gvec_3_ool(vfp_reg_offset(true, rd),
391
+ vfp_reg_offset(true, rd),
392
+ vfp_reg_offset(true, rm),
393
+ 16, 16, extract32(insn, 6, 1),
394
+ gen_helper_crypto_aese);
395
} else {
396
- gen_helper_crypto_aesmc(ptr1, ptr2, tmp3);
397
+ tcg_gen_gvec_2_ool(vfp_reg_offset(true, rd),
398
+ vfp_reg_offset(true, rm),
399
+ 16, 16, extract32(insn, 6, 1),
400
+ gen_helper_crypto_aesmc);
401
}
402
- tcg_temp_free_ptr(ptr1);
403
- tcg_temp_free_ptr(ptr2);
404
- tcg_temp_free_i32(tmp3);
405
break;
406
case NEON_2RM_SHA1H:
407
if (!dc_isar_feature(aa32_sha1, s) || ((rm | rd) & 1)) {
120
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
408
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
121
index XXXXXXX..XXXXXXX 100644
409
index XXXXXXX..XXXXXXX 100644
122
--- a/target/arm/vec_helper.c
410
--- a/target/arm/vec_helper.c
123
+++ b/target/arm/vec_helper.c
411
+++ b/target/arm/vec_helper.c
124
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm,
412
@@ -XXX,XX +XXX,XX @@
125
}
413
#include "exec/helper-proto.h"
126
clear_tail(d, opr_sz, simd_maxsz(desc));
414
#include "tcg/tcg-gvec-desc.h"
127
}
415
#include "fpu/softfloat.h"
128
+
416
-
129
+/* Floating-point trigonometric starting value.
417
+#include "vec_internal.h"
130
+ * See the ARM ARM pseudocode function FPTrigSMul.
418
131
+ */
419
/* Note that vector data is stored in host-endian 64-bit chunks,
132
+static float16 float16_ftsmul(float16 op1, uint16_t op2, float_status *stat)
420
so addressing units smaller than that needs a host-endian fixup. */
133
+{
421
@@ -XXX,XX +XXX,XX @@
134
+ float16 result = float16_mul(op1, op1, stat);
422
#define H4(x) (x)
135
+ if (!float16_is_any_nan(result)) {
423
#endif
136
+ result = float16_set_sign(result, op2 & 1);
424
137
+ }
425
-static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
138
+ return result;
426
-{
139
+}
427
- uint64_t *d = vd + opr_sz;
140
+
428
- uintptr_t i;
141
+static float32 float32_ftsmul(float32 op1, uint32_t op2, float_status *stat)
429
-
142
+{
430
- for (i = opr_sz; i < max_sz; i += 8) {
143
+ float32 result = float32_mul(op1, op1, stat);
431
- *d++ = 0;
144
+ if (!float32_is_any_nan(result)) {
432
- }
145
+ result = float32_set_sign(result, op2 & 1);
433
-}
146
+ }
434
-
147
+ return result;
435
/* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
148
+}
436
static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
149
+
437
int16_t src3, uint32_t *sat)
150
+static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat)
151
+{
152
+ float64 result = float64_mul(op1, op1, stat);
153
+ if (!float64_is_any_nan(result)) {
154
+ result = float64_set_sign(result, op2 & 1);
155
+ }
156
+ return result;
157
+}
158
+
159
+#define DO_3OP(NAME, FUNC, TYPE) \
160
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
161
+{ \
162
+ intptr_t i, oprsz = simd_oprsz(desc); \
163
+ TYPE *d = vd, *n = vn, *m = vm; \
164
+ for (i = 0; i < oprsz / sizeof(TYPE); i++) { \
165
+ d[i] = FUNC(n[i], m[i], stat); \
166
+ } \
167
+}
168
+
169
+DO_3OP(gvec_fadd_h, float16_add, float16)
170
+DO_3OP(gvec_fadd_s, float32_add, float32)
171
+DO_3OP(gvec_fadd_d, float64_add, float64)
172
+
173
+DO_3OP(gvec_fsub_h, float16_sub, float16)
174
+DO_3OP(gvec_fsub_s, float32_sub, float32)
175
+DO_3OP(gvec_fsub_d, float64_sub, float64)
176
+
177
+DO_3OP(gvec_fmul_h, float16_mul, float16)
178
+DO_3OP(gvec_fmul_s, float32_mul, float32)
179
+DO_3OP(gvec_fmul_d, float64_mul, float64)
180
+
181
+DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16)
182
+DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32)
183
+DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
184
+
185
+#ifdef TARGET_AARCH64
186
+
187
+DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
188
+DO_3OP(gvec_recps_s, helper_recpsf_f32, float32)
189
+DO_3OP(gvec_recps_d, helper_recpsf_f64, float64)
190
+
191
+DO_3OP(gvec_rsqrts_h, helper_rsqrtsf_f16, float16)
192
+DO_3OP(gvec_rsqrts_s, helper_rsqrtsf_f32, float32)
193
+DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64)
194
+
195
+#endif
196
+#undef DO_3OP
197
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
198
index XXXXXXX..XXXXXXX 100644
199
--- a/target/arm/sve.decode
200
+++ b/target/arm/sve.decode
201
@@ -XXX,XX +XXX,XX @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
202
# SVE integer multiply immediate (unpredicated)
203
MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
204
205
+### SVE Floating Point Arithmetic - Unpredicated Group
206
+
207
+# SVE floating-point arithmetic (unpredicated)
208
+FADD_zzz 01100101 .. 0 ..... 000 000 ..... ..... @rd_rn_rm
209
+FSUB_zzz 01100101 .. 0 ..... 000 001 ..... ..... @rd_rn_rm
210
+FMUL_zzz 01100101 .. 0 ..... 000 010 ..... ..... @rd_rn_rm
211
+FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm
212
+FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm
213
+FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm
214
+
215
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
216
217
# SVE load predicate register
218
--
438
--
219
2.17.1
439
2.20.1
220
440
221
441
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
With this conversion, we will be able to use the same helpers
4
with sve. This also fixes a bug in which we failed to clear
5
the high bits of the SVE register after an AdvSIMD operation.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200514212831.31248-3-richard.henderson@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-9-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/helper-sve.h | 14 +++++++++++++
12
target/arm/helper.h | 2 ++
9
target/arm/sve_helper.c | 41 +++++++++++++++++++++++++++++++-------
13
target/arm/translate-a64.h | 3 ++
10
target/arm/translate-sve.c | 38 +++++++++++++++++++++++++++++++++++
14
target/arm/crypto_helper.c | 11 +++++++
11
target/arm/sve.decode | 7 +++++++
15
target/arm/translate-a64.c | 59 ++++++++++++++++++++------------------
12
4 files changed, 93 insertions(+), 7 deletions(-)
16
4 files changed, 47 insertions(+), 28 deletions(-)
13
17
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
20
--- a/target/arm/helper.h
17
+++ b/target/arm/helper-sve.h
21
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(crypto_sm3partw2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
19
23
DEF_HELPER_FLAGS_4(crypto_sm4e, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32)
24
DEF_HELPER_FLAGS_4(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
25
22
+DEF_HELPER_FLAGS_4(sve_revb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_4(crypto_rax1, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_4(sve_revb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(sve_revb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+
27
+
26
+DEF_HELPER_FLAGS_4(sve_revh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
27
+DEF_HELPER_FLAGS_4(sve_revh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
30
31
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
32
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/translate-a64.h
34
+++ b/target/arm/translate-a64.h
35
@@ -XXX,XX +XXX,XX @@ static inline int vec_full_reg_size(DisasContext *s)
36
37
bool disas_sve(DisasContext *, uint32_t);
38
39
+void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
40
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
28
+
41
+
29
+DEF_HELPER_FLAGS_4(sve_revw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
42
#endif /* TARGET_ARM_TRANSLATE_A64_H */
43
diff --git a/target/arm/crypto_helper.c b/target/arm/crypto_helper.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/target/arm/crypto_helper.c
46
+++ b/target/arm/crypto_helper.c
47
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sm4ekey)(void *vd, void *vn, void* vm, uint32_t desc)
48
}
49
clear_tail(vd, opr_sz, simd_maxsz(desc));
50
}
30
+
51
+
31
+DEF_HELPER_FLAGS_4(sve_rbit_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
52
+void HELPER(crypto_rax1)(void *vd, void *vn, void *vm, uint32_t desc)
32
+DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
53
+{
33
+DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
54
+ intptr_t i, opr_sz = simd_oprsz(desc);
34
+DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
55
+ uint64_t *d = vd, *n = vn, *m = vm;
35
+
56
+
36
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
57
+ for (i = 0; i < opr_sz / 8; ++i) {
37
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
58
+ d[i] = n[i] ^ rol64(m[i], 1);
38
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
59
+ }
39
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
60
+ clear_tail(vd, opr_sz, simd_maxsz(desc));
61
+}
62
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
40
index XXXXXXX..XXXXXXX 100644
63
index XXXXXXX..XXXXXXX 100644
41
--- a/target/arm/sve_helper.c
64
--- a/target/arm/translate-a64.c
42
+++ b/target/arm/sve_helper.c
65
+++ b/target/arm/translate-a64.c
43
@@ -XXX,XX +XXX,XX @@ static inline uint64_t expand_pred_s(uint8_t byte)
66
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
44
return word[byte & 0x11];
67
tcg_temp_free_ptr(tcg_rn_ptr);
45
}
68
}
46
69
47
+/* Swap 16-bit words within a 32-bit word. */
70
+static void gen_rax1_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)
48
+static inline uint32_t hswap32(uint32_t h)
49
+{
71
+{
50
+ return rol32(h, 16);
72
+ tcg_gen_rotli_i64(d, m, 1);
73
+ tcg_gen_xor_i64(d, d, n);
51
+}
74
+}
52
+
75
+
53
+/* Swap 16-bit words within a 64-bit word. */
76
+static void gen_rax1_vec(unsigned vece, TCGv_vec d, TCGv_vec n, TCGv_vec m)
54
+static inline uint64_t hswap64(uint64_t h)
55
+{
77
+{
56
+ uint64_t m = 0x0000ffff0000ffffull;
78
+ tcg_gen_rotli_vec(vece, d, m, 1);
57
+ h = rol64(h, 32);
79
+ tcg_gen_xor_vec(vece, d, d, n);
58
+ return ((h & m) << 16) | ((h >> 16) & m);
59
+}
80
+}
60
+
81
+
61
+/* Swap 32-bit words within a 64-bit word. */
82
+void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
62
+static inline uint64_t wswap64(uint64_t h)
83
+ uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
63
+{
84
+{
64
+ return rol64(h, 32);
85
+ static const TCGOpcode vecop_list[] = { INDEX_op_rotli_vec, 0 };
86
+ static const GVecGen3 op = {
87
+ .fni8 = gen_rax1_i64,
88
+ .fniv = gen_rax1_vec,
89
+ .opt_opc = vecop_list,
90
+ .fno = gen_helper_crypto_rax1,
91
+ .vece = MO_64,
92
+ };
93
+ tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &op);
65
+}
94
+}
66
+
95
+
67
#define LOGICAL_PPPP(NAME, FUNC) \
96
/* Crypto three-reg SHA512
68
void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
97
* 31 21 20 16 15 14 13 12 11 10 9 5 4 0
69
{ \
98
* +-----------------------+------+---+---+-----+--------+------+------+
70
@@ -XXX,XX +XXX,XX @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG)
99
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
71
DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG)
100
bool feature;
72
DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG)
101
CryptoThreeOpFn *genfn = NULL;
73
102
gen_helper_gvec_3 *oolfn = NULL;
74
+DO_ZPZ(sve_revb_h, uint16_t, H1_2, bswap16)
103
+ GVecGen3Fn *gvecfn = NULL;
75
+DO_ZPZ(sve_revb_s, uint32_t, H1_4, bswap32)
104
76
+DO_ZPZ_D(sve_revb_d, uint64_t, bswap64)
105
if (o == 0) {
77
+
106
switch (opcode) {
78
+DO_ZPZ(sve_revh_s, uint32_t, H1_4, hswap32)
107
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
79
+DO_ZPZ_D(sve_revh_d, uint64_t, hswap64)
108
break;
80
+
109
case 3: /* RAX1 */
81
+DO_ZPZ_D(sve_revw_d, uint64_t, wswap64)
110
feature = dc_isar_feature(aa64_sha3, s);
82
+
111
- genfn = NULL;
83
+DO_ZPZ(sve_rbit_b, uint8_t, H1, revbit8)
112
+ gvecfn = gen_gvec_rax1;
84
+DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16)
113
break;
85
+DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32)
114
default:
86
+DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64)
115
g_assert_not_reached();
87
+
116
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
88
/* Three-operand expander, unpredicated, in which the third operand is "wide".
117
89
*/
118
if (oolfn) {
90
#define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \
119
gen_gvec_op3_ool(s, true, rd, rn, rm, 0, oolfn);
91
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc)
120
- return;
121
- }
122
-
123
- if (genfn) {
124
+ } else if (gvecfn) {
125
+ gen_gvec_fn3(s, true, rd, rn, rm, gvecfn, MO_64);
126
+ } else {
127
TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
128
129
tcg_rd_ptr = vec_full_reg_ptr(s, rd);
130
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
131
tcg_temp_free_ptr(tcg_rd_ptr);
132
tcg_temp_free_ptr(tcg_rn_ptr);
133
tcg_temp_free_ptr(tcg_rm_ptr);
134
- } else {
135
- TCGv_i64 tcg_op1, tcg_op2, tcg_res[2];
136
- int pass;
137
-
138
- tcg_op1 = tcg_temp_new_i64();
139
- tcg_op2 = tcg_temp_new_i64();
140
- tcg_res[0] = tcg_temp_new_i64();
141
- tcg_res[1] = tcg_temp_new_i64();
142
-
143
- for (pass = 0; pass < 2; pass++) {
144
- read_vec_element(s, tcg_op1, rn, pass, MO_64);
145
- read_vec_element(s, tcg_op2, rm, pass, MO_64);
146
-
147
- tcg_gen_rotli_i64(tcg_res[pass], tcg_op2, 1);
148
- tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1);
149
- }
150
- write_vec_element(s, tcg_res[0], rd, 0, MO_64);
151
- write_vec_element(s, tcg_res[1], rd, 1, MO_64);
152
-
153
- tcg_temp_free_i64(tcg_op1);
154
- tcg_temp_free_i64(tcg_op2);
155
- tcg_temp_free_i64(tcg_res[0]);
156
- tcg_temp_free_i64(tcg_res[1]);
92
}
157
}
93
}
158
}
94
159
95
-static inline uint64_t hswap64(uint64_t h)
96
-{
97
- uint64_t m = 0x0000ffff0000ffffull;
98
- h = rol64(h, 32);
99
- return ((h & m) << 16) | ((h >> 16) & m);
100
-}
101
-
102
void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc)
103
{
104
intptr_t i, j, opr_sz = simd_oprsz(desc);
105
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
106
index XXXXXXX..XXXXXXX 100644
107
--- a/target/arm/translate-sve.c
108
+++ b/target/arm/translate-sve.c
109
@@ -XXX,XX +XXX,XX @@ static bool trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
110
return true;
111
}
112
113
+static bool trans_REVB(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
114
+{
115
+ static gen_helper_gvec_3 * const fns[4] = {
116
+ NULL,
117
+ gen_helper_sve_revb_h,
118
+ gen_helper_sve_revb_s,
119
+ gen_helper_sve_revb_d,
120
+ };
121
+ return do_zpz_ool(s, a, fns[a->esz]);
122
+}
123
+
124
+static bool trans_REVH(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
125
+{
126
+ static gen_helper_gvec_3 * const fns[4] = {
127
+ NULL,
128
+ NULL,
129
+ gen_helper_sve_revh_s,
130
+ gen_helper_sve_revh_d,
131
+ };
132
+ return do_zpz_ool(s, a, fns[a->esz]);
133
+}
134
+
135
+static bool trans_REVW(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
136
+{
137
+ return do_zpz_ool(s, a, a->esz == 3 ? gen_helper_sve_revw_d : NULL);
138
+}
139
+
140
+static bool trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
141
+{
142
+ static gen_helper_gvec_3 * const fns[4] = {
143
+ gen_helper_sve_rbit_b,
144
+ gen_helper_sve_rbit_h,
145
+ gen_helper_sve_rbit_s,
146
+ gen_helper_sve_rbit_d,
147
+ };
148
+ return do_zpz_ool(s, a, fns[a->esz]);
149
+}
150
+
151
/*
152
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
153
*/
154
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
155
index XXXXXXX..XXXXXXX 100644
156
--- a/target/arm/sve.decode
157
+++ b/target/arm/sve.decode
158
@@ -XXX,XX +XXX,XX @@ CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn
159
# SVE copy element from general register to vector (predicated)
160
CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn
161
162
+# SVE reverse within elements
163
+# Note esz >= operation size
164
+REVB 00000101 .. 1001 00 100 ... ..... ..... @rd_pg_rn
165
+REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn
166
+REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn
167
+RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn
168
+
169
### SVE Predicate Logical Operations Group
170
171
# SVE predicate logical operations
172
--
160
--
173
2.17.1
161
2.20.1
174
162
175
163
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Do not yet convert the helpers to loop over opr_sz, but the
4
descriptor allows the vector tail to be cleared. Which fixes
5
an existing bug vs SVE.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200514212831.31248-4-richard.henderson@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-7-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/helper-sve.h | 2 +
12
target/arm/helper.h | 15 +++++++-----
9
target/arm/sve_helper.c | 12 ++
13
target/arm/crypto_helper.c | 37 +++++++++++++++++++++++-----
10
target/arm/translate-sve.c | 328 +++++++++++++++++++++++++++++++++++++
14
target/arm/translate-a64.c | 50 ++++++++++++--------------------------
11
target/arm/sve.decode | 20 +++
15
3 files changed, 55 insertions(+), 47 deletions(-)
12
4 files changed, 362 insertions(+)
16
13
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
19
--- a/target/arm/helper.h
17
+++ b/target/arm/helper-sve.h
20
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(crypto_sha256h2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
19
DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
DEF_HELPER_FLAGS_2(crypto_sha256su0, TCG_CALL_NO_RWG, void, ptr, ptr)
20
DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
DEF_HELPER_FLAGS_3(crypto_sha256su1, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
21
24
22
+DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32)
25
-DEF_HELPER_FLAGS_3(crypto_sha512h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
23
+
26
-DEF_HELPER_FLAGS_3(crypto_sha512h2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
24
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
-DEF_HELPER_FLAGS_2(crypto_sha512su0, TCG_CALL_NO_RWG, void, ptr, ptr)
25
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
-DEF_HELPER_FLAGS_3(crypto_sha512su1, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
26
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(crypto_sha512h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
30
+DEF_HELPER_FLAGS_4(crypto_sha512h2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_3(crypto_sha512su0, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_4(crypto_sha512su1, TCG_CALL_NO_RWG,
33
+ void, ptr, ptr, ptr, i32)
34
35
DEF_HELPER_FLAGS_5(crypto_sm3tt, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32, i32)
36
-DEF_HELPER_FLAGS_3(crypto_sm3partw1, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
37
-DEF_HELPER_FLAGS_3(crypto_sm3partw2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
38
+DEF_HELPER_FLAGS_4(crypto_sm3partw1, TCG_CALL_NO_RWG,
39
+ void, ptr, ptr, ptr, i32)
40
+DEF_HELPER_FLAGS_4(crypto_sm3partw2, TCG_CALL_NO_RWG,
41
+ void, ptr, ptr, ptr, i32)
42
43
DEF_HELPER_FLAGS_4(crypto_sm4e, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
44
DEF_HELPER_FLAGS_4(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
45
diff --git a/target/arm/crypto_helper.c b/target/arm/crypto_helper.c
28
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
29
--- a/target/arm/sve_helper.c
47
--- a/target/arm/crypto_helper.c
30
+++ b/target/arm/sve_helper.c
48
+++ b/target/arm/crypto_helper.c
31
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc)
49
@@ -XXX,XX +XXX,XX @@ union CRYPTO_STATE {
32
d[j] = 0;
50
#define CR_ST_WORD(state, i) (state.words[i])
51
#endif
52
53
+/*
54
+ * The caller has not been converted to full gvec, and so only
55
+ * modifies the low 16 bytes of the vector register.
56
+ */
57
+static void clear_tail_16(void *vd, uint32_t desc)
58
+{
59
+ int opr_sz = simd_oprsz(desc);
60
+ int max_sz = simd_maxsz(desc);
61
+
62
+ assert(opr_sz == 16);
63
+ clear_tail(vd, opr_sz, max_sz);
64
+}
65
+
66
static void do_crypto_aese(uint64_t *rd, uint64_t *rn,
67
uint64_t *rm, bool decrypt)
68
{
69
@@ -XXX,XX +XXX,XX @@ static uint64_t s1_512(uint64_t x)
70
return ror64(x, 19) ^ ror64(x, 61) ^ (x >> 6);
71
}
72
73
-void HELPER(crypto_sha512h)(void *vd, void *vn, void *vm)
74
+void HELPER(crypto_sha512h)(void *vd, void *vn, void *vm, uint32_t desc)
75
{
76
uint64_t *rd = vd;
77
uint64_t *rn = vn;
78
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha512h)(void *vd, void *vn, void *vm)
79
80
rd[0] = d0;
81
rd[1] = d1;
82
+
83
+ clear_tail_16(vd, desc);
84
}
85
86
-void HELPER(crypto_sha512h2)(void *vd, void *vn, void *vm)
87
+void HELPER(crypto_sha512h2)(void *vd, void *vn, void *vm, uint32_t desc)
88
{
89
uint64_t *rd = vd;
90
uint64_t *rn = vn;
91
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha512h2)(void *vd, void *vn, void *vm)
92
93
rd[0] = d0;
94
rd[1] = d1;
95
+
96
+ clear_tail_16(vd, desc);
97
}
98
99
-void HELPER(crypto_sha512su0)(void *vd, void *vn)
100
+void HELPER(crypto_sha512su0)(void *vd, void *vn, uint32_t desc)
101
{
102
uint64_t *rd = vd;
103
uint64_t *rn = vn;
104
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha512su0)(void *vd, void *vn)
105
106
rd[0] = d0;
107
rd[1] = d1;
108
+
109
+ clear_tail_16(vd, desc);
110
}
111
112
-void HELPER(crypto_sha512su1)(void *vd, void *vn, void *vm)
113
+void HELPER(crypto_sha512su1)(void *vd, void *vn, void *vm, uint32_t desc)
114
{
115
uint64_t *rd = vd;
116
uint64_t *rn = vn;
117
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha512su1)(void *vd, void *vn, void *vm)
118
119
rd[0] += s1_512(rn[0]) + rm[0];
120
rd[1] += s1_512(rn[1]) + rm[1];
121
+
122
+ clear_tail_16(vd, desc);
123
}
124
125
-void HELPER(crypto_sm3partw1)(void *vd, void *vn, void *vm)
126
+void HELPER(crypto_sm3partw1)(void *vd, void *vn, void *vm, uint32_t desc)
127
{
128
uint64_t *rd = vd;
129
uint64_t *rn = vn;
130
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sm3partw1)(void *vd, void *vn, void *vm)
131
132
rd[0] = d.l[0];
133
rd[1] = d.l[1];
134
+
135
+ clear_tail_16(vd, desc);
136
}
137
138
-void HELPER(crypto_sm3partw2)(void *vd, void *vn, void *vm)
139
+void HELPER(crypto_sm3partw2)(void *vd, void *vn, void *vm, uint32_t desc)
140
{
141
uint64_t *rd = vd;
142
uint64_t *rn = vn;
143
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sm3partw2)(void *vd, void *vn, void *vm)
144
145
rd[0] = d.l[0];
146
rd[1] = d.l[1];
147
+
148
+ clear_tail_16(vd, desc);
149
}
150
151
void HELPER(crypto_sm3tt)(void *vd, void *vn, void *vm, uint32_t imm2,
152
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
153
index XXXXXXX..XXXXXXX 100644
154
--- a/target/arm/translate-a64.c
155
+++ b/target/arm/translate-a64.c
156
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
157
int rn = extract32(insn, 5, 5);
158
int rd = extract32(insn, 0, 5);
159
bool feature;
160
- CryptoThreeOpFn *genfn = NULL;
161
gen_helper_gvec_3 *oolfn = NULL;
162
GVecGen3Fn *gvecfn = NULL;
163
164
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
165
switch (opcode) {
166
case 0: /* SHA512H */
167
feature = dc_isar_feature(aa64_sha512, s);
168
- genfn = gen_helper_crypto_sha512h;
169
+ oolfn = gen_helper_crypto_sha512h;
170
break;
171
case 1: /* SHA512H2 */
172
feature = dc_isar_feature(aa64_sha512, s);
173
- genfn = gen_helper_crypto_sha512h2;
174
+ oolfn = gen_helper_crypto_sha512h2;
175
break;
176
case 2: /* SHA512SU1 */
177
feature = dc_isar_feature(aa64_sha512, s);
178
- genfn = gen_helper_crypto_sha512su1;
179
+ oolfn = gen_helper_crypto_sha512su1;
180
break;
181
case 3: /* RAX1 */
182
feature = dc_isar_feature(aa64_sha3, s);
183
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
184
switch (opcode) {
185
case 0: /* SM3PARTW1 */
186
feature = dc_isar_feature(aa64_sm3, s);
187
- genfn = gen_helper_crypto_sm3partw1;
188
+ oolfn = gen_helper_crypto_sm3partw1;
189
break;
190
case 1: /* SM3PARTW2 */
191
feature = dc_isar_feature(aa64_sm3, s);
192
- genfn = gen_helper_crypto_sm3partw2;
193
+ oolfn = gen_helper_crypto_sm3partw2;
194
break;
195
case 2: /* SM4EKEY */
196
feature = dc_isar_feature(aa64_sm4, s);
197
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
198
199
if (oolfn) {
200
gen_gvec_op3_ool(s, true, rd, rn, rm, 0, oolfn);
201
- } else if (gvecfn) {
202
- gen_gvec_fn3(s, true, rd, rn, rm, gvecfn, MO_64);
203
} else {
204
- TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
205
-
206
- tcg_rd_ptr = vec_full_reg_ptr(s, rd);
207
- tcg_rn_ptr = vec_full_reg_ptr(s, rn);
208
- tcg_rm_ptr = vec_full_reg_ptr(s, rm);
209
-
210
- genfn(tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr);
211
-
212
- tcg_temp_free_ptr(tcg_rd_ptr);
213
- tcg_temp_free_ptr(tcg_rn_ptr);
214
- tcg_temp_free_ptr(tcg_rm_ptr);
215
+ gen_gvec_fn3(s, true, rd, rn, rm, gvecfn, MO_64);
33
}
216
}
34
}
217
}
35
+
218
36
+/* Similar to the ARM LastActiveElement pseudocode function, except the
219
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
37
+ * result is multiplied by the element size. This includes the not found
220
int opcode = extract32(insn, 10, 2);
38
+ * indication; e.g. not found for esz=3 is -8.
221
int rn = extract32(insn, 5, 5);
39
+ */
222
int rd = extract32(insn, 0, 5);
40
+int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc)
223
- TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
41
+{
224
bool feature;
42
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
225
- CryptoTwoOpFn *genfn;
43
+ intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2);
226
- gen_helper_gvec_3 *oolfn = NULL;
44
+
227
45
+ return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz);
228
switch (opcode) {
46
+}
229
case 0: /* SHA512SU0 */
47
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
230
feature = dc_isar_feature(aa64_sha512, s);
48
index XXXXXXX..XXXXXXX 100644
231
- genfn = gen_helper_crypto_sha512su0;
49
--- a/target/arm/translate-sve.c
232
break;
50
+++ b/target/arm/translate-sve.c
233
case 1: /* SM4E */
51
@@ -XXX,XX +XXX,XX @@ static bool trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
234
feature = dc_isar_feature(aa64_sm4, s);
52
return do_zpz_ool(s, a, fns[a->esz]);
235
- oolfn = gen_helper_crypto_sm4e;
53
}
236
break;
54
237
default:
55
+/* Call the helper that computes the ARM LastActiveElement pseudocode
238
unallocated_encoding(s);
56
+ * function, scaled by the element size. This includes the not found
239
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
57
+ * indication; e.g. not found for esz=3 is -8.
240
return;
58
+ */
241
}
59
+static void find_last_active(DisasContext *s, TCGv_i32 ret, int esz, int pg)
242
60
+{
243
- if (oolfn) {
61
+ /* Predicate sizes may be smaller and cannot use simd_desc. We cannot
244
- gen_gvec_op3_ool(s, true, rd, rd, rn, 0, oolfn);
62
+ * round up, as we do elsewhere, because we need the exact size.
245
- return;
63
+ */
246
+ switch (opcode) {
64
+ TCGv_ptr t_p = tcg_temp_new_ptr();
247
+ case 0: /* SHA512SU0 */
65
+ TCGv_i32 t_desc;
248
+ gen_gvec_op2_ool(s, true, rd, rn, 0, gen_helper_crypto_sha512su0);
66
+ unsigned vsz = pred_full_reg_size(s);
67
+ unsigned desc;
68
+
69
+ desc = vsz - 2;
70
+ desc = deposit32(desc, SIMD_DATA_SHIFT, 2, esz);
71
+
72
+ tcg_gen_addi_ptr(t_p, cpu_env, pred_full_reg_offset(s, pg));
73
+ t_desc = tcg_const_i32(desc);
74
+
75
+ gen_helper_sve_last_active_element(ret, t_p, t_desc);
76
+
77
+ tcg_temp_free_i32(t_desc);
78
+ tcg_temp_free_ptr(t_p);
79
+}
80
+
81
+/* Increment LAST to the offset of the next element in the vector,
82
+ * wrapping around to 0.
83
+ */
84
+static void incr_last_active(DisasContext *s, TCGv_i32 last, int esz)
85
+{
86
+ unsigned vsz = vec_full_reg_size(s);
87
+
88
+ tcg_gen_addi_i32(last, last, 1 << esz);
89
+ if (is_power_of_2(vsz)) {
90
+ tcg_gen_andi_i32(last, last, vsz - 1);
91
+ } else {
92
+ TCGv_i32 max = tcg_const_i32(vsz);
93
+ TCGv_i32 zero = tcg_const_i32(0);
94
+ tcg_gen_movcond_i32(TCG_COND_GEU, last, last, max, zero, last);
95
+ tcg_temp_free_i32(max);
96
+ tcg_temp_free_i32(zero);
97
+ }
98
+}
99
+
100
+/* If LAST < 0, set LAST to the offset of the last element in the vector. */
101
+static void wrap_last_active(DisasContext *s, TCGv_i32 last, int esz)
102
+{
103
+ unsigned vsz = vec_full_reg_size(s);
104
+
105
+ if (is_power_of_2(vsz)) {
106
+ tcg_gen_andi_i32(last, last, vsz - 1);
107
+ } else {
108
+ TCGv_i32 max = tcg_const_i32(vsz - (1 << esz));
109
+ TCGv_i32 zero = tcg_const_i32(0);
110
+ tcg_gen_movcond_i32(TCG_COND_LT, last, last, zero, max, last);
111
+ tcg_temp_free_i32(max);
112
+ tcg_temp_free_i32(zero);
113
+ }
114
+}
115
+
116
+/* Load an unsigned element of ESZ from BASE+OFS. */
117
+static TCGv_i64 load_esz(TCGv_ptr base, int ofs, int esz)
118
+{
119
+ TCGv_i64 r = tcg_temp_new_i64();
120
+
121
+ switch (esz) {
122
+ case 0:
123
+ tcg_gen_ld8u_i64(r, base, ofs);
124
+ break;
249
+ break;
125
+ case 1:
250
+ case 1: /* SM4E */
126
+ tcg_gen_ld16u_i64(r, base, ofs);
251
+ gen_gvec_op3_ool(s, true, rd, rd, rn, 0, gen_helper_crypto_sm4e);
127
+ break;
128
+ case 2:
129
+ tcg_gen_ld32u_i64(r, base, ofs);
130
+ break;
131
+ case 3:
132
+ tcg_gen_ld_i64(r, base, ofs);
133
+ break;
252
+ break;
134
+ default:
253
+ default:
135
+ g_assert_not_reached();
254
+ g_assert_not_reached();
136
+ }
255
}
137
+ return r;
256
-
138
+}
257
- tcg_rd_ptr = vec_full_reg_ptr(s, rd);
139
+
258
- tcg_rn_ptr = vec_full_reg_ptr(s, rn);
140
+/* Load an unsigned element of ESZ from RM[LAST]. */
259
-
141
+static TCGv_i64 load_last_active(DisasContext *s, TCGv_i32 last,
260
- genfn(tcg_rd_ptr, tcg_rn_ptr);
142
+ int rm, int esz)
261
-
143
+{
262
- tcg_temp_free_ptr(tcg_rd_ptr);
144
+ TCGv_ptr p = tcg_temp_new_ptr();
263
- tcg_temp_free_ptr(tcg_rn_ptr);
145
+ TCGv_i64 r;
264
}
146
+
265
147
+ /* Convert offset into vector into offset into ENV.
266
/* Crypto four-register
148
+ * The final adjustment for the vector register base
149
+ * is added via constant offset to the load.
150
+ */
151
+#ifdef HOST_WORDS_BIGENDIAN
152
+ /* Adjust for element ordering. See vec_reg_offset. */
153
+ if (esz < 3) {
154
+ tcg_gen_xori_i32(last, last, 8 - (1 << esz));
155
+ }
156
+#endif
157
+ tcg_gen_ext_i32_ptr(p, last);
158
+ tcg_gen_add_ptr(p, p, cpu_env);
159
+
160
+ r = load_esz(p, vec_full_reg_offset(s, rm), esz);
161
+ tcg_temp_free_ptr(p);
162
+
163
+ return r;
164
+}
165
+
166
+/* Compute CLAST for a Zreg. */
167
+static bool do_clast_vector(DisasContext *s, arg_rprr_esz *a, bool before)
168
+{
169
+ TCGv_i32 last;
170
+ TCGLabel *over;
171
+ TCGv_i64 ele;
172
+ unsigned vsz, esz = a->esz;
173
+
174
+ if (!sve_access_check(s)) {
175
+ return true;
176
+ }
177
+
178
+ last = tcg_temp_local_new_i32();
179
+ over = gen_new_label();
180
+
181
+ find_last_active(s, last, esz, a->pg);
182
+
183
+ /* There is of course no movcond for a 2048-bit vector,
184
+ * so we must branch over the actual store.
185
+ */
186
+ tcg_gen_brcondi_i32(TCG_COND_LT, last, 0, over);
187
+
188
+ if (!before) {
189
+ incr_last_active(s, last, esz);
190
+ }
191
+
192
+ ele = load_last_active(s, last, a->rm, esz);
193
+ tcg_temp_free_i32(last);
194
+
195
+ vsz = vec_full_reg_size(s);
196
+ tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), vsz, vsz, ele);
197
+ tcg_temp_free_i64(ele);
198
+
199
+ /* If this insn used MOVPRFX, we may need a second move. */
200
+ if (a->rd != a->rn) {
201
+ TCGLabel *done = gen_new_label();
202
+ tcg_gen_br(done);
203
+
204
+ gen_set_label(over);
205
+ do_mov_z(s, a->rd, a->rn);
206
+
207
+ gen_set_label(done);
208
+ } else {
209
+ gen_set_label(over);
210
+ }
211
+ return true;
212
+}
213
+
214
+static bool trans_CLASTA_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
215
+{
216
+ return do_clast_vector(s, a, false);
217
+}
218
+
219
+static bool trans_CLASTB_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
220
+{
221
+ return do_clast_vector(s, a, true);
222
+}
223
+
224
+/* Compute CLAST for a scalar. */
225
+static void do_clast_scalar(DisasContext *s, int esz, int pg, int rm,
226
+ bool before, TCGv_i64 reg_val)
227
+{
228
+ TCGv_i32 last = tcg_temp_new_i32();
229
+ TCGv_i64 ele, cmp, zero;
230
+
231
+ find_last_active(s, last, esz, pg);
232
+
233
+ /* Extend the original value of last prior to incrementing. */
234
+ cmp = tcg_temp_new_i64();
235
+ tcg_gen_ext_i32_i64(cmp, last);
236
+
237
+ if (!before) {
238
+ incr_last_active(s, last, esz);
239
+ }
240
+
241
+ /* The conceit here is that while last < 0 indicates not found, after
242
+ * adjusting for cpu_env->vfp.zregs[rm], it is still a valid address
243
+ * from which we can load garbage. We then discard the garbage with
244
+ * a conditional move.
245
+ */
246
+ ele = load_last_active(s, last, rm, esz);
247
+ tcg_temp_free_i32(last);
248
+
249
+ zero = tcg_const_i64(0);
250
+ tcg_gen_movcond_i64(TCG_COND_GE, reg_val, cmp, zero, ele, reg_val);
251
+
252
+ tcg_temp_free_i64(zero);
253
+ tcg_temp_free_i64(cmp);
254
+ tcg_temp_free_i64(ele);
255
+}
256
+
257
+/* Compute CLAST for a Vreg. */
258
+static bool do_clast_fp(DisasContext *s, arg_rpr_esz *a, bool before)
259
+{
260
+ if (sve_access_check(s)) {
261
+ int esz = a->esz;
262
+ int ofs = vec_reg_offset(s, a->rd, 0, esz);
263
+ TCGv_i64 reg = load_esz(cpu_env, ofs, esz);
264
+
265
+ do_clast_scalar(s, esz, a->pg, a->rn, before, reg);
266
+ write_fp_dreg(s, a->rd, reg);
267
+ tcg_temp_free_i64(reg);
268
+ }
269
+ return true;
270
+}
271
+
272
+static bool trans_CLASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
273
+{
274
+ return do_clast_fp(s, a, false);
275
+}
276
+
277
+static bool trans_CLASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
278
+{
279
+ return do_clast_fp(s, a, true);
280
+}
281
+
282
+/* Compute CLAST for a Xreg. */
283
+static bool do_clast_general(DisasContext *s, arg_rpr_esz *a, bool before)
284
+{
285
+ TCGv_i64 reg;
286
+
287
+ if (!sve_access_check(s)) {
288
+ return true;
289
+ }
290
+
291
+ reg = cpu_reg(s, a->rd);
292
+ switch (a->esz) {
293
+ case 0:
294
+ tcg_gen_ext8u_i64(reg, reg);
295
+ break;
296
+ case 1:
297
+ tcg_gen_ext16u_i64(reg, reg);
298
+ break;
299
+ case 2:
300
+ tcg_gen_ext32u_i64(reg, reg);
301
+ break;
302
+ case 3:
303
+ break;
304
+ default:
305
+ g_assert_not_reached();
306
+ }
307
+
308
+ do_clast_scalar(s, a->esz, a->pg, a->rn, before, reg);
309
+ return true;
310
+}
311
+
312
+static bool trans_CLASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
313
+{
314
+ return do_clast_general(s, a, false);
315
+}
316
+
317
+static bool trans_CLASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
318
+{
319
+ return do_clast_general(s, a, true);
320
+}
321
+
322
+/* Compute LAST for a scalar. */
323
+static TCGv_i64 do_last_scalar(DisasContext *s, int esz,
324
+ int pg, int rm, bool before)
325
+{
326
+ TCGv_i32 last = tcg_temp_new_i32();
327
+ TCGv_i64 ret;
328
+
329
+ find_last_active(s, last, esz, pg);
330
+ if (before) {
331
+ wrap_last_active(s, last, esz);
332
+ } else {
333
+ incr_last_active(s, last, esz);
334
+ }
335
+
336
+ ret = load_last_active(s, last, rm, esz);
337
+ tcg_temp_free_i32(last);
338
+ return ret;
339
+}
340
+
341
+/* Compute LAST for a Vreg. */
342
+static bool do_last_fp(DisasContext *s, arg_rpr_esz *a, bool before)
343
+{
344
+ if (sve_access_check(s)) {
345
+ TCGv_i64 val = do_last_scalar(s, a->esz, a->pg, a->rn, before);
346
+ write_fp_dreg(s, a->rd, val);
347
+ tcg_temp_free_i64(val);
348
+ }
349
+ return true;
350
+}
351
+
352
+static bool trans_LASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
353
+{
354
+ return do_last_fp(s, a, false);
355
+}
356
+
357
+static bool trans_LASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
358
+{
359
+ return do_last_fp(s, a, true);
360
+}
361
+
362
+/* Compute LAST for a Xreg. */
363
+static bool do_last_general(DisasContext *s, arg_rpr_esz *a, bool before)
364
+{
365
+ if (sve_access_check(s)) {
366
+ TCGv_i64 val = do_last_scalar(s, a->esz, a->pg, a->rn, before);
367
+ tcg_gen_mov_i64(cpu_reg(s, a->rd), val);
368
+ tcg_temp_free_i64(val);
369
+ }
370
+ return true;
371
+}
372
+
373
+static bool trans_LASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
374
+{
375
+ return do_last_general(s, a, false);
376
+}
377
+
378
+static bool trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
379
+{
380
+ return do_last_general(s, a, true);
381
+}
382
+
383
/*
384
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
385
*/
386
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
387
index XXXXXXX..XXXXXXX 100644
388
--- a/target/arm/sve.decode
389
+++ b/target/arm/sve.decode
390
@@ -XXX,XX +XXX,XX @@ TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm
391
# Note esz >= 2
392
COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn
393
394
+# SVE conditionally broadcast element to vector
395
+CLASTA_z 00000101 .. 10100 0 100 ... ..... ..... @rdn_pg_rm
396
+CLASTB_z 00000101 .. 10100 1 100 ... ..... ..... @rdn_pg_rm
397
+
398
+# SVE conditionally copy element to SIMD&FP scalar
399
+CLASTA_v 00000101 .. 10101 0 100 ... ..... ..... @rd_pg_rn
400
+CLASTB_v 00000101 .. 10101 1 100 ... ..... ..... @rd_pg_rn
401
+
402
+# SVE conditionally copy element to general register
403
+CLASTA_r 00000101 .. 11000 0 101 ... ..... ..... @rd_pg_rn
404
+CLASTB_r 00000101 .. 11000 1 101 ... ..... ..... @rd_pg_rn
405
+
406
+# SVE copy element to SIMD&FP scalar register
407
+LASTA_v 00000101 .. 10001 0 100 ... ..... ..... @rd_pg_rn
408
+LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn
409
+
410
+# SVE copy element to general register
411
+LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn
412
+LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn
413
+
414
### SVE Predicate Logical Operations Group
415
416
# SVE predicate logical operations
417
--
267
--
418
2.17.1
268
2.20.1
419
269
420
270
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Do not yet convert the helpers to loop over opr_sz, but the
4
descriptor allows the vector tail to be cleared. Which fixes
5
an existing bug vs SVE.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200514212831.31248-5-richard.henderson@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-12-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/helper-sve.h | 115 +++++++++++++++++++++++
12
target/arm/helper.h | 12 ++--
9
target/arm/sve_helper.c | 187 +++++++++++++++++++++++++++++++++++++
13
target/arm/neon-dp.decode | 12 ++--
10
target/arm/translate-sve.c | 91 ++++++++++++++++++
14
target/arm/crypto_helper.c | 24 +++++--
11
target/arm/sve.decode | 24 +++++
15
target/arm/translate-a64.c | 34 ++++-----
12
4 files changed, 417 insertions(+)
16
target/arm/translate-neon.inc.c | 124 +++++---------------------------
17
target/arm/translate.c | 24 ++-----
18
6 files changed, 67 insertions(+), 163 deletions(-)
13
19
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
20
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
22
--- a/target/arm/helper.h
17
+++ b/target/arm/helper-sve.h
23
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(crypto_aese, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
25
DEF_HELPER_FLAGS_3(crypto_aesmc, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
26
21
27
DEF_HELPER_FLAGS_4(crypto_sha1_3reg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
+DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG,
28
-DEF_HELPER_FLAGS_2(crypto_sha1h, TCG_CALL_NO_RWG, void, ptr, ptr)
23
+ i32, ptr, ptr, ptr, ptr, i32)
29
-DEF_HELPER_FLAGS_2(crypto_sha1su1, TCG_CALL_NO_RWG, void, ptr, ptr)
24
+DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_b, TCG_CALL_NO_RWG,
30
+DEF_HELPER_FLAGS_3(crypto_sha1h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
25
+ i32, ptr, ptr, ptr, ptr, i32)
31
+DEF_HELPER_FLAGS_3(crypto_sha1su1, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_b, TCG_CALL_NO_RWG,
32
27
+ i32, ptr, ptr, ptr, ptr, i32)
33
-DEF_HELPER_FLAGS_3(crypto_sha256h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
28
+DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_b, TCG_CALL_NO_RWG,
34
-DEF_HELPER_FLAGS_3(crypto_sha256h2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
29
+ i32, ptr, ptr, ptr, ptr, i32)
35
-DEF_HELPER_FLAGS_2(crypto_sha256su0, TCG_CALL_NO_RWG, void, ptr, ptr)
30
+DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_b, TCG_CALL_NO_RWG,
36
-DEF_HELPER_FLAGS_3(crypto_sha256su1, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
31
+ i32, ptr, ptr, ptr, ptr, i32)
37
+DEF_HELPER_FLAGS_4(crypto_sha256h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
32
+DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_b, TCG_CALL_NO_RWG,
38
+DEF_HELPER_FLAGS_4(crypto_sha256h2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+ i32, ptr, ptr, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_3(crypto_sha256su0, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
34
+
40
+DEF_HELPER_FLAGS_4(crypto_sha256su1, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_h, TCG_CALL_NO_RWG,
41
36
+ i32, ptr, ptr, ptr, ptr, i32)
42
DEF_HELPER_FLAGS_4(crypto_sha512h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
37
+DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_h, TCG_CALL_NO_RWG,
43
DEF_HELPER_FLAGS_4(crypto_sha512h2, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
38
+ i32, ptr, ptr, ptr, ptr, i32)
44
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
39
+DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_h, TCG_CALL_NO_RWG,
45
index XXXXXXX..XXXXXXX 100644
40
+ i32, ptr, ptr, ptr, ptr, i32)
46
--- a/target/arm/neon-dp.decode
41
+DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_h, TCG_CALL_NO_RWG,
47
+++ b/target/arm/neon-dp.decode
42
+ i32, ptr, ptr, ptr, ptr, i32)
48
@@ -XXX,XX +XXX,XX @@ VPADD_3s 1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
43
+DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_h, TCG_CALL_NO_RWG,
49
44
+ i32, ptr, ptr, ptr, ptr, i32)
50
VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
45
+DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_h, TCG_CALL_NO_RWG,
51
46
+ i32, ptr, ptr, ptr, ptr, i32)
52
+@3same_crypto .... .... .... .... .... .... .... .... \
47
+
53
+ &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0 q=1
48
+DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_s, TCG_CALL_NO_RWG,
54
+
49
+ i32, ptr, ptr, ptr, ptr, i32)
55
SHA1_3s 1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
50
+DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_s, TCG_CALL_NO_RWG,
56
vm=%vm_dp vn=%vn_dp vd=%vd_dp
51
+ i32, ptr, ptr, ptr, ptr, i32)
57
-SHA256H_3s 1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... \
52
+DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_s, TCG_CALL_NO_RWG,
58
- vm=%vm_dp vn=%vn_dp vd=%vd_dp
53
+ i32, ptr, ptr, ptr, ptr, i32)
59
-SHA256H2_3s 1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
54
+DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_s, TCG_CALL_NO_RWG,
60
- vm=%vm_dp vn=%vn_dp vd=%vd_dp
55
+ i32, ptr, ptr, ptr, ptr, i32)
61
-SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
56
+DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_s, TCG_CALL_NO_RWG,
62
- vm=%vm_dp vn=%vn_dp vd=%vd_dp
57
+ i32, ptr, ptr, ptr, ptr, i32)
63
+SHA256H_3s 1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... @3same_crypto
58
+DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_s, TCG_CALL_NO_RWG,
64
+SHA256H2_3s 1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... @3same_crypto
59
+ i32, ptr, ptr, ptr, ptr, i32)
65
+SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... @3same_crypto
60
+
66
61
+DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_d, TCG_CALL_NO_RWG,
67
VFMA_fp_3s 1111 001 0 0 . 0 . .... .... 1100 ... 1 .... @3same_fp
62
+ i32, ptr, ptr, ptr, ptr, i32)
68
VFMS_fp_3s 1111 001 0 0 . 1 . .... .... 1100 ... 1 .... @3same_fp
63
+DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_d, TCG_CALL_NO_RWG,
69
diff --git a/target/arm/crypto_helper.c b/target/arm/crypto_helper.c
64
+ i32, ptr, ptr, ptr, ptr, i32)
70
index XXXXXXX..XXXXXXX 100644
65
+DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_d, TCG_CALL_NO_RWG,
71
--- a/target/arm/crypto_helper.c
66
+ i32, ptr, ptr, ptr, ptr, i32)
72
+++ b/target/arm/crypto_helper.c
67
+DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_d, TCG_CALL_NO_RWG,
73
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha1_3reg)(void *vd, void *vn, void *vm, uint32_t op)
68
+ i32, ptr, ptr, ptr, ptr, i32)
74
rd[1] = d.l[1];
69
+DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_d, TCG_CALL_NO_RWG,
75
}
70
+ i32, ptr, ptr, ptr, ptr, i32)
76
71
+DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_d, TCG_CALL_NO_RWG,
77
-void HELPER(crypto_sha1h)(void *vd, void *vm)
72
+ i32, ptr, ptr, ptr, ptr, i32)
78
+void HELPER(crypto_sha1h)(void *vd, void *vm, uint32_t desc)
73
+
79
{
74
+DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_b, TCG_CALL_NO_RWG,
80
uint64_t *rd = vd;
75
+ i32, ptr, ptr, ptr, ptr, i32)
81
uint64_t *rm = vm;
76
+DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_b, TCG_CALL_NO_RWG,
82
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha1h)(void *vd, void *vm)
77
+ i32, ptr, ptr, ptr, ptr, i32)
83
78
+DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_b, TCG_CALL_NO_RWG,
84
rd[0] = m.l[0];
79
+ i32, ptr, ptr, ptr, ptr, i32)
85
rd[1] = m.l[1];
80
+DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_b, TCG_CALL_NO_RWG,
86
+
81
+ i32, ptr, ptr, ptr, ptr, i32)
87
+ clear_tail_16(vd, desc);
82
+DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_b, TCG_CALL_NO_RWG,
88
}
83
+ i32, ptr, ptr, ptr, ptr, i32)
89
84
+DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_b, TCG_CALL_NO_RWG,
90
-void HELPER(crypto_sha1su1)(void *vd, void *vm)
85
+ i32, ptr, ptr, ptr, ptr, i32)
91
+void HELPER(crypto_sha1su1)(void *vd, void *vm, uint32_t desc)
86
+DEF_HELPER_FLAGS_5(sve_cmple_ppzw_b, TCG_CALL_NO_RWG,
92
{
87
+ i32, ptr, ptr, ptr, ptr, i32)
93
uint64_t *rd = vd;
88
+DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_b, TCG_CALL_NO_RWG,
94
uint64_t *rm = vm;
89
+ i32, ptr, ptr, ptr, ptr, i32)
95
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha1su1)(void *vd, void *vm)
90
+DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_b, TCG_CALL_NO_RWG,
96
91
+ i32, ptr, ptr, ptr, ptr, i32)
97
rd[0] = d.l[0];
92
+DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_b, TCG_CALL_NO_RWG,
98
rd[1] = d.l[1];
93
+ i32, ptr, ptr, ptr, ptr, i32)
99
+
94
+
100
+ clear_tail_16(vd, desc);
95
+DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_h, TCG_CALL_NO_RWG,
101
}
96
+ i32, ptr, ptr, ptr, ptr, i32)
102
97
+DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_h, TCG_CALL_NO_RWG,
103
/*
98
+ i32, ptr, ptr, ptr, ptr, i32)
104
@@ -XXX,XX +XXX,XX @@ static uint32_t s1(uint32_t x)
99
+DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_h, TCG_CALL_NO_RWG,
105
return ror32(x, 17) ^ ror32(x, 19) ^ (x >> 10);
100
+ i32, ptr, ptr, ptr, ptr, i32)
106
}
101
+DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_h, TCG_CALL_NO_RWG,
107
102
+ i32, ptr, ptr, ptr, ptr, i32)
108
-void HELPER(crypto_sha256h)(void *vd, void *vn, void *vm)
103
+DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_h, TCG_CALL_NO_RWG,
109
+void HELPER(crypto_sha256h)(void *vd, void *vn, void *vm, uint32_t desc)
104
+ i32, ptr, ptr, ptr, ptr, i32)
110
{
105
+DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_h, TCG_CALL_NO_RWG,
111
uint64_t *rd = vd;
106
+ i32, ptr, ptr, ptr, ptr, i32)
112
uint64_t *rn = vn;
107
+DEF_HELPER_FLAGS_5(sve_cmple_ppzw_h, TCG_CALL_NO_RWG,
113
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha256h)(void *vd, void *vn, void *vm)
108
+ i32, ptr, ptr, ptr, ptr, i32)
114
109
+DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_h, TCG_CALL_NO_RWG,
115
rd[0] = d.l[0];
110
+ i32, ptr, ptr, ptr, ptr, i32)
116
rd[1] = d.l[1];
111
+DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_h, TCG_CALL_NO_RWG,
117
+
112
+ i32, ptr, ptr, ptr, ptr, i32)
118
+ clear_tail_16(vd, desc);
113
+DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_h, TCG_CALL_NO_RWG,
119
}
114
+ i32, ptr, ptr, ptr, ptr, i32)
120
115
+
121
-void HELPER(crypto_sha256h2)(void *vd, void *vn, void *vm)
116
+DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_s, TCG_CALL_NO_RWG,
122
+void HELPER(crypto_sha256h2)(void *vd, void *vn, void *vm, uint32_t desc)
117
+ i32, ptr, ptr, ptr, ptr, i32)
123
{
118
+DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_s, TCG_CALL_NO_RWG,
124
uint64_t *rd = vd;
119
+ i32, ptr, ptr, ptr, ptr, i32)
125
uint64_t *rn = vn;
120
+DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_s, TCG_CALL_NO_RWG,
126
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha256h2)(void *vd, void *vn, void *vm)
121
+ i32, ptr, ptr, ptr, ptr, i32)
127
122
+DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_s, TCG_CALL_NO_RWG,
128
rd[0] = d.l[0];
123
+ i32, ptr, ptr, ptr, ptr, i32)
129
rd[1] = d.l[1];
124
+DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_s, TCG_CALL_NO_RWG,
130
+
125
+ i32, ptr, ptr, ptr, ptr, i32)
131
+ clear_tail_16(vd, desc);
126
+DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_s, TCG_CALL_NO_RWG,
132
}
127
+ i32, ptr, ptr, ptr, ptr, i32)
133
128
+DEF_HELPER_FLAGS_5(sve_cmple_ppzw_s, TCG_CALL_NO_RWG,
134
-void HELPER(crypto_sha256su0)(void *vd, void *vm)
129
+ i32, ptr, ptr, ptr, ptr, i32)
135
+void HELPER(crypto_sha256su0)(void *vd, void *vm, uint32_t desc)
130
+DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_s, TCG_CALL_NO_RWG,
136
{
131
+ i32, ptr, ptr, ptr, ptr, i32)
137
uint64_t *rd = vd;
132
+DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG,
138
uint64_t *rm = vm;
133
+ i32, ptr, ptr, ptr, ptr, i32)
139
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha256su0)(void *vd, void *vm)
134
+DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG,
140
135
+ i32, ptr, ptr, ptr, ptr, i32)
141
rd[0] = d.l[0];
136
+
142
rd[1] = d.l[1];
137
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
143
+
138
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
144
+ clear_tail_16(vd, desc);
139
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
145
}
140
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
146
141
index XXXXXXX..XXXXXXX 100644
147
-void HELPER(crypto_sha256su1)(void *vd, void *vn, void *vm)
142
--- a/target/arm/sve_helper.c
148
+void HELPER(crypto_sha256su1)(void *vd, void *vn, void *vm, uint32_t desc)
143
+++ b/target/arm/sve_helper.c
149
{
144
@@ -XXX,XX +XXX,XX @@ static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags)
150
uint64_t *rd = vd;
145
return flags;
151
uint64_t *rn = vn;
146
}
152
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sha256su1)(void *vd, void *vn, void *vm)
147
153
148
+/* This is an iterative function, called for each Pd and Pg word
154
rd[0] = d.l[0];
149
+ * moving backward.
155
rd[1] = d.l[1];
150
+ */
156
+
151
+static uint32_t iter_predtest_bwd(uint64_t d, uint64_t g, uint32_t flags)
157
+ clear_tail_16(vd, desc);
152
+{
158
}
153
+ if (likely(g)) {
159
154
+ /* Compute C from first (i.e last) !(D & G).
160
/*
155
+ Use bit 2 to signal first G bit seen. */
161
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
156
+ if (!(flags & 4)) {
162
index XXXXXXX..XXXXXXX 100644
157
+ flags += 4 - 1; /* add bit 2, subtract C from PREDTEST_INIT */
163
--- a/target/arm/translate-a64.c
158
+ flags |= (d & pow2floor(g)) == 0;
164
+++ b/target/arm/translate-a64.c
159
+ }
165
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
160
+
166
int rm = extract32(insn, 16, 5);
161
+ /* Accumulate Z from each D & G. */
167
int rn = extract32(insn, 5, 5);
162
+ flags |= ((d & g) != 0) << 1;
168
int rd = extract32(insn, 0, 5);
163
+
169
- CryptoThreeOpFn *genfn;
164
+ /* Compute N from last (i.e first) D & G. Replace previous. */
170
- TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
165
+ flags = deposit32(flags, 31, 1, (d & (g & -g)) != 0);
171
+ gen_helper_gvec_3 *genfn;
172
bool feature;
173
174
if (size != 0) {
175
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
176
return;
177
}
178
179
- tcg_rd_ptr = vec_full_reg_ptr(s, rd);
180
- tcg_rn_ptr = vec_full_reg_ptr(s, rn);
181
- tcg_rm_ptr = vec_full_reg_ptr(s, rm);
182
-
183
if (genfn) {
184
- genfn(tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr);
185
+ gen_gvec_op3_ool(s, true, rd, rn, rm, 0, genfn);
186
} else {
187
TCGv_i32 tcg_opcode = tcg_const_i32(opcode);
188
+ TCGv_ptr tcg_rd_ptr = vec_full_reg_ptr(s, rd);
189
+ TCGv_ptr tcg_rn_ptr = vec_full_reg_ptr(s, rn);
190
+ TCGv_ptr tcg_rm_ptr = vec_full_reg_ptr(s, rm);
191
192
gen_helper_crypto_sha1_3reg(tcg_rd_ptr, tcg_rn_ptr,
193
tcg_rm_ptr, tcg_opcode);
194
- tcg_temp_free_i32(tcg_opcode);
195
- }
196
197
- tcg_temp_free_ptr(tcg_rd_ptr);
198
- tcg_temp_free_ptr(tcg_rn_ptr);
199
- tcg_temp_free_ptr(tcg_rm_ptr);
200
+ tcg_temp_free_i32(tcg_opcode);
201
+ tcg_temp_free_ptr(tcg_rd_ptr);
202
+ tcg_temp_free_ptr(tcg_rn_ptr);
203
+ tcg_temp_free_ptr(tcg_rm_ptr);
166
+ }
204
+ }
167
+ return flags;
205
}
168
+}
206
169
+
207
/* Crypto two-reg SHA
170
/* The same for a single word predicate. */
208
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
171
uint32_t HELPER(sve_predtest1)(uint64_t d, uint64_t g)
209
int opcode = extract32(insn, 12, 5);
172
{
210
int rn = extract32(insn, 5, 5);
173
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm,
211
int rd = extract32(insn, 0, 5);
174
d[i] = (pg[H1(i)] & 1 ? nn : mm);
212
- CryptoTwoOpFn *genfn;
213
+ gen_helper_gvec_2 *genfn;
214
bool feature;
215
- TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
216
217
if (size != 0) {
218
unallocated_encoding(s);
219
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
220
if (!fp_access_check(s)) {
221
return;
175
}
222
}
176
}
223
-
177
+
224
- tcg_rd_ptr = vec_full_reg_ptr(s, rd);
178
+/* Two operand comparison controlled by a predicate.
225
- tcg_rn_ptr = vec_full_reg_ptr(s, rn);
179
+ * ??? It is very tempting to want to be able to expand this inline
226
-
180
+ * with x86 instructions, e.g.
227
- genfn(tcg_rd_ptr, tcg_rn_ptr);
181
+ *
228
-
182
+ * vcmpeqw zm, zn, %ymm0
229
- tcg_temp_free_ptr(tcg_rd_ptr);
183
+ * vpmovmskb %ymm0, %eax
230
- tcg_temp_free_ptr(tcg_rn_ptr);
184
+ * and $0x5555, %eax
231
+ gen_gvec_op2_ool(s, true, rd, rn, 0, genfn);
185
+ * and pg, %eax
232
}
186
+ *
233
187
+ * or even aarch64, e.g.
234
static void gen_rax1_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)
188
+ *
235
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
189
+ * // mask = 4000 1000 0400 0100 0040 0010 0004 0001
236
index XXXXXXX..XXXXXXX 100644
190
+ * cmeq v0.8h, zn, zm
237
--- a/target/arm/translate-neon.inc.c
191
+ * and v0.8h, v0.8h, mask
238
+++ b/target/arm/translate-neon.inc.c
192
+ * addv h0, v0.8h
239
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
193
+ * and v0.8b, pg
240
DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
194
+ *
241
DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
195
+ * However, coming up with an abstraction that allows vector inputs and
242
196
+ * a scalar output, and also handles the byte-ordering of sub-uint64_t
243
-static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
197
+ * scalar outputs, is tricky.
244
- uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
198
+ */
245
-{
199
+#define DO_CMP_PPZZ(NAME, TYPE, OP, H, MASK) \
246
- tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz,
200
+uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
247
- 0, gen_helper_gvec_pmul_b);
201
+{ \
248
-}
202
+ intptr_t opr_sz = simd_oprsz(desc); \
249
+#define WRAP_OOL_FN(WRAPNAME, FUNC) \
203
+ uint32_t flags = PREDTEST_INIT; \
250
+ static void WRAPNAME(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, \
204
+ intptr_t i = opr_sz; \
251
+ uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz) \
205
+ do { \
252
+ { \
206
+ uint64_t out = 0, pg; \
253
+ tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, 0, FUNC); \
207
+ do { \
254
+ }
208
+ i -= sizeof(TYPE), out <<= sizeof(TYPE); \
255
+
209
+ TYPE nn = *(TYPE *)(vn + H(i)); \
256
+WRAP_OOL_FN(gen_VMUL_p_3s, gen_helper_gvec_pmul_b)
210
+ TYPE mm = *(TYPE *)(vm + H(i)); \
257
211
+ out |= nn OP mm; \
258
static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
212
+ } while (i & 63); \
259
{
213
+ pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \
260
@@ -XXX,XX +XXX,XX @@ static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
214
+ out &= pg; \
215
+ *(uint64_t *)(vd + (i >> 3)) = out; \
216
+ flags = iter_predtest_bwd(out, pg, flags); \
217
+ } while (i > 0); \
218
+ return flags; \
219
+}
220
+
221
+#define DO_CMP_PPZZ_B(NAME, TYPE, OP) \
222
+ DO_CMP_PPZZ(NAME, TYPE, OP, H1, 0xffffffffffffffffull)
223
+#define DO_CMP_PPZZ_H(NAME, TYPE, OP) \
224
+ DO_CMP_PPZZ(NAME, TYPE, OP, H1_2, 0x5555555555555555ull)
225
+#define DO_CMP_PPZZ_S(NAME, TYPE, OP) \
226
+ DO_CMP_PPZZ(NAME, TYPE, OP, H1_4, 0x1111111111111111ull)
227
+#define DO_CMP_PPZZ_D(NAME, TYPE, OP) \
228
+ DO_CMP_PPZZ(NAME, TYPE, OP, , 0x0101010101010101ull)
229
+
230
+DO_CMP_PPZZ_B(sve_cmpeq_ppzz_b, uint8_t, ==)
231
+DO_CMP_PPZZ_H(sve_cmpeq_ppzz_h, uint16_t, ==)
232
+DO_CMP_PPZZ_S(sve_cmpeq_ppzz_s, uint32_t, ==)
233
+DO_CMP_PPZZ_D(sve_cmpeq_ppzz_d, uint64_t, ==)
234
+
235
+DO_CMP_PPZZ_B(sve_cmpne_ppzz_b, uint8_t, !=)
236
+DO_CMP_PPZZ_H(sve_cmpne_ppzz_h, uint16_t, !=)
237
+DO_CMP_PPZZ_S(sve_cmpne_ppzz_s, uint32_t, !=)
238
+DO_CMP_PPZZ_D(sve_cmpne_ppzz_d, uint64_t, !=)
239
+
240
+DO_CMP_PPZZ_B(sve_cmpgt_ppzz_b, int8_t, >)
241
+DO_CMP_PPZZ_H(sve_cmpgt_ppzz_h, int16_t, >)
242
+DO_CMP_PPZZ_S(sve_cmpgt_ppzz_s, int32_t, >)
243
+DO_CMP_PPZZ_D(sve_cmpgt_ppzz_d, int64_t, >)
244
+
245
+DO_CMP_PPZZ_B(sve_cmpge_ppzz_b, int8_t, >=)
246
+DO_CMP_PPZZ_H(sve_cmpge_ppzz_h, int16_t, >=)
247
+DO_CMP_PPZZ_S(sve_cmpge_ppzz_s, int32_t, >=)
248
+DO_CMP_PPZZ_D(sve_cmpge_ppzz_d, int64_t, >=)
249
+
250
+DO_CMP_PPZZ_B(sve_cmphi_ppzz_b, uint8_t, >)
251
+DO_CMP_PPZZ_H(sve_cmphi_ppzz_h, uint16_t, >)
252
+DO_CMP_PPZZ_S(sve_cmphi_ppzz_s, uint32_t, >)
253
+DO_CMP_PPZZ_D(sve_cmphi_ppzz_d, uint64_t, >)
254
+
255
+DO_CMP_PPZZ_B(sve_cmphs_ppzz_b, uint8_t, >=)
256
+DO_CMP_PPZZ_H(sve_cmphs_ppzz_h, uint16_t, >=)
257
+DO_CMP_PPZZ_S(sve_cmphs_ppzz_s, uint32_t, >=)
258
+DO_CMP_PPZZ_D(sve_cmphs_ppzz_d, uint64_t, >=)
259
+
260
+#undef DO_CMP_PPZZ_B
261
+#undef DO_CMP_PPZZ_H
262
+#undef DO_CMP_PPZZ_S
263
+#undef DO_CMP_PPZZ_D
264
+#undef DO_CMP_PPZZ
265
+
266
+/* Similar, but the second source is "wide". */
267
+#define DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H, MASK) \
268
+uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \
269
+{ \
270
+ intptr_t opr_sz = simd_oprsz(desc); \
271
+ uint32_t flags = PREDTEST_INIT; \
272
+ intptr_t i = opr_sz; \
273
+ do { \
274
+ uint64_t out = 0, pg; \
275
+ do { \
276
+ TYPEW mm = *(TYPEW *)(vm + i - 8); \
277
+ do { \
278
+ i -= sizeof(TYPE), out <<= sizeof(TYPE); \
279
+ TYPE nn = *(TYPE *)(vn + H(i)); \
280
+ out |= nn OP mm; \
281
+ } while (i & 7); \
282
+ } while (i & 63); \
283
+ pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \
284
+ out &= pg; \
285
+ *(uint64_t *)(vd + (i >> 3)) = out; \
286
+ flags = iter_predtest_bwd(out, pg, flags); \
287
+ } while (i > 0); \
288
+ return flags; \
289
+}
290
+
291
+#define DO_CMP_PPZW_B(NAME, TYPE, TYPEW, OP) \
292
+ DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1, 0xffffffffffffffffull)
293
+#define DO_CMP_PPZW_H(NAME, TYPE, TYPEW, OP) \
294
+ DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_2, 0x5555555555555555ull)
295
+#define DO_CMP_PPZW_S(NAME, TYPE, TYPEW, OP) \
296
+ DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_4, 0x1111111111111111ull)
297
+
298
+DO_CMP_PPZW_B(sve_cmpeq_ppzw_b, uint8_t, uint64_t, ==)
299
+DO_CMP_PPZW_H(sve_cmpeq_ppzw_h, uint16_t, uint64_t, ==)
300
+DO_CMP_PPZW_S(sve_cmpeq_ppzw_s, uint32_t, uint64_t, ==)
301
+
302
+DO_CMP_PPZW_B(sve_cmpne_ppzw_b, uint8_t, uint64_t, !=)
303
+DO_CMP_PPZW_H(sve_cmpne_ppzw_h, uint16_t, uint64_t, !=)
304
+DO_CMP_PPZW_S(sve_cmpne_ppzw_s, uint32_t, uint64_t, !=)
305
+
306
+DO_CMP_PPZW_B(sve_cmpgt_ppzw_b, int8_t, int64_t, >)
307
+DO_CMP_PPZW_H(sve_cmpgt_ppzw_h, int16_t, int64_t, >)
308
+DO_CMP_PPZW_S(sve_cmpgt_ppzw_s, int32_t, int64_t, >)
309
+
310
+DO_CMP_PPZW_B(sve_cmpge_ppzw_b, int8_t, int64_t, >=)
311
+DO_CMP_PPZW_H(sve_cmpge_ppzw_h, int16_t, int64_t, >=)
312
+DO_CMP_PPZW_S(sve_cmpge_ppzw_s, int32_t, int64_t, >=)
313
+
314
+DO_CMP_PPZW_B(sve_cmphi_ppzw_b, uint8_t, uint64_t, >)
315
+DO_CMP_PPZW_H(sve_cmphi_ppzw_h, uint16_t, uint64_t, >)
316
+DO_CMP_PPZW_S(sve_cmphi_ppzw_s, uint32_t, uint64_t, >)
317
+
318
+DO_CMP_PPZW_B(sve_cmphs_ppzw_b, uint8_t, uint64_t, >=)
319
+DO_CMP_PPZW_H(sve_cmphs_ppzw_h, uint16_t, uint64_t, >=)
320
+DO_CMP_PPZW_S(sve_cmphs_ppzw_s, uint32_t, uint64_t, >=)
321
+
322
+DO_CMP_PPZW_B(sve_cmplt_ppzw_b, int8_t, int64_t, <)
323
+DO_CMP_PPZW_H(sve_cmplt_ppzw_h, int16_t, int64_t, <)
324
+DO_CMP_PPZW_S(sve_cmplt_ppzw_s, int32_t, int64_t, <)
325
+
326
+DO_CMP_PPZW_B(sve_cmple_ppzw_b, int8_t, int64_t, <=)
327
+DO_CMP_PPZW_H(sve_cmple_ppzw_h, int16_t, int64_t, <=)
328
+DO_CMP_PPZW_S(sve_cmple_ppzw_s, int32_t, int64_t, <=)
329
+
330
+DO_CMP_PPZW_B(sve_cmplo_ppzw_b, uint8_t, uint64_t, <)
331
+DO_CMP_PPZW_H(sve_cmplo_ppzw_h, uint16_t, uint64_t, <)
332
+DO_CMP_PPZW_S(sve_cmplo_ppzw_s, uint32_t, uint64_t, <)
333
+
334
+DO_CMP_PPZW_B(sve_cmpls_ppzw_b, uint8_t, uint64_t, <=)
335
+DO_CMP_PPZW_H(sve_cmpls_ppzw_h, uint16_t, uint64_t, <=)
336
+DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=)
337
+
338
+#undef DO_CMP_PPZW_B
339
+#undef DO_CMP_PPZW_H
340
+#undef DO_CMP_PPZW_S
341
+#undef DO_CMP_PPZW
342
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
343
index XXXXXXX..XXXXXXX 100644
344
--- a/target/arm/translate-sve.c
345
+++ b/target/arm/translate-sve.c
346
@@ -XXX,XX +XXX,XX @@
347
#include "trace-tcg.h"
348
#include "translate-a64.h"
349
350
+
351
+typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr,
352
+ TCGv_ptr, TCGv_ptr, TCGv_i32);
353
+
354
/*
355
* Helpers for extracting complex instruction fields.
356
*/
357
@@ -XXX,XX +XXX,XX @@ static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
358
return true;
261
return true;
359
}
262
}
360
263
361
+/*
264
-static bool trans_SHA256H_3s(DisasContext *s, arg_SHA256H_3s *a)
362
+ *** SVE Integer Compare - Vectors Group
265
-{
363
+ */
266
- TCGv_ptr ptr1, ptr2, ptr3;
364
+
267
-
365
+static bool do_ppzz_flags(DisasContext *s, arg_rprr_esz *a,
268
- if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
366
+ gen_helper_gvec_flags_4 *gen_fn)
269
- !dc_isar_feature(aa32_sha2, s)) {
367
+{
270
- return false;
368
+ TCGv_ptr pd, zn, zm, pg;
271
+#define DO_SHA2(NAME, FUNC) \
369
+ unsigned vsz;
272
+ WRAP_OOL_FN(gen_##NAME##_3s, FUNC) \
370
+ TCGv_i32 t;
273
+ static bool trans_##NAME##_3s(DisasContext *s, arg_3same *a) \
371
+
274
+ { \
372
+ if (gen_fn == NULL) {
275
+ if (!dc_isar_feature(aa32_sha2, s)) { \
373
+ return false;
276
+ return false; \
374
+ }
277
+ } \
375
+ if (!sve_access_check(s)) {
278
+ return do_3same(s, a, gen_##NAME##_3s); \
376
+ return true;
279
}
377
+ }
280
378
+
281
- /* UNDEF accesses to D16-D31 if they don't exist. */
379
+ vsz = vec_full_reg_size(s);
282
- if (!dc_isar_feature(aa32_simd_r32, s) &&
380
+ t = tcg_const_i32(simd_desc(vsz, vsz, 0));
283
- ((a->vd | a->vn | a->vm) & 0x10)) {
381
+ pd = tcg_temp_new_ptr();
284
- return false;
382
+ zn = tcg_temp_new_ptr();
285
- }
383
+ zm = tcg_temp_new_ptr();
286
-
384
+ pg = tcg_temp_new_ptr();
287
- if ((a->vn | a->vm | a->vd) & 1) {
385
+
288
- return false;
386
+ tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd));
289
- }
387
+ tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn));
290
-
388
+ tcg_gen_addi_ptr(zm, cpu_env, vec_full_reg_offset(s, a->rm));
291
- if (!vfp_access_check(s)) {
389
+ tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg));
292
- return true;
390
+
293
- }
391
+ gen_fn(t, pd, zn, zm, pg, t);
294
-
392
+
295
- ptr1 = vfp_reg_ptr(true, a->vd);
393
+ tcg_temp_free_ptr(pd);
296
- ptr2 = vfp_reg_ptr(true, a->vn);
394
+ tcg_temp_free_ptr(zn);
297
- ptr3 = vfp_reg_ptr(true, a->vm);
395
+ tcg_temp_free_ptr(zm);
298
- gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
396
+ tcg_temp_free_ptr(pg);
299
- tcg_temp_free_ptr(ptr1);
397
+
300
- tcg_temp_free_ptr(ptr2);
398
+ do_pred_flags(t);
301
- tcg_temp_free_ptr(ptr3);
399
+
302
-
400
+ tcg_temp_free_i32(t);
303
- return true;
401
+ return true;
304
-}
402
+}
305
-
403
+
306
-static bool trans_SHA256H2_3s(DisasContext *s, arg_SHA256H2_3s *a)
404
+#define DO_PPZZ(NAME, name) \
307
-{
405
+static bool trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \
308
- TCGv_ptr ptr1, ptr2, ptr3;
406
+ uint32_t insn) \
309
-
407
+{ \
310
- if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
408
+ static gen_helper_gvec_flags_4 * const fns[4] = { \
311
- !dc_isar_feature(aa32_sha2, s)) {
409
+ gen_helper_sve_##name##_ppzz_b, gen_helper_sve_##name##_ppzz_h, \
312
- return false;
410
+ gen_helper_sve_##name##_ppzz_s, gen_helper_sve_##name##_ppzz_d, \
313
- }
411
+ }; \
314
-
412
+ return do_ppzz_flags(s, a, fns[a->esz]); \
315
- /* UNDEF accesses to D16-D31 if they don't exist. */
413
+}
316
- if (!dc_isar_feature(aa32_simd_r32, s) &&
414
+
317
- ((a->vd | a->vn | a->vm) & 0x10)) {
415
+DO_PPZZ(CMPEQ, cmpeq)
318
- return false;
416
+DO_PPZZ(CMPNE, cmpne)
319
- }
417
+DO_PPZZ(CMPGT, cmpgt)
320
-
418
+DO_PPZZ(CMPGE, cmpge)
321
- if ((a->vn | a->vm | a->vd) & 1) {
419
+DO_PPZZ(CMPHI, cmphi)
322
- return false;
420
+DO_PPZZ(CMPHS, cmphs)
323
- }
421
+
324
-
422
+#undef DO_PPZZ
325
- if (!vfp_access_check(s)) {
423
+
326
- return true;
424
+#define DO_PPZW(NAME, name) \
327
- }
425
+static bool trans_##NAME##_ppzw(DisasContext *s, arg_rprr_esz *a, \
328
-
426
+ uint32_t insn) \
329
- ptr1 = vfp_reg_ptr(true, a->vd);
427
+{ \
330
- ptr2 = vfp_reg_ptr(true, a->vn);
428
+ static gen_helper_gvec_flags_4 * const fns[4] = { \
331
- ptr3 = vfp_reg_ptr(true, a->vm);
429
+ gen_helper_sve_##name##_ppzw_b, gen_helper_sve_##name##_ppzw_h, \
332
- gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
430
+ gen_helper_sve_##name##_ppzw_s, NULL \
333
- tcg_temp_free_ptr(ptr1);
431
+ }; \
334
- tcg_temp_free_ptr(ptr2);
432
+ return do_ppzz_flags(s, a, fns[a->esz]); \
335
- tcg_temp_free_ptr(ptr3);
433
+}
336
-
434
+
337
- return true;
435
+DO_PPZW(CMPEQ, cmpeq)
338
-}
436
+DO_PPZW(CMPNE, cmpne)
339
-
437
+DO_PPZW(CMPGT, cmpgt)
340
-static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
438
+DO_PPZW(CMPGE, cmpge)
341
-{
439
+DO_PPZW(CMPHI, cmphi)
342
- TCGv_ptr ptr1, ptr2, ptr3;
440
+DO_PPZW(CMPHS, cmphs)
343
-
441
+DO_PPZW(CMPLT, cmplt)
344
- if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
442
+DO_PPZW(CMPLE, cmple)
345
- !dc_isar_feature(aa32_sha2, s)) {
443
+DO_PPZW(CMPLO, cmplo)
346
- return false;
444
+DO_PPZW(CMPLS, cmpls)
347
- }
445
+
348
-
446
+#undef DO_PPZW
349
- /* UNDEF accesses to D16-D31 if they don't exist. */
447
+
350
- if (!dc_isar_feature(aa32_simd_r32, s) &&
448
/*
351
- ((a->vd | a->vn | a->vm) & 0x10)) {
449
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
352
- return false;
450
*/
353
- }
451
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
354
-
452
index XXXXXXX..XXXXXXX 100644
355
- if ((a->vn | a->vm | a->vd) & 1) {
453
--- a/target/arm/sve.decode
356
- return false;
454
+++ b/target/arm/sve.decode
357
- }
455
@@ -XXX,XX +XXX,XX @@
358
-
456
@rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \
359
- if (!vfp_access_check(s)) {
457
&rprr_esz rm=%reg_movprfx
360
- return true;
458
@rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz
361
- }
459
+@pd_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 . rd:4 &rprr_esz
362
-
460
363
- ptr1 = vfp_reg_ptr(true, a->vd);
461
# Three register operand, with governing predicate, vector element size
364
- ptr2 = vfp_reg_ptr(true, a->vn);
462
@rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \
365
- ptr3 = vfp_reg_ptr(true, a->vm);
463
@@ -XXX,XX +XXX,XX @@ SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm
366
- gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
464
# SVE select vector elements (predicated)
367
- tcg_temp_free_ptr(ptr1);
465
SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm
368
- tcg_temp_free_ptr(ptr2);
466
369
- tcg_temp_free_ptr(ptr3);
467
+### SVE Integer Compare - Vectors Group
370
-
468
+
371
- return true;
469
+# SVE integer compare_vectors
372
-}
470
+CMPHS_ppzz 00100100 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_rm
373
+DO_SHA2(SHA256H, gen_helper_crypto_sha256h)
471
+CMPHI_ppzz 00100100 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_rm
374
+DO_SHA2(SHA256H2, gen_helper_crypto_sha256h2)
472
+CMPGE_ppzz 00100100 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_rm
375
+DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1)
473
+CMPGT_ppzz 00100100 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_rm
376
474
+CMPEQ_ppzz 00100100 .. 0 ..... 101 ... ..... 0 .... @pd_pg_rn_rm
377
#define DO_3SAME_64(INSN, FUNC) \
475
+CMPNE_ppzz 00100100 .. 0 ..... 101 ... ..... 1 .... @pd_pg_rn_rm
378
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
476
+
379
diff --git a/target/arm/translate.c b/target/arm/translate.c
477
+# SVE integer compare with wide elements
380
index XXXXXXX..XXXXXXX 100644
478
+# Note these require esz != 3.
381
--- a/target/arm/translate.c
479
+CMPEQ_ppzw 00100100 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_rm
382
+++ b/target/arm/translate.c
480
+CMPNE_ppzw 00100100 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_rm
383
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
481
+CMPGE_ppzw 00100100 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm
384
int vec_size;
482
+CMPGT_ppzw 00100100 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm
385
uint32_t imm;
483
+CMPLT_ppzw 00100100 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm
386
TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
484
+CMPLE_ppzw 00100100 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm
387
- TCGv_ptr ptr1, ptr2;
485
+CMPHS_ppzw 00100100 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm
388
+ TCGv_ptr ptr1;
486
+CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm
389
TCGv_i64 tmp64;
487
+CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm
390
488
+CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm
391
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
489
+
392
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
490
### SVE Predicate Logical Operations Group
393
if (!dc_isar_feature(aa32_sha1, s) || ((rm | rd) & 1)) {
491
394
return 1;
492
# SVE predicate logical operations
395
}
396
- ptr1 = vfp_reg_ptr(true, rd);
397
- ptr2 = vfp_reg_ptr(true, rm);
398
-
399
- gen_helper_crypto_sha1h(ptr1, ptr2);
400
-
401
- tcg_temp_free_ptr(ptr1);
402
- tcg_temp_free_ptr(ptr2);
403
+ tcg_gen_gvec_2_ool(rd_ofs, rm_ofs, 16, 16, 0,
404
+ gen_helper_crypto_sha1h);
405
break;
406
case NEON_2RM_SHA1SU1:
407
if ((rm | rd) & 1) {
408
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
409
} else if (!dc_isar_feature(aa32_sha1, s)) {
410
return 1;
411
}
412
- ptr1 = vfp_reg_ptr(true, rd);
413
- ptr2 = vfp_reg_ptr(true, rm);
414
- if (q) {
415
- gen_helper_crypto_sha256su0(ptr1, ptr2);
416
- } else {
417
- gen_helper_crypto_sha1su1(ptr1, ptr2);
418
- }
419
- tcg_temp_free_ptr(ptr1);
420
- tcg_temp_free_ptr(ptr2);
421
+ tcg_gen_gvec_2_ool(rd_ofs, rm_ofs, 16, 16, 0,
422
+ q ? gen_helper_crypto_sha256su0
423
+ : gen_helper_crypto_sha1su1);
424
break;
425
-
426
case NEON_2RM_VMVN:
427
tcg_gen_gvec_not(0, rd_ofs, rm_ofs, vec_size, vec_size);
428
break;
493
--
429
--
494
2.17.1
430
2.20.1
495
431
496
432
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Rather than passing an opcode to a helper, fully decode the
4
operation at translate time. Use clear_tail_16 to zap the
5
balance of the SVE register with the AdvSIMD write.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200514212831.31248-6-richard.henderson@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-18-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/helper-sve.h | 25 +++++++
12
target/arm/helper.h | 5 +-
9
target/arm/sve_helper.c | 41 +++++++++++
13
target/arm/neon-dp.decode | 6 +-
10
target/arm/translate-sve.c | 144 +++++++++++++++++++++++++++++++++++++
14
target/arm/crypto_helper.c | 99 +++++++++++++++++++++------------
11
target/arm/sve.decode | 26 +++++++
15
target/arm/translate-a64.c | 29 ++++------
12
4 files changed, 236 insertions(+)
16
target/arm/translate-neon.inc.c | 46 ++++-----------
13
17
5 files changed, 93 insertions(+), 92 deletions(-)
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
18
15
index XXXXXXX..XXXXXXX 100644
19
diff --git a/target/arm/helper.h b/target/arm/helper.h
16
--- a/target/arm/helper-sve.h
20
index XXXXXXX..XXXXXXX 100644
17
+++ b/target/arm/helper-sve.h
21
--- a/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
22
+++ b/target/arm/helper.h
19
DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
23
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(neon_qzip32, TCG_CALL_NO_RWG, void, ptr, ptr)
20
24
DEF_HELPER_FLAGS_4(crypto_aese, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32)
25
DEF_HELPER_FLAGS_3(crypto_aesmc, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
22
+
26
23
+DEF_HELPER_FLAGS_4(sve_subri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
27
-DEF_HELPER_FLAGS_4(crypto_sha1_3reg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(sve_subri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
28
+DEF_HELPER_FLAGS_4(crypto_sha1su0, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_4(sve_subri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
29
+DEF_HELPER_FLAGS_4(crypto_sha1c, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_4(sve_subri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
30
+DEF_HELPER_FLAGS_4(crypto_sha1p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+
31
+DEF_HELPER_FLAGS_4(crypto_sha1m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(sve_smaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
32
DEF_HELPER_FLAGS_3(crypto_sha1h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(sve_smaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
33
DEF_HELPER_FLAGS_3(crypto_sha1su1, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_4(sve_smaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
34
31
+DEF_HELPER_FLAGS_4(sve_smaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
35
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
32
+
36
index XXXXXXX..XXXXXXX 100644
33
+DEF_HELPER_FLAGS_4(sve_smini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
37
--- a/target/arm/neon-dp.decode
34
+DEF_HELPER_FLAGS_4(sve_smini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
38
+++ b/target/arm/neon-dp.decode
35
+DEF_HELPER_FLAGS_4(sve_smini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
39
@@ -XXX,XX +XXX,XX @@ VQRDMLAH_3s 1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
36
+DEF_HELPER_FLAGS_4(sve_smini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
40
@3same_crypto .... .... .... .... .... .... .... .... \
37
+
41
&3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0 q=1
38
+DEF_HELPER_FLAGS_4(sve_umaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
42
39
+DEF_HELPER_FLAGS_4(sve_umaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
43
-SHA1_3s 1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
40
+DEF_HELPER_FLAGS_4(sve_umaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
44
- vm=%vm_dp vn=%vn_dp vd=%vd_dp
41
+DEF_HELPER_FLAGS_4(sve_umaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
45
+SHA1C_3s 1111 001 0 0 . 00 .... .... 1100 . 1 . 0 .... @3same_crypto
42
+
46
+SHA1P_3s 1111 001 0 0 . 01 .... .... 1100 . 1 . 0 .... @3same_crypto
43
+DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
47
+SHA1M_3s 1111 001 0 0 . 10 .... .... 1100 . 1 . 0 .... @3same_crypto
44
+DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
48
+SHA1SU0_3s 1111 001 0 0 . 11 .... .... 1100 . 1 . 0 .... @3same_crypto
45
+DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
49
SHA256H_3s 1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... @3same_crypto
46
+DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
50
SHA256H2_3s 1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... @3same_crypto
47
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
51
SHA256SU1_3s 1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... @3same_crypto
48
index XXXXXXX..XXXXXXX 100644
52
diff --git a/target/arm/crypto_helper.c b/target/arm/crypto_helper.c
49
--- a/target/arm/sve_helper.c
53
index XXXXXXX..XXXXXXX 100644
50
+++ b/target/arm/sve_helper.c
54
--- a/target/arm/crypto_helper.c
51
@@ -XXX,XX +XXX,XX @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN)
55
+++ b/target/arm/crypto_helper.c
52
#undef DO_VPZ
56
@@ -XXX,XX +XXX,XX @@ union CRYPTO_STATE {
53
#undef DO_VPZ_D
57
};
54
58
55
+/* Two vector operand, one scalar operand, unpredicated. */
59
#ifdef HOST_WORDS_BIGENDIAN
56
+#define DO_ZZI(NAME, TYPE, OP) \
60
-#define CR_ST_BYTE(state, i) (state.bytes[(15 - (i)) ^ 8])
57
+void HELPER(NAME)(void *vd, void *vn, uint64_t s64, uint32_t desc) \
61
-#define CR_ST_WORD(state, i) (state.words[(3 - (i)) ^ 2])
58
+{ \
62
+#define CR_ST_BYTE(state, i) ((state).bytes[(15 - (i)) ^ 8])
59
+ intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(TYPE); \
63
+#define CR_ST_WORD(state, i) ((state).words[(3 - (i)) ^ 2])
60
+ TYPE s = s64, *d = vd, *n = vn; \
64
#else
61
+ for (i = 0; i < opr_sz; ++i) { \
65
-#define CR_ST_BYTE(state, i) (state.bytes[i])
62
+ d[i] = OP(n[i], s); \
66
-#define CR_ST_WORD(state, i) (state.words[i])
63
+ } \
67
+#define CR_ST_BYTE(state, i) ((state).bytes[i])
64
+}
68
+#define CR_ST_WORD(state, i) ((state).words[i])
65
+
69
#endif
66
+#define DO_SUBR(X, Y) (Y - X)
70
67
+
71
/*
68
+DO_ZZI(sve_subri_b, uint8_t, DO_SUBR)
72
@@ -XXX,XX +XXX,XX @@ static uint32_t maj(uint32_t x, uint32_t y, uint32_t z)
69
+DO_ZZI(sve_subri_h, uint16_t, DO_SUBR)
73
return (x & y) | ((x | y) & z);
70
+DO_ZZI(sve_subri_s, uint32_t, DO_SUBR)
71
+DO_ZZI(sve_subri_d, uint64_t, DO_SUBR)
72
+
73
+DO_ZZI(sve_smaxi_b, int8_t, DO_MAX)
74
+DO_ZZI(sve_smaxi_h, int16_t, DO_MAX)
75
+DO_ZZI(sve_smaxi_s, int32_t, DO_MAX)
76
+DO_ZZI(sve_smaxi_d, int64_t, DO_MAX)
77
+
78
+DO_ZZI(sve_smini_b, int8_t, DO_MIN)
79
+DO_ZZI(sve_smini_h, int16_t, DO_MIN)
80
+DO_ZZI(sve_smini_s, int32_t, DO_MIN)
81
+DO_ZZI(sve_smini_d, int64_t, DO_MIN)
82
+
83
+DO_ZZI(sve_umaxi_b, uint8_t, DO_MAX)
84
+DO_ZZI(sve_umaxi_h, uint16_t, DO_MAX)
85
+DO_ZZI(sve_umaxi_s, uint32_t, DO_MAX)
86
+DO_ZZI(sve_umaxi_d, uint64_t, DO_MAX)
87
+
88
+DO_ZZI(sve_umini_b, uint8_t, DO_MIN)
89
+DO_ZZI(sve_umini_h, uint16_t, DO_MIN)
90
+DO_ZZI(sve_umini_s, uint32_t, DO_MIN)
91
+DO_ZZI(sve_umini_d, uint64_t, DO_MIN)
92
+
93
+#undef DO_ZZI
94
+
95
#undef DO_AND
96
#undef DO_ORR
97
#undef DO_EOR
98
@@ -XXX,XX +XXX,XX @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN)
99
#undef DO_ASR
100
#undef DO_LSR
101
#undef DO_LSL
102
+#undef DO_SUBR
103
104
/* Similar to the ARM LastActiveElement pseudocode function, except the
105
result is multiplied by the element size. This includes the not found
106
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
107
index XXXXXXX..XXXXXXX 100644
108
--- a/target/arm/translate-sve.c
109
+++ b/target/arm/translate-sve.c
110
@@ -XXX,XX +XXX,XX @@ static inline int expand_imm_sh8s(int x)
111
return (int8_t)x << (x & 0x100 ? 8 : 0);
112
}
74
}
113
75
114
+static inline int expand_imm_sh8u(int x)
76
-void HELPER(crypto_sha1_3reg)(void *vd, void *vn, void *vm, uint32_t op)
115
+{
77
+void HELPER(crypto_sha1su0)(void *vd, void *vn, void *vm, uint32_t desc)
116
+ return (uint8_t)x << (x & 0x100 ? 8 : 0);
78
+{
117
+}
79
+ uint64_t *d = vd, *n = vn, *m = vm;
118
+
80
+ uint64_t d0, d1;
119
/*
81
+
120
* Include the generated decoder.
82
+ d0 = d[1] ^ d[0] ^ m[0];
121
*/
83
+ d1 = n[0] ^ d[1] ^ m[1];
122
@@ -XXX,XX +XXX,XX @@ static bool trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn)
84
+ d[0] = d0;
123
return true;
85
+ d[1] = d1;
86
+
87
+ clear_tail_16(vd, desc);
88
+}
89
+
90
+static inline void crypto_sha1_3reg(uint64_t *rd, uint64_t *rn,
91
+ uint64_t *rm, uint32_t desc,
92
+ uint32_t (*fn)(union CRYPTO_STATE *d))
93
{
94
- uint64_t *rd = vd;
95
- uint64_t *rn = vn;
96
- uint64_t *rm = vm;
97
union CRYPTO_STATE d = { .l = { rd[0], rd[1] } };
98
union CRYPTO_STATE n = { .l = { rn[0], rn[1] } };
99
union CRYPTO_STATE m = { .l = { rm[0], rm[1] } };
100
+ int i;
101
102
- if (op == 3) { /* sha1su0 */
103
- d.l[0] ^= d.l[1] ^ m.l[0];
104
- d.l[1] ^= n.l[0] ^ m.l[1];
105
- } else {
106
- int i;
107
+ for (i = 0; i < 4; i++) {
108
+ uint32_t t = fn(&d);
109
110
- for (i = 0; i < 4; i++) {
111
- uint32_t t;
112
+ t += rol32(CR_ST_WORD(d, 0), 5) + CR_ST_WORD(n, 0)
113
+ + CR_ST_WORD(m, i);
114
115
- switch (op) {
116
- case 0: /* sha1c */
117
- t = cho(CR_ST_WORD(d, 1), CR_ST_WORD(d, 2), CR_ST_WORD(d, 3));
118
- break;
119
- case 1: /* sha1p */
120
- t = par(CR_ST_WORD(d, 1), CR_ST_WORD(d, 2), CR_ST_WORD(d, 3));
121
- break;
122
- case 2: /* sha1m */
123
- t = maj(CR_ST_WORD(d, 1), CR_ST_WORD(d, 2), CR_ST_WORD(d, 3));
124
- break;
125
- default:
126
- g_assert_not_reached();
127
- }
128
- t += rol32(CR_ST_WORD(d, 0), 5) + CR_ST_WORD(n, 0)
129
- + CR_ST_WORD(m, i);
130
-
131
- CR_ST_WORD(n, 0) = CR_ST_WORD(d, 3);
132
- CR_ST_WORD(d, 3) = CR_ST_WORD(d, 2);
133
- CR_ST_WORD(d, 2) = ror32(CR_ST_WORD(d, 1), 2);
134
- CR_ST_WORD(d, 1) = CR_ST_WORD(d, 0);
135
- CR_ST_WORD(d, 0) = t;
136
- }
137
+ CR_ST_WORD(n, 0) = CR_ST_WORD(d, 3);
138
+ CR_ST_WORD(d, 3) = CR_ST_WORD(d, 2);
139
+ CR_ST_WORD(d, 2) = ror32(CR_ST_WORD(d, 1), 2);
140
+ CR_ST_WORD(d, 1) = CR_ST_WORD(d, 0);
141
+ CR_ST_WORD(d, 0) = t;
142
}
143
rd[0] = d.l[0];
144
rd[1] = d.l[1];
145
+
146
+ clear_tail_16(rd, desc);
147
+}
148
+
149
+static uint32_t do_sha1c(union CRYPTO_STATE *d)
150
+{
151
+ return cho(CR_ST_WORD(*d, 1), CR_ST_WORD(*d, 2), CR_ST_WORD(*d, 3));
152
+}
153
+
154
+void HELPER(crypto_sha1c)(void *vd, void *vn, void *vm, uint32_t desc)
155
+{
156
+ crypto_sha1_3reg(vd, vn, vm, desc, do_sha1c);
157
+}
158
+
159
+static uint32_t do_sha1p(union CRYPTO_STATE *d)
160
+{
161
+ return par(CR_ST_WORD(*d, 1), CR_ST_WORD(*d, 2), CR_ST_WORD(*d, 3));
162
+}
163
+
164
+void HELPER(crypto_sha1p)(void *vd, void *vn, void *vm, uint32_t desc)
165
+{
166
+ crypto_sha1_3reg(vd, vn, vm, desc, do_sha1p);
167
+}
168
+
169
+static uint32_t do_sha1m(union CRYPTO_STATE *d)
170
+{
171
+ return maj(CR_ST_WORD(*d, 1), CR_ST_WORD(*d, 2), CR_ST_WORD(*d, 3));
172
+}
173
+
174
+void HELPER(crypto_sha1m)(void *vd, void *vn, void *vm, uint32_t desc)
175
+{
176
+ crypto_sha1_3reg(vd, vn, vm, desc, do_sha1m);
124
}
177
}
125
178
126
+static bool trans_ADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
179
void HELPER(crypto_sha1h)(void *vd, void *vm, uint32_t desc)
127
+{
180
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
128
+ if (a->esz == 0 && extract32(insn, 13, 1)) {
181
index XXXXXXX..XXXXXXX 100644
129
+ return false;
182
--- a/target/arm/translate-a64.c
130
+ }
183
+++ b/target/arm/translate-a64.c
131
+ if (sve_access_check(s)) {
184
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
132
+ unsigned vsz = vec_full_reg_size(s);
185
133
+ tcg_gen_gvec_addi(a->esz, vec_full_reg_offset(s, a->rd),
186
switch (opcode) {
134
+ vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz);
187
case 0: /* SHA1C */
135
+ }
188
+ genfn = gen_helper_crypto_sha1c;
136
+ return true;
189
+ feature = dc_isar_feature(aa64_sha1, s);
137
+}
190
+ break;
138
+
191
case 1: /* SHA1P */
139
+static bool trans_SUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
192
+ genfn = gen_helper_crypto_sha1p;
140
+{
193
+ feature = dc_isar_feature(aa64_sha1, s);
141
+ a->imm = -a->imm;
194
+ break;
142
+ return trans_ADD_zzi(s, a, insn);
195
case 2: /* SHA1M */
143
+}
196
+ genfn = gen_helper_crypto_sha1m;
144
+
197
+ feature = dc_isar_feature(aa64_sha1, s);
145
+static bool trans_SUBR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
198
+ break;
146
+{
199
case 3: /* SHA1SU0 */
147
+ static const GVecGen2s op[4] = {
200
- genfn = NULL;
148
+ { .fni8 = tcg_gen_vec_sub8_i64,
201
+ genfn = gen_helper_crypto_sha1su0;
149
+ .fniv = tcg_gen_sub_vec,
202
feature = dc_isar_feature(aa64_sha1, s);
150
+ .fno = gen_helper_sve_subri_b,
203
break;
151
+ .opc = INDEX_op_sub_vec,
204
case 4: /* SHA256H */
152
+ .vece = MO_8,
205
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
153
+ .scalar_first = true },
206
if (!fp_access_check(s)) {
154
+ { .fni8 = tcg_gen_vec_sub16_i64,
207
return;
155
+ .fniv = tcg_gen_sub_vec,
208
}
156
+ .fno = gen_helper_sve_subri_h,
209
-
157
+ .opc = INDEX_op_sub_vec,
210
- if (genfn) {
158
+ .vece = MO_16,
211
- gen_gvec_op3_ool(s, true, rd, rn, rm, 0, genfn);
159
+ .scalar_first = true },
212
- } else {
160
+ { .fni4 = tcg_gen_sub_i32,
213
- TCGv_i32 tcg_opcode = tcg_const_i32(opcode);
161
+ .fniv = tcg_gen_sub_vec,
214
- TCGv_ptr tcg_rd_ptr = vec_full_reg_ptr(s, rd);
162
+ .fno = gen_helper_sve_subri_s,
215
- TCGv_ptr tcg_rn_ptr = vec_full_reg_ptr(s, rn);
163
+ .opc = INDEX_op_sub_vec,
216
- TCGv_ptr tcg_rm_ptr = vec_full_reg_ptr(s, rm);
164
+ .vece = MO_32,
217
-
165
+ .scalar_first = true },
218
- gen_helper_crypto_sha1_3reg(tcg_rd_ptr, tcg_rn_ptr,
166
+ { .fni8 = tcg_gen_sub_i64,
219
- tcg_rm_ptr, tcg_opcode);
167
+ .fniv = tcg_gen_sub_vec,
220
-
168
+ .fno = gen_helper_sve_subri_d,
221
- tcg_temp_free_i32(tcg_opcode);
169
+ .opc = INDEX_op_sub_vec,
222
- tcg_temp_free_ptr(tcg_rd_ptr);
170
+ .prefer_i64 = TCG_TARGET_REG_BITS == 64,
223
- tcg_temp_free_ptr(tcg_rn_ptr);
171
+ .vece = MO_64,
224
- tcg_temp_free_ptr(tcg_rm_ptr);
172
+ .scalar_first = true }
225
- }
173
+ };
226
+ gen_gvec_op3_ool(s, true, rd, rn, rm, 0, genfn);
174
+
227
}
175
+ if (a->esz == 0 && extract32(insn, 13, 1)) {
228
176
+ return false;
229
/* Crypto two-reg SHA
177
+ }
230
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
178
+ if (sve_access_check(s)) {
231
index XXXXXXX..XXXXXXX 100644
179
+ unsigned vsz = vec_full_reg_size(s);
232
--- a/target/arm/translate-neon.inc.c
180
+ TCGv_i64 c = tcg_const_i64(a->imm);
233
+++ b/target/arm/translate-neon.inc.c
181
+ tcg_gen_gvec_2s(vec_full_reg_offset(s, a->rd),
234
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
182
+ vec_full_reg_offset(s, a->rn),
235
DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
183
+ vsz, vsz, c, &op[a->esz]);
236
DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
184
+ tcg_temp_free_i64(c);
237
185
+ }
238
-static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
186
+ return true;
239
-{
187
+}
240
- TCGv_ptr ptr1, ptr2, ptr3;
188
+
241
- TCGv_i32 tmp;
189
+static bool trans_MUL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
242
-
190
+{
243
- if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
191
+ if (sve_access_check(s)) {
244
- !dc_isar_feature(aa32_sha1, s)) {
192
+ unsigned vsz = vec_full_reg_size(s);
245
- return false;
193
+ tcg_gen_gvec_muli(a->esz, vec_full_reg_offset(s, a->rd),
246
+#define DO_SHA1(NAME, FUNC) \
194
+ vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz);
247
+ WRAP_OOL_FN(gen_##NAME##_3s, FUNC) \
195
+ }
248
+ static bool trans_##NAME##_3s(DisasContext *s, arg_3same *a) \
196
+ return true;
249
+ { \
197
+}
250
+ if (!dc_isar_feature(aa32_sha1, s)) { \
198
+
251
+ return false; \
199
+static bool do_zzi_sat(DisasContext *s, arg_rri_esz *a, uint32_t insn,
252
+ } \
200
+ bool u, bool d)
253
+ return do_3same(s, a, gen_##NAME##_3s); \
201
+{
254
}
202
+ if (a->esz == 0 && extract32(insn, 13, 1)) {
255
203
+ return false;
256
- /* UNDEF accesses to D16-D31 if they don't exist. */
204
+ }
257
- if (!dc_isar_feature(aa32_simd_r32, s) &&
205
+ if (sve_access_check(s)) {
258
- ((a->vd | a->vn | a->vm) & 0x10)) {
206
+ TCGv_i64 val = tcg_const_i64(a->imm);
259
- return false;
207
+ do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, u, d);
260
- }
208
+ tcg_temp_free_i64(val);
261
-
209
+ }
262
- if ((a->vn | a->vm | a->vd) & 1) {
210
+ return true;
263
- return false;
211
+}
264
- }
212
+
265
-
213
+static bool trans_SQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
266
- if (!vfp_access_check(s)) {
214
+{
267
- return true;
215
+ return do_zzi_sat(s, a, insn, false, false);
268
- }
216
+}
269
-
217
+
270
- ptr1 = vfp_reg_ptr(true, a->vd);
218
+static bool trans_UQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
271
- ptr2 = vfp_reg_ptr(true, a->vn);
219
+{
272
- ptr3 = vfp_reg_ptr(true, a->vm);
220
+ return do_zzi_sat(s, a, insn, true, false);
273
- tmp = tcg_const_i32(a->optype);
221
+}
274
- gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp);
222
+
275
- tcg_temp_free_i32(tmp);
223
+static bool trans_SQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
276
- tcg_temp_free_ptr(ptr1);
224
+{
277
- tcg_temp_free_ptr(ptr2);
225
+ return do_zzi_sat(s, a, insn, false, true);
278
- tcg_temp_free_ptr(ptr3);
226
+}
279
-
227
+
280
- return true;
228
+static bool trans_UQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn)
281
-}
229
+{
282
+DO_SHA1(SHA1C, gen_helper_crypto_sha1c)
230
+ return do_zzi_sat(s, a, insn, true, true);
283
+DO_SHA1(SHA1P, gen_helper_crypto_sha1p)
231
+}
284
+DO_SHA1(SHA1M, gen_helper_crypto_sha1m)
232
+
285
+DO_SHA1(SHA1SU0, gen_helper_crypto_sha1su0)
233
+static bool do_zzi_ool(DisasContext *s, arg_rri_esz *a, gen_helper_gvec_2i *fn)
286
234
+{
287
#define DO_SHA2(NAME, FUNC) \
235
+ if (sve_access_check(s)) {
288
WRAP_OOL_FN(gen_##NAME##_3s, FUNC) \
236
+ unsigned vsz = vec_full_reg_size(s);
237
+ TCGv_i64 c = tcg_const_i64(a->imm);
238
+
239
+ tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd),
240
+ vec_full_reg_offset(s, a->rn),
241
+ c, vsz, vsz, 0, fn);
242
+ tcg_temp_free_i64(c);
243
+ }
244
+ return true;
245
+}
246
+
247
+#define DO_ZZI(NAME, name) \
248
+static bool trans_##NAME##_zzi(DisasContext *s, arg_rri_esz *a, \
249
+ uint32_t insn) \
250
+{ \
251
+ static gen_helper_gvec_2i * const fns[4] = { \
252
+ gen_helper_sve_##name##i_b, gen_helper_sve_##name##i_h, \
253
+ gen_helper_sve_##name##i_s, gen_helper_sve_##name##i_d, \
254
+ }; \
255
+ return do_zzi_ool(s, a, fns[a->esz]); \
256
+}
257
+
258
+DO_ZZI(SMAX, smax)
259
+DO_ZZI(UMAX, umax)
260
+DO_ZZI(SMIN, smin)
261
+DO_ZZI(UMIN, umin)
262
+
263
+#undef DO_ZZI
264
+
265
/*
266
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
267
*/
268
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
269
index XXXXXXX..XXXXXXX 100644
270
--- a/target/arm/sve.decode
271
+++ b/target/arm/sve.decode
272
@@ -XXX,XX +XXX,XX @@
273
274
# Signed 8-bit immediate, optionally shifted left by 8.
275
%sh8_i8s 5:9 !function=expand_imm_sh8s
276
+# Unsigned 8-bit immediate, optionally shifted left by 8.
277
+%sh8_i8u 5:9 !function=expand_imm_sh8u
278
279
# Either a copy of rd (at bit 0), or a different source
280
# as propagated via the MOVPRFX instruction.
281
@@ -XXX,XX +XXX,XX @@
282
@pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz
283
@rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \
284
&rrr_esz rn=%reg_movprfx
285
+@rdn_sh_i8u ........ esz:2 ...... ...... ..... rd:5 \
286
+ &rri_esz rn=%reg_movprfx imm=%sh8_i8u
287
+@rdn_i8u ........ esz:2 ...... ... imm:8 rd:5 \
288
+ &rri_esz rn=%reg_movprfx
289
+@rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \
290
+ &rri_esz rn=%reg_movprfx
291
292
# Three operand with "memory" size, aka immediate left shift
293
@rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri
294
@@ -XXX,XX +XXX,XX @@ FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5
295
# SVE broadcast integer immediate (unpredicated)
296
DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=%sh8_i8s
297
298
+# SVE integer add/subtract immediate (unpredicated)
299
+ADD_zzi 00100101 .. 100 000 11 . ........ ..... @rdn_sh_i8u
300
+SUB_zzi 00100101 .. 100 001 11 . ........ ..... @rdn_sh_i8u
301
+SUBR_zzi 00100101 .. 100 011 11 . ........ ..... @rdn_sh_i8u
302
+SQADD_zzi 00100101 .. 100 100 11 . ........ ..... @rdn_sh_i8u
303
+UQADD_zzi 00100101 .. 100 101 11 . ........ ..... @rdn_sh_i8u
304
+SQSUB_zzi 00100101 .. 100 110 11 . ........ ..... @rdn_sh_i8u
305
+UQSUB_zzi 00100101 .. 100 111 11 . ........ ..... @rdn_sh_i8u
306
+
307
+# SVE integer min/max immediate (unpredicated)
308
+SMAX_zzi 00100101 .. 101 000 110 ........ ..... @rdn_i8s
309
+UMAX_zzi 00100101 .. 101 001 110 ........ ..... @rdn_i8u
310
+SMIN_zzi 00100101 .. 101 010 110 ........ ..... @rdn_i8s
311
+UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u
312
+
313
+# SVE integer multiply immediate (unpredicated)
314
+MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s
315
+
316
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
317
318
# SVE load predicate register
319
--
289
--
320
2.17.1
290
2.20.1
321
291
322
292
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Richard Henderson <richard.henderson@linaro.org>
2
2
3
Rather than passing an opcode to a helper, fully decode the
4
operation at translate time. Use clear_tail_16 to zap the
5
balance of the SVE register with the AdvSIMD write.
6
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20200514212831.31248-7-richard.henderson@linaro.org
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-6-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
11
---
8
target/arm/helper-sve.h | 3 +++
12
target/arm/helper.h | 5 ++++-
9
target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++++++++++
13
target/arm/crypto_helper.c | 24 ++++++++++++++++++------
10
target/arm/translate-sve.c | 12 ++++++++++++
14
target/arm/translate-a64.c | 21 +++++----------------
11
target/arm/sve.decode | 6 ++++++
15
3 files changed, 27 insertions(+), 23 deletions(-)
12
4 files changed, 55 insertions(+)
13
16
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
17
diff --git a/target/arm/helper.h b/target/arm/helper.h
15
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
19
--- a/target/arm/helper.h
17
+++ b/target/arm/helper-sve.h
20
+++ b/target/arm/helper.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(crypto_sha512su0, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
DEF_HELPER_FLAGS_4(crypto_sha512su1, TCG_CALL_NO_RWG,
20
DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
void, ptr, ptr, ptr, i32)
21
24
22
+DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
-DEF_HELPER_FLAGS_5(crypto_sm3tt, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32, i32)
23
+DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+DEF_HELPER_FLAGS_4(crypto_sm3tt1a, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
27
+DEF_HELPER_FLAGS_4(crypto_sm3tt1b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(crypto_sm3tt2a, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(crypto_sm3tt2b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
DEF_HELPER_FLAGS_4(crypto_sm3partw1, TCG_CALL_NO_RWG,
31
void, ptr, ptr, ptr, i32)
32
DEF_HELPER_FLAGS_4(crypto_sm3partw2, TCG_CALL_NO_RWG,
33
diff --git a/target/arm/crypto_helper.c b/target/arm/crypto_helper.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/target/arm/crypto_helper.c
36
+++ b/target/arm/crypto_helper.c
37
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sm3partw2)(void *vd, void *vn, void *vm, uint32_t desc)
38
clear_tail_16(vd, desc);
39
}
40
41
-void HELPER(crypto_sm3tt)(void *vd, void *vn, void *vm, uint32_t imm2,
42
- uint32_t opcode)
43
+static inline void QEMU_ALWAYS_INLINE
44
+crypto_sm3tt(uint64_t *rd, uint64_t *rn, uint64_t *rm,
45
+ uint32_t desc, uint32_t opcode)
46
{
47
- uint64_t *rd = vd;
48
- uint64_t *rn = vn;
49
- uint64_t *rm = vm;
50
union CRYPTO_STATE d = { .l = { rd[0], rd[1] } };
51
union CRYPTO_STATE n = { .l = { rn[0], rn[1] } };
52
union CRYPTO_STATE m = { .l = { rm[0], rm[1] } };
53
+ uint32_t imm2 = simd_data(desc);
54
uint32_t t;
55
56
assert(imm2 < 4);
57
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sm3tt)(void *vd, void *vn, void *vm, uint32_t imm2,
58
/* SM3TT2B */
59
t = cho(CR_ST_WORD(d, 3), CR_ST_WORD(d, 2), CR_ST_WORD(d, 1));
60
} else {
61
- g_assert_not_reached();
62
+ qemu_build_not_reached();
63
}
64
65
t += CR_ST_WORD(d, 0) + CR_ST_WORD(m, imm2);
66
@@ -XXX,XX +XXX,XX @@ void HELPER(crypto_sm3tt)(void *vd, void *vn, void *vm, uint32_t imm2,
67
68
rd[0] = d.l[0];
69
rd[1] = d.l[1];
24
+
70
+
25
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
71
+ clear_tail_16(rd, desc);
26
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
72
}
27
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
73
28
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
74
+#define DO_SM3TT(NAME, OPCODE) \
75
+ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
76
+ { crypto_sm3tt(vd, vn, vm, desc, OPCODE); }
77
+
78
+DO_SM3TT(crypto_sm3tt1a, 0)
79
+DO_SM3TT(crypto_sm3tt1b, 1)
80
+DO_SM3TT(crypto_sm3tt2a, 2)
81
+DO_SM3TT(crypto_sm3tt2b, 3)
82
+
83
+#undef DO_SM3TT
84
+
85
static uint8_t const sm4_sbox[] = {
86
0xd6, 0x90, 0xe9, 0xfe, 0xcc, 0xe1, 0x3d, 0xb7,
87
0x16, 0xb6, 0x14, 0xc2, 0x28, 0xfb, 0x2c, 0x05,
88
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
29
index XXXXXXX..XXXXXXX 100644
89
index XXXXXXX..XXXXXXX 100644
30
--- a/target/arm/sve_helper.c
90
--- a/target/arm/translate-a64.c
31
+++ b/target/arm/sve_helper.c
91
+++ b/target/arm/translate-a64.c
32
@@ -XXX,XX +XXX,XX @@ DO_TRN(sve_trn_d, uint64_t, )
92
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn)
33
#undef DO_ZIP
93
*/
34
#undef DO_UZP
94
static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
35
#undef DO_TRN
95
{
36
+
96
+ static gen_helper_gvec_3 * const fns[4] = {
37
+void HELPER(sve_compact_s)(void *vd, void *vn, void *vg, uint32_t desc)
97
+ gen_helper_crypto_sm3tt1a, gen_helper_crypto_sm3tt1b,
38
+{
98
+ gen_helper_crypto_sm3tt2a, gen_helper_crypto_sm3tt2b,
39
+ intptr_t i, j, opr_sz = simd_oprsz(desc) / 4;
99
+ };
40
+ uint32_t *d = vd, *n = vn;
100
int opcode = extract32(insn, 10, 2);
41
+ uint8_t *pg = vg;
101
int imm2 = extract32(insn, 12, 2);
42
+
102
int rm = extract32(insn, 16, 5);
43
+ for (i = j = 0; i < opr_sz; i++) {
103
int rn = extract32(insn, 5, 5);
44
+ if (pg[H1(i / 2)] & (i & 1 ? 0x10 : 0x01)) {
104
int rd = extract32(insn, 0, 5);
45
+ d[H4(j)] = n[H4(i)];
105
- TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
46
+ j++;
106
- TCGv_i32 tcg_imm2, tcg_opcode;
47
+ }
107
48
+ }
108
if (!dc_isar_feature(aa64_sm3, s)) {
49
+ for (; j < opr_sz; j++) {
109
unallocated_encoding(s);
50
+ d[H4(j)] = 0;
110
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
51
+ }
111
return;
52
+}
112
}
53
+
113
54
+void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc)
114
- tcg_rd_ptr = vec_full_reg_ptr(s, rd);
55
+{
115
- tcg_rn_ptr = vec_full_reg_ptr(s, rn);
56
+ intptr_t i, j, opr_sz = simd_oprsz(desc) / 8;
116
- tcg_rm_ptr = vec_full_reg_ptr(s, rm);
57
+ uint64_t *d = vd, *n = vn;
117
- tcg_imm2 = tcg_const_i32(imm2);
58
+ uint8_t *pg = vg;
118
- tcg_opcode = tcg_const_i32(opcode);
59
+
119
-
60
+ for (i = j = 0; i < opr_sz; i++) {
120
- gen_helper_crypto_sm3tt(tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr, tcg_imm2,
61
+ if (pg[H1(i)] & 1) {
121
- tcg_opcode);
62
+ d[j] = n[i];
122
-
63
+ j++;
123
- tcg_temp_free_ptr(tcg_rd_ptr);
64
+ }
124
- tcg_temp_free_ptr(tcg_rn_ptr);
65
+ }
125
- tcg_temp_free_ptr(tcg_rm_ptr);
66
+ for (; j < opr_sz; j++) {
126
- tcg_temp_free_i32(tcg_imm2);
67
+ d[j] = 0;
127
- tcg_temp_free_i32(tcg_opcode);
68
+ }
128
+ gen_gvec_op3_ool(s, true, rd, rn, rm, imm2, fns[opcode]);
69
+}
70
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/target/arm/translate-sve.c
73
+++ b/target/arm/translate-sve.c
74
@@ -XXX,XX +XXX,XX @@ static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
75
return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]);
76
}
129
}
77
130
78
+/*
131
/* C3.6 Data processing - SIMD, inc Crypto
79
+ *** SVE Permute Vector - Predicated Group
80
+ */
81
+
82
+static bool trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
83
+{
84
+ static gen_helper_gvec_3 * const fns[4] = {
85
+ NULL, NULL, gen_helper_sve_compact_s, gen_helper_sve_compact_d
86
+ };
87
+ return do_zpz_ool(s, a, fns[a->esz]);
88
+}
89
+
90
/*
91
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
92
*/
93
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
94
index XXXXXXX..XXXXXXX 100644
95
--- a/target/arm/sve.decode
96
+++ b/target/arm/sve.decode
97
@@ -XXX,XX +XXX,XX @@ UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm
98
TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm
99
TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm
100
101
+### SVE Permute - Predicated Group
102
+
103
+# SVE compress active elements
104
+# Note esz >= 2
105
+COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn
106
+
107
### SVE Predicate Logical Operations Group
108
109
# SVE predicate logical operations
110
--
132
--
111
2.17.1
133
2.20.1
112
134
113
135
diff view generated by jsdifflib
1
Convert the pckbd device away from using the old_mmio field
1
From: Philippe Mathieu-Daudé <f4bug@amsat.org>
2
of MemoryRegionOps. This change only affects the memory-mapped
3
variant of the i8042, which is used by the Unicore32 'puv3'
4
board and the MIPS Jazz boards 'magnum' and 'pica61'.
5
2
3
The ADC region size is 256B, split as:
4
- [0x00 - 0x4f] defined
5
- [0x50 - 0xff] reserved
6
7
All registers are 32-bit (thus when the datasheet mentions the
8
last defined register is 0x4c, it means its address range is
9
0x4c .. 0x4f.
10
11
This model implementation is also 32-bit. Set MemoryRegionOps
12
'impl' fields.
13
14
See:
15
'RM0033 Reference manual Rev 8', Table 10.13.18 "ADC register map".
16
17
Reported-by: Seth Kintigh <skintigh@gmail.com>
18
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
19
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
20
Message-id: 20200603055915.17678-1-f4bug@amsat.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Message-id: 20180601141223.26630-6-peter.maydell@linaro.org
9
---
22
---
10
hw/input/pckbd.c | 14 ++++++++------
23
hw/adc/stm32f2xx_adc.c | 4 +++-
11
1 file changed, 8 insertions(+), 6 deletions(-)
24
1 file changed, 3 insertions(+), 1 deletion(-)
12
25
13
diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
26
diff --git a/hw/adc/stm32f2xx_adc.c b/hw/adc/stm32f2xx_adc.c
14
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
15
--- a/hw/input/pckbd.c
28
--- a/hw/adc/stm32f2xx_adc.c
16
+++ b/hw/input/pckbd.c
29
+++ b/hw/adc/stm32f2xx_adc.c
17
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_kbd = {
30
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps stm32f2xx_adc_ops = {
31
.read = stm32f2xx_adc_read,
32
.write = stm32f2xx_adc_write,
33
.endianness = DEVICE_NATIVE_ENDIAN,
34
+ .impl.min_access_size = 4,
35
+ .impl.max_access_size = 4,
18
};
36
};
19
37
20
/* Memory mapped interface */
38
static const VMStateDescription vmstate_stm32f2xx_adc = {
21
-static uint32_t kbd_mm_readb (void *opaque, hwaddr addr)
39
@@ -XXX,XX +XXX,XX @@ static void stm32f2xx_adc_init(Object *obj)
22
+static uint64_t kbd_mm_readfn(void *opaque, hwaddr addr, unsigned size)
40
sysbus_init_irq(SYS_BUS_DEVICE(obj), &s->irq);
23
{
41
24
KBDState *s = opaque;
42
memory_region_init_io(&s->mmio, obj, &stm32f2xx_adc_ops, s,
25
43
- TYPE_STM32F2XX_ADC, 0xFF);
26
@@ -XXX,XX +XXX,XX @@ static uint32_t kbd_mm_readb (void *opaque, hwaddr addr)
44
+ TYPE_STM32F2XX_ADC, 0x100);
27
return kbd_read_data(s, 0, 1) & 0xff;
45
sysbus_init_mmio(SYS_BUS_DEVICE(obj), &s->mmio);
28
}
46
}
29
47
30
-static void kbd_mm_writeb (void *opaque, hwaddr addr, uint32_t value)
31
+static void kbd_mm_writefn(void *opaque, hwaddr addr,
32
+ uint64_t value, unsigned size)
33
{
34
KBDState *s = opaque;
35
36
@@ -XXX,XX +XXX,XX @@ static void kbd_mm_writeb (void *opaque, hwaddr addr, uint32_t value)
37
kbd_write_data(s, 0, value & 0xff, 1);
38
}
39
40
+
41
static const MemoryRegionOps i8042_mmio_ops = {
42
+ .read = kbd_mm_readfn,
43
+ .write = kbd_mm_writefn,
44
+ .valid.min_access_size = 1,
45
+ .valid.max_access_size = 4,
46
.endianness = DEVICE_NATIVE_ENDIAN,
47
- .old_mmio = {
48
- .read = { kbd_mm_readb, kbd_mm_readb, kbd_mm_readb },
49
- .write = { kbd_mm_writeb, kbd_mm_writeb, kbd_mm_writeb },
50
- },
51
};
52
53
void i8042_mm_init(qemu_irq kbd_irq, qemu_irq mouse_irq,
54
--
48
--
55
2.17.1
49
2.20.1
56
50
57
51
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Thomas Huth <thuth@redhat.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
As described by Edgar here:
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
5
Message-id: 20180613015641.5667-11-richard.henderson@linaro.org
5
https://www.mail-archive.com/qemu-devel@nongnu.org/msg605124.html
6
7
we can use the Ubuntu kernel for testing the xlnx-versal-virt machine.
8
So let's add a boot test for this now.
9
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Signed-off-by: Thomas Huth <thuth@redhat.com>
13
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
14
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
15
Message-id: 20200525141237.15243-1-thuth@redhat.com
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
17
---
8
target/arm/helper-sve.h | 9 +++++++
18
tests/acceptance/boot_linux_console.py | 26 ++++++++++++++++++++++++++
9
target/arm/sve_helper.c | 55 ++++++++++++++++++++++++++++++++++++++
19
1 file changed, 26 insertions(+)
10
target/arm/translate-sve.c | 2 ++
11
target/arm/sve.decode | 6 +++++
12
4 files changed, 72 insertions(+)
13
20
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
21
diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
15
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
23
--- a/tests/acceptance/boot_linux_console.py
17
+++ b/target/arm/helper-sve.h
24
+++ b/tests/acceptance/boot_linux_console.py
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG,
25
@@ -XXX,XX +XXX,XX @@ class BootLinuxConsole(LinuxKernelTest):
19
DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG,
26
console_pattern = 'Kernel command line: %s' % kernel_command_line
20
void, ptr, ptr, ptr, ptr, i32)
27
self.wait_for_console_pattern(console_pattern)
21
28
22
+DEF_HELPER_FLAGS_5(sve_sel_zpzz_b, TCG_CALL_NO_RWG,
29
+ def test_aarch64_xlnx_versal_virt(self):
23
+ void, ptr, ptr, ptr, ptr, i32)
30
+ """
24
+DEF_HELPER_FLAGS_5(sve_sel_zpzz_h, TCG_CALL_NO_RWG,
31
+ :avocado: tags=arch:aarch64
25
+ void, ptr, ptr, ptr, ptr, i32)
32
+ :avocado: tags=machine:xlnx-versal-virt
26
+DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG,
33
+ :avocado: tags=device:pl011
27
+ void, ptr, ptr, ptr, ptr, i32)
34
+ :avocado: tags=device:arm_gicv3
28
+DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG,
35
+ """
29
+ void, ptr, ptr, ptr, ptr, i32)
36
+ kernel_url = ('http://ports.ubuntu.com/ubuntu-ports/dists/'
37
+ 'bionic-updates/main/installer-arm64/current/images/'
38
+ 'netboot/ubuntu-installer/arm64/linux')
39
+ kernel_hash = '5bfc54cf7ed8157d93f6e5b0241e727b6dc22c50'
40
+ kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
30
+
41
+
31
DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG,
42
+ initrd_url = ('http://ports.ubuntu.com/ubuntu-ports/dists/'
32
void, ptr, ptr, ptr, ptr, i32)
43
+ 'bionic-updates/main/installer-arm64/current/images/'
33
DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG,
44
+ 'netboot/ubuntu-installer/arm64/initrd.gz')
34
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
45
+ initrd_hash = 'd385d3e88d53e2004c5d43cbe668b458a094f772'
35
index XXXXXXX..XXXXXXX 100644
46
+ initrd_path = self.fetch_asset(initrd_url, asset_hash=initrd_hash)
36
--- a/target/arm/sve_helper.c
37
+++ b/target/arm/sve_helper.c
38
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t desc)
39
}
40
swap_memmove(vd + len, vm, opr_sz * 8 - len);
41
}
42
+
47
+
43
+void HELPER(sve_sel_zpzz_b)(void *vd, void *vn, void *vm,
48
+ self.vm.set_console()
44
+ void *vg, uint32_t desc)
49
+ self.vm.add_args('-m', '2G',
45
+{
50
+ '-kernel', kernel_path,
46
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
51
+ '-initrd', initrd_path)
47
+ uint64_t *d = vd, *n = vn, *m = vm;
52
+ self.vm.launch()
48
+ uint8_t *pg = vg;
53
+ self.wait_for_console_pattern('Checked W+X mappings: passed')
49
+
54
+
50
+ for (i = 0; i < opr_sz; i += 1) {
55
def test_arm_virt(self):
51
+ uint64_t nn = n[i], mm = m[i];
56
"""
52
+ uint64_t pp = expand_pred_b(pg[H1(i)]);
57
:avocado: tags=arch:arm
53
+ d[i] = (nn & pp) | (mm & ~pp);
54
+ }
55
+}
56
+
57
+void HELPER(sve_sel_zpzz_h)(void *vd, void *vn, void *vm,
58
+ void *vg, uint32_t desc)
59
+{
60
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
61
+ uint64_t *d = vd, *n = vn, *m = vm;
62
+ uint8_t *pg = vg;
63
+
64
+ for (i = 0; i < opr_sz; i += 1) {
65
+ uint64_t nn = n[i], mm = m[i];
66
+ uint64_t pp = expand_pred_h(pg[H1(i)]);
67
+ d[i] = (nn & pp) | (mm & ~pp);
68
+ }
69
+}
70
+
71
+void HELPER(sve_sel_zpzz_s)(void *vd, void *vn, void *vm,
72
+ void *vg, uint32_t desc)
73
+{
74
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
75
+ uint64_t *d = vd, *n = vn, *m = vm;
76
+ uint8_t *pg = vg;
77
+
78
+ for (i = 0; i < opr_sz; i += 1) {
79
+ uint64_t nn = n[i], mm = m[i];
80
+ uint64_t pp = expand_pred_s(pg[H1(i)]);
81
+ d[i] = (nn & pp) | (mm & ~pp);
82
+ }
83
+}
84
+
85
+void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm,
86
+ void *vg, uint32_t desc)
87
+{
88
+ intptr_t i, opr_sz = simd_oprsz(desc) / 8;
89
+ uint64_t *d = vd, *n = vn, *m = vm;
90
+ uint8_t *pg = vg;
91
+
92
+ for (i = 0; i < opr_sz; i += 1) {
93
+ uint64_t nn = n[i], mm = m[i];
94
+ d[i] = (pg[H1(i)] & 1 ? nn : mm);
95
+ }
96
+}
97
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
98
index XXXXXXX..XXXXXXX 100644
99
--- a/target/arm/translate-sve.c
100
+++ b/target/arm/translate-sve.c
101
@@ -XXX,XX +XXX,XX @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
102
return do_zpzz_ool(s, a, fns[a->esz]);
103
}
104
105
+DO_ZPZZ(SEL, sel)
106
+
107
#undef DO_ZPZZ
108
109
/*
110
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
111
index XXXXXXX..XXXXXXX 100644
112
--- a/target/arm/sve.decode
113
+++ b/target/arm/sve.decode
114
@@ -XXX,XX +XXX,XX @@
115
&rprr_esz rn=%reg_movprfx
116
@rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \
117
&rprr_esz rm=%reg_movprfx
118
+@rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz
119
120
# Three register operand, with governing predicate, vector element size
121
@rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \
122
@@ -XXX,XX +XXX,XX @@ RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn
123
# SVE vector splice (predicated)
124
SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm
125
126
+### SVE Select Vectors Group
127
+
128
+# SVE select vector elements (predicated)
129
+SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm
130
+
131
### SVE Predicate Logical Operations Group
132
133
# SVE predicate logical operations
134
--
58
--
135
2.17.1
59
2.20.1
136
60
137
61
diff view generated by jsdifflib
1
From: Cédric Le Goater <clg@kaod.org>
1
From: Cédric Le Goater <clg@kaod.org>
2
2
3
On Macronix chips, two bytes can written to the WRSR. First byte will
4
configure the status register and the second the configuration
5
register. It is important to save the configuration value as it
6
contains the dummy cycle setting when using dual or quad IO mode.
7
8
Signed-off-by: Cédric Le Goater <clg@kaod.org>
3
Signed-off-by: Cédric Le Goater <clg@kaod.org>
9
Acked-by: Alistair Francis <alistair.francis@wdc.com>
4
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
5
Message-id: 20200602135050.593692-1-clg@kaod.org
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
7
---
12
hw/block/m25p80.c | 1 +
8
docs/system/arm/aspeed.rst | 85 ++++++++++++++++++++++++++++++++++++++
13
1 file changed, 1 insertion(+)
9
docs/system/target-arm.rst | 1 +
10
2 files changed, 86 insertions(+)
11
create mode 100644 docs/system/arm/aspeed.rst
14
12
15
diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
13
diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
14
new file mode 100644
15
index XXXXXXX..XXXXXXX
16
--- /dev/null
17
+++ b/docs/system/arm/aspeed.rst
18
@@ -XXX,XX +XXX,XX @@
19
+Aspeed family boards (``*-bmc``, ``ast2500-evb``, ``ast2600-evb``)
20
+==================================================================
21
+
22
+The QEMU Aspeed machines model BMCs of various OpenPOWER systems and
23
+Aspeed evaluation boards. They are based on different releases of the
24
+Aspeed SoC : the AST2400 integrating an ARM926EJ-S CPU (400MHz), the
25
+AST2500 with an ARM1176JZS CPU (800MHz) and more recently the AST2600
26
+with dual cores ARM Cortex A7 CPUs (1.2GHz).
27
+
28
+The SoC comes with RAM, Gigabit ethernet, USB, SD/MMC, USB, SPI, I2C,
29
+etc.
30
+
31
+AST2400 SoC based machines :
32
+
33
+- ``palmetto-bmc`` OpenPOWER Palmetto POWER8 BMC
34
+
35
+AST2500 SoC based machines :
36
+
37
+- ``ast2500-evb`` Aspeed AST2500 Evaluation board
38
+- ``romulus-bmc`` OpenPOWER Romulus POWER9 BMC
39
+- ``witherspoon-bmc`` OpenPOWER Witherspoon POWER9 BMC
40
+- ``sonorapass-bmc`` OCP SonoraPass BMC
41
+- ``swift-bmc`` OpenPOWER Swift BMC POWER9
42
+
43
+AST2600 SoC based machines :
44
+
45
+- ``ast2600-evb`` Aspeed AST2600 Evaluation board (Cortex A7)
46
+- ``tacoma-bmc`` OpenPOWER Witherspoon POWER9 AST2600 BMC
47
+
48
+Supported devices
49
+-----------------
50
+
51
+ * SMP (for the AST2600 Cortex-A7)
52
+ * Interrupt Controller (VIC)
53
+ * Timer Controller
54
+ * RTC Controller
55
+ * I2C Controller
56
+ * System Control Unit (SCU)
57
+ * SRAM mapping
58
+ * X-DMA Controller (basic interface)
59
+ * Static Memory Controller (SMC or FMC) - Only SPI Flash support
60
+ * SPI Memory Controller
61
+ * USB 2.0 Controller
62
+ * SD/MMC storage controllers
63
+ * SDRAM controller (dummy interface for basic settings and training)
64
+ * Watchdog Controller
65
+ * GPIO Controller (Master only)
66
+ * UART
67
+ * Ethernet controllers
68
+
69
+
70
+Missing devices
71
+---------------
72
+
73
+ * Coprocessor support
74
+ * ADC (out of tree implementation)
75
+ * PWM and Fan Controller
76
+ * LPC Bus Controller
77
+ * Slave GPIO Controller
78
+ * Super I/O Controller
79
+ * Hash/Crypto Engine
80
+ * PCI-Express 1 Controller
81
+ * Graphic Display Controller
82
+ * PECI Controller
83
+ * MCTP Controller
84
+ * Mailbox Controller
85
+ * Virtual UART
86
+ * eSPI Controller
87
+ * I3C Controller
88
+
89
+Boot options
90
+------------
91
+
92
+The Aspeed machines can be started using the -kernel option to load a
93
+Linux kernel or from a firmare image which can be downloaded from the
94
+OpenPOWER jenkins :
95
+
96
+ https://openpower.xyz/
97
+
98
+The image should be attached as an MTD drive. Run :
99
+
100
+.. code-block:: bash
101
+
102
+ $ qemu-system-arm -M romulus-bmc -nic user \
103
+    -drive file=flash-romulus,format=raw,if=mtd -nographic
104
diff --git a/docs/system/target-arm.rst b/docs/system/target-arm.rst
16
index XXXXXXX..XXXXXXX 100644
105
index XXXXXXX..XXXXXXX 100644
17
--- a/hw/block/m25p80.c
106
--- a/docs/system/target-arm.rst
18
+++ b/hw/block/m25p80.c
107
+++ b/docs/system/target-arm.rst
19
@@ -XXX,XX +XXX,XX @@ static void complete_collecting_data(Flash *s)
108
@@ -XXX,XX +XXX,XX @@ undocumented; you can get a complete list by running
20
case MAN_MACRONIX:
109
arm/realview
21
s->quad_enable = extract32(s->data[0], 6, 1);
110
arm/versatile
22
if (s->len > 1) {
111
arm/vexpress
23
+ s->volatile_cfg = s->data[1];
112
+ arm/aspeed
24
s->four_bytes_address_mode = extract32(s->data[1], 5, 1);
113
arm/musicpal
25
}
114
arm/nseries
26
break;
115
arm/orangepi
27
--
116
--
28
2.17.1
117
2.20.1
29
118
30
119
diff view generated by jsdifflib
1
For the IoTKit MPC support, we need to wire together the
1
From: Paul Zimmerman <pauldzim@gmail.com>
2
interrupt outputs of 17 MPCs; this exceeds the current
2
3
value of MAX_OR_LINES. Increase MAX_OR_LINES to 32 (which
3
Add BCM2835 SOC MPHI (Message-based Parallel Host Interface)
4
should be enough for anyone).
4
emulation. It is very basic, only providing the FIQ interrupt
5
5
needed to allow the dwc-otg USB host controller driver in the
6
The tricky part is retaining the migration compatibility for
6
Raspbian kernel to function.
7
existing OR gates; we add a subsection which is only used
7
8
for larger OR gates, and define it such that we can freely
8
Signed-off-by: Paul Zimmerman <pauldzim@gmail.com>
9
increase MAX_OR_LINES in future (or even move to a dynamically
9
Acked-by: Philippe Mathieu-Daude <f4bug@amsat.org>
10
allocated levels[] array without an upper size limit) without
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
breaking compatibility.
11
Message-id: 20200520235349.21215-2-pauldzim@gmail.com
12
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
15
Message-id: 20180604152941.20374-10-peter.maydell@linaro.org
16
---
13
---
17
include/hw/or-irq.h | 5 ++++-
14
include/hw/arm/bcm2835_peripherals.h | 2 +
18
hw/core/or-irq.c | 39 +++++++++++++++++++++++++++++++++++++--
15
include/hw/misc/bcm2835_mphi.h | 44 ++++++
19
2 files changed, 41 insertions(+), 3 deletions(-)
16
hw/arm/bcm2835_peripherals.c | 17 +++
20
17
hw/misc/bcm2835_mphi.c | 191 +++++++++++++++++++++++++++
21
diff --git a/include/hw/or-irq.h b/include/hw/or-irq.h
18
hw/misc/Makefile.objs | 1 +
19
5 files changed, 255 insertions(+)
20
create mode 100644 include/hw/misc/bcm2835_mphi.h
21
create mode 100644 hw/misc/bcm2835_mphi.c
22
23
diff --git a/include/hw/arm/bcm2835_peripherals.h b/include/hw/arm/bcm2835_peripherals.h
22
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
23
--- a/include/hw/or-irq.h
25
--- a/include/hw/arm/bcm2835_peripherals.h
24
+++ b/include/hw/or-irq.h
26
+++ b/include/hw/arm/bcm2835_peripherals.h
25
@@ -XXX,XX +XXX,XX @@
27
@@ -XXX,XX +XXX,XX @@
26
28
#include "hw/misc/bcm2835_property.h"
27
#define TYPE_OR_IRQ "or-irq"
29
#include "hw/misc/bcm2835_rng.h"
28
30
#include "hw/misc/bcm2835_mbox.h"
29
-#define MAX_OR_LINES 16
31
+#include "hw/misc/bcm2835_mphi.h"
30
+/* This can safely be increased if necessary without breaking
32
#include "hw/misc/bcm2835_thermal.h"
31
+ * migration compatibility (as long as it remains greater than 15).
33
#include "hw/sd/sdhci.h"
34
#include "hw/sd/bcm2835_sdhost.h"
35
@@ -XXX,XX +XXX,XX @@ typedef struct BCM2835PeripheralState {
36
qemu_irq irq, fiq;
37
38
BCM2835SystemTimerState systmr;
39
+ BCM2835MphiState mphi;
40
UnimplementedDeviceState armtmr;
41
UnimplementedDeviceState cprman;
42
UnimplementedDeviceState a2w;
43
diff --git a/include/hw/misc/bcm2835_mphi.h b/include/hw/misc/bcm2835_mphi.h
44
new file mode 100644
45
index XXXXXXX..XXXXXXX
46
--- /dev/null
47
+++ b/include/hw/misc/bcm2835_mphi.h
48
@@ -XXX,XX +XXX,XX @@
49
+/*
50
+ * BCM2835 SOC MPHI state definitions
51
+ *
52
+ * Copyright (c) 2020 Paul Zimmerman <pauldzim@gmail.com>
53
+ *
54
+ * This program is free software; you can redistribute it and/or modify
55
+ * it under the terms of the GNU General Public License as published by
56
+ * the Free Software Foundation; either version 2 of the License, or
57
+ * (at your option) any later version.
58
+ *
59
+ * This program is distributed in the hope that it will be useful,
60
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
61
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
62
+ * GNU General Public License for more details.
32
+ */
63
+ */
33
+#define MAX_OR_LINES 32
64
+
34
65
+#ifndef HW_MISC_BCM2835_MPHI_H
35
typedef struct OrIRQState qemu_or_irq;
66
+#define HW_MISC_BCM2835_MPHI_H
36
67
+
37
diff --git a/hw/core/or-irq.c b/hw/core/or-irq.c
68
+#include "hw/irq.h"
69
+#include "hw/sysbus.h"
70
+
71
+#define MPHI_MMIO_SIZE 0x1000
72
+
73
+typedef struct BCM2835MphiState BCM2835MphiState;
74
+
75
+struct BCM2835MphiState {
76
+ SysBusDevice parent_obj;
77
+ qemu_irq irq;
78
+ MemoryRegion iomem;
79
+
80
+ uint32_t outdda;
81
+ uint32_t outddb;
82
+ uint32_t ctrl;
83
+ uint32_t intstat;
84
+ uint32_t swirq;
85
+};
86
+
87
+#define TYPE_BCM2835_MPHI "bcm2835-mphi"
88
+
89
+#define BCM2835_MPHI(obj) \
90
+ OBJECT_CHECK(BCM2835MphiState, (obj), TYPE_BCM2835_MPHI)
91
+
92
+#endif
93
diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
38
index XXXXXXX..XXXXXXX 100644
94
index XXXXXXX..XXXXXXX 100644
39
--- a/hw/core/or-irq.c
95
--- a/hw/arm/bcm2835_peripherals.c
40
+++ b/hw/core/or-irq.c
96
+++ b/hw/arm/bcm2835_peripherals.c
41
@@ -XXX,XX +XXX,XX @@ static void or_irq_init(Object *obj)
97
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_init(Object *obj)
42
qdev_init_gpio_out(DEVICE(obj), &s->out_irq, 1);
98
OBJECT(&s->sdhci.sdbus));
99
object_property_add_const_link(OBJECT(&s->gpio), "sdbus-sdhost",
100
OBJECT(&s->sdhost.sdbus));
101
+
102
+ /* Mphi */
103
+ sysbus_init_child_obj(obj, "mphi", &s->mphi, sizeof(s->mphi),
104
+ TYPE_BCM2835_MPHI);
43
}
105
}
44
106
45
+/* The original version of this device had a fixed 16 entries in its
107
static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
46
+ * VMState array; devices with more inputs than this need to
108
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
47
+ * migrate the extra lines via a subsection.
109
48
+ * The subsection migrates as much of the levels[] array as is needed
110
object_property_add_alias(OBJECT(s), "sd-bus", OBJECT(&s->gpio), "sd-bus");
49
+ * (including repeating the first 16 elements), to avoid the awkwardness
111
50
+ * of splitting it in two to meet the requirements of VMSTATE_VARRAY_UINT16.
112
+ /* Mphi */
113
+ object_property_set_bool(OBJECT(&s->mphi), true, "realized", &err);
114
+ if (err) {
115
+ error_propagate(errp, err);
116
+ return;
117
+ }
118
+
119
+ memory_region_add_subregion(&s->peri_mr, MPHI_OFFSET,
120
+ sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->mphi), 0));
121
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->mphi), 0,
122
+ qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
123
+ INTERRUPT_HOSTPORT));
124
+
125
create_unimp(s, &s->armtmr, "bcm2835-sp804", ARMCTRL_TIMER0_1_OFFSET, 0x40);
126
create_unimp(s, &s->cprman, "bcm2835-cprman", CPRMAN_OFFSET, 0x1000);
127
create_unimp(s, &s->a2w, "bcm2835-a2w", A2W_OFFSET, 0x1000);
128
diff --git a/hw/misc/bcm2835_mphi.c b/hw/misc/bcm2835_mphi.c
129
new file mode 100644
130
index XXXXXXX..XXXXXXX
131
--- /dev/null
132
+++ b/hw/misc/bcm2835_mphi.c
133
@@ -XXX,XX +XXX,XX @@
134
+/*
135
+ * BCM2835 SOC MPHI emulation
136
+ *
137
+ * Very basic emulation, only providing the FIQ interrupt needed to
138
+ * allow the dwc-otg USB host controller driver in the Raspbian kernel
139
+ * to function.
140
+ *
141
+ * Copyright (c) 2020 Paul Zimmerman <pauldzim@gmail.com>
142
+ *
143
+ * This program is free software; you can redistribute it and/or modify
144
+ * it under the terms of the GNU General Public License as published by
145
+ * the Free Software Foundation; either version 2 of the License, or
146
+ * (at your option) any later version.
147
+ *
148
+ * This program is distributed in the hope that it will be useful,
149
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
150
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
151
+ * GNU General Public License for more details.
51
+ */
152
+ */
52
+#define OLD_MAX_OR_LINES 16
153
+
53
+#if MAX_OR_LINES < OLD_MAX_OR_LINES
154
+#include "qemu/osdep.h"
54
+#error MAX_OR_LINES must be at least 16 for migration compatibility
155
+#include "qapi/error.h"
55
+#endif
156
+#include "hw/misc/bcm2835_mphi.h"
56
+
157
+#include "migration/vmstate.h"
57
+static bool vmstate_extras_needed(void *opaque)
158
+#include "qemu/error-report.h"
58
+{
159
+#include "qemu/log.h"
59
+ qemu_or_irq *s = OR_IRQ(opaque);
160
+#include "qemu/main-loop.h"
60
+
161
+
61
+ return s->num_lines >= OLD_MAX_OR_LINES;
162
+static inline void mphi_raise_irq(BCM2835MphiState *s)
62
+}
163
+{
63
+
164
+ qemu_set_irq(s->irq, 1);
64
+static const VMStateDescription vmstate_or_irq_extras = {
165
+}
65
+ .name = "or-irq-extras",
166
+
167
+static inline void mphi_lower_irq(BCM2835MphiState *s)
168
+{
169
+ qemu_set_irq(s->irq, 0);
170
+}
171
+
172
+static uint64_t mphi_reg_read(void *ptr, hwaddr addr, unsigned size)
173
+{
174
+ BCM2835MphiState *s = ptr;
175
+ uint32_t val = 0;
176
+
177
+ switch (addr) {
178
+ case 0x28: /* outdda */
179
+ val = s->outdda;
180
+ break;
181
+ case 0x2c: /* outddb */
182
+ val = s->outddb;
183
+ break;
184
+ case 0x4c: /* ctrl */
185
+ val = s->ctrl;
186
+ val |= 1 << 17;
187
+ break;
188
+ case 0x50: /* intstat */
189
+ val = s->intstat;
190
+ break;
191
+ case 0x1f0: /* swirq_set */
192
+ val = s->swirq;
193
+ break;
194
+ case 0x1f4: /* swirq_clr */
195
+ val = s->swirq;
196
+ break;
197
+ default:
198
+ qemu_log_mask(LOG_UNIMP, "read from unknown register");
199
+ break;
200
+ }
201
+
202
+ return val;
203
+}
204
+
205
+static void mphi_reg_write(void *ptr, hwaddr addr, uint64_t val, unsigned size)
206
+{
207
+ BCM2835MphiState *s = ptr;
208
+ int do_irq = 0;
209
+
210
+ switch (addr) {
211
+ case 0x28: /* outdda */
212
+ s->outdda = val;
213
+ break;
214
+ case 0x2c: /* outddb */
215
+ s->outddb = val;
216
+ if (val & (1 << 29)) {
217
+ do_irq = 1;
218
+ }
219
+ break;
220
+ case 0x4c: /* ctrl */
221
+ s->ctrl = val;
222
+ if (val & (1 << 16)) {
223
+ do_irq = -1;
224
+ }
225
+ break;
226
+ case 0x50: /* intstat */
227
+ s->intstat = val;
228
+ if (val & ((1 << 16) | (1 << 29))) {
229
+ do_irq = -1;
230
+ }
231
+ break;
232
+ case 0x1f0: /* swirq_set */
233
+ s->swirq |= val;
234
+ do_irq = 1;
235
+ break;
236
+ case 0x1f4: /* swirq_clr */
237
+ s->swirq &= ~val;
238
+ do_irq = -1;
239
+ break;
240
+ default:
241
+ qemu_log_mask(LOG_UNIMP, "write to unknown register");
242
+ return;
243
+ }
244
+
245
+ if (do_irq > 0) {
246
+ mphi_raise_irq(s);
247
+ } else if (do_irq < 0) {
248
+ mphi_lower_irq(s);
249
+ }
250
+}
251
+
252
+static const MemoryRegionOps mphi_mmio_ops = {
253
+ .read = mphi_reg_read,
254
+ .write = mphi_reg_write,
255
+ .impl.min_access_size = 4,
256
+ .impl.max_access_size = 4,
257
+ .endianness = DEVICE_LITTLE_ENDIAN,
258
+};
259
+
260
+static void mphi_reset(DeviceState *dev)
261
+{
262
+ BCM2835MphiState *s = BCM2835_MPHI(dev);
263
+
264
+ s->outdda = 0;
265
+ s->outddb = 0;
266
+ s->ctrl = 0;
267
+ s->intstat = 0;
268
+ s->swirq = 0;
269
+}
270
+
271
+static void mphi_realize(DeviceState *dev, Error **errp)
272
+{
273
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
274
+ BCM2835MphiState *s = BCM2835_MPHI(dev);
275
+
276
+ sysbus_init_irq(sbd, &s->irq);
277
+}
278
+
279
+static void mphi_init(Object *obj)
280
+{
281
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
282
+ BCM2835MphiState *s = BCM2835_MPHI(obj);
283
+
284
+ memory_region_init_io(&s->iomem, obj, &mphi_mmio_ops, s, "mphi", MPHI_MMIO_SIZE);
285
+ sysbus_init_mmio(sbd, &s->iomem);
286
+}
287
+
288
+const VMStateDescription vmstate_mphi_state = {
289
+ .name = "mphi",
66
+ .version_id = 1,
290
+ .version_id = 1,
67
+ .minimum_version_id = 1,
291
+ .minimum_version_id = 1,
68
+ .needed = vmstate_extras_needed,
69
+ .fields = (VMStateField[]) {
292
+ .fields = (VMStateField[]) {
70
+ VMSTATE_VARRAY_UINT16_UNSAFE(levels, qemu_or_irq, num_lines, 0,
293
+ VMSTATE_UINT32(outdda, BCM2835MphiState),
71
+ vmstate_info_bool, bool),
294
+ VMSTATE_UINT32(outddb, BCM2835MphiState),
72
+ VMSTATE_END_OF_LIST(),
295
+ VMSTATE_UINT32(ctrl, BCM2835MphiState),
73
+ },
296
+ VMSTATE_UINT32(intstat, BCM2835MphiState),
297
+ VMSTATE_UINT32(swirq, BCM2835MphiState),
298
+ VMSTATE_END_OF_LIST()
299
+ }
74
+};
300
+};
75
+
301
+
76
static const VMStateDescription vmstate_or_irq = {
302
+static void mphi_class_init(ObjectClass *klass, void *data)
77
.name = TYPE_OR_IRQ,
303
+{
78
.version_id = 1,
304
+ DeviceClass *dc = DEVICE_CLASS(klass);
79
.minimum_version_id = 1,
305
+
80
.fields = (VMStateField[]) {
306
+ dc->realize = mphi_realize;
81
- VMSTATE_BOOL_ARRAY(levels, qemu_or_irq, MAX_OR_LINES),
307
+ dc->reset = mphi_reset;
82
+ VMSTATE_BOOL_SUB_ARRAY(levels, qemu_or_irq, 0, OLD_MAX_OR_LINES),
308
+ dc->vmsd = &vmstate_mphi_state;
83
VMSTATE_END_OF_LIST(),
309
+}
84
- }
310
+
85
+ },
311
+static const TypeInfo bcm2835_mphi_type_info = {
86
+ .subsections = (const VMStateDescription*[]) {
312
+ .name = TYPE_BCM2835_MPHI,
87
+ &vmstate_or_irq_extras,
313
+ .parent = TYPE_SYS_BUS_DEVICE,
88
+ NULL
314
+ .instance_size = sizeof(BCM2835MphiState),
89
+ },
315
+ .instance_init = mphi_init,
90
};
316
+ .class_init = mphi_class_init,
91
317
+};
92
static Property or_irq_properties[] = {
318
+
319
+static void bcm2835_mphi_register_types(void)
320
+{
321
+ type_register_static(&bcm2835_mphi_type_info);
322
+}
323
+
324
+type_init(bcm2835_mphi_register_types)
325
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
326
index XXXXXXX..XXXXXXX 100644
327
--- a/hw/misc/Makefile.objs
328
+++ b/hw/misc/Makefile.objs
329
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_OMAP) += omap_l4.o
330
common-obj-$(CONFIG_OMAP) += omap_sdrc.o
331
common-obj-$(CONFIG_OMAP) += omap_tap.o
332
common-obj-$(CONFIG_RASPI) += bcm2835_mbox.o
333
+common-obj-$(CONFIG_RASPI) += bcm2835_mphi.o
334
common-obj-$(CONFIG_RASPI) += bcm2835_property.o
335
common-obj-$(CONFIG_RASPI) += bcm2835_rng.o
336
common-obj-$(CONFIG_RASPI) += bcm2835_thermal.o
93
--
337
--
94
2.17.1
338
2.20.1
95
339
96
340
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Paul Zimmerman <pauldzim@gmail.com>
2
2
3
Import the dwc-hsotg (dwc2) register definitions file from the
4
Linux kernel. This is a copy of drivers/usb/dwc2/hw.h from the
5
mainline Linux kernel, the only changes being to the header, and
6
two instances of 'u32' changed to 'uint32_t' to allow it to
7
compile. Checkpatch throws a boatload of errors due to the tab
8
indentation, but I would rather import it as-is than reformat it.
9
10
Signed-off-by: Paul Zimmerman <pauldzim@gmail.com>
11
Message-id: 20200520235349.21215-3-pauldzim@gmail.com
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-3-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
14
---
8
target/arm/helper-sve.h | 23 +++++++
15
include/hw/usb/dwc2-regs.h | 899 +++++++++++++++++++++++++++++++++++++
9
target/arm/sve_helper.c | 114 +++++++++++++++++++++++++++++++
16
1 file changed, 899 insertions(+)
10
target/arm/translate-sve.c | 133 +++++++++++++++++++++++++++++++++++++
17
create mode 100644 include/hw/usb/dwc2-regs.h
11
target/arm/sve.decode | 27 ++++++++
12
4 files changed, 297 insertions(+)
13
18
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
19
diff --git a/include/hw/usb/dwc2-regs.h b/include/hw/usb/dwc2-regs.h
15
index XXXXXXX..XXXXXXX 100644
20
new file mode 100644
16
--- a/target/arm/helper-sve.h
21
index XXXXXXX..XXXXXXX
17
+++ b/target/arm/helper-sve.h
22
--- /dev/null
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
23
+++ b/include/hw/usb/dwc2-regs.h
19
24
@@ -XXX,XX +XXX,XX @@
20
DEF_HELPER_FLAGS_4(sve_ext, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+/* SPDX-License-Identifier: (GPL-2.0+ OR BSD-3-Clause) */
21
22
+DEF_HELPER_FLAGS_4(sve_insr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
23
+DEF_HELPER_FLAGS_4(sve_insr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
24
+DEF_HELPER_FLAGS_4(sve_insr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
25
+DEF_HELPER_FLAGS_4(sve_insr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
26
+
27
+DEF_HELPER_FLAGS_3(sve_rev_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_3(sve_rev_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_3(sve_rev_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_3(sve_rev_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
31
+
32
+DEF_HELPER_FLAGS_4(sve_tbl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(sve_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(sve_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(sve_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
+
37
+DEF_HELPER_FLAGS_3(sve_sunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
38
+DEF_HELPER_FLAGS_3(sve_sunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
39
+DEF_HELPER_FLAGS_3(sve_sunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
40
+
41
+DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
42
+DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
43
+DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
44
+
45
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
46
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
47
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
48
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/target/arm/sve_helper.c
51
+++ b/target/arm/sve_helper.c
52
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc)
53
memcpy(vd + n_siz, &tmp, n_ofs);
54
}
55
}
56
+
57
+#define DO_INSR(NAME, TYPE, H) \
58
+void HELPER(NAME)(void *vd, void *vn, uint64_t val, uint32_t desc) \
59
+{ \
60
+ intptr_t opr_sz = simd_oprsz(desc); \
61
+ swap_memmove(vd + sizeof(TYPE), vn, opr_sz - sizeof(TYPE)); \
62
+ *(TYPE *)(vd + H(0)) = val; \
63
+}
64
+
65
+DO_INSR(sve_insr_b, uint8_t, H1)
66
+DO_INSR(sve_insr_h, uint16_t, H1_2)
67
+DO_INSR(sve_insr_s, uint32_t, H1_4)
68
+DO_INSR(sve_insr_d, uint64_t, )
69
+
70
+#undef DO_INSR
71
+
72
+void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc)
73
+{
74
+ intptr_t i, j, opr_sz = simd_oprsz(desc);
75
+ for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) {
76
+ uint64_t f = *(uint64_t *)(vn + i);
77
+ uint64_t b = *(uint64_t *)(vn + j);
78
+ *(uint64_t *)(vd + i) = bswap64(b);
79
+ *(uint64_t *)(vd + j) = bswap64(f);
80
+ }
81
+}
82
+
83
+static inline uint64_t hswap64(uint64_t h)
84
+{
85
+ uint64_t m = 0x0000ffff0000ffffull;
86
+ h = rol64(h, 32);
87
+ return ((h & m) << 16) | ((h >> 16) & m);
88
+}
89
+
90
+void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc)
91
+{
92
+ intptr_t i, j, opr_sz = simd_oprsz(desc);
93
+ for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) {
94
+ uint64_t f = *(uint64_t *)(vn + i);
95
+ uint64_t b = *(uint64_t *)(vn + j);
96
+ *(uint64_t *)(vd + i) = hswap64(b);
97
+ *(uint64_t *)(vd + j) = hswap64(f);
98
+ }
99
+}
100
+
101
+void HELPER(sve_rev_s)(void *vd, void *vn, uint32_t desc)
102
+{
103
+ intptr_t i, j, opr_sz = simd_oprsz(desc);
104
+ for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) {
105
+ uint64_t f = *(uint64_t *)(vn + i);
106
+ uint64_t b = *(uint64_t *)(vn + j);
107
+ *(uint64_t *)(vd + i) = rol64(b, 32);
108
+ *(uint64_t *)(vd + j) = rol64(f, 32);
109
+ }
110
+}
111
+
112
+void HELPER(sve_rev_d)(void *vd, void *vn, uint32_t desc)
113
+{
114
+ intptr_t i, j, opr_sz = simd_oprsz(desc);
115
+ for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) {
116
+ uint64_t f = *(uint64_t *)(vn + i);
117
+ uint64_t b = *(uint64_t *)(vn + j);
118
+ *(uint64_t *)(vd + i) = b;
119
+ *(uint64_t *)(vd + j) = f;
120
+ }
121
+}
122
+
123
+#define DO_TBL(NAME, TYPE, H) \
124
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
125
+{ \
126
+ intptr_t i, opr_sz = simd_oprsz(desc); \
127
+ uintptr_t elem = opr_sz / sizeof(TYPE); \
128
+ TYPE *d = vd, *n = vn, *m = vm; \
129
+ ARMVectorReg tmp; \
130
+ if (unlikely(vd == vn)) { \
131
+ n = memcpy(&tmp, vn, opr_sz); \
132
+ } \
133
+ for (i = 0; i < elem; i++) { \
134
+ TYPE j = m[H(i)]; \
135
+ d[H(i)] = j < elem ? n[H(j)] : 0; \
136
+ } \
137
+}
138
+
139
+DO_TBL(sve_tbl_b, uint8_t, H1)
140
+DO_TBL(sve_tbl_h, uint16_t, H2)
141
+DO_TBL(sve_tbl_s, uint32_t, H4)
142
+DO_TBL(sve_tbl_d, uint64_t, )
143
+
144
+#undef TBL
145
+
146
+#define DO_UNPK(NAME, TYPED, TYPES, HD, HS) \
147
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \
148
+{ \
149
+ intptr_t i, opr_sz = simd_oprsz(desc); \
150
+ TYPED *d = vd; \
151
+ TYPES *n = vn; \
152
+ ARMVectorReg tmp; \
153
+ if (unlikely(vn - vd < opr_sz)) { \
154
+ n = memcpy(&tmp, n, opr_sz / 2); \
155
+ } \
156
+ for (i = 0; i < opr_sz / sizeof(TYPED); i++) { \
157
+ d[HD(i)] = n[HS(i)]; \
158
+ } \
159
+}
160
+
161
+DO_UNPK(sve_sunpk_h, int16_t, int8_t, H2, H1)
162
+DO_UNPK(sve_sunpk_s, int32_t, int16_t, H4, H2)
163
+DO_UNPK(sve_sunpk_d, int64_t, int32_t, , H4)
164
+
165
+DO_UNPK(sve_uunpk_h, uint16_t, uint8_t, H2, H1)
166
+DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2)
167
+DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4)
168
+
169
+#undef DO_UNPK
170
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
171
index XXXXXXX..XXXXXXX 100644
172
--- a/target/arm/translate-sve.c
173
+++ b/target/arm/translate-sve.c
174
@@ -XXX,XX +XXX,XX @@ static bool trans_EXT(DisasContext *s, arg_EXT *a, uint32_t insn)
175
return true;
176
}
177
178
+/*
26
+/*
179
+ *** SVE Permute - Unpredicated Group
27
+ * Imported from the Linux kernel file drivers/usb/dwc2/hw.h, commit
28
+ * a89bae709b3492b478480a2c9734e7e9393b279c ("usb: dwc2: Move
29
+ * UTMI_PHY_DATA defines closer")
30
+ *
31
+ * hw.h - DesignWare HS OTG Controller hardware definitions
32
+ *
33
+ * Copyright 2004-2013 Synopsys, Inc.
34
+ *
35
+ * Redistribution and use in source and binary forms, with or without
36
+ * modification, are permitted provided that the following conditions
37
+ * are met:
38
+ * 1. Redistributions of source code must retain the above copyright
39
+ * notice, this list of conditions, and the following disclaimer,
40
+ * without modification.
41
+ * 2. Redistributions in binary form must reproduce the above copyright
42
+ * notice, this list of conditions and the following disclaimer in the
43
+ * documentation and/or other materials provided with the distribution.
44
+ * 3. The names of the above-listed copyright holders may not be used
45
+ * to endorse or promote products derived from this software without
46
+ * specific prior written permission.
47
+ *
48
+ * ALTERNATIVELY, this software may be distributed under the terms of the
49
+ * GNU General Public License ("GPL") as published by the Free Software
50
+ * Foundation; either version 2 of the License, or (at your option) any
51
+ * later version.
52
+ *
53
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
54
+ * IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
55
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
56
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
57
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
58
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
59
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
60
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
61
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
62
+ * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
63
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
180
+ */
64
+ */
181
+
65
+
182
+static bool trans_DUP_s(DisasContext *s, arg_DUP_s *a, uint32_t insn)
66
+#ifndef __DWC2_HW_H__
183
+{
67
+#define __DWC2_HW_H__
184
+ if (sve_access_check(s)) {
68
+
185
+ unsigned vsz = vec_full_reg_size(s);
69
+#define HSOTG_REG(x)    (x)
186
+ tcg_gen_gvec_dup_i64(a->esz, vec_full_reg_offset(s, a->rd),
70
+
187
+ vsz, vsz, cpu_reg_sp(s, a->rn));
71
+#define GOTGCTL                HSOTG_REG(0x000)
188
+ }
72
+#define GOTGCTL_CHIRPEN            BIT(27)
189
+ return true;
73
+#define GOTGCTL_MULT_VALID_BC_MASK    (0x1f << 22)
190
+}
74
+#define GOTGCTL_MULT_VALID_BC_SHIFT    22
191
+
75
+#define GOTGCTL_OTGVER            BIT(20)
192
+static bool trans_DUP_x(DisasContext *s, arg_DUP_x *a, uint32_t insn)
76
+#define GOTGCTL_BSESVLD            BIT(19)
193
+{
77
+#define GOTGCTL_ASESVLD            BIT(18)
194
+ if ((a->imm & 0x1f) == 0) {
78
+#define GOTGCTL_DBNC_SHORT        BIT(17)
195
+ return false;
79
+#define GOTGCTL_CONID_B            BIT(16)
196
+ }
80
+#define GOTGCTL_DBNCE_FLTR_BYPASS    BIT(15)
197
+ if (sve_access_check(s)) {
81
+#define GOTGCTL_DEVHNPEN        BIT(11)
198
+ unsigned vsz = vec_full_reg_size(s);
82
+#define GOTGCTL_HSTSETHNPEN        BIT(10)
199
+ unsigned dofs = vec_full_reg_offset(s, a->rd);
83
+#define GOTGCTL_HNPREQ            BIT(9)
200
+ unsigned esz, index;
84
+#define GOTGCTL_HSTNEGSCS        BIT(8)
201
+
85
+#define GOTGCTL_SESREQ            BIT(1)
202
+ esz = ctz32(a->imm);
86
+#define GOTGCTL_SESREQSCS        BIT(0)
203
+ index = a->imm >> (esz + 1);
87
+
204
+
88
+#define GOTGINT                HSOTG_REG(0x004)
205
+ if ((index << esz) < vsz) {
89
+#define GOTGINT_DBNCE_DONE        BIT(19)
206
+ unsigned nofs = vec_reg_offset(s, a->rn, index, esz);
90
+#define GOTGINT_A_DEV_TOUT_CHG        BIT(18)
207
+ tcg_gen_gvec_dup_mem(esz, dofs, nofs, vsz, vsz);
91
+#define GOTGINT_HST_NEG_DET        BIT(17)
208
+ } else {
92
+#define GOTGINT_HST_NEG_SUC_STS_CHNG    BIT(9)
209
+ tcg_gen_gvec_dup64i(dofs, vsz, vsz, 0);
93
+#define GOTGINT_SES_REQ_SUC_STS_CHNG    BIT(8)
210
+ }
94
+#define GOTGINT_SES_END_DET        BIT(2)
211
+ }
95
+
212
+ return true;
96
+#define GAHBCFG                HSOTG_REG(0x008)
213
+}
97
+#define GAHBCFG_AHB_SINGLE        BIT(23)
214
+
98
+#define GAHBCFG_NOTI_ALL_DMA_WRIT    BIT(22)
215
+static void do_insr_i64(DisasContext *s, arg_rrr_esz *a, TCGv_i64 val)
99
+#define GAHBCFG_REM_MEM_SUPP        BIT(21)
216
+{
100
+#define GAHBCFG_P_TXF_EMP_LVL        BIT(8)
217
+ typedef void gen_insr(TCGv_ptr, TCGv_ptr, TCGv_i64, TCGv_i32);
101
+#define GAHBCFG_NP_TXF_EMP_LVL        BIT(7)
218
+ static gen_insr * const fns[4] = {
102
+#define GAHBCFG_DMA_EN            BIT(5)
219
+ gen_helper_sve_insr_b, gen_helper_sve_insr_h,
103
+#define GAHBCFG_HBSTLEN_MASK        (0xf << 1)
220
+ gen_helper_sve_insr_s, gen_helper_sve_insr_d,
104
+#define GAHBCFG_HBSTLEN_SHIFT        1
221
+ };
105
+#define GAHBCFG_HBSTLEN_SINGLE        0
222
+ unsigned vsz = vec_full_reg_size(s);
106
+#define GAHBCFG_HBSTLEN_INCR        1
223
+ TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
107
+#define GAHBCFG_HBSTLEN_INCR4        3
224
+ TCGv_ptr t_zd = tcg_temp_new_ptr();
108
+#define GAHBCFG_HBSTLEN_INCR8        5
225
+ TCGv_ptr t_zn = tcg_temp_new_ptr();
109
+#define GAHBCFG_HBSTLEN_INCR16        7
226
+
110
+#define GAHBCFG_GLBL_INTR_EN        BIT(0)
227
+ tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, a->rd));
111
+#define GAHBCFG_CTRL_MASK        (GAHBCFG_P_TXF_EMP_LVL | \
228
+ tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn));
112
+                     GAHBCFG_NP_TXF_EMP_LVL | \
229
+
113
+                     GAHBCFG_DMA_EN | \
230
+ fns[a->esz](t_zd, t_zn, val, desc);
114
+                     GAHBCFG_GLBL_INTR_EN)
231
+
115
+
232
+ tcg_temp_free_ptr(t_zd);
116
+#define GUSBCFG                HSOTG_REG(0x00C)
233
+ tcg_temp_free_ptr(t_zn);
117
+#define GUSBCFG_FORCEDEVMODE        BIT(30)
234
+ tcg_temp_free_i32(desc);
118
+#define GUSBCFG_FORCEHOSTMODE        BIT(29)
235
+}
119
+#define GUSBCFG_TXENDDELAY        BIT(28)
236
+
120
+#define GUSBCFG_ICTRAFFICPULLREMOVE    BIT(27)
237
+static bool trans_INSR_f(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
121
+#define GUSBCFG_ICUSBCAP        BIT(26)
238
+{
122
+#define GUSBCFG_ULPI_INT_PROT_DIS    BIT(25)
239
+ if (sve_access_check(s)) {
123
+#define GUSBCFG_INDICATORPASSTHROUGH    BIT(24)
240
+ TCGv_i64 t = tcg_temp_new_i64();
124
+#define GUSBCFG_INDICATORCOMPLEMENT    BIT(23)
241
+ tcg_gen_ld_i64(t, cpu_env, vec_reg_offset(s, a->rm, 0, MO_64));
125
+#define GUSBCFG_TERMSELDLPULSE        BIT(22)
242
+ do_insr_i64(s, a, t);
126
+#define GUSBCFG_ULPI_INT_VBUS_IND    BIT(21)
243
+ tcg_temp_free_i64(t);
127
+#define GUSBCFG_ULPI_EXT_VBUS_DRV    BIT(20)
244
+ }
128
+#define GUSBCFG_ULPI_CLK_SUSP_M        BIT(19)
245
+ return true;
129
+#define GUSBCFG_ULPI_AUTO_RES        BIT(18)
246
+}
130
+#define GUSBCFG_ULPI_FS_LS        BIT(17)
247
+
131
+#define GUSBCFG_OTG_UTMI_FS_SEL        BIT(16)
248
+static bool trans_INSR_r(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
132
+#define GUSBCFG_PHY_LP_CLK_SEL        BIT(15)
249
+{
133
+#define GUSBCFG_USBTRDTIM_MASK        (0xf << 10)
250
+ if (sve_access_check(s)) {
134
+#define GUSBCFG_USBTRDTIM_SHIFT        10
251
+ do_insr_i64(s, a, cpu_reg(s, a->rm));
135
+#define GUSBCFG_HNPCAP            BIT(9)
252
+ }
136
+#define GUSBCFG_SRPCAP            BIT(8)
253
+ return true;
137
+#define GUSBCFG_DDRSEL            BIT(7)
254
+}
138
+#define GUSBCFG_PHYSEL            BIT(6)
255
+
139
+#define GUSBCFG_FSINTF            BIT(5)
256
+static bool trans_REV_v(DisasContext *s, arg_rr_esz *a, uint32_t insn)
140
+#define GUSBCFG_ULPI_UTMI_SEL        BIT(4)
257
+{
141
+#define GUSBCFG_PHYIF16            BIT(3)
258
+ static gen_helper_gvec_2 * const fns[4] = {
142
+#define GUSBCFG_PHYIF8            (0 << 3)
259
+ gen_helper_sve_rev_b, gen_helper_sve_rev_h,
143
+#define GUSBCFG_TOUTCAL_MASK        (0x7 << 0)
260
+ gen_helper_sve_rev_s, gen_helper_sve_rev_d
144
+#define GUSBCFG_TOUTCAL_SHIFT        0
261
+ };
145
+#define GUSBCFG_TOUTCAL_LIMIT        0x7
262
+
146
+#define GUSBCFG_TOUTCAL(_x)        ((_x) << 0)
263
+ if (sve_access_check(s)) {
147
+
264
+ unsigned vsz = vec_full_reg_size(s);
148
+#define GRSTCTL                HSOTG_REG(0x010)
265
+ tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd),
149
+#define GRSTCTL_AHBIDLE            BIT(31)
266
+ vec_full_reg_offset(s, a->rn),
150
+#define GRSTCTL_DMAREQ            BIT(30)
267
+ vsz, vsz, 0, fns[a->esz]);
151
+#define GRSTCTL_TXFNUM_MASK        (0x1f << 6)
268
+ }
152
+#define GRSTCTL_TXFNUM_SHIFT        6
269
+ return true;
153
+#define GRSTCTL_TXFNUM_LIMIT        0x1f
270
+}
154
+#define GRSTCTL_TXFNUM(_x)        ((_x) << 6)
271
+
155
+#define GRSTCTL_TXFFLSH            BIT(5)
272
+static bool trans_TBL(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
156
+#define GRSTCTL_RXFFLSH            BIT(4)
273
+{
157
+#define GRSTCTL_IN_TKNQ_FLSH        BIT(3)
274
+ static gen_helper_gvec_3 * const fns[4] = {
158
+#define GRSTCTL_FRMCNTRRST        BIT(2)
275
+ gen_helper_sve_tbl_b, gen_helper_sve_tbl_h,
159
+#define GRSTCTL_HSFTRST            BIT(1)
276
+ gen_helper_sve_tbl_s, gen_helper_sve_tbl_d
160
+#define GRSTCTL_CSFTRST            BIT(0)
277
+ };
161
+
278
+
162
+#define GINTSTS                HSOTG_REG(0x014)
279
+ if (sve_access_check(s)) {
163
+#define GINTMSK                HSOTG_REG(0x018)
280
+ unsigned vsz = vec_full_reg_size(s);
164
+#define GINTSTS_WKUPINT            BIT(31)
281
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
165
+#define GINTSTS_SESSREQINT        BIT(30)
282
+ vec_full_reg_offset(s, a->rn),
166
+#define GINTSTS_DISCONNINT        BIT(29)
283
+ vec_full_reg_offset(s, a->rm),
167
+#define GINTSTS_CONIDSTSCHNG        BIT(28)
284
+ vsz, vsz, 0, fns[a->esz]);
168
+#define GINTSTS_LPMTRANRCVD        BIT(27)
285
+ }
169
+#define GINTSTS_PTXFEMP            BIT(26)
286
+ return true;
170
+#define GINTSTS_HCHINT            BIT(25)
287
+}
171
+#define GINTSTS_PRTINT            BIT(24)
288
+
172
+#define GINTSTS_RESETDET        BIT(23)
289
+static bool trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn)
173
+#define GINTSTS_FET_SUSP        BIT(22)
290
+{
174
+#define GINTSTS_INCOMPL_IP        BIT(21)
291
+ static gen_helper_gvec_2 * const fns[4][2] = {
175
+#define GINTSTS_INCOMPL_SOOUT        BIT(21)
292
+ { NULL, NULL },
176
+#define GINTSTS_INCOMPL_SOIN        BIT(20)
293
+ { gen_helper_sve_sunpk_h, gen_helper_sve_uunpk_h },
177
+#define GINTSTS_OEPINT            BIT(19)
294
+ { gen_helper_sve_sunpk_s, gen_helper_sve_uunpk_s },
178
+#define GINTSTS_IEPINT            BIT(18)
295
+ { gen_helper_sve_sunpk_d, gen_helper_sve_uunpk_d },
179
+#define GINTSTS_EPMIS            BIT(17)
296
+ };
180
+#define GINTSTS_RESTOREDONE        BIT(16)
297
+
181
+#define GINTSTS_EOPF            BIT(15)
298
+ if (a->esz == 0) {
182
+#define GINTSTS_ISOUTDROP        BIT(14)
299
+ return false;
183
+#define GINTSTS_ENUMDONE        BIT(13)
300
+ }
184
+#define GINTSTS_USBRST            BIT(12)
301
+ if (sve_access_check(s)) {
185
+#define GINTSTS_USBSUSP            BIT(11)
302
+ unsigned vsz = vec_full_reg_size(s);
186
+#define GINTSTS_ERLYSUSP        BIT(10)
303
+ tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd),
187
+#define GINTSTS_I2CINT            BIT(9)
304
+ vec_full_reg_offset(s, a->rn)
188
+#define GINTSTS_ULPI_CK_INT        BIT(8)
305
+ + (a->h ? vsz / 2 : 0),
189
+#define GINTSTS_GOUTNAKEFF        BIT(7)
306
+ vsz, vsz, 0, fns[a->esz][a->u]);
190
+#define GINTSTS_GINNAKEFF        BIT(6)
307
+ }
191
+#define GINTSTS_NPTXFEMP        BIT(5)
308
+ return true;
192
+#define GINTSTS_RXFLVL            BIT(4)
309
+}
193
+#define GINTSTS_SOF            BIT(3)
310
+
194
+#define GINTSTS_OTGINT            BIT(2)
311
/*
195
+#define GINTSTS_MODEMIS            BIT(1)
312
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
196
+#define GINTSTS_CURMODE_HOST        BIT(0)
313
*/
197
+
314
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
198
+#define GRXSTSR                HSOTG_REG(0x01C)
315
index XXXXXXX..XXXXXXX 100644
199
+#define GRXSTSP                HSOTG_REG(0x020)
316
--- a/target/arm/sve.decode
200
+#define GRXSTS_FN_MASK            (0x7f << 25)
317
+++ b/target/arm/sve.decode
201
+#define GRXSTS_FN_SHIFT            25
318
@@ -XXX,XX +XXX,XX @@
202
+#define GRXSTS_PKTSTS_MASK        (0xf << 17)
319
203
+#define GRXSTS_PKTSTS_SHIFT        17
320
%imm4_16_p1 16:4 !function=plus1
204
+#define GRXSTS_PKTSTS_GLOBALOUTNAK    1
321
%imm6_22_5 22:1 5:5
205
+#define GRXSTS_PKTSTS_OUTRX        2
322
+%imm7_22_16 22:2 16:5
206
+#define GRXSTS_PKTSTS_HCHIN        2
323
%imm8_16_10 16:5 10:3
207
+#define GRXSTS_PKTSTS_OUTDONE        3
324
%imm9_16_10 16:s6 10:3
208
+#define GRXSTS_PKTSTS_HCHIN_XFER_COMP    3
325
209
+#define GRXSTS_PKTSTS_SETUPDONE        4
326
@@ -XXX,XX +XXX,XX @@
210
+#define GRXSTS_PKTSTS_DATATOGGLEERR    5
327
211
+#define GRXSTS_PKTSTS_SETUPRX        6
328
# Three operand, vector element size
212
+#define GRXSTS_PKTSTS_HCHHALTED        7
329
@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz
213
+#define GRXSTS_HCHNUM_MASK        (0xf << 0)
330
+@rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \
214
+#define GRXSTS_HCHNUM_SHIFT        0
331
+ &rrr_esz rn=%reg_movprfx
215
+#define GRXSTS_DPID_MASK        (0x3 << 15)
332
216
+#define GRXSTS_DPID_SHIFT        15
333
# Three operand with "memory" size, aka immediate left shift
217
+#define GRXSTS_BYTECNT_MASK        (0x7ff << 4)
334
@rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri
218
+#define GRXSTS_BYTECNT_SHIFT        4
335
@@ -XXX,XX +XXX,XX @@ CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s
219
+#define GRXSTS_EPNUM_MASK        (0xf << 0)
336
EXT 00000101 001 ..... 000 ... rm:5 rd:5 \
220
+#define GRXSTS_EPNUM_SHIFT        0
337
&rrri rn=%reg_movprfx imm=%imm8_16_10
221
+
338
222
+#define GRXFSIZ                HSOTG_REG(0x024)
339
+### SVE Permute - Unpredicated Group
223
+#define GRXFSIZ_DEPTH_MASK        (0xffff << 0)
340
+
224
+#define GRXFSIZ_DEPTH_SHIFT        0
341
+# SVE broadcast general register
225
+
342
+DUP_s 00000101 .. 1 00000 001110 ..... ..... @rd_rn
226
+#define GNPTXFSIZ            HSOTG_REG(0x028)
343
+
227
+/* Use FIFOSIZE_* constants to access this register */
344
+# SVE broadcast indexed element
228
+
345
+DUP_x 00000101 .. 1 ..... 001000 rn:5 rd:5 \
229
+#define GNPTXSTS            HSOTG_REG(0x02C)
346
+ &rri imm=%imm7_22_16
230
+#define GNPTXSTS_NP_TXQ_TOP_MASK        (0x7f << 24)
347
+
231
+#define GNPTXSTS_NP_TXQ_TOP_SHIFT        24
348
+# SVE insert SIMD&FP scalar register
232
+#define GNPTXSTS_NP_TXQ_SPC_AVAIL_MASK        (0xff << 16)
349
+INSR_f 00000101 .. 1 10100 001110 ..... ..... @rdn_rm
233
+#define GNPTXSTS_NP_TXQ_SPC_AVAIL_SHIFT        16
350
+
234
+#define GNPTXSTS_NP_TXQ_SPC_AVAIL_GET(_v)    (((_v) >> 16) & 0xff)
351
+# SVE insert general register
235
+#define GNPTXSTS_NP_TXF_SPC_AVAIL_MASK        (0xffff << 0)
352
+INSR_r 00000101 .. 1 00100 001110 ..... ..... @rdn_rm
236
+#define GNPTXSTS_NP_TXF_SPC_AVAIL_SHIFT        0
353
+
237
+#define GNPTXSTS_NP_TXF_SPC_AVAIL_GET(_v)    (((_v) >> 0) & 0xffff)
354
+# SVE reverse vector elements
238
+
355
+REV_v 00000101 .. 1 11000 001110 ..... ..... @rd_rn
239
+#define GI2CCTL                HSOTG_REG(0x0030)
356
+
240
+#define GI2CCTL_BSYDNE            BIT(31)
357
+# SVE vector table lookup
241
+#define GI2CCTL_RW            BIT(30)
358
+TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm
242
+#define GI2CCTL_I2CDATSE0        BIT(28)
359
+
243
+#define GI2CCTL_I2CDEVADDR_MASK        (0x3 << 26)
360
+# SVE unpack vector elements
244
+#define GI2CCTL_I2CDEVADDR_SHIFT    26
361
+UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5
245
+#define GI2CCTL_I2CSUSPCTL        BIT(25)
362
+
246
+#define GI2CCTL_ACK            BIT(24)
363
### SVE Predicate Logical Operations Group
247
+#define GI2CCTL_I2CEN            BIT(23)
364
248
+#define GI2CCTL_ADDR_MASK        (0x7f << 16)
365
# SVE predicate logical operations
249
+#define GI2CCTL_ADDR_SHIFT        16
250
+#define GI2CCTL_REGADDR_MASK        (0xff << 8)
251
+#define GI2CCTL_REGADDR_SHIFT        8
252
+#define GI2CCTL_RWDATA_MASK        (0xff << 0)
253
+#define GI2CCTL_RWDATA_SHIFT        0
254
+
255
+#define GPVNDCTL            HSOTG_REG(0x0034)
256
+#define GGPIO                HSOTG_REG(0x0038)
257
+#define GGPIO_STM32_OTG_GCCFG_PWRDWN    BIT(16)
258
+
259
+#define GUID                HSOTG_REG(0x003c)
260
+#define GSNPSID                HSOTG_REG(0x0040)
261
+#define GHWCFG1                HSOTG_REG(0x0044)
262
+#define GSNPSID_ID_MASK            GENMASK(31, 16)
263
+
264
+#define GHWCFG2                HSOTG_REG(0x0048)
265
+#define GHWCFG2_OTG_ENABLE_IC_USB        BIT(31)
266
+#define GHWCFG2_DEV_TOKEN_Q_DEPTH_MASK        (0x1f << 26)
267
+#define GHWCFG2_DEV_TOKEN_Q_DEPTH_SHIFT        26
268
+#define GHWCFG2_HOST_PERIO_TX_Q_DEPTH_MASK    (0x3 << 24)
269
+#define GHWCFG2_HOST_PERIO_TX_Q_DEPTH_SHIFT    24
270
+#define GHWCFG2_NONPERIO_TX_Q_DEPTH_MASK    (0x3 << 22)
271
+#define GHWCFG2_NONPERIO_TX_Q_DEPTH_SHIFT    22
272
+#define GHWCFG2_MULTI_PROC_INT            BIT(20)
273
+#define GHWCFG2_DYNAMIC_FIFO            BIT(19)
274
+#define GHWCFG2_PERIO_EP_SUPPORTED        BIT(18)
275
+#define GHWCFG2_NUM_HOST_CHAN_MASK        (0xf << 14)
276
+#define GHWCFG2_NUM_HOST_CHAN_SHIFT        14
277
+#define GHWCFG2_NUM_DEV_EP_MASK            (0xf << 10)
278
+#define GHWCFG2_NUM_DEV_EP_SHIFT        10
279
+#define GHWCFG2_FS_PHY_TYPE_MASK        (0x3 << 8)
280
+#define GHWCFG2_FS_PHY_TYPE_SHIFT        8
281
+#define GHWCFG2_FS_PHY_TYPE_NOT_SUPPORTED    0
282
+#define GHWCFG2_FS_PHY_TYPE_DEDICATED        1
283
+#define GHWCFG2_FS_PHY_TYPE_SHARED_UTMI        2
284
+#define GHWCFG2_FS_PHY_TYPE_SHARED_ULPI        3
285
+#define GHWCFG2_HS_PHY_TYPE_MASK        (0x3 << 6)
286
+#define GHWCFG2_HS_PHY_TYPE_SHIFT        6
287
+#define GHWCFG2_HS_PHY_TYPE_NOT_SUPPORTED    0
288
+#define GHWCFG2_HS_PHY_TYPE_UTMI        1
289
+#define GHWCFG2_HS_PHY_TYPE_ULPI        2
290
+#define GHWCFG2_HS_PHY_TYPE_UTMI_ULPI        3
291
+#define GHWCFG2_POINT2POINT            BIT(5)
292
+#define GHWCFG2_ARCHITECTURE_MASK        (0x3 << 3)
293
+#define GHWCFG2_ARCHITECTURE_SHIFT        3
294
+#define GHWCFG2_SLAVE_ONLY_ARCH            0
295
+#define GHWCFG2_EXT_DMA_ARCH            1
296
+#define GHWCFG2_INT_DMA_ARCH            2
297
+#define GHWCFG2_OP_MODE_MASK            (0x7 << 0)
298
+#define GHWCFG2_OP_MODE_SHIFT            0
299
+#define GHWCFG2_OP_MODE_HNP_SRP_CAPABLE        0
300
+#define GHWCFG2_OP_MODE_SRP_ONLY_CAPABLE    1
301
+#define GHWCFG2_OP_MODE_NO_HNP_SRP_CAPABLE    2
302
+#define GHWCFG2_OP_MODE_SRP_CAPABLE_DEVICE    3
303
+#define GHWCFG2_OP_MODE_NO_SRP_CAPABLE_DEVICE    4
304
+#define GHWCFG2_OP_MODE_SRP_CAPABLE_HOST    5
305
+#define GHWCFG2_OP_MODE_NO_SRP_CAPABLE_HOST    6
306
+#define GHWCFG2_OP_MODE_UNDEFINED        7
307
+
308
+#define GHWCFG3                HSOTG_REG(0x004c)
309
+#define GHWCFG3_DFIFO_DEPTH_MASK        (0xffff << 16)
310
+#define GHWCFG3_DFIFO_DEPTH_SHIFT        16
311
+#define GHWCFG3_OTG_LPM_EN            BIT(15)
312
+#define GHWCFG3_BC_SUPPORT            BIT(14)
313
+#define GHWCFG3_OTG_ENABLE_HSIC            BIT(13)
314
+#define GHWCFG3_ADP_SUPP            BIT(12)
315
+#define GHWCFG3_SYNCH_RESET_TYPE        BIT(11)
316
+#define GHWCFG3_OPTIONAL_FEATURES        BIT(10)
317
+#define GHWCFG3_VENDOR_CTRL_IF            BIT(9)
318
+#define GHWCFG3_I2C                BIT(8)
319
+#define GHWCFG3_OTG_FUNC            BIT(7)
320
+#define GHWCFG3_PACKET_SIZE_CNTR_WIDTH_MASK    (0x7 << 4)
321
+#define GHWCFG3_PACKET_SIZE_CNTR_WIDTH_SHIFT    4
322
+#define GHWCFG3_XFER_SIZE_CNTR_WIDTH_MASK    (0xf << 0)
323
+#define GHWCFG3_XFER_SIZE_CNTR_WIDTH_SHIFT    0
324
+
325
+#define GHWCFG4                HSOTG_REG(0x0050)
326
+#define GHWCFG4_DESC_DMA_DYN            BIT(31)
327
+#define GHWCFG4_DESC_DMA            BIT(30)
328
+#define GHWCFG4_NUM_IN_EPS_MASK            (0xf << 26)
329
+#define GHWCFG4_NUM_IN_EPS_SHIFT        26
330
+#define GHWCFG4_DED_FIFO_EN            BIT(25)
331
+#define GHWCFG4_DED_FIFO_SHIFT        25
332
+#define GHWCFG4_SESSION_END_FILT_EN        BIT(24)
333
+#define GHWCFG4_B_VALID_FILT_EN            BIT(23)
334
+#define GHWCFG4_A_VALID_FILT_EN            BIT(22)
335
+#define GHWCFG4_VBUS_VALID_FILT_EN        BIT(21)
336
+#define GHWCFG4_IDDIG_FILT_EN            BIT(20)
337
+#define GHWCFG4_NUM_DEV_MODE_CTRL_EP_MASK    (0xf << 16)
338
+#define GHWCFG4_NUM_DEV_MODE_CTRL_EP_SHIFT    16
339
+#define GHWCFG4_UTMI_PHY_DATA_WIDTH_MASK    (0x3 << 14)
340
+#define GHWCFG4_UTMI_PHY_DATA_WIDTH_SHIFT    14
341
+#define GHWCFG4_UTMI_PHY_DATA_WIDTH_8        0
342
+#define GHWCFG4_UTMI_PHY_DATA_WIDTH_16        1
343
+#define GHWCFG4_UTMI_PHY_DATA_WIDTH_8_OR_16    2
344
+#define GHWCFG4_ACG_SUPPORTED            BIT(12)
345
+#define GHWCFG4_IPG_ISOC_SUPPORTED        BIT(11)
346
+#define GHWCFG4_SERVICE_INTERVAL_SUPPORTED BIT(10)
347
+#define GHWCFG4_XHIBER                BIT(7)
348
+#define GHWCFG4_HIBER                BIT(6)
349
+#define GHWCFG4_MIN_AHB_FREQ            BIT(5)
350
+#define GHWCFG4_POWER_OPTIMIZ            BIT(4)
351
+#define GHWCFG4_NUM_DEV_PERIO_IN_EP_MASK    (0xf << 0)
352
+#define GHWCFG4_NUM_DEV_PERIO_IN_EP_SHIFT    0
353
+
354
+#define GLPMCFG                HSOTG_REG(0x0054)
355
+#define GLPMCFG_INVSELHSIC        BIT(31)
356
+#define GLPMCFG_HSICCON            BIT(30)
357
+#define GLPMCFG_RSTRSLPSTS        BIT(29)
358
+#define GLPMCFG_ENBESL            BIT(28)
359
+#define GLPMCFG_LPM_RETRYCNT_STS_MASK    (0x7 << 25)
360
+#define GLPMCFG_LPM_RETRYCNT_STS_SHIFT    25
361
+#define GLPMCFG_SNDLPM            BIT(24)
362
+#define GLPMCFG_RETRY_CNT_MASK        (0x7 << 21)
363
+#define GLPMCFG_RETRY_CNT_SHIFT        21
364
+#define GLPMCFG_LPM_REJECT_CTRL_CONTROL    BIT(21)
365
+#define GLPMCFG_LPM_ACCEPT_CTRL_ISOC    BIT(22)
366
+#define GLPMCFG_LPM_CHNL_INDX_MASK    (0xf << 17)
367
+#define GLPMCFG_LPM_CHNL_INDX_SHIFT    17
368
+#define GLPMCFG_L1RESUMEOK        BIT(16)
369
+#define GLPMCFG_SLPSTS            BIT(15)
370
+#define GLPMCFG_COREL1RES_MASK        (0x3 << 13)
371
+#define GLPMCFG_COREL1RES_SHIFT        13
372
+#define GLPMCFG_HIRD_THRES_MASK        (0x1f << 8)
373
+#define GLPMCFG_HIRD_THRES_SHIFT    8
374
+#define GLPMCFG_HIRD_THRES_EN        (0x10 << 8)
375
+#define GLPMCFG_ENBLSLPM        BIT(7)
376
+#define GLPMCFG_BREMOTEWAKE        BIT(6)
377
+#define GLPMCFG_HIRD_MASK        (0xf << 2)
378
+#define GLPMCFG_HIRD_SHIFT        2
379
+#define GLPMCFG_APPL1RES        BIT(1)
380
+#define GLPMCFG_LPMCAP            BIT(0)
381
+
382
+#define GPWRDN                HSOTG_REG(0x0058)
383
+#define GPWRDN_MULT_VAL_ID_BC_MASK    (0x1f << 24)
384
+#define GPWRDN_MULT_VAL_ID_BC_SHIFT    24
385
+#define GPWRDN_ADP_INT            BIT(23)
386
+#define GPWRDN_BSESSVLD            BIT(22)
387
+#define GPWRDN_IDSTS            BIT(21)
388
+#define GPWRDN_LINESTATE_MASK        (0x3 << 19)
389
+#define GPWRDN_LINESTATE_SHIFT        19
390
+#define GPWRDN_STS_CHGINT_MSK        BIT(18)
391
+#define GPWRDN_STS_CHGINT        BIT(17)
392
+#define GPWRDN_SRP_DET_MSK        BIT(16)
393
+#define GPWRDN_SRP_DET            BIT(15)
394
+#define GPWRDN_CONNECT_DET_MSK        BIT(14)
395
+#define GPWRDN_CONNECT_DET        BIT(13)
396
+#define GPWRDN_DISCONN_DET_MSK        BIT(12)
397
+#define GPWRDN_DISCONN_DET        BIT(11)
398
+#define GPWRDN_RST_DET_MSK        BIT(10)
399
+#define GPWRDN_RST_DET            BIT(9)
400
+#define GPWRDN_LNSTSCHG_MSK        BIT(8)
401
+#define GPWRDN_LNSTSCHG            BIT(7)
402
+#define GPWRDN_DIS_VBUS            BIT(6)
403
+#define GPWRDN_PWRDNSWTCH        BIT(5)
404
+#define GPWRDN_PWRDNRSTN        BIT(4)
405
+#define GPWRDN_PWRDNCLMP        BIT(3)
406
+#define GPWRDN_RESTORE            BIT(2)
407
+#define GPWRDN_PMUACTV            BIT(1)
408
+#define GPWRDN_PMUINTSEL        BIT(0)
409
+
410
+#define GDFIFOCFG            HSOTG_REG(0x005c)
411
+#define GDFIFOCFG_EPINFOBASE_MASK    (0xffff << 16)
412
+#define GDFIFOCFG_EPINFOBASE_SHIFT    16
413
+#define GDFIFOCFG_GDFIFOCFG_MASK    (0xffff << 0)
414
+#define GDFIFOCFG_GDFIFOCFG_SHIFT    0
415
+
416
+#define ADPCTL                HSOTG_REG(0x0060)
417
+#define ADPCTL_AR_MASK            (0x3 << 27)
418
+#define ADPCTL_AR_SHIFT            27
419
+#define ADPCTL_ADP_TMOUT_INT_MSK    BIT(26)
420
+#define ADPCTL_ADP_SNS_INT_MSK        BIT(25)
421
+#define ADPCTL_ADP_PRB_INT_MSK        BIT(24)
422
+#define ADPCTL_ADP_TMOUT_INT        BIT(23)
423
+#define ADPCTL_ADP_SNS_INT        BIT(22)
424
+#define ADPCTL_ADP_PRB_INT        BIT(21)
425
+#define ADPCTL_ADPENA            BIT(20)
426
+#define ADPCTL_ADPRES            BIT(19)
427
+#define ADPCTL_ENASNS            BIT(18)
428
+#define ADPCTL_ENAPRB            BIT(17)
429
+#define ADPCTL_RTIM_MASK        (0x7ff << 6)
430
+#define ADPCTL_RTIM_SHIFT        6
431
+#define ADPCTL_PRB_PER_MASK        (0x3 << 4)
432
+#define ADPCTL_PRB_PER_SHIFT        4
433
+#define ADPCTL_PRB_DELTA_MASK        (0x3 << 2)
434
+#define ADPCTL_PRB_DELTA_SHIFT        2
435
+#define ADPCTL_PRB_DSCHRG_MASK        (0x3 << 0)
436
+#define ADPCTL_PRB_DSCHRG_SHIFT        0
437
+
438
+#define GREFCLK                 HSOTG_REG(0x0064)
439
+#define GREFCLK_REFCLKPER_MASK         (0x1ffff << 15)
440
+#define GREFCLK_REFCLKPER_SHIFT         15
441
+#define GREFCLK_REF_CLK_MODE         BIT(14)
442
+#define GREFCLK_SOF_CNT_WKUP_ALERT_MASK     (0x3ff)
443
+#define GREFCLK_SOF_CNT_WKUP_ALERT_SHIFT 0
444
+
445
+#define GINTMSK2            HSOTG_REG(0x0068)
446
+#define GINTMSK2_WKUP_ALERT_INT_MSK    BIT(0)
447
+
448
+#define GINTSTS2            HSOTG_REG(0x006c)
449
+#define GINTSTS2_WKUP_ALERT_INT        BIT(0)
450
+
451
+#define HPTXFSIZ            HSOTG_REG(0x100)
452
+/* Use FIFOSIZE_* constants to access this register */
453
+
454
+#define DPTXFSIZN(_a)            HSOTG_REG(0x104 + (((_a) - 1) * 4))
455
+/* Use FIFOSIZE_* constants to access this register */
456
+
457
+/* These apply to the GNPTXFSIZ, HPTXFSIZ and DPTXFSIZN registers */
458
+#define FIFOSIZE_DEPTH_MASK        (0xffff << 16)
459
+#define FIFOSIZE_DEPTH_SHIFT        16
460
+#define FIFOSIZE_STARTADDR_MASK        (0xffff << 0)
461
+#define FIFOSIZE_STARTADDR_SHIFT    0
462
+#define FIFOSIZE_DEPTH_GET(_x)        (((_x) >> 16) & 0xffff)
463
+
464
+/* Device mode registers */
465
+
466
+#define DCFG                HSOTG_REG(0x800)
467
+#define DCFG_DESCDMA_EN            BIT(23)
468
+#define DCFG_EPMISCNT_MASK        (0x1f << 18)
469
+#define DCFG_EPMISCNT_SHIFT        18
470
+#define DCFG_EPMISCNT_LIMIT        0x1f
471
+#define DCFG_EPMISCNT(_x)        ((_x) << 18)
472
+#define DCFG_IPG_ISOC_SUPPORDED        BIT(17)
473
+#define DCFG_PERFRINT_MASK        (0x3 << 11)
474
+#define DCFG_PERFRINT_SHIFT        11
475
+#define DCFG_PERFRINT_LIMIT        0x3
476
+#define DCFG_PERFRINT(_x)        ((_x) << 11)
477
+#define DCFG_DEVADDR_MASK        (0x7f << 4)
478
+#define DCFG_DEVADDR_SHIFT        4
479
+#define DCFG_DEVADDR_LIMIT        0x7f
480
+#define DCFG_DEVADDR(_x)        ((_x) << 4)
481
+#define DCFG_NZ_STS_OUT_HSHK        BIT(2)
482
+#define DCFG_DEVSPD_MASK        (0x3 << 0)
483
+#define DCFG_DEVSPD_SHIFT        0
484
+#define DCFG_DEVSPD_HS            0
485
+#define DCFG_DEVSPD_FS            1
486
+#define DCFG_DEVSPD_LS            2
487
+#define DCFG_DEVSPD_FS48        3
488
+
489
+#define DCTL                HSOTG_REG(0x804)
490
+#define DCTL_SERVICE_INTERVAL_SUPPORTED BIT(19)
491
+#define DCTL_PWRONPRGDONE        BIT(11)
492
+#define DCTL_CGOUTNAK            BIT(10)
493
+#define DCTL_SGOUTNAK            BIT(9)
494
+#define DCTL_CGNPINNAK            BIT(8)
495
+#define DCTL_SGNPINNAK            BIT(7)
496
+#define DCTL_TSTCTL_MASK        (0x7 << 4)
497
+#define DCTL_TSTCTL_SHIFT        4
498
+#define DCTL_GOUTNAKSTS            BIT(3)
499
+#define DCTL_GNPINNAKSTS        BIT(2)
500
+#define DCTL_SFTDISCON            BIT(1)
501
+#define DCTL_RMTWKUPSIG            BIT(0)
502
+
503
+#define DSTS                HSOTG_REG(0x808)
504
+#define DSTS_SOFFN_MASK            (0x3fff << 8)
505
+#define DSTS_SOFFN_SHIFT        8
506
+#define DSTS_SOFFN_LIMIT        0x3fff
507
+#define DSTS_SOFFN(_x)            ((_x) << 8)
508
+#define DSTS_ERRATICERR            BIT(3)
509
+#define DSTS_ENUMSPD_MASK        (0x3 << 1)
510
+#define DSTS_ENUMSPD_SHIFT        1
511
+#define DSTS_ENUMSPD_HS            0
512
+#define DSTS_ENUMSPD_FS            1
513
+#define DSTS_ENUMSPD_LS            2
514
+#define DSTS_ENUMSPD_FS48        3
515
+#define DSTS_SUSPSTS            BIT(0)
516
+
517
+#define DIEPMSK                HSOTG_REG(0x810)
518
+#define DIEPMSK_NAKMSK            BIT(13)
519
+#define DIEPMSK_BNAININTRMSK        BIT(9)
520
+#define DIEPMSK_TXFIFOUNDRNMSK        BIT(8)
521
+#define DIEPMSK_TXFIFOEMPTY        BIT(7)
522
+#define DIEPMSK_INEPNAKEFFMSK        BIT(6)
523
+#define DIEPMSK_INTKNEPMISMSK        BIT(5)
524
+#define DIEPMSK_INTKNTXFEMPMSK        BIT(4)
525
+#define DIEPMSK_TIMEOUTMSK        BIT(3)
526
+#define DIEPMSK_AHBERRMSK        BIT(2)
527
+#define DIEPMSK_EPDISBLDMSK        BIT(1)
528
+#define DIEPMSK_XFERCOMPLMSK        BIT(0)
529
+
530
+#define DOEPMSK                HSOTG_REG(0x814)
531
+#define DOEPMSK_BNAMSK            BIT(9)
532
+#define DOEPMSK_BACK2BACKSETUP        BIT(6)
533
+#define DOEPMSK_STSPHSERCVDMSK        BIT(5)
534
+#define DOEPMSK_OUTTKNEPDISMSK        BIT(4)
535
+#define DOEPMSK_SETUPMSK        BIT(3)
536
+#define DOEPMSK_AHBERRMSK        BIT(2)
537
+#define DOEPMSK_EPDISBLDMSK        BIT(1)
538
+#define DOEPMSK_XFERCOMPLMSK        BIT(0)
539
+
540
+#define DAINT                HSOTG_REG(0x818)
541
+#define DAINTMSK            HSOTG_REG(0x81C)
542
+#define DAINT_OUTEP_SHIFT        16
543
+#define DAINT_OUTEP(_x)            (1 << ((_x) + 16))
544
+#define DAINT_INEP(_x)            (1 << (_x))
545
+
546
+#define DTKNQR1                HSOTG_REG(0x820)
547
+#define DTKNQR2                HSOTG_REG(0x824)
548
+#define DTKNQR3                HSOTG_REG(0x830)
549
+#define DTKNQR4                HSOTG_REG(0x834)
550
+#define DIEPEMPMSK            HSOTG_REG(0x834)
551
+
552
+#define DVBUSDIS            HSOTG_REG(0x828)
553
+#define DVBUSPULSE            HSOTG_REG(0x82C)
554
+
555
+#define DIEPCTL0            HSOTG_REG(0x900)
556
+#define DIEPCTL(_a)            HSOTG_REG(0x900 + ((_a) * 0x20))
557
+
558
+#define DOEPCTL0            HSOTG_REG(0xB00)
559
+#define DOEPCTL(_a)            HSOTG_REG(0xB00 + ((_a) * 0x20))
560
+
561
+/* EP0 specialness:
562
+ * bits[29..28] - reserved (no SetD0PID, SetD1PID)
563
+ * bits[25..22] - should always be zero, this isn't a periodic endpoint
564
+ * bits[10..0] - MPS setting different for EP0
565
+ */
566
+#define D0EPCTL_MPS_MASK        (0x3 << 0)
567
+#define D0EPCTL_MPS_SHIFT        0
568
+#define D0EPCTL_MPS_64            0
569
+#define D0EPCTL_MPS_32            1
570
+#define D0EPCTL_MPS_16            2
571
+#define D0EPCTL_MPS_8            3
572
+
573
+#define DXEPCTL_EPENA            BIT(31)
574
+#define DXEPCTL_EPDIS            BIT(30)
575
+#define DXEPCTL_SETD1PID        BIT(29)
576
+#define DXEPCTL_SETODDFR        BIT(29)
577
+#define DXEPCTL_SETD0PID        BIT(28)
578
+#define DXEPCTL_SETEVENFR        BIT(28)
579
+#define DXEPCTL_SNAK            BIT(27)
580
+#define DXEPCTL_CNAK            BIT(26)
581
+#define DXEPCTL_TXFNUM_MASK        (0xf << 22)
582
+#define DXEPCTL_TXFNUM_SHIFT        22
583
+#define DXEPCTL_TXFNUM_LIMIT        0xf
584
+#define DXEPCTL_TXFNUM(_x)        ((_x) << 22)
585
+#define DXEPCTL_STALL            BIT(21)
586
+#define DXEPCTL_SNP            BIT(20)
587
+#define DXEPCTL_EPTYPE_MASK        (0x3 << 18)
588
+#define DXEPCTL_EPTYPE_CONTROL        (0x0 << 18)
589
+#define DXEPCTL_EPTYPE_ISO        (0x1 << 18)
590
+#define DXEPCTL_EPTYPE_BULK        (0x2 << 18)
591
+#define DXEPCTL_EPTYPE_INTERRUPT    (0x3 << 18)
592
+
593
+#define DXEPCTL_NAKSTS            BIT(17)
594
+#define DXEPCTL_DPID            BIT(16)
595
+#define DXEPCTL_EOFRNUM            BIT(16)
596
+#define DXEPCTL_USBACTEP        BIT(15)
597
+#define DXEPCTL_NEXTEP_MASK        (0xf << 11)
598
+#define DXEPCTL_NEXTEP_SHIFT        11
599
+#define DXEPCTL_NEXTEP_LIMIT        0xf
600
+#define DXEPCTL_NEXTEP(_x)        ((_x) << 11)
601
+#define DXEPCTL_MPS_MASK        (0x7ff << 0)
602
+#define DXEPCTL_MPS_SHIFT        0
603
+#define DXEPCTL_MPS_LIMIT        0x7ff
604
+#define DXEPCTL_MPS(_x)            ((_x) << 0)
605
+
606
+#define DIEPINT(_a)            HSOTG_REG(0x908 + ((_a) * 0x20))
607
+#define DOEPINT(_a)            HSOTG_REG(0xB08 + ((_a) * 0x20))
608
+#define DXEPINT_SETUP_RCVD        BIT(15)
609
+#define DXEPINT_NYETINTRPT        BIT(14)
610
+#define DXEPINT_NAKINTRPT        BIT(13)
611
+#define DXEPINT_BBLEERRINTRPT        BIT(12)
612
+#define DXEPINT_PKTDRPSTS        BIT(11)
613
+#define DXEPINT_BNAINTR            BIT(9)
614
+#define DXEPINT_TXFIFOUNDRN        BIT(8)
615
+#define DXEPINT_OUTPKTERR        BIT(8)
616
+#define DXEPINT_TXFEMP            BIT(7)
617
+#define DXEPINT_INEPNAKEFF        BIT(6)
618
+#define DXEPINT_BACK2BACKSETUP        BIT(6)
619
+#define DXEPINT_INTKNEPMIS        BIT(5)
620
+#define DXEPINT_STSPHSERCVD        BIT(5)
621
+#define DXEPINT_INTKNTXFEMP        BIT(4)
622
+#define DXEPINT_OUTTKNEPDIS        BIT(4)
623
+#define DXEPINT_TIMEOUT            BIT(3)
624
+#define DXEPINT_SETUP            BIT(3)
625
+#define DXEPINT_AHBERR            BIT(2)
626
+#define DXEPINT_EPDISBLD        BIT(1)
627
+#define DXEPINT_XFERCOMPL        BIT(0)
628
+
629
+#define DIEPTSIZ0            HSOTG_REG(0x910)
630
+#define DIEPTSIZ0_PKTCNT_MASK        (0x3 << 19)
631
+#define DIEPTSIZ0_PKTCNT_SHIFT        19
632
+#define DIEPTSIZ0_PKTCNT_LIMIT        0x3
633
+#define DIEPTSIZ0_PKTCNT(_x)        ((_x) << 19)
634
+#define DIEPTSIZ0_XFERSIZE_MASK        (0x7f << 0)
635
+#define DIEPTSIZ0_XFERSIZE_SHIFT    0
636
+#define DIEPTSIZ0_XFERSIZE_LIMIT    0x7f
637
+#define DIEPTSIZ0_XFERSIZE(_x)        ((_x) << 0)
638
+
639
+#define DOEPTSIZ0            HSOTG_REG(0xB10)
640
+#define DOEPTSIZ0_SUPCNT_MASK        (0x3 << 29)
641
+#define DOEPTSIZ0_SUPCNT_SHIFT        29
642
+#define DOEPTSIZ0_SUPCNT_LIMIT        0x3
643
+#define DOEPTSIZ0_SUPCNT(_x)        ((_x) << 29)
644
+#define DOEPTSIZ0_PKTCNT        BIT(19)
645
+#define DOEPTSIZ0_XFERSIZE_MASK        (0x7f << 0)
646
+#define DOEPTSIZ0_XFERSIZE_SHIFT    0
647
+
648
+#define DIEPTSIZ(_a)            HSOTG_REG(0x910 + ((_a) * 0x20))
649
+#define DOEPTSIZ(_a)            HSOTG_REG(0xB10 + ((_a) * 0x20))
650
+#define DXEPTSIZ_MC_MASK        (0x3 << 29)
651
+#define DXEPTSIZ_MC_SHIFT        29
652
+#define DXEPTSIZ_MC_LIMIT        0x3
653
+#define DXEPTSIZ_MC(_x)            ((_x) << 29)
654
+#define DXEPTSIZ_PKTCNT_MASK        (0x3ff << 19)
655
+#define DXEPTSIZ_PKTCNT_SHIFT        19
656
+#define DXEPTSIZ_PKTCNT_LIMIT        0x3ff
657
+#define DXEPTSIZ_PKTCNT_GET(_v)        (((_v) >> 19) & 0x3ff)
658
+#define DXEPTSIZ_PKTCNT(_x)        ((_x) << 19)
659
+#define DXEPTSIZ_XFERSIZE_MASK        (0x7ffff << 0)
660
+#define DXEPTSIZ_XFERSIZE_SHIFT        0
661
+#define DXEPTSIZ_XFERSIZE_LIMIT        0x7ffff
662
+#define DXEPTSIZ_XFERSIZE_GET(_v)    (((_v) >> 0) & 0x7ffff)
663
+#define DXEPTSIZ_XFERSIZE(_x)        ((_x) << 0)
664
+
665
+#define DIEPDMA(_a)            HSOTG_REG(0x914 + ((_a) * 0x20))
666
+#define DOEPDMA(_a)            HSOTG_REG(0xB14 + ((_a) * 0x20))
667
+
668
+#define DTXFSTS(_a)            HSOTG_REG(0x918 + ((_a) * 0x20))
669
+
670
+#define PCGCTL                HSOTG_REG(0x0e00)
671
+#define PCGCTL_IF_DEV_MODE        BIT(31)
672
+#define PCGCTL_P2HD_PRT_SPD_MASK    (0x3 << 29)
673
+#define PCGCTL_P2HD_PRT_SPD_SHIFT    29
674
+#define PCGCTL_P2HD_DEV_ENUM_SPD_MASK    (0x3 << 27)
675
+#define PCGCTL_P2HD_DEV_ENUM_SPD_SHIFT    27
676
+#define PCGCTL_MAC_DEV_ADDR_MASK    (0x7f << 20)
677
+#define PCGCTL_MAC_DEV_ADDR_SHIFT    20
678
+#define PCGCTL_MAX_TERMSEL        BIT(19)
679
+#define PCGCTL_MAX_XCVRSELECT_MASK    (0x3 << 17)
680
+#define PCGCTL_MAX_XCVRSELECT_SHIFT    17
681
+#define PCGCTL_PORT_POWER        BIT(16)
682
+#define PCGCTL_PRT_CLK_SEL_MASK        (0x3 << 14)
683
+#define PCGCTL_PRT_CLK_SEL_SHIFT    14
684
+#define PCGCTL_ESS_REG_RESTORED        BIT(13)
685
+#define PCGCTL_EXTND_HIBER_SWITCH    BIT(12)
686
+#define PCGCTL_EXTND_HIBER_PWRCLMP    BIT(11)
687
+#define PCGCTL_ENBL_EXTND_HIBER        BIT(10)
688
+#define PCGCTL_RESTOREMODE        BIT(9)
689
+#define PCGCTL_RESETAFTSUSP        BIT(8)
690
+#define PCGCTL_DEEP_SLEEP        BIT(7)
691
+#define PCGCTL_PHY_IN_SLEEP        BIT(6)
692
+#define PCGCTL_ENBL_SLEEP_GATING    BIT(5)
693
+#define PCGCTL_RSTPDWNMODULE        BIT(3)
694
+#define PCGCTL_PWRCLMP            BIT(2)
695
+#define PCGCTL_GATEHCLK            BIT(1)
696
+#define PCGCTL_STOPPCLK            BIT(0)
697
+
698
+#define PCGCCTL1 HSOTG_REG(0xe04)
699
+#define PCGCCTL1_TIMER (0x3 << 1)
700
+#define PCGCCTL1_GATEEN BIT(0)
701
+
702
+#define EPFIFO(_a)            HSOTG_REG(0x1000 + ((_a) * 0x1000))
703
+
704
+/* Host Mode Registers */
705
+
706
+#define HCFG                HSOTG_REG(0x0400)
707
+#define HCFG_MODECHTIMEN        BIT(31)
708
+#define HCFG_PERSCHEDENA        BIT(26)
709
+#define HCFG_FRLISTEN_MASK        (0x3 << 24)
710
+#define HCFG_FRLISTEN_SHIFT        24
711
+#define HCFG_FRLISTEN_8                (0 << 24)
712
+#define FRLISTEN_8_SIZE                8
713
+#define HCFG_FRLISTEN_16            BIT(24)
714
+#define FRLISTEN_16_SIZE            16
715
+#define HCFG_FRLISTEN_32            (2 << 24)
716
+#define FRLISTEN_32_SIZE            32
717
+#define HCFG_FRLISTEN_64            (3 << 24)
718
+#define FRLISTEN_64_SIZE            64
719
+#define HCFG_DESCDMA            BIT(23)
720
+#define HCFG_RESVALID_MASK        (0xff << 8)
721
+#define HCFG_RESVALID_SHIFT        8
722
+#define HCFG_ENA32KHZ            BIT(7)
723
+#define HCFG_FSLSSUPP            BIT(2)
724
+#define HCFG_FSLSPCLKSEL_MASK        (0x3 << 0)
725
+#define HCFG_FSLSPCLKSEL_SHIFT        0
726
+#define HCFG_FSLSPCLKSEL_30_60_MHZ    0
727
+#define HCFG_FSLSPCLKSEL_48_MHZ        1
728
+#define HCFG_FSLSPCLKSEL_6_MHZ        2
729
+
730
+#define HFIR                HSOTG_REG(0x0404)
731
+#define HFIR_FRINT_MASK            (0xffff << 0)
732
+#define HFIR_FRINT_SHIFT        0
733
+#define HFIR_RLDCTRL            BIT(16)
734
+
735
+#define HFNUM                HSOTG_REG(0x0408)
736
+#define HFNUM_FRREM_MASK        (0xffff << 16)
737
+#define HFNUM_FRREM_SHIFT        16
738
+#define HFNUM_FRNUM_MASK        (0xffff << 0)
739
+#define HFNUM_FRNUM_SHIFT        0
740
+#define HFNUM_MAX_FRNUM            0x3fff
741
+
742
+#define HPTXSTS                HSOTG_REG(0x0410)
743
+#define TXSTS_QTOP_ODD            BIT(31)
744
+#define TXSTS_QTOP_CHNEP_MASK        (0xf << 27)
745
+#define TXSTS_QTOP_CHNEP_SHIFT        27
746
+#define TXSTS_QTOP_TOKEN_MASK        (0x3 << 25)
747
+#define TXSTS_QTOP_TOKEN_SHIFT        25
748
+#define TXSTS_QTOP_TERMINATE        BIT(24)
749
+#define TXSTS_QSPCAVAIL_MASK        (0xff << 16)
750
+#define TXSTS_QSPCAVAIL_SHIFT        16
751
+#define TXSTS_FSPCAVAIL_MASK        (0xffff << 0)
752
+#define TXSTS_FSPCAVAIL_SHIFT        0
753
+
754
+#define HAINT                HSOTG_REG(0x0414)
755
+#define HAINTMSK            HSOTG_REG(0x0418)
756
+#define HFLBADDR            HSOTG_REG(0x041c)
757
+
758
+#define HPRT0                HSOTG_REG(0x0440)
759
+#define HPRT0_SPD_MASK            (0x3 << 17)
760
+#define HPRT0_SPD_SHIFT            17
761
+#define HPRT0_SPD_HIGH_SPEED        0
762
+#define HPRT0_SPD_FULL_SPEED        1
763
+#define HPRT0_SPD_LOW_SPEED        2
764
+#define HPRT0_TSTCTL_MASK        (0xf << 13)
765
+#define HPRT0_TSTCTL_SHIFT        13
766
+#define HPRT0_PWR            BIT(12)
767
+#define HPRT0_LNSTS_MASK        (0x3 << 10)
768
+#define HPRT0_LNSTS_SHIFT        10
769
+#define HPRT0_RST            BIT(8)
770
+#define HPRT0_SUSP            BIT(7)
771
+#define HPRT0_RES            BIT(6)
772
+#define HPRT0_OVRCURRCHG        BIT(5)
773
+#define HPRT0_OVRCURRACT        BIT(4)
774
+#define HPRT0_ENACHG            BIT(3)
775
+#define HPRT0_ENA            BIT(2)
776
+#define HPRT0_CONNDET            BIT(1)
777
+#define HPRT0_CONNSTS            BIT(0)
778
+
779
+#define HCCHAR(_ch)            HSOTG_REG(0x0500 + 0x20 * (_ch))
780
+#define HCCHAR_CHENA            BIT(31)
781
+#define HCCHAR_CHDIS            BIT(30)
782
+#define HCCHAR_ODDFRM            BIT(29)
783
+#define HCCHAR_DEVADDR_MASK        (0x7f << 22)
784
+#define HCCHAR_DEVADDR_SHIFT        22
785
+#define HCCHAR_MULTICNT_MASK        (0x3 << 20)
786
+#define HCCHAR_MULTICNT_SHIFT        20
787
+#define HCCHAR_EPTYPE_MASK        (0x3 << 18)
788
+#define HCCHAR_EPTYPE_SHIFT        18
789
+#define HCCHAR_LSPDDEV            BIT(17)
790
+#define HCCHAR_EPDIR            BIT(15)
791
+#define HCCHAR_EPNUM_MASK        (0xf << 11)
792
+#define HCCHAR_EPNUM_SHIFT        11
793
+#define HCCHAR_MPS_MASK            (0x7ff << 0)
794
+#define HCCHAR_MPS_SHIFT        0
795
+
796
+#define HCSPLT(_ch)            HSOTG_REG(0x0504 + 0x20 * (_ch))
797
+#define HCSPLT_SPLTENA            BIT(31)
798
+#define HCSPLT_COMPSPLT            BIT(16)
799
+#define HCSPLT_XACTPOS_MASK        (0x3 << 14)
800
+#define HCSPLT_XACTPOS_SHIFT        14
801
+#define HCSPLT_XACTPOS_MID        0
802
+#define HCSPLT_XACTPOS_END        1
803
+#define HCSPLT_XACTPOS_BEGIN        2
804
+#define HCSPLT_XACTPOS_ALL        3
805
+#define HCSPLT_HUBADDR_MASK        (0x7f << 7)
806
+#define HCSPLT_HUBADDR_SHIFT        7
807
+#define HCSPLT_PRTADDR_MASK        (0x7f << 0)
808
+#define HCSPLT_PRTADDR_SHIFT        0
809
+
810
+#define HCINT(_ch)            HSOTG_REG(0x0508 + 0x20 * (_ch))
811
+#define HCINTMSK(_ch)            HSOTG_REG(0x050c + 0x20 * (_ch))
812
+#define HCINTMSK_RESERVED14_31        (0x3ffff << 14)
813
+#define HCINTMSK_FRM_LIST_ROLL        BIT(13)
814
+#define HCINTMSK_XCS_XACT        BIT(12)
815
+#define HCINTMSK_BNA            BIT(11)
816
+#define HCINTMSK_DATATGLERR        BIT(10)
817
+#define HCINTMSK_FRMOVRUN        BIT(9)
818
+#define HCINTMSK_BBLERR            BIT(8)
819
+#define HCINTMSK_XACTERR        BIT(7)
820
+#define HCINTMSK_NYET            BIT(6)
821
+#define HCINTMSK_ACK            BIT(5)
822
+#define HCINTMSK_NAK            BIT(4)
823
+#define HCINTMSK_STALL            BIT(3)
824
+#define HCINTMSK_AHBERR            BIT(2)
825
+#define HCINTMSK_CHHLTD            BIT(1)
826
+#define HCINTMSK_XFERCOMPL        BIT(0)
827
+
828
+#define HCTSIZ(_ch)            HSOTG_REG(0x0510 + 0x20 * (_ch))
829
+#define TSIZ_DOPNG            BIT(31)
830
+#define TSIZ_SC_MC_PID_MASK        (0x3 << 29)
831
+#define TSIZ_SC_MC_PID_SHIFT        29
832
+#define TSIZ_SC_MC_PID_DATA0        0
833
+#define TSIZ_SC_MC_PID_DATA2        1
834
+#define TSIZ_SC_MC_PID_DATA1        2
835
+#define TSIZ_SC_MC_PID_MDATA        3
836
+#define TSIZ_SC_MC_PID_SETUP        3
837
+#define TSIZ_PKTCNT_MASK        (0x3ff << 19)
838
+#define TSIZ_PKTCNT_SHIFT        19
839
+#define TSIZ_NTD_MASK            (0xff << 8)
840
+#define TSIZ_NTD_SHIFT            8
841
+#define TSIZ_SCHINFO_MASK        (0xff << 0)
842
+#define TSIZ_SCHINFO_SHIFT        0
843
+#define TSIZ_XFERSIZE_MASK        (0x7ffff << 0)
844
+#define TSIZ_XFERSIZE_SHIFT        0
845
+
846
+#define HCDMA(_ch)            HSOTG_REG(0x0514 + 0x20 * (_ch))
847
+
848
+#define HCDMAB(_ch)            HSOTG_REG(0x051c + 0x20 * (_ch))
849
+
850
+#define HCFIFO(_ch)            HSOTG_REG(0x1000 + 0x1000 * (_ch))
851
+
852
+/**
853
+ * struct dwc2_dma_desc - DMA descriptor structure,
854
+ * used for both host and gadget modes
855
+ *
856
+ * @status: DMA descriptor status quadlet
857
+ * @buf: DMA descriptor data buffer pointer
858
+ *
859
+ * DMA Descriptor structure contains two quadlets:
860
+ * Status quadlet and Data buffer pointer.
861
+ */
862
+struct dwc2_dma_desc {
863
+    uint32_t status;
864
+    uint32_t buf;
865
+} __packed;
866
+
867
+/* Host Mode DMA descriptor status quadlet */
868
+
869
+#define HOST_DMA_A            BIT(31)
870
+#define HOST_DMA_STS_MASK        (0x3 << 28)
871
+#define HOST_DMA_STS_SHIFT        28
872
+#define HOST_DMA_STS_PKTERR        BIT(28)
873
+#define HOST_DMA_EOL            BIT(26)
874
+#define HOST_DMA_IOC            BIT(25)
875
+#define HOST_DMA_SUP            BIT(24)
876
+#define HOST_DMA_ALT_QTD        BIT(23)
877
+#define HOST_DMA_QTD_OFFSET_MASK    (0x3f << 17)
878
+#define HOST_DMA_QTD_OFFSET_SHIFT    17
879
+#define HOST_DMA_ISOC_NBYTES_MASK    (0xfff << 0)
880
+#define HOST_DMA_ISOC_NBYTES_SHIFT    0
881
+#define HOST_DMA_NBYTES_MASK        (0x1ffff << 0)
882
+#define HOST_DMA_NBYTES_SHIFT        0
883
+#define HOST_DMA_NBYTES_LIMIT        131071
884
+
885
+/* Device Mode DMA descriptor status quadlet */
886
+
887
+#define DEV_DMA_BUFF_STS_MASK        (0x3 << 30)
888
+#define DEV_DMA_BUFF_STS_SHIFT        30
889
+#define DEV_DMA_BUFF_STS_HREADY        0
890
+#define DEV_DMA_BUFF_STS_DMABUSY    1
891
+#define DEV_DMA_BUFF_STS_DMADONE    2
892
+#define DEV_DMA_BUFF_STS_HBUSY        3
893
+#define DEV_DMA_STS_MASK        (0x3 << 28)
894
+#define DEV_DMA_STS_SHIFT        28
895
+#define DEV_DMA_STS_SUCC        0
896
+#define DEV_DMA_STS_BUFF_FLUSH        1
897
+#define DEV_DMA_STS_BUFF_ERR        3
898
+#define DEV_DMA_L            BIT(27)
899
+#define DEV_DMA_SHORT            BIT(26)
900
+#define DEV_DMA_IOC            BIT(25)
901
+#define DEV_DMA_SR            BIT(24)
902
+#define DEV_DMA_MTRF            BIT(23)
903
+#define DEV_DMA_ISOC_PID_MASK        (0x3 << 23)
904
+#define DEV_DMA_ISOC_PID_SHIFT        23
905
+#define DEV_DMA_ISOC_PID_DATA0        0
906
+#define DEV_DMA_ISOC_PID_DATA2        1
907
+#define DEV_DMA_ISOC_PID_DATA1        2
908
+#define DEV_DMA_ISOC_PID_MDATA        3
909
+#define DEV_DMA_ISOC_FRNUM_MASK        (0x7ff << 12)
910
+#define DEV_DMA_ISOC_FRNUM_SHIFT    12
911
+#define DEV_DMA_ISOC_TX_NBYTES_MASK    (0xfff << 0)
912
+#define DEV_DMA_ISOC_TX_NBYTES_LIMIT    0xfff
913
+#define DEV_DMA_ISOC_RX_NBYTES_MASK    (0x7ff << 0)
914
+#define DEV_DMA_ISOC_RX_NBYTES_LIMIT    0x7ff
915
+#define DEV_DMA_ISOC_NBYTES_SHIFT    0
916
+#define DEV_DMA_NBYTES_MASK        (0xffff << 0)
917
+#define DEV_DMA_NBYTES_SHIFT        0
918
+#define DEV_DMA_NBYTES_LIMIT        0xffff
919
+
920
+#define MAX_DMA_DESC_NUM_GENERIC    64
921
+#define MAX_DMA_DESC_NUM_HS_ISOC    256
922
+
923
+#endif /* __DWC2_HW_H__ */
366
--
924
--
367
2.17.1
925
2.20.1
368
926
369
927
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Paul Zimmerman <pauldzim@gmail.com>
2
2
3
Add the dwc-hsotg (dwc2) USB host controller state definitions.
4
Mostly based on hw/usb/hcd-ehci.h.
5
6
Signed-off-by: Paul Zimmerman <pauldzim@gmail.com>
7
Message-id: 20200520235349.21215-4-pauldzim@gmail.com
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-5-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
10
---
8
target/arm/helper-sve.h | 15 ++++++++
11
hw/usb/hcd-dwc2.h | 190 ++++++++++++++++++++++++++++++++++++++++++++++
9
target/arm/sve_helper.c | 72 ++++++++++++++++++++++++++++++++++++
12
1 file changed, 190 insertions(+)
10
target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++
13
create mode 100644 hw/usb/hcd-dwc2.h
11
target/arm/sve.decode | 10 +++++
14
12
4 files changed, 172 insertions(+)
15
diff --git a/hw/usb/hcd-dwc2.h b/hw/usb/hcd-dwc2.h
13
16
new file mode 100644
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
17
index XXXXXXX..XXXXXXX
15
index XXXXXXX..XXXXXXX 100644
18
--- /dev/null
16
--- a/target/arm/helper-sve.h
19
+++ b/hw/usb/hcd-dwc2.h
17
+++ b/target/arm/helper-sve.h
20
@@ -XXX,XX +XXX,XX @@
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
20
DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
21
22
+DEF_HELPER_FLAGS_4(sve_zip_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+DEF_HELPER_FLAGS_4(sve_zip_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
24
+DEF_HELPER_FLAGS_4(sve_zip_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
25
+DEF_HELPER_FLAGS_4(sve_zip_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
26
+
27
+DEF_HELPER_FLAGS_4(sve_uzp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+DEF_HELPER_FLAGS_4(sve_uzp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+DEF_HELPER_FLAGS_4(sve_uzp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
+DEF_HELPER_FLAGS_4(sve_uzp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+
32
+DEF_HELPER_FLAGS_4(sve_trn_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
33
+DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
34
+DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
35
+DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
36
+
37
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
38
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
39
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
40
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/sve_helper.c
43
+++ b/target/arm/sve_helper.c
44
@@ -XXX,XX +XXX,XX @@ void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc)
45
}
46
}
47
}
48
+
49
+#define DO_ZIP(NAME, TYPE, H) \
50
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
51
+{ \
52
+ intptr_t oprsz = simd_oprsz(desc); \
53
+ intptr_t i, oprsz_2 = oprsz / 2; \
54
+ ARMVectorReg tmp_n, tmp_m; \
55
+ /* We produce output faster than we consume input. \
56
+ Therefore we must be mindful of possible overlap. */ \
57
+ if (unlikely((vn - vd) < (uintptr_t)oprsz)) { \
58
+ vn = memcpy(&tmp_n, vn, oprsz_2); \
59
+ } \
60
+ if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \
61
+ vm = memcpy(&tmp_m, vm, oprsz_2); \
62
+ } \
63
+ for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \
64
+ *(TYPE *)(vd + H(2 * i + 0)) = *(TYPE *)(vn + H(i)); \
65
+ *(TYPE *)(vd + H(2 * i + sizeof(TYPE))) = *(TYPE *)(vm + H(i)); \
66
+ } \
67
+}
68
+
69
+DO_ZIP(sve_zip_b, uint8_t, H1)
70
+DO_ZIP(sve_zip_h, uint16_t, H1_2)
71
+DO_ZIP(sve_zip_s, uint32_t, H1_4)
72
+DO_ZIP(sve_zip_d, uint64_t, )
73
+
74
+#define DO_UZP(NAME, TYPE, H) \
75
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
76
+{ \
77
+ intptr_t oprsz = simd_oprsz(desc); \
78
+ intptr_t oprsz_2 = oprsz / 2; \
79
+ intptr_t odd_ofs = simd_data(desc); \
80
+ intptr_t i; \
81
+ ARMVectorReg tmp_m; \
82
+ if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \
83
+ vm = memcpy(&tmp_m, vm, oprsz); \
84
+ } \
85
+ for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \
86
+ *(TYPE *)(vd + H(i)) = *(TYPE *)(vn + H(2 * i + odd_ofs)); \
87
+ } \
88
+ for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \
89
+ *(TYPE *)(vd + H(oprsz_2 + i)) = *(TYPE *)(vm + H(2 * i + odd_ofs)); \
90
+ } \
91
+}
92
+
93
+DO_UZP(sve_uzp_b, uint8_t, H1)
94
+DO_UZP(sve_uzp_h, uint16_t, H1_2)
95
+DO_UZP(sve_uzp_s, uint32_t, H1_4)
96
+DO_UZP(sve_uzp_d, uint64_t, )
97
+
98
+#define DO_TRN(NAME, TYPE, H) \
99
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
100
+{ \
101
+ intptr_t oprsz = simd_oprsz(desc); \
102
+ intptr_t odd_ofs = simd_data(desc); \
103
+ intptr_t i; \
104
+ for (i = 0; i < oprsz; i += 2 * sizeof(TYPE)) { \
105
+ TYPE ae = *(TYPE *)(vn + H(i + odd_ofs)); \
106
+ TYPE be = *(TYPE *)(vm + H(i + odd_ofs)); \
107
+ *(TYPE *)(vd + H(i + 0)) = ae; \
108
+ *(TYPE *)(vd + H(i + sizeof(TYPE))) = be; \
109
+ } \
110
+}
111
+
112
+DO_TRN(sve_trn_b, uint8_t, H1)
113
+DO_TRN(sve_trn_h, uint16_t, H1_2)
114
+DO_TRN(sve_trn_s, uint32_t, H1_4)
115
+DO_TRN(sve_trn_d, uint64_t, )
116
+
117
+#undef DO_ZIP
118
+#undef DO_UZP
119
+#undef DO_TRN
120
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
121
index XXXXXXX..XXXXXXX 100644
122
--- a/target/arm/translate-sve.c
123
+++ b/target/arm/translate-sve.c
124
@@ -XXX,XX +XXX,XX @@ static bool trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn)
125
return do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p);
126
}
127
128
+/*
21
+/*
129
+ *** SVE Permute - Interleaving Group
22
+ * dwc-hsotg (dwc2) USB host controller state definitions
23
+ *
24
+ * Based on hw/usb/hcd-ehci.h
25
+ *
26
+ * Copyright (c) 2020 Paul Zimmerman <pauldzim@gmail.com>
27
+ *
28
+ * This program is free software; you can redistribute it and/or modify
29
+ * it under the terms of the GNU General Public License as published by
30
+ * the Free Software Foundation; either version 2 of the License, or
31
+ * (at your option) any later version.
32
+ *
33
+ * This program is distributed in the hope that it will be useful,
34
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
35
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
36
+ * GNU General Public License for more details.
130
+ */
37
+ */
131
+
38
+
132
+static bool do_zip(DisasContext *s, arg_rrr_esz *a, bool high)
39
+#ifndef HW_USB_DWC2_H
133
+{
40
+#define HW_USB_DWC2_H
134
+ static gen_helper_gvec_3 * const fns[4] = {
41
+
135
+ gen_helper_sve_zip_b, gen_helper_sve_zip_h,
42
+#include "qemu/timer.h"
136
+ gen_helper_sve_zip_s, gen_helper_sve_zip_d,
43
+#include "hw/irq.h"
137
+ };
44
+#include "hw/sysbus.h"
138
+
45
+#include "hw/usb.h"
139
+ if (sve_access_check(s)) {
46
+#include "sysemu/dma.h"
140
+ unsigned vsz = vec_full_reg_size(s);
47
+
141
+ unsigned high_ofs = high ? vsz / 2 : 0;
48
+#define DWC2_MMIO_SIZE 0x11000
142
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
49
+
143
+ vec_full_reg_offset(s, a->rn) + high_ofs,
50
+#define DWC2_NB_CHAN 8 /* Number of host channels */
144
+ vec_full_reg_offset(s, a->rm) + high_ofs,
51
+#define DWC2_MAX_XFER_SIZE 65536 /* Max transfer size expected in HCTSIZ */
145
+ vsz, vsz, 0, fns[a->esz]);
52
+
146
+ }
53
+typedef struct DWC2Packet DWC2Packet;
147
+ return true;
54
+typedef struct DWC2State DWC2State;
148
+}
55
+typedef struct DWC2Class DWC2Class;
149
+
56
+
150
+static bool do_zzz_data_ool(DisasContext *s, arg_rrr_esz *a, int data,
57
+enum async_state {
151
+ gen_helper_gvec_3 *fn)
58
+ DWC2_ASYNC_NONE = 0,
152
+{
59
+ DWC2_ASYNC_INITIALIZED,
153
+ if (sve_access_check(s)) {
60
+ DWC2_ASYNC_INFLIGHT,
154
+ unsigned vsz = vec_full_reg_size(s);
61
+ DWC2_ASYNC_FINISHED,
155
+ tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
62
+};
156
+ vec_full_reg_offset(s, a->rn),
63
+
157
+ vec_full_reg_offset(s, a->rm),
64
+struct DWC2Packet {
158
+ vsz, vsz, data, fn);
65
+ USBPacket packet;
159
+ }
66
+ uint32_t devadr;
160
+ return true;
67
+ uint32_t epnum;
161
+}
68
+ uint32_t epdir;
162
+
69
+ uint32_t mps;
163
+static bool trans_ZIP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
70
+ uint32_t pid;
164
+{
71
+ uint32_t index;
165
+ return do_zip(s, a, false);
72
+ uint32_t pcnt;
166
+}
73
+ uint32_t len;
167
+
74
+ int32_t async;
168
+static bool trans_ZIP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
75
+ bool small;
169
+{
76
+ bool needs_service;
170
+ return do_zip(s, a, true);
77
+};
171
+}
78
+
172
+
79
+struct DWC2State {
173
+static gen_helper_gvec_3 * const uzp_fns[4] = {
80
+ /*< private >*/
174
+ gen_helper_sve_uzp_b, gen_helper_sve_uzp_h,
81
+ SysBusDevice parent_obj;
175
+ gen_helper_sve_uzp_s, gen_helper_sve_uzp_d,
82
+
176
+};
83
+ /*< public >*/
177
+
84
+ USBBus bus;
178
+static bool trans_UZP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
85
+ qemu_irq irq;
179
+{
86
+ MemoryRegion *dma_mr;
180
+ return do_zzz_data_ool(s, a, 0, uzp_fns[a->esz]);
87
+ AddressSpace dma_as;
181
+}
88
+ MemoryRegion container;
182
+
89
+ MemoryRegion hsotg;
183
+static bool trans_UZP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
90
+ MemoryRegion fifos;
184
+{
91
+
185
+ return do_zzz_data_ool(s, a, 1 << a->esz, uzp_fns[a->esz]);
92
+ union {
186
+}
93
+#define DWC2_GLBREG_SIZE 0x70
187
+
94
+ uint32_t glbreg[DWC2_GLBREG_SIZE / sizeof(uint32_t)];
188
+static gen_helper_gvec_3 * const trn_fns[4] = {
95
+ struct {
189
+ gen_helper_sve_trn_b, gen_helper_sve_trn_h,
96
+ uint32_t gotgctl; /* 00 */
190
+ gen_helper_sve_trn_s, gen_helper_sve_trn_d,
97
+ uint32_t gotgint; /* 04 */
191
+};
98
+ uint32_t gahbcfg; /* 08 */
192
+
99
+ uint32_t gusbcfg; /* 0c */
193
+static bool trans_TRN1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
100
+ uint32_t grstctl; /* 10 */
194
+{
101
+ uint32_t gintsts; /* 14 */
195
+ return do_zzz_data_ool(s, a, 0, trn_fns[a->esz]);
102
+ uint32_t gintmsk; /* 18 */
196
+}
103
+ uint32_t grxstsr; /* 1c */
197
+
104
+ uint32_t grxstsp; /* 20 */
198
+static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
105
+ uint32_t grxfsiz; /* 24 */
199
+{
106
+ uint32_t gnptxfsiz; /* 28 */
200
+ return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]);
107
+ uint32_t gnptxsts; /* 2c */
201
+}
108
+ uint32_t gi2cctl; /* 30 */
202
+
109
+ uint32_t gpvndctl; /* 34 */
203
/*
110
+ uint32_t ggpio; /* 38 */
204
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
111
+ uint32_t guid; /* 3c */
205
*/
112
+ uint32_t gsnpsid; /* 40 */
206
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
113
+ uint32_t ghwcfg1; /* 44 */
207
index XXXXXXX..XXXXXXX 100644
114
+ uint32_t ghwcfg2; /* 48 */
208
--- a/target/arm/sve.decode
115
+ uint32_t ghwcfg3; /* 4c */
209
+++ b/target/arm/sve.decode
116
+ uint32_t ghwcfg4; /* 50 */
210
@@ -XXX,XX +XXX,XX @@ REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn
117
+ uint32_t glpmcfg; /* 54 */
211
PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0
118
+ uint32_t gpwrdn; /* 58 */
212
PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0
119
+ uint32_t gdfifocfg; /* 5c */
213
120
+ uint32_t gadpctl; /* 60 */
214
+### SVE Permute - Interleaving Group
121
+ uint32_t grefclk; /* 64 */
215
+
122
+ uint32_t gintmsk2; /* 68 */
216
+# SVE permute vector elements
123
+ uint32_t gintsts2; /* 6c */
217
+ZIP1_z 00000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm
124
+ };
218
+ZIP2_z 00000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm
125
+ };
219
+UZP1_z 00000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm
126
+
220
+UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm
127
+ union {
221
+TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm
128
+#define DWC2_FSZREG_SIZE 0x04
222
+TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm
129
+ uint32_t fszreg[DWC2_FSZREG_SIZE / sizeof(uint32_t)];
223
+
130
+ struct {
224
### SVE Predicate Logical Operations Group
131
+ uint32_t hptxfsiz; /* 100 */
225
132
+ };
226
# SVE predicate logical operations
133
+ };
134
+
135
+ union {
136
+#define DWC2_HREG0_SIZE 0x44
137
+ uint32_t hreg0[DWC2_HREG0_SIZE / sizeof(uint32_t)];
138
+ struct {
139
+ uint32_t hcfg; /* 400 */
140
+ uint32_t hfir; /* 404 */
141
+ uint32_t hfnum; /* 408 */
142
+ uint32_t rsvd0; /* 40c */
143
+ uint32_t hptxsts; /* 410 */
144
+ uint32_t haint; /* 414 */
145
+ uint32_t haintmsk; /* 418 */
146
+ uint32_t hflbaddr; /* 41c */
147
+ uint32_t rsvd1[8]; /* 420-43c */
148
+ uint32_t hprt0; /* 440 */
149
+ };
150
+ };
151
+
152
+#define DWC2_HREG1_SIZE (0x20 * DWC2_NB_CHAN)
153
+ uint32_t hreg1[DWC2_HREG1_SIZE / sizeof(uint32_t)];
154
+
155
+#define hcchar(_ch) hreg1[((_ch) << 3) + 0] /* 500, 520, ... */
156
+#define hcsplt(_ch) hreg1[((_ch) << 3) + 1] /* 504, 524, ... */
157
+#define hcint(_ch) hreg1[((_ch) << 3) + 2] /* 508, 528, ... */
158
+#define hcintmsk(_ch) hreg1[((_ch) << 3) + 3] /* 50c, 52c, ... */
159
+#define hctsiz(_ch) hreg1[((_ch) << 3) + 4] /* 510, 530, ... */
160
+#define hcdma(_ch) hreg1[((_ch) << 3) + 5] /* 514, 534, ... */
161
+#define hcdmab(_ch) hreg1[((_ch) << 3) + 7] /* 51c, 53c, ... */
162
+
163
+ union {
164
+#define DWC2_PCGREG_SIZE 0x08
165
+ uint32_t pcgreg[DWC2_PCGREG_SIZE / sizeof(uint32_t)];
166
+ struct {
167
+ uint32_t pcgctl; /* e00 */
168
+ uint32_t pcgcctl1; /* e04 */
169
+ };
170
+ };
171
+
172
+ /* TODO - implement FIFO registers for slave mode */
173
+#define DWC2_HFIFO_SIZE (0x1000 * DWC2_NB_CHAN)
174
+
175
+ /*
176
+ * Internal state
177
+ */
178
+ QEMUTimer *eof_timer;
179
+ QEMUTimer *frame_timer;
180
+ QEMUBH *async_bh;
181
+ int64_t sof_time;
182
+ int64_t usb_frame_time;
183
+ int64_t usb_bit_time;
184
+ uint32_t usb_version;
185
+ uint16_t frame_number;
186
+ uint16_t fi;
187
+ uint16_t next_chan;
188
+ bool working;
189
+ USBPort uport;
190
+ DWC2Packet packet[DWC2_NB_CHAN]; /* one packet per chan */
191
+ uint8_t usb_buf[DWC2_NB_CHAN][DWC2_MAX_XFER_SIZE]; /* one buffer per chan */
192
+};
193
+
194
+struct DWC2Class {
195
+ /*< private >*/
196
+ SysBusDeviceClass parent_class;
197
+ ResettablePhases parent_phases;
198
+
199
+ /*< public >*/
200
+};
201
+
202
+#define TYPE_DWC2_USB "dwc2-usb"
203
+#define DWC2_USB(obj) \
204
+ OBJECT_CHECK(DWC2State, (obj), TYPE_DWC2_USB)
205
+#define DWC2_CLASS(klass) \
206
+ OBJECT_CLASS_CHECK(DWC2Class, (klass), TYPE_DWC2_USB)
207
+#define DWC2_GET_CLASS(obj) \
208
+ OBJECT_GET_CLASS(DWC2Class, (obj), TYPE_DWC2_USB)
209
+
210
+#endif
227
--
211
--
228
2.17.1
212
2.20.1
229
213
230
214
diff view generated by jsdifflib
1
Convert the mcf5206 device away from using the old_mmio field
1
From: Paul Zimmerman <pauldzim@gmail.com>
2
of MemoryRegionOps. This device is used by the an5206 board.
3
2
3
Add the dwc-hsotg (dwc2) USB host controller emulation code.
4
Based on hw/usb/hcd-ehci.c and hw/usb/hcd-ohci.c.
5
6
Note that to use this with the dwc-otg driver in the Raspbian
7
kernel, you must pass the option "dwc_otg.fiq_fsm_enable=0" on
8
the kernel command line.
9
10
Emulation of slave mode and of descriptor-DMA mode has not been
11
implemented yet. These modes are seldom used.
12
13
I have used some on-line sources of information while developing
14
this emulation, including:
15
16
http://www.capital-micro.com/PDF/CME-M7_Family_User_Guide_EN.pdf
17
which has a pretty complete description of the controller starting
18
on page 370.
19
20
https://sourceforge.net/p/wive-ng/wive-ng-mt/ci/master/tree/docs/DataSheets/RT3050_5x_V2.0_081408_0902.pdf
21
which has a description of the controller registers starting on
22
page 130.
23
24
Thanks to Felippe Mathieu-Daude for providing a cleaner method
25
of implementing the memory regions for the controller registers.
26
27
Signed-off-by: Paul Zimmerman <pauldzim@gmail.com>
28
Message-id: 20200520235349.21215-5-pauldzim@gmail.com
29
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
30
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Acked-by: Thomas Huth <huth@tuxfamily.org>
6
Message-id: 20180601141223.26630-3-peter.maydell@linaro.org
7
---
31
---
8
hw/m68k/mcf5206.c | 48 +++++++++++++++++++++++++++++++++++------------
32
hw/usb/hcd-dwc2.c | 1417 ++++++++++++++++++++++++++++++++++++++++++
9
1 file changed, 36 insertions(+), 12 deletions(-)
33
hw/usb/Kconfig | 5 +
34
hw/usb/Makefile.objs | 1 +
35
hw/usb/trace-events | 50 ++
36
4 files changed, 1473 insertions(+)
37
create mode 100644 hw/usb/hcd-dwc2.c
10
38
11
diff --git a/hw/m68k/mcf5206.c b/hw/m68k/mcf5206.c
39
diff --git a/hw/usb/hcd-dwc2.c b/hw/usb/hcd-dwc2.c
12
index XXXXXXX..XXXXXXX 100644
40
new file mode 100644
13
--- a/hw/m68k/mcf5206.c
41
index XXXXXXX..XXXXXXX
14
+++ b/hw/m68k/mcf5206.c
42
--- /dev/null
15
@@ -XXX,XX +XXX,XX @@ static void m5206_mbar_writel(void *opaque, hwaddr offset,
43
+++ b/hw/usb/hcd-dwc2.c
16
m5206_mbar_write(s, offset, value, 4);
44
@@ -XXX,XX +XXX,XX @@
17
}
45
+/*
18
46
+ * dwc-hsotg (dwc2) USB host controller emulation
19
+static uint64_t m5206_mbar_readfn(void *opaque, hwaddr addr, unsigned size)
47
+ *
20
+{
48
+ * Based on hw/usb/hcd-ehci.c and hw/usb/hcd-ohci.c
21
+ switch (size) {
49
+ *
22
+ case 1:
50
+ * Note that to use this emulation with the dwc-otg driver in the
23
+ return m5206_mbar_readb(opaque, addr);
51
+ * Raspbian kernel, you must pass the option "dwc_otg.fiq_fsm_enable=0"
24
+ case 2:
52
+ * on the kernel command line.
25
+ return m5206_mbar_readw(opaque, addr);
53
+ *
26
+ case 4:
54
+ * Some useful documentation used to develop this emulation can be
27
+ return m5206_mbar_readl(opaque, addr);
55
+ * found online (as of April 2020) at:
56
+ *
57
+ * http://www.capital-micro.com/PDF/CME-M7_Family_User_Guide_EN.pdf
58
+ * which has a pretty complete description of the controller starting
59
+ * on page 370.
60
+ *
61
+ * https://sourceforge.net/p/wive-ng/wive-ng-mt/ci/master/tree/docs/DataSheets/RT3050_5x_V2.0_081408_0902.pdf
62
+ * which has a description of the controller registers starting on
63
+ * page 130.
64
+ *
65
+ * Copyright (c) 2020 Paul Zimmerman <pauldzim@gmail.com>
66
+ *
67
+ * This program is free software; you can redistribute it and/or modify
68
+ * it under the terms of the GNU General Public License as published by
69
+ * the Free Software Foundation; either version 2 of the License, or
70
+ * (at your option) any later version.
71
+ *
72
+ * This program is distributed in the hope that it will be useful,
73
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
74
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
75
+ * GNU General Public License for more details.
76
+ */
77
+
78
+#include "qemu/osdep.h"
79
+#include "qemu/units.h"
80
+#include "qapi/error.h"
81
+#include "hw/usb/dwc2-regs.h"
82
+#include "hw/usb/hcd-dwc2.h"
83
+#include "migration/vmstate.h"
84
+#include "trace.h"
85
+#include "qemu/log.h"
86
+#include "qemu/error-report.h"
87
+#include "qemu/main-loop.h"
88
+#include "hw/qdev-properties.h"
89
+
90
+#define USB_HZ_FS 12000000
91
+#define USB_HZ_HS 96000000
92
+#define USB_FRMINTVL 12000
93
+
94
+/* nifty macros from Arnon's EHCI version */
95
+#define get_field(data, field) \
96
+ (((data) & field##_MASK) >> field##_SHIFT)
97
+
98
+#define set_field(data, newval, field) do { \
99
+ uint32_t val = *(data); \
100
+ val &= ~field##_MASK; \
101
+ val |= ((newval) << field##_SHIFT) & field##_MASK; \
102
+ *(data) = val; \
103
+} while (0)
104
+
105
+#define get_bit(data, bitmask) \
106
+ (!!((data) & (bitmask)))
107
+
108
+/* update irq line */
109
+static inline void dwc2_update_irq(DWC2State *s)
110
+{
111
+ static int oldlevel;
112
+ int level = 0;
113
+
114
+ if ((s->gintsts & s->gintmsk) && (s->gahbcfg & GAHBCFG_GLBL_INTR_EN)) {
115
+ level = 1;
116
+ }
117
+ if (level != oldlevel) {
118
+ oldlevel = level;
119
+ trace_usb_dwc2_update_irq(level);
120
+ qemu_set_irq(s->irq, level);
121
+ }
122
+}
123
+
124
+/* flag interrupt condition */
125
+static inline void dwc2_raise_global_irq(DWC2State *s, uint32_t intr)
126
+{
127
+ if (!(s->gintsts & intr)) {
128
+ s->gintsts |= intr;
129
+ trace_usb_dwc2_raise_global_irq(intr);
130
+ dwc2_update_irq(s);
131
+ }
132
+}
133
+
134
+static inline void dwc2_lower_global_irq(DWC2State *s, uint32_t intr)
135
+{
136
+ if (s->gintsts & intr) {
137
+ s->gintsts &= ~intr;
138
+ trace_usb_dwc2_lower_global_irq(intr);
139
+ dwc2_update_irq(s);
140
+ }
141
+}
142
+
143
+static inline void dwc2_raise_host_irq(DWC2State *s, uint32_t host_intr)
144
+{
145
+ if (!(s->haint & host_intr)) {
146
+ s->haint |= host_intr;
147
+ s->haint &= 0xffff;
148
+ trace_usb_dwc2_raise_host_irq(host_intr);
149
+ if (s->haint & s->haintmsk) {
150
+ dwc2_raise_global_irq(s, GINTSTS_HCHINT);
151
+ }
152
+ }
153
+}
154
+
155
+static inline void dwc2_lower_host_irq(DWC2State *s, uint32_t host_intr)
156
+{
157
+ if (s->haint & host_intr) {
158
+ s->haint &= ~host_intr;
159
+ trace_usb_dwc2_lower_host_irq(host_intr);
160
+ if (!(s->haint & s->haintmsk)) {
161
+ dwc2_lower_global_irq(s, GINTSTS_HCHINT);
162
+ }
163
+ }
164
+}
165
+
166
+static inline void dwc2_update_hc_irq(DWC2State *s, int index)
167
+{
168
+ uint32_t host_intr = 1 << (index >> 3);
169
+
170
+ if (s->hreg1[index + 2] & s->hreg1[index + 3]) {
171
+ dwc2_raise_host_irq(s, host_intr);
172
+ } else {
173
+ dwc2_lower_host_irq(s, host_intr);
174
+ }
175
+}
176
+
177
+/* set a timer for EOF */
178
+static void dwc2_eof_timer(DWC2State *s)
179
+{
180
+ timer_mod(s->eof_timer, s->sof_time + s->usb_frame_time);
181
+}
182
+
183
+/* Set a timer for EOF and generate SOF event */
184
+static void dwc2_sof(DWC2State *s)
185
+{
186
+ s->sof_time += s->usb_frame_time;
187
+ trace_usb_dwc2_sof(s->sof_time);
188
+ dwc2_eof_timer(s);
189
+ dwc2_raise_global_irq(s, GINTSTS_SOF);
190
+}
191
+
192
+/* Do frame processing on frame boundary */
193
+static void dwc2_frame_boundary(void *opaque)
194
+{
195
+ DWC2State *s = opaque;
196
+ int64_t now;
197
+ uint16_t frcnt;
198
+
199
+ now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
200
+
201
+ /* Frame boundary, so do EOF stuff here */
202
+
203
+ /* Increment frame number */
204
+ frcnt = (uint16_t)((now - s->sof_time) / s->fi);
205
+ s->frame_number = (s->frame_number + frcnt) & 0xffff;
206
+ s->hfnum = s->frame_number & HFNUM_MAX_FRNUM;
207
+
208
+ /* Do SOF stuff here */
209
+ dwc2_sof(s);
210
+}
211
+
212
+/* Start sending SOF tokens on the USB bus */
213
+static void dwc2_bus_start(DWC2State *s)
214
+{
215
+ trace_usb_dwc2_bus_start();
216
+ s->sof_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
217
+ dwc2_eof_timer(s);
218
+}
219
+
220
+/* Stop sending SOF tokens on the USB bus */
221
+static void dwc2_bus_stop(DWC2State *s)
222
+{
223
+ trace_usb_dwc2_bus_stop();
224
+ timer_del(s->eof_timer);
225
+}
226
+
227
+static USBDevice *dwc2_find_device(DWC2State *s, uint8_t addr)
228
+{
229
+ USBDevice *dev;
230
+
231
+ trace_usb_dwc2_find_device(addr);
232
+
233
+ if (!(s->hprt0 & HPRT0_ENA)) {
234
+ trace_usb_dwc2_port_disabled(0);
235
+ } else {
236
+ dev = usb_find_device(&s->uport, addr);
237
+ if (dev != NULL) {
238
+ trace_usb_dwc2_device_found(0);
239
+ return dev;
240
+ }
241
+ }
242
+
243
+ trace_usb_dwc2_device_not_found();
244
+ return NULL;
245
+}
246
+
247
+static const char *pstatus[] = {
248
+ "USB_RET_SUCCESS", "USB_RET_NODEV", "USB_RET_NAK", "USB_RET_STALL",
249
+ "USB_RET_BABBLE", "USB_RET_IOERROR", "USB_RET_ASYNC",
250
+ "USB_RET_ADD_TO_QUEUE", "USB_RET_REMOVE_FROM_QUEUE"
251
+};
252
+
253
+static uint32_t pintr[] = {
254
+ HCINTMSK_XFERCOMPL, HCINTMSK_XACTERR, HCINTMSK_NAK, HCINTMSK_STALL,
255
+ HCINTMSK_BBLERR, HCINTMSK_XACTERR, HCINTMSK_XACTERR, HCINTMSK_XACTERR,
256
+ HCINTMSK_XACTERR
257
+};
258
+
259
+static const char *types[] = {
260
+ "Ctrl", "Isoc", "Bulk", "Intr"
261
+};
262
+
263
+static const char *dirs[] = {
264
+ "Out", "In"
265
+};
266
+
267
+static void dwc2_handle_packet(DWC2State *s, uint32_t devadr, USBDevice *dev,
268
+ USBEndpoint *ep, uint32_t index, bool send)
269
+{
270
+ DWC2Packet *p;
271
+ uint32_t hcchar = s->hreg1[index];
272
+ uint32_t hctsiz = s->hreg1[index + 4];
273
+ uint32_t hcdma = s->hreg1[index + 5];
274
+ uint32_t chan, epnum, epdir, eptype, mps, pid, pcnt, len, tlen, intr = 0;
275
+ uint32_t tpcnt, stsidx, actual = 0;
276
+ bool do_intr = false, done = false;
277
+
278
+ epnum = get_field(hcchar, HCCHAR_EPNUM);
279
+ epdir = get_bit(hcchar, HCCHAR_EPDIR);
280
+ eptype = get_field(hcchar, HCCHAR_EPTYPE);
281
+ mps = get_field(hcchar, HCCHAR_MPS);
282
+ pid = get_field(hctsiz, TSIZ_SC_MC_PID);
283
+ pcnt = get_field(hctsiz, TSIZ_PKTCNT);
284
+ len = get_field(hctsiz, TSIZ_XFERSIZE);
285
+ assert(len <= DWC2_MAX_XFER_SIZE);
286
+ chan = index >> 3;
287
+ p = &s->packet[chan];
288
+
289
+ trace_usb_dwc2_handle_packet(chan, dev, &p->packet, epnum, types[eptype],
290
+ dirs[epdir], mps, len, pcnt);
291
+
292
+ if (eptype == USB_ENDPOINT_XFER_CONTROL && pid == TSIZ_SC_MC_PID_SETUP) {
293
+ pid = USB_TOKEN_SETUP;
294
+ } else {
295
+ pid = epdir ? USB_TOKEN_IN : USB_TOKEN_OUT;
296
+ }
297
+
298
+ if (send) {
299
+ tlen = len;
300
+ if (p->small) {
301
+ if (tlen > mps) {
302
+ tlen = mps;
303
+ }
304
+ }
305
+
306
+ if (pid != USB_TOKEN_IN) {
307
+ trace_usb_dwc2_memory_read(hcdma, tlen);
308
+ if (dma_memory_read(&s->dma_as, hcdma,
309
+ s->usb_buf[chan], tlen) != MEMTX_OK) {
310
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: dma_memory_read failed\n",
311
+ __func__);
312
+ }
313
+ }
314
+
315
+ usb_packet_init(&p->packet);
316
+ usb_packet_setup(&p->packet, pid, ep, 0, hcdma,
317
+ pid != USB_TOKEN_IN, true);
318
+ usb_packet_addbuf(&p->packet, s->usb_buf[chan], tlen);
319
+ p->async = DWC2_ASYNC_NONE;
320
+ usb_handle_packet(dev, &p->packet);
321
+ } else {
322
+ tlen = p->len;
323
+ }
324
+
325
+ stsidx = -p->packet.status;
326
+ assert(stsidx < sizeof(pstatus) / sizeof(*pstatus));
327
+ actual = p->packet.actual_length;
328
+ trace_usb_dwc2_packet_status(pstatus[stsidx], actual);
329
+
330
+babble:
331
+ if (p->packet.status != USB_RET_SUCCESS &&
332
+ p->packet.status != USB_RET_NAK &&
333
+ p->packet.status != USB_RET_STALL &&
334
+ p->packet.status != USB_RET_ASYNC) {
335
+ trace_usb_dwc2_packet_error(pstatus[stsidx]);
336
+ }
337
+
338
+ if (p->packet.status == USB_RET_ASYNC) {
339
+ trace_usb_dwc2_async_packet(&p->packet, chan, dev, epnum,
340
+ dirs[epdir], tlen);
341
+ usb_device_flush_ep_queue(dev, ep);
342
+ assert(p->async != DWC2_ASYNC_INFLIGHT);
343
+ p->devadr = devadr;
344
+ p->epnum = epnum;
345
+ p->epdir = epdir;
346
+ p->mps = mps;
347
+ p->pid = pid;
348
+ p->index = index;
349
+ p->pcnt = pcnt;
350
+ p->len = tlen;
351
+ p->async = DWC2_ASYNC_INFLIGHT;
352
+ p->needs_service = false;
353
+ return;
354
+ }
355
+
356
+ if (p->packet.status == USB_RET_SUCCESS) {
357
+ if (actual > tlen) {
358
+ p->packet.status = USB_RET_BABBLE;
359
+ goto babble;
360
+ }
361
+
362
+ if (pid == USB_TOKEN_IN) {
363
+ trace_usb_dwc2_memory_write(hcdma, actual);
364
+ if (dma_memory_write(&s->dma_as, hcdma, s->usb_buf[chan],
365
+ actual) != MEMTX_OK) {
366
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: dma_memory_write failed\n",
367
+ __func__);
368
+ }
369
+ }
370
+
371
+ tpcnt = actual / mps;
372
+ if (actual % mps) {
373
+ tpcnt++;
374
+ if (pid == USB_TOKEN_IN) {
375
+ done = true;
376
+ }
377
+ }
378
+
379
+ pcnt -= tpcnt < pcnt ? tpcnt : pcnt;
380
+ set_field(&hctsiz, pcnt, TSIZ_PKTCNT);
381
+ len -= actual < len ? actual : len;
382
+ set_field(&hctsiz, len, TSIZ_XFERSIZE);
383
+ s->hreg1[index + 4] = hctsiz;
384
+ hcdma += actual;
385
+ s->hreg1[index + 5] = hcdma;
386
+
387
+ if (!pcnt || len == 0 || actual == 0) {
388
+ done = true;
389
+ }
390
+ } else {
391
+ intr |= pintr[stsidx];
392
+ if (p->packet.status == USB_RET_NAK &&
393
+ (eptype == USB_ENDPOINT_XFER_CONTROL ||
394
+ eptype == USB_ENDPOINT_XFER_BULK)) {
395
+ /*
396
+ * for ctrl/bulk, automatically retry on NAK,
397
+ * but send the interrupt anyway
398
+ */
399
+ intr &= ~HCINTMSK_RESERVED14_31;
400
+ s->hreg1[index + 2] |= intr;
401
+ do_intr = true;
402
+ } else {
403
+ intr |= HCINTMSK_CHHLTD;
404
+ done = true;
405
+ }
406
+ }
407
+
408
+ usb_packet_cleanup(&p->packet);
409
+
410
+ if (done) {
411
+ hcchar &= ~HCCHAR_CHENA;
412
+ s->hreg1[index] = hcchar;
413
+ if (!(intr & HCINTMSK_CHHLTD)) {
414
+ intr |= HCINTMSK_CHHLTD | HCINTMSK_XFERCOMPL;
415
+ }
416
+ intr &= ~HCINTMSK_RESERVED14_31;
417
+ s->hreg1[index + 2] |= intr;
418
+ p->needs_service = false;
419
+ trace_usb_dwc2_packet_done(pstatus[stsidx], actual, len, pcnt);
420
+ dwc2_update_hc_irq(s, index);
421
+ return;
422
+ }
423
+
424
+ p->devadr = devadr;
425
+ p->epnum = epnum;
426
+ p->epdir = epdir;
427
+ p->mps = mps;
428
+ p->pid = pid;
429
+ p->index = index;
430
+ p->pcnt = pcnt;
431
+ p->len = len;
432
+ p->needs_service = true;
433
+ trace_usb_dwc2_packet_next(pstatus[stsidx], len, pcnt);
434
+ if (do_intr) {
435
+ dwc2_update_hc_irq(s, index);
436
+ }
437
+}
438
+
439
+/* Attach or detach a device on root hub */
440
+
441
+static const char *speeds[] = {
442
+ "low", "full", "high"
443
+};
444
+
445
+static void dwc2_attach(USBPort *port)
446
+{
447
+ DWC2State *s = port->opaque;
448
+ int hispd = 0;
449
+
450
+ trace_usb_dwc2_attach(port);
451
+ assert(port->index == 0);
452
+
453
+ if (!port->dev || !port->dev->attached) {
454
+ return;
455
+ }
456
+
457
+ assert(port->dev->speed <= USB_SPEED_HIGH);
458
+ trace_usb_dwc2_attach_speed(speeds[port->dev->speed]);
459
+ s->hprt0 &= ~HPRT0_SPD_MASK;
460
+
461
+ switch (port->dev->speed) {
462
+ case USB_SPEED_LOW:
463
+ s->hprt0 |= HPRT0_SPD_LOW_SPEED << HPRT0_SPD_SHIFT;
464
+ break;
465
+ case USB_SPEED_FULL:
466
+ s->hprt0 |= HPRT0_SPD_FULL_SPEED << HPRT0_SPD_SHIFT;
467
+ break;
468
+ case USB_SPEED_HIGH:
469
+ s->hprt0 |= HPRT0_SPD_HIGH_SPEED << HPRT0_SPD_SHIFT;
470
+ hispd = 1;
471
+ break;
472
+ }
473
+
474
+ if (hispd) {
475
+ s->usb_frame_time = NANOSECONDS_PER_SECOND / 8000; /* 125000 */
476
+ if (NANOSECONDS_PER_SECOND >= USB_HZ_HS) {
477
+ s->usb_bit_time = NANOSECONDS_PER_SECOND / USB_HZ_HS; /* 10.4 */
478
+ } else {
479
+ s->usb_bit_time = 1;
480
+ }
481
+ } else {
482
+ s->usb_frame_time = NANOSECONDS_PER_SECOND / 1000; /* 1000000 */
483
+ if (NANOSECONDS_PER_SECOND >= USB_HZ_FS) {
484
+ s->usb_bit_time = NANOSECONDS_PER_SECOND / USB_HZ_FS; /* 83.3 */
485
+ } else {
486
+ s->usb_bit_time = 1;
487
+ }
488
+ }
489
+
490
+ s->fi = USB_FRMINTVL - 1;
491
+ s->hprt0 |= HPRT0_CONNDET | HPRT0_CONNSTS;
492
+
493
+ dwc2_bus_start(s);
494
+ dwc2_raise_global_irq(s, GINTSTS_PRTINT);
495
+}
496
+
497
+static void dwc2_detach(USBPort *port)
498
+{
499
+ DWC2State *s = port->opaque;
500
+
501
+ trace_usb_dwc2_detach(port);
502
+ assert(port->index == 0);
503
+
504
+ dwc2_bus_stop(s);
505
+
506
+ s->hprt0 &= ~(HPRT0_SPD_MASK | HPRT0_SUSP | HPRT0_ENA | HPRT0_CONNSTS);
507
+ s->hprt0 |= HPRT0_CONNDET | HPRT0_ENACHG;
508
+
509
+ dwc2_raise_global_irq(s, GINTSTS_PRTINT);
510
+}
511
+
512
+static void dwc2_child_detach(USBPort *port, USBDevice *child)
513
+{
514
+ trace_usb_dwc2_child_detach(port, child);
515
+ assert(port->index == 0);
516
+}
517
+
518
+static void dwc2_wakeup(USBPort *port)
519
+{
520
+ DWC2State *s = port->opaque;
521
+
522
+ trace_usb_dwc2_wakeup(port);
523
+ assert(port->index == 0);
524
+
525
+ if (s->hprt0 & HPRT0_SUSP) {
526
+ s->hprt0 |= HPRT0_RES;
527
+ dwc2_raise_global_irq(s, GINTSTS_PRTINT);
528
+ }
529
+
530
+ qemu_bh_schedule(s->async_bh);
531
+}
532
+
533
+static void dwc2_async_packet_complete(USBPort *port, USBPacket *packet)
534
+{
535
+ DWC2State *s = port->opaque;
536
+ DWC2Packet *p;
537
+ USBDevice *dev;
538
+ USBEndpoint *ep;
539
+
540
+ assert(port->index == 0);
541
+ p = container_of(packet, DWC2Packet, packet);
542
+ dev = dwc2_find_device(s, p->devadr);
543
+ ep = usb_ep_get(dev, p->pid, p->epnum);
544
+ trace_usb_dwc2_async_packet_complete(port, packet, p->index >> 3, dev,
545
+ p->epnum, dirs[p->epdir], p->len);
546
+ assert(p->async == DWC2_ASYNC_INFLIGHT);
547
+
548
+ if (packet->status == USB_RET_REMOVE_FROM_QUEUE) {
549
+ usb_cancel_packet(packet);
550
+ usb_packet_cleanup(packet);
551
+ return;
552
+ }
553
+
554
+ dwc2_handle_packet(s, p->devadr, dev, ep, p->index, false);
555
+
556
+ p->async = DWC2_ASYNC_FINISHED;
557
+ qemu_bh_schedule(s->async_bh);
558
+}
559
+
560
+static USBPortOps dwc2_port_ops = {
561
+ .attach = dwc2_attach,
562
+ .detach = dwc2_detach,
563
+ .child_detach = dwc2_child_detach,
564
+ .wakeup = dwc2_wakeup,
565
+ .complete = dwc2_async_packet_complete,
566
+};
567
+
568
+static uint32_t dwc2_get_frame_remaining(DWC2State *s)
569
+{
570
+ uint32_t fr = 0;
571
+ int64_t tks;
572
+
573
+ tks = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - s->sof_time;
574
+ if (tks < 0) {
575
+ tks = 0;
576
+ }
577
+
578
+ /* avoid muldiv if possible */
579
+ if (tks >= s->usb_frame_time) {
580
+ goto out;
581
+ }
582
+ if (tks < s->usb_bit_time) {
583
+ fr = s->fi;
584
+ goto out;
585
+ }
586
+
587
+ /* tks = number of ns since SOF, divided by 83 (fs) or 10 (hs) */
588
+ tks = tks / s->usb_bit_time;
589
+ if (tks >= (int64_t)s->fi) {
590
+ goto out;
591
+ }
592
+
593
+ /* remaining = frame interval minus tks */
594
+ fr = (uint32_t)((int64_t)s->fi - tks);
595
+
596
+out:
597
+ return fr;
598
+}
599
+
600
+static void dwc2_work_bh(void *opaque)
601
+{
602
+ DWC2State *s = opaque;
603
+ DWC2Packet *p;
604
+ USBDevice *dev;
605
+ USBEndpoint *ep;
606
+ int64_t t_now, expire_time;
607
+ int chan;
608
+ bool found = false;
609
+
610
+ trace_usb_dwc2_work_bh();
611
+ if (s->working) {
612
+ return;
613
+ }
614
+ s->working = true;
615
+
616
+ t_now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
617
+ chan = s->next_chan;
618
+
619
+ do {
620
+ p = &s->packet[chan];
621
+ if (p->needs_service) {
622
+ dev = dwc2_find_device(s, p->devadr);
623
+ ep = usb_ep_get(dev, p->pid, p->epnum);
624
+ trace_usb_dwc2_work_bh_service(s->next_chan, chan, dev, p->epnum);
625
+ dwc2_handle_packet(s, p->devadr, dev, ep, p->index, true);
626
+ found = true;
627
+ }
628
+ if (++chan == DWC2_NB_CHAN) {
629
+ chan = 0;
630
+ }
631
+ if (found) {
632
+ s->next_chan = chan;
633
+ trace_usb_dwc2_work_bh_next(chan);
634
+ }
635
+ } while (chan != s->next_chan);
636
+
637
+ if (found) {
638
+ expire_time = t_now + NANOSECONDS_PER_SECOND / 4000;
639
+ timer_mod(s->frame_timer, expire_time);
640
+ }
641
+ s->working = false;
642
+}
643
+
644
+static void dwc2_enable_chan(DWC2State *s, uint32_t index)
645
+{
646
+ USBDevice *dev;
647
+ USBEndpoint *ep;
648
+ uint32_t hcchar;
649
+ uint32_t hctsiz;
650
+ uint32_t devadr, epnum, epdir, eptype, pid, len;
651
+ DWC2Packet *p;
652
+
653
+ assert((index >> 3) < DWC2_NB_CHAN);
654
+ p = &s->packet[index >> 3];
655
+ hcchar = s->hreg1[index];
656
+ hctsiz = s->hreg1[index + 4];
657
+ devadr = get_field(hcchar, HCCHAR_DEVADDR);
658
+ epnum = get_field(hcchar, HCCHAR_EPNUM);
659
+ epdir = get_bit(hcchar, HCCHAR_EPDIR);
660
+ eptype = get_field(hcchar, HCCHAR_EPTYPE);
661
+ pid = get_field(hctsiz, TSIZ_SC_MC_PID);
662
+ len = get_field(hctsiz, TSIZ_XFERSIZE);
663
+
664
+ dev = dwc2_find_device(s, devadr);
665
+
666
+ trace_usb_dwc2_enable_chan(index >> 3, dev, &p->packet, epnum);
667
+ if (dev == NULL) {
668
+ return;
669
+ }
670
+
671
+ if (eptype == USB_ENDPOINT_XFER_CONTROL && pid == TSIZ_SC_MC_PID_SETUP) {
672
+ pid = USB_TOKEN_SETUP;
673
+ } else {
674
+ pid = epdir ? USB_TOKEN_IN : USB_TOKEN_OUT;
675
+ }
676
+
677
+ ep = usb_ep_get(dev, pid, epnum);
678
+
679
+ /*
680
+ * Hack: Networking doesn't like us delivering large transfers, it kind
681
+ * of works but the latency is horrible. So if the transfer is <= the mtu
682
+ * size, we take that as a hint that this might be a network transfer,
683
+ * and do the transfer packet-by-packet.
684
+ */
685
+ if (len > 1536) {
686
+ p->small = false;
687
+ } else {
688
+ p->small = true;
689
+ }
690
+
691
+ dwc2_handle_packet(s, devadr, dev, ep, index, true);
692
+ qemu_bh_schedule(s->async_bh);
693
+}
694
+
695
+static const char *glbregnm[] = {
696
+ "GOTGCTL ", "GOTGINT ", "GAHBCFG ", "GUSBCFG ", "GRSTCTL ",
697
+ "GINTSTS ", "GINTMSK ", "GRXSTSR ", "GRXSTSP ", "GRXFSIZ ",
698
+ "GNPTXFSIZ", "GNPTXSTS ", "GI2CCTL ", "GPVNDCTL ", "GGPIO ",
699
+ "GUID ", "GSNPSID ", "GHWCFG1 ", "GHWCFG2 ", "GHWCFG3 ",
700
+ "GHWCFG4 ", "GLPMCFG ", "GPWRDN ", "GDFIFOCFG", "GADPCTL ",
701
+ "GREFCLK ", "GINTMSK2 ", "GINTSTS2 "
702
+};
703
+
704
+static uint64_t dwc2_glbreg_read(void *ptr, hwaddr addr, int index,
705
+ unsigned size)
706
+{
707
+ DWC2State *s = ptr;
708
+ uint32_t val;
709
+
710
+ assert(addr <= GINTSTS2);
711
+ val = s->glbreg[index];
712
+
713
+ switch (addr) {
714
+ case GRSTCTL:
715
+ /* clear any self-clearing bits that were set */
716
+ val &= ~(GRSTCTL_TXFFLSH | GRSTCTL_RXFFLSH | GRSTCTL_IN_TKNQ_FLSH |
717
+ GRSTCTL_FRMCNTRRST | GRSTCTL_HSFTRST | GRSTCTL_CSFTRST);
718
+ s->glbreg[index] = val;
719
+ break;
720
+ default:
721
+ break;
722
+ }
723
+
724
+ trace_usb_dwc2_glbreg_read(addr, glbregnm[index], val);
725
+ return val;
726
+}
727
+
728
+static void dwc2_glbreg_write(void *ptr, hwaddr addr, int index, uint64_t val,
729
+ unsigned size)
730
+{
731
+ DWC2State *s = ptr;
732
+ uint64_t orig = val;
733
+ uint32_t *mmio;
734
+ uint32_t old;
735
+ int iflg = 0;
736
+
737
+ assert(addr <= GINTSTS2);
738
+ mmio = &s->glbreg[index];
739
+ old = *mmio;
740
+
741
+ switch (addr) {
742
+ case GOTGCTL:
743
+ /* don't allow setting of read-only bits */
744
+ val &= ~(GOTGCTL_MULT_VALID_BC_MASK | GOTGCTL_BSESVLD |
745
+ GOTGCTL_ASESVLD | GOTGCTL_DBNC_SHORT | GOTGCTL_CONID_B |
746
+ GOTGCTL_HSTNEGSCS | GOTGCTL_SESREQSCS);
747
+ /* don't allow clearing of read-only bits */
748
+ val |= old & (GOTGCTL_MULT_VALID_BC_MASK | GOTGCTL_BSESVLD |
749
+ GOTGCTL_ASESVLD | GOTGCTL_DBNC_SHORT | GOTGCTL_CONID_B |
750
+ GOTGCTL_HSTNEGSCS | GOTGCTL_SESREQSCS);
751
+ break;
752
+ case GAHBCFG:
753
+ if ((val & GAHBCFG_GLBL_INTR_EN) && !(old & GAHBCFG_GLBL_INTR_EN)) {
754
+ iflg = 1;
755
+ }
756
+ break;
757
+ case GRSTCTL:
758
+ val |= GRSTCTL_AHBIDLE;
759
+ val &= ~GRSTCTL_DMAREQ;
760
+ if (!(old & GRSTCTL_TXFFLSH) && (val & GRSTCTL_TXFFLSH)) {
761
+ /* TODO - TX fifo flush */
762
+ qemu_log_mask(LOG_UNIMP, "Tx FIFO flush not implemented\n");
763
+ }
764
+ if (!(old & GRSTCTL_RXFFLSH) && (val & GRSTCTL_RXFFLSH)) {
765
+ /* TODO - RX fifo flush */
766
+ qemu_log_mask(LOG_UNIMP, "Rx FIFO flush not implemented\n");
767
+ }
768
+ if (!(old & GRSTCTL_IN_TKNQ_FLSH) && (val & GRSTCTL_IN_TKNQ_FLSH)) {
769
+ /* TODO - device IN token queue flush */
770
+ qemu_log_mask(LOG_UNIMP, "Token queue flush not implemented\n");
771
+ }
772
+ if (!(old & GRSTCTL_FRMCNTRRST) && (val & GRSTCTL_FRMCNTRRST)) {
773
+ /* TODO - host frame counter reset */
774
+ qemu_log_mask(LOG_UNIMP, "Frame counter reset not implemented\n");
775
+ }
776
+ if (!(old & GRSTCTL_HSFTRST) && (val & GRSTCTL_HSFTRST)) {
777
+ /* TODO - host soft reset */
778
+ qemu_log_mask(LOG_UNIMP, "Host soft reset not implemented\n");
779
+ }
780
+ if (!(old & GRSTCTL_CSFTRST) && (val & GRSTCTL_CSFTRST)) {
781
+ /* TODO - core soft reset */
782
+ qemu_log_mask(LOG_UNIMP, "Core soft reset not implemented\n");
783
+ }
784
+ /* don't allow clearing of self-clearing bits */
785
+ val |= old & (GRSTCTL_TXFFLSH | GRSTCTL_RXFFLSH |
786
+ GRSTCTL_IN_TKNQ_FLSH | GRSTCTL_FRMCNTRRST |
787
+ GRSTCTL_HSFTRST | GRSTCTL_CSFTRST);
788
+ break;
789
+ case GINTSTS:
790
+ /* clear the write-1-to-clear bits */
791
+ val |= ~old;
792
+ val = ~val;
793
+ /* don't allow clearing of read-only bits */
794
+ val |= old & (GINTSTS_PTXFEMP | GINTSTS_HCHINT | GINTSTS_PRTINT |
795
+ GINTSTS_OEPINT | GINTSTS_IEPINT | GINTSTS_GOUTNAKEFF |
796
+ GINTSTS_GINNAKEFF | GINTSTS_NPTXFEMP | GINTSTS_RXFLVL |
797
+ GINTSTS_OTGINT | GINTSTS_CURMODE_HOST);
798
+ iflg = 1;
799
+ break;
800
+ case GINTMSK:
801
+ iflg = 1;
802
+ break;
803
+ default:
804
+ break;
805
+ }
806
+
807
+ trace_usb_dwc2_glbreg_write(addr, glbregnm[index], orig, old, val);
808
+ *mmio = val;
809
+
810
+ if (iflg) {
811
+ dwc2_update_irq(s);
812
+ }
813
+}
814
+
815
+static uint64_t dwc2_fszreg_read(void *ptr, hwaddr addr, int index,
816
+ unsigned size)
817
+{
818
+ DWC2State *s = ptr;
819
+ uint32_t val;
820
+
821
+ assert(addr == HPTXFSIZ);
822
+ val = s->fszreg[index];
823
+
824
+ trace_usb_dwc2_fszreg_read(addr, val);
825
+ return val;
826
+}
827
+
828
+static void dwc2_fszreg_write(void *ptr, hwaddr addr, int index, uint64_t val,
829
+ unsigned size)
830
+{
831
+ DWC2State *s = ptr;
832
+ uint64_t orig = val;
833
+ uint32_t *mmio;
834
+ uint32_t old;
835
+
836
+ assert(addr == HPTXFSIZ);
837
+ mmio = &s->fszreg[index];
838
+ old = *mmio;
839
+
840
+ trace_usb_dwc2_fszreg_write(addr, orig, old, val);
841
+ *mmio = val;
842
+}
843
+
844
+static const char *hreg0nm[] = {
845
+ "HCFG ", "HFIR ", "HFNUM ", "<rsvd> ", "HPTXSTS ",
846
+ "HAINT ", "HAINTMSK ", "HFLBADDR ", "<rsvd> ", "<rsvd> ",
847
+ "<rsvd> ", "<rsvd> ", "<rsvd> ", "<rsvd> ", "<rsvd> ",
848
+ "<rsvd> ", "HPRT0 "
849
+};
850
+
851
+static uint64_t dwc2_hreg0_read(void *ptr, hwaddr addr, int index,
852
+ unsigned size)
853
+{
854
+ DWC2State *s = ptr;
855
+ uint32_t val;
856
+
857
+ assert(addr >= HCFG && addr <= HPRT0);
858
+ val = s->hreg0[index];
859
+
860
+ switch (addr) {
861
+ case HFNUM:
862
+ val = (dwc2_get_frame_remaining(s) << HFNUM_FRREM_SHIFT) |
863
+ (s->hfnum << HFNUM_FRNUM_SHIFT);
864
+ break;
865
+ default:
866
+ break;
867
+ }
868
+
869
+ trace_usb_dwc2_hreg0_read(addr, hreg0nm[index], val);
870
+ return val;
871
+}
872
+
873
+static void dwc2_hreg0_write(void *ptr, hwaddr addr, int index, uint64_t val,
874
+ unsigned size)
875
+{
876
+ DWC2State *s = ptr;
877
+ USBDevice *dev = s->uport.dev;
878
+ uint64_t orig = val;
879
+ uint32_t *mmio;
880
+ uint32_t tval, told, old;
881
+ int prst = 0;
882
+ int iflg = 0;
883
+
884
+ assert(addr >= HCFG && addr <= HPRT0);
885
+ mmio = &s->hreg0[index];
886
+ old = *mmio;
887
+
888
+ switch (addr) {
889
+ case HFIR:
890
+ break;
891
+ case HFNUM:
892
+ case HPTXSTS:
893
+ case HAINT:
894
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: write to read-only register\n",
895
+ __func__);
896
+ return;
897
+ case HAINTMSK:
898
+ val &= 0xffff;
899
+ break;
900
+ case HPRT0:
901
+ /* don't allow clearing of read-only bits */
902
+ val |= old & (HPRT0_SPD_MASK | HPRT0_LNSTS_MASK | HPRT0_OVRCURRACT |
903
+ HPRT0_CONNSTS);
904
+ /* don't allow clearing of self-clearing bits */
905
+ val |= old & (HPRT0_SUSP | HPRT0_RES);
906
+ /* don't allow setting of self-setting bits */
907
+ if (!(old & HPRT0_ENA) && (val & HPRT0_ENA)) {
908
+ val &= ~HPRT0_ENA;
909
+ }
910
+ /* clear the write-1-to-clear bits */
911
+ tval = val & (HPRT0_OVRCURRCHG | HPRT0_ENACHG | HPRT0_ENA |
912
+ HPRT0_CONNDET);
913
+ told = old & (HPRT0_OVRCURRCHG | HPRT0_ENACHG | HPRT0_ENA |
914
+ HPRT0_CONNDET);
915
+ tval |= ~told;
916
+ tval = ~tval;
917
+ tval &= (HPRT0_OVRCURRCHG | HPRT0_ENACHG | HPRT0_ENA |
918
+ HPRT0_CONNDET);
919
+ val &= ~(HPRT0_OVRCURRCHG | HPRT0_ENACHG | HPRT0_ENA |
920
+ HPRT0_CONNDET);
921
+ val |= tval;
922
+ if (!(val & HPRT0_RST) && (old & HPRT0_RST)) {
923
+ if (dev && dev->attached) {
924
+ val |= HPRT0_ENA | HPRT0_ENACHG;
925
+ prst = 1;
926
+ }
927
+ }
928
+ if (val & (HPRT0_OVRCURRCHG | HPRT0_ENACHG | HPRT0_CONNDET)) {
929
+ iflg = 1;
930
+ } else {
931
+ iflg = -1;
932
+ }
933
+ break;
934
+ default:
935
+ break;
936
+ }
937
+
938
+ if (prst) {
939
+ trace_usb_dwc2_hreg0_write(addr, hreg0nm[index], orig, old,
940
+ val & ~HPRT0_CONNDET);
941
+ trace_usb_dwc2_hreg0_action("call usb_port_reset");
942
+ usb_port_reset(&s->uport);
943
+ val &= ~HPRT0_CONNDET;
944
+ } else {
945
+ trace_usb_dwc2_hreg0_write(addr, hreg0nm[index], orig, old, val);
946
+ }
947
+
948
+ *mmio = val;
949
+
950
+ if (iflg > 0) {
951
+ trace_usb_dwc2_hreg0_action("enable PRTINT");
952
+ dwc2_raise_global_irq(s, GINTSTS_PRTINT);
953
+ } else if (iflg < 0) {
954
+ trace_usb_dwc2_hreg0_action("disable PRTINT");
955
+ dwc2_lower_global_irq(s, GINTSTS_PRTINT);
956
+ }
957
+}
958
+
959
+static const char *hreg1nm[] = {
960
+ "HCCHAR ", "HCSPLT ", "HCINT ", "HCINTMSK", "HCTSIZ ", "HCDMA ",
961
+ "<rsvd> ", "HCDMAB "
962
+};
963
+
964
+static uint64_t dwc2_hreg1_read(void *ptr, hwaddr addr, int index,
965
+ unsigned size)
966
+{
967
+ DWC2State *s = ptr;
968
+ uint32_t val;
969
+
970
+ assert(addr >= HCCHAR(0) && addr <= HCDMAB(DWC2_NB_CHAN - 1));
971
+ val = s->hreg1[index];
972
+
973
+ trace_usb_dwc2_hreg1_read(addr, hreg1nm[index & 7], addr >> 5, val);
974
+ return val;
975
+}
976
+
977
+static void dwc2_hreg1_write(void *ptr, hwaddr addr, int index, uint64_t val,
978
+ unsigned size)
979
+{
980
+ DWC2State *s = ptr;
981
+ uint64_t orig = val;
982
+ uint32_t *mmio;
983
+ uint32_t old;
984
+ int iflg = 0;
985
+ int enflg = 0;
986
+ int disflg = 0;
987
+
988
+ assert(addr >= HCCHAR(0) && addr <= HCDMAB(DWC2_NB_CHAN - 1));
989
+ mmio = &s->hreg1[index];
990
+ old = *mmio;
991
+
992
+ switch (HSOTG_REG(0x500) + (addr & 0x1c)) {
993
+ case HCCHAR(0):
994
+ if ((val & HCCHAR_CHDIS) && !(old & HCCHAR_CHDIS)) {
995
+ val &= ~(HCCHAR_CHENA | HCCHAR_CHDIS);
996
+ disflg = 1;
997
+ } else {
998
+ val |= old & HCCHAR_CHDIS;
999
+ if ((val & HCCHAR_CHENA) && !(old & HCCHAR_CHENA)) {
1000
+ val &= ~HCCHAR_CHDIS;
1001
+ enflg = 1;
1002
+ } else {
1003
+ val |= old & HCCHAR_CHENA;
1004
+ }
1005
+ }
1006
+ break;
1007
+ case HCINT(0):
1008
+ /* clear the write-1-to-clear bits */
1009
+ val |= ~old;
1010
+ val = ~val;
1011
+ val &= ~HCINTMSK_RESERVED14_31;
1012
+ iflg = 1;
1013
+ break;
1014
+ case HCINTMSK(0):
1015
+ val &= ~HCINTMSK_RESERVED14_31;
1016
+ iflg = 1;
1017
+ break;
1018
+ case HCDMAB(0):
1019
+ qemu_log_mask(LOG_GUEST_ERROR, "%s: write to read-only register\n",
1020
+ __func__);
1021
+ return;
1022
+ default:
1023
+ break;
1024
+ }
1025
+
1026
+ trace_usb_dwc2_hreg1_write(addr, hreg1nm[index & 7], index >> 3, orig,
1027
+ old, val);
1028
+ *mmio = val;
1029
+
1030
+ if (disflg) {
1031
+ /* set ChHltd in HCINT */
1032
+ s->hreg1[(index & ~7) + 2] |= HCINTMSK_CHHLTD;
1033
+ iflg = 1;
1034
+ }
1035
+
1036
+ if (enflg) {
1037
+ dwc2_enable_chan(s, index & ~7);
1038
+ }
1039
+
1040
+ if (iflg) {
1041
+ dwc2_update_hc_irq(s, index & ~7);
1042
+ }
1043
+}
1044
+
1045
+static const char *pcgregnm[] = {
1046
+ "PCGCTL ", "PCGCCTL1 "
1047
+};
1048
+
1049
+static uint64_t dwc2_pcgreg_read(void *ptr, hwaddr addr, int index,
1050
+ unsigned size)
1051
+{
1052
+ DWC2State *s = ptr;
1053
+ uint32_t val;
1054
+
1055
+ assert(addr >= PCGCTL && addr <= PCGCCTL1);
1056
+ val = s->pcgreg[index];
1057
+
1058
+ trace_usb_dwc2_pcgreg_read(addr, pcgregnm[index], val);
1059
+ return val;
1060
+}
1061
+
1062
+static void dwc2_pcgreg_write(void *ptr, hwaddr addr, int index,
1063
+ uint64_t val, unsigned size)
1064
+{
1065
+ DWC2State *s = ptr;
1066
+ uint64_t orig = val;
1067
+ uint32_t *mmio;
1068
+ uint32_t old;
1069
+
1070
+ assert(addr >= PCGCTL && addr <= PCGCCTL1);
1071
+ mmio = &s->pcgreg[index];
1072
+ old = *mmio;
1073
+
1074
+ trace_usb_dwc2_pcgreg_write(addr, pcgregnm[index], orig, old, val);
1075
+ *mmio = val;
1076
+}
1077
+
1078
+static uint64_t dwc2_hsotg_read(void *ptr, hwaddr addr, unsigned size)
1079
+{
1080
+ uint64_t val;
1081
+
1082
+ switch (addr) {
1083
+ case HSOTG_REG(0x000) ... HSOTG_REG(0x0fc):
1084
+ val = dwc2_glbreg_read(ptr, addr, (addr - HSOTG_REG(0x000)) >> 2, size);
1085
+ break;
1086
+ case HSOTG_REG(0x100):
1087
+ val = dwc2_fszreg_read(ptr, addr, (addr - HSOTG_REG(0x100)) >> 2, size);
1088
+ break;
1089
+ case HSOTG_REG(0x104) ... HSOTG_REG(0x3fc):
1090
+ /* Gadget-mode registers, just return 0 for now */
1091
+ val = 0;
1092
+ break;
1093
+ case HSOTG_REG(0x400) ... HSOTG_REG(0x4fc):
1094
+ val = dwc2_hreg0_read(ptr, addr, (addr - HSOTG_REG(0x400)) >> 2, size);
1095
+ break;
1096
+ case HSOTG_REG(0x500) ... HSOTG_REG(0x7fc):
1097
+ val = dwc2_hreg1_read(ptr, addr, (addr - HSOTG_REG(0x500)) >> 2, size);
1098
+ break;
1099
+ case HSOTG_REG(0x800) ... HSOTG_REG(0xdfc):
1100
+ /* Gadget-mode registers, just return 0 for now */
1101
+ val = 0;
1102
+ break;
1103
+ case HSOTG_REG(0xe00) ... HSOTG_REG(0xffc):
1104
+ val = dwc2_pcgreg_read(ptr, addr, (addr - HSOTG_REG(0xe00)) >> 2, size);
1105
+ break;
28
+ default:
1106
+ default:
29
+ g_assert_not_reached();
1107
+ g_assert_not_reached();
30
+ }
1108
+ }
31
+}
1109
+
32
+
1110
+ return val;
33
+static void m5206_mbar_writefn(void *opaque, hwaddr addr,
1111
+}
34
+ uint64_t value, unsigned size)
1112
+
35
+{
1113
+static void dwc2_hsotg_write(void *ptr, hwaddr addr, uint64_t val,
36
+ switch (size) {
1114
+ unsigned size)
37
+ case 1:
1115
+{
38
+ m5206_mbar_writeb(opaque, addr, value);
1116
+ switch (addr) {
39
+ break;
1117
+ case HSOTG_REG(0x000) ... HSOTG_REG(0x0fc):
40
+ case 2:
1118
+ dwc2_glbreg_write(ptr, addr, (addr - HSOTG_REG(0x000)) >> 2, val, size);
41
+ m5206_mbar_writew(opaque, addr, value);
1119
+ break;
42
+ break;
1120
+ case HSOTG_REG(0x100):
43
+ case 4:
1121
+ dwc2_fszreg_write(ptr, addr, (addr - HSOTG_REG(0x100)) >> 2, val, size);
44
+ m5206_mbar_writel(opaque, addr, value);
1122
+ break;
1123
+ case HSOTG_REG(0x104) ... HSOTG_REG(0x3fc):
1124
+ /* Gadget-mode registers, do nothing for now */
1125
+ break;
1126
+ case HSOTG_REG(0x400) ... HSOTG_REG(0x4fc):
1127
+ dwc2_hreg0_write(ptr, addr, (addr - HSOTG_REG(0x400)) >> 2, val, size);
1128
+ break;
1129
+ case HSOTG_REG(0x500) ... HSOTG_REG(0x7fc):
1130
+ dwc2_hreg1_write(ptr, addr, (addr - HSOTG_REG(0x500)) >> 2, val, size);
1131
+ break;
1132
+ case HSOTG_REG(0x800) ... HSOTG_REG(0xdfc):
1133
+ /* Gadget-mode registers, do nothing for now */
1134
+ break;
1135
+ case HSOTG_REG(0xe00) ... HSOTG_REG(0xffc):
1136
+ dwc2_pcgreg_write(ptr, addr, (addr - HSOTG_REG(0xe00)) >> 2, val, size);
45
+ break;
1137
+ break;
46
+ default:
1138
+ default:
47
+ g_assert_not_reached();
1139
+ g_assert_not_reached();
48
+ }
1140
+ }
49
+}
1141
+}
50
+
1142
+
51
static const MemoryRegionOps m5206_mbar_ops = {
1143
+static const MemoryRegionOps dwc2_mmio_hsotg_ops = {
52
- .old_mmio = {
1144
+ .read = dwc2_hsotg_read,
53
- .read = {
1145
+ .write = dwc2_hsotg_write,
54
- m5206_mbar_readb,
1146
+ .impl.min_access_size = 4,
55
- m5206_mbar_readw,
1147
+ .impl.max_access_size = 4,
56
- m5206_mbar_readl,
1148
+ .endianness = DEVICE_LITTLE_ENDIAN,
57
- },
1149
+};
58
- .write = {
1150
+
59
- m5206_mbar_writeb,
1151
+static uint64_t dwc2_hreg2_read(void *ptr, hwaddr addr, unsigned size)
60
- m5206_mbar_writew,
1152
+{
61
- m5206_mbar_writel,
1153
+ /* TODO - implement FIFOs to support slave mode */
62
- },
1154
+ trace_usb_dwc2_hreg2_read(addr, addr >> 12, 0);
63
- },
1155
+ qemu_log_mask(LOG_UNIMP, "FIFO read not implemented\n");
64
+ .read = m5206_mbar_readfn,
1156
+ return 0;
65
+ .write = m5206_mbar_writefn,
1157
+}
66
+ .valid.min_access_size = 1,
1158
+
67
+ .valid.max_access_size = 4,
1159
+static void dwc2_hreg2_write(void *ptr, hwaddr addr, uint64_t val,
68
.endianness = DEVICE_NATIVE_ENDIAN,
1160
+ unsigned size)
69
};
1161
+{
70
1162
+ uint64_t orig = val;
1163
+
1164
+ /* TODO - implement FIFOs to support slave mode */
1165
+ trace_usb_dwc2_hreg2_write(addr, addr >> 12, orig, 0, val);
1166
+ qemu_log_mask(LOG_UNIMP, "FIFO write not implemented\n");
1167
+}
1168
+
1169
+static const MemoryRegionOps dwc2_mmio_hreg2_ops = {
1170
+ .read = dwc2_hreg2_read,
1171
+ .write = dwc2_hreg2_write,
1172
+ .impl.min_access_size = 4,
1173
+ .impl.max_access_size = 4,
1174
+ .endianness = DEVICE_LITTLE_ENDIAN,
1175
+};
1176
+
1177
+static void dwc2_wakeup_endpoint(USBBus *bus, USBEndpoint *ep,
1178
+ unsigned int stream)
1179
+{
1180
+ DWC2State *s = container_of(bus, DWC2State, bus);
1181
+
1182
+ trace_usb_dwc2_wakeup_endpoint(ep, stream);
1183
+
1184
+ /* TODO - do something here? */
1185
+ qemu_bh_schedule(s->async_bh);
1186
+}
1187
+
1188
+static USBBusOps dwc2_bus_ops = {
1189
+ .wakeup_endpoint = dwc2_wakeup_endpoint,
1190
+};
1191
+
1192
+static void dwc2_work_timer(void *opaque)
1193
+{
1194
+ DWC2State *s = opaque;
1195
+
1196
+ trace_usb_dwc2_work_timer();
1197
+ qemu_bh_schedule(s->async_bh);
1198
+}
1199
+
1200
+static void dwc2_reset_enter(Object *obj, ResetType type)
1201
+{
1202
+ DWC2Class *c = DWC2_GET_CLASS(obj);
1203
+ DWC2State *s = DWC2_USB(obj);
1204
+ int i;
1205
+
1206
+ trace_usb_dwc2_reset_enter();
1207
+
1208
+ if (c->parent_phases.enter) {
1209
+ c->parent_phases.enter(obj, type);
1210
+ }
1211
+
1212
+ timer_del(s->frame_timer);
1213
+ qemu_bh_cancel(s->async_bh);
1214
+
1215
+ if (s->uport.dev && s->uport.dev->attached) {
1216
+ usb_detach(&s->uport);
1217
+ }
1218
+
1219
+ dwc2_bus_stop(s);
1220
+
1221
+ s->gotgctl = GOTGCTL_BSESVLD | GOTGCTL_ASESVLD | GOTGCTL_CONID_B;
1222
+ s->gotgint = 0;
1223
+ s->gahbcfg = 0;
1224
+ s->gusbcfg = 5 << GUSBCFG_USBTRDTIM_SHIFT;
1225
+ s->grstctl = GRSTCTL_AHBIDLE;
1226
+ s->gintsts = GINTSTS_CONIDSTSCHNG | GINTSTS_PTXFEMP | GINTSTS_NPTXFEMP |
1227
+ GINTSTS_CURMODE_HOST;
1228
+ s->gintmsk = 0;
1229
+ s->grxstsr = 0;
1230
+ s->grxstsp = 0;
1231
+ s->grxfsiz = 1024;
1232
+ s->gnptxfsiz = 1024 << FIFOSIZE_DEPTH_SHIFT;
1233
+ s->gnptxsts = (4 << FIFOSIZE_DEPTH_SHIFT) | 1024;
1234
+ s->gi2cctl = GI2CCTL_I2CDATSE0 | GI2CCTL_ACK;
1235
+ s->gpvndctl = 0;
1236
+ s->ggpio = 0;
1237
+ s->guid = 0;
1238
+ s->gsnpsid = 0x4f54294a;
1239
+ s->ghwcfg1 = 0;
1240
+ s->ghwcfg2 = (8 << GHWCFG2_DEV_TOKEN_Q_DEPTH_SHIFT) |
1241
+ (4 << GHWCFG2_HOST_PERIO_TX_Q_DEPTH_SHIFT) |
1242
+ (4 << GHWCFG2_NONPERIO_TX_Q_DEPTH_SHIFT) |
1243
+ GHWCFG2_DYNAMIC_FIFO |
1244
+ GHWCFG2_PERIO_EP_SUPPORTED |
1245
+ ((DWC2_NB_CHAN - 1) << GHWCFG2_NUM_HOST_CHAN_SHIFT) |
1246
+ (GHWCFG2_INT_DMA_ARCH << GHWCFG2_ARCHITECTURE_SHIFT) |
1247
+ (GHWCFG2_OP_MODE_NO_SRP_CAPABLE_HOST << GHWCFG2_OP_MODE_SHIFT);
1248
+ s->ghwcfg3 = (4096 << GHWCFG3_DFIFO_DEPTH_SHIFT) |
1249
+ (4 << GHWCFG3_PACKET_SIZE_CNTR_WIDTH_SHIFT) |
1250
+ (4 << GHWCFG3_XFER_SIZE_CNTR_WIDTH_SHIFT);
1251
+ s->ghwcfg4 = 0;
1252
+ s->glpmcfg = 0;
1253
+ s->gpwrdn = GPWRDN_PWRDNRSTN;
1254
+ s->gdfifocfg = 0;
1255
+ s->gadpctl = 0;
1256
+ s->grefclk = 0;
1257
+ s->gintmsk2 = 0;
1258
+ s->gintsts2 = 0;
1259
+
1260
+ s->hptxfsiz = 500 << FIFOSIZE_DEPTH_SHIFT;
1261
+
1262
+ s->hcfg = 2 << HCFG_RESVALID_SHIFT;
1263
+ s->hfir = 60000;
1264
+ s->hfnum = 0x3fff;
1265
+ s->hptxsts = (16 << TXSTS_QSPCAVAIL_SHIFT) | 32768;
1266
+ s->haint = 0;
1267
+ s->haintmsk = 0;
1268
+ s->hprt0 = 0;
1269
+
1270
+ memset(s->hreg1, 0, sizeof(s->hreg1));
1271
+ memset(s->pcgreg, 0, sizeof(s->pcgreg));
1272
+
1273
+ s->sof_time = 0;
1274
+ s->frame_number = 0;
1275
+ s->fi = USB_FRMINTVL - 1;
1276
+ s->next_chan = 0;
1277
+ s->working = false;
1278
+
1279
+ for (i = 0; i < DWC2_NB_CHAN; i++) {
1280
+ s->packet[i].needs_service = false;
1281
+ }
1282
+}
1283
+
1284
+static void dwc2_reset_hold(Object *obj)
1285
+{
1286
+ DWC2Class *c = DWC2_GET_CLASS(obj);
1287
+ DWC2State *s = DWC2_USB(obj);
1288
+
1289
+ trace_usb_dwc2_reset_hold();
1290
+
1291
+ if (c->parent_phases.hold) {
1292
+ c->parent_phases.hold(obj);
1293
+ }
1294
+
1295
+ dwc2_update_irq(s);
1296
+}
1297
+
1298
+static void dwc2_reset_exit(Object *obj)
1299
+{
1300
+ DWC2Class *c = DWC2_GET_CLASS(obj);
1301
+ DWC2State *s = DWC2_USB(obj);
1302
+
1303
+ trace_usb_dwc2_reset_exit();
1304
+
1305
+ if (c->parent_phases.exit) {
1306
+ c->parent_phases.exit(obj);
1307
+ }
1308
+
1309
+ s->hprt0 = HPRT0_PWR;
1310
+ if (s->uport.dev && s->uport.dev->attached) {
1311
+ usb_attach(&s->uport);
1312
+ usb_device_reset(s->uport.dev);
1313
+ }
1314
+}
1315
+
1316
+static void dwc2_realize(DeviceState *dev, Error **errp)
1317
+{
1318
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
1319
+ DWC2State *s = DWC2_USB(dev);
1320
+ Object *obj;
1321
+ Error *err = NULL;
1322
+
1323
+ obj = object_property_get_link(OBJECT(dev), "dma-mr", &err);
1324
+ if (err) {
1325
+ error_setg(errp, "dwc2: required dma-mr link not found: %s",
1326
+ error_get_pretty(err));
1327
+ return;
1328
+ }
1329
+ assert(obj != NULL);
1330
+
1331
+ s->dma_mr = MEMORY_REGION(obj);
1332
+ address_space_init(&s->dma_as, s->dma_mr, "dwc2");
1333
+
1334
+ usb_bus_new(&s->bus, sizeof(s->bus), &dwc2_bus_ops, dev);
1335
+ usb_register_port(&s->bus, &s->uport, s, 0, &dwc2_port_ops,
1336
+ USB_SPEED_MASK_LOW | USB_SPEED_MASK_FULL |
1337
+ (s->usb_version == 2 ? USB_SPEED_MASK_HIGH : 0));
1338
+ s->uport.dev = 0;
1339
+
1340
+ s->usb_frame_time = NANOSECONDS_PER_SECOND / 1000; /* 1000000 */
1341
+ if (NANOSECONDS_PER_SECOND >= USB_HZ_FS) {
1342
+ s->usb_bit_time = NANOSECONDS_PER_SECOND / USB_HZ_FS; /* 83.3 */
1343
+ } else {
1344
+ s->usb_bit_time = 1;
1345
+ }
1346
+
1347
+ s->fi = USB_FRMINTVL - 1;
1348
+ s->eof_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, dwc2_frame_boundary, s);
1349
+ s->frame_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, dwc2_work_timer, s);
1350
+ s->async_bh = qemu_bh_new(dwc2_work_bh, s);
1351
+
1352
+ sysbus_init_irq(sbd, &s->irq);
1353
+}
1354
+
1355
+static void dwc2_init(Object *obj)
1356
+{
1357
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
1358
+ DWC2State *s = DWC2_USB(obj);
1359
+
1360
+ memory_region_init(&s->container, obj, "dwc2", DWC2_MMIO_SIZE);
1361
+ sysbus_init_mmio(sbd, &s->container);
1362
+
1363
+ memory_region_init_io(&s->hsotg, obj, &dwc2_mmio_hsotg_ops, s,
1364
+ "dwc2-io", 4 * KiB);
1365
+ memory_region_add_subregion(&s->container, 0x0000, &s->hsotg);
1366
+
1367
+ memory_region_init_io(&s->fifos, obj, &dwc2_mmio_hreg2_ops, s,
1368
+ "dwc2-fifo", 64 * KiB);
1369
+ memory_region_add_subregion(&s->container, 0x1000, &s->fifos);
1370
+}
1371
+
1372
+static const VMStateDescription vmstate_dwc2_state_packet = {
1373
+ .name = "dwc2/packet",
1374
+ .version_id = 1,
1375
+ .minimum_version_id = 1,
1376
+ .fields = (VMStateField[]) {
1377
+ VMSTATE_UINT32(devadr, DWC2Packet),
1378
+ VMSTATE_UINT32(epnum, DWC2Packet),
1379
+ VMSTATE_UINT32(epdir, DWC2Packet),
1380
+ VMSTATE_UINT32(mps, DWC2Packet),
1381
+ VMSTATE_UINT32(pid, DWC2Packet),
1382
+ VMSTATE_UINT32(index, DWC2Packet),
1383
+ VMSTATE_UINT32(pcnt, DWC2Packet),
1384
+ VMSTATE_UINT32(len, DWC2Packet),
1385
+ VMSTATE_INT32(async, DWC2Packet),
1386
+ VMSTATE_BOOL(small, DWC2Packet),
1387
+ VMSTATE_BOOL(needs_service, DWC2Packet),
1388
+ VMSTATE_END_OF_LIST()
1389
+ },
1390
+};
1391
+
1392
+const VMStateDescription vmstate_dwc2_state = {
1393
+ .name = "dwc2",
1394
+ .version_id = 1,
1395
+ .minimum_version_id = 1,
1396
+ .fields = (VMStateField[]) {
1397
+ VMSTATE_UINT32_ARRAY(glbreg, DWC2State,
1398
+ DWC2_GLBREG_SIZE / sizeof(uint32_t)),
1399
+ VMSTATE_UINT32_ARRAY(fszreg, DWC2State,
1400
+ DWC2_FSZREG_SIZE / sizeof(uint32_t)),
1401
+ VMSTATE_UINT32_ARRAY(hreg0, DWC2State,
1402
+ DWC2_HREG0_SIZE / sizeof(uint32_t)),
1403
+ VMSTATE_UINT32_ARRAY(hreg1, DWC2State,
1404
+ DWC2_HREG1_SIZE / sizeof(uint32_t)),
1405
+ VMSTATE_UINT32_ARRAY(pcgreg, DWC2State,
1406
+ DWC2_PCGREG_SIZE / sizeof(uint32_t)),
1407
+
1408
+ VMSTATE_TIMER_PTR(eof_timer, DWC2State),
1409
+ VMSTATE_TIMER_PTR(frame_timer, DWC2State),
1410
+ VMSTATE_INT64(sof_time, DWC2State),
1411
+ VMSTATE_INT64(usb_frame_time, DWC2State),
1412
+ VMSTATE_INT64(usb_bit_time, DWC2State),
1413
+ VMSTATE_UINT32(usb_version, DWC2State),
1414
+ VMSTATE_UINT16(frame_number, DWC2State),
1415
+ VMSTATE_UINT16(fi, DWC2State),
1416
+ VMSTATE_UINT16(next_chan, DWC2State),
1417
+ VMSTATE_BOOL(working, DWC2State),
1418
+
1419
+ VMSTATE_STRUCT_ARRAY(packet, DWC2State, DWC2_NB_CHAN, 1,
1420
+ vmstate_dwc2_state_packet, DWC2Packet),
1421
+ VMSTATE_UINT8_2DARRAY(usb_buf, DWC2State, DWC2_NB_CHAN,
1422
+ DWC2_MAX_XFER_SIZE),
1423
+
1424
+ VMSTATE_END_OF_LIST()
1425
+ }
1426
+};
1427
+
1428
+static Property dwc2_usb_properties[] = {
1429
+ DEFINE_PROP_UINT32("usb_version", DWC2State, usb_version, 2),
1430
+ DEFINE_PROP_END_OF_LIST(),
1431
+};
1432
+
1433
+static void dwc2_class_init(ObjectClass *klass, void *data)
1434
+{
1435
+ DeviceClass *dc = DEVICE_CLASS(klass);
1436
+ DWC2Class *c = DWC2_CLASS(klass);
1437
+ ResettableClass *rc = RESETTABLE_CLASS(klass);
1438
+
1439
+ dc->realize = dwc2_realize;
1440
+ dc->vmsd = &vmstate_dwc2_state;
1441
+ set_bit(DEVICE_CATEGORY_USB, dc->categories);
1442
+ device_class_set_props(dc, dwc2_usb_properties);
1443
+ resettable_class_set_parent_phases(rc, dwc2_reset_enter, dwc2_reset_hold,
1444
+ dwc2_reset_exit, &c->parent_phases);
1445
+}
1446
+
1447
+static const TypeInfo dwc2_usb_type_info = {
1448
+ .name = TYPE_DWC2_USB,
1449
+ .parent = TYPE_SYS_BUS_DEVICE,
1450
+ .instance_size = sizeof(DWC2State),
1451
+ .instance_init = dwc2_init,
1452
+ .class_size = sizeof(DWC2Class),
1453
+ .class_init = dwc2_class_init,
1454
+};
1455
+
1456
+static void dwc2_usb_register_types(void)
1457
+{
1458
+ type_register_static(&dwc2_usb_type_info);
1459
+}
1460
+
1461
+type_init(dwc2_usb_register_types)
1462
diff --git a/hw/usb/Kconfig b/hw/usb/Kconfig
1463
index XXXXXXX..XXXXXXX 100644
1464
--- a/hw/usb/Kconfig
1465
+++ b/hw/usb/Kconfig
1466
@@ -XXX,XX +XXX,XX @@ config USB_MUSB
1467
bool
1468
select USB
1469
1470
+config USB_DWC2
1471
+ bool
1472
+ default y
1473
+ select USB
1474
+
1475
config TUSB6010
1476
bool
1477
select USB_MUSB
1478
diff --git a/hw/usb/Makefile.objs b/hw/usb/Makefile.objs
1479
index XXXXXXX..XXXXXXX 100644
1480
--- a/hw/usb/Makefile.objs
1481
+++ b/hw/usb/Makefile.objs
1482
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_USB_EHCI_SYSBUS) += hcd-ehci-sysbus.o
1483
common-obj-$(CONFIG_USB_XHCI) += hcd-xhci.o
1484
common-obj-$(CONFIG_USB_XHCI_NEC) += hcd-xhci-nec.o
1485
common-obj-$(CONFIG_USB_MUSB) += hcd-musb.o
1486
+common-obj-$(CONFIG_USB_DWC2) += hcd-dwc2.o
1487
1488
common-obj-$(CONFIG_TUSB6010) += tusb6010.o
1489
common-obj-$(CONFIG_IMX) += chipidea.o
1490
diff --git a/hw/usb/trace-events b/hw/usb/trace-events
1491
index XXXXXXX..XXXXXXX 100644
1492
--- a/hw/usb/trace-events
1493
+++ b/hw/usb/trace-events
1494
@@ -XXX,XX +XXX,XX @@ usb_xhci_xfer_error(void *xfer, uint32_t ret) "%p: ret %d"
1495
usb_xhci_unimplemented(const char *item, int nr) "%s (0x%x)"
1496
usb_xhci_enforced_limit(const char *item) "%s"
1497
1498
+# hcd-dwc2.c
1499
+usb_dwc2_update_irq(uint32_t level) "level=%d"
1500
+usb_dwc2_raise_global_irq(uint32_t intr) "0x%08x"
1501
+usb_dwc2_lower_global_irq(uint32_t intr) "0x%08x"
1502
+usb_dwc2_raise_host_irq(uint32_t intr) "0x%04x"
1503
+usb_dwc2_lower_host_irq(uint32_t intr) "0x%04x"
1504
+usb_dwc2_sof(int64_t next) "next SOF %" PRId64
1505
+usb_dwc2_bus_start(void) "start SOFs"
1506
+usb_dwc2_bus_stop(void) "stop SOFs"
1507
+usb_dwc2_find_device(uint8_t addr) "%d"
1508
+usb_dwc2_port_disabled(uint32_t pnum) "port %d disabled"
1509
+usb_dwc2_device_found(uint32_t pnum) "device found on port %d"
1510
+usb_dwc2_device_not_found(void) "device not found"
1511
+usb_dwc2_handle_packet(uint32_t chan, void *dev, void *pkt, uint32_t ep, const char *type, const char *dir, uint32_t mps, uint32_t len, uint32_t pcnt) "ch %d dev %p pkt %p ep %d type %s dir %s mps %d len %d pcnt %d"
1512
+usb_dwc2_memory_read(uint32_t addr, uint32_t len) "addr %d len %d"
1513
+usb_dwc2_packet_status(const char *status, uint32_t len) "status %s len %d"
1514
+usb_dwc2_packet_error(const char *status) "ERROR %s"
1515
+usb_dwc2_async_packet(void *pkt, uint32_t chan, void *dev, uint32_t ep, const char *dir, uint32_t len) "pkt %p ch %d dev %p ep %d %s len %d"
1516
+usb_dwc2_memory_write(uint32_t addr, uint32_t len) "addr %d len %d"
1517
+usb_dwc2_packet_done(const char *status, uint32_t actual, uint32_t len, uint32_t pcnt) "status %s actual %d len %d pcnt %d"
1518
+usb_dwc2_packet_next(const char *status, uint32_t len, uint32_t pcnt) "status %s len %d pcnt %d"
1519
+usb_dwc2_attach(void *port) "port %p"
1520
+usb_dwc2_attach_speed(const char *speed) "%s-speed device attached"
1521
+usb_dwc2_detach(void *port) "port %p"
1522
+usb_dwc2_child_detach(void *port, void *child) "port %p child %p"
1523
+usb_dwc2_wakeup(void *port) "port %p"
1524
+usb_dwc2_async_packet_complete(void *port, void *pkt, uint32_t chan, void *dev, uint32_t ep, const char *dir, uint32_t len) "port %p packet %p ch %d dev %p ep %d %s len %d"
1525
+usb_dwc2_work_bh(void) ""
1526
+usb_dwc2_work_bh_service(uint32_t first, uint32_t current, void *dev, uint32_t ep) "first %d servicing %d dev %p ep %d"
1527
+usb_dwc2_work_bh_next(uint32_t chan) "next %d"
1528
+usb_dwc2_enable_chan(uint32_t chan, void *dev, void *pkt, uint32_t ep) "ch %d dev %p pkt %p ep %d"
1529
+usb_dwc2_glbreg_read(uint64_t addr, const char *reg, uint32_t val) " 0x%04" PRIx64 " %s val 0x%08x"
1530
+usb_dwc2_glbreg_write(uint64_t addr, const char *reg, uint64_t val, uint32_t old, uint64_t result) "0x%04" PRIx64 " %s val 0x%08" PRIx64 " old 0x%08x result 0x%08" PRIx64
1531
+usb_dwc2_fszreg_read(uint64_t addr, uint32_t val) " 0x%04" PRIx64 " HPTXFSIZ val 0x%08x"
1532
+usb_dwc2_fszreg_write(uint64_t addr, uint64_t val, uint32_t old, uint64_t result) "0x%04" PRIx64 " HPTXFSIZ val 0x%08" PRIx64 " old 0x%08x result 0x%08" PRIx64
1533
+usb_dwc2_hreg0_read(uint64_t addr, const char *reg, uint32_t val) " 0x%04" PRIx64 " %s val 0x%08x"
1534
+usb_dwc2_hreg0_write(uint64_t addr, const char *reg, uint64_t val, uint32_t old, uint64_t result) " 0x%04" PRIx64 " %s val 0x%08" PRIx64 " old 0x%08x result 0x%08" PRIx64
1535
+usb_dwc2_hreg1_read(uint64_t addr, const char *reg, uint64_t chan, uint32_t val) " 0x%04" PRIx64 " %s%" PRId64 " val 0x%08x"
1536
+usb_dwc2_hreg1_write(uint64_t addr, const char *reg, uint64_t chan, uint64_t val, uint32_t old, uint64_t result) " 0x%04" PRIx64 " %s%" PRId64 " val 0x%08" PRIx64 " old 0x%08x result 0x%08" PRIx64
1537
+usb_dwc2_pcgreg_read(uint64_t addr, const char *reg, uint32_t val) " 0x%04" PRIx64 " %s val 0x%08x"
1538
+usb_dwc2_pcgreg_write(uint64_t addr, const char *reg, uint64_t val, uint32_t old, uint64_t result) "0x%04" PRIx64 " %s val 0x%08" PRIx64 " old 0x%08x result 0x%08" PRIx64
1539
+usb_dwc2_hreg2_read(uint64_t addr, uint64_t fifo, uint32_t val) " 0x%04" PRIx64 " FIFO%" PRId64 " val 0x%08x"
1540
+usb_dwc2_hreg2_write(uint64_t addr, uint64_t fifo, uint64_t val, uint32_t old, uint64_t result) " 0x%04" PRIx64 " FIFO%" PRId64 " val 0x%08" PRIx64 " old 0x%08x result 0x%08" PRIx64
1541
+usb_dwc2_hreg0_action(const char *s) "%s"
1542
+usb_dwc2_wakeup_endpoint(void *ep, uint32_t stream) "endp %p stream %d"
1543
+usb_dwc2_work_timer(void) ""
1544
+usb_dwc2_reset_enter(void) "=== RESET enter ==="
1545
+usb_dwc2_reset_hold(void) "=== RESET hold ==="
1546
+usb_dwc2_reset_exit(void) "=== RESET exit ==="
1547
+
1548
# desc.c
1549
usb_desc_device(int addr, int len, int ret) "dev %d query device, len %d, ret %d"
1550
usb_desc_device_qualifier(int addr, int len, int ret) "dev %d query device qualifier, len %d, ret %d"
71
--
1551
--
72
2.17.1
1552
2.20.1
73
1553
74
1554
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Paul Zimmerman <pauldzim@gmail.com>
2
2
3
Rearrange the arithmetic so that we are agnostic about the total size
3
The dwc-hsotg (dwc2) USB host depends on a short packet to
4
of the vector and the size of the element. This will allow us to index
4
indicate the end of an IN transfer. The usb-storage driver
5
up to the 32nd byte and with 16-byte elements.
5
currently doesn't provide this, so fix it.
6
6
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
I have tested this change rather extensively using a PC
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
emulation with xhci, ehci, and uhci controllers, and have
9
Message-id: 20180613015641.5667-2-richard.henderson@linaro.org
9
not observed any regressions.
10
11
Signed-off-by: Paul Zimmerman <pauldzim@gmail.com>
12
Message-id: 20200520235349.21215-6-pauldzim@gmail.com
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
---
14
---
12
target/arm/translate-a64.h | 26 +++++++++++++++++---------
15
hw/usb/dev-storage.c | 15 ++++++++++++++-
13
1 file changed, 17 insertions(+), 9 deletions(-)
16
1 file changed, 14 insertions(+), 1 deletion(-)
14
17
15
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
18
diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c
16
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
17
--- a/target/arm/translate-a64.h
20
--- a/hw/usb/dev-storage.c
18
+++ b/target/arm/translate-a64.h
21
+++ b/hw/usb/dev-storage.c
19
@@ -XXX,XX +XXX,XX @@ static inline void assert_fp_access_checked(DisasContext *s)
22
@@ -XXX,XX +XXX,XX @@ static void usb_msd_copy_data(MSDState *s, USBPacket *p)
20
static inline int vec_reg_offset(DisasContext *s, int regno,
23
usb_packet_copy(p, scsi_req_get_buf(s->req) + s->scsi_off, len);
21
int element, TCGMemOp size)
24
s->scsi_len -= len;
22
{
25
s->scsi_off += len;
23
- int offs = 0;
26
+ if (len > s->data_len) {
24
+ int element_size = 1 << size;
27
+ len = s->data_len;
25
+ int offs = element * element_size;
26
#ifdef HOST_WORDS_BIGENDIAN
27
/* This is complicated slightly because vfp.zregs[n].d[0] is
28
- * still the low half and vfp.zregs[n].d[1] the high half
29
- * of the 128 bit vector, even on big endian systems.
30
- * Calculate the offset assuming a fully bigendian 128 bits,
31
- * then XOR to account for the order of the two 64 bit halves.
32
+ * still the lowest and vfp.zregs[n].d[15] the highest of the
33
+ * 256 byte vector, even on big endian systems.
34
+ *
35
+ * Calculate the offset assuming fully little-endian,
36
+ * then XOR to account for the order of the 8-byte units.
37
+ *
38
+ * For 16 byte elements, the two 8 byte halves will not form a
39
+ * host int128 if the host is bigendian, since they're in the
40
+ * wrong order. However the only 16 byte operation we have is
41
+ * a move, so we can ignore this for the moment. More complicated
42
+ * operations will have to special case loading and storing from
43
+ * the zregs array.
44
*/
45
- offs += (16 - ((element + 1) * (1 << size)));
46
- offs ^= 8;
47
-#else
48
- offs += element * (1 << size);
49
+ if (element_size < 8) {
50
+ offs ^= 8 - element_size;
51
+ }
28
+ }
52
#endif
29
s->data_len -= len;
53
offs += offsetof(CPUARMState, vfp.zregs[regno]);
30
if (s->scsi_len == 0 || s->data_len == 0) {
54
assert_fp_access_checked(s);
31
scsi_req_continue(s->req);
32
@@ -XXX,XX +XXX,XX @@ static void usb_msd_command_complete(SCSIRequest *req, uint32_t status, size_t r
33
if (s->data_len) {
34
int len = (p->iov.size - p->actual_length);
35
usb_packet_skip(p, len);
36
+ if (len > s->data_len) {
37
+ len = s->data_len;
38
+ }
39
s->data_len -= len;
40
}
41
if (s->data_len == 0) {
42
@@ -XXX,XX +XXX,XX @@ static void usb_msd_handle_data(USBDevice *dev, USBPacket *p)
43
int len = p->iov.size - p->actual_length;
44
if (len) {
45
usb_packet_skip(p, len);
46
+ if (len > s->data_len) {
47
+ len = s->data_len;
48
+ }
49
s->data_len -= len;
50
if (s->data_len == 0) {
51
s->mode = USB_MSDM_CSW;
52
@@ -XXX,XX +XXX,XX @@ static void usb_msd_handle_data(USBDevice *dev, USBPacket *p)
53
int len = p->iov.size - p->actual_length;
54
if (len) {
55
usb_packet_skip(p, len);
56
+ if (len > s->data_len) {
57
+ len = s->data_len;
58
+ }
59
s->data_len -= len;
60
if (s->data_len == 0) {
61
s->mode = USB_MSDM_CSW;
62
}
63
}
64
}
65
- if (p->actual_length < p->iov.size) {
66
+ if (p->actual_length < p->iov.size && (p->short_not_ok ||
67
+ s->scsi_len >= p->ep->max_packet_size)) {
68
DPRINTF("Deferring packet %p [wait data-in]\n", p);
69
s->packet = p;
70
p->status = USB_RET_ASYNC;
55
--
71
--
56
2.17.1
72
2.20.1
57
73
58
74
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
From: Paul Zimmerman <pauldzim@gmail.com>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Wire the dwc-hsotg (dwc2) emulation into Qemu
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
5
Message-id: 20180613015641.5667-10-richard.henderson@linaro.org
5
Signed-off-by: Paul Zimmerman <pauldzim@gmail.com>
6
Reviewed-by: Philippe Mathieu-Daude <f4bug@amsat.org>
7
Message-id: 20200520235349.21215-7-pauldzim@gmail.com
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
---
9
---
8
target/arm/helper-sve.h | 2 ++
10
include/hw/arm/bcm2835_peripherals.h | 3 ++-
9
target/arm/sve_helper.c | 37 +++++++++++++++++++++++++++++++++++++
11
hw/arm/bcm2835_peripherals.c | 21 ++++++++++++++++++++-
10
target/arm/translate-sve.c | 13 +++++++++++++
12
2 files changed, 22 insertions(+), 2 deletions(-)
11
target/arm/sve.decode | 3 +++
12
4 files changed, 55 insertions(+)
13
13
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
14
diff --git a/include/hw/arm/bcm2835_peripherals.h b/include/hw/arm/bcm2835_peripherals.h
15
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
16
--- a/include/hw/arm/bcm2835_peripherals.h
17
+++ b/target/arm/helper-sve.h
17
+++ b/include/hw/arm/bcm2835_peripherals.h
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
18
@@ -XXX,XX +XXX,XX @@
19
DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
19
#include "hw/sd/bcm2835_sdhost.h"
20
DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
20
#include "hw/gpio/bcm2835_gpio.h"
21
21
#include "hw/timer/bcm2835_systmr.h"
22
+DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+#include "hw/usb/hcd-dwc2.h"
23
#include "hw/misc/unimp.h"
24
25
#define TYPE_BCM2835_PERIPHERALS "bcm2835-peripherals"
26
@@ -XXX,XX +XXX,XX @@ typedef struct BCM2835PeripheralState {
27
UnimplementedDeviceState ave0;
28
UnimplementedDeviceState bscsl;
29
UnimplementedDeviceState smi;
30
- UnimplementedDeviceState dwc2;
31
+ DWC2State dwc2;
32
UnimplementedDeviceState sdramc;
33
} BCM2835PeripheralState;
34
35
diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
36
index XXXXXXX..XXXXXXX 100644
37
--- a/hw/arm/bcm2835_peripherals.c
38
+++ b/hw/arm/bcm2835_peripherals.c
39
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_init(Object *obj)
40
/* Mphi */
41
sysbus_init_child_obj(obj, "mphi", &s->mphi, sizeof(s->mphi),
42
TYPE_BCM2835_MPHI);
23
+
43
+
24
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
44
+ /* DWC2 */
25
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
45
+ sysbus_init_child_obj(obj, "dwc2", &s->dwc2, sizeof(s->dwc2),
26
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
46
+ TYPE_DWC2_USB);
27
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
47
+
28
index XXXXXXX..XXXXXXX 100644
48
+ object_property_add_const_link(OBJECT(&s->dwc2), "dma-mr",
29
--- a/target/arm/sve_helper.c
49
+ OBJECT(&s->gpu_bus_mr));
30
+++ b/target/arm/sve_helper.c
31
@@ -XXX,XX +XXX,XX @@ int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc)
32
33
return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz);
34
}
50
}
35
+
51
36
+void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t desc)
52
static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
37
+{
53
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
38
+ intptr_t opr_sz = simd_oprsz(desc) / 8;
54
qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
39
+ int esz = simd_data(desc);
55
INTERRUPT_HOSTPORT));
40
+ uint64_t pg, first_g, last_g, len, mask = pred_esz_masks[esz];
56
41
+ intptr_t i, first_i, last_i;
57
+ /* DWC2 */
42
+ ARMVectorReg tmp;
58
+ object_property_set_bool(OBJECT(&s->dwc2), true, "realized", &err);
43
+
59
+ if (err) {
44
+ first_i = last_i = 0;
60
+ error_propagate(errp, err);
45
+ first_g = last_g = 0;
61
+ return;
46
+
47
+ /* Find the extent of the active elements within VG. */
48
+ for (i = QEMU_ALIGN_UP(opr_sz, 8) - 8; i >= 0; i -= 8) {
49
+ pg = *(uint64_t *)(vg + i) & mask;
50
+ if (pg) {
51
+ if (last_g == 0) {
52
+ last_g = pg;
53
+ last_i = i;
54
+ }
55
+ first_g = pg;
56
+ first_i = i;
57
+ }
58
+ }
62
+ }
59
+
63
+
60
+ len = 0;
64
+ memory_region_add_subregion(&s->peri_mr, USB_OTG_OFFSET,
61
+ if (first_g != 0) {
65
+ sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->dwc2), 0));
62
+ first_i = first_i * 8 + ctz64(first_g);
66
+ sysbus_connect_irq(SYS_BUS_DEVICE(&s->dwc2), 0,
63
+ last_i = last_i * 8 + 63 - clz64(last_g);
67
+ qdev_get_gpio_in_named(DEVICE(&s->ic), BCM2835_IC_GPU_IRQ,
64
+ len = last_i - first_i + (1 << esz);
68
+ INTERRUPT_USB));
65
+ if (vd == vm) {
69
+
66
+ vm = memcpy(&tmp, vm, opr_sz * 8);
70
create_unimp(s, &s->armtmr, "bcm2835-sp804", ARMCTRL_TIMER0_1_OFFSET, 0x40);
67
+ }
71
create_unimp(s, &s->cprman, "bcm2835-cprman", CPRMAN_OFFSET, 0x1000);
68
+ swap_memmove(vd, vn + first_i, len);
72
create_unimp(s, &s->a2w, "bcm2835-a2w", A2W_OFFSET, 0x1000);
69
+ }
73
@@ -XXX,XX +XXX,XX @@ static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
70
+ swap_memmove(vd + len, vm, opr_sz * 8 - len);
74
create_unimp(s, &s->otp, "bcm2835-otp", OTP_OFFSET, 0x80);
71
+}
75
create_unimp(s, &s->dbus, "bcm2835-dbus", DBUS_OFFSET, 0x8000);
72
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
76
create_unimp(s, &s->ave0, "bcm2835-ave0", AVE0_OFFSET, 0x8000);
73
index XXXXXXX..XXXXXXX 100644
77
- create_unimp(s, &s->dwc2, "dwc-usb2", USB_OTG_OFFSET, 0x1000);
74
--- a/target/arm/translate-sve.c
78
create_unimp(s, &s->sdramc, "bcm2835-sdramc", SDRAMC_OFFSET, 0x100);
75
+++ b/target/arm/translate-sve.c
76
@@ -XXX,XX +XXX,XX @@ static bool trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
77
return do_zpz_ool(s, a, fns[a->esz]);
78
}
79
}
79
80
80
+static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
81
+{
82
+ if (sve_access_check(s)) {
83
+ unsigned vsz = vec_full_reg_size(s);
84
+ tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd),
85
+ vec_full_reg_offset(s, a->rn),
86
+ vec_full_reg_offset(s, a->rm),
87
+ pred_full_reg_offset(s, a->pg),
88
+ vsz, vsz, a->esz, gen_helper_sve_splice);
89
+ }
90
+ return true;
91
+}
92
+
93
/*
94
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
95
*/
96
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
97
index XXXXXXX..XXXXXXX 100644
98
--- a/target/arm/sve.decode
99
+++ b/target/arm/sve.decode
100
@@ -XXX,XX +XXX,XX @@ REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn
101
REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn
102
RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn
103
104
+# SVE vector splice (predicated)
105
+SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm
106
+
107
### SVE Predicate Logical Operations Group
108
109
# SVE predicate logical operations
110
--
81
--
111
2.17.1
82
2.20.1
112
83
113
84
diff view generated by jsdifflib
1
From: Shannon Zhao <zhaoshenglong@huawei.com>
1
From: Paul Zimmerman <pauldzim@gmail.com>
2
2
3
While for_each_dist_irq_reg loop starts from GIC_INTERNAL, it forgot to
3
Add a check for functional dwc-hsotg (dwc2) USB host emulation to
4
offset the date array and index. This will overlap the GICR registers
4
the Raspi 2 acceptance test
5
value and leave the last GIC_INTERNAL irq's registers out of update.
6
5
7
Fixes: 367b9f527becdd20ddf116e17a3c0c2bbc486920
6
Signed-off-by: Paul Zimmerman <pauldzim@gmail.com>
8
Cc: qemu-stable@nongnu.org
7
Reviewed-by: Philippe Mathieu-Daude <f4bug@amsat.org>
9
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Message-id: 20200520235349.21215-8-pauldzim@gmail.com
10
Reviewed-by: Eric Auger <eric.auger@redhat.com>
11
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
12
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
13
---
10
---
14
hw/intc/arm_gicv3_kvm.c | 18 ++++++++++++++++--
11
tests/acceptance/boot_linux_console.py | 9 +++++++--
15
1 file changed, 16 insertions(+), 2 deletions(-)
12
1 file changed, 7 insertions(+), 2 deletions(-)
16
13
17
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
14
diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
18
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
19
--- a/hw/intc/arm_gicv3_kvm.c
16
--- a/tests/acceptance/boot_linux_console.py
20
+++ b/hw/intc/arm_gicv3_kvm.c
17
+++ b/tests/acceptance/boot_linux_console.py
21
@@ -XXX,XX +XXX,XX @@ static void kvm_dist_get_priority(GICv3State *s, uint32_t offset, uint8_t *bmp)
18
@@ -XXX,XX +XXX,XX @@ class BootLinuxConsole(LinuxKernelTest):
22
uint32_t reg, *field;
19
23
int irq;
20
self.vm.set_console()
24
21
kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
25
- field = (uint32_t *)bmp;
22
- serial_kernel_cmdline[uart_id])
26
+ /* For the KVM GICv3, affinity routing is always enabled, and the first 8
23
+ serial_kernel_cmdline[uart_id] +
27
+ * GICD_IPRIORITYR<n> registers are always RAZ/WI. The corresponding
24
+ ' root=/dev/mmcblk0p2 rootwait ' +
28
+ * functionality is replaced by GICR_IPRIORITYR<n>. It doesn't need to
25
+ 'dwc_otg.fiq_fsm_enable=0')
29
+ * sync them. So it needs to skip the field of GIC_INTERNAL irqs in bmp and
26
self.vm.add_args('-kernel', kernel_path,
30
+ * offset.
27
'-dtb', dtb_path,
31
+ */
28
- '-append', kernel_command_line)
32
+ field = (uint32_t *)(bmp + GIC_INTERNAL);
29
+ '-append', kernel_command_line,
33
+ offset += (GIC_INTERNAL * 8) / 8;
30
+ '-device', 'usb-kbd')
34
for_each_dist_irq_reg(irq, s->num_irq, 8) {
31
self.vm.launch()
35
kvm_gicd_access(s, offset, &reg, false);
32
console_pattern = 'Kernel command line: %s' % kernel_command_line
36
*field = reg;
33
self.wait_for_console_pattern(console_pattern)
37
@@ -XXX,XX +XXX,XX @@ static void kvm_dist_put_priority(GICv3State *s, uint32_t offset, uint8_t *bmp)
34
+ console_pattern = 'Product: QEMU USB Keyboard'
38
uint32_t reg, *field;
35
+ self.wait_for_console_pattern(console_pattern)
39
int irq;
36
40
37
def test_arm_raspi2_uart0(self):
41
- field = (uint32_t *)bmp;
38
"""
42
+ /* For the KVM GICv3, affinity routing is always enabled, and the first 8
43
+ * GICD_IPRIORITYR<n> registers are always RAZ/WI. The corresponding
44
+ * functionality is replaced by GICR_IPRIORITYR<n>. It doesn't need to
45
+ * sync them. So it needs to skip the field of GIC_INTERNAL irqs in bmp and
46
+ * offset.
47
+ */
48
+ field = (uint32_t *)(bmp + GIC_INTERNAL);
49
+ offset += (GIC_INTERNAL * 8) / 8;
50
for_each_dist_irq_reg(irq, s->num_irq, 8) {
51
reg = *field;
52
kvm_gicd_access(s, offset, &reg, true);
53
--
39
--
54
2.17.1
40
2.20.1
55
41
56
42
diff view generated by jsdifflib
Deleted patch
1
The ethernet controller in the AN505 MPC FPGA image is behind
2
the same AHB Peripheral Protection Controller that handles
3
the graphics and GPIOs. (In the documentation this is clear
4
in the block diagram but the ethernet controller was omitted
5
from the table listing devices connected to the PPC.)
6
The ethernet sits behind AHB PPCEXP0 interface 5. We had
7
incorrectly claimed that this was a "gpio4", but there are
8
only 4 GPIOs in this image.
9
1
10
Correct the QEMU model to match the hardware.
11
12
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Message-id: 20180515171446.10834-1-peter.maydell@linaro.org
15
---
16
hw/arm/mps2-tz.c | 32 +++++++++++++++++++++++---------
17
1 file changed, 23 insertions(+), 9 deletions(-)
18
19
diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/hw/arm/mps2-tz.c
22
+++ b/hw/arm/mps2-tz.c
23
@@ -XXX,XX +XXX,XX @@ typedef struct {
24
UnimplementedDeviceState spi[5];
25
UnimplementedDeviceState i2c[4];
26
UnimplementedDeviceState i2s_audio;
27
- UnimplementedDeviceState gpio[5];
28
+ UnimplementedDeviceState gpio[4];
29
UnimplementedDeviceState dma[4];
30
UnimplementedDeviceState gfx;
31
CMSDKAPBUART uart[5];
32
SplitIRQ sec_resp_splitter;
33
qemu_or_irq uart_irq_orgate;
34
+ DeviceState *lan9118;
35
} MPS2TZMachineState;
36
37
#define TYPE_MPS2TZ_MACHINE "mps2tz"
38
@@ -XXX,XX +XXX,XX @@ static MemoryRegion *make_fpgaio(MPS2TZMachineState *mms, void *opaque,
39
return sysbus_mmio_get_region(SYS_BUS_DEVICE(fpgaio), 0);
40
}
41
42
+static MemoryRegion *make_eth_dev(MPS2TZMachineState *mms, void *opaque,
43
+ const char *name, hwaddr size)
44
+{
45
+ SysBusDevice *s;
46
+ DeviceState *iotkitdev = DEVICE(&mms->iotkit);
47
+ NICInfo *nd = &nd_table[0];
48
+
49
+ /* In hardware this is a LAN9220; the LAN9118 is software compatible
50
+ * except that it doesn't support the checksum-offload feature.
51
+ */
52
+ qemu_check_nic_model(nd, "lan9118");
53
+ mms->lan9118 = qdev_create(NULL, "lan9118");
54
+ qdev_set_nic_properties(mms->lan9118, nd);
55
+ qdev_init_nofail(mms->lan9118);
56
+
57
+ s = SYS_BUS_DEVICE(mms->lan9118);
58
+ sysbus_connect_irq(s, 0, qdev_get_gpio_in_named(iotkitdev, "EXP_IRQ", 16));
59
+ return sysbus_mmio_get_region(s, 0);
60
+}
61
+
62
static void mps2tz_common_init(MachineState *machine)
63
{
64
MPS2TZMachineState *mms = MPS2TZ_MACHINE(machine);
65
@@ -XXX,XX +XXX,XX @@ static void mps2tz_common_init(MachineState *machine)
66
{ "gpio1", make_unimp_dev, &mms->gpio[1], 0x40101000, 0x1000 },
67
{ "gpio2", make_unimp_dev, &mms->gpio[2], 0x40102000, 0x1000 },
68
{ "gpio3", make_unimp_dev, &mms->gpio[3], 0x40103000, 0x1000 },
69
- { "gpio4", make_unimp_dev, &mms->gpio[4], 0x40104000, 0x1000 },
70
+ { "eth", make_eth_dev, NULL, 0x42000000, 0x100000 },
71
},
72
}, {
73
.name = "ahb_ppcexp1",
74
@@ -XXX,XX +XXX,XX @@ static void mps2tz_common_init(MachineState *machine)
75
"cfg_sec_resp", 0));
76
}
77
78
- /* In hardware this is a LAN9220; the LAN9118 is software compatible
79
- * except that it doesn't support the checksum-offload feature.
80
- * The ethernet controller is not behind a PPC.
81
- */
82
- lan9118_init(&nd_table[0], 0x42000000,
83
- qdev_get_gpio_in_named(iotkitdev, "EXP_IRQ", 16));
84
-
85
create_unimplemented_device("FPGA NS PC", 0x48007000, 0x1000);
86
87
armv7m_load_kernel(ARM_CPU(first_cpu), machine->kernel_filename, 0x400000);
88
--
89
2.17.1
90
91
diff view generated by jsdifflib
Deleted patch
1
Convert the sh7750 device away from using the old_mmio field
2
of MemoryRegionOps. This device is used by the sh4 r2d board.
3
1
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Message-id: 20180601141223.26630-2-peter.maydell@linaro.org
7
---
8
hw/sh4/sh7750.c | 44 ++++++++++++++++++++++++++++++++++++--------
9
1 file changed, 36 insertions(+), 8 deletions(-)
10
11
diff --git a/hw/sh4/sh7750.c b/hw/sh4/sh7750.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/hw/sh4/sh7750.c
14
+++ b/hw/sh4/sh7750.c
15
@@ -XXX,XX +XXX,XX @@ static void sh7750_mem_writel(void *opaque, hwaddr addr,
16
}
17
}
18
19
+static uint64_t sh7750_mem_readfn(void *opaque, hwaddr addr, unsigned size)
20
+{
21
+ switch (size) {
22
+ case 1:
23
+ return sh7750_mem_readb(opaque, addr);
24
+ case 2:
25
+ return sh7750_mem_readw(opaque, addr);
26
+ case 4:
27
+ return sh7750_mem_readl(opaque, addr);
28
+ default:
29
+ g_assert_not_reached();
30
+ }
31
+}
32
+
33
+static void sh7750_mem_writefn(void *opaque, hwaddr addr,
34
+ uint64_t value, unsigned size)
35
+{
36
+ switch (size) {
37
+ case 1:
38
+ sh7750_mem_writeb(opaque, addr, value);
39
+ break;
40
+ case 2:
41
+ sh7750_mem_writew(opaque, addr, value);
42
+ break;
43
+ case 4:
44
+ sh7750_mem_writel(opaque, addr, value);
45
+ break;
46
+ default:
47
+ g_assert_not_reached();
48
+ }
49
+}
50
+
51
static const MemoryRegionOps sh7750_mem_ops = {
52
- .old_mmio = {
53
- .read = {sh7750_mem_readb,
54
- sh7750_mem_readw,
55
- sh7750_mem_readl },
56
- .write = {sh7750_mem_writeb,
57
- sh7750_mem_writew,
58
- sh7750_mem_writel },
59
- },
60
+ .read = sh7750_mem_readfn,
61
+ .write = sh7750_mem_writefn,
62
+ .valid.min_access_size = 1,
63
+ .valid.max_access_size = 4,
64
.endianness = DEVICE_NATIVE_ENDIAN,
65
};
66
67
--
68
2.17.1
69
70
diff view generated by jsdifflib
Deleted patch
1
Convert the wdt_i6300esb device away from using the old_mmio field
2
of MemoryRegionOps.
3
1
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Message-id: 20180601141223.26630-5-peter.maydell@linaro.org
7
---
8
hw/watchdog/wdt_i6300esb.c | 48 ++++++++++++++++++++++++++++----------
9
1 file changed, 36 insertions(+), 12 deletions(-)
10
11
diff --git a/hw/watchdog/wdt_i6300esb.c b/hw/watchdog/wdt_i6300esb.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/hw/watchdog/wdt_i6300esb.c
14
+++ b/hw/watchdog/wdt_i6300esb.c
15
@@ -XXX,XX +XXX,XX @@ static void i6300esb_mem_writel(void *vp, hwaddr addr, uint32_t val)
16
}
17
}
18
19
+static uint64_t i6300esb_mem_readfn(void *opaque, hwaddr addr, unsigned size)
20
+{
21
+ switch (size) {
22
+ case 1:
23
+ return i6300esb_mem_readb(opaque, addr);
24
+ case 2:
25
+ return i6300esb_mem_readw(opaque, addr);
26
+ case 4:
27
+ return i6300esb_mem_readl(opaque, addr);
28
+ default:
29
+ g_assert_not_reached();
30
+ }
31
+}
32
+
33
+static void i6300esb_mem_writefn(void *opaque, hwaddr addr,
34
+ uint64_t value, unsigned size)
35
+{
36
+ switch (size) {
37
+ case 1:
38
+ i6300esb_mem_writeb(opaque, addr, value);
39
+ break;
40
+ case 2:
41
+ i6300esb_mem_writew(opaque, addr, value);
42
+ break;
43
+ case 4:
44
+ i6300esb_mem_writel(opaque, addr, value);
45
+ break;
46
+ default:
47
+ g_assert_not_reached();
48
+ }
49
+}
50
+
51
static const MemoryRegionOps i6300esb_ops = {
52
- .old_mmio = {
53
- .read = {
54
- i6300esb_mem_readb,
55
- i6300esb_mem_readw,
56
- i6300esb_mem_readl,
57
- },
58
- .write = {
59
- i6300esb_mem_writeb,
60
- i6300esb_mem_writew,
61
- i6300esb_mem_writel,
62
- },
63
- },
64
+ .read = i6300esb_mem_readfn,
65
+ .write = i6300esb_mem_writefn,
66
+ .valid.min_access_size = 1,
67
+ .valid.max_access_size = 4,
68
.endianness = DEVICE_LITTLE_ENDIAN,
69
};
70
71
--
72
2.17.1
73
74
diff view generated by jsdifflib
Deleted patch
1
Convert the parallel device away from using the old_mmio field
2
of MemoryRegionOps. This change only affects the memory-mapped
3
variant, which is used by the MIPS Jazz boards 'magnum' and 'pica61'.
4
1
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Message-id: 20180601141223.26630-7-peter.maydell@linaro.org
8
---
9
hw/char/parallel.c | 50 ++++++++++------------------------------------
10
1 file changed, 11 insertions(+), 39 deletions(-)
11
12
diff --git a/hw/char/parallel.c b/hw/char/parallel.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/hw/char/parallel.c
15
+++ b/hw/char/parallel.c
16
@@ -XXX,XX +XXX,XX @@ static void parallel_isa_realizefn(DeviceState *dev, Error **errp)
17
}
18
19
/* Memory mapped interface */
20
-static uint32_t parallel_mm_readb (void *opaque, hwaddr addr)
21
+static uint64_t parallel_mm_readfn(void *opaque, hwaddr addr, unsigned size)
22
{
23
ParallelState *s = opaque;
24
25
- return parallel_ioport_read_sw(s, addr >> s->it_shift) & 0xFF;
26
+ return parallel_ioport_read_sw(s, addr >> s->it_shift) &
27
+ MAKE_64BIT_MASK(0, size * 8);
28
}
29
30
-static void parallel_mm_writeb (void *opaque,
31
- hwaddr addr, uint32_t value)
32
+static void parallel_mm_writefn(void *opaque, hwaddr addr,
33
+ uint64_t value, unsigned size)
34
{
35
ParallelState *s = opaque;
36
37
- parallel_ioport_write_sw(s, addr >> s->it_shift, value & 0xFF);
38
-}
39
-
40
-static uint32_t parallel_mm_readw (void *opaque, hwaddr addr)
41
-{
42
- ParallelState *s = opaque;
43
-
44
- return parallel_ioport_read_sw(s, addr >> s->it_shift) & 0xFFFF;
45
-}
46
-
47
-static void parallel_mm_writew (void *opaque,
48
- hwaddr addr, uint32_t value)
49
-{
50
- ParallelState *s = opaque;
51
-
52
- parallel_ioport_write_sw(s, addr >> s->it_shift, value & 0xFFFF);
53
-}
54
-
55
-static uint32_t parallel_mm_readl (void *opaque, hwaddr addr)
56
-{
57
- ParallelState *s = opaque;
58
-
59
- return parallel_ioport_read_sw(s, addr >> s->it_shift);
60
-}
61
-
62
-static void parallel_mm_writel (void *opaque,
63
- hwaddr addr, uint32_t value)
64
-{
65
- ParallelState *s = opaque;
66
-
67
- parallel_ioport_write_sw(s, addr >> s->it_shift, value);
68
+ parallel_ioport_write_sw(s, addr >> s->it_shift,
69
+ value & MAKE_64BIT_MASK(0, size * 8));
70
}
71
72
static const MemoryRegionOps parallel_mm_ops = {
73
- .old_mmio = {
74
- .read = { parallel_mm_readb, parallel_mm_readw, parallel_mm_readl },
75
- .write = { parallel_mm_writeb, parallel_mm_writew, parallel_mm_writel },
76
- },
77
+ .read = parallel_mm_readfn,
78
+ .write = parallel_mm_writefn,
79
+ .valid.min_access_size = 1,
80
+ .valid.max_access_size = 4,
81
.endianness = DEVICE_NATIVE_ENDIAN,
82
};
83
84
--
85
2.17.1
86
87
diff view generated by jsdifflib
Deleted patch
1
The stellaris board is still using the legacy armv7m_init() function,
2
which predates conversion of the ARMv7M into a proper QOM container
3
object. Make the board code directly create the ARMv7M object instead.
4
1
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20180601144328.23817-2-peter.maydell@linaro.org
9
---
10
hw/arm/stellaris.c | 12 ++++++++++--
11
1 file changed, 10 insertions(+), 2 deletions(-)
12
13
diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/hw/arm/stellaris.c
16
+++ b/hw/arm/stellaris.c
17
@@ -XXX,XX +XXX,XX @@
18
#include "qemu/log.h"
19
#include "exec/address-spaces.h"
20
#include "sysemu/sysemu.h"
21
+#include "hw/arm/armv7m.h"
22
#include "hw/char/pl011.h"
23
#include "hw/misc/unimp.h"
24
#include "cpu.h"
25
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
26
&error_fatal);
27
memory_region_add_subregion(system_memory, 0x20000000, sram);
28
29
- nvic = armv7m_init(system_memory, flash_size, NUM_IRQ_LINES,
30
- ms->kernel_filename, ms->cpu_type);
31
+ nvic = qdev_create(NULL, TYPE_ARMV7M);
32
+ qdev_prop_set_uint32(nvic, "num-irq", NUM_IRQ_LINES);
33
+ qdev_prop_set_string(nvic, "cpu-type", ms->cpu_type);
34
+ object_property_set_link(OBJECT(nvic), OBJECT(get_system_memory()),
35
+ "memory", &error_abort);
36
+ /* This will exit with an error if the user passed us a bad cpu_type */
37
+ qdev_init_nofail(nvic);
38
39
qdev_connect_gpio_out_named(nvic, "SYSRESETREQ", 0,
40
qemu_allocate_irq(&do_sys_reset, NULL, 0));
41
@@ -XXX,XX +XXX,XX @@ static void stellaris_init(MachineState *ms, stellaris_board_info *board)
42
create_unimplemented_device("analogue-comparator", 0x4003c000, 0x1000);
43
create_unimplemented_device("hibernation", 0x400fc000, 0x1000);
44
create_unimplemented_device("flash-control", 0x400fd000, 0x1000);
45
+
46
+ armv7m_load_kernel(ARM_CPU(first_cpu), ms->kernel_filename, flash_size);
47
}
48
49
/* FIXME: Figure out how to generate these from stellaris_boards. */
50
--
51
2.17.1
52
53
diff view generated by jsdifflib
Deleted patch
1
Remove the now-unused armv7m_init() function. This was a legacy from
2
before we properly QOMified ARMv7M, and it has some flaws:
3
1
4
* it combines work that needs to be done by an SoC object (creating
5
and initializing the TYPE_ARMV7M object) with work that needs to
6
be done by the board model (setting the system up to load the ELF
7
file specified with -kernel)
8
* TYPE_ARMV7M creation failure is fatal, but an SoC object wants to
9
arrange to propagate the failure outward
10
* it uses allocate-and-create via qdev_create() whereas the current
11
preferred style for SoC objects is to do creation in-place
12
13
Board and SoC models can instead do the two jobs this function
14
was doing themselves, in the right places and with whatever their
15
preferred style/error handling is.
16
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
19
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
20
Message-id: 20180601144328.23817-3-peter.maydell@linaro.org
21
---
22
include/hw/arm/arm.h | 8 ++------
23
hw/arm/armv7m.c | 21 ---------------------
24
2 files changed, 2 insertions(+), 27 deletions(-)
25
26
diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
27
index XXXXXXX..XXXXXXX 100644
28
--- a/include/hw/arm/arm.h
29
+++ b/include/hw/arm/arm.h
30
@@ -XXX,XX +XXX,XX @@ typedef enum {
31
ARM_ENDIANNESS_BE32,
32
} arm_endianness;
33
34
-/* armv7m.c */
35
-DeviceState *armv7m_init(MemoryRegion *system_memory, int mem_size, int num_irq,
36
- const char *kernel_filename, const char *cpu_type);
37
/**
38
* armv7m_load_kernel:
39
* @cpu: CPU
40
@@ -XXX,XX +XXX,XX @@ DeviceState *armv7m_init(MemoryRegion *system_memory, int mem_size, int num_irq,
41
* @mem_size: mem_size: maximum image size to load
42
*
43
* Load the guest image for an ARMv7M system. This must be called by
44
- * any ARMv7M board, either directly or via armv7m_init(). (This is
45
- * necessary to ensure that the CPU resets correctly on system reset,
46
- * as well as for kernel loading.)
47
+ * any ARMv7M board. (This is necessary to ensure that the CPU resets
48
+ * correctly on system reset, as well as for kernel loading.)
49
*/
50
void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size);
51
52
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/hw/arm/armv7m.c
55
+++ b/hw/arm/armv7m.c
56
@@ -XXX,XX +XXX,XX @@ static void armv7m_reset(void *opaque)
57
cpu_reset(CPU(cpu));
58
}
59
60
-/* Init CPU and memory for a v7-M based board.
61
- mem_size is in bytes.
62
- Returns the ARMv7M device. */
63
-
64
-DeviceState *armv7m_init(MemoryRegion *system_memory, int mem_size, int num_irq,
65
- const char *kernel_filename, const char *cpu_type)
66
-{
67
- DeviceState *armv7m;
68
-
69
- armv7m = qdev_create(NULL, TYPE_ARMV7M);
70
- qdev_prop_set_uint32(armv7m, "num-irq", num_irq);
71
- qdev_prop_set_string(armv7m, "cpu-type", cpu_type);
72
- object_property_set_link(OBJECT(armv7m), OBJECT(get_system_memory()),
73
- "memory", &error_abort);
74
- /* This will exit with an error if the user passed us a bad cpu_type */
75
- qdev_init_nofail(armv7m);
76
-
77
- armv7m_load_kernel(ARM_CPU(first_cpu), kernel_filename, mem_size);
78
- return armv7m;
79
-}
80
-
81
void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size)
82
{
83
int image_size;
84
--
85
2.17.1
86
87
diff view generated by jsdifflib
Deleted patch
1
The Cortex-M CPU and its NVIC are two intimately intertwined parts of
2
the same hardware; it is not possible to use one without the other.
3
Unfortunately a lot of our board models don't do any sanity checking
4
on the CPU type the user asks for, so a command line like
5
qemu-system-arm -M versatilepb -cpu cortex-m3
6
will create an M3 without an NVIC, and coredump immediately.
7
In the other direction, trying a non-M-profile CPU in an M-profile
8
board won't blow up, but doesn't do anything useful either:
9
qemu-system-arm -M lm3s6965evb -cpu arm926
10
1
11
Add some checking in the NVIC and CPU realize functions that the
12
user isn't trying to use an NVIC without an M-profile CPU or
13
an M-profile CPU without an NVIC, so we can produce a helpful
14
error message rather than a core dump.
15
16
Fixes: https://bugs.launchpad.net/qemu/+bug/1766896
17
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
18
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
19
Message-id: 20180601160355.15393-1-peter.maydell@linaro.org
20
---
21
hw/arm/armv7m.c | 7 ++++++-
22
hw/intc/armv7m_nvic.c | 6 +++++-
23
target/arm/cpu.c | 18 ++++++++++++++++++
24
3 files changed, 29 insertions(+), 2 deletions(-)
25
26
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
27
index XXXXXXX..XXXXXXX 100644
28
--- a/hw/arm/armv7m.c
29
+++ b/hw/arm/armv7m.c
30
@@ -XXX,XX +XXX,XX @@ static void armv7m_realize(DeviceState *dev, Error **errp)
31
return;
32
}
33
}
34
+
35
+ /* Tell the CPU where the NVIC is; it will fail realize if it doesn't
36
+ * have one.
37
+ */
38
+ s->cpu->env.nvic = &s->nvic;
39
+
40
object_property_set_bool(OBJECT(s->cpu), true, "realized", &err);
41
if (err != NULL) {
42
error_propagate(errp, err);
43
@@ -XXX,XX +XXX,XX @@ static void armv7m_realize(DeviceState *dev, Error **errp)
44
sbd = SYS_BUS_DEVICE(&s->nvic);
45
sysbus_connect_irq(sbd, 0,
46
qdev_get_gpio_in(DEVICE(s->cpu), ARM_CPU_IRQ));
47
- s->cpu->env.nvic = &s->nvic;
48
49
memory_region_add_subregion(&s->container, 0xe000e000,
50
sysbus_mmio_get_region(sbd, 0));
51
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/hw/intc/armv7m_nvic.c
54
+++ b/hw/intc/armv7m_nvic.c
55
@@ -XXX,XX +XXX,XX @@ static void armv7m_nvic_realize(DeviceState *dev, Error **errp)
56
int regionlen;
57
58
s->cpu = ARM_CPU(qemu_get_cpu(0));
59
- assert(s->cpu);
60
+
61
+ if (!s->cpu || !arm_feature(&s->cpu->env, ARM_FEATURE_M)) {
62
+ error_setg(errp, "The NVIC can only be used with a Cortex-M CPU");
63
+ return;
64
+ }
65
66
if (s->num_irq > NVIC_MAX_IRQ) {
67
error_setg(errp, "num-irq %d exceeds NVIC maximum", s->num_irq);
68
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
69
index XXXXXXX..XXXXXXX 100644
70
--- a/target/arm/cpu.c
71
+++ b/target/arm/cpu.c
72
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
73
return;
74
}
75
76
+#ifndef CONFIG_USER_ONLY
77
+ /* The NVIC and M-profile CPU are two halves of a single piece of
78
+ * hardware; trying to use one without the other is a command line
79
+ * error and will result in segfaults if not caught here.
80
+ */
81
+ if (arm_feature(env, ARM_FEATURE_M)) {
82
+ if (!env->nvic) {
83
+ error_setg(errp, "This board cannot be used with Cortex-M CPUs");
84
+ return;
85
+ }
86
+ } else {
87
+ if (env->nvic) {
88
+ error_setg(errp, "This board can only be used with Cortex-M CPUs");
89
+ return;
90
+ }
91
+ }
92
+#endif
93
+
94
cpu_exec_realizefn(cs, &local_err);
95
if (local_err != NULL) {
96
error_propagate(errp, local_err);
97
--
98
2.17.1
99
100
diff view generated by jsdifflib
Deleted patch
1
The 'addr' field in the CPUIOTLBEntry struct has a rather non-obvious
2
use; add a comment documenting it (reverse-engineered from what
3
the code that sets it is doing).
4
1
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-id: 20180611125633.32755-2-peter.maydell@linaro.org
9
---
10
include/exec/cpu-defs.h | 9 +++++++++
11
accel/tcg/cputlb.c | 12 ++++++++++++
12
2 files changed, 21 insertions(+)
13
14
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/include/exec/cpu-defs.h
17
+++ b/include/exec/cpu-defs.h
18
@@ -XXX,XX +XXX,XX @@ QEMU_BUILD_BUG_ON(sizeof(CPUTLBEntry) != (1 << CPU_TLB_ENTRY_BITS));
19
* structs into one.)
20
*/
21
typedef struct CPUIOTLBEntry {
22
+ /*
23
+ * @addr contains:
24
+ * - in the lower TARGET_PAGE_BITS, a physical section number
25
+ * - with the lower TARGET_PAGE_BITS masked off, an offset which
26
+ * must be added to the virtual address to obtain:
27
+ * + the ram_addr_t of the target RAM (if the physical section
28
+ * number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
29
+ * + the offset within the target MemoryRegion (otherwise)
30
+ */
31
hwaddr addr;
32
MemTxAttrs attrs;
33
} CPUIOTLBEntry;
34
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
35
index XXXXXXX..XXXXXXX 100644
36
--- a/accel/tcg/cputlb.c
37
+++ b/accel/tcg/cputlb.c
38
@@ -XXX,XX +XXX,XX @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
39
env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
40
41
/* refill the tlb */
42
+ /*
43
+ * At this point iotlb contains a physical section number in the lower
44
+ * TARGET_PAGE_BITS, and either
45
+ * + the ram_addr_t of the page base of the target RAM (if NOTDIRTY or ROM)
46
+ * + the offset within section->mr of the page base (otherwise)
47
+ * We subtract the vaddr (which is page aligned and thus won't
48
+ * disturb the low bits) to give an offset which can be added to the
49
+ * (non-page-aligned) vaddr of the eventual memory access to get
50
+ * the MemoryRegion offset for the access. Note that the vaddr we
51
+ * subtract here is that of the page base, and not the same as the
52
+ * vaddr we add back in io_readx()/io_writex()/get_page_addr_code().
53
+ */
54
env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
55
env->iotlb[mmu_idx][index].attrs = attrs;
56
57
--
58
2.17.1
59
60
diff view generated by jsdifflib
Deleted patch
1
The API for cpu_transaction_failed() says that it takes the physical
2
address for the failed transaction. However we were actually passing
3
it the offset within the target MemoryRegion. We don't currently
4
have any target CPU implementations of this hook that require the
5
physical address; fix this bug so we don't get confused if we ever
6
do add one.
7
1
8
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20180611125633.32755-3-peter.maydell@linaro.org
13
---
14
include/exec/exec-all.h | 13 ++++++++++--
15
accel/tcg/cputlb.c | 44 +++++++++++++++++++++++++++++------------
16
exec.c | 5 +++--
17
3 files changed, 45 insertions(+), 17 deletions(-)
18
19
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/include/exec/exec-all.h
22
+++ b/include/exec/exec-all.h
23
@@ -XXX,XX +XXX,XX @@ void tb_lock_reset(void);
24
25
#if !defined(CONFIG_USER_ONLY)
26
27
-struct MemoryRegion *iotlb_to_region(CPUState *cpu,
28
- hwaddr index, MemTxAttrs attrs);
29
+/**
30
+ * iotlb_to_section:
31
+ * @cpu: CPU performing the access
32
+ * @index: TCG CPU IOTLB entry
33
+ *
34
+ * Given a TCG CPU IOTLB entry, return the MemoryRegionSection that
35
+ * it refers to. @index will have been initially created and returned
36
+ * by memory_region_section_get_iotlb().
37
+ */
38
+struct MemoryRegionSection *iotlb_to_section(CPUState *cpu,
39
+ hwaddr index, MemTxAttrs attrs);
40
41
void tlb_fill(CPUState *cpu, target_ulong addr, int size,
42
MMUAccessType access_type, int mmu_idx, uintptr_t retaddr);
43
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
44
index XXXXXXX..XXXXXXX 100644
45
--- a/accel/tcg/cputlb.c
46
+++ b/accel/tcg/cputlb.c
47
@@ -XXX,XX +XXX,XX @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
48
target_ulong addr, uintptr_t retaddr, int size)
49
{
50
CPUState *cpu = ENV_GET_CPU(env);
51
- hwaddr physaddr = iotlbentry->addr;
52
- MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
53
+ hwaddr mr_offset;
54
+ MemoryRegionSection *section;
55
+ MemoryRegion *mr;
56
uint64_t val;
57
bool locked = false;
58
MemTxResult r;
59
60
- physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
61
+ section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
62
+ mr = section->mr;
63
+ mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
64
cpu->mem_io_pc = retaddr;
65
if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
66
cpu_io_recompile(cpu, retaddr);
67
@@ -XXX,XX +XXX,XX @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
68
qemu_mutex_lock_iothread();
69
locked = true;
70
}
71
- r = memory_region_dispatch_read(mr, physaddr,
72
+ r = memory_region_dispatch_read(mr, mr_offset,
73
&val, size, iotlbentry->attrs);
74
if (r != MEMTX_OK) {
75
+ hwaddr physaddr = mr_offset +
76
+ section->offset_within_address_space -
77
+ section->offset_within_region;
78
+
79
cpu_transaction_failed(cpu, physaddr, addr, size, MMU_DATA_LOAD,
80
mmu_idx, iotlbentry->attrs, r, retaddr);
81
}
82
@@ -XXX,XX +XXX,XX @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
83
uintptr_t retaddr, int size)
84
{
85
CPUState *cpu = ENV_GET_CPU(env);
86
- hwaddr physaddr = iotlbentry->addr;
87
- MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
88
+ hwaddr mr_offset;
89
+ MemoryRegionSection *section;
90
+ MemoryRegion *mr;
91
bool locked = false;
92
MemTxResult r;
93
94
- physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
95
+ section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
96
+ mr = section->mr;
97
+ mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
98
if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
99
cpu_io_recompile(cpu, retaddr);
100
}
101
@@ -XXX,XX +XXX,XX @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
102
qemu_mutex_lock_iothread();
103
locked = true;
104
}
105
- r = memory_region_dispatch_write(mr, physaddr,
106
+ r = memory_region_dispatch_write(mr, mr_offset,
107
val, size, iotlbentry->attrs);
108
if (r != MEMTX_OK) {
109
+ hwaddr physaddr = mr_offset +
110
+ section->offset_within_address_space -
111
+ section->offset_within_region;
112
+
113
cpu_transaction_failed(cpu, physaddr, addr, size, MMU_DATA_STORE,
114
mmu_idx, iotlbentry->attrs, r, retaddr);
115
}
116
@@ -XXX,XX +XXX,XX @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
117
*/
118
tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
119
{
120
- int mmu_idx, index, pd;
121
+ int mmu_idx, index;
122
void *p;
123
MemoryRegion *mr;
124
+ MemoryRegionSection *section;
125
CPUState *cpu = ENV_GET_CPU(env);
126
CPUIOTLBEntry *iotlbentry;
127
- hwaddr physaddr;
128
+ hwaddr physaddr, mr_offset;
129
130
index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
131
mmu_idx = cpu_mmu_index(env, true);
132
@@ -XXX,XX +XXX,XX @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
133
}
134
}
135
iotlbentry = &env->iotlb[mmu_idx][index];
136
- pd = iotlbentry->addr & ~TARGET_PAGE_MASK;
137
- mr = iotlb_to_region(cpu, pd, iotlbentry->attrs);
138
+ section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
139
+ mr = section->mr;
140
if (memory_region_is_unassigned(mr)) {
141
qemu_mutex_lock_iothread();
142
if (memory_region_request_mmio_ptr(mr, addr)) {
143
@@ -XXX,XX +XXX,XX @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
144
* and use the MemTXResult it produced). However it is the
145
* simplest place we have currently available for the check.
146
*/
147
- physaddr = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
148
+ mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
149
+ physaddr = mr_offset +
150
+ section->offset_within_address_space -
151
+ section->offset_within_region;
152
cpu_transaction_failed(cpu, physaddr, addr, 0, MMU_INST_FETCH, mmu_idx,
153
iotlbentry->attrs, MEMTX_DECODE_ERROR, 0);
154
155
diff --git a/exec.c b/exec.c
156
index XXXXXXX..XXXXXXX 100644
157
--- a/exec.c
158
+++ b/exec.c
159
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps readonly_mem_ops = {
160
},
161
};
162
163
-MemoryRegion *iotlb_to_region(CPUState *cpu, hwaddr index, MemTxAttrs attrs)
164
+MemoryRegionSection *iotlb_to_section(CPUState *cpu,
165
+ hwaddr index, MemTxAttrs attrs)
166
{
167
int asidx = cpu_asidx_from_attrs(cpu, attrs);
168
CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx];
169
AddressSpaceDispatch *d = atomic_rcu_read(&cpuas->memory_dispatch);
170
MemoryRegionSection *sections = d->map.sections;
171
172
- return sections[index & ~TARGET_PAGE_MASK].mr;
173
+ return &sections[index & ~TARGET_PAGE_MASK];
174
}
175
176
static void io_mem_init(void)
177
--
178
2.17.1
179
180
diff view generated by jsdifflib
Deleted patch
1
The codebase has a bit of a mix of different multiline
2
comment styles. State a preference for the Linux kernel
3
style:
4
/*
5
* Star on the left for each line.
6
* Leading slash-star and trailing star-slash
7
* each go on a line of their own.
8
*/
9
1
10
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
11
Reviewed-by: Eric Blake <eblake@redhat.com>
12
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
13
Reviewed-by: Markus Armbruster <armbru@redhat.com>
14
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
15
Reviewed-by: Thomas Huth <thuth@redhat.com>
16
Reviewed-by: John Snow <jsnow@redhat.com>
17
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
18
Message-id: 20180611141716.3813-1-peter.maydell@linaro.org
19
---
20
CODING_STYLE | 17 +++++++++++++++++
21
1 file changed, 17 insertions(+)
22
23
diff --git a/CODING_STYLE b/CODING_STYLE
24
index XXXXXXX..XXXXXXX 100644
25
--- a/CODING_STYLE
26
+++ b/CODING_STYLE
27
@@ -XXX,XX +XXX,XX @@ We use traditional C-style /* */ comments and avoid // comments.
28
Rationale: The // form is valid in C99, so this is purely a matter of
29
consistency of style. The checkpatch script will warn you about this.
30
31
+Multiline comment blocks should have a row of stars on the left,
32
+and the initial /* and terminating */ both on their own lines:
33
+ /*
34
+ * like
35
+ * this
36
+ */
37
+This is the same format required by the Linux kernel coding style.
38
+
39
+(Some of the existing comments in the codebase use the GNU Coding
40
+Standards form which does not have stars on the left, or other
41
+variations; avoid these when writing new comments, but don't worry
42
+about converting to the preferred form unless you're editing that
43
+comment anyway.)
44
+
45
+Rationale: Consistency, and ease of visually picking out a multiline
46
+comment from the surrounding code.
47
+
48
8. trace-events style
49
50
8.1 0x prefix
51
--
52
2.17.1
53
54
diff view generated by jsdifflib
Deleted patch
1
There's a common pattern in QEMU where a function needs to perform
2
a data load or store of an N byte integer in a particular endianness.
3
At the moment this is handled by doing a switch() on the size and
4
calling the appropriate ld*_p or st*_p function for each size.
5
1
6
Provide a new family of functions ldn_*_p() and stn_*_p() which
7
take the size as an argument and do the switch() themselves.
8
9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
11
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
12
Message-id: 20180611171007.4165-2-peter.maydell@linaro.org
13
---
14
include/exec/cpu-all.h | 4 +++
15
include/qemu/bswap.h | 52 +++++++++++++++++++++++++++++++++++++
16
docs/devel/loads-stores.rst | 15 +++++++++++
17
3 files changed, 71 insertions(+)
18
19
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/include/exec/cpu-all.h
22
+++ b/include/exec/cpu-all.h
23
@@ -XXX,XX +XXX,XX @@ static inline void tswap64s(uint64_t *s)
24
#define stq_p(p, v) stq_be_p(p, v)
25
#define stfl_p(p, v) stfl_be_p(p, v)
26
#define stfq_p(p, v) stfq_be_p(p, v)
27
+#define ldn_p(p, sz) ldn_be_p(p, sz)
28
+#define stn_p(p, sz, v) stn_be_p(p, sz, v)
29
#else
30
#define lduw_p(p) lduw_le_p(p)
31
#define ldsw_p(p) ldsw_le_p(p)
32
@@ -XXX,XX +XXX,XX @@ static inline void tswap64s(uint64_t *s)
33
#define stq_p(p, v) stq_le_p(p, v)
34
#define stfl_p(p, v) stfl_le_p(p, v)
35
#define stfq_p(p, v) stfq_le_p(p, v)
36
+#define ldn_p(p, sz) ldn_le_p(p, sz)
37
+#define stn_p(p, sz, v) stn_le_p(p, sz, v)
38
#endif
39
40
/* MMU memory access macros */
41
diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
42
index XXXXXXX..XXXXXXX 100644
43
--- a/include/qemu/bswap.h
44
+++ b/include/qemu/bswap.h
45
@@ -XXX,XX +XXX,XX @@ typedef union {
46
* For accessors that take a guest address rather than a
47
* host address, see the cpu_{ld,st}_* accessors defined in
48
* cpu_ldst.h.
49
+ *
50
+ * For cases where the size to be used is not fixed at compile time,
51
+ * there are
52
+ * stn{endian}_p(ptr, sz, val)
53
+ * which stores @val to @ptr as an @endian-order number @sz bytes in size
54
+ * and
55
+ * ldn{endian}_p(ptr, sz)
56
+ * which loads @sz bytes from @ptr as an unsigned @endian-order number
57
+ * and returns it in a uint64_t.
58
*/
59
60
static inline int ldub_p(const void *ptr)
61
@@ -XXX,XX +XXX,XX @@ static inline unsigned long leul_to_cpu(unsigned long v)
62
#endif
63
}
64
65
+/* Store v to p as a sz byte value in host order */
66
+#define DO_STN_LDN_P(END) \
67
+ static inline void stn_## END ## _p(void *ptr, int sz, uint64_t v) \
68
+ { \
69
+ switch (sz) { \
70
+ case 1: \
71
+ stb_p(ptr, v); \
72
+ break; \
73
+ case 2: \
74
+ stw_ ## END ## _p(ptr, v); \
75
+ break; \
76
+ case 4: \
77
+ stl_ ## END ## _p(ptr, v); \
78
+ break; \
79
+ case 8: \
80
+ stq_ ## END ## _p(ptr, v); \
81
+ break; \
82
+ default: \
83
+ g_assert_not_reached(); \
84
+ } \
85
+ } \
86
+ static inline uint64_t ldn_## END ## _p(const void *ptr, int sz) \
87
+ { \
88
+ switch (sz) { \
89
+ case 1: \
90
+ return ldub_p(ptr); \
91
+ case 2: \
92
+ return lduw_ ## END ## _p(ptr); \
93
+ case 4: \
94
+ return (uint32_t)ldl_ ## END ## _p(ptr); \
95
+ case 8: \
96
+ return ldq_ ## END ## _p(ptr); \
97
+ default: \
98
+ g_assert_not_reached(); \
99
+ } \
100
+ }
101
+
102
+DO_STN_LDN_P(he)
103
+DO_STN_LDN_P(le)
104
+DO_STN_LDN_P(be)
105
+
106
+#undef DO_STN_LDN_P
107
+
108
#undef le_bswap
109
#undef be_bswap
110
#undef le_bswaps
111
diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst
112
index XXXXXXX..XXXXXXX 100644
113
--- a/docs/devel/loads-stores.rst
114
+++ b/docs/devel/loads-stores.rst
115
@@ -XXX,XX +XXX,XX @@ The ``_{endian}`` infix is omitted for target-endian accesses.
116
The target endian accessors are only available to source
117
files which are built per-target.
118
119
+There are also functions which take the size as an argument:
120
+
121
+load: ``ldn{endian}_p(ptr, sz)``
122
+
123
+which performs an unsigned load of ``sz`` bytes from ``ptr``
124
+as an ``{endian}`` order value and returns it in a uint64_t.
125
+
126
+store: ``stn{endian}_p(ptr, sz, val)``
127
+
128
+which stores ``val`` to ``ptr`` as an ``{endian}`` order value
129
+of size ``sz`` bytes.
130
+
131
+
132
Regexes for git grep
133
- ``\<ldf\?[us]\?[bwlq]\(_[hbl]e\)\?_p\>``
134
- ``\<stf\?[bwlq]\(_[hbl]e\)\?_p\>``
135
+ - ``\<ldn_\([hbl]e\)?_p\>``
136
+ - ``\<stn_\([hbl]e\)?_p\>``
137
138
``cpu_{ld,st}_*``
139
~~~~~~~~~~~~~~~~~
140
--
141
2.17.1
142
143
diff view generated by jsdifflib
Deleted patch
1
In subpage_read() we perform a load of the data into a local buffer
2
which we then access using ldub_p(), lduw_p(), ldl_p() or ldq_p()
3
depending on its size, storing the result into the uint64_t *data.
4
Since ldl_p() returns an 'int', this means that for the 4-byte
5
case we will sign-extend the data, whereas for 1 and 2 byte
6
reads we zero-extend it.
7
1
8
This ought not to matter since the caller will likely ignore values in
9
the high bytes of the data, but add a cast so that we're consistent.
10
11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Message-id: 20180611171007.4165-3-peter.maydell@linaro.org
14
---
15
exec.c | 2 +-
16
1 file changed, 1 insertion(+), 1 deletion(-)
17
18
diff --git a/exec.c b/exec.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/exec.c
21
+++ b/exec.c
22
@@ -XXX,XX +XXX,XX @@ static MemTxResult subpage_read(void *opaque, hwaddr addr, uint64_t *data,
23
*data = lduw_p(buf);
24
return MEMTX_OK;
25
case 4:
26
- *data = ldl_p(buf);
27
+ *data = (uint32_t)ldl_p(buf);
28
return MEMTX_OK;
29
case 8:
30
*data = ldq_p(buf);
31
--
32
2.17.1
33
34
diff view generated by jsdifflib
Deleted patch
1
Now we have stn_p() and ldn_p() we can use them in various
2
functions in exec.c that used to have their own switch-on-size code.
3
1
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20180611171007.4165-4-peter.maydell@linaro.org
8
---
9
exec.c | 112 +++++----------------------------------------------------
10
1 file changed, 8 insertions(+), 104 deletions(-)
11
12
diff --git a/exec.c b/exec.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/exec.c
15
+++ b/exec.c
16
@@ -XXX,XX +XXX,XX @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
17
memory_notdirty_write_prepare(&ndi, current_cpu, current_cpu->mem_io_vaddr,
18
ram_addr, size);
19
20
- switch (size) {
21
- case 1:
22
- stb_p(qemu_map_ram_ptr(NULL, ram_addr), val);
23
- break;
24
- case 2:
25
- stw_p(qemu_map_ram_ptr(NULL, ram_addr), val);
26
- break;
27
- case 4:
28
- stl_p(qemu_map_ram_ptr(NULL, ram_addr), val);
29
- break;
30
- case 8:
31
- stq_p(qemu_map_ram_ptr(NULL, ram_addr), val);
32
- break;
33
- default:
34
- abort();
35
- }
36
+ stn_p(qemu_map_ram_ptr(NULL, ram_addr), size, val);
37
memory_notdirty_write_complete(&ndi);
38
}
39
40
@@ -XXX,XX +XXX,XX @@ static MemTxResult subpage_read(void *opaque, hwaddr addr, uint64_t *data,
41
if (res) {
42
return res;
43
}
44
- switch (len) {
45
- case 1:
46
- *data = ldub_p(buf);
47
- return MEMTX_OK;
48
- case 2:
49
- *data = lduw_p(buf);
50
- return MEMTX_OK;
51
- case 4:
52
- *data = (uint32_t)ldl_p(buf);
53
- return MEMTX_OK;
54
- case 8:
55
- *data = ldq_p(buf);
56
- return MEMTX_OK;
57
- default:
58
- abort();
59
- }
60
+ *data = ldn_p(buf, len);
61
+ return MEMTX_OK;
62
}
63
64
static MemTxResult subpage_write(void *opaque, hwaddr addr,
65
@@ -XXX,XX +XXX,XX @@ static MemTxResult subpage_write(void *opaque, hwaddr addr,
66
" value %"PRIx64"\n",
67
__func__, subpage, len, addr, value);
68
#endif
69
- switch (len) {
70
- case 1:
71
- stb_p(buf, value);
72
- break;
73
- case 2:
74
- stw_p(buf, value);
75
- break;
76
- case 4:
77
- stl_p(buf, value);
78
- break;
79
- case 8:
80
- stq_p(buf, value);
81
- break;
82
- default:
83
- abort();
84
- }
85
+ stn_p(buf, len, value);
86
return flatview_write(subpage->fv, addr + subpage->base, attrs, buf, len);
87
}
88
89
@@ -XXX,XX +XXX,XX @@ static MemTxResult flatview_write_continue(FlatView *fv, hwaddr addr,
90
l = memory_access_size(mr, l, addr1);
91
/* XXX: could force current_cpu to NULL to avoid
92
potential bugs */
93
- switch (l) {
94
- case 8:
95
- /* 64 bit write access */
96
- val = ldq_p(buf);
97
- result |= memory_region_dispatch_write(mr, addr1, val, 8,
98
- attrs);
99
- break;
100
- case 4:
101
- /* 32 bit write access */
102
- val = (uint32_t)ldl_p(buf);
103
- result |= memory_region_dispatch_write(mr, addr1, val, 4,
104
- attrs);
105
- break;
106
- case 2:
107
- /* 16 bit write access */
108
- val = lduw_p(buf);
109
- result |= memory_region_dispatch_write(mr, addr1, val, 2,
110
- attrs);
111
- break;
112
- case 1:
113
- /* 8 bit write access */
114
- val = ldub_p(buf);
115
- result |= memory_region_dispatch_write(mr, addr1, val, 1,
116
- attrs);
117
- break;
118
- default:
119
- abort();
120
- }
121
+ val = ldn_p(buf, l);
122
+ result |= memory_region_dispatch_write(mr, addr1, val, l, attrs);
123
} else {
124
/* RAM case */
125
ptr = qemu_ram_ptr_length(mr->ram_block, addr1, &l, false);
126
@@ -XXX,XX +XXX,XX @@ MemTxResult flatview_read_continue(FlatView *fv, hwaddr addr,
127
/* I/O case */
128
release_lock |= prepare_mmio_access(mr);
129
l = memory_access_size(mr, l, addr1);
130
- switch (l) {
131
- case 8:
132
- /* 64 bit read access */
133
- result |= memory_region_dispatch_read(mr, addr1, &val, 8,
134
- attrs);
135
- stq_p(buf, val);
136
- break;
137
- case 4:
138
- /* 32 bit read access */
139
- result |= memory_region_dispatch_read(mr, addr1, &val, 4,
140
- attrs);
141
- stl_p(buf, val);
142
- break;
143
- case 2:
144
- /* 16 bit read access */
145
- result |= memory_region_dispatch_read(mr, addr1, &val, 2,
146
- attrs);
147
- stw_p(buf, val);
148
- break;
149
- case 1:
150
- /* 8 bit read access */
151
- result |= memory_region_dispatch_read(mr, addr1, &val, 1,
152
- attrs);
153
- stb_p(buf, val);
154
- break;
155
- default:
156
- abort();
157
- }
158
+ result |= memory_region_dispatch_read(mr, addr1, &val, l, attrs);
159
+ stn_p(buf, l, val);
160
} else {
161
/* RAM case */
162
ptr = qemu_ram_ptr_length(mr->ram_block, addr1, &l, false);
163
--
164
2.17.1
165
166
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the VSHL and VSLI insns from the Neon 2-registers-and-a-shift
2
group to decodetree.
2
3
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-14-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200522145520.6778-2-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper-sve.h | 18 +++
8
target/arm/neon-dp.decode | 25 ++++++++++++++++++++++
9
target/arm/sve_helper.c | 248 +++++++++++++++++++++++++++++++++++++
9
target/arm/translate-neon.inc.c | 38 +++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 106 ++++++++++++++++
10
target/arm/translate.c | 18 +++++++---------
11
target/arm/sve.decode | 19 +++
11
3 files changed, 71 insertions(+), 10 deletions(-)
12
4 files changed, 391 insertions(+)
13
12
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
15
--- a/target/arm/neon-dp.decode
17
+++ b/target/arm/helper-sve.h
16
+++ b/target/arm/neon-dp.decode
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_orn_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
17
@@ -XXX,XX +XXX,XX @@ VRECPS_fp_3s 1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
19
DEF_HELPER_FLAGS_5(sve_nor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
18
VRSQRTS_fp_3s 1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
20
DEF_HELPER_FLAGS_5(sve_nand_pppp, TCG_CALL_NO_RWG,
19
VMAXNM_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
21
void, ptr, ptr, ptr, ptr, i32)
20
VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
22
+
21
+
23
+DEF_HELPER_FLAGS_5(sve_brkpa, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
22
+######################################################################
24
+DEF_HELPER_FLAGS_5(sve_brkpb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
23
+# 2-reg-and-shift grouping:
25
+DEF_HELPER_FLAGS_5(sve_brkpas, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32)
24
+# 1111 001 U 1 D immH:3 immL:3 Vd:4 opc:4 L Q M 1 Vm:4
26
+DEF_HELPER_FLAGS_5(sve_brkpbs, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32)
25
+######################################################################
26
+&2reg_shift vm vd q shift size
27
+
27
+
28
+DEF_HELPER_FLAGS_4(sve_brka_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
28
+@2reg_shl_d .... ... . . . shift:6 .... .... 1 q:1 . . .... \
29
+DEF_HELPER_FLAGS_4(sve_brkb_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
29
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=3
30
+DEF_HELPER_FLAGS_4(sve_brka_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
30
+@2reg_shl_s .... ... . . . 1 shift:5 .... .... 0 q:1 . . .... \
31
+DEF_HELPER_FLAGS_4(sve_brkb_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
31
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=2
32
+@2reg_shl_h .... ... . . . 01 shift:4 .... .... 0 q:1 . . .... \
33
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1
34
+@2reg_shl_b .... ... . . . 001 shift:3 .... .... 0 q:1 . . .... \
35
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=0
32
+
36
+
33
+DEF_HELPER_FLAGS_4(sve_brkas_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
37
+VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
34
+DEF_HELPER_FLAGS_4(sve_brkbs_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
38
+VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
35
+DEF_HELPER_FLAGS_4(sve_brkas_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
39
+VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
36
+DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
40
+VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_b
37
+
41
+
38
+DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
42
+VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
39
+DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
43
+VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
40
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
44
+VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
45
+VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_b
46
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
41
index XXXXXXX..XXXXXXX 100644
47
index XXXXXXX..XXXXXXX 100644
42
--- a/target/arm/sve_helper.c
48
--- a/target/arm/translate-neon.inc.c
43
+++ b/target/arm/sve_helper.c
49
+++ b/target/arm/translate-neon.inc.c
44
@@ -XXX,XX +XXX,XX @@ DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=)
50
@@ -XXX,XX +XXX,XX @@ static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
45
#undef DO_CMP_PPZI_S
51
DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
46
#undef DO_CMP_PPZI_D
52
DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
47
#undef DO_CMP_PPZI
53
DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
48
+
54
+
49
+/* Similar to the ARM LastActive pseudocode function. */
55
+static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
50
+static bool last_active_pred(void *vd, void *vg, intptr_t oprsz)
51
+{
56
+{
52
+ intptr_t i;
57
+ /* Handle a 2-reg-shift insn which can be vectorized. */
58
+ int vec_size = a->q ? 16 : 8;
59
+ int rd_ofs = neon_reg_offset(a->vd, 0);
60
+ int rm_ofs = neon_reg_offset(a->vm, 0);
53
+
61
+
54
+ for (i = QEMU_ALIGN_UP(oprsz, 8) - 8; i >= 0; i -= 8) {
62
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
55
+ uint64_t pg = *(uint64_t *)(vg + i);
63
+ return false;
56
+ if (pg) {
57
+ return (pow2floor(pg) & *(uint64_t *)(vd + i)) != 0;
58
+ }
59
+ }
60
+ return 0;
61
+}
62
+
63
+/* Compute a mask into RETB that is true for all G, up to and including
64
+ * (if after) or excluding (if !after) the first G & N.
65
+ * Return true if BRK found.
66
+ */
67
+static bool compute_brk(uint64_t *retb, uint64_t n, uint64_t g,
68
+ bool brk, bool after)
69
+{
70
+ uint64_t b;
71
+
72
+ if (brk) {
73
+ b = 0;
74
+ } else if ((g & n) == 0) {
75
+ /* For all G, no N are set; break not found. */
76
+ b = g;
77
+ } else {
78
+ /* Break somewhere in N. Locate it. */
79
+ b = g & n; /* guard true, pred true */
80
+ b = b & -b; /* first such */
81
+ if (after) {
82
+ b = b | (b - 1); /* break after same */
83
+ } else {
84
+ b = b - 1; /* break before same */
85
+ }
86
+ brk = true;
87
+ }
64
+ }
88
+
65
+
89
+ *retb = b;
66
+ /* UNDEF accesses to D16-D31 if they don't exist. */
90
+ return brk;
67
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
91
+}
68
+ ((a->vd | a->vm) & 0x10)) {
69
+ return false;
70
+ }
92
+
71
+
93
+/* Compute a zeroing BRK. */
72
+ if ((a->vm | a->vd) & a->q) {
94
+static void compute_brk_z(uint64_t *d, uint64_t *n, uint64_t *g,
73
+ return false;
95
+ intptr_t oprsz, bool after)
74
+ }
96
+{
97
+ bool brk = false;
98
+ intptr_t i;
99
+
75
+
100
+ for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) {
76
+ if (!vfp_access_check(s)) {
101
+ uint64_t this_b, this_g = g[i];
102
+
103
+ brk = compute_brk(&this_b, n[i], this_g, brk, after);
104
+ d[i] = this_b & this_g;
105
+ }
106
+}
107
+
108
+/* Likewise, but also compute flags. */
109
+static uint32_t compute_brks_z(uint64_t *d, uint64_t *n, uint64_t *g,
110
+ intptr_t oprsz, bool after)
111
+{
112
+ uint32_t flags = PREDTEST_INIT;
113
+ bool brk = false;
114
+ intptr_t i;
115
+
116
+ for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) {
117
+ uint64_t this_b, this_d, this_g = g[i];
118
+
119
+ brk = compute_brk(&this_b, n[i], this_g, brk, after);
120
+ d[i] = this_d = this_b & this_g;
121
+ flags = iter_predtest_fwd(this_d, this_g, flags);
122
+ }
123
+ return flags;
124
+}
125
+
126
+/* Compute a merging BRK. */
127
+static void compute_brk_m(uint64_t *d, uint64_t *n, uint64_t *g,
128
+ intptr_t oprsz, bool after)
129
+{
130
+ bool brk = false;
131
+ intptr_t i;
132
+
133
+ for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) {
134
+ uint64_t this_b, this_g = g[i];
135
+
136
+ brk = compute_brk(&this_b, n[i], this_g, brk, after);
137
+ d[i] = (this_b & this_g) | (d[i] & ~this_g);
138
+ }
139
+}
140
+
141
+/* Likewise, but also compute flags. */
142
+static uint32_t compute_brks_m(uint64_t *d, uint64_t *n, uint64_t *g,
143
+ intptr_t oprsz, bool after)
144
+{
145
+ uint32_t flags = PREDTEST_INIT;
146
+ bool brk = false;
147
+ intptr_t i;
148
+
149
+ for (i = 0; i < oprsz / 8; ++i) {
150
+ uint64_t this_b, this_d = d[i], this_g = g[i];
151
+
152
+ brk = compute_brk(&this_b, n[i], this_g, brk, after);
153
+ d[i] = this_d = (this_b & this_g) | (this_d & ~this_g);
154
+ flags = iter_predtest_fwd(this_d, this_g, flags);
155
+ }
156
+ return flags;
157
+}
158
+
159
+static uint32_t do_zero(ARMPredicateReg *d, intptr_t oprsz)
160
+{
161
+ /* It is quicker to zero the whole predicate than loop on OPRSZ.
162
+ * The compiler should turn this into 4 64-bit integer stores.
163
+ */
164
+ memset(d, 0, sizeof(ARMPredicateReg));
165
+ return PREDTEST_INIT;
166
+}
167
+
168
+void HELPER(sve_brkpa)(void *vd, void *vn, void *vm, void *vg,
169
+ uint32_t pred_desc)
170
+{
171
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
172
+ if (last_active_pred(vn, vg, oprsz)) {
173
+ compute_brk_z(vd, vm, vg, oprsz, true);
174
+ } else {
175
+ do_zero(vd, oprsz);
176
+ }
177
+}
178
+
179
+uint32_t HELPER(sve_brkpas)(void *vd, void *vn, void *vm, void *vg,
180
+ uint32_t pred_desc)
181
+{
182
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
183
+ if (last_active_pred(vn, vg, oprsz)) {
184
+ return compute_brks_z(vd, vm, vg, oprsz, true);
185
+ } else {
186
+ return do_zero(vd, oprsz);
187
+ }
188
+}
189
+
190
+void HELPER(sve_brkpb)(void *vd, void *vn, void *vm, void *vg,
191
+ uint32_t pred_desc)
192
+{
193
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
194
+ if (last_active_pred(vn, vg, oprsz)) {
195
+ compute_brk_z(vd, vm, vg, oprsz, false);
196
+ } else {
197
+ do_zero(vd, oprsz);
198
+ }
199
+}
200
+
201
+uint32_t HELPER(sve_brkpbs)(void *vd, void *vn, void *vm, void *vg,
202
+ uint32_t pred_desc)
203
+{
204
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
205
+ if (last_active_pred(vn, vg, oprsz)) {
206
+ return compute_brks_z(vd, vm, vg, oprsz, false);
207
+ } else {
208
+ return do_zero(vd, oprsz);
209
+ }
210
+}
211
+
212
+void HELPER(sve_brka_z)(void *vd, void *vn, void *vg, uint32_t pred_desc)
213
+{
214
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
215
+ compute_brk_z(vd, vn, vg, oprsz, true);
216
+}
217
+
218
+uint32_t HELPER(sve_brkas_z)(void *vd, void *vn, void *vg, uint32_t pred_desc)
219
+{
220
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
221
+ return compute_brks_z(vd, vn, vg, oprsz, true);
222
+}
223
+
224
+void HELPER(sve_brkb_z)(void *vd, void *vn, void *vg, uint32_t pred_desc)
225
+{
226
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
227
+ compute_brk_z(vd, vn, vg, oprsz, false);
228
+}
229
+
230
+uint32_t HELPER(sve_brkbs_z)(void *vd, void *vn, void *vg, uint32_t pred_desc)
231
+{
232
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
233
+ return compute_brks_z(vd, vn, vg, oprsz, false);
234
+}
235
+
236
+void HELPER(sve_brka_m)(void *vd, void *vn, void *vg, uint32_t pred_desc)
237
+{
238
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
239
+ compute_brk_m(vd, vn, vg, oprsz, true);
240
+}
241
+
242
+uint32_t HELPER(sve_brkas_m)(void *vd, void *vn, void *vg, uint32_t pred_desc)
243
+{
244
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
245
+ return compute_brks_m(vd, vn, vg, oprsz, true);
246
+}
247
+
248
+void HELPER(sve_brkb_m)(void *vd, void *vn, void *vg, uint32_t pred_desc)
249
+{
250
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
251
+ compute_brk_m(vd, vn, vg, oprsz, false);
252
+}
253
+
254
+uint32_t HELPER(sve_brkbs_m)(void *vd, void *vn, void *vg, uint32_t pred_desc)
255
+{
256
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
257
+ return compute_brks_m(vd, vn, vg, oprsz, false);
258
+}
259
+
260
+void HELPER(sve_brkn)(void *vd, void *vn, void *vg, uint32_t pred_desc)
261
+{
262
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
263
+
264
+ if (!last_active_pred(vn, vg, oprsz)) {
265
+ do_zero(vd, oprsz);
266
+ }
267
+}
268
+
269
+/* As if PredTest(Ones(PL), D, esz). */
270
+static uint32_t predtest_ones(ARMPredicateReg *d, intptr_t oprsz,
271
+ uint64_t esz_mask)
272
+{
273
+ uint32_t flags = PREDTEST_INIT;
274
+ intptr_t i;
275
+
276
+ for (i = 0; i < oprsz / 8; i++) {
277
+ flags = iter_predtest_fwd(d->p[i], esz_mask, flags);
278
+ }
279
+ if (oprsz & 7) {
280
+ uint64_t mask = ~(-1ULL << (8 * (oprsz & 7)));
281
+ flags = iter_predtest_fwd(d->p[i], esz_mask & mask, flags);
282
+ }
283
+ return flags;
284
+}
285
+
286
+uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_desc)
287
+{
288
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
289
+
290
+ if (last_active_pred(vn, vg, oprsz)) {
291
+ return predtest_ones(vd, oprsz, -1);
292
+ } else {
293
+ return do_zero(vd, oprsz);
294
+ }
295
+}
296
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
297
index XXXXXXX..XXXXXXX 100644
298
--- a/target/arm/translate-sve.c
299
+++ b/target/arm/translate-sve.c
300
@@ -XXX,XX +XXX,XX @@ DO_PPZI(CMPLS, cmpls)
301
302
#undef DO_PPZI
303
304
+/*
305
+ *** SVE Partition Break Group
306
+ */
307
+
308
+static bool do_brk3(DisasContext *s, arg_rprr_s *a,
309
+ gen_helper_gvec_4 *fn, gen_helper_gvec_flags_4 *fn_s)
310
+{
311
+ if (!sve_access_check(s)) {
312
+ return true;
77
+ return true;
313
+ }
78
+ }
314
+
79
+
315
+ unsigned vsz = pred_full_reg_size(s);
80
+ fn(a->size, rd_ofs, rm_ofs, a->shift, vec_size, vec_size);
316
+
317
+ /* Predicate sizes may be smaller and cannot use simd_desc. */
318
+ TCGv_ptr d = tcg_temp_new_ptr();
319
+ TCGv_ptr n = tcg_temp_new_ptr();
320
+ TCGv_ptr m = tcg_temp_new_ptr();
321
+ TCGv_ptr g = tcg_temp_new_ptr();
322
+ TCGv_i32 t = tcg_const_i32(vsz - 2);
323
+
324
+ tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd));
325
+ tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn));
326
+ tcg_gen_addi_ptr(m, cpu_env, pred_full_reg_offset(s, a->rm));
327
+ tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg));
328
+
329
+ if (a->s) {
330
+ fn_s(t, d, n, m, g, t);
331
+ do_pred_flags(t);
332
+ } else {
333
+ fn(d, n, m, g, t);
334
+ }
335
+ tcg_temp_free_ptr(d);
336
+ tcg_temp_free_ptr(n);
337
+ tcg_temp_free_ptr(m);
338
+ tcg_temp_free_ptr(g);
339
+ tcg_temp_free_i32(t);
340
+ return true;
81
+ return true;
341
+}
82
+}
342
+
83
+
343
+static bool do_brk2(DisasContext *s, arg_rpr_s *a,
84
+#define DO_2SH(INSN, FUNC) \
344
+ gen_helper_gvec_3 *fn, gen_helper_gvec_flags_3 *fn_s)
85
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
345
+{
86
+ { \
346
+ if (!sve_access_check(s)) {
87
+ return do_vector_2sh(s, a, FUNC); \
347
+ return true;
88
+ } \
348
+ }
349
+
89
+
350
+ unsigned vsz = pred_full_reg_size(s);
90
+DO_2SH(VSHL, tcg_gen_gvec_shli)
91
+DO_2SH(VSLI, gen_gvec_sli)
92
diff --git a/target/arm/translate.c b/target/arm/translate.c
93
index XXXXXXX..XXXXXXX 100644
94
--- a/target/arm/translate.c
95
+++ b/target/arm/translate.c
96
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
97
if ((insn & 0x00380080) != 0) {
98
/* Two registers and shift. */
99
op = (insn >> 8) & 0xf;
351
+
100
+
352
+ /* Predicate sizes may be smaller and cannot use simd_desc. */
101
+ switch (op) {
353
+ TCGv_ptr d = tcg_temp_new_ptr();
102
+ case 5: /* VSHL, VSLI */
354
+ TCGv_ptr n = tcg_temp_new_ptr();
103
+ return 1; /* handled by decodetree */
355
+ TCGv_ptr g = tcg_temp_new_ptr();
104
+ default:
356
+ TCGv_i32 t = tcg_const_i32(vsz - 2);
105
+ break;
106
+ }
357
+
107
+
358
+ tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd));
108
if (insn & (1 << 7)) {
359
+ tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn));
109
/* 64-bit shift. */
360
+ tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg));
110
if (op > 7) {
361
+
111
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
362
+ if (a->s) {
112
gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
363
+ fn_s(t, d, n, g, t);
113
vec_size, vec_size);
364
+ do_pred_flags(t);
114
return 0;
365
+ } else {
115
-
366
+ fn(d, n, g, t);
116
- case 5: /* VSHL, VSLI */
367
+ }
117
- if (u) { /* VSLI */
368
+ tcg_temp_free_ptr(d);
118
- gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
369
+ tcg_temp_free_ptr(n);
119
- vec_size, vec_size);
370
+ tcg_temp_free_ptr(g);
120
- } else { /* VSHL */
371
+ tcg_temp_free_i32(t);
121
- tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
372
+ return true;
122
- vec_size, vec_size);
373
+}
123
- }
374
+
124
- return 0;
375
+static bool trans_BRKPA(DisasContext *s, arg_rprr_s *a, uint32_t insn)
125
}
376
+{
126
377
+ return do_brk3(s, a, gen_helper_sve_brkpa, gen_helper_sve_brkpas);
127
if (size == 3) {
378
+}
379
+
380
+static bool trans_BRKPB(DisasContext *s, arg_rprr_s *a, uint32_t insn)
381
+{
382
+ return do_brk3(s, a, gen_helper_sve_brkpb, gen_helper_sve_brkpbs);
383
+}
384
+
385
+static bool trans_BRKA_m(DisasContext *s, arg_rpr_s *a, uint32_t insn)
386
+{
387
+ return do_brk2(s, a, gen_helper_sve_brka_m, gen_helper_sve_brkas_m);
388
+}
389
+
390
+static bool trans_BRKB_m(DisasContext *s, arg_rpr_s *a, uint32_t insn)
391
+{
392
+ return do_brk2(s, a, gen_helper_sve_brkb_m, gen_helper_sve_brkbs_m);
393
+}
394
+
395
+static bool trans_BRKA_z(DisasContext *s, arg_rpr_s *a, uint32_t insn)
396
+{
397
+ return do_brk2(s, a, gen_helper_sve_brka_z, gen_helper_sve_brkas_z);
398
+}
399
+
400
+static bool trans_BRKB_z(DisasContext *s, arg_rpr_s *a, uint32_t insn)
401
+{
402
+ return do_brk2(s, a, gen_helper_sve_brkb_z, gen_helper_sve_brkbs_z);
403
+}
404
+
405
+static bool trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn)
406
+{
407
+ return do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns);
408
+}
409
+
410
/*
411
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
412
*/
413
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
414
index XXXXXXX..XXXXXXX 100644
415
--- a/target/arm/sve.decode
416
+++ b/target/arm/sve.decode
417
@@ -XXX,XX +XXX,XX @@
418
&rri_esz rd rn imm esz
419
&rrr_esz rd rn rm esz
420
&rpr_esz rd pg rn esz
421
+&rpr_s rd pg rn s
422
&rprr_s rd pg rn rm s
423
&rprr_esz rd pg rn rm esz
424
&rprrr_esz rd pg rn rm ra esz
425
@@ -XXX,XX +XXX,XX @@
426
@pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz
427
@rd_rn ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz
428
429
+# Two operand with governing predicate, flags setting
430
+@pd_pg_pn_s ........ . s:1 ...... .. pg:4 . rn:4 . rd:4 &rpr_s
431
+
432
# Three operand with unused vector element size
433
@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0
434
435
@@ -XXX,XX +XXX,XX @@ PFIRST 00100101 01 011 000 11000 00 .... 0 .... @pd_pn_e0
436
# SVE predicate next active
437
PNEXT 00100101 .. 011 001 11000 10 .... 0 .... @pd_pn
438
439
+### SVE Partition Break Group
440
+
441
+# SVE propagate break from previous partition
442
+BRKPA 00100101 0. 00 .... 11 .... 0 .... 0 .... @pd_pg_pn_pm_s
443
+BRKPB 00100101 0. 00 .... 11 .... 0 .... 1 .... @pd_pg_pn_pm_s
444
+
445
+# SVE partition break condition
446
+BRKA_z 00100101 0. 01000001 .... 0 .... 0 .... @pd_pg_pn_s
447
+BRKB_z 00100101 1. 01000001 .... 0 .... 0 .... @pd_pg_pn_s
448
+BRKA_m 00100101 0. 01000001 .... 0 .... 1 .... @pd_pg_pn_s
449
+BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s
450
+
451
+# SVE propagate break to next partition
452
+BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s
453
+
454
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
455
456
# SVE load predicate register
457
--
128
--
458
2.17.1
129
2.20.1
459
130
460
131
diff view generated by jsdifflib
1
If an IOMMU supports mappings that care about the memory
1
Convert the VSHR 2-reg-shift insns to decodetree.
2
transaction attributes, then it no longer has a unique
2
3
address -> output mapping, but more than one. We can
3
Note that unlike the legacy decoder, we present the right shift
4
represent these using an IOMMU index, analogous to TCG's
4
amount to the trans_ function as a positive integer.
5
mmu indexes.
6
5
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Message-id: 20200522145520.6778-3-peter.maydell@linaro.org
10
Message-id: 20180604152941.20374-2-peter.maydell@linaro.org
11
---
9
---
12
include/exec/memory.h | 55 +++++++++++++++++++++++++++++++++++++++++++
10
target/arm/neon-dp.decode | 25 ++++++++++++++++++++
13
memory.c | 23 ++++++++++++++++++
11
target/arm/translate-neon.inc.c | 41 +++++++++++++++++++++++++++++++++
14
2 files changed, 78 insertions(+)
12
target/arm/translate.c | 21 +----------------
13
3 files changed, 67 insertions(+), 20 deletions(-)
15
14
16
diff --git a/include/exec/memory.h b/include/exec/memory.h
15
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
17
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
18
--- a/include/exec/memory.h
17
--- a/target/arm/neon-dp.decode
19
+++ b/include/exec/memory.h
18
+++ b/target/arm/neon-dp.decode
20
@@ -XXX,XX +XXX,XX @@ enum IOMMUMemoryRegionAttr {
19
@@ -XXX,XX +XXX,XX @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
21
* to report whenever mappings are changed, by calling
20
######################################################################
22
* memory_region_notify_iommu() (or, if necessary, by calling
21
&2reg_shift vm vd q shift size
23
* memory_region_notify_one() for each registered notifier).
22
24
+ *
23
+# Right shifts are encoded as N - shift, where N is the element size in bits.
25
+ * Conceptually an IOMMU provides a mapping from input address
24
+%neon_rshift_i6 16:6 !function=rsub_64
26
+ * to an output TLB entry. If the IOMMU is aware of memory transaction
25
+%neon_rshift_i5 16:5 !function=rsub_32
27
+ * attributes and the output TLB entry depends on the transaction
26
+%neon_rshift_i4 16:4 !function=rsub_16
28
+ * attributes, we represent this using IOMMU indexes. Each index
27
+%neon_rshift_i3 16:3 !function=rsub_8
29
+ * selects a particular translation table that the IOMMU has:
30
+ * @attrs_to_index returns the IOMMU index for a set of transaction attributes
31
+ * @translate takes an input address and an IOMMU index
32
+ * and the mapping returned can only depend on the input address and the
33
+ * IOMMU index.
34
+ *
35
+ * Most IOMMUs don't care about the transaction attributes and support
36
+ * only a single IOMMU index. A more complex IOMMU might have one index
37
+ * for secure transactions and one for non-secure transactions.
38
*/
39
typedef struct IOMMUMemoryRegionClass {
40
/* private */
41
@@ -XXX,XX +XXX,XX @@ typedef struct IOMMUMemoryRegionClass {
42
*/
43
int (*get_attr)(IOMMUMemoryRegion *iommu, enum IOMMUMemoryRegionAttr attr,
44
void *data);
45
+
28
+
46
+ /* Return the IOMMU index to use for a given set of transaction attributes.
29
+@2reg_shr_d .... ... . . . ...... .... .... 1 q:1 . . .... \
47
+ *
30
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=3 shift=%neon_rshift_i6
48
+ * Optional method: if an IOMMU only supports a single IOMMU index then
31
+@2reg_shr_s .... ... . . . 1 ..... .... .... 0 q:1 . . .... \
49
+ * the default implementation of memory_region_iommu_attrs_to_index()
32
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=2 shift=%neon_rshift_i5
50
+ * will return 0.
33
+@2reg_shr_h .... ... . . . 01 .... .... .... 0 q:1 . . .... \
51
+ *
34
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1 shift=%neon_rshift_i4
52
+ * The indexes supported by an IOMMU must be contiguous, starting at 0.
35
+@2reg_shr_b .... ... . . . 001 ... .... .... 0 q:1 . . .... \
53
+ *
36
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=0 shift=%neon_rshift_i3
54
+ * @iommu: the IOMMUMemoryRegion
55
+ * @attrs: memory transaction attributes
56
+ */
57
+ int (*attrs_to_index)(IOMMUMemoryRegion *iommu, MemTxAttrs attrs);
58
+
37
+
59
+ /* Return the number of IOMMU indexes this IOMMU supports.
38
@2reg_shl_d .... ... . . . shift:6 .... .... 1 q:1 . . .... \
60
+ *
39
&2reg_shift vm=%vm_dp vd=%vd_dp size=3
61
+ * Optional method: if this method is not provided, then
40
@2reg_shl_s .... ... . . . 1 shift:5 .... .... 0 q:1 . . .... \
62
+ * memory_region_iommu_num_indexes() will return 1, indicating that
41
@@ -XXX,XX +XXX,XX @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
63
+ * only a single IOMMU index is supported.
42
@2reg_shl_b .... ... . . . 001 shift:3 .... .... 0 q:1 . . .... \
64
+ *
43
&2reg_shift vm=%vm_dp vd=%vd_dp size=0
65
+ * @iommu: the IOMMUMemoryRegion
44
66
+ */
45
+VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
67
+ int (*num_indexes)(IOMMUMemoryRegion *iommu);
46
+VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
68
} IOMMUMemoryRegionClass;
47
+VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
69
48
+VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_b
70
typedef struct CoalescedMemoryRange CoalescedMemoryRange;
71
@@ -XXX,XX +XXX,XX @@ int memory_region_iommu_get_attr(IOMMUMemoryRegion *iommu_mr,
72
enum IOMMUMemoryRegionAttr attr,
73
void *data);
74
75
+/**
76
+ * memory_region_iommu_attrs_to_index: return the IOMMU index to
77
+ * use for translations with the given memory transaction attributes.
78
+ *
79
+ * @iommu_mr: the memory region
80
+ * @attrs: the memory transaction attributes
81
+ */
82
+int memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr,
83
+ MemTxAttrs attrs);
84
+
49
+
85
+/**
50
+VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
86
+ * memory_region_iommu_num_indexes: return the total number of IOMMU
51
+VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
87
+ * indexes that this IOMMU supports.
52
+VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
88
+ *
53
+VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_b
89
+ * @iommu_mr: the memory region
90
+ */
91
+int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr);
92
+
54
+
93
/**
55
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
94
* memory_region_name: get a memory region's name
56
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
95
*
57
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
96
diff --git a/memory.c b/memory.c
58
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
97
index XXXXXXX..XXXXXXX 100644
59
index XXXXXXX..XXXXXXX 100644
98
--- a/memory.c
60
--- a/target/arm/translate-neon.inc.c
99
+++ b/memory.c
61
+++ b/target/arm/translate-neon.inc.c
100
@@ -XXX,XX +XXX,XX @@ int memory_region_iommu_get_attr(IOMMUMemoryRegion *iommu_mr,
62
@@ -XXX,XX +XXX,XX @@ static inline int plus1(DisasContext *s, int x)
101
return imrc->get_attr(iommu_mr, attr, data);
63
return x + 1;
102
}
64
}
103
65
104
+int memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr,
66
+static inline int rsub_64(DisasContext *s, int x)
105
+ MemTxAttrs attrs)
106
+{
67
+{
107
+ IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
68
+ return 64 - x;
108
+
109
+ if (!imrc->attrs_to_index) {
110
+ return 0;
111
+ }
112
+
113
+ return imrc->attrs_to_index(iommu_mr, attrs);
114
+}
69
+}
115
+
70
+
116
+int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr)
71
+static inline int rsub_32(DisasContext *s, int x)
117
+{
72
+{
118
+ IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
73
+ return 32 - x;
119
+
74
+}
120
+ if (!imrc->num_indexes) {
75
+static inline int rsub_16(DisasContext *s, int x)
121
+ return 1;
76
+{
122
+ }
77
+ return 16 - x;
123
+
78
+}
124
+ return imrc->num_indexes(iommu_mr);
79
+static inline int rsub_8(DisasContext *s, int x)
80
+{
81
+ return 8 - x;
125
+}
82
+}
126
+
83
+
127
void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
84
/* Include the generated Neon decoder */
128
{
85
#include "decode-neon-dp.inc.c"
129
uint8_t mask = 1 << client;
86
#include "decode-neon-ls.inc.c"
87
@@ -XXX,XX +XXX,XX @@ static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
88
89
DO_2SH(VSHL, tcg_gen_gvec_shli)
90
DO_2SH(VSLI, gen_gvec_sli)
91
+
92
+static bool trans_VSHR_S_2sh(DisasContext *s, arg_2reg_shift *a)
93
+{
94
+ /* Signed shift out of range results in all-sign-bits */
95
+ a->shift = MIN(a->shift, (8 << a->size) - 1);
96
+ return do_vector_2sh(s, a, tcg_gen_gvec_sari);
97
+}
98
+
99
+static void gen_zero_rd_2sh(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
100
+ int64_t shift, uint32_t oprsz, uint32_t maxsz)
101
+{
102
+ tcg_gen_gvec_dup_imm(vece, rd_ofs, oprsz, maxsz, 0);
103
+}
104
+
105
+static bool trans_VSHR_U_2sh(DisasContext *s, arg_2reg_shift *a)
106
+{
107
+ /* Shift out of range is architecturally valid and results in zero. */
108
+ if (a->shift >= (8 << a->size)) {
109
+ return do_vector_2sh(s, a, gen_zero_rd_2sh);
110
+ } else {
111
+ return do_vector_2sh(s, a, tcg_gen_gvec_shri);
112
+ }
113
+}
114
diff --git a/target/arm/translate.c b/target/arm/translate.c
115
index XXXXXXX..XXXXXXX 100644
116
--- a/target/arm/translate.c
117
+++ b/target/arm/translate.c
118
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
119
op = (insn >> 8) & 0xf;
120
121
switch (op) {
122
+ case 0: /* VSHR */
123
case 5: /* VSHL, VSLI */
124
return 1; /* handled by decodetree */
125
default:
126
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
127
}
128
129
switch (op) {
130
- case 0: /* VSHR */
131
- /* Right shift comes here negative. */
132
- shift = -shift;
133
- /* Shifts larger than the element size are architecturally
134
- * valid. Unsigned results in all zeros; signed results
135
- * in all sign bits.
136
- */
137
- if (!u) {
138
- tcg_gen_gvec_sari(size, rd_ofs, rm_ofs,
139
- MIN(shift, (8 << size) - 1),
140
- vec_size, vec_size);
141
- } else if (shift >= 8 << size) {
142
- tcg_gen_gvec_dup_imm(MO_8, rd_ofs, vec_size,
143
- vec_size, 0);
144
- } else {
145
- tcg_gen_gvec_shri(size, rd_ofs, rm_ofs, shift,
146
- vec_size, vec_size);
147
- }
148
- return 0;
149
-
150
case 1: /* VSRA */
151
/* Right shift comes here negative. */
152
shift = -shift;
130
--
153
--
131
2.17.1
154
2.20.1
132
155
133
156
diff view generated by jsdifflib
1
From: Julia Suvorova <jusual@mail.ru>
1
Convert the VSRA, VSRI, VRSHR, VRSRA 2-reg-shift insns to decodetree.
2
(These are the last instructions in the group that are vectorized;
3
the rest all require looping over each element.)
2
4
3
ARMv6-M supports 6 Thumb2 instructions. This patch checks for these
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
instructions and allows their execution.
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5
Like Thumb2 cores, ARMv6-M always interprets BL instruction as 32-bit.
7
Message-id: 20200522145520.6778-4-peter.maydell@linaro.org
8
---
9
target/arm/neon-dp.decode | 35 ++++++++++++++++++++++
10
target/arm/translate-neon.inc.c | 7 +++++
11
target/arm/translate.c | 52 +++------------------------------
12
3 files changed, 46 insertions(+), 48 deletions(-)
6
13
7
This patch is required for future Cortex-M0 support.
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
8
15
index XXXXXXX..XXXXXXX 100644
9
Signed-off-by: Julia Suvorova <jusual@mail.ru>
16
--- a/target/arm/neon-dp.decode
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
17
+++ b/target/arm/neon-dp.decode
11
Message-id: 20180612204632.28780-1-jusual@mail.ru
18
@@ -XXX,XX +XXX,XX @@ VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
12
[PMM: move armv6m_insn[] and armv6m_mask[] closer to
19
VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
13
point of use, and mark 'const'. Check for M-and-not-v7
20
VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_b
14
rather than M-and-6.]
21
15
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
22
+VSRA_S_2sh 1111 001 0 1 . ...... .... 0001 . . . 1 .... @2reg_shr_d
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
+VSRA_S_2sh 1111 001 0 1 . ...... .... 0001 . . . 1 .... @2reg_shr_s
17
---
24
+VSRA_S_2sh 1111 001 0 1 . ...... .... 0001 . . . 1 .... @2reg_shr_h
18
target/arm/translate.c | 43 +++++++++++++++++++++++++++++++++++++-----
25
+VSRA_S_2sh 1111 001 0 1 . ...... .... 0001 . . . 1 .... @2reg_shr_b
19
1 file changed, 38 insertions(+), 5 deletions(-)
26
+
20
27
+VSRA_U_2sh 1111 001 1 1 . ...... .... 0001 . . . 1 .... @2reg_shr_d
28
+VSRA_U_2sh 1111 001 1 1 . ...... .... 0001 . . . 1 .... @2reg_shr_s
29
+VSRA_U_2sh 1111 001 1 1 . ...... .... 0001 . . . 1 .... @2reg_shr_h
30
+VSRA_U_2sh 1111 001 1 1 . ...... .... 0001 . . . 1 .... @2reg_shr_b
31
+
32
+VRSHR_S_2sh 1111 001 0 1 . ...... .... 0010 . . . 1 .... @2reg_shr_d
33
+VRSHR_S_2sh 1111 001 0 1 . ...... .... 0010 . . . 1 .... @2reg_shr_s
34
+VRSHR_S_2sh 1111 001 0 1 . ...... .... 0010 . . . 1 .... @2reg_shr_h
35
+VRSHR_S_2sh 1111 001 0 1 . ...... .... 0010 . . . 1 .... @2reg_shr_b
36
+
37
+VRSHR_U_2sh 1111 001 1 1 . ...... .... 0010 . . . 1 .... @2reg_shr_d
38
+VRSHR_U_2sh 1111 001 1 1 . ...... .... 0010 . . . 1 .... @2reg_shr_s
39
+VRSHR_U_2sh 1111 001 1 1 . ...... .... 0010 . . . 1 .... @2reg_shr_h
40
+VRSHR_U_2sh 1111 001 1 1 . ...... .... 0010 . . . 1 .... @2reg_shr_b
41
+
42
+VRSRA_S_2sh 1111 001 0 1 . ...... .... 0011 . . . 1 .... @2reg_shr_d
43
+VRSRA_S_2sh 1111 001 0 1 . ...... .... 0011 . . . 1 .... @2reg_shr_s
44
+VRSRA_S_2sh 1111 001 0 1 . ...... .... 0011 . . . 1 .... @2reg_shr_h
45
+VRSRA_S_2sh 1111 001 0 1 . ...... .... 0011 . . . 1 .... @2reg_shr_b
46
+
47
+VRSRA_U_2sh 1111 001 1 1 . ...... .... 0011 . . . 1 .... @2reg_shr_d
48
+VRSRA_U_2sh 1111 001 1 1 . ...... .... 0011 . . . 1 .... @2reg_shr_s
49
+VRSRA_U_2sh 1111 001 1 1 . ...... .... 0011 . . . 1 .... @2reg_shr_h
50
+VRSRA_U_2sh 1111 001 1 1 . ...... .... 0011 . . . 1 .... @2reg_shr_b
51
+
52
+VSRI_2sh 1111 001 1 1 . ...... .... 0100 . . . 1 .... @2reg_shr_d
53
+VSRI_2sh 1111 001 1 1 . ...... .... 0100 . . . 1 .... @2reg_shr_s
54
+VSRI_2sh 1111 001 1 1 . ...... .... 0100 . . . 1 .... @2reg_shr_h
55
+VSRI_2sh 1111 001 1 1 . ...... .... 0100 . . . 1 .... @2reg_shr_b
56
+
57
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
58
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
59
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
60
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
61
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/translate-neon.inc.c
63
+++ b/target/arm/translate-neon.inc.c
64
@@ -XXX,XX +XXX,XX @@ static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
65
66
DO_2SH(VSHL, tcg_gen_gvec_shli)
67
DO_2SH(VSLI, gen_gvec_sli)
68
+DO_2SH(VSRI, gen_gvec_sri)
69
+DO_2SH(VSRA_S, gen_gvec_ssra)
70
+DO_2SH(VSRA_U, gen_gvec_usra)
71
+DO_2SH(VRSHR_S, gen_gvec_srshr)
72
+DO_2SH(VRSHR_U, gen_gvec_urshr)
73
+DO_2SH(VRSRA_S, gen_gvec_srsra)
74
+DO_2SH(VRSRA_U, gen_gvec_ursra)
75
76
static bool trans_VSHR_S_2sh(DisasContext *s, arg_2reg_shift *a)
77
{
21
diff --git a/target/arm/translate.c b/target/arm/translate.c
78
diff --git a/target/arm/translate.c b/target/arm/translate.c
22
index XXXXXXX..XXXXXXX 100644
79
index XXXXXXX..XXXXXXX 100644
23
--- a/target/arm/translate.c
80
--- a/target/arm/translate.c
24
+++ b/target/arm/translate.c
81
+++ b/target/arm/translate.c
25
@@ -XXX,XX +XXX,XX @@ static bool thumb_insn_is_16bit(DisasContext *s, uint32_t insn)
82
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
26
* end up actually treating this as two 16-bit insns, though,
83
27
* if it's half of a bl/blx pair that might span a page boundary.
84
switch (op) {
28
*/
85
case 0: /* VSHR */
29
- if (arm_dc_feature(s, ARM_FEATURE_THUMB2)) {
86
+ case 1: /* VSRA */
30
+ if (arm_dc_feature(s, ARM_FEATURE_THUMB2) ||
87
+ case 2: /* VRSHR */
31
+ arm_dc_feature(s, ARM_FEATURE_M)) {
88
+ case 3: /* VRSRA */
32
/* Thumb2 cores (including all M profile ones) always treat
89
+ case 4: /* VSRI */
33
* 32-bit insns as 32-bit.
90
case 5: /* VSHL, VSLI */
34
*/
91
return 1; /* handled by decodetree */
35
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
92
default:
36
int conds;
93
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
37
int logic_cc;
94
shift = shift - (1 << (size + 3));
38
95
}
39
- /* The only 32 bit insn that's allowed for Thumb1 is the combined
96
40
- * BL/BLX prefix and suffix.
97
- switch (op) {
41
+ /*
98
- case 1: /* VSRA */
42
+ * ARMv6-M supports a limited subset of Thumb2 instructions.
99
- /* Right shift comes here negative. */
43
+ * Other Thumb1 architectures allow only 32-bit
100
- shift = -shift;
44
+ * combined BL/BLX prefix and suffix.
101
- if (u) {
45
*/
102
- gen_gvec_usra(size, rd_ofs, rm_ofs, shift,
46
- if ((insn & 0xf800e800) != 0xf000e800) {
103
- vec_size, vec_size);
47
+ if (arm_dc_feature(s, ARM_FEATURE_M) &&
104
- } else {
48
+ !arm_dc_feature(s, ARM_FEATURE_V7)) {
105
- gen_gvec_ssra(size, rd_ofs, rm_ofs, shift,
49
+ int i;
106
- vec_size, vec_size);
50
+ bool found = false;
107
- }
51
+ const uint32_t armv6m_insn[] = {0xf3808000 /* msr */,
108
- return 0;
52
+ 0xf3b08040 /* dsb */,
109
-
53
+ 0xf3b08050 /* dmb */,
110
- case 2: /* VRSHR */
54
+ 0xf3b08060 /* isb */,
111
- /* Right shift comes here negative. */
55
+ 0xf3e08000 /* mrs */,
112
- shift = -shift;
56
+ 0xf000d000 /* bl */};
113
- if (u) {
57
+ const uint32_t armv6m_mask[] = {0xffe0d000,
114
- gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
58
+ 0xfff0d0f0,
115
- vec_size, vec_size);
59
+ 0xfff0d0f0,
116
- } else {
60
+ 0xfff0d0f0,
117
- gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
61
+ 0xffe0d000,
118
- vec_size, vec_size);
62
+ 0xf800d000};
119
- }
63
+
120
- return 0;
64
+ for (i = 0; i < ARRAY_SIZE(armv6m_insn); i++) {
121
-
65
+ if ((insn & armv6m_mask[i]) == armv6m_insn[i]) {
122
- case 3: /* VRSRA */
66
+ found = true;
123
- /* Right shift comes here negative. */
67
+ break;
124
- shift = -shift;
68
+ }
125
- if (u) {
69
+ }
126
- gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
70
+ if (!found) {
127
- vec_size, vec_size);
71
+ goto illegal_op;
128
- } else {
72
+ }
129
- gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
73
+ } else if ((insn & 0xf800e800) != 0xf000e800) {
130
- vec_size, vec_size);
74
ARCH(6T2);
131
- }
75
}
132
- return 0;
76
133
-
77
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
134
- case 4: /* VSRI */
78
}
135
- if (!u) {
79
break;
136
- return 1;
80
case 3: /* Special control operations. */
137
- }
81
- ARCH(7);
138
- /* Right shift comes here negative. */
82
+ if (!arm_dc_feature(s, ARM_FEATURE_V7) &&
139
- shift = -shift;
83
+ !(arm_dc_feature(s, ARM_FEATURE_V6) &&
140
- gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
84
+ arm_dc_feature(s, ARM_FEATURE_M))) {
141
- vec_size, vec_size);
85
+ goto illegal_op;
142
- return 0;
86
+ }
143
- }
87
op = (insn >> 4) & 0xf;
144
-
88
switch (op) {
145
if (size == 3) {
89
case 2: /* clrex */
146
count = q + 1;
147
} else {
90
--
148
--
91
2.17.1
149
2.20.1
92
150
93
151
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the VQSHLU and QVSHL 2-reg-shift insns to decodetree.
2
2
These are the last of the simple shift-by-immediate insns.
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-id: 20180613015641.5667-4-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200522145520.6778-5-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper-sve.h | 6 +
8
target/arm/neon-dp.decode | 15 +++++
9
target/arm/sve_helper.c | 290 +++++++++++++++++++++++++++++++++++++
9
target/arm/translate-neon.inc.c | 108 +++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 120 +++++++++++++++
10
target/arm/translate.c | 110 +-------------------------------
11
target/arm/sve.decode | 18 +++
11
3 files changed, 126 insertions(+), 107 deletions(-)
12
4 files changed, 434 insertions(+)
12
13
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
15
--- a/target/arm/neon-dp.decode
17
+++ b/target/arm/helper-sve.h
16
+++ b/target/arm/neon-dp.decode
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
17
@@ -XXX,XX +XXX,XX @@ VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
19
DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
18
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
20
DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
19
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
21
20
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_b
22
+DEF_HELPER_FLAGS_4(sve_zip_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
21
+
23
+DEF_HELPER_FLAGS_4(sve_uzp_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
22
+VQSHLU_64_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_d
24
+DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
23
+VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_s
25
+DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
24
+VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_h
26
+DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
25
+VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_b
27
+
26
+
28
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
27
+VQSHL_S_64_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
29
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
28
+VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
30
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
29
+VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
31
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
30
+VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
31
+
32
+VQSHL_U_64_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
33
+VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
34
+VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
35
+VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
36
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
32
index XXXXXXX..XXXXXXX 100644
37
index XXXXXXX..XXXXXXX 100644
33
--- a/target/arm/sve_helper.c
38
--- a/target/arm/translate-neon.inc.c
34
+++ b/target/arm/sve_helper.c
39
+++ b/target/arm/translate-neon.inc.c
35
@@ -XXX,XX +XXX,XX @@ DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2)
40
@@ -XXX,XX +XXX,XX @@ static bool trans_VSHR_U_2sh(DisasContext *s, arg_2reg_shift *a)
36
DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4)
41
return do_vector_2sh(s, a, tcg_gen_gvec_shri);
37
42
}
38
#undef DO_UNPK
43
}
39
+
44
+
40
+/* Mask of bits included in the even numbered predicates of width esz.
45
+static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
41
+ * We also use this for expand_bits/compress_bits, and so extend the
46
+ NeonGenTwo64OpEnvFn *fn)
42
+ * same pattern out to 16-bit units.
43
+ */
44
+static const uint64_t even_bit_esz_masks[5] = {
45
+ 0x5555555555555555ull,
46
+ 0x3333333333333333ull,
47
+ 0x0f0f0f0f0f0f0f0full,
48
+ 0x00ff00ff00ff00ffull,
49
+ 0x0000ffff0000ffffull,
50
+};
51
+
52
+/* Zero-extend units of 2**N bits to units of 2**(N+1) bits.
53
+ * For N==0, this corresponds to the operation that in qemu/bitops.h
54
+ * we call half_shuffle64; this algorithm is from Hacker's Delight,
55
+ * section 7-2 Shuffling Bits.
56
+ */
57
+static uint64_t expand_bits(uint64_t x, int n)
58
+{
47
+{
59
+ int i;
48
+ /*
60
+
49
+ * 2-reg-and-shift operations, size == 3 case, where the
61
+ x &= 0xffffffffu;
50
+ * function needs to be passed cpu_env.
62
+ for (i = 4; i >= n; i--) {
51
+ */
63
+ int sh = 1 << i;
52
+ TCGv_i64 constimm;
64
+ x = ((x << sh) | x) & even_bit_esz_masks[i];
53
+ int pass;
65
+ }
54
+
66
+ return x;
55
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
67
+}
56
+ return false;
68
+
57
+ }
69
+/* Compress units of 2**(N+1) bits to units of 2**N bits.
58
+
70
+ * For N==0, this corresponds to the operation that in qemu/bitops.h
59
+ /* UNDEF accesses to D16-D31 if they don't exist. */
71
+ * we call half_unshuffle64; this algorithm is from Hacker's Delight,
60
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
72
+ * section 7-2 Shuffling Bits, where it is called an inverse half shuffle.
61
+ ((a->vd | a->vm) & 0x10)) {
73
+ */
62
+ return false;
74
+static uint64_t compress_bits(uint64_t x, int n)
63
+ }
75
+{
64
+
76
+ int i;
65
+ if ((a->vm | a->vd) & a->q) {
77
+
66
+ return false;
78
+ for (i = n; i <= 4; i++) {
67
+ }
79
+ int sh = 1 << i;
68
+
80
+ x &= even_bit_esz_masks[i];
69
+ if (!vfp_access_check(s)) {
81
+ x = (x >> sh) | x;
82
+ }
83
+ return x & 0xffffffffu;
84
+}
85
+
86
+void HELPER(sve_zip_p)(void *vd, void *vn, void *vm, uint32_t pred_desc)
87
+{
88
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
89
+ int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2);
90
+ intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1);
91
+ uint64_t *d = vd;
92
+ intptr_t i;
93
+
94
+ if (oprsz <= 8) {
95
+ uint64_t nn = *(uint64_t *)vn;
96
+ uint64_t mm = *(uint64_t *)vm;
97
+ int half = 4 * oprsz;
98
+
99
+ nn = extract64(nn, high * half, half);
100
+ mm = extract64(mm, high * half, half);
101
+ nn = expand_bits(nn, esz);
102
+ mm = expand_bits(mm, esz);
103
+ d[0] = nn + (mm << (1 << esz));
104
+ } else {
105
+ ARMPredicateReg tmp_n, tmp_m;
106
+
107
+ /* We produce output faster than we consume input.
108
+ Therefore we must be mindful of possible overlap. */
109
+ if ((vn - vd) < (uintptr_t)oprsz) {
110
+ vn = memcpy(&tmp_n, vn, oprsz);
111
+ }
112
+ if ((vm - vd) < (uintptr_t)oprsz) {
113
+ vm = memcpy(&tmp_m, vm, oprsz);
114
+ }
115
+ if (high) {
116
+ high = oprsz >> 1;
117
+ }
118
+
119
+ if ((high & 3) == 0) {
120
+ uint32_t *n = vn, *m = vm;
121
+ high >>= 2;
122
+
123
+ for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) {
124
+ uint64_t nn = n[H4(high + i)];
125
+ uint64_t mm = m[H4(high + i)];
126
+
127
+ nn = expand_bits(nn, esz);
128
+ mm = expand_bits(mm, esz);
129
+ d[i] = nn + (mm << (1 << esz));
130
+ }
131
+ } else {
132
+ uint8_t *n = vn, *m = vm;
133
+ uint16_t *d16 = vd;
134
+
135
+ for (i = 0; i < oprsz / 2; i++) {
136
+ uint16_t nn = n[H1(high + i)];
137
+ uint16_t mm = m[H1(high + i)];
138
+
139
+ nn = expand_bits(nn, esz);
140
+ mm = expand_bits(mm, esz);
141
+ d16[H2(i)] = nn + (mm << (1 << esz));
142
+ }
143
+ }
144
+ }
145
+}
146
+
147
+void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc)
148
+{
149
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
150
+ int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2);
151
+ int odd = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1) << esz;
152
+ uint64_t *d = vd, *n = vn, *m = vm;
153
+ uint64_t l, h;
154
+ intptr_t i;
155
+
156
+ if (oprsz <= 8) {
157
+ l = compress_bits(n[0] >> odd, esz);
158
+ h = compress_bits(m[0] >> odd, esz);
159
+ d[0] = extract64(l + (h << (4 * oprsz)), 0, 8 * oprsz);
160
+ } else {
161
+ ARMPredicateReg tmp_m;
162
+ intptr_t oprsz_16 = oprsz / 16;
163
+
164
+ if ((vm - vd) < (uintptr_t)oprsz) {
165
+ m = memcpy(&tmp_m, vm, oprsz);
166
+ }
167
+
168
+ for (i = 0; i < oprsz_16; i++) {
169
+ l = n[2 * i + 0];
170
+ h = n[2 * i + 1];
171
+ l = compress_bits(l >> odd, esz);
172
+ h = compress_bits(h >> odd, esz);
173
+ d[i] = l + (h << 32);
174
+ }
175
+
176
+ /* For VL which is not a power of 2, the results from M do not
177
+ align nicely with the uint64_t for D. Put the aligned results
178
+ from M into TMP_M and then copy it into place afterward. */
179
+ if (oprsz & 15) {
180
+ d[i] = compress_bits(n[2 * i] >> odd, esz);
181
+
182
+ for (i = 0; i < oprsz_16; i++) {
183
+ l = m[2 * i + 0];
184
+ h = m[2 * i + 1];
185
+ l = compress_bits(l >> odd, esz);
186
+ h = compress_bits(h >> odd, esz);
187
+ tmp_m.p[i] = l + (h << 32);
188
+ }
189
+ tmp_m.p[i] = compress_bits(m[2 * i] >> odd, esz);
190
+
191
+ swap_memmove(vd + oprsz / 2, &tmp_m, oprsz / 2);
192
+ } else {
193
+ for (i = 0; i < oprsz_16; i++) {
194
+ l = m[2 * i + 0];
195
+ h = m[2 * i + 1];
196
+ l = compress_bits(l >> odd, esz);
197
+ h = compress_bits(h >> odd, esz);
198
+ d[oprsz_16 + i] = l + (h << 32);
199
+ }
200
+ }
201
+ }
202
+}
203
+
204
+void HELPER(sve_trn_p)(void *vd, void *vn, void *vm, uint32_t pred_desc)
205
+{
206
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
207
+ uintptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2);
208
+ bool odd = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1);
209
+ uint64_t *d = vd, *n = vn, *m = vm;
210
+ uint64_t mask;
211
+ int shr, shl;
212
+ intptr_t i;
213
+
214
+ shl = 1 << esz;
215
+ shr = 0;
216
+ mask = even_bit_esz_masks[esz];
217
+ if (odd) {
218
+ mask <<= shl;
219
+ shr = shl;
220
+ shl = 0;
221
+ }
222
+
223
+ for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) {
224
+ uint64_t nn = (n[i] & mask) >> shr;
225
+ uint64_t mm = (m[i] & mask) << shl;
226
+ d[i] = nn + mm;
227
+ }
228
+}
229
+
230
+/* Reverse units of 2**N bits. */
231
+static uint64_t reverse_bits_64(uint64_t x, int n)
232
+{
233
+ int i, sh;
234
+
235
+ x = bswap64(x);
236
+ for (i = 2, sh = 4; i >= n; i--, sh >>= 1) {
237
+ uint64_t mask = even_bit_esz_masks[i];
238
+ x = ((x & mask) << sh) | ((x >> sh) & mask);
239
+ }
240
+ return x;
241
+}
242
+
243
+static uint8_t reverse_bits_8(uint8_t x, int n)
244
+{
245
+ static const uint8_t mask[3] = { 0x55, 0x33, 0x0f };
246
+ int i, sh;
247
+
248
+ for (i = 2, sh = 4; i >= n; i--, sh >>= 1) {
249
+ x = ((x & mask[i]) << sh) | ((x >> sh) & mask[i]);
250
+ }
251
+ return x;
252
+}
253
+
254
+void HELPER(sve_rev_p)(void *vd, void *vn, uint32_t pred_desc)
255
+{
256
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
257
+ int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2);
258
+ intptr_t i, oprsz_2 = oprsz / 2;
259
+
260
+ if (oprsz <= 8) {
261
+ uint64_t l = *(uint64_t *)vn;
262
+ l = reverse_bits_64(l << (64 - 8 * oprsz), esz);
263
+ *(uint64_t *)vd = l;
264
+ } else if ((oprsz & 15) == 0) {
265
+ for (i = 0; i < oprsz_2; i += 8) {
266
+ intptr_t ih = oprsz - 8 - i;
267
+ uint64_t l = reverse_bits_64(*(uint64_t *)(vn + i), esz);
268
+ uint64_t h = reverse_bits_64(*(uint64_t *)(vn + ih), esz);
269
+ *(uint64_t *)(vd + i) = h;
270
+ *(uint64_t *)(vd + ih) = l;
271
+ }
272
+ } else {
273
+ for (i = 0; i < oprsz_2; i += 1) {
274
+ intptr_t il = H1(i);
275
+ intptr_t ih = H1(oprsz - 1 - i);
276
+ uint8_t l = reverse_bits_8(*(uint8_t *)(vn + il), esz);
277
+ uint8_t h = reverse_bits_8(*(uint8_t *)(vn + ih), esz);
278
+ *(uint8_t *)(vd + il) = h;
279
+ *(uint8_t *)(vd + ih) = l;
280
+ }
281
+ }
282
+}
283
+
284
+void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc)
285
+{
286
+ intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
287
+ intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1);
288
+ uint64_t *d = vd;
289
+ intptr_t i;
290
+
291
+ if (oprsz <= 8) {
292
+ uint64_t nn = *(uint64_t *)vn;
293
+ int half = 4 * oprsz;
294
+
295
+ nn = extract64(nn, high * half, half);
296
+ nn = expand_bits(nn, 0);
297
+ d[0] = nn;
298
+ } else {
299
+ ARMPredicateReg tmp_n;
300
+
301
+ /* We produce output faster than we consume input.
302
+ Therefore we must be mindful of possible overlap. */
303
+ if ((vn - vd) < (uintptr_t)oprsz) {
304
+ vn = memcpy(&tmp_n, vn, oprsz);
305
+ }
306
+ if (high) {
307
+ high = oprsz >> 1;
308
+ }
309
+
310
+ if ((high & 3) == 0) {
311
+ uint32_t *n = vn;
312
+ high >>= 2;
313
+
314
+ for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) {
315
+ uint64_t nn = n[H4(high + i)];
316
+ d[i] = expand_bits(nn, 0);
317
+ }
318
+ } else {
319
+ uint16_t *d16 = vd;
320
+ uint8_t *n = vn;
321
+
322
+ for (i = 0; i < oprsz / 2; i++) {
323
+ uint16_t nn = n[H1(high + i)];
324
+ d16[H2(i)] = expand_bits(nn, 0);
325
+ }
326
+ }
327
+ }
328
+}
329
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
330
index XXXXXXX..XXXXXXX 100644
331
--- a/target/arm/translate-sve.c
332
+++ b/target/arm/translate-sve.c
333
@@ -XXX,XX +XXX,XX @@ static bool trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn)
334
return true;
335
}
336
337
+/*
338
+ *** SVE Permute - Predicates Group
339
+ */
340
+
341
+static bool do_perm_pred3(DisasContext *s, arg_rrr_esz *a, bool high_odd,
342
+ gen_helper_gvec_3 *fn)
343
+{
344
+ if (!sve_access_check(s)) {
345
+ return true;
70
+ return true;
346
+ }
71
+ }
347
+
72
+
348
+ unsigned vsz = pred_full_reg_size(s);
73
+ /*
349
+
74
+ * To avoid excessive duplication of ops we implement shift
350
+ /* Predicate sizes may be smaller and cannot use simd_desc.
75
+ * by immediate using the variable shift operations.
351
+ We cannot round up, as we do elsewhere, because we need
76
+ */
352
+ the exact size for ZIP2 and REV. We retain the style for
77
+ constimm = tcg_const_i64(dup_const(a->size, a->shift));
353
+ the other helpers for consistency. */
78
+
354
+ TCGv_ptr t_d = tcg_temp_new_ptr();
79
+ for (pass = 0; pass < a->q + 1; pass++) {
355
+ TCGv_ptr t_n = tcg_temp_new_ptr();
80
+ TCGv_i64 tmp = tcg_temp_new_i64();
356
+ TCGv_ptr t_m = tcg_temp_new_ptr();
81
+
357
+ TCGv_i32 t_desc;
82
+ neon_load_reg64(tmp, a->vm + pass);
358
+ int desc;
83
+ fn(tmp, cpu_env, tmp, constimm);
359
+
84
+ neon_store_reg64(tmp, a->vd + pass);
360
+ desc = vsz - 2;
85
+ }
361
+ desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz);
86
+ tcg_temp_free_i64(constimm);
362
+ desc = deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd);
363
+
364
+ tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd));
365
+ tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn));
366
+ tcg_gen_addi_ptr(t_m, cpu_env, pred_full_reg_offset(s, a->rm));
367
+ t_desc = tcg_const_i32(desc);
368
+
369
+ fn(t_d, t_n, t_m, t_desc);
370
+
371
+ tcg_temp_free_ptr(t_d);
372
+ tcg_temp_free_ptr(t_n);
373
+ tcg_temp_free_ptr(t_m);
374
+ tcg_temp_free_i32(t_desc);
375
+ return true;
87
+ return true;
376
+}
88
+}
377
+
89
+
378
+static bool do_perm_pred2(DisasContext *s, arg_rr_esz *a, bool high_odd,
90
+static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a,
379
+ gen_helper_gvec_2 *fn)
91
+ NeonGenTwoOpEnvFn *fn)
380
+{
92
+{
381
+ if (!sve_access_check(s)) {
93
+ /*
94
+ * 2-reg-and-shift operations, size < 3 case, where the
95
+ * helper needs to be passed cpu_env.
96
+ */
97
+ TCGv_i32 constimm;
98
+ int pass;
99
+
100
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
101
+ return false;
102
+ }
103
+
104
+ /* UNDEF accesses to D16-D31 if they don't exist. */
105
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
106
+ ((a->vd | a->vm) & 0x10)) {
107
+ return false;
108
+ }
109
+
110
+ if ((a->vm | a->vd) & a->q) {
111
+ return false;
112
+ }
113
+
114
+ if (!vfp_access_check(s)) {
382
+ return true;
115
+ return true;
383
+ }
116
+ }
384
+
117
+
385
+ unsigned vsz = pred_full_reg_size(s);
118
+ /*
386
+ TCGv_ptr t_d = tcg_temp_new_ptr();
119
+ * To avoid excessive duplication of ops we implement shift
387
+ TCGv_ptr t_n = tcg_temp_new_ptr();
120
+ * by immediate using the variable shift operations.
388
+ TCGv_i32 t_desc;
121
+ */
389
+ int desc;
122
+ constimm = tcg_const_i32(dup_const(a->size, a->shift));
390
+
123
+
391
+ tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd));
124
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
392
+ tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn));
125
+ TCGv_i32 tmp = neon_load_reg(a->vm, pass);
393
+
126
+ fn(tmp, cpu_env, tmp, constimm);
394
+ /* Predicate sizes may be smaller and cannot use simd_desc.
127
+ neon_store_reg(a->vd, pass, tmp);
395
+ We cannot round up, as we do elsewhere, because we need
128
+ }
396
+ the exact size for ZIP2 and REV. We retain the style for
129
+ tcg_temp_free_i32(constimm);
397
+ the other helpers for consistency. */
398
+
399
+ desc = vsz - 2;
400
+ desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz);
401
+ desc = deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd);
402
+ t_desc = tcg_const_i32(desc);
403
+
404
+ fn(t_d, t_n, t_desc);
405
+
406
+ tcg_temp_free_i32(t_desc);
407
+ tcg_temp_free_ptr(t_d);
408
+ tcg_temp_free_ptr(t_n);
409
+ return true;
130
+ return true;
410
+}
131
+}
411
+
132
+
412
+static bool trans_ZIP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
133
+#define DO_2SHIFT_ENV(INSN, FUNC) \
413
+{
134
+ static bool trans_##INSN##_64_2sh(DisasContext *s, arg_2reg_shift *a) \
414
+ return do_perm_pred3(s, a, 0, gen_helper_sve_zip_p);
135
+ { \
415
+}
136
+ return do_2shift_env_64(s, a, gen_helper_neon_##FUNC##64); \
416
+
137
+ } \
417
+static bool trans_ZIP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
138
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
418
+{
139
+ { \
419
+ return do_perm_pred3(s, a, 1, gen_helper_sve_zip_p);
140
+ static NeonGenTwoOpEnvFn * const fns[] = { \
420
+}
141
+ gen_helper_neon_##FUNC##8, \
421
+
142
+ gen_helper_neon_##FUNC##16, \
422
+static bool trans_UZP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
143
+ gen_helper_neon_##FUNC##32, \
423
+{
144
+ }; \
424
+ return do_perm_pred3(s, a, 0, gen_helper_sve_uzp_p);
145
+ assert(a->size < ARRAY_SIZE(fns)); \
425
+}
146
+ return do_2shift_env_32(s, a, fns[a->size]); \
426
+
147
+ }
427
+static bool trans_UZP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
148
+
428
+{
149
+DO_2SHIFT_ENV(VQSHLU, qshlu_s)
429
+ return do_perm_pred3(s, a, 1, gen_helper_sve_uzp_p);
150
+DO_2SHIFT_ENV(VQSHL_U, qshl_u)
430
+}
151
+DO_2SHIFT_ENV(VQSHL_S, qshl_s)
431
+
152
diff --git a/target/arm/translate.c b/target/arm/translate.c
432
+static bool trans_TRN1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
433
+{
434
+ return do_perm_pred3(s, a, 0, gen_helper_sve_trn_p);
435
+}
436
+
437
+static bool trans_TRN2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn)
438
+{
439
+ return do_perm_pred3(s, a, 1, gen_helper_sve_trn_p);
440
+}
441
+
442
+static bool trans_REV_p(DisasContext *s, arg_rr_esz *a, uint32_t insn)
443
+{
444
+ return do_perm_pred2(s, a, 0, gen_helper_sve_rev_p);
445
+}
446
+
447
+static bool trans_PUNPKLO(DisasContext *s, arg_PUNPKLO *a, uint32_t insn)
448
+{
449
+ return do_perm_pred2(s, a, 0, gen_helper_sve_punpk_p);
450
+}
451
+
452
+static bool trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn)
453
+{
454
+ return do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p);
455
+}
456
+
457
/*
458
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
459
*/
460
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
461
index XXXXXXX..XXXXXXX 100644
153
index XXXXXXX..XXXXXXX 100644
462
--- a/target/arm/sve.decode
154
--- a/target/arm/translate.c
463
+++ b/target/arm/sve.decode
155
+++ b/target/arm/translate.c
464
@@ -XXX,XX +XXX,XX @@
156
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
465
157
}
466
# Three operand, vector element size
158
}
467
@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz
159
468
+@pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz
160
-#define GEN_NEON_INTEGER_OP_ENV(name) do { \
469
@rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \
161
- switch ((size << 1) | u) { \
470
&rrr_esz rn=%reg_movprfx
162
- case 0: \
471
163
- gen_helper_neon_##name##_s8(tmp, cpu_env, tmp, tmp2); \
472
@@ -XXX,XX +XXX,XX @@ TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm
164
- break; \
473
# SVE unpack vector elements
165
- case 1: \
474
UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5
166
- gen_helper_neon_##name##_u8(tmp, cpu_env, tmp, tmp2); \
475
167
- break; \
476
+### SVE Permute - Predicates Group
168
- case 2: \
477
+
169
- gen_helper_neon_##name##_s16(tmp, cpu_env, tmp, tmp2); \
478
+# SVE permute predicate elements
170
- break; \
479
+ZIP1_p 00000101 .. 10 .... 010 000 0 .... 0 .... @pd_pn_pm
171
- case 3: \
480
+ZIP2_p 00000101 .. 10 .... 010 001 0 .... 0 .... @pd_pn_pm
172
- gen_helper_neon_##name##_u16(tmp, cpu_env, tmp, tmp2); \
481
+UZP1_p 00000101 .. 10 .... 010 010 0 .... 0 .... @pd_pn_pm
173
- break; \
482
+UZP2_p 00000101 .. 10 .... 010 011 0 .... 0 .... @pd_pn_pm
174
- case 4: \
483
+TRN1_p 00000101 .. 10 .... 010 100 0 .... 0 .... @pd_pn_pm
175
- gen_helper_neon_##name##_s32(tmp, cpu_env, tmp, tmp2); \
484
+TRN2_p 00000101 .. 10 .... 010 101 0 .... 0 .... @pd_pn_pm
176
- break; \
485
+
177
- case 5: \
486
+# SVE reverse predicate elements
178
- gen_helper_neon_##name##_u32(tmp, cpu_env, tmp, tmp2); \
487
+REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn
179
- break; \
488
+
180
- default: return 1; \
489
+# SVE unpack predicate elements
181
- }} while (0)
490
+PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0
182
-
491
+PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0
183
static TCGv_i32 neon_load_scratch(int scratch)
492
+
184
{
493
### SVE Predicate Logical Operations Group
185
TCGv_i32 tmp = tcg_temp_new_i32();
494
186
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
495
# SVE predicate logical operations
187
int size;
188
int shift;
189
int pass;
190
- int count;
191
int u;
192
int vec_size;
193
uint32_t imm;
194
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
195
case 3: /* VRSRA */
196
case 4: /* VSRI */
197
case 5: /* VSHL, VSLI */
198
+ case 6: /* VQSHLU */
199
+ case 7: /* VQSHL */
200
return 1; /* handled by decodetree */
201
default:
202
break;
203
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
204
size--;
205
}
206
shift = (insn >> 16) & ((1 << (3 + size)) - 1);
207
- if (op < 8) {
208
- /* Shift by immediate:
209
- VSHR, VSRA, VRSHR, VRSRA, VSRI, VSHL, VQSHL, VQSHLU. */
210
- if (q && ((rd | rm) & 1)) {
211
- return 1;
212
- }
213
- if (!u && (op == 4 || op == 6)) {
214
- return 1;
215
- }
216
- /* Right shifts are encoded as N - shift, where N is the
217
- element size in bits. */
218
- if (op <= 4) {
219
- shift = shift - (1 << (size + 3));
220
- }
221
-
222
- if (size == 3) {
223
- count = q + 1;
224
- } else {
225
- count = q ? 4: 2;
226
- }
227
-
228
- /* To avoid excessive duplication of ops we implement shift
229
- * by immediate using the variable shift operations.
230
- */
231
- imm = dup_const(size, shift);
232
-
233
- for (pass = 0; pass < count; pass++) {
234
- if (size == 3) {
235
- neon_load_reg64(cpu_V0, rm + pass);
236
- tcg_gen_movi_i64(cpu_V1, imm);
237
- switch (op) {
238
- case 6: /* VQSHLU */
239
- gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
240
- cpu_V0, cpu_V1);
241
- break;
242
- case 7: /* VQSHL */
243
- if (u) {
244
- gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
245
- cpu_V0, cpu_V1);
246
- } else {
247
- gen_helper_neon_qshl_s64(cpu_V0, cpu_env,
248
- cpu_V0, cpu_V1);
249
- }
250
- break;
251
- default:
252
- g_assert_not_reached();
253
- }
254
- neon_store_reg64(cpu_V0, rd + pass);
255
- } else { /* size < 3 */
256
- /* Operands in T0 and T1. */
257
- tmp = neon_load_reg(rm, pass);
258
- tmp2 = tcg_temp_new_i32();
259
- tcg_gen_movi_i32(tmp2, imm);
260
- switch (op) {
261
- case 6: /* VQSHLU */
262
- switch (size) {
263
- case 0:
264
- gen_helper_neon_qshlu_s8(tmp, cpu_env,
265
- tmp, tmp2);
266
- break;
267
- case 1:
268
- gen_helper_neon_qshlu_s16(tmp, cpu_env,
269
- tmp, tmp2);
270
- break;
271
- case 2:
272
- gen_helper_neon_qshlu_s32(tmp, cpu_env,
273
- tmp, tmp2);
274
- break;
275
- default:
276
- abort();
277
- }
278
- break;
279
- case 7: /* VQSHL */
280
- GEN_NEON_INTEGER_OP_ENV(qshl);
281
- break;
282
- default:
283
- g_assert_not_reached();
284
- }
285
- tcg_temp_free_i32(tmp2);
286
- neon_store_reg(rd, pass, tmp);
287
- }
288
- } /* for pass */
289
- } else if (op < 10) {
290
+ if (op < 10) {
291
/* Shift by immediate and narrow:
292
VSHRN, VRSHRN, VQSHRN, VQRSHRN. */
293
int input_unsigned = (op == 8) ? !u : u;
496
--
294
--
497
2.17.1
295
2.20.1
498
296
499
297
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the Neon narrowing shifts where op==8 to decodetree:
2
2
* VSHRN
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
* VRSHRN
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
* VQSHRUN
5
Message-id: 20180613015641.5667-8-richard.henderson@linaro.org
5
* VQRSHRUN
6
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-id: 20200522145520.6778-6-peter.maydell@linaro.org
7
---
10
---
8
target/arm/translate-sve.c | 19 +++++++++++++++++++
11
target/arm/neon-dp.decode | 27 ++++++
9
target/arm/sve.decode | 6 ++++++
12
target/arm/translate-neon.inc.c | 167 ++++++++++++++++++++++++++++++++
10
2 files changed, 25 insertions(+)
13
target/arm/translate.c | 1 +
11
14
3 files changed, 195 insertions(+)
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
15
16
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
13
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
18
--- a/target/arm/neon-dp.decode
15
+++ b/target/arm/translate-sve.c
19
+++ b/target/arm/neon-dp.decode
16
@@ -XXX,XX +XXX,XX @@ static bool trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
20
@@ -XXX,XX +XXX,XX @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
17
return do_last_general(s, a, true);
21
@2reg_shl_b .... ... . . . 001 shift:3 .... .... 0 q:1 . . .... \
18
}
22
&2reg_shift vm=%vm_dp vd=%vd_dp size=0
19
23
20
+static bool trans_CPY_m_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
24
+# Narrowing right shifts: here the Q bit is part of the opcode decode
21
+{
25
+@2reg_shrn_d .... ... . . . 1 ..... .... .... 0 . . . .... \
22
+ if (sve_access_check(s)) {
26
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=3 q=0 \
23
+ do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, cpu_reg_sp(s, a->rn));
27
+ shift=%neon_rshift_i5
24
+ }
28
+@2reg_shrn_s .... ... . . . 01 .... .... .... 0 . . . .... \
29
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=2 q=0 \
30
+ shift=%neon_rshift_i4
31
+@2reg_shrn_h .... ... . . . 001 ... .... .... 0 . . . .... \
32
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1 q=0 \
33
+ shift=%neon_rshift_i3
34
+
35
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
36
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
37
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
38
@@ -XXX,XX +XXX,XX @@ VQSHL_U_64_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
39
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
40
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
41
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
42
+
43
+VSHRN_64_2sh 1111 001 0 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_d
44
+VSHRN_32_2sh 1111 001 0 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_s
45
+VSHRN_16_2sh 1111 001 0 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_h
46
+
47
+VRSHRN_64_2sh 1111 001 0 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_d
48
+VRSHRN_32_2sh 1111 001 0 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_s
49
+VRSHRN_16_2sh 1111 001 0 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_h
50
+
51
+VQSHRUN_64_2sh 1111 001 1 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_d
52
+VQSHRUN_32_2sh 1111 001 1 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_s
53
+VQSHRUN_16_2sh 1111 001 1 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_h
54
+
55
+VQRSHRUN_64_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_d
56
+VQRSHRUN_32_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_s
57
+VQRSHRUN_16_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_h
58
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
59
index XXXXXXX..XXXXXXX 100644
60
--- a/target/arm/translate-neon.inc.c
61
+++ b/target/arm/translate-neon.inc.c
62
@@ -XXX,XX +XXX,XX @@ static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a,
63
DO_2SHIFT_ENV(VQSHLU, qshlu_s)
64
DO_2SHIFT_ENV(VQSHL_U, qshl_u)
65
DO_2SHIFT_ENV(VQSHL_S, qshl_s)
66
+
67
+static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
68
+ NeonGenTwo64OpFn *shiftfn,
69
+ NeonGenNarrowEnvFn *narrowfn)
70
+{
71
+ /* 2-reg-and-shift narrowing-shift operations, size == 3 case */
72
+ TCGv_i64 constimm, rm1, rm2;
73
+ TCGv_i32 rd;
74
+
75
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
76
+ return false;
77
+ }
78
+
79
+ /* UNDEF accesses to D16-D31 if they don't exist. */
80
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
81
+ ((a->vd | a->vm) & 0x10)) {
82
+ return false;
83
+ }
84
+
85
+ if (a->vm & 1) {
86
+ return false;
87
+ }
88
+
89
+ if (!vfp_access_check(s)) {
90
+ return true;
91
+ }
92
+
93
+ /*
94
+ * This is always a right shift, and the shiftfn is always a
95
+ * left-shift helper, which thus needs the negated shift count.
96
+ */
97
+ constimm = tcg_const_i64(-a->shift);
98
+ rm1 = tcg_temp_new_i64();
99
+ rm2 = tcg_temp_new_i64();
100
+
101
+ /* Load both inputs first to avoid potential overwrite if rm == rd */
102
+ neon_load_reg64(rm1, a->vm);
103
+ neon_load_reg64(rm2, a->vm + 1);
104
+
105
+ shiftfn(rm1, rm1, constimm);
106
+ rd = tcg_temp_new_i32();
107
+ narrowfn(rd, cpu_env, rm1);
108
+ neon_store_reg(a->vd, 0, rd);
109
+
110
+ shiftfn(rm2, rm2, constimm);
111
+ rd = tcg_temp_new_i32();
112
+ narrowfn(rd, cpu_env, rm2);
113
+ neon_store_reg(a->vd, 1, rd);
114
+
115
+ tcg_temp_free_i64(rm1);
116
+ tcg_temp_free_i64(rm2);
117
+ tcg_temp_free_i64(constimm);
118
+
25
+ return true;
119
+ return true;
26
+}
120
+}
27
+
121
+
28
+static bool trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
122
+static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
29
+{
123
+ NeonGenTwoOpFn *shiftfn,
30
+ if (sve_access_check(s)) {
124
+ NeonGenNarrowEnvFn *narrowfn)
31
+ int ofs = vec_reg_offset(s, a->rn, 0, a->esz);
125
+{
32
+ TCGv_i64 t = load_esz(cpu_env, ofs, a->esz);
126
+ /* 2-reg-and-shift narrowing-shift operations, size < 3 case */
33
+ do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, t);
127
+ TCGv_i32 constimm, rm1, rm2, rm3, rm4;
34
+ tcg_temp_free_i64(t);
128
+ TCGv_i64 rtmp;
35
+ }
129
+ uint32_t imm;
130
+
131
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
132
+ return false;
133
+ }
134
+
135
+ /* UNDEF accesses to D16-D31 if they don't exist. */
136
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
137
+ ((a->vd | a->vm) & 0x10)) {
138
+ return false;
139
+ }
140
+
141
+ if (a->vm & 1) {
142
+ return false;
143
+ }
144
+
145
+ if (!vfp_access_check(s)) {
146
+ return true;
147
+ }
148
+
149
+ /*
150
+ * This is always a right shift, and the shiftfn is always a
151
+ * left-shift helper, which thus needs the negated shift count
152
+ * duplicated into each lane of the immediate value.
153
+ */
154
+ if (a->size == 1) {
155
+ imm = (uint16_t)(-a->shift);
156
+ imm |= imm << 16;
157
+ } else {
158
+ /* size == 2 */
159
+ imm = -a->shift;
160
+ }
161
+ constimm = tcg_const_i32(imm);
162
+
163
+ /* Load all inputs first to avoid potential overwrite */
164
+ rm1 = neon_load_reg(a->vm, 0);
165
+ rm2 = neon_load_reg(a->vm, 1);
166
+ rm3 = neon_load_reg(a->vm + 1, 0);
167
+ rm4 = neon_load_reg(a->vm + 1, 1);
168
+ rtmp = tcg_temp_new_i64();
169
+
170
+ shiftfn(rm1, rm1, constimm);
171
+ shiftfn(rm2, rm2, constimm);
172
+
173
+ tcg_gen_concat_i32_i64(rtmp, rm1, rm2);
174
+ tcg_temp_free_i32(rm2);
175
+
176
+ narrowfn(rm1, cpu_env, rtmp);
177
+ neon_store_reg(a->vd, 0, rm1);
178
+
179
+ shiftfn(rm3, rm3, constimm);
180
+ shiftfn(rm4, rm4, constimm);
181
+ tcg_temp_free_i32(constimm);
182
+
183
+ tcg_gen_concat_i32_i64(rtmp, rm3, rm4);
184
+ tcg_temp_free_i32(rm4);
185
+
186
+ narrowfn(rm3, cpu_env, rtmp);
187
+ tcg_temp_free_i64(rtmp);
188
+ neon_store_reg(a->vd, 1, rm3);
36
+ return true;
189
+ return true;
37
+}
190
+}
38
+
191
+
39
/*
192
+#define DO_2SN_64(INSN, FUNC, NARROWFUNC) \
40
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
193
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
41
*/
194
+ { \
42
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
195
+ return do_2shift_narrow_64(s, a, FUNC, NARROWFUNC); \
196
+ }
197
+#define DO_2SN_32(INSN, FUNC, NARROWFUNC) \
198
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
199
+ { \
200
+ return do_2shift_narrow_32(s, a, FUNC, NARROWFUNC); \
201
+ }
202
+
203
+static void gen_neon_narrow_u32(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
204
+{
205
+ tcg_gen_extrl_i64_i32(dest, src);
206
+}
207
+
208
+static void gen_neon_narrow_u16(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
209
+{
210
+ gen_helper_neon_narrow_u16(dest, src);
211
+}
212
+
213
+static void gen_neon_narrow_u8(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
214
+{
215
+ gen_helper_neon_narrow_u8(dest, src);
216
+}
217
+
218
+DO_2SN_64(VSHRN_64, gen_ushl_i64, gen_neon_narrow_u32)
219
+DO_2SN_32(VSHRN_32, gen_ushl_i32, gen_neon_narrow_u16)
220
+DO_2SN_32(VSHRN_16, gen_helper_neon_shl_u16, gen_neon_narrow_u8)
221
+
222
+DO_2SN_64(VRSHRN_64, gen_helper_neon_rshl_u64, gen_neon_narrow_u32)
223
+DO_2SN_32(VRSHRN_32, gen_helper_neon_rshl_u32, gen_neon_narrow_u16)
224
+DO_2SN_32(VRSHRN_16, gen_helper_neon_rshl_u16, gen_neon_narrow_u8)
225
+
226
+DO_2SN_64(VQSHRUN_64, gen_sshl_i64, gen_helper_neon_unarrow_sat32)
227
+DO_2SN_32(VQSHRUN_32, gen_sshl_i32, gen_helper_neon_unarrow_sat16)
228
+DO_2SN_32(VQSHRUN_16, gen_helper_neon_shl_s16, gen_helper_neon_unarrow_sat8)
229
+
230
+DO_2SN_64(VQRSHRUN_64, gen_helper_neon_rshl_s64, gen_helper_neon_unarrow_sat32)
231
+DO_2SN_32(VQRSHRUN_32, gen_helper_neon_rshl_s32, gen_helper_neon_unarrow_sat16)
232
+DO_2SN_32(VQRSHRUN_16, gen_helper_neon_rshl_s16, gen_helper_neon_unarrow_sat8)
233
diff --git a/target/arm/translate.c b/target/arm/translate.c
43
index XXXXXXX..XXXXXXX 100644
234
index XXXXXXX..XXXXXXX 100644
44
--- a/target/arm/sve.decode
235
--- a/target/arm/translate.c
45
+++ b/target/arm/sve.decode
236
+++ b/target/arm/translate.c
46
@@ -XXX,XX +XXX,XX @@ LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn
237
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
47
LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn
238
case 5: /* VSHL, VSLI */
48
LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn
239
case 6: /* VQSHLU */
49
240
case 7: /* VQSHL */
50
+# SVE copy element from SIMD&FP scalar register
241
+ case 8: /* VSHRN, VRSHRN, VQSHRUN, VQRSHRUN */
51
+CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn
242
return 1; /* handled by decodetree */
52
+
243
default:
53
+# SVE copy element from general register to vector (predicated)
244
break;
54
+CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn
55
+
56
### SVE Predicate Logical Operations Group
57
58
# SVE predicate logical operations
59
--
245
--
60
2.17.1
246
2.20.1
61
247
62
248
diff view generated by jsdifflib
1
Add support for multiple IOMMU indexes to the IOMMU notifier APIs.
1
Convert the remaining Neon narrowing shifts to decodetree:
2
When initializing a notifier with iommu_notifier_init(), the caller
2
* VQSHRN
3
must pass the IOMMU index that it is interested in. When a change
3
* VQRSHRN
4
happens, the IOMMU implementation must pass
5
memory_region_notify_iommu() the IOMMU index that has changed and
6
that notifiers must be called for.
7
8
IOMMUs which support only a single index don't need to change.
9
Callers which only really support working with IOMMUs with a single
10
index can use the result of passing MEMTXATTRS_UNSPECIFIED to
11
memory_region_iommu_attrs_to_index().
12
4
13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
14
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
15
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Message-id: 20200522145520.6778-7-peter.maydell@linaro.org
16
Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
17
---
8
---
18
include/exec/memory.h | 7 ++++++-
9
target/arm/neon-dp.decode | 20 ++++++
19
hw/i386/intel_iommu.c | 6 +++---
10
target/arm/translate-neon.inc.c | 15 +++++
20
hw/ppc/spapr_iommu.c | 2 +-
11
target/arm/translate.c | 110 +-------------------------------
21
hw/s390x/s390-pci-inst.c | 4 ++--
12
3 files changed, 37 insertions(+), 108 deletions(-)
22
hw/vfio/common.c | 6 +++++-
13
23
hw/virtio/vhost.c | 7 ++++++-
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
24
memory.c | 8 +++++++-
25
7 files changed, 30 insertions(+), 10 deletions(-)
26
27
diff --git a/include/exec/memory.h b/include/exec/memory.h
28
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
29
--- a/include/exec/memory.h
16
--- a/target/arm/neon-dp.decode
30
+++ b/include/exec/memory.h
17
+++ b/target/arm/neon-dp.decode
31
@@ -XXX,XX +XXX,XX @@ struct IOMMUNotifier {
18
@@ -XXX,XX +XXX,XX @@ VQSHRUN_16_2sh 1111 001 1 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_h
32
/* Notify for address space range start <= addr <= end */
19
VQRSHRUN_64_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_d
33
hwaddr start;
20
VQRSHRUN_32_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_s
34
hwaddr end;
21
VQRSHRUN_16_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_h
35
+ int iommu_idx;
22
+
36
QLIST_ENTRY(IOMMUNotifier) node;
23
+# VQSHRN with signed input
37
};
24
+VQSHRN_S64_2sh 1111 001 0 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_d
38
typedef struct IOMMUNotifier IOMMUNotifier;
25
+VQSHRN_S32_2sh 1111 001 0 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_s
39
26
+VQSHRN_S16_2sh 1111 001 0 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_h
40
static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
27
+
41
IOMMUNotifierFlag flags,
28
+# VQRSHRN with signed input
42
- hwaddr start, hwaddr end)
29
+VQRSHRN_S64_2sh 1111 001 0 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_d
43
+ hwaddr start, hwaddr end,
30
+VQRSHRN_S32_2sh 1111 001 0 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_s
44
+ int iommu_idx)
31
+VQRSHRN_S16_2sh 1111 001 0 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_h
45
{
32
+
46
n->notify = fn;
33
+# VQSHRN with unsigned input
47
n->notifier_flags = flags;
34
+VQSHRN_U64_2sh 1111 001 1 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_d
48
n->start = start;
35
+VQSHRN_U32_2sh 1111 001 1 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_s
49
n->end = end;
36
+VQSHRN_U16_2sh 1111 001 1 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_h
50
+ n->iommu_idx = iommu_idx;
37
+
51
}
38
+# VQRSHRN with unsigned input
52
39
+VQRSHRN_U64_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_d
53
/*
40
+VQRSHRN_U32_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_s
54
@@ -XXX,XX +XXX,XX @@ uint64_t memory_region_iommu_get_min_page_size(IOMMUMemoryRegion *iommu_mr);
41
+VQRSHRN_U16_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_h
55
* should be notified with an UNMAP followed by a MAP.
42
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
56
*
57
* @iommu_mr: the memory region that was changed
58
+ * @iommu_idx: the IOMMU index for the translation table which has changed
59
* @entry: the new entry in the IOMMU translation table. The entry
60
* replaces all old entries for the same virtual I/O address range.
61
* Deleted entries have .@perm == 0.
62
*/
63
void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
64
+ int iommu_idx,
65
IOMMUTLBEntry entry);
66
67
/**
68
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
69
index XXXXXXX..XXXXXXX 100644
43
index XXXXXXX..XXXXXXX 100644
70
--- a/hw/i386/intel_iommu.c
44
--- a/target/arm/translate-neon.inc.c
71
+++ b/hw/i386/intel_iommu.c
45
+++ b/target/arm/translate-neon.inc.c
72
@@ -XXX,XX +XXX,XX @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num,
46
@@ -XXX,XX +XXX,XX @@ DO_2SN_32(VQSHRUN_16, gen_helper_neon_shl_s16, gen_helper_neon_unarrow_sat8)
73
static int vtd_sync_shadow_page_hook(IOMMUTLBEntry *entry,
47
DO_2SN_64(VQRSHRUN_64, gen_helper_neon_rshl_s64, gen_helper_neon_unarrow_sat32)
74
void *private)
48
DO_2SN_32(VQRSHRUN_32, gen_helper_neon_rshl_s32, gen_helper_neon_unarrow_sat16)
75
{
49
DO_2SN_32(VQRSHRUN_16, gen_helper_neon_rshl_s16, gen_helper_neon_unarrow_sat8)
76
- memory_region_notify_iommu((IOMMUMemoryRegion *)private, *entry);
50
+DO_2SN_64(VQSHRN_S64, gen_sshl_i64, gen_helper_neon_narrow_sat_s32)
77
+ memory_region_notify_iommu((IOMMUMemoryRegion *)private, 0, *entry);
51
+DO_2SN_32(VQSHRN_S32, gen_sshl_i32, gen_helper_neon_narrow_sat_s16)
78
return 0;
52
+DO_2SN_32(VQSHRN_S16, gen_helper_neon_shl_s16, gen_helper_neon_narrow_sat_s8)
79
}
53
+
80
54
+DO_2SN_64(VQRSHRN_S64, gen_helper_neon_rshl_s64, gen_helper_neon_narrow_sat_s32)
81
@@ -XXX,XX +XXX,XX @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
55
+DO_2SN_32(VQRSHRN_S32, gen_helper_neon_rshl_s32, gen_helper_neon_narrow_sat_s16)
82
.addr_mask = size - 1,
56
+DO_2SN_32(VQRSHRN_S16, gen_helper_neon_rshl_s16, gen_helper_neon_narrow_sat_s8)
83
.perm = IOMMU_NONE,
57
+
84
};
58
+DO_2SN_64(VQSHRN_U64, gen_ushl_i64, gen_helper_neon_narrow_sat_u32)
85
- memory_region_notify_iommu(&vtd_as->iommu, entry);
59
+DO_2SN_32(VQSHRN_U32, gen_ushl_i32, gen_helper_neon_narrow_sat_u16)
86
+ memory_region_notify_iommu(&vtd_as->iommu, 0, entry);
60
+DO_2SN_32(VQSHRN_U16, gen_helper_neon_shl_u16, gen_helper_neon_narrow_sat_u8)
87
}
61
+
88
}
62
+DO_2SN_64(VQRSHRN_U64, gen_helper_neon_rshl_u64, gen_helper_neon_narrow_sat_u32)
89
}
63
+DO_2SN_32(VQRSHRN_U32, gen_helper_neon_rshl_u32, gen_helper_neon_narrow_sat_u16)
90
@@ -XXX,XX +XXX,XX @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
64
+DO_2SN_32(VQRSHRN_U16, gen_helper_neon_rshl_u16, gen_helper_neon_narrow_sat_u8)
91
entry.iova = addr;
65
diff --git a/target/arm/translate.c b/target/arm/translate.c
92
entry.perm = IOMMU_NONE;
93
entry.translated_addr = 0;
94
- memory_region_notify_iommu(&vtd_dev_as->iommu, entry);
95
+ memory_region_notify_iommu(&vtd_dev_as->iommu, 0, entry);
96
97
done:
98
return true;
99
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
100
index XXXXXXX..XXXXXXX 100644
66
index XXXXXXX..XXXXXXX 100644
101
--- a/hw/ppc/spapr_iommu.c
67
--- a/target/arm/translate.c
102
+++ b/hw/ppc/spapr_iommu.c
68
+++ b/target/arm/translate.c
103
@@ -XXX,XX +XXX,XX @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba,
69
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_unarrow_sats(int size, TCGv_i32 dest, TCGv_i64 src)
104
entry.translated_addr = tce & page_mask;
105
entry.addr_mask = ~page_mask;
106
entry.perm = spapr_tce_iommu_access_flags(tce);
107
- memory_region_notify_iommu(&tcet->iommu, entry);
108
+ memory_region_notify_iommu(&tcet->iommu, 0, entry);
109
110
return H_SUCCESS;
111
}
112
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
113
index XXXXXXX..XXXXXXX 100644
114
--- a/hw/s390x/s390-pci-inst.c
115
+++ b/hw/s390x/s390-pci-inst.c
116
@@ -XXX,XX +XXX,XX @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
117
}
118
119
notify.perm = IOMMU_NONE;
120
- memory_region_notify_iommu(&iommu->iommu_mr, notify);
121
+ memory_region_notify_iommu(&iommu->iommu_mr, 0, notify);
122
notify.perm = entry->perm;
123
}
124
125
@@ -XXX,XX +XXX,XX @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
126
g_hash_table_replace(iommu->iotlb, &cache->iova, cache);
127
}
128
129
- memory_region_notify_iommu(&iommu->iommu_mr, notify);
130
+ memory_region_notify_iommu(&iommu->iommu_mr, 0, notify);
131
}
132
133
int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
134
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
135
index XXXXXXX..XXXXXXX 100644
136
--- a/hw/vfio/common.c
137
+++ b/hw/vfio/common.c
138
@@ -XXX,XX +XXX,XX @@ static void vfio_listener_region_add(MemoryListener *listener,
139
if (memory_region_is_iommu(section->mr)) {
140
VFIOGuestIOMMU *giommu;
141
IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
142
+ int iommu_idx;
143
144
trace_vfio_listener_region_add_iommu(iova, end);
145
/*
146
@@ -XXX,XX +XXX,XX @@ static void vfio_listener_region_add(MemoryListener *listener,
147
llend = int128_add(int128_make64(section->offset_within_region),
148
section->size);
149
llend = int128_sub(llend, int128_one());
150
+ iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
151
+ MEMTXATTRS_UNSPECIFIED);
152
iommu_notifier_init(&giommu->n, vfio_iommu_map_notify,
153
IOMMU_NOTIFIER_ALL,
154
section->offset_within_region,
155
- int128_get64(llend));
156
+ int128_get64(llend),
157
+ iommu_idx);
158
QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
159
160
memory_region_register_iommu_notifier(section->mr, &giommu->n);
161
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
162
index XXXXXXX..XXXXXXX 100644
163
--- a/hw/virtio/vhost.c
164
+++ b/hw/virtio/vhost.c
165
@@ -XXX,XX +XXX,XX @@ static void vhost_iommu_region_add(MemoryListener *listener,
166
iommu_listener);
167
struct vhost_iommu *iommu;
168
Int128 end;
169
+ int iommu_idx;
170
+ IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
171
172
if (!memory_region_is_iommu(section->mr)) {
173
return;
174
@@ -XXX,XX +XXX,XX @@ static void vhost_iommu_region_add(MemoryListener *listener,
175
end = int128_add(int128_make64(section->offset_within_region),
176
section->size);
177
end = int128_sub(end, int128_one());
178
+ iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
179
+ MEMTXATTRS_UNSPECIFIED);
180
iommu_notifier_init(&iommu->n, vhost_iommu_unmap_notify,
181
IOMMU_NOTIFIER_UNMAP,
182
section->offset_within_region,
183
- int128_get64(end));
184
+ int128_get64(end),
185
+ iommu_idx);
186
iommu->mr = section->mr;
187
iommu->iommu_offset = section->offset_within_address_space -
188
section->offset_within_region;
189
diff --git a/memory.c b/memory.c
190
index XXXXXXX..XXXXXXX 100644
191
--- a/memory.c
192
+++ b/memory.c
193
@@ -XXX,XX +XXX,XX @@ void memory_region_register_iommu_notifier(MemoryRegion *mr,
194
iommu_mr = IOMMU_MEMORY_REGION(mr);
195
assert(n->notifier_flags != IOMMU_NOTIFIER_NONE);
196
assert(n->start <= n->end);
197
+ assert(n->iommu_idx >= 0 &&
198
+ n->iommu_idx < memory_region_iommu_num_indexes(iommu_mr));
199
+
200
QLIST_INSERT_HEAD(&iommu_mr->iommu_notify, n, node);
201
memory_region_update_iommu_notify_flags(iommu_mr);
202
}
203
@@ -XXX,XX +XXX,XX @@ void memory_region_notify_one(IOMMUNotifier *notifier,
204
}
205
206
void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
207
+ int iommu_idx,
208
IOMMUTLBEntry entry)
209
{
210
IOMMUNotifier *iommu_notifier;
211
@@ -XXX,XX +XXX,XX @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
212
assert(memory_region_is_iommu(MEMORY_REGION(iommu_mr)));
213
214
IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
215
- memory_region_notify_one(iommu_notifier, &entry);
216
+ if (iommu_notifier->iommu_idx == iommu_idx) {
217
+ memory_region_notify_one(iommu_notifier, &entry);
218
+ }
219
}
70
}
220
}
71
}
221
72
73
-static inline void gen_neon_shift_narrow(int size, TCGv_i32 var, TCGv_i32 shift,
74
- int q, int u)
75
-{
76
- if (q) {
77
- if (u) {
78
- switch (size) {
79
- case 1: gen_helper_neon_rshl_u16(var, var, shift); break;
80
- case 2: gen_helper_neon_rshl_u32(var, var, shift); break;
81
- default: abort();
82
- }
83
- } else {
84
- switch (size) {
85
- case 1: gen_helper_neon_rshl_s16(var, var, shift); break;
86
- case 2: gen_helper_neon_rshl_s32(var, var, shift); break;
87
- default: abort();
88
- }
89
- }
90
- } else {
91
- if (u) {
92
- switch (size) {
93
- case 1: gen_helper_neon_shl_u16(var, var, shift); break;
94
- case 2: gen_ushl_i32(var, var, shift); break;
95
- default: abort();
96
- }
97
- } else {
98
- switch (size) {
99
- case 1: gen_helper_neon_shl_s16(var, var, shift); break;
100
- case 2: gen_sshl_i32(var, var, shift); break;
101
- default: abort();
102
- }
103
- }
104
- }
105
-}
106
-
107
static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
108
{
109
if (u) {
110
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
111
case 6: /* VQSHLU */
112
case 7: /* VQSHL */
113
case 8: /* VSHRN, VRSHRN, VQSHRUN, VQRSHRUN */
114
+ case 9: /* VQSHRN, VQRSHRN */
115
return 1; /* handled by decodetree */
116
default:
117
break;
118
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
119
size--;
120
}
121
shift = (insn >> 16) & ((1 << (3 + size)) - 1);
122
- if (op < 10) {
123
- /* Shift by immediate and narrow:
124
- VSHRN, VRSHRN, VQSHRN, VQRSHRN. */
125
- int input_unsigned = (op == 8) ? !u : u;
126
- if (rm & 1) {
127
- return 1;
128
- }
129
- shift = shift - (1 << (size + 3));
130
- size++;
131
- if (size == 3) {
132
- tmp64 = tcg_const_i64(shift);
133
- neon_load_reg64(cpu_V0, rm);
134
- neon_load_reg64(cpu_V1, rm + 1);
135
- for (pass = 0; pass < 2; pass++) {
136
- TCGv_i64 in;
137
- if (pass == 0) {
138
- in = cpu_V0;
139
- } else {
140
- in = cpu_V1;
141
- }
142
- if (q) {
143
- if (input_unsigned) {
144
- gen_helper_neon_rshl_u64(cpu_V0, in, tmp64);
145
- } else {
146
- gen_helper_neon_rshl_s64(cpu_V0, in, tmp64);
147
- }
148
- } else {
149
- if (input_unsigned) {
150
- gen_ushl_i64(cpu_V0, in, tmp64);
151
- } else {
152
- gen_sshl_i64(cpu_V0, in, tmp64);
153
- }
154
- }
155
- tmp = tcg_temp_new_i32();
156
- gen_neon_narrow_op(op == 8, u, size - 1, tmp, cpu_V0);
157
- neon_store_reg(rd, pass, tmp);
158
- } /* for pass */
159
- tcg_temp_free_i64(tmp64);
160
- } else {
161
- if (size == 1) {
162
- imm = (uint16_t)shift;
163
- imm |= imm << 16;
164
- } else {
165
- /* size == 2 */
166
- imm = (uint32_t)shift;
167
- }
168
- tmp2 = tcg_const_i32(imm);
169
- tmp4 = neon_load_reg(rm + 1, 0);
170
- tmp5 = neon_load_reg(rm + 1, 1);
171
- for (pass = 0; pass < 2; pass++) {
172
- if (pass == 0) {
173
- tmp = neon_load_reg(rm, 0);
174
- } else {
175
- tmp = tmp4;
176
- }
177
- gen_neon_shift_narrow(size, tmp, tmp2, q,
178
- input_unsigned);
179
- if (pass == 0) {
180
- tmp3 = neon_load_reg(rm, 1);
181
- } else {
182
- tmp3 = tmp5;
183
- }
184
- gen_neon_shift_narrow(size, tmp3, tmp2, q,
185
- input_unsigned);
186
- tcg_gen_concat_i32_i64(cpu_V0, tmp, tmp3);
187
- tcg_temp_free_i32(tmp);
188
- tcg_temp_free_i32(tmp3);
189
- tmp = tcg_temp_new_i32();
190
- gen_neon_narrow_op(op == 8, u, size - 1, tmp, cpu_V0);
191
- neon_store_reg(rd, pass, tmp);
192
- } /* for pass */
193
- tcg_temp_free_i32(tmp2);
194
- }
195
- } else if (op == 10) {
196
+ if (op == 10) {
197
/* VSHLL, VMOVL */
198
if (q || (rd & 1)) {
199
return 1;
222
--
200
--
223
2.17.1
201
2.20.1
224
202
225
203
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the VSHLL and VMOVL insns from the 2-reg-shift group
2
2
to decodetree. Since the loop always has two passes, we unroll
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
it to avoid the awkward reassignment of one TCGv to another.
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
5
Message-id: 20180613015641.5667-13-richard.henderson@linaro.org
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
7
Message-id: 20200522145520.6778-8-peter.maydell@linaro.org
7
---
8
---
8
target/arm/helper-sve.h | 44 +++++++++++++++++++
9
target/arm/neon-dp.decode | 16 +++++++
9
target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
10
target/arm/translate-sve.c | 66 ++++++++++++++++++++++++++++
11
target/arm/translate.c | 46 +------------------
11
target/arm/sve.decode | 23 ++++++++++
12
3 files changed, 99 insertions(+), 44 deletions(-)
12
4 files changed, 221 insertions(+)
13
13
14
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
15
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
16
--- a/target/arm/neon-dp.decode
17
+++ b/target/arm/helper-sve.h
17
+++ b/target/arm/neon-dp.decode
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG,
18
@@ -XXX,XX +XXX,XX @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
19
DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG,
19
&2reg_shift vm=%vm_dp vd=%vd_dp size=1 q=0 \
20
i32, ptr, ptr, ptr, ptr, i32)
20
shift=%neon_rshift_i3
21
21
22
+DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
22
+# Long left shifts: again Q is part of opcode decode
23
+DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
23
+@2reg_shll_s .... ... . . . 1 shift:5 .... .... 0 . . . .... \
24
+DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
24
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=2 q=0
25
+DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
25
+@2reg_shll_h .... ... . . . 01 shift:4 .... .... 0 . . . .... \
26
+DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
26
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1 q=0
27
+DEF_HELPER_FLAGS_4(sve_cmple_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
27
+@2reg_shll_b .... ... . . . 001 shift:3 .... .... 0 . . . .... \
28
+DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
28
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=0 q=0
29
+DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
29
+
30
+DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
30
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
31
+DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
31
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
32
+
32
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
33
+DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
33
@@ -XXX,XX +XXX,XX @@ VQSHRN_U16_2sh 1111 001 1 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_h
34
+DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
34
VQRSHRN_U64_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_d
35
+DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
35
VQRSHRN_U32_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_s
36
+DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
36
VQRSHRN_U16_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_h
37
+DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
37
+
38
+DEF_HELPER_FLAGS_4(sve_cmple_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
38
+VSHLL_S_2sh 1111 001 0 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_s
39
+DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
39
+VSHLL_S_2sh 1111 001 0 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_h
40
+DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
40
+VSHLL_S_2sh 1111 001 0 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
41
+DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
41
+
42
+DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
42
+VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_s
43
+
43
+VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_h
44
+DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
44
+VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
45
+DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
45
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
46
+DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
47
+DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
48
+DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
49
+DEF_HELPER_FLAGS_4(sve_cmple_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
50
+DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
51
+DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
52
+DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
53
+DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
54
+
55
+DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
56
+DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
57
+DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
58
+DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
59
+DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
60
+DEF_HELPER_FLAGS_4(sve_cmple_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
61
+DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
62
+DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
63
+DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
64
+DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
65
+
66
DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
67
DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
68
DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
69
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
70
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
71
--- a/target/arm/sve_helper.c
47
--- a/target/arm/translate-neon.inc.c
72
+++ b/target/arm/sve_helper.c
48
+++ b/target/arm/translate-neon.inc.c
73
@@ -XXX,XX +XXX,XX @@ DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=)
49
@@ -XXX,XX +XXX,XX @@ DO_2SN_32(VQSHRN_U16, gen_helper_neon_shl_u16, gen_helper_neon_narrow_sat_u8)
74
#undef DO_CMP_PPZW_H
50
DO_2SN_64(VQRSHRN_U64, gen_helper_neon_rshl_u64, gen_helper_neon_narrow_sat_u32)
75
#undef DO_CMP_PPZW_S
51
DO_2SN_32(VQRSHRN_U32, gen_helper_neon_rshl_u32, gen_helper_neon_narrow_sat_u16)
76
#undef DO_CMP_PPZW
52
DO_2SN_32(VQRSHRN_U16, gen_helper_neon_rshl_u16, gen_helper_neon_narrow_sat_u8)
77
+
53
+
78
+/* Similar, but the second source is immediate. */
54
+static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
79
+#define DO_CMP_PPZI(NAME, TYPE, OP, H, MASK) \
55
+ NeonGenWidenFn *widenfn, bool u)
80
+uint32_t HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \
81
+{ \
82
+ intptr_t opr_sz = simd_oprsz(desc); \
83
+ uint32_t flags = PREDTEST_INIT; \
84
+ TYPE mm = simd_data(desc); \
85
+ intptr_t i = opr_sz; \
86
+ do { \
87
+ uint64_t out = 0, pg; \
88
+ do { \
89
+ i -= sizeof(TYPE), out <<= sizeof(TYPE); \
90
+ TYPE nn = *(TYPE *)(vn + H(i)); \
91
+ out |= nn OP mm; \
92
+ } while (i & 63); \
93
+ pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \
94
+ out &= pg; \
95
+ *(uint64_t *)(vd + (i >> 3)) = out; \
96
+ flags = iter_predtest_bwd(out, pg, flags); \
97
+ } while (i > 0); \
98
+ return flags; \
99
+}
100
+
101
+#define DO_CMP_PPZI_B(NAME, TYPE, OP) \
102
+ DO_CMP_PPZI(NAME, TYPE, OP, H1, 0xffffffffffffffffull)
103
+#define DO_CMP_PPZI_H(NAME, TYPE, OP) \
104
+ DO_CMP_PPZI(NAME, TYPE, OP, H1_2, 0x5555555555555555ull)
105
+#define DO_CMP_PPZI_S(NAME, TYPE, OP) \
106
+ DO_CMP_PPZI(NAME, TYPE, OP, H1_4, 0x1111111111111111ull)
107
+#define DO_CMP_PPZI_D(NAME, TYPE, OP) \
108
+ DO_CMP_PPZI(NAME, TYPE, OP, , 0x0101010101010101ull)
109
+
110
+DO_CMP_PPZI_B(sve_cmpeq_ppzi_b, uint8_t, ==)
111
+DO_CMP_PPZI_H(sve_cmpeq_ppzi_h, uint16_t, ==)
112
+DO_CMP_PPZI_S(sve_cmpeq_ppzi_s, uint32_t, ==)
113
+DO_CMP_PPZI_D(sve_cmpeq_ppzi_d, uint64_t, ==)
114
+
115
+DO_CMP_PPZI_B(sve_cmpne_ppzi_b, uint8_t, !=)
116
+DO_CMP_PPZI_H(sve_cmpne_ppzi_h, uint16_t, !=)
117
+DO_CMP_PPZI_S(sve_cmpne_ppzi_s, uint32_t, !=)
118
+DO_CMP_PPZI_D(sve_cmpne_ppzi_d, uint64_t, !=)
119
+
120
+DO_CMP_PPZI_B(sve_cmpgt_ppzi_b, int8_t, >)
121
+DO_CMP_PPZI_H(sve_cmpgt_ppzi_h, int16_t, >)
122
+DO_CMP_PPZI_S(sve_cmpgt_ppzi_s, int32_t, >)
123
+DO_CMP_PPZI_D(sve_cmpgt_ppzi_d, int64_t, >)
124
+
125
+DO_CMP_PPZI_B(sve_cmpge_ppzi_b, int8_t, >=)
126
+DO_CMP_PPZI_H(sve_cmpge_ppzi_h, int16_t, >=)
127
+DO_CMP_PPZI_S(sve_cmpge_ppzi_s, int32_t, >=)
128
+DO_CMP_PPZI_D(sve_cmpge_ppzi_d, int64_t, >=)
129
+
130
+DO_CMP_PPZI_B(sve_cmphi_ppzi_b, uint8_t, >)
131
+DO_CMP_PPZI_H(sve_cmphi_ppzi_h, uint16_t, >)
132
+DO_CMP_PPZI_S(sve_cmphi_ppzi_s, uint32_t, >)
133
+DO_CMP_PPZI_D(sve_cmphi_ppzi_d, uint64_t, >)
134
+
135
+DO_CMP_PPZI_B(sve_cmphs_ppzi_b, uint8_t, >=)
136
+DO_CMP_PPZI_H(sve_cmphs_ppzi_h, uint16_t, >=)
137
+DO_CMP_PPZI_S(sve_cmphs_ppzi_s, uint32_t, >=)
138
+DO_CMP_PPZI_D(sve_cmphs_ppzi_d, uint64_t, >=)
139
+
140
+DO_CMP_PPZI_B(sve_cmplt_ppzi_b, int8_t, <)
141
+DO_CMP_PPZI_H(sve_cmplt_ppzi_h, int16_t, <)
142
+DO_CMP_PPZI_S(sve_cmplt_ppzi_s, int32_t, <)
143
+DO_CMP_PPZI_D(sve_cmplt_ppzi_d, int64_t, <)
144
+
145
+DO_CMP_PPZI_B(sve_cmple_ppzi_b, int8_t, <=)
146
+DO_CMP_PPZI_H(sve_cmple_ppzi_h, int16_t, <=)
147
+DO_CMP_PPZI_S(sve_cmple_ppzi_s, int32_t, <=)
148
+DO_CMP_PPZI_D(sve_cmple_ppzi_d, int64_t, <=)
149
+
150
+DO_CMP_PPZI_B(sve_cmplo_ppzi_b, uint8_t, <)
151
+DO_CMP_PPZI_H(sve_cmplo_ppzi_h, uint16_t, <)
152
+DO_CMP_PPZI_S(sve_cmplo_ppzi_s, uint32_t, <)
153
+DO_CMP_PPZI_D(sve_cmplo_ppzi_d, uint64_t, <)
154
+
155
+DO_CMP_PPZI_B(sve_cmpls_ppzi_b, uint8_t, <=)
156
+DO_CMP_PPZI_H(sve_cmpls_ppzi_h, uint16_t, <=)
157
+DO_CMP_PPZI_S(sve_cmpls_ppzi_s, uint32_t, <=)
158
+DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=)
159
+
160
+#undef DO_CMP_PPZI_B
161
+#undef DO_CMP_PPZI_H
162
+#undef DO_CMP_PPZI_S
163
+#undef DO_CMP_PPZI_D
164
+#undef DO_CMP_PPZI
165
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
166
index XXXXXXX..XXXXXXX 100644
167
--- a/target/arm/translate-sve.c
168
+++ b/target/arm/translate-sve.c
169
@@ -XXX,XX +XXX,XX @@
170
#include "translate-a64.h"
171
172
173
+typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr,
174
+ TCGv_ptr, TCGv_i32);
175
typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr,
176
TCGv_ptr, TCGv_ptr, TCGv_i32);
177
178
@@ -XXX,XX +XXX,XX @@ DO_PPZW(CMPLS, cmpls)
179
180
#undef DO_PPZW
181
182
+/*
183
+ *** SVE Integer Compare - Immediate Groups
184
+ */
185
+
186
+static bool do_ppzi_flags(DisasContext *s, arg_rpri_esz *a,
187
+ gen_helper_gvec_flags_3 *gen_fn)
188
+{
56
+{
189
+ TCGv_ptr pd, zn, pg;
57
+ TCGv_i64 tmp;
190
+ unsigned vsz;
58
+ TCGv_i32 rm0, rm1;
191
+ TCGv_i32 t;
59
+ uint64_t widen_mask = 0;
192
+
60
+
193
+ if (gen_fn == NULL) {
61
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
194
+ return false;
62
+ return false;
195
+ }
63
+ }
196
+ if (!sve_access_check(s)) {
64
+
65
+ /* UNDEF accesses to D16-D31 if they don't exist. */
66
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
67
+ ((a->vd | a->vm) & 0x10)) {
68
+ return false;
69
+ }
70
+
71
+ if (a->vd & 1) {
72
+ return false;
73
+ }
74
+
75
+ if (!vfp_access_check(s)) {
197
+ return true;
76
+ return true;
198
+ }
77
+ }
199
+
78
+
200
+ vsz = vec_full_reg_size(s);
79
+ /*
201
+ t = tcg_const_i32(simd_desc(vsz, vsz, a->imm));
80
+ * This is a widen-and-shift operation. The shift is always less
202
+ pd = tcg_temp_new_ptr();
81
+ * than the width of the source type, so after widening the input
203
+ zn = tcg_temp_new_ptr();
82
+ * vector we can simply shift the whole 64-bit widened register,
204
+ pg = tcg_temp_new_ptr();
83
+ * and then clear the potential overflow bits resulting from left
205
+
84
+ * bits of the narrow input appearing as right bits of the left
206
+ tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd));
85
+ * neighbour narrow input. Calculate a mask of bits to clear.
207
+ tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn));
86
+ */
208
+ tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg));
87
+ if ((a->shift != 0) && (a->size < 2 || u)) {
209
+
88
+ int esize = 8 << a->size;
210
+ gen_fn(t, pd, zn, pg, t);
89
+ widen_mask = MAKE_64BIT_MASK(0, esize);
211
+
90
+ widen_mask >>= esize - a->shift;
212
+ tcg_temp_free_ptr(pd);
91
+ widen_mask = dup_const(a->size + 1, widen_mask);
213
+ tcg_temp_free_ptr(zn);
92
+ }
214
+ tcg_temp_free_ptr(pg);
93
+
215
+
94
+ rm0 = neon_load_reg(a->vm, 0);
216
+ do_pred_flags(t);
95
+ rm1 = neon_load_reg(a->vm, 1);
217
+
96
+ tmp = tcg_temp_new_i64();
218
+ tcg_temp_free_i32(t);
97
+
98
+ widenfn(tmp, rm0);
99
+ if (a->shift != 0) {
100
+ tcg_gen_shli_i64(tmp, tmp, a->shift);
101
+ tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
102
+ }
103
+ neon_store_reg64(tmp, a->vd);
104
+
105
+ widenfn(tmp, rm1);
106
+ if (a->shift != 0) {
107
+ tcg_gen_shli_i64(tmp, tmp, a->shift);
108
+ tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
109
+ }
110
+ neon_store_reg64(tmp, a->vd + 1);
111
+ tcg_temp_free_i64(tmp);
219
+ return true;
112
+ return true;
220
+}
113
+}
221
+
114
+
222
+#define DO_PPZI(NAME, name) \
115
+static bool trans_VSHLL_S_2sh(DisasContext *s, arg_2reg_shift *a)
223
+static bool trans_##NAME##_ppzi(DisasContext *s, arg_rpri_esz *a, \
116
+{
224
+ uint32_t insn) \
117
+ NeonGenWidenFn *widenfn[] = {
225
+{ \
118
+ gen_helper_neon_widen_s8,
226
+ static gen_helper_gvec_flags_3 * const fns[4] = { \
119
+ gen_helper_neon_widen_s16,
227
+ gen_helper_sve_##name##_ppzi_b, gen_helper_sve_##name##_ppzi_h, \
120
+ tcg_gen_ext_i32_i64,
228
+ gen_helper_sve_##name##_ppzi_s, gen_helper_sve_##name##_ppzi_d, \
121
+ };
229
+ }; \
122
+ return do_vshll_2sh(s, a, widenfn[a->size], false);
230
+ return do_ppzi_flags(s, a, fns[a->esz]); \
231
+}
123
+}
232
+
124
+
233
+DO_PPZI(CMPEQ, cmpeq)
125
+static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
234
+DO_PPZI(CMPNE, cmpne)
126
+{
235
+DO_PPZI(CMPGT, cmpgt)
127
+ NeonGenWidenFn *widenfn[] = {
236
+DO_PPZI(CMPGE, cmpge)
128
+ gen_helper_neon_widen_u8,
237
+DO_PPZI(CMPHI, cmphi)
129
+ gen_helper_neon_widen_u16,
238
+DO_PPZI(CMPHS, cmphs)
130
+ tcg_gen_extu_i32_i64,
239
+DO_PPZI(CMPLT, cmplt)
131
+ };
240
+DO_PPZI(CMPLE, cmple)
132
+ return do_vshll_2sh(s, a, widenfn[a->size], true);
241
+DO_PPZI(CMPLO, cmplo)
133
+}
242
+DO_PPZI(CMPLS, cmpls)
134
diff --git a/target/arm/translate.c b/target/arm/translate.c
243
+
244
+#undef DO_PPZI
245
+
246
/*
247
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
248
*/
249
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
250
index XXXXXXX..XXXXXXX 100644
135
index XXXXXXX..XXXXXXX 100644
251
--- a/target/arm/sve.decode
136
--- a/target/arm/translate.c
252
+++ b/target/arm/sve.decode
137
+++ b/target/arm/translate.c
253
@@ -XXX,XX +XXX,XX @@
138
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
254
@rdn_dbm ........ .. .... dbm:13 rd:5 \
139
case 7: /* VQSHL */
255
&rr_dbm rn=%reg_movprfx
140
case 8: /* VSHRN, VRSHRN, VQSHRUN, VQRSHRUN */
256
141
case 9: /* VQSHRN, VQRSHRN */
257
+# Predicate output, vector and immediate input,
142
+ case 10: /* VSHLL, including VMOVL */
258
+# controlling predicate, element size.
143
return 1; /* handled by decodetree */
259
+@pd_pg_rn_i7 ........ esz:2 . imm:7 . pg:3 rn:5 . rd:4 &rpri_esz
144
default:
260
+@pd_pg_rn_i5 ........ esz:2 . imm:s5 ... pg:3 rn:5 . rd:4 &rpri_esz
145
break;
261
+
146
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
262
# Basic Load/Store with 9-bit immediate offset
147
size--;
263
@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \
148
}
264
&rri imm=%imm9_16_10
149
shift = (insn >> 16) & ((1 << (3 + size)) - 1);
265
@@ -XXX,XX +XXX,XX @@ CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm
150
- if (op == 10) {
266
CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm
151
- /* VSHLL, VMOVL */
267
CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm
152
- if (q || (rd & 1)) {
268
153
- return 1;
269
+### SVE Integer Compare - Unsigned Immediate Group
154
- }
270
+
155
- tmp = neon_load_reg(rm, 0);
271
+# SVE integer compare with unsigned immediate
156
- tmp2 = neon_load_reg(rm, 1);
272
+CMPHS_ppzi 00100100 .. 1 ....... 0 ... ..... 0 .... @pd_pg_rn_i7
157
- for (pass = 0; pass < 2; pass++) {
273
+CMPHI_ppzi 00100100 .. 1 ....... 0 ... ..... 1 .... @pd_pg_rn_i7
158
- if (pass == 1)
274
+CMPLO_ppzi 00100100 .. 1 ....... 1 ... ..... 0 .... @pd_pg_rn_i7
159
- tmp = tmp2;
275
+CMPLS_ppzi 00100100 .. 1 ....... 1 ... ..... 1 .... @pd_pg_rn_i7
160
-
276
+
161
- gen_neon_widen(cpu_V0, tmp, size, u);
277
+### SVE Integer Compare - Signed Immediate Group
162
-
278
+
163
- if (shift != 0) {
279
+# SVE integer compare with signed immediate
164
- /* The shift is less than the width of the source
280
+CMPGE_ppzi 00100101 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_i5
165
- type, so we can just shift the whole register. */
281
+CMPGT_ppzi 00100101 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_i5
166
- tcg_gen_shli_i64(cpu_V0, cpu_V0, shift);
282
+CMPLT_ppzi 00100101 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_i5
167
- /* Widen the result of shift: we need to clear
283
+CMPLE_ppzi 00100101 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_i5
168
- * the potential overflow bits resulting from
284
+CMPEQ_ppzi 00100101 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_i5
169
- * left bits of the narrow input appearing as
285
+CMPNE_ppzi 00100101 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_i5
170
- * right bits of left the neighbour narrow
286
+
171
- * input. */
287
### SVE Predicate Logical Operations Group
172
- if (size < 2 || !u) {
288
173
- uint64_t imm64;
289
# SVE predicate logical operations
174
- if (size == 0) {
175
- imm = (0xffu >> (8 - shift));
176
- imm |= imm << 16;
177
- } else if (size == 1) {
178
- imm = 0xffff >> (16 - shift);
179
- } else {
180
- /* size == 2 */
181
- imm = 0xffffffff >> (32 - shift);
182
- }
183
- if (size < 2) {
184
- imm64 = imm | (((uint64_t)imm) << 32);
185
- } else {
186
- imm64 = imm;
187
- }
188
- tcg_gen_andi_i64(cpu_V0, cpu_V0, ~imm64);
189
- }
190
- }
191
- neon_store_reg64(cpu_V0, rd + pass);
192
- }
193
- } else if (op >= 14) {
194
+ if (op >= 14) {
195
/* VCVT fixed-point. */
196
TCGv_ptr fpst;
197
TCGv_i32 shiftv;
290
--
198
--
291
2.17.1
199
2.20.1
292
200
293
201
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the VCVT fixed-point conversion operations in the
2
Neon 2-regs-and-shift group to decodetree.
2
3
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Message-id: 20180613015641.5667-16-richard.henderson@linaro.org
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-id: 20200522145520.6778-9-peter.maydell@linaro.org
7
---
7
---
8
target/arm/helper-sve.h | 2 +
8
target/arm/neon-dp.decode | 11 +++++
9
target/arm/sve_helper.c | 31 ++++++++++++
9
target/arm/translate-neon.inc.c | 49 +++++++++++++++++++++
10
target/arm/translate-sve.c | 99 ++++++++++++++++++++++++++++++++++++++
10
target/arm/translate.c | 75 +--------------------------------
11
target/arm/sve.decode | 8 +++
11
3 files changed, 62 insertions(+), 73 deletions(-)
12
4 files changed, 140 insertions(+)
13
12
14
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
13
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/target/arm/helper-sve.h
15
--- a/target/arm/neon-dp.decode
17
+++ b/target/arm/helper-sve.h
16
+++ b/target/arm/neon-dp.decode
18
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
17
@@ -XXX,XX +XXX,XX @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
19
DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32)
18
@2reg_shll_b .... ... . . . 001 shift:3 .... .... 0 . . . .... \
20
19
&2reg_shift vm=%vm_dp vd=%vd_dp size=0 q=0
21
DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32)
20
21
+# We use size=0 for fp32 and size=1 for fp16 to match the 3-same encodings.
22
+@2reg_vcvt .... ... . . . 1 ..... .... .... . q:1 . . .... \
23
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=0 shift=%neon_rshift_i5
22
+
24
+
23
+DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32)
25
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
24
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
26
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
27
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
28
@@ -XXX,XX +XXX,XX @@ VSHLL_S_2sh 1111 001 0 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
29
VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_s
30
VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_h
31
VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
32
+
33
+# VCVT fixed<->float conversions
34
+# TODO: FP16 fixed<->float conversions are opc==0b1100 and 0b1101
35
+VCVT_SF_2sh 1111 001 0 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
36
+VCVT_UF_2sh 1111 001 1 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
37
+VCVT_FS_2sh 1111 001 0 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
38
+VCVT_FU_2sh 1111 001 1 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
39
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
25
index XXXXXXX..XXXXXXX 100644
40
index XXXXXXX..XXXXXXX 100644
26
--- a/target/arm/sve_helper.c
41
--- a/target/arm/translate-neon.inc.c
27
+++ b/target/arm/sve_helper.c
42
+++ b/target/arm/translate-neon.inc.c
28
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc)
43
@@ -XXX,XX +XXX,XX @@ static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
29
}
44
};
30
return sum;
45
return do_vshll_2sh(s, a, widenfn[a->size], true);
31
}
46
}
32
+
47
+
33
+uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc)
48
+static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a,
49
+ NeonGenTwoSingleOPFn *fn)
34
+{
50
+{
35
+ uintptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2;
51
+ /* FP operations in 2-reg-and-shift group */
36
+ intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2);
52
+ TCGv_i32 tmp, shiftv;
37
+ uint64_t esz_mask = pred_esz_masks[esz];
53
+ TCGv_ptr fpstatus;
38
+ ARMPredicateReg *d = vd;
54
+ int pass;
39
+ uint32_t flags;
40
+ intptr_t i;
41
+
55
+
42
+ /* Begin with a zero predicate register. */
56
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
43
+ flags = do_zero(d, oprsz);
57
+ return false;
44
+ if (count == 0) {
45
+ return flags;
46
+ }
58
+ }
47
+
59
+
48
+ /* Scale from predicate element count to bits. */
60
+ /* UNDEF accesses to D16-D31 if they don't exist. */
49
+ count <<= esz;
61
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
50
+ /* Bound to the bits in the predicate. */
62
+ ((a->vd | a->vm) & 0x10)) {
51
+ count = MIN(count, oprsz * 8);
63
+ return false;
52
+
53
+ /* Set all of the requested bits. */
54
+ for (i = 0; i < count / 64; ++i) {
55
+ d->p[i] = esz_mask;
56
+ }
57
+ if (count & 63) {
58
+ d->p[i] = MAKE_64BIT_MASK(0, count & 63) & esz_mask;
59
+ }
64
+ }
60
+
65
+
61
+ return predtest_ones(d, oprsz, esz_mask);
66
+ if ((a->vm | a->vd) & a->q) {
62
+}
67
+ return false;
63
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
68
+ }
64
index XXXXXXX..XXXXXXX 100644
65
--- a/target/arm/translate-sve.c
66
+++ b/target/arm/translate-sve.c
67
@@ -XXX,XX +XXX,XX @@ static bool trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a,
68
return true;
69
}
70
71
+/*
72
+ *** SVE Integer Compare Scalars Group
73
+ */
74
+
69
+
75
+static bool trans_CTERM(DisasContext *s, arg_CTERM *a, uint32_t insn)
70
+ if (!vfp_access_check(s)) {
76
+{
77
+ if (!sve_access_check(s)) {
78
+ return true;
71
+ return true;
79
+ }
72
+ }
80
+
73
+
81
+ TCGCond cond = (a->ne ? TCG_COND_NE : TCG_COND_EQ);
74
+ fpstatus = get_fpstatus_ptr(1);
82
+ TCGv_i64 rn = read_cpu_reg(s, a->rn, a->sf);
75
+ shiftv = tcg_const_i32(a->shift);
83
+ TCGv_i64 rm = read_cpu_reg(s, a->rm, a->sf);
76
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
84
+ TCGv_i64 cmp = tcg_temp_new_i64();
77
+ tmp = neon_load_reg(a->vm, pass);
85
+
78
+ fn(tmp, tmp, shiftv, fpstatus);
86
+ tcg_gen_setcond_i64(cond, cmp, rn, rm);
79
+ neon_store_reg(a->vd, pass, tmp);
87
+ tcg_gen_extrl_i64_i32(cpu_NF, cmp);
80
+ }
88
+ tcg_temp_free_i64(cmp);
81
+ tcg_temp_free_ptr(fpstatus);
89
+
82
+ tcg_temp_free_i32(shiftv);
90
+ /* VF = !NF & !CF. */
91
+ tcg_gen_xori_i32(cpu_VF, cpu_NF, 1);
92
+ tcg_gen_andc_i32(cpu_VF, cpu_VF, cpu_CF);
93
+
94
+ /* Both NF and VF actually look at bit 31. */
95
+ tcg_gen_neg_i32(cpu_NF, cpu_NF);
96
+ tcg_gen_neg_i32(cpu_VF, cpu_VF);
97
+ return true;
83
+ return true;
98
+}
84
+}
99
+
85
+
100
+static bool trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn)
86
+#define DO_FP_2SH(INSN, FUNC) \
101
+{
87
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
102
+ if (!sve_access_check(s)) {
88
+ { \
103
+ return true;
89
+ return do_fp_2sh(s, a, FUNC); \
104
+ }
90
+ }
105
+
91
+
106
+ TCGv_i64 op0 = read_cpu_reg(s, a->rn, 1);
92
+DO_FP_2SH(VCVT_SF, gen_helper_vfp_sltos)
107
+ TCGv_i64 op1 = read_cpu_reg(s, a->rm, 1);
93
+DO_FP_2SH(VCVT_UF, gen_helper_vfp_ultos)
108
+ TCGv_i64 t0 = tcg_temp_new_i64();
94
+DO_FP_2SH(VCVT_FS, gen_helper_vfp_tosls_round_to_zero)
109
+ TCGv_i64 t1 = tcg_temp_new_i64();
95
+DO_FP_2SH(VCVT_FU, gen_helper_vfp_touls_round_to_zero)
110
+ TCGv_i32 t2, t3;
96
diff --git a/target/arm/translate.c b/target/arm/translate.c
111
+ TCGv_ptr ptr;
112
+ unsigned desc, vsz = vec_full_reg_size(s);
113
+ TCGCond cond;
114
+
115
+ if (!a->sf) {
116
+ if (a->u) {
117
+ tcg_gen_ext32u_i64(op0, op0);
118
+ tcg_gen_ext32u_i64(op1, op1);
119
+ } else {
120
+ tcg_gen_ext32s_i64(op0, op0);
121
+ tcg_gen_ext32s_i64(op1, op1);
122
+ }
123
+ }
124
+
125
+ /* For the helper, compress the different conditions into a computation
126
+ * of how many iterations for which the condition is true.
127
+ *
128
+ * This is slightly complicated by 0 <= UINT64_MAX, which is nominally
129
+ * 2**64 iterations, overflowing to 0. Of course, predicate registers
130
+ * aren't that large, so any value >= predicate size is sufficient.
131
+ */
132
+ tcg_gen_sub_i64(t0, op1, op0);
133
+
134
+ /* t0 = MIN(op1 - op0, vsz). */
135
+ tcg_gen_movi_i64(t1, vsz);
136
+ tcg_gen_umin_i64(t0, t0, t1);
137
+ if (a->eq) {
138
+ /* Equality means one more iteration. */
139
+ tcg_gen_addi_i64(t0, t0, 1);
140
+ }
141
+
142
+ /* t0 = (condition true ? t0 : 0). */
143
+ cond = (a->u
144
+ ? (a->eq ? TCG_COND_LEU : TCG_COND_LTU)
145
+ : (a->eq ? TCG_COND_LE : TCG_COND_LT));
146
+ tcg_gen_movi_i64(t1, 0);
147
+ tcg_gen_movcond_i64(cond, t0, op0, op1, t0, t1);
148
+
149
+ t2 = tcg_temp_new_i32();
150
+ tcg_gen_extrl_i64_i32(t2, t0);
151
+ tcg_temp_free_i64(t0);
152
+ tcg_temp_free_i64(t1);
153
+
154
+ desc = (vsz / 8) - 2;
155
+ desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz);
156
+ t3 = tcg_const_i32(desc);
157
+
158
+ ptr = tcg_temp_new_ptr();
159
+ tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd));
160
+
161
+ gen_helper_sve_while(t2, ptr, t2, t3);
162
+ do_pred_flags(t2);
163
+
164
+ tcg_temp_free_ptr(ptr);
165
+ tcg_temp_free_i32(t2);
166
+ tcg_temp_free_i32(t3);
167
+ return true;
168
+}
169
+
170
/*
171
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
172
*/
173
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
174
index XXXXXXX..XXXXXXX 100644
97
index XXXXXXX..XXXXXXX 100644
175
--- a/target/arm/sve.decode
98
--- a/target/arm/translate.c
176
+++ b/target/arm/sve.decode
99
+++ b/target/arm/translate.c
177
@@ -XXX,XX +XXX,XX @@ SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred
100
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
178
# SVE saturating inc/dec vector by predicate count
101
int q;
179
SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred
102
int rd, rn, rm, rd_ofs, rn_ofs, rm_ofs;
180
103
int size;
181
+### SVE Integer Compare - Scalars Group
104
- int shift;
182
+
105
int pass;
183
+# SVE conditionally terminate scalars
106
int u;
184
+CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000
107
int vec_size;
185
+
108
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
186
+# SVE integer compare scalar count and limit
109
return 1;
187
+WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4
110
} else if (insn & (1 << 4)) {
188
+
111
if ((insn & 0x00380080) != 0) {
189
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
112
- /* Two registers and shift. */
190
113
- op = (insn >> 8) & 0xf;
191
# SVE load predicate register
114
-
115
- switch (op) {
116
- case 0: /* VSHR */
117
- case 1: /* VSRA */
118
- case 2: /* VRSHR */
119
- case 3: /* VRSRA */
120
- case 4: /* VSRI */
121
- case 5: /* VSHL, VSLI */
122
- case 6: /* VQSHLU */
123
- case 7: /* VQSHL */
124
- case 8: /* VSHRN, VRSHRN, VQSHRUN, VQRSHRUN */
125
- case 9: /* VQSHRN, VQRSHRN */
126
- case 10: /* VSHLL, including VMOVL */
127
- return 1; /* handled by decodetree */
128
- default:
129
- break;
130
- }
131
-
132
- if (insn & (1 << 7)) {
133
- /* 64-bit shift. */
134
- if (op > 7) {
135
- return 1;
136
- }
137
- size = 3;
138
- } else {
139
- size = 2;
140
- while ((insn & (1 << (size + 19))) == 0)
141
- size--;
142
- }
143
- shift = (insn >> 16) & ((1 << (3 + size)) - 1);
144
- if (op >= 14) {
145
- /* VCVT fixed-point. */
146
- TCGv_ptr fpst;
147
- TCGv_i32 shiftv;
148
- VFPGenFixPointFn *fn;
149
-
150
- if (!(insn & (1 << 21)) || (q && ((rd | rm) & 1))) {
151
- return 1;
152
- }
153
-
154
- if (!(op & 1)) {
155
- if (u) {
156
- fn = gen_helper_vfp_ultos;
157
- } else {
158
- fn = gen_helper_vfp_sltos;
159
- }
160
- } else {
161
- if (u) {
162
- fn = gen_helper_vfp_touls_round_to_zero;
163
- } else {
164
- fn = gen_helper_vfp_tosls_round_to_zero;
165
- }
166
- }
167
-
168
- /* We have already masked out the must-be-1 top bit of imm6,
169
- * hence this 32-shift where the ARM ARM has 64-imm6.
170
- */
171
- shift = 32 - shift;
172
- fpst = get_fpstatus_ptr(1);
173
- shiftv = tcg_const_i32(shift);
174
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
175
- TCGv_i32 tmpf = neon_load_reg(rm, pass);
176
- fn(tmpf, tmpf, shiftv, fpst);
177
- neon_store_reg(rd, pass, tmpf);
178
- }
179
- tcg_temp_free_ptr(fpst);
180
- tcg_temp_free_i32(shiftv);
181
- } else {
182
- return 1;
183
- }
184
+ /* Two registers and shift: handled by decodetree */
185
+ return 1;
186
} else { /* (insn & 0x00380080) == 0 */
187
int invert, reg_ofs, vec_size;
188
192
--
189
--
193
2.17.1
190
2.20.1
194
191
195
192
diff view generated by jsdifflib
1
From: Richard Henderson <richard.henderson@linaro.org>
1
Convert the insns in the one-register-and-immediate group to decodetree.
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
In the new decode, our asimd_imm_const() function returns a 64-bit value
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
rather than a 32-bit one, which means we don't need to treat cmode=14 op=1
5
Message-id: 20180613015641.5667-17-richard.henderson@linaro.org
5
as a special case in the decoder (it is the only encoding where the two
6
halves of the 64-bit value are different).
7
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-id: 20200522145520.6778-10-peter.maydell@linaro.org
7
---
11
---
8
target/arm/translate-sve.c | 37 +++++++++++++++++++++++++++++++++++++
12
target/arm/neon-dp.decode | 22 ++++++
9
target/arm/sve.decode | 8 ++++++++
13
target/arm/translate-neon.inc.c | 118 ++++++++++++++++++++++++++++++++
10
2 files changed, 45 insertions(+)
14
target/arm/translate.c | 101 +--------------------------
11
15
3 files changed, 142 insertions(+), 99 deletions(-)
12
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
16
17
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
13
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
14
--- a/target/arm/translate-sve.c
19
--- a/target/arm/neon-dp.decode
15
+++ b/target/arm/translate-sve.c
20
+++ b/target/arm/neon-dp.decode
16
@@ -XXX,XX +XXX,XX @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn)
21
@@ -XXX,XX +XXX,XX @@ VCVT_SF_2sh 1111 001 0 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
17
return true;
22
VCVT_UF_2sh 1111 001 1 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
18
}
23
VCVT_FS_2sh 1111 001 0 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
19
24
VCVT_FU_2sh 1111 001 1 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
20
+/*
25
+
21
+ *** SVE Integer Wide Immediate - Unpredicated Group
26
+######################################################################
22
+ */
27
+# 1-reg-and-modified-immediate grouping:
23
+
28
+# 1111 001 i 1 D 000 imm:3 Vd:4 cmode:4 0 Q op 1 Vm:4
24
+static bool trans_FDUP(DisasContext *s, arg_FDUP *a, uint32_t insn)
29
+######################################################################
25
+{
30
+
26
+ if (a->esz == 0) {
31
+&1reg_imm vd q imm cmode op
32
+
33
+%asimd_imm_value 24:1 16:3 0:4
34
+
35
+@1reg_imm .... ... . . . ... ... .... .... . q:1 . . .... \
36
+ &1reg_imm imm=%asimd_imm_value vd=%vd_dp
37
+
38
+# The cmode/op bits here decode VORR/VBIC/VMOV/VMNV, but
39
+# not in a way we can conveniently represent in decodetree without
40
+# a lot of repetition:
41
+# VORR: op=0, (cmode & 1) && cmode < 12
42
+# VBIC: op=1, (cmode & 1) && cmode < 12
43
+# VMOV: everything else
44
+# So we have a single decode line and check the cmode/op in the
45
+# trans function.
46
+Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
47
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/target/arm/translate-neon.inc.c
50
+++ b/target/arm/translate-neon.inc.c
51
@@ -XXX,XX +XXX,XX @@ DO_FP_2SH(VCVT_SF, gen_helper_vfp_sltos)
52
DO_FP_2SH(VCVT_UF, gen_helper_vfp_ultos)
53
DO_FP_2SH(VCVT_FS, gen_helper_vfp_tosls_round_to_zero)
54
DO_FP_2SH(VCVT_FU, gen_helper_vfp_touls_round_to_zero)
55
+
56
+static uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
57
+{
58
+ /*
59
+ * Expand the encoded constant.
60
+ * Note that cmode = 2,3,4,5,6,7,10,11,12,13 imm=0 is UNPREDICTABLE.
61
+ * We choose to not special-case this and will behave as if a
62
+ * valid constant encoding of 0 had been given.
63
+ * cmode = 15 op = 1 must UNDEF; we assume decode has handled that.
64
+ */
65
+ switch (cmode) {
66
+ case 0: case 1:
67
+ /* no-op */
68
+ break;
69
+ case 2: case 3:
70
+ imm <<= 8;
71
+ break;
72
+ case 4: case 5:
73
+ imm <<= 16;
74
+ break;
75
+ case 6: case 7:
76
+ imm <<= 24;
77
+ break;
78
+ case 8: case 9:
79
+ imm |= imm << 16;
80
+ break;
81
+ case 10: case 11:
82
+ imm = (imm << 8) | (imm << 24);
83
+ break;
84
+ case 12:
85
+ imm = (imm << 8) | 0xff;
86
+ break;
87
+ case 13:
88
+ imm = (imm << 16) | 0xffff;
89
+ break;
90
+ case 14:
91
+ if (op) {
92
+ /*
93
+ * This is the only case where the top and bottom 32 bits
94
+ * of the encoded constant differ.
95
+ */
96
+ uint64_t imm64 = 0;
97
+ int n;
98
+
99
+ for (n = 0; n < 8; n++) {
100
+ if (imm & (1 << n)) {
101
+ imm64 |= (0xffULL << (n * 8));
102
+ }
103
+ }
104
+ return imm64;
105
+ }
106
+ imm |= (imm << 8) | (imm << 16) | (imm << 24);
107
+ break;
108
+ case 15:
109
+ imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
110
+ | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
111
+ break;
112
+ }
113
+ if (op) {
114
+ imm = ~imm;
115
+ }
116
+ return dup_const(MO_32, imm);
117
+}
118
+
119
+static bool do_1reg_imm(DisasContext *s, arg_1reg_imm *a,
120
+ GVecGen2iFn *fn)
121
+{
122
+ uint64_t imm;
123
+ int reg_ofs, vec_size;
124
+
125
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
27
+ return false;
126
+ return false;
28
+ }
127
+ }
29
+ if (sve_access_check(s)) {
128
+
30
+ unsigned vsz = vec_full_reg_size(s);
129
+ /* UNDEF accesses to D16-D31 if they don't exist. */
31
+ int dofs = vec_full_reg_offset(s, a->rd);
130
+ if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
32
+ uint64_t imm;
131
+ return false;
33
+
132
+ }
34
+ /* Decode the VFP immediate. */
133
+
35
+ imm = vfp_expand_imm(a->esz, a->imm);
134
+ if (a->vd & a->q) {
36
+ imm = dup_const(a->esz, imm);
135
+ return false;
37
+
136
+ }
38
+ tcg_gen_gvec_dup64i(dofs, vsz, vsz, imm);
137
+
39
+ }
138
+ if (!vfp_access_check(s)) {
139
+ return true;
140
+ }
141
+
142
+ reg_ofs = neon_reg_offset(a->vd, 0);
143
+ vec_size = a->q ? 16 : 8;
144
+ imm = asimd_imm_const(a->imm, a->cmode, a->op);
145
+
146
+ fn(MO_64, reg_ofs, reg_ofs, imm, vec_size, vec_size);
40
+ return true;
147
+ return true;
41
+}
148
+}
42
+
149
+
43
+static bool trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn)
150
+static void gen_VMOV_1r(unsigned vece, uint32_t dofs, uint32_t aofs,
44
+{
151
+ int64_t c, uint32_t oprsz, uint32_t maxsz)
45
+ if (a->esz == 0 && extract32(insn, 13, 1)) {
152
+{
46
+ return false;
153
+ tcg_gen_gvec_dup_imm(MO_64, dofs, oprsz, maxsz, c);
47
+ }
154
+}
48
+ if (sve_access_check(s)) {
155
+
49
+ unsigned vsz = vec_full_reg_size(s);
156
+static bool trans_Vimm_1r(DisasContext *s, arg_1reg_imm *a)
50
+ int dofs = vec_full_reg_offset(s, a->rd);
157
+{
51
+
158
+ /* Handle decode of cmode/op here between VORR/VBIC/VMOV */
52
+ tcg_gen_gvec_dup64i(dofs, vsz, vsz, dup_const(a->esz, a->imm));
159
+ GVecGen2iFn *fn;
53
+ }
160
+
54
+ return true;
161
+ if ((a->cmode & 1) && a->cmode < 12) {
55
+}
162
+ /* for op=1, the imm will be inverted, so BIC becomes AND. */
56
+
163
+ fn = a->op ? tcg_gen_gvec_andi : tcg_gen_gvec_ori;
57
/*
164
+ } else {
58
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
165
+ /* There is one unallocated cmode/op combination in this space */
59
*/
166
+ if (a->cmode == 15 && a->op == 1) {
60
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
167
+ return false;
168
+ }
169
+ fn = gen_VMOV_1r;
170
+ }
171
+ return do_1reg_imm(s, a, fn);
172
+}
173
diff --git a/target/arm/translate.c b/target/arm/translate.c
61
index XXXXXXX..XXXXXXX 100644
174
index XXXXXXX..XXXXXXX 100644
62
--- a/target/arm/sve.decode
175
--- a/target/arm/translate.c
63
+++ b/target/arm/sve.decode
176
+++ b/target/arm/translate.c
64
@@ -XXX,XX +XXX,XX @@ CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000
177
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
65
# SVE integer compare scalar count and limit
178
/* Three register same length: handled by decodetree */
66
WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4
179
return 1;
67
180
} else if (insn & (1 << 4)) {
68
+### SVE Integer Wide Immediate - Unpredicated Group
181
- if ((insn & 0x00380080) != 0) {
69
+
182
- /* Two registers and shift: handled by decodetree */
70
+# SVE broadcast floating-point immediate (unpredicated)
183
- return 1;
71
+FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5
184
- } else { /* (insn & 0x00380080) == 0 */
72
+
185
- int invert, reg_ofs, vec_size;
73
+# SVE broadcast integer immediate (unpredicated)
186
-
74
+DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=%sh8_i8s
187
- if (q && (rd & 1)) {
75
+
188
- return 1;
76
### SVE Memory - 32-bit Gather and Unsized Contiguous Group
189
- }
77
190
-
78
# SVE load predicate register
191
- op = (insn >> 8) & 0xf;
192
- /* One register and immediate. */
193
- imm = (u << 7) | ((insn >> 12) & 0x70) | (insn & 0xf);
194
- invert = (insn & (1 << 5)) != 0;
195
- /* Note that op = 2,3,4,5,6,7,10,11,12,13 imm=0 is UNPREDICTABLE.
196
- * We choose to not special-case this and will behave as if a
197
- * valid constant encoding of 0 had been given.
198
- */
199
- switch (op) {
200
- case 0: case 1:
201
- /* no-op */
202
- break;
203
- case 2: case 3:
204
- imm <<= 8;
205
- break;
206
- case 4: case 5:
207
- imm <<= 16;
208
- break;
209
- case 6: case 7:
210
- imm <<= 24;
211
- break;
212
- case 8: case 9:
213
- imm |= imm << 16;
214
- break;
215
- case 10: case 11:
216
- imm = (imm << 8) | (imm << 24);
217
- break;
218
- case 12:
219
- imm = (imm << 8) | 0xff;
220
- break;
221
- case 13:
222
- imm = (imm << 16) | 0xffff;
223
- break;
224
- case 14:
225
- imm |= (imm << 8) | (imm << 16) | (imm << 24);
226
- if (invert) {
227
- imm = ~imm;
228
- }
229
- break;
230
- case 15:
231
- if (invert) {
232
- return 1;
233
- }
234
- imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
235
- | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
236
- break;
237
- }
238
- if (invert) {
239
- imm = ~imm;
240
- }
241
-
242
- reg_ofs = neon_reg_offset(rd, 0);
243
- vec_size = q ? 16 : 8;
244
-
245
- if (op & 1 && op < 12) {
246
- if (invert) {
247
- /* The immediate value has already been inverted,
248
- * so BIC becomes AND.
249
- */
250
- tcg_gen_gvec_andi(MO_32, reg_ofs, reg_ofs, imm,
251
- vec_size, vec_size);
252
- } else {
253
- tcg_gen_gvec_ori(MO_32, reg_ofs, reg_ofs, imm,
254
- vec_size, vec_size);
255
- }
256
- } else {
257
- /* VMOV, VMVN. */
258
- if (op == 14 && invert) {
259
- TCGv_i64 t64 = tcg_temp_new_i64();
260
-
261
- for (pass = 0; pass <= q; ++pass) {
262
- uint64_t val = 0;
263
- int n;
264
-
265
- for (n = 0; n < 8; n++) {
266
- if (imm & (1 << (n + pass * 8))) {
267
- val |= 0xffull << (n * 8);
268
- }
269
- }
270
- tcg_gen_movi_i64(t64, val);
271
- neon_store_reg64(t64, rd + pass);
272
- }
273
- tcg_temp_free_i64(t64);
274
- } else {
275
- tcg_gen_gvec_dup_imm(MO_32, reg_ofs, vec_size,
276
- vec_size, imm);
277
- }
278
- }
279
- }
280
+ /* Two registers and shift or reg and imm: handled by decodetree */
281
+ return 1;
282
} else { /* (insn & 0x00800010 == 0x00800000) */
283
if (size != 3) {
284
op = (insn >> 8) & 0xf;
79
--
285
--
80
2.17.1
286
2.20.1
81
287
82
288
diff view generated by jsdifflib
Deleted patch
1
Add an IOMMU index argument to the translate method of
2
IOMMUs. Since all of our current IOMMU implementations
3
support only a single IOMMU index, this has no effect
4
on the behaviour.
5
1
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Message-id: 20180604152941.20374-4-peter.maydell@linaro.org
10
---
11
include/exec/memory.h | 3 ++-
12
exec.c | 11 +++++++++--
13
hw/alpha/typhoon.c | 3 ++-
14
hw/arm/smmuv3.c | 2 +-
15
hw/dma/rc4030.c | 2 +-
16
hw/i386/amd_iommu.c | 2 +-
17
hw/i386/intel_iommu.c | 2 +-
18
hw/ppc/spapr_iommu.c | 3 ++-
19
hw/s390x/s390-pci-bus.c | 2 +-
20
hw/sparc/sun4m_iommu.c | 3 ++-
21
hw/sparc64/sun4u_iommu.c | 2 +-
22
memory.c | 2 +-
23
12 files changed, 24 insertions(+), 13 deletions(-)
24
25
diff --git a/include/exec/memory.h b/include/exec/memory.h
26
index XXXXXXX..XXXXXXX 100644
27
--- a/include/exec/memory.h
28
+++ b/include/exec/memory.h
29
@@ -XXX,XX +XXX,XX @@ typedef struct IOMMUMemoryRegionClass {
30
* @iommu: the IOMMUMemoryRegion
31
* @hwaddr: address to be translated within the memory region
32
* @flag: requested access permissions
33
+ * @iommu_idx: IOMMU index for the translation
34
*/
35
IOMMUTLBEntry (*translate)(IOMMUMemoryRegion *iommu, hwaddr addr,
36
- IOMMUAccessFlags flag);
37
+ IOMMUAccessFlags flag, int iommu_idx);
38
/* Returns minimum supported page size in bytes.
39
* If this method is not provided then the minimum is assumed to
40
* be TARGET_PAGE_SIZE.
41
diff --git a/exec.c b/exec.c
42
index XXXXXXX..XXXXXXX 100644
43
--- a/exec.c
44
+++ b/exec.c
45
@@ -XXX,XX +XXX,XX @@ static MemoryRegionSection address_space_translate_iommu(IOMMUMemoryRegion *iomm
46
do {
47
hwaddr addr = *xlat;
48
IOMMUMemoryRegionClass *imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
49
- IOMMUTLBEntry iotlb = imrc->translate(iommu_mr, addr, is_write ?
50
- IOMMU_WO : IOMMU_RO);
51
+ int iommu_idx = 0;
52
+ IOMMUTLBEntry iotlb;
53
+
54
+ if (imrc->attrs_to_index) {
55
+ iommu_idx = imrc->attrs_to_index(iommu_mr, attrs);
56
+ }
57
+
58
+ iotlb = imrc->translate(iommu_mr, addr, is_write ?
59
+ IOMMU_WO : IOMMU_RO, iommu_idx);
60
61
if (!(iotlb.perm & (1 << is_write))) {
62
goto unassigned;
63
diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/hw/alpha/typhoon.c
66
+++ b/hw/alpha/typhoon.c
67
@@ -XXX,XX +XXX,XX @@ static bool window_translate(TyphoonWindow *win, hwaddr addr,
68
Pchip and generate a machine check interrupt. */
69
static IOMMUTLBEntry typhoon_translate_iommu(IOMMUMemoryRegion *iommu,
70
hwaddr addr,
71
- IOMMUAccessFlags flag)
72
+ IOMMUAccessFlags flag,
73
+ int iommu_idx)
74
{
75
TyphoonPchip *pchip = container_of(iommu, TyphoonPchip, iommu);
76
IOMMUTLBEntry ret;
77
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
78
index XXXXXXX..XXXXXXX 100644
79
--- a/hw/arm/smmuv3.c
80
+++ b/hw/arm/smmuv3.c
81
@@ -XXX,XX +XXX,XX @@ static int smmuv3_decode_config(IOMMUMemoryRegion *mr, SMMUTransCfg *cfg,
82
}
83
84
static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr,
85
- IOMMUAccessFlags flag)
86
+ IOMMUAccessFlags flag, int iommu_idx)
87
{
88
SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
89
SMMUv3State *s = sdev->smmu;
90
diff --git a/hw/dma/rc4030.c b/hw/dma/rc4030.c
91
index XXXXXXX..XXXXXXX 100644
92
--- a/hw/dma/rc4030.c
93
+++ b/hw/dma/rc4030.c
94
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps jazzio_ops = {
95
};
96
97
static IOMMUTLBEntry rc4030_dma_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
98
- IOMMUAccessFlags flag)
99
+ IOMMUAccessFlags flag, int iommu_idx)
100
{
101
rc4030State *s = container_of(iommu, rc4030State, dma_mr);
102
IOMMUTLBEntry ret = {
103
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
104
index XXXXXXX..XXXXXXX 100644
105
--- a/hw/i386/amd_iommu.c
106
+++ b/hw/i386/amd_iommu.c
107
@@ -XXX,XX +XXX,XX @@ static inline bool amdvi_is_interrupt_addr(hwaddr addr)
108
}
109
110
static IOMMUTLBEntry amdvi_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
111
- IOMMUAccessFlags flag)
112
+ IOMMUAccessFlags flag, int iommu_idx)
113
{
114
AMDVIAddressSpace *as = container_of(iommu, AMDVIAddressSpace, iommu);
115
AMDVIState *s = as->iommu_state;
116
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
117
index XXXXXXX..XXXXXXX 100644
118
--- a/hw/i386/intel_iommu.c
119
+++ b/hw/i386/intel_iommu.c
120
@@ -XXX,XX +XXX,XX @@ static void vtd_mem_write(void *opaque, hwaddr addr,
121
}
122
123
static IOMMUTLBEntry vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
124
- IOMMUAccessFlags flag)
125
+ IOMMUAccessFlags flag, int iommu_idx)
126
{
127
VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
128
IntelIOMMUState *s = vtd_as->iommu_state;
129
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
130
index XXXXXXX..XXXXXXX 100644
131
--- a/hw/ppc/spapr_iommu.c
132
+++ b/hw/ppc/spapr_iommu.c
133
@@ -XXX,XX +XXX,XX @@ static void spapr_tce_free_table(uint64_t *table, int fd, uint32_t nb_table)
134
/* Called from RCU critical section */
135
static IOMMUTLBEntry spapr_tce_translate_iommu(IOMMUMemoryRegion *iommu,
136
hwaddr addr,
137
- IOMMUAccessFlags flag)
138
+ IOMMUAccessFlags flag,
139
+ int iommu_idx)
140
{
141
sPAPRTCETable *tcet = container_of(iommu, sPAPRTCETable, iommu);
142
uint64_t tce;
143
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/hw/s390x/s390-pci-bus.c
146
+++ b/hw/s390x/s390-pci-bus.c
147
@@ -XXX,XX +XXX,XX @@ uint16_t s390_guest_io_table_walk(uint64_t g_iota, hwaddr addr,
148
}
149
150
static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
151
- IOMMUAccessFlags flag)
152
+ IOMMUAccessFlags flag, int iommu_idx)
153
{
154
S390PCIIOMMU *iommu = container_of(mr, S390PCIIOMMU, iommu_mr);
155
S390IOTLBEntry *entry;
156
diff --git a/hw/sparc/sun4m_iommu.c b/hw/sparc/sun4m_iommu.c
157
index XXXXXXX..XXXXXXX 100644
158
--- a/hw/sparc/sun4m_iommu.c
159
+++ b/hw/sparc/sun4m_iommu.c
160
@@ -XXX,XX +XXX,XX @@ static void iommu_bad_addr(IOMMUState *s, hwaddr addr,
161
/* Called from RCU critical section */
162
static IOMMUTLBEntry sun4m_translate_iommu(IOMMUMemoryRegion *iommu,
163
hwaddr addr,
164
- IOMMUAccessFlags flags)
165
+ IOMMUAccessFlags flags,
166
+ int iommu_idx)
167
{
168
IOMMUState *is = container_of(iommu, IOMMUState, iommu);
169
hwaddr page, pa;
170
diff --git a/hw/sparc64/sun4u_iommu.c b/hw/sparc64/sun4u_iommu.c
171
index XXXXXXX..XXXXXXX 100644
172
--- a/hw/sparc64/sun4u_iommu.c
173
+++ b/hw/sparc64/sun4u_iommu.c
174
@@ -XXX,XX +XXX,XX @@
175
/* Called from RCU critical section */
176
static IOMMUTLBEntry sun4u_translate_iommu(IOMMUMemoryRegion *iommu,
177
hwaddr addr,
178
- IOMMUAccessFlags flag)
179
+ IOMMUAccessFlags flag, int iommu_idx)
180
{
181
IOMMUState *is = container_of(iommu, IOMMUState, iommu);
182
hwaddr baseaddr, offset;
183
diff --git a/memory.c b/memory.c
184
index XXXXXXX..XXXXXXX 100644
185
--- a/memory.c
186
+++ b/memory.c
187
@@ -XXX,XX +XXX,XX @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
188
granularity = memory_region_iommu_get_min_page_size(iommu_mr);
189
190
for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
191
- iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE);
192
+ iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx);
193
if (iotlb.perm != IOMMU_NONE) {
194
n->notify(n, &iotlb);
195
}
196
--
197
2.17.1
198
199
diff view generated by jsdifflib