[v1] target/arm: Use TCG vector ops for MVE

[PATCH 0/4] target/arm: Use TCG vector ops for MVE

Peter Maydell posted 4 patches 4 years, 5 months ago

Diff against v2
Download series mbox

Test checkpatch failed

Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210902150910.15748-1-peter.maydell@linaro.org

Maintainers: Peter Maydell <peter.maydell@linaro.org>

There is a newer version of this series

target/arm/cpu.h              |   4 +-
target/arm/translate.h        |   2 +
target/arm/helper.c           |  23 ++++++++
target/arm/translate-m-nocp.c |   8 +++
target/arm/translate-mve.c    | 102 ++++++++++++++++++++++------------
target/arm/translate-vfp.c    |   3 +
target/arm/translate.c        |   4 ++
7 files changed, 110 insertions(+), 36 deletions(-)

Expand all Fold all

[PATCH 0/4] target/arm: Use TCG vector ops for MVE

Posted by Peter Maydell 4 years, 5 months ago

This patchset uses the TCG vector ops for some MVE
instructions. We can only do this when we know that none
of the MVE lanes are predicated, ie when neither tail
predication nor VPT predication nor ECI partial insn
execution are happening.

Patch 1 adds a TB flag so we can track whether we can
safely assume that the insn operates on the entire vector,
and patches 2, 3, 4 use that to vectorize some simple
cases (bitwise logic ops, integer add, sub, mul, min, max,
abs and neg).

This small initial patchset is intended as a check that
the general structure for handling this makes sense;
there are almost certainly other places we could improve
the codegen for the non-predicated case.

Richard: if you have time to scan through the MVE insns
and suggest which ones would benefit from a vectorized
version that would be very helpful...

thanks
-- PMM

Peter Maydell (4):
  target/arm: Add TB flag for "MVE insns not predicated"
  target/arm: Optimize MVE logic ops
  target/arm: Optimize MVE arithmetic ops
  target/arm: Optimize MVE VNEG, VABS

 target/arm/cpu.h              |   4 +-
 target/arm/translate.h        |   2 +
 target/arm/helper.c           |  23 ++++++++
 target/arm/translate-m-nocp.c |   8 +++
 target/arm/translate-mve.c    | 102 ++++++++++++++++++++++------------
 target/arm/translate-vfp.c    |   3 +
 target/arm/translate.c        |   4 ++
 7 files changed, 110 insertions(+), 36 deletions(-)

-- 
2.20.1

Re: [PATCH 0/4] target/arm: Use TCG vector ops for MVE

Posted by Richard Henderson 4 years, 5 months ago

On 9/2/21 5:09 PM, Peter Maydell wrote:
> Richard: if you have time to scan through the MVE insns
> and suggest which ones would benefit from a vectorized
> version that would be very helpful...

VDUP
VMOVL (into shifts or and)
VMVN (which seems to have gotten separated from the rest of the do_1op)
VSHL
VSHR
VSLI
VSRI

I think that's about all.


r~

Re: [PATCH 0/4] target/arm: Use TCG vector ops for MVE

Posted by Peter Maydell 4 years, 5 months ago

On Fri, 3 Sept 2021 at 16:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 9/2/21 5:09 PM, Peter Maydell wrote:
> > Richard: if you have time to scan through the MVE insns
> > and suggest which ones would benefit from a vectorized
> > version that would be very helpful...
>
> VDUP
> VMOVL (into shifts or and)
> VMVN (which seems to have gotten separated from the rest of the do_1op)
> VSHL
> VSHR
> VSLI
> VSRI
>
> I think that's about all.

I guess also VMOV (immediate) (vector) ?

Thanks for the list, I'll have a look at them.

-- PMM

Re: [PATCH 0/4] target/arm: Use TCG vector ops for MVE

Posted by Richard Henderson 4 years, 5 months ago

On 9/3/21 5:20 PM, Peter Maydell wrote:
> On Fri, 3 Sept 2021 at 16:14, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> On 9/2/21 5:09 PM, Peter Maydell wrote:
>>> Richard: if you have time to scan through the MVE insns
>>> and suggest which ones would benefit from a vectorized
>>> version that would be very helpful...
>>
>> VDUP
>> VMOVL (into shifts or and)
>> VMVN (which seems to have gotten separated from the rest of the do_1op)
>> VSHL
>> VSHR
>> VSLI
>> VSRI
>>
>> I think that's about all.
> 
> I guess also VMOV (immediate) (vector) ?

Oh right, yes.  I skipped VMOV because I remembered that the register version is an alias 
for VORR.  Am I correct that there is no mve insn corresponding to the NEON VLD1 (single 
element to all lanes)?


r~