[v3] target/arm: partial vector cleanup

[PATCH v3 08/16] target/arm: Swap argument order for VSHL during decode

Posted by Richard Henderson 5 years, 9 months ago

Rather than perform the argument swap during code generation,
perform it during decode.  This means it doesn't have to be
special cased later, and we can share code with aarch64 code
generation.  Hopefully the decode comment addresses any confusion
that might arise in between.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       | 9 +++++++--
 target/arm/translate-neon.inc.c | 3 +--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index ec3a92fe75..6b0b6566d6 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -65,8 +65,13 @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
 VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
 VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
 
-VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
-VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
+# The shift operations are of the form Vd = Vm << Vn.
+# By reversing the names of the fields here, we can use standard expanders.
+@3same_rev       .... ... . . . size:2 .... .... .... . q:1 . . .... \
+                 &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
+
+VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
+VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
 
 VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
 VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index aefeff498a..416302bcc7 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -692,8 +692,7 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
                                 uint32_t oprsz, uint32_t maxsz)         \
     {                                                                   \
-        /* Note the operation is vshl vd,vm,vn */                       \
-        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
                        oprsz, maxsz, &OPARRAY[vece]);                   \
     }                                                                   \
     DO_3SAME(INSN, gen_##INSN##_3s)
-- 
2.20.1

Re: [PATCH v3 08/16] target/arm: Swap argument order for VSHL during decode

Posted by Peter Maydell 5 years, 9 months ago

On Fri, 8 May 2020 at 16:22, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Rather than perform the argument swap during code generation,
> perform it during decode.  This means it doesn't have to be
> special cased later, and we can share code with aarch64 code
> generation.  Hopefully the decode comment addresses any confusion
> that might arise in between.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/neon-dp.decode       | 9 +++++++--
>  target/arm/translate-neon.inc.c | 3 +--
>  2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
> index ec3a92fe75..6b0b6566d6 100644
> --- a/target/arm/neon-dp.decode
> +++ b/target/arm/neon-dp.decode
> @@ -65,8 +65,13 @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
>  VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
>  VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
>
> -VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
> -VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
> +# The shift operations are of the form Vd = Vm << Vn.
> +# By reversing the names of the fields here, we can use standard expanders.

My work-in-progress v2 of the neon decodetree stuff has a slightly more
expanded comment for @3same_rev:

# The _rev suffix indicates that Vn and Vm are reversed. This is
# the case for shifts. In the Arm ARM these insns are documented
# with the Vm and Vn fields in their usual places, but in the
# assembly the operands are listed "backwards", ie in the order
# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
# to consider Vm and Vm as being in different fields in the insn,
# which allows us to avoid special-casing shifts in the trans_
# function code (where we would otherwise need to manually swap
# the operands over to call Neon helper functions that are shared
# with AArch64 which does not have this odd reversed-operand situation).

> +@3same_rev       .... ... . . . size:2 .... .... .... . q:1 . . .... \
> +                 &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
> +
> +VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
> +VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
>
>  VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
>  VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
> diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
> index aefeff498a..416302bcc7 100644
> --- a/target/arm/translate-neon.inc.c
> +++ b/target/arm/translate-neon.inc.c
> @@ -692,8 +692,7 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
>                                  uint32_t rn_ofs, uint32_t rm_ofs,       \
>                                  uint32_t oprsz, uint32_t maxsz)         \
>      {                                                                   \
> -        /* Note the operation is vshl vd,vm,vn */                       \
> -        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
> +        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
>                         oprsz, maxsz, &OPARRAY[vece]);                   \
>      }                                                                   \
>      DO_3SAME(INSN, gen_##INSN##_3s)
> --
> 2.20.1

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM