On Fri, 8 May 2020 at 16:22, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Rather than perform the argument swap during code generation,
> perform it during decode. This means it doesn't have to be
> special cased later, and we can share code with aarch64 code
> generation. Hopefully the decode comment addresses any confusion
> that might arise in between.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/arm/neon-dp.decode | 9 +++++++--
> target/arm/translate-neon.inc.c | 3 +--
> 2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
> index ec3a92fe75..6b0b6566d6 100644
> --- a/target/arm/neon-dp.decode
> +++ b/target/arm/neon-dp.decode
> @@ -65,8 +65,13 @@ VCGT_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
> VCGE_S_3s 1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
> VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
>
> -VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
> -VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
> +# The shift operations are of the form Vd = Vm << Vn.
> +# By reversing the names of the fields here, we can use standard expanders.
My work-in-progress v2 of the neon decodetree stuff has a slightly more
expanded comment for @3same_rev:
# The _rev suffix indicates that Vn and Vm are reversed. This is
# the case for shifts. In the Arm ARM these insns are documented
# with the Vm and Vn fields in their usual places, but in the
# assembly the operands are listed "backwards", ie in the order
# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
# to consider Vm and Vm as being in different fields in the insn,
# which allows us to avoid special-casing shifts in the trans_
# function code (where we would otherwise need to manually swap
# the operands over to call Neon helper functions that are shared
# with AArch64 which does not have this odd reversed-operand situation).
> +@3same_rev .... ... . . . size:2 .... .... .... . q:1 . . .... \
> + &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
> +
> +VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
> +VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
>
> VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
> VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
> diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
> index aefeff498a..416302bcc7 100644
> --- a/target/arm/translate-neon.inc.c
> +++ b/target/arm/translate-neon.inc.c
> @@ -692,8 +692,7 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
> uint32_t rn_ofs, uint32_t rm_ofs, \
> uint32_t oprsz, uint32_t maxsz) \
> { \
> - /* Note the operation is vshl vd,vm,vn */ \
> - tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, \
> + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \
> oprsz, maxsz, &OPARRAY[vece]); \
> } \
> DO_3SAME(INSN, gen_##INSN##_3s)
> --
> 2.20.1
Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM