On 11/10/2018 22:52, Richard Henderson wrote:
> For a sequence of loads or stores from a single register,
> little-endian operations can be promoted to an 8-byte op.
> This can reduce the number of operations by a factor of 8.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
> target/arm/translate.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index 12a744b3c3..09f2d648b7 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -5011,6 +5011,16 @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
> if (size == 3 && (interleave | spacing) != 1) {
> return 1;
> }
> + /* For our purposes, bytes are always little-endian. */
> + if (size == 0) {
> + endian = MO_LE;
> + }
> + /* Consecutive little-endian elements from a single register
> + * can be promoted to a larger little-endian operation.
> + */
> + if (interleave == 1 && endian == MO_LE) {
> + size = 3;
> + }
> tmp64 = tcg_temp_new_i64();
> addr = tcg_temp_new_i32();
> tmp2 = tcg_const_i32(1 << size);
>