[v1] target/arm: Scalable Vector Extension

[Qemu-devel] [PATCH v3-a 26/27] target/arm: Extend vec_reg_offset to larger sizes

Posted by Richard Henderson 7 years, 8 months ago

Rearrange the arithmetic so that we are agnostic about the total size
of the vector and the size of the element.  This will allow us to index
up to the 32nd byte and with 16-byte elements.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index dd9c09f89b..5a97ae2b59 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -67,18 +67,18 @@ static inline void assert_fp_access_checked(DisasContext *s)
 static inline int vec_reg_offset(DisasContext *s, int regno,
                                  int element, TCGMemOp size)
 {
-    int offs = 0;
+    int element_size = 1 << size;
+    int offs = element * element_size;
 #ifdef HOST_WORDS_BIGENDIAN
     /* This is complicated slightly because vfp.zregs[n].d[0] is
      * still the low half and vfp.zregs[n].d[1] the high half
      * of the 128 bit vector, even on big endian systems.
-     * Calculate the offset assuming a fully bigendian 128 bits,
-     * then XOR to account for the order of the two 64 bit halves.
+     * Calculate the offset assuming a fully little-endian 128 bits,
+     * then XOR to account for the order of the 64 bit units.
      */
-    offs += (16 - ((element + 1) * (1 << size)));
-    offs ^= 8;
-#else
-    offs += element * (1 << size);
+    if (element_size < 8) {
+        offs ^= 8 - element_size;
+    }
 #endif
     offs += offsetof(CPUARMState, vfp.zregs[regno]);
     assert_fp_access_checked(s);
-- 
2.17.0

Re: [Qemu-devel] [PATCH v3-a 26/27] target/arm: Extend vec_reg_offset to larger sizes

Posted by Peter Maydell 7 years, 8 months ago

On 16 May 2018 at 23:30, Richard Henderson <richard.henderson@linaro.org> wrote:
> Rearrange the arithmetic so that we are agnostic about the total size
> of the vector and the size of the element.  This will allow us to index
> up to the 32nd byte and with 16-byte elements.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-a64.h | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
> index dd9c09f89b..5a97ae2b59 100644
> --- a/target/arm/translate-a64.h
> +++ b/target/arm/translate-a64.h
> @@ -67,18 +67,18 @@ static inline void assert_fp_access_checked(DisasContext *s)
>  static inline int vec_reg_offset(DisasContext *s, int regno,
>                                   int element, TCGMemOp size)
>  {
> -    int offs = 0;
> +    int element_size = 1 << size;
> +    int offs = element * element_size;
>  #ifdef HOST_WORDS_BIGENDIAN
>      /* This is complicated slightly because vfp.zregs[n].d[0] is
>       * still the low half and vfp.zregs[n].d[1] the high half
>       * of the 128 bit vector, even on big endian systems.
> -     * Calculate the offset assuming a fully bigendian 128 bits,
> -     * then XOR to account for the order of the two 64 bit halves.
> +     * Calculate the offset assuming a fully little-endian 128 bits,
> +     * then XOR to account for the order of the 64 bit units.
>       */
> -    offs += (16 - ((element + 1) * (1 << size)));
> -    offs ^= 8;
> -#else
> -    offs += element * (1 << size);
> +    if (element_size < 8) {
> +        offs ^= 8 - element_size;
> +    }
>  #endif
>      offs += offsetof(CPUARMState, vfp.zregs[regno]);
>      assert_fp_access_checked(s);

This looks right for element sizes up to 8, but I don't understand
how it handles 16 byte elements as the commit message says -- the
d[0] and d[1] are the wrong way round and don't form a single
16-byte big-endian value, so they must need special casing somewhere
else ?

thanks
-- PMM

Re: [Qemu-devel] [PATCH v3-a 26/27] target/arm: Extend vec_reg_offset to larger sizes

Posted by Richard Henderson 7 years, 8 months ago

On 05/17/2018 08:57 AM, Peter Maydell wrote:
> This looks right for element sizes up to 8, but I don't understand
> how it handles 16 byte elements as the commit message says -- the
> d[0] and d[1] are the wrong way round and don't form a single
> 16-byte big-endian value, so they must need special casing somewhere
> else ?

You're right that it's not a proper int128 for host arithmetic.  Fortunately,
the only operation we have on 128-bit values, at present, is move.

How better might you word the commit message?

r~

Re: [Qemu-devel] [PATCH v3-a 26/27] target/arm: Extend vec_reg_offset to larger sizes

Posted by Peter Maydell 7 years, 8 months ago

On 17 May 2018 at 17:51, Richard Henderson <richard.henderson@linaro.org> wrote:
> On 05/17/2018 08:57 AM, Peter Maydell wrote:
>> This looks right for element sizes up to 8, but I don't understand
>> how it handles 16 byte elements as the commit message says -- the
>> d[0] and d[1] are the wrong way round and don't form a single
>> 16-byte big-endian value, so they must need special casing somewhere
>> else ?
>
> You're right that it's not a proper int128 for host arithmetic.  Fortunately,
> the only operation we have on 128-bit values, at present, is move.
>
> How better might you word the commit message?

You could add in the comment in the function something like:
"For 32 byte elements, we return an offset of zero. The two halves will
not form a host int128 if the host is bigendian, since they're in
the wrong order. However the only 32 byte operation we have is a move,
so we can ignore this for the moment. More complicated 32 byte operations
would have to special case loading and storing from the zregs array."

thanks
-- PMM