[PATCH RFC] target/arm: Implement SVE2 gather load insns

Stephen Long posted 1 patch 4 years ago
Failed in applying to current master (apply log)
target/arm/sve.decode      | 11 +++++++++++
target/arm/translate-sve.c |  8 ++++++++
2 files changed, 19 insertions(+)
[PATCH RFC] target/arm: Implement SVE2 gather load insns
Posted by Stephen Long 4 years ago
Add decoding logic for SVE2 64-bit/32-bit gather non-temporal load
insns.

64-bit
* LDNT1SB
* LDNT1B (vector plus scalar)
* LDNT1SH
* LDNT1H (vector plus scalar)
* LDNT1SW
* LDNT1W (vector plus scalar)
* LDNT1D (vector plus scalar)

32-bit
* LDNT1SB
* LDNT1B (vector plus scalar)
* LDNT1SH
* LDNT1H (vector plus scalar)
* LDNT1W (vector plus scalar)

Signed-off-by: Stephen Long <steplong@quicinc.com>

I'm not sure I'm initializing xs correctly. This also goes for the
scatter store insns in the previous patch.
---
 target/arm/sve.decode      | 11 +++++++++++
 target/arm/translate-sve.c |  8 ++++++++
 2 files changed, 19 insertions(+)

diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index ef5dd281a6..d7799746ab 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -1388,6 +1388,17 @@ SQRDCMLAH_zzzz  01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5  ra=%reg_movprfx
 
 FMMLA           01100100 .. 1 ..... 111001 ..... .....  @rda_rn_rm
 
+### SVE2 Memory Gather Load Group
+
+# SVE2 64-bit gather non-temporal load
+#   (scalar plus unpacked 32-bit unscaled offsets)
+LDNT1_zprz      1100010 msz:2 00 rm:5 1 u:1 0 pg:3 rn:5 rd:5 \
+                &rprr_gather_load xs=0 esz=3 scale=0 ff=0
+
+# SVE2 32-bit gather non-temporal load (scalar plus 32-bit unscaled offsets)
+LDNT1_zprz      1000010 msz:2 00 rm:5 10 u:1 pg:3 rn:5 rd:5 \
+                &rprr_gather_load xs=0 esz=2 scale=0 ff=0
+
 ### SVE2 Memory Store Group
 
 # SVE2 64-bit scatter non-temporal store (vector plus scalar)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 4873e25182..bdabb89e82 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -5886,6 +5886,14 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a)
     return true;
 }
 
+static bool trans_LDNT1_zprz(DisasContext *s, arg_LD1_zprz *a)
+{
+    if (!dc_isar_feature(aa64_sve2, s)) {
+        return false;
+    }
+    return trans_LDNT1_zprz(s, a);
+}
+
 /* Indexed by [mte][be][xs][msz].  */
 static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][2][2][3] = {
     { /* MTE Inactive */
-- 
2.17.1


Re: [PATCH RFC] target/arm: Implement SVE2 gather load insns
Posted by Richard Henderson 4 years ago
On 4/22/20 8:23 AM, Stephen Long wrote:
> Add decoding logic for SVE2 64-bit/32-bit gather non-temporal load
> insns.
> 
> 64-bit
> * LDNT1SB
> * LDNT1B (vector plus scalar)
> * LDNT1SH
> * LDNT1H (vector plus scalar)
> * LDNT1SW
> * LDNT1W (vector plus scalar)
> * LDNT1D (vector plus scalar)
> 
> 32-bit
> * LDNT1SB
> * LDNT1B (vector plus scalar)
> * LDNT1SH
> * LDNT1H (vector plus scalar)
> * LDNT1W (vector plus scalar)
> 
> Signed-off-by: Stephen Long <steplong@quicinc.com>
> 
> I'm not sure I'm initializing xs correctly. This also goes for the
> scatter store insns in the previous patch.

You did.  xs=0 is 32-bit unsigned offset, xs=1 is 32-bit signed offset
(directly from the SVE encoding); I repurpose xs=2 as 64-bit offset.  There's a
comment in there next to the load/store helper array to that effect.

> ---
>  target/arm/sve.decode      | 11 +++++++++++
>  target/arm/translate-sve.c |  8 ++++++++
>  2 files changed, 19 insertions(+)

Applied to my SVE2 branch.  Thanks!


r~