[v1] target/arm: Implement SVE2 FLOGB

[PATCH RFC] target/arm: Implement SVE2 FLOGB

Posted by Stephen Long 5 years, 9 months ago

Signed-off-by: Stephen Long <steplong@quicinc.com>
---

Right now, there is no log2 function for half precision floats, so I'm
not sure how to proceed. Currently, I just added a TODO comment.

 target/arm/helper-sve.h    |  3 +++
 target/arm/sve.decode      |  4 ++++
 target/arm/sve_helper.c    |  3 +++
 target/arm/translate-sve.c | 17 +++++++++++++++++
 4 files changed, 27 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 0a62eef94e..aaa5fc33f9 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -2731,3 +2731,6 @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_h, TCG_CALL_NO_RWG,
                    void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_s, TCG_CALL_NO_RWG,
                    void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(flogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(flogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index 3cf824bac5..dcb095bb5d 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -1568,3 +1568,7 @@ SM4E            01000101 00 10001 1 11100 0 ..... .....  @rdn_rm_e0
 # SVE2 crypto constructive binary operations
 SM4EKEY         01000101 00 1 ..... 11110 0 ..... .....  @rd_rn_rm_e0
 RAX1            01000101 00 1 ..... 11110 1 ..... .....  @rd_rn_rm_e0
+
+### SVE2 floating-point convert to integer
+
+FLOGB           01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5  &rpr_esz
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index aa94df302a..aba9c064fb 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -4624,6 +4624,9 @@ DO_ZPZ_FP(sve_ucvt_dh, uint64_t,     , uint64_to_float16)
 DO_ZPZ_FP(sve_ucvt_ds, uint64_t,     , uint64_to_float32)
 DO_ZPZ_FP(sve_ucvt_dd, uint64_t,     , uint64_to_float64)
 
+DO_ZPZ_FP(flogb_s, float32, H1_4, float32_log2)
+DO_ZPZ_FP(flogb_d, float64,     , float64_log2)
+
 #undef DO_ZPZ_FP
 
 static void do_fmla_zpzzz_h(void *vd, void *vn, void *vm, void *va, void *vg,
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index a8e57ea5f4..9176b18bc9 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -8253,3 +8253,20 @@ static bool trans_RAX1(DisasContext *s, arg_rrr_esz *a)
     }
     return true;
 }
+
+static bool trans_FLOGB(DisasContext *s, arg_rpr_esz *a)
+{
+    /* TODO: There is no support for log base 2 for half-precision floats */
+    static gen_helper_gvec_3_ptr * const fns[] = {
+        NULL,
+        gen_helper_flogb_s,
+        gen_helper_flogb_d,
+    };
+    if (a->esz == 0 || !dc_isar_feature(aa64_sve2, s)) {
+        return false;
+    }
+    if (sve_access_check(s)) {
+        do_ppz_fp(s, a, fns[a->esz - 1]);
+    }
+    return true;
+}
-- 
2.17.1

Re: [PATCH RFC] target/arm: Implement SVE2 FLOGB

Posted by Richard Henderson 5 years, 9 months ago

On 4/30/20 10:20 AM, Stephen Long wrote:
> +DO_ZPZ_FP(flogb_s, float32, H1_4, float32_log2)
> +DO_ZPZ_FP(flogb_d, float64,     , float64_log2)

Please read the instruction description more carefully.  The result is not the
full log2 of the input:


> This instruction returns the signed integer base 2 logarithm of each floating-point input element | X | after normalization.  This is the unbiased exponent of X used in the representation of the floating-point value, such that, for positive X , X = significand × 2 exponent.

Please look at Library pseudocode for aarch64/functions/sve/FPLogB in the manual.

You then use the helpers from softfloat.h like so

    if (float32_is_normal(x)) {
        // extract exponent and remove bias
        return extract32(x, 23, 8) - 127;
    } else if (float32_is_infinity(x)) {
        return INT32_MAX;
    } else if (float32_is_any_nan(x) || float32_is_zero(x)) {
        return INT32_MIN;
    } else {
        // denormal
        // extract fraction, normalize vs 2**-127.
        int shift = 9 - clz32(extract32(0, 23));
        return -127 - shift + 1;
    }


r~