From nobody Wed Oct 29 22:55:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525945556301282.6714165518399; Thu, 10 May 2018 02:45:56 -0700 (PDT) Received: from localhost ([::1]:32771 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fGi95-0007eH-HF for importer@patchew.org; Thu, 10 May 2018 05:45:55 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46172) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fGi5W-0002FH-Ef for qemu-devel@nongnu.org; Thu, 10 May 2018 05:42:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fGi5U-0004Tc-KI for qemu-devel@nongnu.org; Thu, 10 May 2018 05:42:14 -0400 Received: from mail-wm0-x243.google.com ([2a00:1450:400c:c09::243]:37296) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fGi5U-0004T3-A5 for qemu-devel@nongnu.org; Thu, 10 May 2018 05:42:12 -0400 Received: by mail-wm0-x243.google.com with SMTP id l1-v6so3226123wmb.2 for ; Thu, 10 May 2018 02:42:12 -0700 (PDT) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id 19-v6sm899577wrz.7.2018.05.10.02.42.07 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 10 May 2018 02:42:08 -0700 (PDT) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id 3002E3E03C0; Thu, 10 May 2018 10:42:07 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XibGir2o+sywcUzQXuh0/9bGi5rtuKwMYtOfJuZIbUs=; b=Dut2xg86s1vHOvuPuH1PkbOFeUBb4ba4OfcPPwzwYrOkZj7VjJ/CH299LMg8axUD4Y Pr4at2zEldJZtuy3ddvqII7Qr5VkB4mPWzgB5attqyBwHH5UYsVZ44K8BSOdh/2qnPOT 7pT6OyapzqOjLugaK42ULE0NBb+YGuBIUi5EQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XibGir2o+sywcUzQXuh0/9bGi5rtuKwMYtOfJuZIbUs=; b=FB+vnwgapBbWvHpf8CynUhK1GZIeRJ5qbLvdvPA0yt2yht1tZyoLom6HQuba3a/rZo 3HjBPJ+RIAh/cmSTXYqbS+KJ8t4QW6Upba4H7yuBlE4qDkfznHYu+iV1H59SkfojZUEt 2VIdRIcGuTXS+YFLFIq/+XDxmBbW4X/xr+C4khMhWBaZ9OdwySq4oMzqSqX4yEsYI/mq XlSj6eu9mgwjPUh7cvmRRXFTxwHurOfp6N5WSifBDxgLWxz19pfduDj7MP6aTScXwOtu J6dz7bTxnEnjwbTqfQzSi7PrD0U72Qj2Qk1D9iV+cRu/NGLfEfSmoNNtvljqRXdEk9+g 2mSA== X-Gm-Message-State: ALKqPweRQbKi61a1xYN4wrFxFCrkv7n04vZqzZiKafkl7W1wklumV+W2 PqCkdhZszymXxm9WBTQX/RB7Hg== X-Google-Smtp-Source: AB8JxZr/92dU7B/mHJiXG8529QrUYVReaYFF+MMjDq5JK8PEe62G2HmnU23PoKCRWtgWBkwfftf37Q== X-Received: by 2002:a1c:5cd5:: with SMTP id q204-v6mr714160wmb.158.1525945331075; Thu, 10 May 2018 02:42:11 -0700 (PDT) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: peter.maydell@linaro.org Date: Thu, 10 May 2018 10:42:04 +0100 Message-Id: <20180510094206.15354-4-alex.bennee@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180510094206.15354-1-alex.bennee@linaro.org> References: <20180510094206.15354-1-alex.bennee@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c09::243 Subject: [Qemu-devel] [PATCH v3 3/5] fpu/softfloat: support ARM Alternative half-precision X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= , qemu-arm@nongnu.org, richard.henderson@linaro.org, qemu-devel@nongnu.org, Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 For float16 ARM supports an alternative half-precision format which sacrifices the ability to represent NaN/Inf in return for a higher dynamic range. To support this I've added an additional FloatFmt (float16_params_ahp). The new FloatFmt flag (arm_althp) is then used to modify the behaviour of canonicalize and round_canonical with respect to representation and exception raising. Finally the float16_to_floatN and floatN_to_float16 conversion routines select the new alternative FloatFmt when !ieee. Signed-off-by: Alex Benn=C3=A9e Reviewed-by: Richard Henderson --- v3 - squash NaN to 0 if destination is AHP F16 --- fpu/softfloat.c | 108 +++++++++++++++++++++++++++++++++++++----------- 1 file changed, 85 insertions(+), 23 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 042e5c901d..79ebc998d3 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -225,6 +225,8 @@ typedef struct { * frac_lsb: least significant bit of fraction * fram_lsbm1: the bit bellow the least significant bit (for rounding) * round_mask/roundeven_mask: masks used for rounding + * The following optional modifiers are available: + * arm_althp: handle ARM Alternative Half Precision */ typedef struct { int exp_size; @@ -236,6 +238,7 @@ typedef struct { uint64_t frac_lsbm1; uint64_t round_mask; uint64_t roundeven_mask; + bool arm_althp; } FloatFmt; =20 /* Expand fields based on the size of exponent and fraction */ @@ -248,12 +251,17 @@ typedef struct { .frac_lsb =3D 1ull << (DECOMPOSED_BINARY_POINT - F), \ .frac_lsbm1 =3D 1ull << ((DECOMPOSED_BINARY_POINT - F) - 1), \ .round_mask =3D (1ull << (DECOMPOSED_BINARY_POINT - F)) - 1, \ - .roundeven_mask =3D (2ull << (DECOMPOSED_BINARY_POINT - F)) - 1 + .roundeven_mask =3D (2ull << (DECOMPOSED_BINARY_POINT - F)) - 1, =20 static const FloatFmt float16_params =3D { FLOAT_PARAMS(5, 10) }; =20 +static const FloatFmt float16_params_ahp =3D { + FLOAT_PARAMS(5, 10) + .arm_althp =3D true +}; + static const FloatFmt float32_params =3D { FLOAT_PARAMS(8, 23) }; @@ -317,7 +325,7 @@ static inline float64 float64_pack_raw(FloatParts p) static FloatParts canonicalize(FloatParts part, const FloatFmt *parm, float_status *status) { - if (part.exp =3D=3D parm->exp_max) { + if (part.exp =3D=3D parm->exp_max && !parm->arm_althp) { if (part.frac =3D=3D 0) { part.cls =3D float_class_inf; } else { @@ -403,8 +411,9 @@ static FloatParts round_canonical(FloatParts p, float_s= tatus *s, =20 exp +=3D parm->exp_bias; if (likely(exp > 0)) { + bool maybe_inexact =3D false; if (frac & round_mask) { - flags |=3D float_flag_inexact; + maybe_inexact =3D true; frac +=3D inc; if (frac & DECOMPOSED_OVERFLOW_BIT) { frac >>=3D 1; @@ -413,14 +422,26 @@ static FloatParts round_canonical(FloatParts p, float= _status *s, } frac >>=3D frac_shift; =20 - if (unlikely(exp >=3D exp_max)) { - flags |=3D float_flag_overflow | float_flag_inexact; - if (overflow_norm) { - exp =3D exp_max - 1; - frac =3D -1; - } else { - p.cls =3D float_class_inf; - goto do_inf; + if (parm->arm_althp) { + if (unlikely(exp >=3D exp_max + 1)) { + flags |=3D float_flag_invalid; + frac =3D -1; + exp =3D exp_max; + } else if (maybe_inexact) { + flags |=3D float_flag_inexact; + } + } else { + if (unlikely(exp >=3D exp_max)) { + flags |=3D float_flag_overflow | float_flag_inexact; + if (overflow_norm) { + exp =3D exp_max - 1; + frac =3D -1; + } else { + p.cls =3D float_class_inf; + goto do_inf; + } + } else if (maybe_inexact) { + flags |=3D float_flag_inexact; } } } else if (s->flush_to_zero) { @@ -465,7 +486,13 @@ static FloatParts round_canonical(FloatParts p, float_= status *s, case float_class_inf: do_inf: exp =3D exp_max; - frac =3D 0; + if (parm->arm_althp) { + flags |=3D float_flag_invalid; + /* Alt HP returns result =3D sign:Ones(M-1) */ + frac =3D -1; + } else { + frac =3D 0; + } break; =20 case float_class_qnan: @@ -483,12 +510,21 @@ static FloatParts round_canonical(FloatParts p, float= _status *s, return p; } =20 +/* Explicit FloatFmt version */ +static FloatParts float16a_unpack_canonical(const FloatFmt *params, + float16 f, float_status *s) +{ + return canonicalize(float16_unpack_raw(f), params, s); +} + static FloatParts float16_unpack_canonical(float16 f, float_status *s) { - return canonicalize(float16_unpack_raw(f), &float16_params, s); + return float16a_unpack_canonical(&float16_params, f, s); } =20 -static float16 float16_round_pack_canonical(FloatParts p, float_status *s) + +static float16 float16a_round_pack_canonical(const FloatFmt *params, + FloatParts p, float_status *s) { switch (p.cls) { case float_class_dnan: @@ -496,11 +532,16 @@ static float16 float16_round_pack_canonical(FloatPart= s p, float_status *s) case float_class_msnan: return float16_maybe_silence_nan(float16_pack_raw(p), s); default: - p =3D round_canonical(p, s, &float16_params); + p =3D round_canonical(p, s, params); return float16_pack_raw(p); } } =20 +static float16 float16_round_pack_canonical(FloatParts p, float_status *s) +{ + return float16a_round_pack_canonical(&float16_params, p, s); +} + static FloatParts float32_unpack_canonical(float32 f, float_status *s) { return canonicalize(float32_unpack_raw(f), &float32_params, s); @@ -1206,6 +1247,17 @@ static FloatParts float_to_float(FloatParts a, s->float_exception_flags |=3D float_flag_invalid; } =20 + if (dstf->arm_althp) { + /* There is no NaN in the destination format: raise Invalid + * and return a zero with the sign of the input NaN. + */ + s->float_exception_flags |=3D float_flag_invalid; + a.cls =3D float_class_zero; + a.frac =3D 0; + a.exp =3D 0; + return a; + } + if (s->default_nan_mode) { a.cls =3D float_class_dnan; return a; @@ -1226,25 +1278,34 @@ static FloatParts float_to_float(FloatParts a, return a; } =20 +/* + * Currently non-ieee implies ARM Alternative Half Precision handling + * for float16 values. If more are needed we'll need to expand the API + * into softfloat. + */ + float32 float16_to_float32(float16 a, bool ieee, float_status *s) { - FloatParts p =3D float16_unpack_canonical(a, s); - FloatParts pr =3D float_to_float(p, &float16_params, &float32_params, = s); + const FloatFmt *fmt16 =3D ieee ? &float16_params : &float16_params_ahp; + FloatParts p =3D float16a_unpack_canonical(fmt16, a, s); + FloatParts pr =3D float_to_float(p, fmt16, &float32_params, s); return float32_round_pack_canonical(pr, s); } =20 float64 float16_to_float64(float16 a, bool ieee, float_status *s) { - FloatParts p =3D float16_unpack_canonical(a, s); - FloatParts pr =3D float_to_float(p, &float16_params, &float64_params, = s); + const FloatFmt *fmt16 =3D ieee ? &float16_params : &float16_params_ahp; + FloatParts p =3D float16a_unpack_canonical(fmt16, a, s); + FloatParts pr =3D float_to_float(p, fmt16, &float64_params, s); return float64_round_pack_canonical(pr, s); } =20 float16 float32_to_float16(float32 a, bool ieee, float_status *s) { + const FloatFmt *fmt16 =3D ieee ? &float16_params : &float16_params_ahp; FloatParts p =3D float32_unpack_canonical(a, s); - FloatParts pr =3D float_to_float(p, &float32_params, &float16_params, = s); - return float16_round_pack_canonical(pr, s); + FloatParts pr =3D float_to_float(p, &float32_params, fmt16, s); + return float16a_round_pack_canonical(fmt16, pr, s); } =20 float64 float32_to_float64(float32 a, float_status *s) @@ -1256,9 +1317,10 @@ float64 float32_to_float64(float32 a, float_status *= s) =20 float16 float64_to_float16(float64 a, bool ieee, float_status *s) { + const FloatFmt *fmt16 =3D ieee ? &float16_params : &float16_params_ahp; FloatParts p =3D float64_unpack_canonical(a, s); - FloatParts pr =3D float_to_float(p, &float64_params, &float16_params, = s); - return float16_round_pack_canonical(pr, s); + FloatParts pr =3D float_to_float(p, &float64_params, fmt16, s); + return float16a_round_pack_canonical(fmt16, pr, s); } =20 float32 float64_to_float32(float64 a, float_status *s) --=20 2.17.0