From nobody Tue Feb 10 22:17:39 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	dkim=fail;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 150791262128029.70968304166911;
 Fri, 13 Oct 2017 09:37:01 -0700 (PDT)
Received: from localhost ([::1]:51088 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1e32xB-00047D-Ck
	for importer@patchew.org; Fri, 13 Oct 2017 12:36:53 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:41845)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1e32ld-0002d4-RX
	for qemu-devel@nongnu.org; Fri, 13 Oct 2017 12:25:00 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1e32lb-0006dD-Dx
	for qemu-devel@nongnu.org; Fri, 13 Oct 2017 12:24:57 -0400
Received: from mail-wr0-x22b.google.com ([2a00:1450:400c:c0c::22b]:51102)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <alex.bennee@linaro.org>)
	id 1e32lb-0006cR-4S
	for qemu-devel@nongnu.org; Fri, 13 Oct 2017 12:24:55 -0400
Received: by mail-wr0-x22b.google.com with SMTP id q42so1429855wrb.7
	for <qemu-devel@nongnu.org>; Fri, 13 Oct 2017 09:24:55 -0700 (PDT)
Received: from zen.linaro.local ([81.128.185.34])
	by smtp.gmail.com with ESMTPSA id
	c37sm2870579wra.73.2017.10.13.09.24.45
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Fri, 13 Oct 2017 09:24:48 -0700 (PDT)
Received: from zen.linaroharston (localhost [127.0.0.1])
	by zen.linaro.local (Postfix) with ESMTP id 7DA773E0CA5;
	Fri, 13 Oct 2017 17:24:39 +0100 (BST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google;
	h=from:to:cc:subject:date:message-id:in-reply-to:references
	:mime-version:content-transfer-encoding;
	bh=fe3BMdKMEp8SD7k107xez1o/mOjZCJNEkZcsa6A7NcE=;
	b=UtUvA8aG+nlBCNImP1T7Ygfh30+r57Jy91nrIKPYZMrRH056i4D1ev6BCJ4W24adZS
	1uE30FCgS9DATphTI8kwbl0SBQgxQPrB54CfkvBV1+JELiyF5BG3V+qhNRlHapE8yyk8
	t2QtnuyuZcv/rmcP4tXgovs6XIx0OqcbzCPSo=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references:mime-version:content-transfer-encoding;
	bh=fe3BMdKMEp8SD7k107xez1o/mOjZCJNEkZcsa6A7NcE=;
	b=Ql5SZvYIqnNHTldAuFhwFfX4lNzGJk063zH1Il4AlVAsUndDLSH7g86lNKAxIutcEb
	ZaEpzXTh0H/0fFxjU996+OCHwLa/CJZ6YZJCx8J+SLUbhrC9sqMM7bsfhSbbuLHVtf6y
	ohTicJyj51+F430tQRutSTudn7ZbCeAgYlpLXIbNtCelF5IiTCEQVfu3EwHZ7Eaxqc0R
	of6w1LzoRqgngQBHdCqRjLKCmXD58pYVKd4G+KxDT9cZ9+dNLlKZVli1oDqGrl06QFV0
	YbuJYfUgSTivo6dOFxVi7HSf8Dph5vhiXwMUOjoyRanbi9ieJRVaASMAyI99uUZaO7hH
	ZVrw==
X-Gm-Message-State: AMCzsaVvdswItWgccBZIEj0wUcscc1qsFGZbtx5b2pZs7pjDCH4lbDR2
	51njGFjek9KHww99eAM2l4LLYQ==
X-Google-Smtp-Source: 
 AOwi7QCNc1+J2+7HSPhA1B08nu7XFV1wG4Q8+dPFgQMcnpyIzJkVvFPjiIBHjqKYYahHSWP5b55JZg==
X-Received: by 10.223.148.71 with SMTP id 65mr1753373wrq.263.1507911893708;
	Fri, 13 Oct 2017 09:24:53 -0700 (PDT)
From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
To: richard.henderson@linaro.org
Date: Fri, 13 Oct 2017 17:24:23 +0100
Message-Id: <20171013162438.32458-16-alex.bennee@linaro.org>
X-Mailer: git-send-email 2.14.1
In-Reply-To: <20171013162438.32458-1-alex.bennee@linaro.org>
References: <20171013162438.32458-1-alex.bennee@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 2a00:1450:400c:c0c::22b
Subject: [Qemu-devel] [RFC PATCH 15/30] softfloat: half-precision
 add/sub/mul/div support
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org,
	=?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	qemu-devel@nongnu.org, Aurelien Jarno <aurelien@aurel32.net>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail-DKIM: fail (Header signature does not verify)
X-ZohoMail: RDKM_2  RSF_0  Z_629925259 SPT_0

Rather than following the SoftFloat3 implementation I've used the same
basic template as the rest of our softfloat code. One minor difference
is the 32bit intermediates end up with the binary point in the same
place as the 32 bit version so the change isn't totally mechanical.

Signed-off-by: Alex Benn=C3=A9e <alex.bennee@linaro.org>
---
 fpu/softfloat.c         | 352 ++++++++++++++++++++++++++++++++++++++++++++=
++++
 include/fpu/softfloat.h |   6 +
 2 files changed, 358 insertions(+)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index cf7bf6d4f4..ff967f5525 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3532,6 +3532,358 @@ static void normalizeFloat16Subnormal(uint32_t aSig=
, int *zExpPtr,
     *zExpPtr =3D 1 - shiftCount;
 }
=20
+/*------------------------------------------------------------------------=
----
+| Returns the result of adding the absolute values of the half-precision
+| floating-point values `a' and `b'.  If `zSign' is 1, the sum is negated
+| before being returned.  `zSign' is ignored if the result is a NaN.
+| The addition is performed according to the IEC/IEEE Standard for Binary
+| Floating-Point Arithmetic.
+*-------------------------------------------------------------------------=
---*/
+
+static float16 addFloat16Sigs(float16 a, float16 b, flag zSign,
+                              float_status *status)
+{
+    int aExp, bExp, zExp;
+    uint16_t aSig, bSig, zSig;
+    int expDiff;
+
+    aSig =3D extractFloat16Frac( a );
+    aExp =3D extractFloat16Exp( a );
+    bSig =3D extractFloat16Frac( b );
+    bExp =3D extractFloat16Exp( b );
+    expDiff =3D aExp - bExp;
+    aSig <<=3D 3;
+    bSig <<=3D 3;
+    if ( 0 < expDiff ) {
+        if ( aExp =3D=3D 0x1F ) {
+            if (aSig) {
+                return propagateFloat16NaN(a, b, status);
+            }
+            return a;
+        }
+        if ( bExp =3D=3D 0 ) {
+            --expDiff;
+        }
+        else {
+            bSig |=3D 0x20000000;
+        }
+        shift16RightJamming( bSig, expDiff, &bSig );
+        zExp =3D aExp;
+    }
+    else if ( expDiff < 0 ) {
+        if ( bExp =3D=3D 0x1F ) {
+            if (bSig) {
+                return propagateFloat16NaN(a, b, status);
+            }
+            return packFloat16( zSign, 0x1F, 0 );
+        }
+        if ( aExp =3D=3D 0 ) {
+            ++expDiff;
+        }
+        else {
+            aSig |=3D 0x0400;
+        }
+        shift16RightJamming( aSig, - expDiff, &aSig );
+        zExp =3D bExp;
+    }
+    else {
+        if ( aExp =3D=3D 0x1F ) {
+            if (aSig | bSig) {
+                return propagateFloat16NaN(a, b, status);
+            }
+            return a;
+        }
+        if ( aExp =3D=3D 0 ) {
+            if (status->flush_to_zero) {
+                if (aSig | bSig) {
+                    float_raise(float_flag_output_denormal, status);
+                }
+                return packFloat16(zSign, 0, 0);
+            }
+            return packFloat16( zSign, 0, ( aSig + bSig )>>3 );
+        }
+        zSig =3D 0x0400 + aSig + bSig;
+        zExp =3D aExp;
+        goto roundAndPack;
+    }
+    aSig |=3D 0x0400;
+    zSig =3D ( aSig + bSig )<<1;
+    --zExp;
+    if ( (int16_t) zSig < 0 ) {
+        zSig =3D aSig + bSig;
+        ++zExp;
+    }
+ roundAndPack:
+    return roundAndPackFloat16(zSign, zExp, zSig, true, status);
+
+}
+
+/*------------------------------------------------------------------------=
----
+| Returns the result of subtracting the absolute values of the half-
+| precision floating-point values `a' and `b'.  If `zSign' is 1, the
+| difference is negated before being returned.  `zSign' is ignored if the
+| result is a NaN.  The subtraction is performed according to the IEC/IEEE
+| Standard for Binary Floating-Point Arithmetic.
+*-------------------------------------------------------------------------=
---*/
+
+static float16 subFloat16Sigs(float16 a, float16 b, flag zSign,
+                              float_status *status)
+{
+    int aExp, bExp, zExp;
+    uint16_t aSig, bSig, zSig;
+    int expDiff;
+
+    aSig =3D extractFloat16Frac( a );
+    aExp =3D extractFloat16Exp( a );
+    bSig =3D extractFloat16Frac( b );
+    bExp =3D extractFloat16Exp( b );
+    expDiff =3D aExp - bExp;
+    aSig <<=3D 7;
+    bSig <<=3D 7;
+    if ( 0 < expDiff ) goto aExpBigger;
+    if ( expDiff < 0 ) goto bExpBigger;
+    if ( aExp =3D=3D 0xFF ) {
+        if (aSig | bSig) {
+            return propagateFloat16NaN(a, b, status);
+        }
+        float_raise(float_flag_invalid, status);
+        return float16_default_nan(status);
+    }
+    if ( aExp =3D=3D 0 ) {
+        aExp =3D 1;
+        bExp =3D 1;
+    }
+    if ( bSig < aSig ) goto aBigger;
+    if ( aSig < bSig ) goto bBigger;
+    return packFloat16(status->float_rounding_mode =3D=3D float_round_down=
, 0, 0);
+ bExpBigger:
+    if ( bExp =3D=3D 0xFF ) {
+        if (bSig) {
+            return propagateFloat16NaN(a, b, status);
+        }
+        return packFloat16( zSign ^ 1, 0xFF, 0 );
+    }
+    if ( aExp =3D=3D 0 ) {
+        ++expDiff;
+    }
+    else {
+        aSig |=3D 0x40000000;
+    }
+    shift16RightJamming( aSig, - expDiff, &aSig );
+    bSig |=3D 0x40000000;
+ bBigger:
+    zSig =3D bSig - aSig;
+    zExp =3D bExp;
+    zSign ^=3D 1;
+    goto normalizeRoundAndPack;
+ aExpBigger:
+    if ( aExp =3D=3D 0xFF ) {
+        if (aSig) {
+            return propagateFloat16NaN(a, b, status);
+        }
+        return a;
+    }
+    if ( bExp =3D=3D 0 ) {
+        --expDiff;
+    }
+    else {
+        bSig |=3D 0x40000000;
+    }
+    shift16RightJamming( bSig, expDiff, &bSig );
+    aSig |=3D 0x40000000;
+ aBigger:
+    zSig =3D aSig - bSig;
+    zExp =3D aExp;
+ normalizeRoundAndPack:
+    --zExp;
+    return normalizeRoundAndPackFloat16(zSign, zExp, zSig, status);
+
+}
+
+/*------------------------------------------------------------------------=
----
+| Returns the result of adding the half-precision floating-point values `a'
+| and `b'.  The operation is performed according to the IEC/IEEE Standard =
for
+| Binary Floating-Point Arithmetic.
+*-------------------------------------------------------------------------=
---*/
+
+float16 float16_add(float16 a, float16 b, float_status *status)
+{
+    flag aSign, bSign;
+    a =3D float16_squash_input_denormal(a, status);
+    b =3D float16_squash_input_denormal(b, status);
+
+    aSign =3D extractFloat16Sign( a );
+    bSign =3D extractFloat16Sign( b );
+    if ( aSign =3D=3D bSign ) {
+        return addFloat16Sigs(a, b, aSign, status);
+    }
+    else {
+        return subFloat16Sigs(a, b, aSign, status);
+    }
+
+}
+
+/*------------------------------------------------------------------------=
----
+| Returns the result of subtracting the half-precision floating-point valu=
es
+| `a' and `b'.  The operation is performed according to the IEC/IEEE Stand=
ard
+| for Binary Floating-Point Arithmetic.
+*-------------------------------------------------------------------------=
---*/
+
+float16 float16_sub(float16 a, float16 b, float_status *status)
+{
+    flag aSign, bSign;
+    a =3D float16_squash_input_denormal(a, status);
+    b =3D float16_squash_input_denormal(b, status);
+
+    aSign =3D extractFloat16Sign( a );
+    bSign =3D extractFloat16Sign( b );
+    if ( aSign =3D=3D bSign ) {
+        return subFloat16Sigs(a, b, aSign, status);
+    }
+    else {
+        return addFloat16Sigs(a, b, aSign, status);
+    }
+
+}
+
+/*------------------------------------------------------------------------=
----
+| Returns the result of multiplying the half-precision floating-point valu=
es
+| `a' and `b'.  The operation is performed according to the IEC/IEEE Stand=
ard
+| for Binary Floating-Point Arithmetic.
+*-------------------------------------------------------------------------=
---*/
+
+float16 float16_mul(float16 a, float16 b, float_status *status)
+{
+    flag aSign, bSign, zSign;
+    int aExp, bExp, zExp;
+    uint32_t aSig, bSig;
+    uint32_t zSig32; /* no zSig as zSig32 passed into rp&f */
+
+    a =3D float16_squash_input_denormal(a, status);
+    b =3D float16_squash_input_denormal(b, status);
+
+    aSig =3D extractFloat16Frac( a );
+    aExp =3D extractFloat16Exp( a );
+    aSign =3D extractFloat16Sign( a );
+    bSig =3D extractFloat16Frac( b );
+    bExp =3D extractFloat16Exp( b );
+    bSign =3D extractFloat16Sign( b );
+    zSign =3D aSign ^ bSign;
+    if ( aExp =3D=3D 0x1F ) {
+        if ( aSig || ( ( bExp =3D=3D 0x1F ) && bSig ) ) {
+            return propagateFloat16NaN(a, b, status);
+        }
+        if ( ( bExp | bSig ) =3D=3D 0 ) {
+            float_raise(float_flag_invalid, status);
+            return float16_default_nan(status);
+        }
+        return packFloat16( zSign, 0x1F, 0 );
+    }
+    if ( bExp =3D=3D 0x1F ) {
+        if (bSig) {
+            return propagateFloat16NaN(a, b, status);
+        }
+        if ( ( aExp | aSig ) =3D=3D 0 ) {
+            float_raise(float_flag_invalid, status);
+            return float16_default_nan(status);
+        }
+        return packFloat16( zSign, 0x1F, 0 );
+    }
+    if ( aExp =3D=3D 0 ) {
+        if ( aSig =3D=3D 0 ) return packFloat16( zSign, 0, 0 );
+        normalizeFloat16Subnormal( aSig, &aExp, &aSig );
+    }
+    if ( bExp =3D=3D 0 ) {
+        if ( bSig =3D=3D 0 ) return packFloat16( zSign, 0, 0 );
+        normalizeFloat16Subnormal( bSig, &bExp, &bSig );
+    }
+    zExp =3D aExp + bExp - 0xF;
+    /* Add implicit bit */
+    aSig =3D ( aSig | 0x0400 )<<4;
+    bSig =3D ( bSig | 0x0400 )<<5;
+    /* Max (format " =3D> 0x%x" (* (lsh #x400 4)  (lsh #x400 5))) =3D> 0x2=
0000000
+     * So shift so binary point from 30/29 to 23/22
+     */
+    shift32RightJamming( ( (uint32_t) aSig ) * bSig, 7, &zSig32 );
+    /* At this point the significand is at the same point as
+     * float32_mul, so we can do the same test */
+    if ( 0 <=3D (int32_t) ( zSig32<<1 ) ) {
+        zSig32 <<=3D 1;
+        --zExp;
+    }
+    return roundAndPackFloat16(zSign, zExp, zSig32, true, status);
+}
+
+/*------------------------------------------------------------------------=
----
+| Returns the result of dividing the half-precision floating-point value `=
a'
+| by the corresponding value `b'.  The operation is performed according to=
 the
+| IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+*-------------------------------------------------------------------------=
---*/
+
+float16 float16_div(float16 a, float16 b, float_status *status)
+{
+    flag aSign, bSign, zSign;
+    int aExp, bExp, zExp;
+    uint32_t aSig, bSig, zSig;
+    a =3D float16_squash_input_denormal(a, status);
+    b =3D float16_squash_input_denormal(b, status);
+
+    aSig =3D extractFloat16Frac( a );
+    aExp =3D extractFloat16Exp( a );
+    aSign =3D extractFloat16Sign( a );
+    bSig =3D extractFloat16Frac( b );
+    bExp =3D extractFloat16Exp( b );
+    bSign =3D extractFloat16Sign( b );
+    zSign =3D aSign ^ bSign;
+    if ( aExp =3D=3D 0xFF ) {
+        if (aSig) {
+            return propagateFloat16NaN(a, b, status);
+        }
+        if ( bExp =3D=3D 0xFF ) {
+            if (bSig) {
+                return propagateFloat16NaN(a, b, status);
+            }
+            float_raise(float_flag_invalid, status);
+            return float16_default_nan(status);
+        }
+        return packFloat16( zSign, 0xFF, 0 );
+    }
+    if ( bExp =3D=3D 0xFF ) {
+        if (bSig) {
+            return propagateFloat16NaN(a, b, status);
+        }
+        return packFloat16( zSign, 0, 0 );
+    }
+    if ( bExp =3D=3D 0 ) {
+        if ( bSig =3D=3D 0 ) {
+            if ( ( aExp | aSig ) =3D=3D 0 ) {
+                float_raise(float_flag_invalid, status);
+                return float16_default_nan(status);
+            }
+            float_raise(float_flag_divbyzero, status);
+            return packFloat16( zSign, 0xFF, 0 );
+        }
+        normalizeFloat16Subnormal( bSig, &bExp, &bSig );
+    }
+    if ( aExp =3D=3D 0 ) {
+        if ( aSig =3D=3D 0 ) return packFloat16( zSign, 0, 0 );
+        normalizeFloat16Subnormal( aSig, &aExp, &aSig );
+    }
+    zExp =3D aExp - bExp + 0x7D;
+    aSig =3D ( aSig | 0x00800000 )<<7;
+    bSig =3D ( bSig | 0x00800000 )<<8;
+    if ( bSig <=3D ( aSig + aSig ) ) {
+        aSig >>=3D 1;
+        ++zExp;
+    }
+    zSig =3D ( ( (uint64_t) aSig )<<16 ) / bSig;
+    if ( ( zSig & 0x3F ) =3D=3D 0 ) {
+        zSig |=3D ( (uint64_t) bSig * zSig !=3D ( (uint64_t) aSig )<<16 );
+    }
+    return roundAndPackFloat16(zSign, zExp, zSig, true, status);
+
+}
+
 /* Half precision floats come in two formats: standard IEEE and "ARM" form=
at.
    The latter gains extra exponent range by omitting the NaN/Inf encodings=
.  */
=20
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index d89fdf7675..f1d79b6d03 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -345,6 +345,12 @@ float64 float16_to_float64(float16 a, flag ieee, float=
_status *status);
 /*------------------------------------------------------------------------=
----
 | Software half-precision operations.
 *-------------------------------------------------------------------------=
---*/
+
+float16 float16_add(float16, float16, float_status *status);
+float16 float16_sub(float16, float16, float_status *status);
+float16 float16_mul(float16, float16, float_status *status);
+float16 float16_div(float16, float16, float_status *status);
+
 int float16_is_quiet_nan(float16, float_status *status);
 int float16_is_signaling_nan(float16, float_status *status);
 float16 float16_maybe_silence_nan(float16, float_status *status);
--=20
2.14.1