From nobody Tue Feb 10 22:17:37 2026 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1507912936149116.63159263256762; Fri, 13 Oct 2017 09:42:16 -0700 (PDT) Received: from localhost ([::1]:51121 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e3327-000060-3X for importer@patchew.org; Fri, 13 Oct 2017 12:41:59 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44006) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e32sX-0000Yw-TO for qemu-devel@nongnu.org; Fri, 13 Oct 2017 12:32:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e32sW-0002F9-0d for qemu-devel@nongnu.org; Fri, 13 Oct 2017 12:32:05 -0400 Received: from mail-wm0-x22c.google.com ([2a00:1450:400c:c09::22c]:48949) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e32sV-0002Dq-Ob for qemu-devel@nongnu.org; Fri, 13 Oct 2017 12:32:03 -0400 Received: by mail-wm0-x22c.google.com with SMTP id i124so22416664wmf.3 for ; Fri, 13 Oct 2017 09:32:03 -0700 (PDT) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id 188sm1047400wmg.45.2017.10.13.09.31.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 13 Oct 2017 09:31:58 -0700 (PDT) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id 1DDF03E05FB; Fri, 13 Oct 2017 17:24:40 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UO8/dqxLs3JexzFu1Zz6rU4lpXIYQJK7hAlRHClqK6U=; b=OuxIJYlIwrO3U8I9Q4HDKW9BB++F0N75FeNypQkZMNDJf5Ze0uFX4IMd/qbzPTdbeb 7/PdEaRH6qOJn/U+USxbaf2hBEIINOA6iViEVj8yoqPyTBzrCqycFlVwsvHtQK5cofBz Rky3Ho9XA2h/0B3zjQFViCTKVu9jAaeWzTt3k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UO8/dqxLs3JexzFu1Zz6rU4lpXIYQJK7hAlRHClqK6U=; b=ehYD2IYPlcs0MncAkJFDWs4WQFb5f0MfJAGUKvR4a+C/PA39+8SJuUslJood+MojcL WQNCjrkXZUNvGFhxPq+0vPIpGDIYqKR7mdkEPWTexoB8dsVs4LZiwkYBrwk/IS1H9M9v 0AEiB/HvSabJdkUYcumyJ64+Atd3jEI7VTDi5x2tNKNgLAHzEMVM5DXToSGiFk6CljVr Lzo77qIyHFValJZhhSAwRku+CDvRygzdIQY8TNS1kuzoDqLkG4ujEsbpo02Y60OnuHGV sABFC1kt5vivx/3Xy16VFSS1BdiRLJmpDrZsrNZLr2gC3vgGSbb6JYpL25EaBSfEBfVy zYmg== X-Gm-Message-State: AMCzsaXre0lBsX9D4CIJ1SechSjkP3+MGZdqT8RYxhOJsSqPGwBLLGtY i73fMHE8+qG65K0DIlrMLYPW+Q== X-Google-Smtp-Source: ABhQp+QcKjjYFf5xleVe7U2zXw0EyIQZMmL/YFbvQZCXLbBiAPI903BNTKZG7a5L6pQ/WFQs+bBSRA== X-Received: by 10.28.88.21 with SMTP id m21mr1802462wmb.111.1507912321094; Fri, 13 Oct 2017 09:32:01 -0700 (PDT) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: richard.henderson@linaro.org Date: Fri, 13 Oct 2017 17:24:31 +0100 Message-Id: <20171013162438.32458-24-alex.bennee@linaro.org> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20171013162438.32458-1-alex.bennee@linaro.org> References: <20171013162438.32458-1-alex.bennee@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c09::22c Subject: [Qemu-devel] [RFC PATCH 23/30] softfloat: add float16_rem and float16_muladd (!CHECK) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org, =?UTF-8?q?Alex=20Benn=C3=A9e?= , qemu-devel@nongnu.org, Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Signed-off-by: Alex Benn=C3=A9e --- fpu/softfloat-specialize.h | 52 +++++++ fpu/softfloat.c | 327 +++++++++++++++++++++++++++++++++++++++++= ++++ include/fpu/softfloat.h | 2 + 3 files changed, 381 insertions(+) diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h index 2ccd4abe11..33c4be1757 100644 --- a/fpu/softfloat-specialize.h +++ b/fpu/softfloat-specialize.h @@ -728,6 +728,58 @@ static float16 propagateFloat16NaN(float16 a, float16 = b, float_status *status) } } =20 +/*------------------------------------------------------------------------= ---- +| Takes three half-precision floating-point values `a', `b' and `c', one of +| which is a NaN, and returns the appropriate NaN result. If any of `a', +| `b' or `c' is a signaling NaN, the invalid exception is raised. +| The input infzero indicates whether a*b was 0*inf or inf*0 (in which case +| obviously c is a NaN, and whether to propagate c or some other NaN is +| implementation defined). +*-------------------------------------------------------------------------= ---*/ + +static float16 propagateFloat16MulAddNaN(float16 a, float16 b, + float16 c, flag infzero, + float_status *status) +{ + flag aIsQuietNaN, aIsSignalingNaN, bIsQuietNaN, bIsSignalingNaN, + cIsQuietNaN, cIsSignalingNaN; + int which; + + aIsQuietNaN =3D float16_is_quiet_nan(a, status); + aIsSignalingNaN =3D float16_is_signaling_nan(a, status); + bIsQuietNaN =3D float16_is_quiet_nan(b, status); + bIsSignalingNaN =3D float16_is_signaling_nan(b, status); + cIsQuietNaN =3D float16_is_quiet_nan(c, status); + cIsSignalingNaN =3D float16_is_signaling_nan(c, status); + + if (aIsSignalingNaN | bIsSignalingNaN | cIsSignalingNaN) { + float_raise(float_flag_invalid, status); + } + + which =3D pickNaNMulAdd(aIsQuietNaN, aIsSignalingNaN, + bIsQuietNaN, bIsSignalingNaN, + cIsQuietNaN, cIsSignalingNaN, infzero, status); + + if (status->default_nan_mode) { + /* Note that this check is after pickNaNMulAdd so that function + * has an opportunity to set the Invalid flag. + */ + return float16_default_nan(status); + } + + switch (which) { + case 0: + return float16_maybe_silence_nan(a, status); + case 1: + return float16_maybe_silence_nan(b, status); + case 2: + return float16_maybe_silence_nan(c, status); + case 3: + default: + return float16_default_nan(status); + } +} + /*------------------------------------------------------------------------= ---- | Takes two single-precision floating-point values `a' and `b', one of whi= ch | is a NaN, and returns the appropriate NaN result. If either `a' or `b' = is a diff --git a/fpu/softfloat.c b/fpu/softfloat.c index fdb2999c41..f7473f97e3 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -3884,6 +3884,333 @@ float16 float16_div(float16 a, float16 b, float_sta= tus *status) =20 } =20 +/*------------------------------------------------------------------------= ---- +| Returns the remainder of the half-precision floating-point value `a' +| with respect to the corresponding value `b'. The operation is performed +| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic. +*-------------------------------------------------------------------------= ---*/ + +float16 float16_rem(float16 a, float16 b, float_status *status) +{ + flag aSign, zSign; + int aExp, bExp, expDiff; + uint32_t aSig, bSig; + uint32_t q; + uint64_t aSig64, bSig64, q64; + uint32_t alternateASig; + int32_t sigMean; + a =3D float16_squash_input_denormal(a, status); + b =3D float16_squash_input_denormal(b, status); + + aSig =3D extractFloat32Frac( a ); + aExp =3D extractFloat32Exp( a ); + aSign =3D extractFloat32Sign( a ); + bSig =3D extractFloat32Frac( b ); + bExp =3D extractFloat32Exp( b ); + if ( aExp =3D=3D 0xFF ) { + if ( aSig || ( ( bExp =3D=3D 0xFF ) && bSig ) ) { + return propagateFloat32NaN(a, b, status); + } + float_raise(float_flag_invalid, status); + return float16_default_nan(status); + } + if ( bExp =3D=3D 0xFF ) { + if (bSig) { + return propagateFloat32NaN(a, b, status); + } + return a; + } + if ( bExp =3D=3D 0 ) { + if ( bSig =3D=3D 0 ) { + float_raise(float_flag_invalid, status); + return float16_default_nan(status); + } + normalizeFloat32Subnormal( bSig, &bExp, &bSig ); + } + if ( aExp =3D=3D 0 ) { + if ( aSig =3D=3D 0 ) return a; + normalizeFloat32Subnormal( aSig, &aExp, &aSig ); + } + expDiff =3D aExp - bExp; + aSig |=3D 0x00800000; + bSig |=3D 0x00800000; + if ( expDiff < 32 ) { + aSig <<=3D 8; + bSig <<=3D 8; + if ( expDiff < 0 ) { + if ( expDiff < -1 ) return a; + aSig >>=3D 1; + } + q =3D ( bSig <=3D aSig ); + if ( q ) aSig -=3D bSig; + if ( 0 < expDiff ) { + q =3D ( ( (uint64_t) aSig )<<32 ) / bSig; + q >>=3D 32 - expDiff; + bSig >>=3D 2; + aSig =3D ( ( aSig>>1 )<<( expDiff - 1 ) ) - bSig * q; + } + else { + aSig >>=3D 2; + bSig >>=3D 2; + } + } + else { + if ( bSig <=3D aSig ) aSig -=3D bSig; + aSig64 =3D ( (uint64_t) aSig )<<40; + bSig64 =3D ( (uint64_t) bSig )<<40; + expDiff -=3D 64; + while ( 0 < expDiff ) { + q64 =3D estimateDiv128To64( aSig64, 0, bSig64 ); + q64 =3D ( 2 < q64 ) ? q64 - 2 : 0; + aSig64 =3D - ( ( bSig * q64 )<<38 ); + expDiff -=3D 62; + } + expDiff +=3D 64; + q64 =3D estimateDiv128To64( aSig64, 0, bSig64 ); + q64 =3D ( 2 < q64 ) ? q64 - 2 : 0; + q =3D q64>>( 64 - expDiff ); + bSig <<=3D 6; + aSig =3D ( ( aSig64>>33 )<<( expDiff - 1 ) ) - bSig * q; + } + do { + alternateASig =3D aSig; + ++q; + aSig -=3D bSig; + } while ( 0 <=3D (int32_t) aSig ); + sigMean =3D aSig + alternateASig; + if ( ( sigMean < 0 ) || ( ( sigMean =3D=3D 0 ) && ( q & 1 ) ) ) { + aSig =3D alternateASig; + } + zSign =3D ( (int32_t) aSig < 0 ); + if ( zSign ) aSig =3D - aSig; + return normalizeRoundAndPackFloat32(aSign ^ zSign, bExp, aSig, status); +} + +/*------------------------------------------------------------------------= ---- +| Returns the result of multiplying the half-precision floating-point valu= es +| `a' and `b' then adding 'c', with no intermediate rounding step after the +| multiplication. The operation is performed according to the IEC/IEEE +| Standard for Binary Floating-Point Arithmetic 754-2008. +| The flags argument allows the caller to select negation of the +| addend, the intermediate product, or the final result. (The difference +| between this and having the caller do a separate negation is that negati= ng +| externally will flip the sign bit on NaNs.) +*-------------------------------------------------------------------------= ---*/ + +float16 float16_muladd(float16 a, float16 b, float16 c, int flags, + float_status *status) +{ + flag aSign, bSign, cSign, zSign; + int aExp, bExp, cExp, pExp, zExp, expDiff; + uint32_t aSig, bSig, cSig; + flag pInf, pZero, pSign; + uint64_t pSig64, cSig64, zSig64; + uint32_t pSig; + int shiftcount; + flag signflip, infzero; + + a =3D float16_squash_input_denormal(a, status); + b =3D float16_squash_input_denormal(b, status); + c =3D float16_squash_input_denormal(c, status); + aSig =3D extractFloat16Frac(a); + aExp =3D extractFloat16Exp(a); + aSign =3D extractFloat16Sign(a); + bSig =3D extractFloat16Frac(b); + bExp =3D extractFloat16Exp(b); + bSign =3D extractFloat16Sign(b); + cSig =3D extractFloat16Frac(c); + cExp =3D extractFloat16Exp(c); + cSign =3D extractFloat16Sign(c); + + infzero =3D ((aExp =3D=3D 0 && aSig =3D=3D 0 && bExp =3D=3D 0x1f && bS= ig =3D=3D 0) || + (aExp =3D=3D 0x1f && aSig =3D=3D 0 && bExp =3D=3D 0 && bSig= =3D=3D 0)); + + /* It is implementation-defined whether the cases of (0,inf,qnan) + * and (inf,0,qnan) raise InvalidOperation or not (and what QNaN + * they return if they do), so we have to hand this information + * off to the target-specific pick-a-NaN routine. + */ + if (((aExp =3D=3D 0xff) && aSig) || + ((bExp =3D=3D 0xff) && bSig) || + ((cExp =3D=3D 0xff) && cSig)) { + return propagateFloat16MulAddNaN(a, b, c, infzero, status); + } + + if (infzero) { + float_raise(float_flag_invalid, status); + return float16_default_nan(status); + } + + if (flags & float_muladd_negate_c) { + cSign ^=3D 1; + } + + signflip =3D (flags & float_muladd_negate_result) ? 1 : 0; + + /* Work out the sign and type of the product */ + pSign =3D aSign ^ bSign; + if (flags & float_muladd_negate_product) { + pSign ^=3D 1; + } + pInf =3D (aExp =3D=3D 0xff) || (bExp =3D=3D 0xff); + pZero =3D ((aExp | aSig) =3D=3D 0) || ((bExp | bSig) =3D=3D 0); + + if (cExp =3D=3D 0xff) { + if (pInf && (pSign ^ cSign)) { + /* addition of opposite-signed infinities =3D> InvalidOperatio= n */ + float_raise(float_flag_invalid, status); + return float16_default_nan(status); + } + /* Otherwise generate an infinity of the same sign */ + return packFloat16(cSign ^ signflip, 0xff, 0); + } + + if (pInf) { + return packFloat16(pSign ^ signflip, 0xff, 0); + } + + if (pZero) { + if (cExp =3D=3D 0) { + if (cSig =3D=3D 0) { + /* Adding two exact zeroes */ + if (pSign =3D=3D cSign) { + zSign =3D pSign; + } else if (status->float_rounding_mode =3D=3D float_round_= down) { + zSign =3D 1; + } else { + zSign =3D 0; + } + return packFloat16(zSign ^ signflip, 0, 0); + } + /* Exact zero plus a denorm */ + if (status->flush_to_zero) { + float_raise(float_flag_output_denormal, status); + return packFloat16(cSign ^ signflip, 0, 0); + } + } + /* Zero plus something non-zero : just return the something */ + if (flags & float_muladd_halve_result) { + if (cExp =3D=3D 0) { + normalizeFloat16Subnormal(cSig, &cExp, &cSig); + } + /* Subtract one to halve, and one again because roundAndPackFl= oat16 + * wants one less than the true exponent. + */ + cExp -=3D 2; + cSig =3D (cSig | 0x00800000) << 7; + return roundAndPackFloat16(cSign ^ signflip, cExp, cSig, true,= status); + } + return packFloat16(cSign ^ signflip, cExp, cSig); + } + + if (aExp =3D=3D 0) { + normalizeFloat16Subnormal(aSig, &aExp, &aSig); + } + if (bExp =3D=3D 0) { + normalizeFloat16Subnormal(bSig, &bExp, &bSig); + } + + /* Calculate the actual result a * b + c */ + + /* Multiply first; this is easy. */ + /* NB: we subtract 0x7e where float16_mul() subtracts 0x7f + * because we want the true exponent, not the "one-less-than" + * flavour that roundAndPackFloat16() takes. + */ + pExp =3D aExp + bExp - 0x7e; + aSig =3D (aSig | 0x00800000) << 7; + bSig =3D (bSig | 0x00800000) << 8; + pSig64 =3D (uint64_t)aSig * bSig; + if ((int64_t)(pSig64 << 1) >=3D 0) { + pSig64 <<=3D 1; + pExp--; + } + + zSign =3D pSign ^ signflip; + + /* Now pSig64 is the significand of the multiply, with the explicit bi= t in + * position 62. + */ + if (cExp =3D=3D 0) { + if (!cSig) { + /* Throw out the special case of c being an exact zero now */ + shift64RightJamming(pSig64, 32, &pSig64); + pSig =3D pSig64; + if (flags & float_muladd_halve_result) { + pExp--; + } + return roundAndPackFloat16(zSign, pExp - 1, + pSig, true, status); + } + normalizeFloat16Subnormal(cSig, &cExp, &cSig); + } + + cSig64 =3D (uint64_t)cSig << (62 - 23); + cSig64 |=3D LIT64(0x4000000000000000); + expDiff =3D pExp - cExp; + + if (pSign =3D=3D cSign) { + /* Addition */ + if (expDiff > 0) { + /* scale c to match p */ + shift64RightJamming(cSig64, expDiff, &cSig64); + zExp =3D pExp; + } else if (expDiff < 0) { + /* scale p to match c */ + shift64RightJamming(pSig64, -expDiff, &pSig64); + zExp =3D cExp; + } else { + /* no scaling needed */ + zExp =3D cExp; + } + /* Add significands and make sure explicit bit ends up in posn 62 = */ + zSig64 =3D pSig64 + cSig64; + if ((int64_t)zSig64 < 0) { + shift64RightJamming(zSig64, 1, &zSig64); + } else { + zExp--; + } + } else { + /* Subtraction */ + if (expDiff > 0) { + shift64RightJamming(cSig64, expDiff, &cSig64); + zSig64 =3D pSig64 - cSig64; + zExp =3D pExp; + } else if (expDiff < 0) { + shift64RightJamming(pSig64, -expDiff, &pSig64); + zSig64 =3D cSig64 - pSig64; + zExp =3D cExp; + zSign ^=3D 1; + } else { + zExp =3D pExp; + if (cSig64 < pSig64) { + zSig64 =3D pSig64 - cSig64; + } else if (pSig64 < cSig64) { + zSig64 =3D cSig64 - pSig64; + zSign ^=3D 1; + } else { + /* Exact zero */ + zSign =3D signflip; + if (status->float_rounding_mode =3D=3D float_round_down) { + zSign ^=3D 1; + } + return packFloat16(zSign, 0, 0); + } + } + --zExp; + /* Normalize to put the explicit bit back into bit 62. */ + shiftcount =3D countLeadingZeros64(zSig64) - 1; + zSig64 <<=3D shiftcount; + zExp -=3D shiftcount; + } + if (flags & float_muladd_halve_result) { + zExp--; + } + + shift64RightJamming(zSig64, 32, &zSig64); + return roundAndPackFloat16(zSign, zExp, zSig64, true, status); +} + /*------------------------------------------------------------------------= ---- | Returns 1 if the half-precision floating-point value `a' is equal to | the corresponding value `b', and 0 otherwise. The invalid exception is diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 76a8310780..a7435e2a5b 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -350,6 +350,8 @@ float16 float16_add(float16, float16, float_status *sta= tus); float16 float16_sub(float16, float16, float_status *status); float16 float16_mul(float16, float16, float_status *status); float16 float16_div(float16, float16, float_status *status); +float16 float16_rem(float16, float16, float_status *status); +float16 float16_muladd(float16, float16, float16, int, float_status *statu= s); int float16_eq(float16, float16, float_status *status); int float16_le(float16, float16, float_status *status); int float16_lt(float16, float16, float_status *status); --=20 2.14.1