From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539472962933584.6236458684509;
 Sat, 13 Oct 2018 16:22:42 -0700 (PDT)
Received: from localhost ([::1]:46551 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTEs-0003iK-GJ
	for importer@patchew.org; Sat, 13 Oct 2018 19:22:30 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57878)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TI-LJ
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCc-0007wi-OP
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:40355)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007ve-HV
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:10 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 6511C21ACF;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:08 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 1E082102EA;
	Sat, 13 Oct 2018 19:20:07 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=NnipD8L3PIYUDBAnWOot9nii3h2PnhTj99Q2qKbThLg=; b=UPZQ6
	VEg/vtBzXM6V2EloxpT902f/czP+8JnKl+jt2PHW8UZ48UqfrCiJl9mYLBEHkeeV
	96q0fDxdYRTXcCzu2pV9SRZItPjggTBuCNW1Tu6O2pEXZc9jdE5+uyh0il8easna
	7/7F6j0a1tMoPsXLBFI+L6t3M1S9Mlms5VgAHw=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=NnipD8L3PIYUDBAnWOot9nii3h2Pn
	hTj99Q2qKbThLg=; b=BX8KalYn52elv82+OjEsAsv3O3MyUp3hcqzG0tOOhsRui
	+6dKrLl4bXhePqZ3U5NJCQ4IVMiZmW5H2eSn9/bm/rpUvqOYZU9Xt3OPtIcnSNog
	7WOULLp15JUF7sLMAcJ0c4gSaGVvPK9Islf5p7D2LAs/aHeU8sA59x9Lje8ThhAw
	XKVoWJfhItw38KMCCxf4fEV7O21tLxhNHjyNI0fgwKs5VSdB/E3oEv3c87gEZRk4
	qURIqOq3mIXzPqmV7xrw6kS8OoKJpNc6uuqXZ2uE2LNXF7Yllgcg15xv2EH9O2Rs
	WBC+BM0RN+OqqEv4/5lQcmf4rSs0Ox1Z940Nh4XtA==
X-ME-Sender: <xms:p33CW823XWoPJeIL3NU_w97EgDVv_J5zOVG-s0VJexGerqjaeLDTOg>
X-ME-Proxy: <xmx:p33CW-Ped_vJ6LM3P0f69qaYZJrDBUveWn5NDHTNRfJFZlspeTZqtg>
	<xmx:p33CW3eAScJkraNvBiOzVAJuGPWsb8D1xj81t5ryu8mK3j60i0hFsg>
	<xmx:p33CWyM6kKg7A5u9La_Pykx84aLtgzgarQFE3PoFWAB8p3uF2_sxOA>
	<xmx:p33CWxb6VGeHBujBAVdVSrhwVqbXK8kQ5c1mBV1VCVAYFCD8akXXbA>
	<xmx:p33CW-uC5MwuCQ-K-5iCFZV-nLg2FMsQOUBKjTy_BPzs5-ilXQMZaQ>
	<xmx:qH3CW1HFYZrsRdZVYoemHj2niRoooVyUCXQST106HYJnrAUp5zd_UQ>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:21 -0400
Message-Id: <20181013231933.28789-2-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 01/13] fp-test: pick TARGET_ARM to get its
 specialization
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

This gets rid of the muladd errors due to not raising the invalid flag.

- Before:
Errors found in f64_mulAdd, rounding near_even, tininess before rounding:
+000.0000000000000  +7FF.0000000000000  +7FF.FFFFFFFFFFFFF
        =3D> +7FF.FFFFFFFFFFFFF .....  expected -7FF.FFFFFFFFFFFFF v....
[...]

- After:
In 6133248 tests, no errors found in f64_mulAdd, rounding near_even, tinine=
ss before rounding.
[...]

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 tests/fp/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/fp/Makefile b/tests/fp/Makefile
index d649a5a1db..49cdcd1bd2 100644
--- a/tests/fp/Makefile
+++ b/tests/fp/Makefile
@@ -29,6 +29,9 @@ QEMU_INCLUDES +=3D -I$(TF_SOURCE_DIR)
=20
 # work around TARGET_* poisoning
 QEMU_CFLAGS +=3D -DHW_POISON_H
+# define a target to match testfloat's implementation-defined choices, suc=
h as
+# whether to raise the invalid flag when dealing with NaNs in muladd.
+QEMU_CFLAGS +=3D -DTARGET_ARM
=20
 # capstone has a platform.h file that clashes with softfloat's
 QEMU_CFLAGS :=3D $(filter-out %capstone, $(QEMU_CFLAGS))
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539473148217539.4724023958144;
 Sat, 13 Oct 2018 16:25:48 -0700 (PDT)
Received: from localhost ([::1]:46566 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTI3-0006DI-2q
	for importer@patchew.org; Sat, 13 Oct 2018 19:25:47 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57875)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TF-LI
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCc-0007wU-DA
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:47701)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007vf-5J
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:10 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 6976021BD9;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:08 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 68038102EB;
	Sat, 13 Oct 2018 19:20:07 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references
	:mime-version:content-type:content-transfer-encoding; s=mesmtp;
	bh=V1hE+xbte4TYTSkApBMzSxt27E4A+9NlJM9SlmxlTvM=; b=NfFZXg4kf09P
	7hfE6dO5MdCE6dajVZtwrsUNDO06AAeRZgNj878s9bRDRqVhdqAJH18faFbW2mri
	Yx9+WeZFnXdn3oJcJM6f4v9BnSYT+ZegJ4GwYx8pVZbAisX791D81Mk8BbeGm+el
	3OUimTA0IuCt3c5pSsfjHwSt4O6sI2E=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:content-transfer-encoding:content-type
	:date:from:in-reply-to:message-id:mime-version:references
	:subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender
	:x-sasl-enc; s=fm1; bh=V1hE+xbte4TYTSkApBMzSxt27E4A+9NlJM9SlmxlT
	vM=; b=t7M05HMlGNdVYeRkTCt9zEVi2acVgFYzeKVMEe9LQxnupAEcN6tSkHgTq
	dTQ4xtqQ8at1ffBWjAVKV78ShIvSG2vAqKa8DQztCEzes/O5F6TrJctn5WhTzesU
	TLH2LnrxWhi6Q1XVCKJ4VzF69N2GoTNnGMaDjC/IxJvQZvIGjVa7N1/oEMLJJucF
	qYswnxruZNDpk7els8bdVltip2CxkCU+H29EDbVyNBZQg4bWeOY9pBZBDK9VYCoi
	7Iba//a/sVzte/gKKPAInjMcZNaqDozHh+Ha1TYSlK66PUQtI+l6h2mDzGRKOKCv
	QWfYDOpyO3CL8pq00yD1KWvCXjOzg==
X-ME-Sender: <xms:p33CW7lKDx7aykBJVp0As8VANf6cX7WPUGmHfIvK3fkIBqwdc0_TRQ>
X-ME-Proxy: <xmx:p33CW_ENNaBTEoIFNqwJrw28rRb6bMrMk018vLjv1PcgV4En8TZUyA>
	<xmx:p33CW1yrimYC_E451mVYjX85Ra5Oh659iWYw00nSI11oVcycMxEy5Q>
	<xmx:p33CWwvsZU4WtnqjAYJTRGAOEzDWAstuNpqandew8q9QIMHgFShgrg>
	<xmx:p33CWy12hp9VXS7lE9ox6Q1Ni6ivphOIaDgqjwdVN8wr9zD1Xz5KAw>
	<xmx:p33CW7-lFc2krZA0RQOFqaQov27VpRvFzb4bDXYWys-ZPAD9vhO3SQ>
	<xmx:qH3CW40gH30ejx-84BFZcjd23v5dY0rXSMy-xMaE8gJA85fzuan2kw>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:22 -0400
Message-Id: <20181013231933.28789-3-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 02/13] softfloat: add float{32, 64}_is_{de,
 }normal
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0

This paves the way for upcoming work.

Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Alex Benn=C3=A9e <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 include/fpu/softfloat.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 8fd9f9bbae..9eeccd88a5 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -464,6 +464,16 @@ static inline int float32_is_zero_or_denormal(float32 =
a)
     return (float32_val(a) & 0x7f800000) =3D=3D 0;
 }
=20
+static inline bool float32_is_normal(float32 a)
+{
+    return ((float32_val(a) + 0x00800000) & 0x7fffffff) >=3D 0x01000000;
+}
+
+static inline bool float32_is_denormal(float32 a)
+{
+    return float32_is_zero_or_denormal(a) && !float32_is_zero(a);
+}
+
 static inline float32 float32_set_sign(float32 a, int sign)
 {
     return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31));
@@ -605,6 +615,16 @@ static inline int float64_is_zero_or_denormal(float64 =
a)
     return (float64_val(a) & 0x7ff0000000000000LL) =3D=3D 0;
 }
=20
+static inline bool float64_is_normal(float64 a)
+{
+    return ((float64_val(a) + (1ULL << 52)) & -1ULL >> 1) >=3D 1ULL << 53;
+}
+
+static inline bool float64_is_denormal(float64 a)
+{
+    return float64_is_zero_or_denormal(a) && !float64_is_zero(a);
+}
+
 static inline float64 float64_set_sign(float64 a, int sign)
 {
     return make_float64((float64_val(a) & 0x7fffffffffffffffULL)
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 15394729630191000.2762209796856;
 Sat, 13 Oct 2018 16:22:43 -0700 (PDT)
Received: from localhost ([::1]:46548 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTEr-0003h4-L1
	for importer@patchew.org; Sat, 13 Oct 2018 19:22:29 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57883)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TL-Lh
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCc-0007wC-Bp
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:36351)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007vh-5H
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:10 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 650CB21ACE;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:08 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 9E68C102DE;
	Sat, 13 Oct 2018 19:20:07 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=CeX9Fblk41NlWeY9UAsd6a4LLLNQgAqmGPeAe4RFW2k=; b=aUGj7
	6TpvhCv9m980+DnFKPnXcs8Bp0mgOCxoq2TQNR+6j/OIyZmDiqmXjADB6/xkPwOb
	GsQ39IW0bTLp5AZz5wgXqf6suCKagfEpiw3zzaFrvpybYuRJwrPTE+wCIMRs1W0P
	pwRUEa3Y1MnNqasf+TecTjmWcIupjV22zpzQeU=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=CeX9Fblk41NlWeY9UAsd6a4LLLNQg
	AqmGPeAe4RFW2k=; b=qZR20ZBYZ1W3bTMTIY36nCHU6gd1F6u7QwlCvzlMcn7rw
	Abfot7v6p1Zd/CrUynyMscY8j+WIBEKCUmhkHmusdJYj5xoXtnrM4l2DXZLYdJQb
	dL493xIF9fGSybVMnybMZTBUsruEIXe5Ugoz8KVaa12u8JbXL4mufu6e7CVOm6vq
	84xlk1V6ASP1bU9PNeZnSov0O62r868ffsqxGvJdFc7oelcBdTAlUBp2QROsAWD1
	lEDBAWU4AJ2Dlgw3d3qzobe1hl7j7OotZ4qMV4p2A1QGya8s6JvgPmgmgG+quo0U
	TVPsYFZ/FYvcPngd6AGHNMmLuzUIbjWsP2Zx8PL8w==
X-ME-Sender: <xms:p33CW_2y4Z3t6guvLSJGzKsmQvqcdVMJKV59pJbdUAoFRTfP9sQ8ew>
X-ME-Proxy: <xmx:p33CW_pXoGPP514RfYOH05Xfbccn5i5JBxsSlAK248pwZxsLSgGiDw>
	<xmx:p33CWxEJmhJUAR4os1t7taWocpZoGgMT55-u_ZPGrvFbvDiVTfHlOw>
	<xmx:p33CW5kq5Z2RsEgeRgQLLnIBNYIBtG7smTDSQCi3PFQeHy6dTTr85g>
	<xmx:p33CW8kcr_oXuPAP_ZIrLR-Abd2m53qWebtsMXcu-EG2ZrEnqI6HwA>
	<xmx:p33CWytTGVL9z356d5AANam2cMaxbLqX1wN2x7VGrLSaZWUeuEPNjA>
	<xmx:qH3CW8sfuet0sD3ScO6UJe1dhIHF9wyVtUQdY32dFcnTX84Lko6MFw>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:23 -0400
Message-Id: <20181013231933.28789-4-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 03/13] target/tricore: use
 float32_is_denormal
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 target/tricore/fpu_helper.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/target/tricore/fpu_helper.c b/target/tricore/fpu_helper.c
index df162902d6..31df462e4a 100644
--- a/target/tricore/fpu_helper.c
+++ b/target/tricore/fpu_helper.c
@@ -44,11 +44,6 @@ static inline uint8_t f_get_excp_flags(CPUTriCoreState *=
env)
               | float_flag_inexact);
 }
=20
-static inline bool f_is_denormal(float32 arg)
-{
-    return float32_is_zero_or_denormal(arg) && !float32_is_zero(arg);
-}
-
 static inline float32 f_maddsub_nan_result(float32 arg1, float32 arg2,
                                            float32 arg3, float32 result,
                                            uint32_t muladd_negate_c)
@@ -260,8 +255,8 @@ uint32_t helper_fcmp(CPUTriCoreState *env, uint32_t r1,=
 uint32_t r2)
     set_flush_inputs_to_zero(0, &env->fp_status);
=20
     result =3D 1 << (float32_compare_quiet(arg1, arg2, &env->fp_status) + =
1);
-    result |=3D f_is_denormal(arg1) << 4;
-    result |=3D f_is_denormal(arg2) << 5;
+    result |=3D float32_is_denormal(arg1) << 4;
+    result |=3D float32_is_denormal(arg2) << 5;
=20
     flags =3D f_get_excp_flags(env);
     if (flags) {
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539472967262789.282602495327;
 Sat, 13 Oct 2018 16:22:47 -0700 (PDT)
Received: from localhost ([::1]:46550 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTEs-0003hi-0p
	for importer@patchew.org; Sat, 13 Oct 2018 19:22:30 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57874)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TE-LE
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCc-0007wP-D4
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:57713)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007vd-1T
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:10 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 680ED21BD8;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:08 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id CFD6E102EC;
	Sat, 13 Oct 2018 19:20:07 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=fuPNOEGYZ1EeiT26Hqgd+f2aRfFLDhUifUkF5sW/8xY=; b=qpJft
	lA5GkWPhITWzNIMLH0HNpG5gRQNitkIdkRnyyQkSNIunkxSdCTP4CVXd1alkrDh8
	JYjK90sOtGXIDhXMcAeJTm/ytnTV2EbaDs6x53ujrP3WzdX4GsKWVYNuUs3Gd1gs
	x5Nh5H9PpbqzJxsCd4Eoj13pOkhYHAuPWW4Iqo=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=fuPNOEGYZ1EeiT26Hqgd+f2aRfFLD
	hUifUkF5sW/8xY=; b=JQCXOQftM0yQ0oD03gPHj5JjM9sSUIhURhxetgTmB5a64
	W34wlIhGaJOzQ0uLBwyjA90JHYvfu/h7TISkWac4Q+f1g5NRKSIMONr/A9VyWUa2
	MXbAUQl//mG9jNTtyoDB8oa97rO5owcie4xuGWlPmfUh2f+fpq2DyiwD9cU3fKde
	a8XvMeCqTy1DfBi8yu4ST7vVyzdwZAz2+ZBnJmVHcgeEAtF0HhHvF1t7lQ5mWzhq
	WdunAMA/Nmvcd5UaApL8Bp9D1znBKGsJJ4bTpDTIxGfQMqkrD6OsyWaOt2EbeK0R
	Wt1g1s9AxGOwQLhhOd0a3kwWmzR6fxlAVCJMXlMNQ==
X-ME-Sender: <xms:qH3CW3zefilR6wfLu_fV15CMTF0I2OPdZ0h7DGCxjm5u-HxvMK_qMw>
X-ME-Proxy: <xmx:qH3CW0g_0lw3fTc79ALgRFGsky8PIfSSJu2z13qaOaH3SZe1VGndaA>
	<xmx:qH3CW1BBMxLRg8QwsojXOHUplXoKkphJJHUuw8LcKzxueT0J6pT7SQ>
	<xmx:qH3CW8uE_ZOLhnCQ-4PjQRtfDZ0wB2oS9B5qR8GMjwQSBWajjKBEaw>
	<xmx:qH3CW00fUdpeNeejEUTPFk56Hcja6HRdBZ4s-0LPFkFHNlWI0XPC2A>
	<xmx:qH3CW1iEDYevJBNWYqaCiCvev4SSimcTJIFZnAB5IWEnG1jziZdFFg>
	<xmx:qH3CW9NG5UcWWCfWNVD4MSs7zfVrAUI6lnag-N8MpCWM3AjCcG_RbQ>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:24 -0400
Message-Id: <20181013231933.28789-5-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 04/13] softfloat: rename canonicalize to
 sf_canonicalize
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

glibc >=3D 2.25 defines canonicalize in commit eaf5ad0
(Add canonicalize, canonicalizef, canonicalizel., 2016-10-26).

Given that we'll be including <math.h> soon, prepare
for this by prefixing our canonicalize() with sf_ to avoid
clashing with the libc's canonicalize().

Reported-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 46ae206172..0cbb08be32 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -336,8 +336,8 @@ static inline float64 float64_pack_raw(FloatParts p)
 #include "softfloat-specialize.h"
=20
 /* Canonicalize EXP and FRAC, setting CLS.  */
-static FloatParts canonicalize(FloatParts part, const FloatFmt *parm,
-                               float_status *status)
+static FloatParts sf_canonicalize(FloatParts part, const FloatFmt *parm,
+                                  float_status *status)
 {
     if (part.exp =3D=3D parm->exp_max && !parm->arm_althp) {
         if (part.frac =3D=3D 0) {
@@ -513,7 +513,7 @@ static FloatParts round_canonical(FloatParts p, float_s=
tatus *s,
 static FloatParts float16a_unpack_canonical(float16 f, float_status *s,
                                             const FloatFmt *params)
 {
-    return canonicalize(float16_unpack_raw(f), params, s);
+    return sf_canonicalize(float16_unpack_raw(f), params, s);
 }
=20
 static FloatParts float16_unpack_canonical(float16 f, float_status *s)
@@ -534,7 +534,7 @@ static float16 float16_round_pack_canonical(FloatParts =
p, float_status *s)
=20
 static FloatParts float32_unpack_canonical(float32 f, float_status *s)
 {
-    return canonicalize(float32_unpack_raw(f), &float32_params, s);
+    return sf_canonicalize(float32_unpack_raw(f), &float32_params, s);
 }
=20
 static float32 float32_round_pack_canonical(FloatParts p, float_status *s)
@@ -544,7 +544,7 @@ static float32 float32_round_pack_canonical(FloatParts =
p, float_status *s)
=20
 static FloatParts float64_unpack_canonical(float64 f, float_status *s)
 {
-    return canonicalize(float64_unpack_raw(f), &float64_params, s);
+    return sf_canonicalize(float64_unpack_raw(f), &float64_params, s);
 }
=20
 static float64 float64_round_pack_canonical(FloatParts p, float_status *s)
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539472967194680.2638308137556;
 Sat, 13 Oct 2018 16:22:47 -0700 (PDT)
Received: from localhost ([::1]:46549 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTEs-0003hC-3t
	for importer@patchew.org; Sat, 13 Oct 2018 19:22:30 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57879)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TJ-LN
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCc-0007x1-Rq
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:57449)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007wG-It
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:10 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 732F221BE6;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:08 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 0EC23102D5;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=kIeLqTq4Rg/eYmqPOg5c92+v8rTPU4/b//I2wiwiWi4=; b=qiFpG
	+z5xAf9sQEOYFj7US4F0bL3HxXrE0UXQ5+rMx0s0Cjs4ffE0W0NfToNtbtNtdo6A
	E8AzyF6MMsYXzYzwHamldoDZssktwcApzUeLsUdEmEuk4DpH6qpgd/1V6ytV3Kei
	yyRDjmKal3rbR4LyuS4Z+LAAGweZl11WrzLUkM=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=kIeLqTq4Rg/eYmqPOg5c92+v8rTPU
	4/b//I2wiwiWi4=; b=LJKvZ1zqBSy0VQz610XDcfxU0DSXq1L8kQwTiNRFJNu9w
	5F9gSOYBFeLIgIq/iGr/SvjaI0I9Qxn3+dUD6QwKc9NfNzjU2kSRx6BD6hnxxqYL
	SjAjc418zjHpKkDRYFroXGIKFngWaBE7sDMftiGnimFPKnN9McVTJ77a9I0tytHg
	ETYkWsgCMTc/pVKHWFi3Qb0f9Q0O95hVZyAgQ30bZSFdHtjWNLW8UFE5xK/KrqzI
	s6+g6FoN3QlymWbx9RyIxrqlagYhvgNEc1FrXNfCxDi7CkXxLuz5AlN1G90CY7QM
	QVbT1/Imtk5f7S3wacd3K73P/8mNwILmP/3+eHCaQ==
X-ME-Sender: <xms:qH3CW1dgZzGE5zepR13XI-Rf_hiCPuczwhEJtg-dtx2omVu1MokSgg>
X-ME-Proxy: <xmx:qH3CW3-RWBr6mOiIIA05fURRmNzRMM7yfwseiuKlNhe1tXZP3XcdEw>
	<xmx:qH3CW3YASKtMlIKoj6O84nbo1F6gu5qdtYLFs7DHzcX6ZO1xlTTV-A>
	<xmx:qH3CW9qDUSZ0_s9addduuAmZQ42CZLvXyvdfnC8eOc_LPA42oG5qPg>
	<xmx:qH3CW01Ho1ZWobUWSdNkAUkrtdAXeLfLQ5kld545O5IJgR-1RJk6Ew>
	<xmx:qH3CW9kR-zXkMZ6uLthdyHkkJHw1lsi9hVaCli-iNUdhlJ7UnSLkSA>
	<xmx:qH3CW-FSO0RTMnaUPgJn2TTGhN0wTkX75yV1_Vk58IjYU9NNvWrDGw>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:25 -0400
Message-Id: <20181013231933.28789-6-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 05/13] softfloat: add float{32,
 64}_is_zero_or_normal
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

These will gain some users very soon.

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 include/fpu/softfloat.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 9eeccd88a5..38a5e99cf3 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -474,6 +474,11 @@ static inline bool float32_is_denormal(float32 a)
     return float32_is_zero_or_denormal(a) && !float32_is_zero(a);
 }
=20
+static inline bool float32_is_zero_or_normal(float32 a)
+{
+    return float32_is_normal(a) || float32_is_zero(a);
+}
+
 static inline float32 float32_set_sign(float32 a, int sign)
 {
     return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31));
@@ -625,6 +630,11 @@ static inline bool float64_is_denormal(float64 a)
     return float64_is_zero_or_denormal(a) && !float64_is_zero(a);
 }
=20
+static inline bool float64_is_zero_or_normal(float64 a)
+{
+    return float64_is_normal(a) || float64_is_zero(a);
+}
+
 static inline float64 float64_set_sign(float64 a, int sign)
 {
     return make_float64((float64_val(a) & 0x7fffffffffffffffULL)
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 15394734404017.813283724244002;
 Sat, 13 Oct 2018 16:30:40 -0700 (PDT)
Received: from localhost ([::1]:46591 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTMl-0001pk-7N
	for importer@patchew.org; Sat, 13 Oct 2018 19:30:39 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57872)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TD-Jj
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:15 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCc-0007x8-UB
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:44591)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007wH-LJ
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:10 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id B8B0C21BEB;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:08 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 44C45102ED;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=mHZ+WkgQo6dNLLpx91/bsHe5ftmVf4K1vN6Awmc8Eyo=; b=JmVVt
	Bu+AkPUO9uedZeYm8HQdPb3tro2Pfhl7Sj/Ml4gJSPi/9o6FTtGqZxm0Vjw+zAxk
	jyQjoXBMy17TAotw9mPlsISJyIy1Dson9gpgo7Pc9nUx7LyjxR7GsmZBF3b3vttu
	7e2KFqw1++ssx6WxPY3ekRXsQ7MXTLf5sQrp9s=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=mHZ+WkgQo6dNLLpx91/bsHe5ftmVf
	4K1vN6Awmc8Eyo=; b=Kl6/IK7jgT06e2q3/6ohS/zUkMQx+PIONgSvxM0bly7VB
	tTTX4QEl6oaGM3mXyleuM7brpmPuElKPLOfnKGGikkWAPu+jvcIJz7OiLoYeNbUM
	ujFSiPj2REeMYj7UQdS4EJmLziXpGInPE9hfW31ARKbxDyj22EBSmik33NN07Anm
	ru3s90+1nQh+gNKBxrPJ7s1Dz0gIp9CpkDLSkrkX8bngF4GtppPVyChEXzOeO9qg
	0ybKzmwamKHg//OkVHNzxT4ApRDAD6h7WH1DFTGBsaA/CvCAgCkGZCvQ1yodoAZj
	2fLr/yjJbF0YaPv3kB3bL2o9TUEomImzrxwJVB4TA==
X-ME-Sender: <xms:qH3CWyUsMdkynfd9yasoUOOLne8oyfcgvkeds4ltgP7aLba8IPwUJg>
X-ME-Proxy: <xmx:qH3CWwUusmFPpa-J1mvzmP7sdYOUGKkJvCofU6H5pkHIKCLcHq-Ukg>
	<xmx:qH3CW7TnDfPADq33oTBjKDYDPNQpLBblKylGD4bCG_bhhsApu43CpA>
	<xmx:qH3CW420tY4wan84GNhAgkgcVkSPyXoXClSpblsl6Nau8hbRcsHScg>
	<xmx:qH3CW9jc0O73YEoFWlYUYFN8MRWlO9acj3Ozoo1ysLwZSPG3Pw3pZA>
	<xmx:qH3CWxVGoV1Jt9aB5fZr_qFuFEGfWBLr9E-uZdCRjb3czP0My6zHXg>
	<xmx:qH3CW_PSO4oyFP8jURXUSfwctHyOiXNflJ8MhtY4GS_ygBJI4e79Tg>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:26 -0400
Message-Id: <20181013231933.28789-7-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 06/13] tests/fp: add fp-bench
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

These microbenchmarks will allow us to measure the performance impact of
FP emulation optimizations. Note that we can measure both directly the impa=
ct
on the softfloat functions (with "-t soft"), or the impact on an
emulated workload (call with "-t host" and run under qemu user-mode).

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 tests/fp/fp-bench.c | 630 ++++++++++++++++++++++++++++++++++++++++++++
 tests/fp/.gitignore |   1 +
 tests/fp/Makefile   |   5 +-
 3 files changed, 635 insertions(+), 1 deletion(-)
 create mode 100644 tests/fp/fp-bench.c

diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c
new file mode 100644
index 0000000000..f5bc5edebf
--- /dev/null
+++ b/tests/fp/fp-bench.c
@@ -0,0 +1,630 @@
+/*
+ * fp-bench.c - A collection of simple floating point microbenchmarks.
+ *
+ * Copyright (C) 2018, Emilio G. Cota <cota@braap.org>
+ *
+ * License: GNU GPL, version 2 or later.
+ *   See the COPYING file in the top-level directory.
+ */
+#ifndef HW_POISON_H
+#error Must define HW_POISON_H to work around TARGET_* poisoning
+#endif
+
+#include "qemu/osdep.h"
+#include <math.h>
+#include <fenv.h>
+#include "qemu/timer.h"
+#include "fpu/softfloat.h"
+
+/* amortize the computation of random inputs */
+#define OPS_PER_ITER     50000
+
+#define MAX_OPERANDS 3
+
+#define SEED_A 0xdeadfacedeadface
+#define SEED_B 0xbadc0feebadc0fee
+#define SEED_C 0xbeefdeadbeefdead
+
+enum op {
+    OP_ADD,
+    OP_SUB,
+    OP_MUL,
+    OP_DIV,
+    OP_FMA,
+    OP_SQRT,
+    OP_CMP,
+    OP_MAX_NR,
+};
+
+static const char * const op_names[] =3D {
+    [OP_ADD] =3D "add",
+    [OP_SUB] =3D "sub",
+    [OP_MUL] =3D "mul",
+    [OP_DIV] =3D "div",
+    [OP_FMA] =3D "mulAdd",
+    [OP_SQRT] =3D "sqrt",
+    [OP_CMP] =3D "cmp",
+    [OP_MAX_NR] =3D NULL,
+};
+
+enum precision {
+    PREC_SINGLE,
+    PREC_DOUBLE,
+    PREC_FLOAT32,
+    PREC_FLOAT64,
+    PREC_MAX_NR,
+};
+
+enum rounding {
+    ROUND_EVEN,
+    ROUND_ZERO,
+    ROUND_DOWN,
+    ROUND_UP,
+    ROUND_TIEAWAY,
+    N_ROUND_MODES,
+};
+
+static const char * const round_names[] =3D {
+    [ROUND_EVEN] =3D "even",
+    [ROUND_ZERO] =3D "zero",
+    [ROUND_DOWN] =3D "down",
+    [ROUND_UP] =3D "up",
+    [ROUND_TIEAWAY] =3D "tieaway",
+};
+
+enum tester {
+    TESTER_SOFT,
+    TESTER_HOST,
+    TESTER_MAX_NR,
+};
+
+static const char * const tester_names[] =3D {
+    [TESTER_SOFT] =3D "soft",
+    [TESTER_HOST] =3D "host",
+    [TESTER_MAX_NR] =3D NULL,
+};
+
+union fp {
+    float f;
+    double d;
+    float32 f32;
+    float64 f64;
+    uint64_t u64;
+};
+
+struct op_state;
+
+typedef float (*float_func_t)(const struct op_state *s);
+typedef double (*double_func_t)(const struct op_state *s);
+
+union fp_func {
+    float_func_t float_func;
+    double_func_t double_func;
+};
+
+typedef void (*bench_func_t)(void);
+
+struct op_desc {
+    const char * const name;
+};
+
+#define DEFAULT_DURATION_SECS 1
+
+static uint64_t random_ops[MAX_OPERANDS] =3D {
+    SEED_A, SEED_B, SEED_C,
+};
+static float_status soft_status;
+static enum precision precision;
+static enum op operation;
+static enum tester tester;
+static uint64_t n_completed_ops;
+static unsigned int duration =3D DEFAULT_DURATION_SECS;
+static int64_t ns_elapsed;
+/* disable optimizations with volatile */
+static volatile union fp res;
+
+/*
+ * From: https://en.wikipedia.org/wiki/Xorshift
+ * This is faster than rand_r(), and gives us a wider range (RAND_MAX is o=
nly
+ * guaranteed to be >=3D INT_MAX).
+ */
+static uint64_t xorshift64star(uint64_t x)
+{
+    x ^=3D x >> 12; /* a */
+    x ^=3D x << 25; /* b */
+    x ^=3D x >> 27; /* c */
+    return x * UINT64_C(2685821657736338717);
+}
+
+static void update_random_ops(int n_ops, enum precision prec)
+{
+    int i;
+
+    for (i =3D 0; i < n_ops; i++) {
+        uint64_t r =3D random_ops[i];
+
+        if (prec =3D=3D PREC_SINGLE || PREC_FLOAT32) {
+            do {
+                r =3D xorshift64star(r);
+            } while (!float32_is_normal(r));
+        } else if (prec =3D=3D PREC_DOUBLE || PREC_FLOAT64) {
+            do {
+                r =3D xorshift64star(r);
+            } while (!float64_is_normal(r));
+        } else {
+            g_assert_not_reached();
+        }
+        random_ops[i] =3D r;
+    }
+}
+
+static void fill_random(union fp *ops, int n_ops, enum precision prec,
+                        bool no_neg)
+{
+    int i;
+
+    for (i =3D 0; i < n_ops; i++) {
+        switch (prec) {
+        case PREC_SINGLE:
+        case PREC_FLOAT32:
+            ops[i].f32 =3D make_float32(random_ops[i]);
+            if (no_neg && float32_is_neg(ops[i].f32)) {
+                ops[i].f32 =3D float32_chs(ops[i].f32);
+            }
+            /* raise the exponent to limit the frequency of denormal resul=
ts */
+            ops[i].f32 |=3D 0x40000000;
+            break;
+        case PREC_DOUBLE:
+        case PREC_FLOAT64:
+            ops[i].f64 =3D make_float64(random_ops[i]);
+            if (no_neg && float64_is_neg(ops[i].f64)) {
+                ops[i].f64 =3D float64_chs(ops[i].f64);
+            }
+            /* raise the exponent to limit the frequency of denormal resul=
ts */
+            ops[i].f64 |=3D LIT64(0x4000000000000000);
+            break;
+        default:
+            g_assert_not_reached();
+        }
+    }
+}
+
+/*
+ * The main benchmark function. Instead of (ab)using macros, we rely
+ * on the compiler to unfold this at compile-time.
+ */
+static void bench(enum precision prec, enum op op, int n_ops, bool no_neg)
+{
+    int64_t tf =3D get_clock() + duration * 1000000000LL;
+
+    while (get_clock() < tf) {
+        union fp ops[MAX_OPERANDS];
+        int64_t t0;
+        int i;
+
+        update_random_ops(n_ops, prec);
+        switch (prec) {
+        case PREC_SINGLE:
+            fill_random(ops, n_ops, prec, no_neg);
+            t0 =3D get_clock();
+            for (i =3D 0; i < OPS_PER_ITER; i++) {
+                float a =3D ops[0].f;
+                float b =3D ops[1].f;
+                float c =3D ops[2].f;
+
+                switch (op) {
+                case OP_ADD:
+                    res.f =3D a + b;
+                    break;
+                case OP_SUB:
+                    res.f =3D a - b;
+                    break;
+                case OP_MUL:
+                    res.f =3D a * b;
+                    break;
+                case OP_DIV:
+                    res.f =3D a / b;
+                    break;
+                case OP_FMA:
+                    res.f =3D fmaf(a, b, c);
+                    break;
+                case OP_SQRT:
+                    res.f =3D sqrtf(a);
+                    break;
+                case OP_CMP:
+                    res.u64 =3D isgreater(a, b);
+                    break;
+                default:
+                    g_assert_not_reached();
+                }
+            }
+            break;
+        case PREC_DOUBLE:
+            fill_random(ops, n_ops, prec, no_neg);
+            t0 =3D get_clock();
+            for (i =3D 0; i < OPS_PER_ITER; i++) {
+                double a =3D ops[0].d;
+                double b =3D ops[1].d;
+                double c =3D ops[2].d;
+
+                switch (op) {
+                case OP_ADD:
+                    res.d =3D a + b;
+                    break;
+                case OP_SUB:
+                    res.d =3D a - b;
+                    break;
+                case OP_MUL:
+                    res.d =3D a * b;
+                    break;
+                case OP_DIV:
+                    res.d =3D a / b;
+                    break;
+                case OP_FMA:
+                    res.d =3D fma(a, b, c);
+                    break;
+                case OP_SQRT:
+                    res.d =3D sqrt(a);
+                    break;
+                case OP_CMP:
+                    res.u64 =3D isgreater(a, b);
+                    break;
+                default:
+                    g_assert_not_reached();
+                }
+            }
+            break;
+        case PREC_FLOAT32:
+            fill_random(ops, n_ops, prec, no_neg);
+            t0 =3D get_clock();
+            for (i =3D 0; i < OPS_PER_ITER; i++) {
+                float32 a =3D ops[0].f32;
+                float32 b =3D ops[1].f32;
+                float32 c =3D ops[2].f32;
+
+                switch (op) {
+                case OP_ADD:
+                    res.f32 =3D float32_add(a, b, &soft_status);
+                    break;
+                case OP_SUB:
+                    res.f32 =3D float32_sub(a, b, &soft_status);
+                    break;
+                case OP_MUL:
+                    res.f =3D float32_mul(a, b, &soft_status);
+                    break;
+                case OP_DIV:
+                    res.f32 =3D float32_div(a, b, &soft_status);
+                    break;
+                case OP_FMA:
+                    res.f32 =3D float32_muladd(a, b, c, 0, &soft_status);
+                    break;
+                case OP_SQRT:
+                    res.f32 =3D float32_sqrt(a, &soft_status);
+                    break;
+                case OP_CMP:
+                    res.u64 =3D float32_compare_quiet(a, b, &soft_status);
+                    break;
+                default:
+                    g_assert_not_reached();
+                }
+            }
+            break;
+        case PREC_FLOAT64:
+            fill_random(ops, n_ops, prec, no_neg);
+            t0 =3D get_clock();
+            for (i =3D 0; i < OPS_PER_ITER; i++) {
+                float64 a =3D ops[0].f64;
+                float64 b =3D ops[1].f64;
+                float64 c =3D ops[2].f64;
+
+                switch (op) {
+                case OP_ADD:
+                    res.f64 =3D float64_add(a, b, &soft_status);
+                    break;
+                case OP_SUB:
+                    res.f64 =3D float64_sub(a, b, &soft_status);
+                    break;
+                case OP_MUL:
+                    res.f =3D float64_mul(a, b, &soft_status);
+                    break;
+                case OP_DIV:
+                    res.f64 =3D float64_div(a, b, &soft_status);
+                    break;
+                case OP_FMA:
+                    res.f64 =3D float64_muladd(a, b, c, 0, &soft_status);
+                    break;
+                case OP_SQRT:
+                    res.f64 =3D float64_sqrt(a, &soft_status);
+                    break;
+                case OP_CMP:
+                    res.u64 =3D float64_compare_quiet(a, b, &soft_status);
+                    break;
+                default:
+                    g_assert_not_reached();
+                }
+            }
+            break;
+        default:
+            g_assert_not_reached();
+        }
+        ns_elapsed +=3D get_clock() - t0;
+        n_completed_ops +=3D OPS_PER_ITER;
+    }
+}
+
+#define GEN_BENCH(name, type, prec, op, n_ops)          \
+    static void __attribute__((flatten)) name(void)     \
+    {                                                   \
+        bench(prec, op, n_ops, false);                  \
+    }
+
+#define GEN_BENCH_NO_NEG(name, type, prec, op, n_ops)   \
+    static void __attribute__((flatten)) name(void)     \
+    {                                                   \
+        bench(prec, op, n_ops, true);                   \
+    }
+
+#define GEN_BENCH_ALL_TYPES(opname, op, n_ops)                          \
+    GEN_BENCH(bench_ ## opname ## _float, float, PREC_SINGLE, op, n_ops) \
+    GEN_BENCH(bench_ ## opname ## _double, double, PREC_DOUBLE, op, n_ops)=
 \
+    GEN_BENCH(bench_ ## opname ## _float32, float32, PREC_FLOAT32, op, n_o=
ps) \
+    GEN_BENCH(bench_ ## opname ## _float64, float64, PREC_FLOAT64, op, n_o=
ps)
+
+GEN_BENCH_ALL_TYPES(add, OP_ADD, 2)
+GEN_BENCH_ALL_TYPES(sub, OP_SUB, 2)
+GEN_BENCH_ALL_TYPES(mul, OP_MUL, 2)
+GEN_BENCH_ALL_TYPES(div, OP_DIV, 2)
+GEN_BENCH_ALL_TYPES(fma, OP_FMA, 3)
+GEN_BENCH_ALL_TYPES(cmp, OP_CMP, 2)
+#undef GEN_BENCH_ALL_TYPES
+
+#define GEN_BENCH_ALL_TYPES_NO_NEG(name, op, n)                         \
+    GEN_BENCH_NO_NEG(bench_ ## name ## _float, float, PREC_SINGLE, op, n) \
+    GEN_BENCH_NO_NEG(bench_ ## name ## _double, double, PREC_DOUBLE, op, n=
) \
+    GEN_BENCH_NO_NEG(bench_ ## name ## _float32, float32, PREC_FLOAT32, op=
, n) \
+    GEN_BENCH_NO_NEG(bench_ ## name ## _float64, float64, PREC_FLOAT64, op=
, n)
+
+GEN_BENCH_ALL_TYPES_NO_NEG(sqrt, OP_SQRT, 1)
+#undef GEN_BENCH_ALL_TYPES_NO_NEG
+
+#undef GEN_BENCH_NO_NEG
+#undef GEN_BENCH
+
+#define GEN_BENCH_FUNCS(opname, op)                             \
+    [op] =3D {                                                    \
+        [PREC_SINGLE]    =3D bench_ ## opname ## _float,          \
+        [PREC_DOUBLE]    =3D bench_ ## opname ## _double,         \
+        [PREC_FLOAT32]   =3D bench_ ## opname ## _float32,        \
+        [PREC_FLOAT64]   =3D bench_ ## opname ## _float64,        \
+    }
+
+static const bench_func_t bench_funcs[OP_MAX_NR][PREC_MAX_NR] =3D {
+    GEN_BENCH_FUNCS(add, OP_ADD),
+    GEN_BENCH_FUNCS(sub, OP_SUB),
+    GEN_BENCH_FUNCS(mul, OP_MUL),
+    GEN_BENCH_FUNCS(div, OP_DIV),
+    GEN_BENCH_FUNCS(fma, OP_FMA),
+    GEN_BENCH_FUNCS(sqrt, OP_SQRT),
+    GEN_BENCH_FUNCS(cmp, OP_CMP),
+};
+
+#undef GEN_BENCH_FUNCS
+
+static void run_bench(void)
+{
+    bench_func_t f;
+
+    f =3D bench_funcs[operation][precision];
+    g_assert(f);
+    f();
+}
+
+/* @arr must be NULL-terminated */
+static int find_name(const char * const *arr, const char *name)
+{
+    int i;
+
+    for (i =3D 0; arr[i] !=3D NULL; i++) {
+        if (strcmp(name, arr[i]) =3D=3D 0) {
+            return i;
+        }
+    }
+    return -1;
+}
+
+static void usage_complete(int argc, char *argv[])
+{
+    gchar *op_list =3D g_strjoinv(", ", (gchar **)op_names);
+    gchar *tester_list =3D g_strjoinv(", ", (gchar **)tester_names);
+
+    fprintf(stderr, "Usage: %s [options]\n", argv[0]);
+    fprintf(stderr, "options:\n");
+    fprintf(stderr, " -d =3D duration, in seconds. Default: %d\n",
+            DEFAULT_DURATION_SECS);
+    fprintf(stderr, " -h =3D show this help message.\n");
+    fprintf(stderr, " -o =3D floating point operation (%s). Default: %s\n",
+            op_list, op_names[0]);
+    fprintf(stderr, " -p =3D floating point precision (single, double). "
+            "Default: single\n");
+    fprintf(stderr, " -r =3D rounding mode (even, zero, down, up, tieaway)=
. "
+            "Default: even\n");
+    fprintf(stderr, " -t =3D tester (%s). Default: %s\n",
+            tester_list, tester_names[0]);
+    fprintf(stderr, " -z =3D flush inputs to zero (soft tester only). "
+            "Default: disabled\n");
+    fprintf(stderr, " -Z =3D flush output to zero (soft tester only). "
+            "Default: disabled\n");
+
+    g_free(tester_list);
+    g_free(op_list);
+}
+
+static int round_name_to_mode(const char *name)
+{
+    int i;
+
+    for (i =3D 0; i < N_ROUND_MODES; i++) {
+        if (!strcmp(round_names[i], name)) {
+            return i;
+        }
+    }
+    return -1;
+}
+
+static void QEMU_NORETURN die_host_rounding(enum rounding rounding)
+{
+    fprintf(stderr, "fatal: '%s' rounding not supported on this host\n",
+            round_names[rounding]);
+    exit(EXIT_FAILURE);
+}
+
+static void set_host_precision(enum rounding rounding)
+{
+    int rhost;
+
+    switch (rounding) {
+    case ROUND_EVEN:
+        rhost =3D FE_TONEAREST;
+        break;
+    case ROUND_ZERO:
+        rhost =3D FE_TOWARDZERO;
+        break;
+    case ROUND_DOWN:
+        rhost =3D FE_DOWNWARD;
+        break;
+    case ROUND_UP:
+        rhost =3D FE_UPWARD;
+        break;
+    case ROUND_TIEAWAY:
+        die_host_rounding(rounding);
+        return;
+    default:
+        g_assert_not_reached();
+    }
+
+    if (fesetround(rhost)) {
+        die_host_rounding(rounding);
+    }
+}
+
+static void set_soft_precision(enum rounding rounding)
+{
+    signed char mode;
+
+    switch (rounding) {
+    case ROUND_EVEN:
+        mode =3D float_round_nearest_even;
+        break;
+    case ROUND_ZERO:
+        mode =3D float_round_to_zero;
+        break;
+    case ROUND_DOWN:
+        mode =3D float_round_down;
+        break;
+    case ROUND_UP:
+        mode =3D float_round_up;
+        break;
+    case ROUND_TIEAWAY:
+        mode =3D float_round_ties_away;
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    soft_status.float_rounding_mode =3D mode;
+}
+
+static void parse_args(int argc, char *argv[])
+{
+    int c;
+    int val;
+    int rounding =3D ROUND_EVEN;
+
+    for (;;) {
+        c =3D getopt(argc, argv, "d:ho:p:r:t:zZ");
+        if (c < 0) {
+            break;
+        }
+        switch (c) {
+        case 'd':
+            duration =3D atoi(optarg);
+            break;
+        case 'h':
+            usage_complete(argc, argv);
+            exit(EXIT_SUCCESS);
+        case 'o':
+            val =3D find_name(op_names, optarg);
+            if (val < 0) {
+                fprintf(stderr, "Unsupported op '%s'\n", optarg);
+                exit(EXIT_FAILURE);
+            }
+            operation =3D val;
+            break;
+        case 'p':
+            if (!strcmp(optarg, "single")) {
+                precision =3D PREC_SINGLE;
+            } else if (!strcmp(optarg, "double")) {
+                precision =3D PREC_DOUBLE;
+            } else {
+                fprintf(stderr, "Unsupported precision '%s'\n", optarg);
+                exit(EXIT_FAILURE);
+            }
+            break;
+        case 'r':
+            rounding =3D round_name_to_mode(optarg);
+            if (rounding < 0) {
+                fprintf(stderr, "fatal: invalid rounding mode '%s'\n", opt=
arg);
+                exit(EXIT_FAILURE);
+            }
+            break;
+        case 't':
+            val =3D find_name(tester_names, optarg);
+            if (val < 0) {
+                fprintf(stderr, "Unsupported tester '%s'\n", optarg);
+                exit(EXIT_FAILURE);
+            }
+            tester =3D val;
+            break;
+        case 'z':
+            soft_status.flush_inputs_to_zero =3D 1;
+            break;
+        case 'Z':
+            soft_status.flush_to_zero =3D 1;
+            break;
+        }
+    }
+
+    /* set precision and rounding mode based on the tester */
+    switch (tester) {
+    case TESTER_HOST:
+        set_host_precision(rounding);
+        break;
+    case TESTER_SOFT:
+        set_soft_precision(rounding);
+        switch (precision) {
+        case PREC_SINGLE:
+            precision =3D PREC_FLOAT32;
+            break;
+        case PREC_DOUBLE:
+            precision =3D PREC_FLOAT64;
+            break;
+        default:
+            g_assert_not_reached();
+        }
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void pr_stats(void)
+{
+    printf("%.2f MFlops\n", (double)n_completed_ops / ns_elapsed * 1e3);
+}
+
+int main(int argc, char *argv[])
+{
+    parse_args(argc, argv);
+    run_bench();
+    pr_stats();
+    return 0;
+}
diff --git a/tests/fp/.gitignore b/tests/fp/.gitignore
index 8d45d18ac4..704fd42992 100644
--- a/tests/fp/.gitignore
+++ b/tests/fp/.gitignore
@@ -1 +1,2 @@
 fp-test
+fp-bench
diff --git a/tests/fp/Makefile b/tests/fp/Makefile
index 49cdcd1bd2..5019dcdca0 100644
--- a/tests/fp/Makefile
+++ b/tests/fp/Makefile
@@ -553,7 +553,7 @@ TF_OBJS_LIB +=3D $(TF_OBJS_WRITECASE)
 TF_OBJS_LIB +=3D testLoops_common.o
 TF_OBJS_LIB +=3D $(TF_OBJS_TEST)
=20
-BINARIES :=3D fp-test$(EXESUF)
+BINARIES :=3D fp-test$(EXESUF) fp-bench$(EXESUF)
=20
 # everything depends on config-host.h because platform.h includes it
 all: $(BUILD_DIR)/config-host.h
@@ -590,10 +590,13 @@ $(TF_OBJS_LIB) slowfloat.o: %.o: $(TF_SOURCE_DIR)/%.c
=20
 libtestfloat.a: $(TF_OBJS_LIB)
=20
+fp-bench$(EXESUF): fp-bench.o $(QEMU_SOFTFLOAT_OBJ) $(LIBQEMUUTIL)
+
 clean:
 	rm -f *.o *.d $(BINARIES)
 	rm -f *.gcno *.gcda *.gcov
 	rm -f fp-test$(EXESUF)
+	rm -f fp-bench$(EXESUF)
 	rm -f libsoftfloat.a
 	rm -f libtestfloat.a
=20
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539473317695500.90541083685196;
 Sat, 13 Oct 2018 16:28:37 -0700 (PDT)
Received: from localhost ([::1]:46578 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTKf-0000Fn-0B
	for importer@patchew.org; Sat, 13 Oct 2018 19:28:29 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57890)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TM-MQ
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:15 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCd-0007xT-6K
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:54593)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007wM-UW
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:11 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id CEE5B21687;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:08 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 75272102DE;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=LlsUGH0GcahW+UaUdedVEmqCjY1GIR0KW6PVdE8Ag8Q=; b=BDbN7
	F2J5iOigyInO6+uPKCYCgvR6uj0kxGFI219i3V7a2XpHKNmnPSHmjGh16T11BTzb
	hwBVi5wvSo9fH6THHRGcDQ/0xifeHW+SR6023+l9dtEdjOMqABShjBg0UgvO1lwx
	PEkjJGso4TKm1y68HYAaQHbTk9XOLW8mhtNjWE=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=LlsUGH0GcahW+UaUdedVEmqCjY1GI
	R0KW6PVdE8Ag8Q=; b=NdLiD4sD4nATBVOEJJ8YMC6QApOwdiLXXBri6VsLtPvXf
	94mUbG+xHJxyFmkXFh7J9NXlT4kNDsWPOOY/w6RVk7Krx4ETS1VjcHpTREYEfmfi
	PGa1r++AWubXuZnBN5ixdnonKOrJfb5KWOQd2Rljz6LaAROyo/+AVP57lHNiLSX5
	sHywrYDsjU/8WIDFUIWKjlniHcbiGDdyxeoxYclU+TY4JkaIgjoq4CZmVV7bGeRg
	wpXPXrz4/ztw6vmmRhjRBLPIgM60dwN5w1BBT8lUrrJIOgLjGkQSdixxo/f/1odW
	QoCl5TjBjA7BnufLLg97wGodoKhGk1+ApH9fvAbyQ==
X-ME-Sender: <xms:qH3CW7c-mRXlKX_dnEidJETviujtVDxs01akOVIWq_p5Iss5vlHjmg>
X-ME-Proxy: <xmx:qH3CWzOyz0UzGY5NeP6DGjgzhRsNpbpDm4WAYafGZUIHsah-T1QzSg>
	<xmx:qH3CWxpsK2sAFD8oUxEqBpCaKcrbVcCuE8yPcG2IX84mb9g6LhBgFw>
	<xmx:qH3CW9Vw5s4ouny5po--vG4RPwXyZsw6JoN-os51XIg44fvCI9nmRg>
	<xmx:qH3CW1JIUpcBDG1ZxC2YGQrinSQ4zUdZr8SGyHcLptudbvhEAZPWaA>
	<xmx:qH3CWw-U0sandcrweyOjbyJg7inQ29IOysyL2CRWEfg5gFDDwYhpbA>
	<xmx:qH3CWxJRodgtpRV5PyUqKtxseF4B5ub4YZbV6NaN6_UBBykQPQWy6w>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:27 -0400
Message-Id: <20181013231933.28789-8-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 07/13] fpu: introduce hardfloat
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

The appended paves the way for leveraging the host FPU for a subset
of guest FP operations. For most guest workloads (e.g. FP flags
aren't ever cleared, inexact occurs often and rounding is set to the
default [to nearest]) this will yield sizable performance speedups.

The approach followed here avoids checking the FP exception flags register.
See the added comment for details.

This assumes that QEMU is running on an IEEE754-compliant FPU and
that the rounding is set to the default (to nearest). The
implementation-dependent specifics of the FPU should not matter; things
like tininess detection and snan representation are still dealt with in
soft-fp. However, this approach will break on most hosts if we compile
QEMU with flags such as -ffast-math. We control the flags so this should
be easy to enforce though.

This patch just adds common code. Some operations will be migrated
to hardfloat in subsequent patches to ease bisection.

Note: some architectures (at least PPC, there might be others) clear
the status flags passed to softfloat before most FP operations. This
precludes the use of hardfloat, so to avoid introducing a performance
regression for those targets, we add a flag to disable hardfloat.
In the long run though it would be good to fix the targets so that
at least the inexact flag passed to softfloat is indeed sticky.

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 341 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 341 insertions(+)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 0cbb08be32..81d06548b5 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -83,6 +83,7 @@ this code that are retained.
  * target-dependent and needs the TARGET_* macros.
  */
 #include "qemu/osdep.h"
+#include <math.h>
 #include "qemu/bitops.h"
 #include "fpu/softfloat.h"
=20
@@ -95,6 +96,346 @@ this code that are retained.
 *-------------------------------------------------------------------------=
---*/
 #include "fpu/softfloat-macros.h"
=20
+/*
+ * Hardfloat
+ *
+ * Fast emulation of guest FP instructions is challenging for two reasons.
+ * First, FP instruction semantics are similar but not identical, particul=
arly
+ * when handling NaNs. Second, emulating at reasonable speed the guest FP
+ * exception flags is not trivial: reading the host's flags register with a
+ * feclearexcept & fetestexcept pair is slow [slightly slower than soft-fp=
],
+ * and trapping on every FP exception is not fast nor pleasant to work wit=
h.
+ *
+ * We address these challenges by leveraging the host FPU for a subset of =
the
+ * operations. To do this we expand on the idea presented in this paper:
+ *
+ * Guo, Yu-Chuan, et al. "Translating the ARM Neon and VFP instructions in=
 a
+ * binary translator." Software: Practice and Experience 46.12 (2016):1591=
-1615.
+ *
+ * The idea is thus to leverage the host FPU to (1) compute FP operations
+ * and (2) identify whether FP exceptions occurred while avoiding
+ * expensive exception flag register accesses.
+ *
+ * An important optimization shown in the paper is that given that excepti=
on
+ * flags are rarely cleared by the guest, we can avoid recomputing some fl=
ags.
+ * This is particularly useful for the inexact flag, which is very frequen=
tly
+ * raised in floating-point workloads.
+ *
+ * We optimize the code further by deferring to soft-fp whenever FP except=
ion
+ * detection might get hairy. Two examples: (1) when at least one operand =
is
+ * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 r=
esult
+ * and the result is < the minimum normal.
+ */
+#define GEN_TYPE_CONV(name, to_t, from_t)       \
+    static inline to_t name(from_t a)           \
+    {                                           \
+        to_t r =3D *(to_t *)&a;                   \
+        return r;                               \
+    }
+
+GEN_TYPE_CONV(float32_to_float, float, float32)
+GEN_TYPE_CONV(float64_to_double, double, float64)
+GEN_TYPE_CONV(float_to_float32, float32, float)
+GEN_TYPE_CONV(double_to_float64, float64, double)
+#undef GEN_TYPE_CONV
+
+#define GEN_INPUT_FLUSH__NOCHECK(name, soft_t)                          \
+    static inline void name(soft_t *a, float_status *s)                 \
+    {                                                                   \
+        if (unlikely(soft_t ## _is_denormal(*a))) {                     \
+            *a =3D soft_t ## _set_sign(soft_t ## _zero,                   \
+                                     soft_t ## _is_neg(*a));            \
+            s->float_exception_flags |=3D float_flag_input_denormal;      \
+        }                                                               \
+    }
+
+GEN_INPUT_FLUSH__NOCHECK(float32_input_flush__nocheck, float32)
+GEN_INPUT_FLUSH__NOCHECK(float64_input_flush__nocheck, float64)
+#undef GEN_INPUT_FLUSH__NOCHECK
+
+#define GEN_INPUT_FLUSH1(name, soft_t)                  \
+    static inline void name(soft_t *a, float_status *s) \
+    {                                                   \
+        if (likely(!s->flush_inputs_to_zero)) {         \
+            return;                                     \
+        }                                               \
+        soft_t ## _input_flush__nocheck(a, s);          \
+    }
+
+GEN_INPUT_FLUSH1(float32_input_flush1, float32)
+GEN_INPUT_FLUSH1(float64_input_flush1, float64)
+#undef GEN_INPUT_FLUSH1
+
+#define GEN_INPUT_FLUSH2(name, soft_t)                                  \
+    static inline void name(soft_t *a, soft_t *b, float_status *s)      \
+    {                                                                   \
+        if (likely(!s->flush_inputs_to_zero)) {                         \
+            return;                                                     \
+        }                                                               \
+        soft_t ## _input_flush__nocheck(a, s);                          \
+        soft_t ## _input_flush__nocheck(b, s);                          \
+    }
+
+GEN_INPUT_FLUSH2(float32_input_flush2, float32)
+GEN_INPUT_FLUSH2(float64_input_flush2, float64)
+#undef GEN_INPUT_FLUSH2
+
+#define GEN_INPUT_FLUSH3(name, soft_t)                                  \
+    static inline void name(soft_t *a, soft_t *b, soft_t *c, float_status =
*s) \
+    {                                                                   \
+        if (likely(!s->flush_inputs_to_zero)) {                         \
+            return;                                                     \
+        }                                                               \
+        soft_t ## _input_flush__nocheck(a, s);                          \
+        soft_t ## _input_flush__nocheck(b, s);                          \
+        soft_t ## _input_flush__nocheck(c, s);                          \
+    }
+
+GEN_INPUT_FLUSH3(float32_input_flush3, float32)
+GEN_INPUT_FLUSH3(float64_input_flush3, float64)
+#undef GEN_INPUT_FLUSH3
+
+static inline bool can_use_fpu(const float_status *s)
+{
+    return likely(s->float_exception_flags & float_flag_inexact &&
+                  s->float_rounding_mode =3D=3D float_round_nearest_even);
+}
+
+/*
+ * Choose whether to use fpclassify or float32/64_* primitives in the gene=
rated
+ * hardfloat functions. Each combination of number of inputs and float size
+ * gets its own value.
+ */
+#if defined(__x86_64__)
+# define QEMU_HARDFLOAT_1F32_USE_FP 0
+# define QEMU_HARDFLOAT_1F64_USE_FP 0
+# define QEMU_HARDFLOAT_2F32_USE_FP 0
+# define QEMU_HARDFLOAT_2F64_USE_FP 1
+# define QEMU_HARDFLOAT_3F32_USE_FP 0
+# define QEMU_HARDFLOAT_3F64_USE_FP 1
+#else
+# define QEMU_HARDFLOAT_1F32_USE_FP 0
+# define QEMU_HARDFLOAT_1F64_USE_FP 0
+# define QEMU_HARDFLOAT_2F32_USE_FP 0
+# define QEMU_HARDFLOAT_2F64_USE_FP 0
+# define QEMU_HARDFLOAT_3F32_USE_FP 0
+# define QEMU_HARDFLOAT_3F64_USE_FP 0
+#endif
+
+/*
+ * QEMU_HARDFLOAT_USE_ISINF chooses whether to use isinf() over
+ * float{32,64}_is_infinity when !USE_FP.
+ * On x86_64/aarch64, using the former over the latter can yield a ~6% spe=
edup.
+ * On power64 however, using isinf() reduces fp-bench performance by up to=
 50%.
+ */
+#if defined(__x86_64__) || defined(__aarch64__)
+# define QEMU_HARDFLOAT_USE_ISINF   1
+#else
+# define QEMU_HARDFLOAT_USE_ISINF   0
+#endif
+
+/*
+ * Some targets clear the FP flags before most FP operations. This prevents
+ * the use of hardfloat, since hardfloat relies on the inexact flag being
+ * already set.
+ */
+#if defined(TARGET_PPC)
+# define QEMU_NO_HARDFLOAT 1
+# define QEMU_SOFTFLOAT_ATTR __attribute__((flatten))
+#else
+# define QEMU_NO_HARDFLOAT 0
+# define QEMU_SOFTFLOAT_ATTR __attribute__((flatten, noinline))
+#endif
+
+/*
+ * Hardfloat generation functions. Each operation can have two flavors:
+ * either using softfloat primitives (e.g. float32_is_zero_or_normal) for
+ * most condition checks, or native ones (e.g. fpclassify).
+ *
+ * The flavor is chosen by the callers. Instead of using macros, we rely o=
n the
+ * compiler to propagate constants and inline everything into the callers.
+ *
+ * We only generate functions for operations with two inputs, since only
+ * these are common enough to justify consolidating them into common code.
+ */
+typedef bool (*f32_check_func_t)(float32 a, float32 b, const float_status =
*s);
+typedef bool (*f64_check_func_t)(float64 a, float64 b, const float_status =
*s);
+typedef bool (*float_check_func_t)(float a, float b, const float_status *s=
);
+typedef bool (*double_check_func_t)(double a, double b, const float_status=
 *s);
+
+typedef float32 (*f32_op2_func_t)(float32 a, float32 b, float_status *s);
+typedef float64 (*f64_op2_func_t)(float64 a, float64 b, float_status *s);
+typedef float (*float_op2_func_t)(float a, float b);
+typedef double (*double_op2_func_t)(double a, double b);
+
+/* 2-input is-zero-or-normal */
+static inline bool
+f32_is_zon2(float32 a, float32 b, const struct float_status *s)
+{
+    return likely(float32_is_zero_or_normal(a) &&
+                  float32_is_zero_or_normal(b) &&
+                  can_use_fpu(s));
+}
+
+static inline bool
+float_is_zon2(float a, float b, const struct float_status *s)
+{
+    return likely((fpclassify(a) =3D=3D FP_NORMAL || fpclassify(a) =3D=3D =
FP_ZERO) &&
+                  (fpclassify(b) =3D=3D FP_NORMAL || fpclassify(b) =3D=3D =
FP_ZERO) &&
+                  can_use_fpu(s));
+}
+
+static inline bool
+f64_is_zon2(float64 a, float64 b, const struct float_status *s)
+{
+    return likely(float64_is_zero_or_normal(a) &&
+                  float64_is_zero_or_normal(b) &&
+                  can_use_fpu(s));
+}
+
+static inline bool
+double_is_zon2(double a, double b, const struct float_status *s)
+{
+    return likely((fpclassify(a) =3D=3D FP_NORMAL || fpclassify(a) =3D=3D =
FP_ZERO) &&
+                  (fpclassify(b) =3D=3D FP_NORMAL || fpclassify(b) =3D=3D =
FP_ZERO) &&
+                  can_use_fpu(s));
+}
+
+/*
+ * Note: @fast and @post can be NULL.
+ * Note: @fast and @fast_op always use softfloat types.
+ */
+static inline float32
+f32_gen2(float32 a, float32 b, float_status *s, float_op2_func_t hard,
+         f32_op2_func_t soft, f32_check_func_t pre, f32_check_func_t post,
+         f32_check_func_t fast, f32_op2_func_t fast_op)
+{
+    if (QEMU_NO_HARDFLOAT) {
+        goto soft;
+    }
+    float32_input_flush2(&a, &b, s);
+    if (likely(pre(a, b, s))) {
+        if (fast !=3D NULL && fast(a, b, s)) {
+            return fast_op(a, b, s);
+        } else {
+            float ha =3D float32_to_float(a);
+            float hb =3D float32_to_float(b);
+            float hr =3D hard(ha, hb);
+            float32 r =3D float_to_float32(hr);
+
+            if (unlikely(QEMU_HARDFLOAT_USE_ISINF ?
+                         isinf(hr) : float32_is_infinity(r))) {
+                s->float_exception_flags |=3D float_flag_overflow;
+            } else if (unlikely(fabsf(hr) <=3D FLT_MIN &&
+                                (post =3D=3D NULL || post(a, b, s)))) {
+                goto soft;
+            }
+            return r;
+        }
+    }
+ soft:
+    return soft(a, b, s);
+}
+
+static inline float32
+float_gen2(float32 a, float32 b, float_status *s, float_op2_func_t hard,
+           f32_op2_func_t soft, float_check_func_t pre, float_check_func_t=
 post,
+           f32_check_func_t fast, f32_op2_func_t fast_op)
+{
+    float ha, hb;
+
+    if (QEMU_NO_HARDFLOAT) {
+        goto soft;
+    }
+    float32_input_flush2(&a, &b, s);
+    ha =3D float32_to_float(a);
+    hb =3D float32_to_float(b);
+    if (likely(pre(ha, hb, s))) {
+        if (fast !=3D NULL && fast(a, b, s)) {
+            return fast_op(a, b, s);
+        } else {
+            float hr =3D hard(ha, hb);
+            float32 r =3D float_to_float32(hr);
+
+            if (unlikely(isinf(hr))) {
+                s->float_exception_flags |=3D float_flag_overflow;
+            } else if (unlikely(fabsf(hr) <=3D FLT_MIN &&
+                                (post =3D=3D NULL || post(ha, hb, s)))) {
+                goto soft;
+            }
+            return r;
+        }
+    }
+ soft:
+    return soft(a, b, s);
+}
+
+static inline float64
+f64_gen2(float64 a, float64 b, float_status *s, double_op2_func_t hard,
+         f64_op2_func_t soft, f64_check_func_t pre, f64_check_func_t post,
+         f64_check_func_t fast, f64_op2_func_t fast_op)
+{
+    if (QEMU_NO_HARDFLOAT) {
+        goto soft;
+    }
+    float64_input_flush2(&a, &b, s);
+    if (likely(pre(a, b, s))) {
+        if (fast !=3D NULL && fast(a, b, s)) {
+            return fast_op(a, b, s);
+        } else {
+            double ha =3D float64_to_double(a);
+            double hb =3D float64_to_double(b);
+            double hr =3D hard(ha, hb);
+            float64 r =3D double_to_float64(hr);
+
+            if (unlikely(QEMU_HARDFLOAT_USE_ISINF ?
+                         isinf(hr) : float64_is_infinity(r))) {
+                s->float_exception_flags |=3D float_flag_overflow;
+            } else if (unlikely(fabsf(hr) <=3D FLT_MIN &&
+                                (post =3D=3D NULL || post(a, b, s)))) {
+                goto soft;
+            }
+            return r;
+        }
+    }
+ soft:
+    return soft(a, b, s);
+}
+
+static inline float64
+double_gen2(float64 a, float64 b, float_status *s, double_op2_func_t hard,
+            f64_op2_func_t soft, double_check_func_t pre,
+            double_check_func_t post, f64_check_func_t fast,
+            f64_op2_func_t fast_op)
+{
+    double ha, hb;
+
+    if (QEMU_NO_HARDFLOAT) {
+        goto soft;
+    }
+    float64_input_flush2(&a, &b, s);
+    ha =3D float64_to_double(a);
+    hb =3D float64_to_double(b);
+    if (likely(pre(ha, hb, s))) {
+        if (fast !=3D NULL && fast(a, b, s)) {
+            return fast_op(a, b, s);
+        } else {
+            double hr =3D hard(ha, hb);
+            float64 r =3D double_to_float64(hr);
+
+            if (unlikely(isinf(hr))) {
+                s->float_exception_flags |=3D float_flag_overflow;
+            } else if (unlikely(fabs(hr) <=3D DBL_MIN &&
+                                (post =3D=3D NULL || post(ha, hb, s)))) {
+                goto soft;
+            }
+            return r;
+        }
+    }
+ soft:
+    return soft(a, b, s);
+}
+
 /*------------------------------------------------------------------------=
----
 | Returns the fraction bits of the half-precision floating-point value `a'.
 *-------------------------------------------------------------------------=
---*/
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539473317570554.9660503344707;
 Sat, 13 Oct 2018 16:28:37 -0700 (PDT)
Received: from localhost ([::1]:46579 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTKd-0000Fs-Mn
	for importer@patchew.org; Sat, 13 Oct 2018 19:28:27 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57941)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCn-0002fO-Kt
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:24 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCd-0007xa-81
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:16 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:41721)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCd-0007wO-0a
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:11 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 265B821A29;
	Sat, 13 Oct 2018 19:20:09 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:09 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id A8697102A0;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=QQHyvpjd7NFW2KQnJQFoSPenvfKqrNScxjJXfUrF0kA=; b=em6/P
	VO0qHng82zosZY644H67THJzf54UFyebO7pmf9BAl7JTXGGjC9LtBUvDqjJgSRUQ
	oMEBCDQGcTvk6lis7YHQSl/zZYN6rmmS5vbecdwPmFaSt+Zf8MaFwXp6BPifUMvg
	Eb19KhwtqVZV2tYkm6f8S3P3jfD9D11xapqwWw=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=QQHyvpjd7NFW2KQnJQFoSPenvfKqr
	NScxjJXfUrF0kA=; b=FBjSnmCVJ06BLq7q+D8t2sAhVCzThKCsKSjawY0USNubg
	z3ThjF2dan7OHJRJyRF6ttvospuzLsVujb3RBEHGl0RxRCwLqy29W5wVEhlkt6yD
	/aKC/l+gl3/KcKX2YJ8YMyPEFoXXecmAOdfe7ZaKKKl8aASPuEapDHJSXMU9v38/
	dN93dalSOkV1Wy6Y2hnDih9DucZ9Al+SbHh8f5wah3tFpQ5DIyAtWRuF1jLTHCW4
	ZjNhfaCNEZGyyz2ydK6vsfj6q1YTedPmH5jet6MlEq3qtzZmliTkltH/bs9IjmM1
	ctSwPngTMsflp0htcWw0mZkKewfLnQXrKbEAgJ8oA==
X-ME-Sender: <xms:qH3CW1YN8fpBGHRaAaMbayNs6EaT4d8E8n1Vylx6Cq3neeQ35K6hDg>
X-ME-Proxy: <xmx:qH3CW6V2FjmMewbTCP_B5qjb04Vc-jE5gret4kGYwtqD2xAOgiTk3g>
	<xmx:qH3CW9-CMUG1jnztdb4SU23R5MQObH4QTHCFQZKFfGhPqwcBaQu6wg>
	<xmx:qX3CW4s6LWyWfwk7VkGlZZBajdmwsyAwhRb3a8f0l0L5-ddgGWJTJw>
	<xmx:qX3CW3UMa1Eodkw34S0YwTFlUQzF4YqJocw4hCPweBGuGCxWyf_4FQ>
	<xmx:qX3CW4UCH6gTsSMarNIctZS58yM4cwxCH6IBbnrNxSZqKfm7scuJeA>
	<xmx:qX3CW4SsOqWr5WxCRTf2Q8SCs8kKfSkhYDFhQNwvPx3Q6S9-_ejlQw>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:28 -0400
Message-Id: <20181013231933.28789-9-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 08/13] hardfloat: implement float32/64
 addition and subtraction
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Performance results (single and double precision) for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
add-single: 135.07 MFlops
add-double: 131.60 MFlops
sub-single: 130.04 MFlops
sub-double: 133.01 MFlops
- after:
add-single: 443.04 MFlops
add-double: 301.95 MFlops
sub-single: 411.36 MFlops
sub-double: 293.15 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
add-single: 44.79 MFlops
add-double: 49.20 MFlops
sub-single: 44.55 MFlops
sub-double: 49.06 MFlops
- after:
add-single: 93.28 MFlops
add-double: 88.27 MFlops
sub-single: 91.47 MFlops
sub-double: 88.27 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
add-single: 72.59 MFlops
add-double: 72.27 MFlops
sub-single: 75.33 MFlops
sub-double: 70.54 MFlops
- after:
add-single: 112.95 MFlops
add-double: 201.11 MFlops
sub-single: 116.80 MFlops
sub-double: 188.72 MFlops

Note that the IBM and ARM machines benefit from having
HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance
can suffer significantly:
- IBM Power8:
add-single: [1] 54.94 vs [0] 116.37 MFlops
add-double: [1] 58.92 vs [0] 201.44 MFlops
- Aarch64 A57:
add-single: [1] 80.72 vs [0] 93.24 MFlops
add-double: [1] 82.10 vs [0] 88.18 MFlops

On the Intel machine, having 2F64 set to 1 pays off, but it
doesn't for 2F32:
- Intel i7-6700K:
add-single: [1] 285.79 vs [0] 426.70 MFlops
add-double: [1] 302.15 vs [0] 278.82 MFlops

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 106 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 98 insertions(+), 8 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 81d06548b5..d5d1c555dc 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1077,8 +1077,8 @@ float16  __attribute__((flatten)) float16_add(float16=
 a, float16 b,
     return float16_round_pack_canonical(pr, status);
 }
=20
-float32 __attribute__((flatten)) float32_add(float32 a, float32 b,
-                                             float_status *status)
+static float32 QEMU_SOFTFLOAT_ATTR
+soft_float32_add(float32 a, float32 b, float_status *status)
 {
     FloatParts pa =3D float32_unpack_canonical(a, status);
     FloatParts pb =3D float32_unpack_canonical(b, status);
@@ -1087,8 +1087,8 @@ float32 __attribute__((flatten)) float32_add(float32 =
a, float32 b,
     return float32_round_pack_canonical(pr, status);
 }
=20
-float64 __attribute__((flatten)) float64_add(float64 a, float64 b,
-                                             float_status *status)
+static float64 QEMU_SOFTFLOAT_ATTR
+soft_float64_add(float64 a, float64 b, float_status *status)
 {
     FloatParts pa =3D float64_unpack_canonical(a, status);
     FloatParts pb =3D float64_unpack_canonical(b, status);
@@ -1107,8 +1107,8 @@ float16 __attribute__((flatten)) float16_sub(float16 =
a, float16 b,
     return float16_round_pack_canonical(pr, status);
 }
=20
-float32 __attribute__((flatten)) float32_sub(float32 a, float32 b,
-                                             float_status *status)
+static float32 QEMU_SOFTFLOAT_ATTR
+soft_float32_sub(float32 a, float32 b, float_status *status)
 {
     FloatParts pa =3D float32_unpack_canonical(a, status);
     FloatParts pb =3D float32_unpack_canonical(b, status);
@@ -1117,8 +1117,8 @@ float32 __attribute__((flatten)) float32_sub(float32 =
a, float32 b,
     return float32_round_pack_canonical(pr, status);
 }
=20
-float64 __attribute__((flatten)) float64_sub(float64 a, float64 b,
-                                             float_status *status)
+static float64 QEMU_SOFTFLOAT_ATTR
+soft_float64_sub(float64 a, float64 b, float_status *status)
 {
     FloatParts pa =3D float64_unpack_canonical(a, status);
     FloatParts pb =3D float64_unpack_canonical(b, status);
@@ -1127,6 +1127,96 @@ float64 __attribute__((flatten)) float64_sub(float64=
 a, float64 b,
     return float64_round_pack_canonical(pr, status);
 }
=20
+static float float_add(float a, float b)
+{
+    return a + b;
+}
+
+static float float_sub(float a, float b)
+{
+    return a - b;
+}
+
+static double double_add(double a, double b)
+{
+    return a + b;
+}
+
+static double double_sub(double a, double b)
+{
+    return a - b;
+}
+
+static bool f32_addsub_post(float32 a, float32 b, const struct float_statu=
s *s)
+{
+    return !(float32_is_zero(a) && float32_is_zero(b));
+}
+
+static bool
+float_addsub_post(float a, float b, const struct float_status *s)
+{
+    return !(fpclassify(a) =3D=3D FP_ZERO && fpclassify(b) =3D=3D FP_ZERO);
+}
+
+static bool f64_addsub_post(float64 a, float64 b, const struct float_statu=
s *s)
+{
+    return !(float64_is_zero(a) && float64_is_zero(b));
+}
+
+static bool
+double_addsub_post(double a, double b, const struct float_status *s)
+{
+    return !(fpclassify(a) =3D=3D FP_ZERO && fpclassify(b) =3D=3D FP_ZERO);
+}
+
+static float32 float32_addsub(float32 a, float32 b, float_status *s,
+                              float_op2_func_t hard, f32_op2_func_t soft)
+{
+    if (QEMU_HARDFLOAT_2F32_USE_FP) {
+        return float_gen2(a, b, s, hard, soft, float_is_zon2, float_addsub=
_post,
+                          NULL, NULL);
+    } else {
+        return f32_gen2(a, b, s, hard, soft, f32_is_zon2, f32_addsub_post,
+                        NULL, NULL);
+    }
+}
+
+static float64 float64_addsub(float64 a, float64 b, float_status *s,
+                              double_op2_func_t hard, f64_op2_func_t soft)
+{
+    if (QEMU_HARDFLOAT_2F64_USE_FP) {
+        return double_gen2(a, b, s, hard, soft, double_is_zon2,
+                           double_addsub_post, NULL, NULL);
+    } else {
+        return f64_gen2(a, b, s, hard, soft, f64_is_zon2, f64_addsub_post,
+                        NULL, NULL);
+    }
+}
+
+float32 __attribute__((flatten))
+float32_add(float32 a, float32 b, float_status *s)
+{
+    return float32_addsub(a, b, s, float_add, soft_float32_add);
+}
+
+float32 __attribute__((flatten))
+float32_sub(float32 a, float32 b, float_status *s)
+{
+    return float32_addsub(a, b, s, float_sub, soft_float32_sub);
+}
+
+float64 __attribute__((flatten))
+float64_add(float64 a, float64 b, float_status *s)
+{
+    return float64_addsub(a, b, s, double_add, soft_float64_add);
+}
+
+float64 __attribute__((flatten))
+float64_sub(float64 a, float64 b, float_status *s)
+{
+    return float64_addsub(a, b, s, double_sub, soft_float64_sub);
+}
+
 /*
  * Returns the result of multiplying the floating-point values `a' and
  * `b'. The operation is performed according to the IEC/IEEE Standard
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539473147529878.5077194498397;
 Sat, 13 Oct 2018 16:25:47 -0700 (PDT)
Received: from localhost ([::1]:46563 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTI2-0006Bd-8z
	for importer@patchew.org; Sat, 13 Oct 2018 19:25:46 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57880)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TK-LP
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCd-0007xG-3w
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:60699)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007wR-Sv
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:10 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 508DF21BF8;
	Sat, 13 Oct 2018 19:20:09 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:09 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id E28F7102D5;
	Sat, 13 Oct 2018 19:20:08 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=W3FYcD7FhPYXsWu0AHdQnoS5W7vwioe/7XuCuQj6MjE=; b=Wk06H
	G6T293quSER96+V/JhahkZoYcNVv7WyrN1rPGvHhynJzcBNB8vZXFvsRiHecqMxk
	2K4lCs5lGjBxZawqswYWIS4VkOD3PKOSb+1XJOvCgYI5N8ktdYcq8gBgKD10U3m7
	L/M3NdZZU+aUBT8IFXsslkIY39UTHYQNinJtlo=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=W3FYcD7FhPYXsWu0AHdQnoS5W7vwi
	oe/7XuCuQj6MjE=; b=mWiNGoyYlkEKXuO/TCJ1Y5O3l8a9mPUUV/rx4jJPqpCYh
	TOAw6ODVSPIpQxR/v+hpo3UcYLfrA6eveasfNgbBKRaMb9hx8Dr+623GuFHZ8Oum
	B7m2gy+P80gtyflvkmTDJft4a4bWdDpBBCPIH5zm1QXBGlhmpSHSyXK7VjqsAQat
	nEbr+xDg5faCuDCHlF9o7lOx8TLhL8trYTKxEFOCLzxKxorgy2Iw8+hvKd5/0MxK
	y/FGaRipHIQJlWLe/amXahAqw1l+Wbv0bknDrAFfeFupovZSHen5NbB7p7CtftPJ
	5I8h5fNgNpHIBv7haG5NvXmbkGRUFEsa7l+P+9gmQ==
X-ME-Sender: <xms:qX3CW1wH6JWOBGrjChTMWW-ivdyXqfYzBM3f-2HlKM6dNfH5XrwoJw>
X-ME-Proxy: <xmx:qX3CW2jhYRPsOezGXp5jdMlhv5YVcLbHsdylMLVS3vRFe-SG04WnNg>
	<xmx:qX3CW-EutUzygoa7KNp__sROWmHmjdjuMmG8-YE_L5Qc_cMwIrOs1w>
	<xmx:qX3CWyXN5VVWpmREDqpxLpfgO3Yb9nQ2JeQNU9xJwAJd2wA5UP1C-w>
	<xmx:qX3CWyBzyM1f8TxOBCel_9eqozrkg5E5dHvLsYuYdzlLT2ll1MrSWQ>
	<xmx:qX3CWzLe4uIthhIc-KbN4dylDMi1J84463hp_NAJd-YeIEpV_KUnxg>
	<xmx:qX3CW0hVyllPq2Zmr25V-aSXJP30mPtblWsW6TmiYPKvngFa8IKWEQ>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:29 -0400
Message-Id: <20181013231933.28789-10-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 09/13] hardfloat: implement float32/64
 multiplication
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Performance results for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
mul-single: 126.91 MFlops
mul-double: 118.28 MFlops
- after:
mul-single: 258.02 MFlops
mul-double: 197.96 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
mul-single: 37.42 MFlops
mul-double: 38.77 MFlops
- after:
mul-single: 73.41 MFlops
mul-double: 76.93 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
mul-single: 58.40 MFlops
mul-double: 59.33 MFlops
- after:
mul-single: 60.25 MFlops
mul-double: 94.79 MFlops

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 62 insertions(+), 4 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index d5d1c555dc..78837fa9d8 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1276,8 +1276,8 @@ float16 __attribute__((flatten)) float16_mul(float16 =
a, float16 b,
     return float16_round_pack_canonical(pr, status);
 }
=20
-float32 __attribute__((flatten)) float32_mul(float32 a, float32 b,
-                                             float_status *status)
+static float32 QEMU_SOFTFLOAT_ATTR
+soft_float32_mul(float32 a, float32 b, float_status *status)
 {
     FloatParts pa =3D float32_unpack_canonical(a, status);
     FloatParts pb =3D float32_unpack_canonical(b, status);
@@ -1286,8 +1286,8 @@ float32 __attribute__((flatten)) float32_mul(float32 =
a, float32 b,
     return float32_round_pack_canonical(pr, status);
 }
=20
-float64 __attribute__((flatten)) float64_mul(float64 a, float64 b,
-                                             float_status *status)
+static float64 QEMU_SOFTFLOAT_ATTR
+soft_float64_mul(float64 a, float64 b, float_status *status)
 {
     FloatParts pa =3D float64_unpack_canonical(a, status);
     FloatParts pb =3D float64_unpack_canonical(b, status);
@@ -1296,6 +1296,64 @@ float64 __attribute__((flatten)) float64_mul(float64=
 a, float64 b,
     return float64_round_pack_canonical(pr, status);
 }
=20
+static float float_mul(float a, float b)
+{
+    return a * b;
+}
+
+static double double_mul(double a, double b)
+{
+    return a * b;
+}
+
+static bool f32_mul_fast(float32 a, float32 b, const struct float_status *=
s)
+{
+    return float32_is_zero(a) || float32_is_zero(b);
+}
+
+static bool f64_mul_fast(float64 a, float64 b, const struct float_status *=
s)
+{
+    return float64_is_zero(a) || float64_is_zero(b);
+}
+
+static float32 f32_mul_fast_op(float32 a, float32 b, float_status *s)
+{
+    bool signbit =3D float32_is_neg(a) ^ float32_is_neg(b);
+
+    return float32_set_sign(float32_zero, signbit);
+}
+
+static float64 f64_mul_fast_op(float64 a, float64 b, float_status *s)
+{
+    bool signbit =3D float64_is_neg(a) ^ float64_is_neg(b);
+
+    return float64_set_sign(float64_zero, signbit);
+}
+
+float32 __attribute__((flatten))
+float32_mul(float32 a, float32 b, float_status *s)
+{
+    if (QEMU_HARDFLOAT_2F32_USE_FP) {
+        return float_gen2(a, b, s, float_mul, soft_float32_mul, float_is_z=
on2,
+                          NULL, f32_mul_fast, f32_mul_fast_op);
+    } else {
+        return f32_gen2(a, b, s, float_mul, soft_float32_mul, f32_is_zon2,=
 NULL,
+                        f32_mul_fast, f32_mul_fast_op);
+    }
+}
+
+float64 __attribute__((flatten))
+float64_mul(float64 a, float64 b, float_status *s)
+{
+    if (QEMU_HARDFLOAT_2F64_USE_FP) {
+        return double_gen2(a, b, s, double_mul, soft_float64_mul,
+                           double_is_zon2, NULL, f64_mul_fast, f64_mul_fas=
t_op);
+    } else {
+        return f64_gen2(a, b, s, double_mul, soft_float64_mul, f64_is_zon2,
+                        NULL, f64_mul_fast, f64_mul_fast_op);
+    }
+}
+
 /*
  * Returns the result of multiplying the floating-point values `a' and
  * `b' then adding 'c', with no intermediate rounding step after the
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 153947314698040.07381974329803;
 Sat, 13 Oct 2018 16:25:46 -0700 (PDT)
Received: from localhost ([::1]:46564 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTI1-0006Bq-TJ
	for importer@patchew.org; Sat, 13 Oct 2018 19:25:45 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57877)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCf-0002TH-LL
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCd-0007xL-4a
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:13 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:57231)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCc-0007wT-Ta
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:10 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 8A16B21C1B;
	Sat, 13 Oct 2018 19:20:09 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:09 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 1E526102DE;
	Sat, 13 Oct 2018 19:20:09 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=ibaUUqpgQWDre7FcVXKtql0Sh+oAI7RDp9M9SMSkCw4=; b=JKkj9
	0UvyBhYC3N8vp/0ublS9ka1eC1M0y1ZNiC2W0A9jElqWlTSMd7tdE4ttSQKlwgNW
	TN9JyrBiPeyRQCwsoLaB1agep9s65jnr3pZl5IyVSu5DQxxz/3KKmcbuOIek6AhE
	T+B3dw5OXA7nVBr3Gm83btePWsjIWHAZpjjzaQ=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=ibaUUqpgQWDre7FcVXKtql0Sh+oAI
	7RDp9M9SMSkCw4=; b=qnU2ZpJ8rbjGcCIMsq/udr6em95b6C1GYuSGZb6J08A2x
	/XLzTdpiUbUxbR9YJQOPqU78yWqmqNLmTRDQxVdDAklHVZo45wHopKb0Q49OVNKq
	q68mqdJq7Bj8uWstsMPkLck0mFBdo0GV4bvuOq53Yc7HCyy6pin5RaO/x9tbCc85
	DbFMSf3qt9qMzg2btF1sJQ2jS2cj3PEXlKOmdErTjH4P1I1F+VyMzYf5ws8bpYGw
	sQXXKBz4FZ7Kjc9GOyqfWY/5QLGFSTMolekU5NVcJOgS6TAgwlLecIesaOIPTQsI
	gg1HfR2iQBWQxUId3ghrPWYFMCA3VWUunt6+HNqIg==
X-ME-Sender: <xms:qX3CW63NZOyqd3Ah0mzqoVefqWbxB4FWygjgR2XajdomJKQyzG2QeA>
X-ME-Proxy: <xmx:qX3CW86vRW6TX839YvLDgOoBHvkFhTObUhD7cBBUd3YcKodSgMswRQ>
	<xmx:qX3CW0zVs2ARQ2sdycHk_xWAM70hi8L3nzPA0UnmTQwTEo8AI9d9Ug>
	<xmx:qX3CWzysVuuqsRWmbEyI0QFqPFO_ge-yrGjx5k6TAapf-WqekdBWWw>
	<xmx:qX3CW8xnlmcn-T6FrxAmNfhTrP-qUfmPUYywitxebT8iv7yewruRHw>
	<xmx:qX3CW9Y2cQmHgVrjJHWu1T_dSEjMqDU2bdBs_3xiwb_FzlJcwJxzYw>
	<xmx:qX3CW6bmSpp0XNOcZf3dThMHthSqxVMtwt_IPVJmltVu4m-HJia8uw>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:30 -0400
Message-Id: <20181013231933.28789-11-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 10/13] hardfloat: implement float32/64
 division
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Performance results for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
div-single: 34.84 MFlops
div-double: 34.04 MFlops
- after:
div-single: 275.23 MFlops
div-double: 216.38 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
div-single: 9.33 MFlops
div-double: 9.30 MFlops
- after:
div-single: 51.55 MFlops
div-double: 15.09 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
div-single: 25.65 MFlops
div-double: 24.91 MFlops
- after:
div-single: 96.83 MFlops
div-double: 31.01 MFlops

Here setting 2FP64_USE_FP to 1 pays off for x86_64:
[1] 215.97 vs [0] 62.15 MFlops

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 86 insertions(+), 2 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 78837fa9d8..8ef0571c6e 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1678,7 +1678,8 @@ float16 float16_div(float16 a, float16 b, float_statu=
s *status)
     return float16_round_pack_canonical(pr, status);
 }
=20
-float32 float32_div(float32 a, float32 b, float_status *status)
+static float32 QEMU_SOFTFLOAT_ATTR
+soft_float32_div(float32 a, float32 b, float_status *status)
 {
     FloatParts pa =3D float32_unpack_canonical(a, status);
     FloatParts pb =3D float32_unpack_canonical(b, status);
@@ -1687,7 +1688,8 @@ float32 float32_div(float32 a, float32 b, float_statu=
s *status)
     return float32_round_pack_canonical(pr, status);
 }
=20
-float64 float64_div(float64 a, float64 b, float_status *status)
+static float64 QEMU_SOFTFLOAT_ATTR
+soft_float64_div(float64 a, float64 b, float_status *status)
 {
     FloatParts pa =3D float64_unpack_canonical(a, status);
     FloatParts pb =3D float64_unpack_canonical(b, status);
@@ -1696,6 +1698,88 @@ float64 float64_div(float64 a, float64 b, float_stat=
us *status)
     return float64_round_pack_canonical(pr, status);
 }
=20
+static float float_div(float a, float b)
+{
+    return a / b;
+}
+
+static double double_div(double a, double b)
+{
+    return a / b;
+}
+
+static bool f32_div_pre(float32 a, float32 b, const struct float_status *s)
+{
+    return likely(float32_is_zero_or_normal(a) &&
+                  float32_is_normal(b) &&
+                  can_use_fpu(s));
+}
+
+static bool f64_div_pre(float64 a, float64 b, const struct float_status *s)
+{
+    return likely(float64_is_zero_or_normal(a) &&
+                  float64_is_normal(b) &&
+                  can_use_fpu(s));
+}
+
+static bool float_div_pre(float a, float b, const struct float_status *s)
+{
+    return likely((fpclassify(a) =3D=3D FP_NORMAL || fpclassify(a) =3D=3D =
FP_ZERO) &&
+                  fpclassify(b) =3D=3D FP_NORMAL &&
+                  can_use_fpu(s));
+}
+
+static bool double_div_pre(double a, double b, const struct float_status *=
s)
+{
+    return likely((fpclassify(a) =3D=3D FP_NORMAL || fpclassify(a) =3D=3D =
FP_ZERO) &&
+                  fpclassify(b) =3D=3D FP_NORMAL &&
+                  can_use_fpu(s));
+}
+
+static bool f32_div_post(float32 a, float32 b, const struct float_status *=
s)
+{
+    return !float32_is_zero(a);
+}
+
+static bool f64_div_post(float64 a, float64 b, const struct float_status *=
s)
+{
+    return !float64_is_zero(a);
+}
+
+static bool float_div_post(float a, float b, const struct float_status *s)
+{
+    return fpclassify(a) !=3D FP_ZERO;
+}
+
+static bool double_div_post(double a, double b, const struct float_status =
*s)
+{
+    return fpclassify(a) !=3D FP_ZERO;
+}
+
+float32 __attribute__((flatten))
+float32_div(float32 a, float32 b, float_status *s)
+{
+    if (QEMU_HARDFLOAT_2F32_USE_FP) {
+        return float_gen2(a, b, s, float_div, soft_float32_div, float_div_=
pre,
+                          float_div_post, NULL, NULL);
+    } else {
+        return f32_gen2(a, b, s, float_div, soft_float32_div, f32_div_pre,
+                        f32_div_post, NULL, NULL);
+    }
+}
+
+float64 __attribute__((flatten))
+float64_div(float64 a, float64 b, float_status *s)
+{
+    if (QEMU_HARDFLOAT_2F64_USE_FP) {
+        return double_gen2(a, b, s, double_div, soft_float64_div,
+                           double_div_pre, double_div_post, NULL, NULL);
+    } else {
+        return f64_gen2(a, b, s, double_div, soft_float64_div, f64_div_pre,
+                        f64_div_post, NULL, NULL);
+    }
+}
+
 /*
  * Float to Float conversions
  *
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539473569831678.0909926921099;
 Sat, 13 Oct 2018 16:32:49 -0700 (PDT)
Received: from localhost ([::1]:46597 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTOq-0003C7-M5
	for importer@patchew.org; Sat, 13 Oct 2018 19:32:48 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:58012)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCz-0002nx-T0
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:37 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCx-00088t-8F
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:33 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:39355)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCw-0007wk-Bs
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:31 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id CA1E121C24;
	Sat, 13 Oct 2018 19:20:09 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:09 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 577ED102A0;
	Sat, 13 Oct 2018 19:20:09 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=8EIseTiKX3W9bS228wi9GBqaSLpRy0LwVCrz3vK0yY8=; b=rOJpZ
	h4frk3PeeZyhwVBYIaa1tSEmaxWALa2sr5r6gAjpMv+QBJnUcVvH3o43Yr4X6Hix
	JBnwr9L0nnCb4vW3kAwylxdXqj+b39cE8y/JYWVqkO7NltbhROj5v9W9j3Na7M+Y
	1PcyLQb/y3iEJDs9dCsOkTIWNhDbEbuYQO2xu0=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=8EIseTiKX3W9bS228wi9GBqaSLpRy
	0LwVCrz3vK0yY8=; b=N55TItzLV3abri1zy31EjbAf1FC5YrjuZEZ72+Fawnhxq
	lqC/iO/VQw4/5i7SSl3q22pjRGSsG5BE1SCTsOu+7gdXhei29ct1JshHHFiqNLgu
	NlzMKcJM5ak9FvUzAzkBe8kFMm4zFEl30gGtem0hF8Npu3evjwCrYsm/PXjlQsSY
	t7F7UI8w+5TBn0uhnG8MFI2LQOTC97Nqe99gJIZwUqiJFUyVWGPhcyS8WJlKlFck
	ndn8uB2plUp/T6LyUWMkUL6Eh05EQmmV6phL3E2oOgz3Jkk/Yf9u5MkTTq/hXc3e
	sSHcbaMkgA/QBumIbvfKo9/zXMOl4dY5IjL07lqtw==
X-ME-Sender: <xms:qX3CW0LFIFcjfAq9PRWYF1VCTjXdWYJzFXirIsdoizNTyco9YK3GKQ>
X-ME-Proxy: <xmx:qX3CW328t16ixke4VA3Iyp_AFG20wk0UI9YSGEJvPcx4WXY-wAKcXw>
	<xmx:qX3CW7klWtf1i8x8CJe2CgYQkDmkyOsEXgpDPtSyCeALEGsVyBs-NA>
	<xmx:qX3CW28X0Q5vYOPPStE6UgSazpv5qKH21PvGD94hmxt4u-YIhgx-nQ>
	<xmx:qX3CWxLCAdBk-Sx9M01-_dugUc_hPIofBrTIH8ycVmiXYtotHfb9uA>
	<xmx:qX3CWzhMHvypgslDhWWkjaEl2gZp5qHhtFF8WoNBueDcP7RbR3aWoQ>
	<xmx:qX3CW6IgJDlKrHKIcZpXtPQkXTlNtH-Am0UH217d5EOtwMEZxR2k2Q>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:31 -0400
Message-Id: <20181013231933.28789-12-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 11/13] hardfloat: implement float32/64 fused
 multiply-add
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Performance results for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
fma-single: 74.73 MFlops
fma-double: 74.54 MFlops
- after:
fma-single: 203.37 MFlops
fma-double: 169.37 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
fma-single: 23.24 MFlops
fma-double: 23.70 MFlops
- after:
fma-single: 66.14 MFlops
fma-double: 63.10 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
fma-single: 37.26 MFlops
fma-double: 37.29 MFlops
- after:
fma-single: 48.90 MFlops
fma-double: 59.51 MFlops

Here having 3FP64 set to 1 pays off for x86_64:
[1] 170.15 vs [0] 153.12 MFlops

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 165 insertions(+), 4 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 8ef0571c6e..1c1a42bf46 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1568,8 +1568,9 @@ float16 __attribute__((flatten)) float16_muladd(float=
16 a, float16 b, float16 c,
     return float16_round_pack_canonical(pr, status);
 }
=20
-float32 __attribute__((flatten)) float32_muladd(float32 a, float32 b, floa=
t32 c,
-                                                int flags, float_status *s=
tatus)
+static float32 QEMU_SOFTFLOAT_ATTR
+soft_float32_muladd(float32 a, float32 b, float32 c, int flags,
+                    float_status *status)
 {
     FloatParts pa =3D float32_unpack_canonical(a, status);
     FloatParts pb =3D float32_unpack_canonical(b, status);
@@ -1579,8 +1580,9 @@ float32 __attribute__((flatten)) float32_muladd(float=
32 a, float32 b, float32 c,
     return float32_round_pack_canonical(pr, status);
 }
=20
-float64 __attribute__((flatten)) float64_muladd(float64 a, float64 b, floa=
t64 c,
-                                                int flags, float_status *s=
tatus)
+static float64 QEMU_SOFTFLOAT_ATTR
+soft_float64_muladd(float64 a, float64 b, float64 c, int flags,
+                    float_status *status)
 {
     FloatParts pa =3D float64_unpack_canonical(a, status);
     FloatParts pb =3D float64_unpack_canonical(b, status);
@@ -1590,6 +1592,165 @@ float64 __attribute__((flatten)) float64_muladd(flo=
at64 a, float64 b, float64 c,
     return float64_round_pack_canonical(pr, status);
 }
=20
+/*
+ * FMA generator for softfloat-based condition checks.
+ *
+ * When (a || b) =3D=3D 0, there's no need to check for under/over flow,
+ * since we know the addend is (normal || 0) and the product is 0.
+ */
+#define GEN_FMA_SF(name, soft_t, host_t, host_fma_f, host_abs_f, min_norma=
l) \
+    static soft_t                                                       \
+    name(soft_t a, soft_t b, soft_t c, int flags, float_status *s)      \
+    {                                                                   \
+        if (QEMU_NO_HARDFLOAT) {                                        \
+            goto soft;                                                  \
+        }                                                               \
+        soft_t ## _input_flush3(&a, &b, &c, s);                         \
+        if (likely(soft_t ## _is_zero_or_normal(a) &&                   \
+                   soft_t ## _is_zero_or_normal(b) &&                   \
+                   soft_t ## _is_zero_or_normal(c) &&                   \
+                   !(flags & float_muladd_halve_result) &&              \
+                   can_use_fpu(s))) {                                   \
+            if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) {       \
+                soft_t p, r;                                            \
+                host_t hp, hc, hr;                                      \
+                bool prod_sign;                                         \
+                                                                        \
+                prod_sign =3D soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b);=
 \
+                prod_sign ^=3D !!(flags & float_muladd_negate_product);   \
+                p =3D soft_t ## _set_sign(soft_t ## _zero, prod_sign);    \
+                                                                        \
+                if (flags & float_muladd_negate_c) {                    \
+                    c =3D soft_t ## _chs(c);                              \
+                }                                                       \
+                                                                        \
+                hp =3D soft_t ## _to_ ## host_t(p);                       \
+                hc =3D soft_t ## _to_ ## host_t(c);                       \
+                hr =3D hp + hc;                                           \
+                r =3D host_t ## _to_ ## soft_t(hr);                       \
+                return flags & float_muladd_negate_result ?             \
+                    soft_t ## _chs(r) : r;                              \
+            } else {                                                    \
+                host_t ha, hb, hc, hr;                                  \
+                soft_t r;                                               \
+                soft_t sa =3D flags & float_muladd_negate_product ?       \
+                    soft_t ## _chs(a) : a;                              \
+                soft_t sc =3D flags & float_muladd_negate_c ?             \
+                    soft_t ## _chs(c) : c;                              \
+                                                                        \
+                ha =3D soft_t ## _to_ ## host_t(sa);                      \
+                hb =3D soft_t ## _to_ ## host_t(b);                       \
+                hc =3D soft_t ## _to_ ## host_t(sc);                      \
+                hr =3D host_fma_f(ha, hb, hc);                            \
+                r =3D host_t ## _to_ ## soft_t(hr);                       \
+                                                                        \
+                if (unlikely(isinf(hr))) {                              \
+                    s->float_exception_flags |=3D float_flag_overflow;    \
+                } else if (unlikely(host_abs_f(hr) <=3D min_normal)) {    \
+                    goto soft;                                          \
+                }                                                       \
+                return flags & float_muladd_negate_result ?             \
+                    soft_t ## _chs(r) : r;                              \
+            }                                                           \
+        }                                                               \
+    soft:                                                               \
+        return soft_ ## soft_t ## _muladd(a, b, c, flags, s);           \
+    }
+
+/* FMA generator for native floating point condition checks */
+#define GEN_FMA_FP(name, soft_t, host_t, host_fma_f, host_abs_f, min_norma=
l) \
+    static soft_t \
+    name(soft_t a, soft_t b, soft_t c, int flags, float_status *s)      \
+    {                                                                   \
+        host_t ha, hb, hc;                                              \
+                                                                        \
+        if (QEMU_NO_HARDFLOAT) {                                        \
+            goto soft;                                                  \
+        }                                                               \
+        soft_t ## _input_flush3(&a, &b, &c, s);                         \
+        ha =3D soft_t ## _to_ ## host_t(a);                               \
+        hb =3D soft_t ## _to_ ## host_t(b);                               \
+        hc =3D soft_t ## _to_ ## host_t(c);                               \
+        if (likely((fpclassify(ha) =3D=3D FP_NORMAL ||                    =
  \
+                    fpclassify(ha) =3D=3D FP_ZERO) &&                     =
  \
+                   (fpclassify(hb) =3D=3D FP_NORMAL ||                    =
  \
+                    fpclassify(hb) =3D=3D FP_ZERO) &&                     =
  \
+                   (fpclassify(hc) =3D=3D FP_NORMAL ||                    =
  \
+                    fpclassify(hc) =3D=3D FP_ZERO) &&                     =
  \
+                   !(flags & float_muladd_halve_result) &&              \
+                   can_use_fpu(s))) {                                   \
+            if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) {       \
+                soft_t p, r;                                            \
+                host_t hp, hc, hr;                                      \
+                bool prod_sign;                                         \
+                                                                        \
+                prod_sign =3D soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b);=
 \
+                prod_sign ^=3D !!(flags & float_muladd_negate_product);   \
+                p =3D soft_t ## _set_sign(soft_t ## _zero, prod_sign);    \
+                                                                        \
+                if (flags & float_muladd_negate_c) {                    \
+                    c =3D soft_t ## _chs(c);                              \
+                }                                                       \
+                                                                        \
+                hp =3D soft_t ## _to_ ## host_t(p);                       \
+                hc =3D soft_t ## _to_ ## host_t(c);                       \
+                hr =3D hp + hc;                                           \
+                r =3D host_t ## _to_ ## soft_t(hr);                       \
+                return flags & float_muladd_negate_result ?             \
+                    soft_t ## _chs(r) : r;                              \
+            } else {                                                    \
+                host_t hr;                                              \
+                                                                        \
+                if (flags & float_muladd_negate_product) {              \
+                    ha =3D -ha;                                           \
+                }                                                       \
+                if (flags & float_muladd_negate_c) {                    \
+                    hc =3D -hc;                                           \
+                }                                                       \
+                hr =3D host_fma_f(ha, hb, hc);                            \
+                if (unlikely(isinf(hr))) {                              \
+                    s->float_exception_flags |=3D float_flag_overflow;    \
+                } else if (unlikely(host_abs_f(hr) <=3D min_normal)) {    \
+                    goto soft;                                          \
+                }                                                       \
+                if (flags & float_muladd_negate_result) {               \
+                    hr =3D -hr;                                           \
+                }                                                       \
+                return host_t ## _to_ ## soft_t(hr);                    \
+            }                                                           \
+        }                                                               \
+    soft:                                                               \
+        return soft_ ## soft_t ## _muladd(a, b, c, flags, s);           \
+    }
+
+GEN_FMA_SF(f32_muladd, float32, float, fmaf, fabsf, FLT_MIN)
+GEN_FMA_SF(f64_muladd, float64, double, fma, fabs, DBL_MIN)
+#undef GEN_FMA_SF
+
+GEN_FMA_FP(float_muladd, float32, float, fmaf, fabsf, FLT_MIN)
+GEN_FMA_FP(double_muladd, float64, double, fma, fabs, DBL_MIN)
+#undef GEN_FMA_FP
+
+float32 __attribute__((flatten))
+float32_muladd(float32 a, float32 b, float32 c, int flags, float_status *s)
+{
+    if (QEMU_HARDFLOAT_3F32_USE_FP) {
+        return float_muladd(a, b, c, flags, s);
+    } else {
+        return f32_muladd(a, b, c, flags, s);
+    }
+}
+
+float64 __attribute__((flatten))
+float64_muladd(float64 a, float64 b, float64 c, int flags, float_status *s)
+{
+    if (QEMU_HARDFLOAT_3F64_USE_FP) {
+        return double_muladd(a, b, c, flags, s);
+    } else {
+        return f64_muladd(a, b, c, flags, s);
+    }
+}
+
 /*
  * Returns the result of dividing the floating-point value `a' by the
  * corresponding value `b'. The operation is performed according to
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539473438085793.0335470100038;
 Sat, 13 Oct 2018 16:30:38 -0700 (PDT)
Received: from localhost ([::1]:46590 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTMi-0001nn-QJ
	for importer@patchew.org; Sat, 13 Oct 2018 19:30:36 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:58010)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCz-0002nv-St
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:37 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCy-00089d-Af
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:33 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:47329)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCy-0007wn-49
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:32 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 0F9AD21BF7;
	Sat, 13 Oct 2018 19:20:10 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:10 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 959F4102D5;
	Sat, 13 Oct 2018 19:20:09 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=VBhOzjGUVoldNDQwVpd3T43xK2XtBARjy3G35LJqXck=; b=0gvep
	iE2zGVNmAfHr3U8C3+p1eE0yAEvfZ5aQnZ9A0EmVkUUQXrZa2UUifKtuyAEB23Ap
	IGQHObC1k+epjRiFdq28lK06xNtFIGQ+v62d5Ejzj7VPzWBjoxJCFvdjVKFyK4i1
	CwA3G/+6BqIxMSjf5qmbB4RM9KmlpXdvBt4FKw=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=VBhOzjGUVoldNDQwVpd3T43xK2XtB
	ARjy3G35LJqXck=; b=lLg7ub7sw9GxCl9zdRZeQ82W0R5qxfh66m9V2uruloHz0
	guIrYFYFEuCU1Wj8BXmtpfHj74pP0jU/UFE3Z7tqlyMWf7rdm/wXIjyo4NtkjywZ
	5a80qTJOf5VYQHE0zN4ugQCdhO5DfN3r+BCMZ4CTIe18Jf6m+fyKHIGqNl3hQadi
	2tQWFSQC6wo+wl9nC7I7sNpZl2GoVKf6IdLosiPCOqr033A9MfMxjpiGXs3Keubr
	CJjcCxvJsOSOjo5flRiiSPCqqLqMWkj4eym8qfiNLn9Sn6iaAnxOcJfW23F9SeD3
	ijoxAOZcy+TaEGPp6cw5iI4KL4Mc5cRLhWq1UOQ5g==
X-ME-Sender: <xms:qX3CW3Lx51W_2Nduvokkli5sg3nIO5-juag4I9u3_7whI60pay9NfQ>
X-ME-Proxy: <xmx:qX3CW3J4mpjBMLVeK2EMz_KPKW3CNU_puZYU3CNX6ZvUBtE8fpqxrQ>
	<xmx:qX3CW-qZ1Q3LsY88qJmtij7UzLLsRwcGEqOqdy6DLMNtJy5sMWSdDQ>
	<xmx:qX3CW4NrBGKQbe3m8vTekNHBLh8DSqVWhivw2XdK-ncoGOIzrN4aAw>
	<xmx:qX3CWzMVODmNgjY_EYrPpdfTm78bW0aXNXKE426EQftxXhkXmJOQAA>
	<xmx:qX3CWy3MQYcIST1ggbHLBKMIZ8a4rgTSkdLx-UZi7kc8DClM4gI6gQ>
	<xmx:qn3CWwV3JbW-iZV_Y_XPqpkWFQoVMeST63Vk5ZpIGEBzb6nNpNtq_Q>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:32 -0400
Message-Id: <20181013231933.28789-13-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 12/13] hardfloat: implement float32/64
 square root
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Performance results for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
sqrt-single: 43.27 MFlops
sqrt-double: 24.81 MFlops
- after:
sqrt-single: 297.94 MFlops
sqrt-double: 210.46 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
sqrt-single: 12.41 MFlops
sqrt-double: 6.22 MFlops
- after:
sqrt-single: 55.58 MFlops
sqrt-double: 40.63 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
sqrt-single: 17.01 MFlops
sqrt-double: 9.61 MFlops
- after:
sqrt-single: 104.17 MFlops
sqrt-double: 133.32 MFlops

Here none of the machines got faster from enabling USE_FP. For
instance, on x86_64 sqrt is 23% slower for single precision,
with it enabled, and 17% slower for double precision.

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 71 insertions(+), 2 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 1c1a42bf46..a738ca4a07 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3155,14 +3155,16 @@ float16 __attribute__((flatten)) float16_sqrt(float=
16 a, float_status *status)
     return float16_round_pack_canonical(pr, status);
 }
=20
-float32 __attribute__((flatten)) float32_sqrt(float32 a, float_status *sta=
tus)
+static float32 QEMU_SOFTFLOAT_ATTR
+soft_float32_sqrt(float32 a, float_status *status)
 {
     FloatParts pa =3D float32_unpack_canonical(a, status);
     FloatParts pr =3D sqrt_float(pa, status, &float32_params);
     return float32_round_pack_canonical(pr, status);
 }
=20
-float64 __attribute__((flatten)) float64_sqrt(float64 a, float_status *sta=
tus)
+static float64 QEMU_SOFTFLOAT_ATTR
+soft_float64_sqrt(float64 a, float_status *status)
 {
     FloatParts pa =3D float64_unpack_canonical(a, status);
     FloatParts pr =3D sqrt_float(pa, status, &float64_params);
@@ -3242,6 +3244,73 @@ float64 float64_silence_nan(float64 a, float_status =
*status)
     return float64_pack_raw(p);
 }
=20
+#define GEN_SQRT_SF(name, soft_t, host_t, host_sqrt_func)               \
+    static soft_t name(soft_t a, float_status *s)                       \
+    {                                                                   \
+        if (QEMU_NO_HARDFLOAT) {                                        \
+            goto soft;                                                  \
+        }                                                               \
+        soft_t ## _input_flush1(&a, s);                                 \
+        if (likely(soft_t ## _is_zero_or_normal(a) &&                   \
+                   !soft_t ## _is_neg(a) &&                             \
+                   can_use_fpu(s))) {                                   \
+            host_t ha =3D soft_t ## _to_ ## host_t(a);                    \
+            host_t hr =3D host_sqrt_func(ha);                             \
+                                                                        \
+            return host_t ## _to_ ## soft_t(hr);                        \
+        }                                                               \
+    soft:                                                               \
+        return soft_ ## soft_t ## _sqrt(a, s);                          \
+    }
+
+#define GEN_SQRT_FP(name, soft_t, host_t, host_sqrt_func)               \
+    static soft_t name(soft_t a, float_status *s)                       \
+    {                                                                   \
+        host_t ha;                                                      \
+                                                                        \
+        if (QEMU_NO_HARDFLOAT) {                                        \
+            goto soft;                                                  \
+        }                                                               \
+        soft_t ## _input_flush1(&a, s);                                 \
+        ha =3D soft_t ## _to_ ## host_t(a);                               \
+        if (likely((fpclassify(ha) =3D=3D FP_NORMAL ||                    =
  \
+                    fpclassify(ha) =3D=3D FP_ZERO) &&                     =
  \
+                   !signbit(ha) &&                                      \
+                   can_use_fpu(s))) {                                   \
+            host_t hr =3D host_sqrt_func(ha);                             \
+                                                                        \
+            return host_t ## _to_ ## soft_t(hr);                        \
+        }                                                               \
+    soft:                                                               \
+        return soft_ ## soft_t ## _sqrt(a, s);                          \
+    }
+
+GEN_SQRT_SF(f32_sqrt, float32, float, sqrtf)
+GEN_SQRT_SF(f64_sqrt, float64, double, sqrt)
+#undef GEN_SQRT_SF
+
+GEN_SQRT_FP(float_sqrt, float32, float, sqrtf)
+GEN_SQRT_FP(double_sqrt, float64, double, sqrt)
+#undef GEN_SQRT_FP
+
+float32 __attribute__((flatten)) float32_sqrt(float32 a, float_status *s)
+{
+    if (QEMU_HARDFLOAT_1F32_USE_FP) {
+        return float_sqrt(a, s);
+    } else {
+        return f32_sqrt(a, s);
+    }
+}
+
+float64 __attribute__((flatten)) float64_sqrt(float64 a, float_status *s)
+{
+    if (QEMU_HARDFLOAT_1F64_USE_FP) {
+        return double_sqrt(a, s);
+    } else {
+        return f64_sqrt(a, s);
+    }
+}
+
 /*------------------------------------------------------------------------=
----
 | Takes a 64-bit fixed-point value `absZ' with binary point between bits 6
 | and 7, and returns the properly rounded 32-bit integer corresponding to =
the
--=20
2.17.1


From nobody Mon Feb  9 04:53:32 2026
Delivered-To: importer@patchew.org
Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as
 permitted sender) client-ip=208.118.235.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Authentication-Results: mx.zohomail.com;
	spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted
 sender)  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by
 mx.zohomail.com
	with SMTPS id 1539473317053350.6542499690188;
 Sat, 13 Oct 2018 16:28:37 -0700 (PDT)
Received: from localhost ([::1]:46577 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1gBTKd-0000FJ-Gm
	for importer@patchew.org; Sat, 13 Oct 2018 19:28:27 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:58011)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCz-0002nw-Sx
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:37 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1gBTCw-00088O-65
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:33 -0400
Received: from out3-smtp.messagingengine.com ([66.111.4.27]:57117)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <cota@braap.org>) id 1gBTCv-0007wt-UT
	for qemu-devel@nongnu.org; Sat, 13 Oct 2018 19:20:30 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
	by mailout.nyi.internal (Postfix) with ESMTP id 3FD2121C2E;
	Sat, 13 Oct 2018 19:20:10 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
	by compute4.internal (MEProxy); Sat, 13 Oct 2018 19:20:10 -0400
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id C6220102DE;
	Sat, 13 Oct 2018 19:20:09 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=
	from:to:cc:subject:date:message-id:in-reply-to:references; s=
	mesmtp; bh=T3+O8eThT+Wa0/CtbNIjT71KjhtNL3WChHnS9AmXxmE=; b=lACrg
	mDjWXB116GtBYPfjAgBMYfXNjb2YpZvOFna0LOdZG8PJp2ze6b/4Asp26Z+l/uIC
	+SGgdoXIzJyxuGBPWLuZmWHq20at1lb7wuSPcad/KO1lVBYlEtyw/NwzudkNY7R8
	O9eW3QT9nfvFwI7T4hEuwCTncbLi4MHUwoHSP8=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:date:from:in-reply-to:message-id
	:references:subject:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm1; bh=T3+O8eThT+Wa0/CtbNIjT71KjhtNL
	3WChHnS9AmXxmE=; b=I0ynRJu3gjs6xaka/47wR4mCVw3EhCaNZHmYZgwER1wyc
	YxJQJ335Hw0HBoaZfuAmSC4BM6dk9jyhKZwkk1xEm4bMKbpt0d11a0xCXLGMSDCi
	jQy+Q8SL5TGrGZZReVqLF4yCUWdUIJILWUTWa55yRBjkuryKgBrov6Hw29B8VvSB
	XEX72utO++0ce7XFYUbopBRf+DyGTlwVk5LChIGG1WhWtqnlKG4N2dsvR2MnWViE
	tt1bxO/V6hyKS5Vc1vAuzJ9PZc/YPzDgQhaK3dREl9TCwEVJ3izrN3pfEe8pp2Sv
	y4CmofFc1mwMoRO9bR3n06hnoQzZG+k+SnfrUjIAg==
X-ME-Sender: <xms:qn3CW1eDkImrAmEYcPnv_3IsAimS6SpxqAFBAqenpmVMyTRldE1KkQ>
X-ME-Proxy: <xmx:qn3CW0G2IHZxBnBhJAXH0hpzJaltV7CaHQ4VTxcfRjvXatjGoVqGHw>
	<xmx:qn3CW1AvXT8-w-cRHJ1cIYXwhMwyKxTpRhGBjEqvjFddQjf7ZR4Hvg>
	<xmx:qn3CWz-bjfaYSSULbwB3NQK_OGRQx_pvjNs0YYH7ySD5wkUfuNCE5w>
	<xmx:qn3CW3f2q84Ez92Y6m4SQL41SbfzZkP3OurxGBCldgqL-zmCWlHogQ>
	<xmx:qn3CW2BKbj3nxB8orz92ixoj5Q45ypKIT3-EOPQqvGsIw8ytK1kcvA>
	<xmx:qn3CW_NrTAXM7hEEU7wkPMhRT0p7-mspcrReQeNCFY_pZegnXlOgxA>
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Date: Sat, 13 Oct 2018 19:19:33 -0400
Message-Id: <20181013231933.28789-14-cota@braap.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181013231933.28789-1-cota@braap.org>
References: <20181013231933.28789-1-cota@braap.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.27
Subject: [Qemu-devel] [PATCH v5 13/13] hardfloat: implement float32/64
 comparison
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
	Richard Henderson <richard.henderson@linaro.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
X-ZohoMail: RSF_0  Z_629925259 SPT_0
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Performance results for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
cmp-single: 113.01 MFlops
cmp-double: 115.54 MFlops
- after:
cmp-single: 527.83 MFlops
cmp-double: 457.21 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
cmp-single: 39.32 MFlops
cmp-double: 39.80 MFlops
- after:
cmp-single: 162.74 MFlops
cmp-double: 167.08 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
cmp-single: 60.81 MFlops
cmp-double: 62.76 MFlops
- after:
cmp-single: 235.39 MFlops
cmp-double: 283.44 MFlops

Here using float{32,64}_is_any_nan is faster than using isnan
for all machines. On x86_64 the perf difference is just a few
percentage points, but on aarch64 we go from 117/119 to
164/169 MFlops for single/double precision, respectively.

Aggregate performance improvement for the last few patches:
[ all charts in png: https://imgur.com/a/4yV8p ]

1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

                   qemu-aarch64 NBench score; higher is better
                 Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

  16 +-+-----------+-------------+----=3D=3D=3D-------+---=3D=3D=3D-------+=
-----------+-+
  14 +-+..........................@@@&&.=3D.......@@@&&.=3D................=
...+-+
  12 +-+..........................@.@.&.=3D.......@.@.&.=3D.....+befor=3D=
=3D=3D     +-+
  10 +-+..........................@.@.&.=3D.......@.@.&.=3D.....+ad@@&& =3D=
     +-+
   8 +-+.......................$$$%.@.&.=3D.......@.@.&.=3D.....+  @@u& =3D=
     +-+
   6 +-+............@@@&&=3D+***##.$%.@.&.=3D***##$$%+@.&.=3D..###$$%%@i& =
=3D     +-+
   4 +-+.......###$%%.@.&=3D.*.*.#.$%.@.&.=3D*.*.#.$%.@.&.=3D+**.#+$ +@m& =
=3D     +-+
   2 +-+.....***.#$.%.@.&=3D.*.*.#.$%.@.&.=3D*.*.#.$%.@.&.=3D.**.#+$+sqr& =
=3D     +-+
   0 +-+-----***##$%%@@&&=3D-***##$$%@@&&=3D=3D***##$$%@@&&=3D=3D-**##$$%+c=
mp=3D=3D-----+-+
            FOURIER    NEURAL NELU DECOMPOSITION         gmean

                              qemu-aarch64 SPEC06fp (test set) speedup over=
 QEMU 4c2c1015905
                                      Host: Intel(R) Core(TM) i7-6700K CPU =
@ 4.00GHz
                                            error bars: 95% confidence inte=
rval

  4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----=
+-----+-----+-----+-----+----+-----+---+-+
    4 +-+..........................+@@+....................................=
.......................................+-+
  3.5 +-+..............%%@&.........@@..............%%@&...................=
.........................+++dsub       +-+
  2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&...................=
.................+%%&+.+%@&++%%@&      +-+
    2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.=
%%@&+......+%%@&.+%%&++$$@&++d%@&  %%@&+-+
  1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**=
#%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+
  0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**=
#%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+
    0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**=
#%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+
  410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.Ge=
msF465.tont470.lb4482.sphinxgeomean

2. Host: ARM Aarch64 A57 @ 2.4GHz

                    qemu-aarch64 NBench score; higher is better
                 Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz

    5 +-+-----------+-------------+-------------+-------------+-----------+=
-+
  4.5 +-+........................................@@@&=3D=3D................=
...+-+
  3 4 +-+..........................@@@&=3D=3D........@.@&.=3D.....+before  =
     +-+
    3 +-+..........................@.@&.=3D........@.@&.=3D.....+ad@@@&=3D=
=3D     +-+
  2.5 +-+.....................##$$%%.@&.=3D........@.@&.=3D.....+  @m@& =3D=
     +-+
    2 +-+............@@@&=3D=3D.***#.$.%.@&.=3D.***#$$%%.@&.=3D.***#$$%%d@&=
 =3D     +-+
  1.5 +-+.....***#$$%%.@&.=3D.*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#+$ +f@& =
=3D     +-+
  0.5 +-+.....*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#+$+sqr& =
=3D     +-+
    0 +-+-----***#$$%%@@&=3D=3D-***#$$%%@@&=3D=3D-***#$$%%@@&=3D=3D-***#$$%=
+cmp=3D=3D-----+-+
             FOURIER    NEURAL NLU DECOMPOSITION         gmean

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 74 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 60 insertions(+), 14 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index a738ca4a07..1758cc93e7 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3014,28 +3014,74 @@ static int compare_floats(FloatParts a, FloatParts =
b, bool is_quiet,
     }
 }
=20
-#define COMPARE(sz)                                                     \
-int float ## sz ## _compare(float ## sz a, float ## sz b,               \
-                            float_status *s)                            \
+#define COMPARE(name, attr, sz)                                         \
+static int attr                                                         \
+name(float ## sz a, float ## sz b, bool is_quiet, float_status *s)      \
 {                                                                       \
     FloatParts pa =3D float ## sz ## _unpack_canonical(a, s);             \
     FloatParts pb =3D float ## sz ## _unpack_canonical(b, s);             \
-    return compare_floats(pa, pb, false, s);                            \
-}                                                                       \
-int float ## sz ## _compare_quiet(float ## sz a, float ## sz b,         \
-                                  float_status *s)                      \
-{                                                                       \
-    FloatParts pa =3D float ## sz ## _unpack_canonical(a, s);             \
-    FloatParts pb =3D float ## sz ## _unpack_canonical(b, s);             \
-    return compare_floats(pa, pb, true, s);                             \
+    return compare_floats(pa, pb, is_quiet, s);                         \
 }
=20
-COMPARE(16)
-COMPARE(32)
-COMPARE(64)
+COMPARE(soft_float16_compare, , 16)
+COMPARE(soft_float32_compare, QEMU_SOFTFLOAT_ATTR, 32)
+COMPARE(soft_float64_compare, QEMU_SOFTFLOAT_ATTR, 64)
=20
 #undef COMPARE
=20
+int __attribute__((flatten))
+float16_compare(float16 a, float16 b, float_status *s)
+{
+    return soft_float16_compare(a, b, false, s);
+}
+
+int __attribute__((flatten))
+float16_compare_quiet(float16 a, float16 b, float_status *s)
+{
+    return soft_float16_compare(a, b, true, s);
+}
+
+#define GEN_FPU_COMPARE(name, quiet_name, soft_t, host_t)               \
+    static int                                                          \
+    fpu_ ## name(soft_t a, soft_t b, bool is_quiet, float_status *s)    \
+    {                                                                   \
+        host_t ha, hb;                                                  \
+                                                                        \
+        if (QEMU_NO_HARDFLOAT) {                                        \
+            return soft_ ## name(a, b, is_quiet, s);                    \
+        }                                                               \
+        soft_t ## _input_flush2(&a, &b, s);                             \
+        ha =3D soft_t ## _to_ ## host_t(a);                               \
+        hb =3D soft_t ## _to_ ## host_t(b);                               \
+        if (unlikely(soft_t ## _is_any_nan(a) ||                        \
+                     soft_t ## _is_any_nan(b))) {                       \
+            return soft_ ## name(a, b, is_quiet, s);                    \
+        }                                                               \
+        if (isgreater(ha, hb)) {                                        \
+            return float_relation_greater;                              \
+        }                                                               \
+        if (isless(ha, hb)) {                                           \
+            return float_relation_less;                                 \
+        }                                                               \
+        return float_relation_equal;                                    \
+    }                                                                   \
+                                                                        \
+    int __attribute__((flatten))                                        \
+    name(soft_t a, soft_t b, float_status *s)                           \
+    {                                                                   \
+        return fpu_ ## name(a, b, false, s);                            \
+    }                                                                   \
+                                                                        \
+    int __attribute__((flatten))                                        \
+    quiet_name(soft_t a, soft_t b, float_status *s)                     \
+    {                                                                   \
+        return fpu_ ## name(a, b, true, s);                             \
+    }
+
+GEN_FPU_COMPARE(float32_compare, float32_compare_quiet, float32, float)
+GEN_FPU_COMPARE(float64_compare, float64_compare_quiet, float64, double)
+#undef GEN_FPU_COMPARE
+
 /* Multiply A by 2 raised to the power N.  */
 static FloatParts scalbn_decomposed(FloatParts a, int n, float_status *s)
 {
--=20
2.17.1