From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104080408117.58220079204091; Sat, 24 Nov 2018 16:01:20 -0800 (PST) Received: from localhost ([::1]:58163 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhrT-0007ls-5z for importer@patchew.org; Sat, 24 Nov 2018 19:01:19 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57464) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmV-0003G5-UD for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmQ-0005Dy-Oo for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:11 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:36555) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmO-0004eK-8G for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:05 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 7AFABCF8; Sat, 24 Nov 2018 18:56:01 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:01 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id AC021102E8; Sat, 24 Nov 2018 18:56:00 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=NnipD8L3PIYUDBAnWOot9nii3h2PnhTj99Q2qKbThLg=; b=SlDxQ 3zXzAypZQN9JK6WqDWqhNjtWU1uYSiQVZq+u5va1CTAXa8d4uMTeaTwikFyOP4WB djoMSH1IaZJwTTHKr6Woxd4w5sjPSejBCRYPnazZTQpfDaC1E+DC6gg1gSZ3c1Nd WMr8IFi2Ztdd8j4+Ef7R5ePtMS3lx6QqYUH/hY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=NnipD8L3PIYUDBAnWOot9nii3h2Pn hTj99Q2qKbThLg=; b=buQ9vEsG3Uz+X08rqqkHFgTAHj3uS+U9wgqOe2DdF5hWy FIq53H0PsHDurcNdLB9MpYHLh9lHsCXKrP1grJmyeQEViyqwa258BTsB9Z/uqesb dXu1plpK5sJQ8DuYRIndY0mVKFKJlhwwWaGEsqxG6vb8jlxnpITBB4h0YPn5yQZe crtdSouAhsPRi8ALORinzix4Y5TnU9K17eU03m0eYFduHs9lXmUjhdEK2oW0VTs5 09X5hbWio4GmXM/XU7Q6DKSFqNm6naIBvo/eYsNgcNLlbQoR9kwa7nFcq1j1O2wJ iorwqxgGNCOSkQB+A75GGvdgRqUr1yMcvPpzbQblQ== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:41 -0500 Message-Id: <20181124235553.17371-2-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 01/13] fp-test: pick TARGET_ARM to get its specialization X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This gets rid of the muladd errors due to not raising the invalid flag. - Before: Errors found in f64_mulAdd, rounding near_even, tininess before rounding: +000.0000000000000 +7FF.0000000000000 +7FF.FFFFFFFFFFFFF =3D> +7FF.FFFFFFFFFFFFF ..... expected -7FF.FFFFFFFFFFFFF v.... [...] - After: In 6133248 tests, no errors found in f64_mulAdd, rounding near_even, tinine= ss before rounding. [...] Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e Tested-by: Alex Benn=C3=A9e --- tests/fp/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/fp/Makefile b/tests/fp/Makefile index d649a5a1db..49cdcd1bd2 100644 --- a/tests/fp/Makefile +++ b/tests/fp/Makefile @@ -29,6 +29,9 @@ QEMU_INCLUDES +=3D -I$(TF_SOURCE_DIR) =20 # work around TARGET_* poisoning QEMU_CFLAGS +=3D -DHW_POISON_H +# define a target to match testfloat's implementation-defined choices, suc= h as +# whether to raise the invalid flag when dealing with NaNs in muladd. +QEMU_CFLAGS +=3D -DTARGET_ARM =20 # capstone has a platform.h file that clashes with softfloat's QEMU_CFLAGS :=3D $(filter-out %capstone, $(QEMU_CFLAGS)) --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543103908860428.03833985026085; Sat, 24 Nov 2018 15:58:28 -0800 (PST) Received: from localhost ([::1]:58148 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhoX-0004Zt-I6 for importer@patchew.org; Sat, 24 Nov 2018 18:58:17 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57463) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmV-0003G3-UI for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmQ-0005EY-Qh for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:11 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:36303) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmP-0004eq-KX for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:06 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id ABA8FCFB; Sat, 24 Nov 2018 18:56:01 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:01 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E416E102ED; Sat, 24 Nov 2018 18:56:00 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=mesmtp; bh=V1hE+xbte4TYTSkApBMzSxt27E4A+9NlJM9SlmxlTvM=; b=ryVvszEDV/ZU 2FeYa7QSZBX56qzTXyUfvST8m/cWakHK4/AYRuu/1Q+MB8FzJFUVrvL1JCEA7gUt MyeSQ5aHaIy/JJJR9oHz3fNM5IDnpESRyA+nBT/MDwKeKz8ISf7KrxQCar43/Msp pyjIojfSg4jV9SpqPrbEEqpjouzALEo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=V1hE+xbte4TYTSkApBMzSxt27E4A+9NlJM9SlmxlT vM=; b=G3udT1sFTxAKj4Ih0RCUTS13NWjVLKyGADWT45l8w4U0DTjTaA+bbzqT0 Pr4n9f0KW2QTww3BNUypZKl++HHL6fqSGSjbV6qN7BtGUGLLG5tEUbiH8/HEBpFi tnulecbaE6XezHBWJgRnjg8a/7J0Wv1GcHG0Ig87la6lGkD3Mzea+ttHw2HN0Bl2 byPSDmVwx6+eG/kE/zMoheBZge3+wiCg7LrJ5NXGO0D6ceiM630vBW/PcZr3L0Zf V16PCUdfS+e8DAvA1TAg8RUrxlfk/MBIa5K0YD3qpeXhSlW1ORhVd9Z88HAngImc 6d55XMpfkEZ6olXqr+Dgkuuea5GhQ== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:42 -0500 Message-Id: <20181124235553.17371-3-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 02/13] softfloat: add float{32, 64}_is_{de, }normal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" This paves the way for upcoming work. Reviewed-by: Bastian Koppelmann Reviewed-by: Alex Benn=C3=A9e Signed-off-by: Emilio G. Cota --- include/fpu/softfloat.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 8fd9f9bbae..9eeccd88a5 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -464,6 +464,16 @@ static inline int float32_is_zero_or_denormal(float32 = a) return (float32_val(a) & 0x7f800000) =3D=3D 0; } =20 +static inline bool float32_is_normal(float32 a) +{ + return ((float32_val(a) + 0x00800000) & 0x7fffffff) >=3D 0x01000000; +} + +static inline bool float32_is_denormal(float32 a) +{ + return float32_is_zero_or_denormal(a) && !float32_is_zero(a); +} + static inline float32 float32_set_sign(float32 a, int sign) { return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31)); @@ -605,6 +615,16 @@ static inline int float64_is_zero_or_denormal(float64 = a) return (float64_val(a) & 0x7ff0000000000000LL) =3D=3D 0; } =20 +static inline bool float64_is_normal(float64 a) +{ + return ((float64_val(a) + (1ULL << 52)) & -1ULL >> 1) >=3D 1ULL << 53; +} + +static inline bool float64_is_denormal(float64 a) +{ + return float64_is_zero_or_denormal(a) && !float64_is_zero(a); +} + static inline float64 float64_set_sign(float64 a, int sign) { return make_float64((float64_val(a) & 0x7fffffffffffffffULL) --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104080535509.8590888636011; Sat, 24 Nov 2018 16:01:20 -0800 (PST) Received: from localhost ([::1]:58165 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhrT-0007mH-51 for importer@patchew.org; Sat, 24 Nov 2018 19:01:19 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57465) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmV-0003G6-UM for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmQ-0005Ey-Ra for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:11 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:40651) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmP-0004h0-Mo for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:06 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id D9CDFCFC; Sat, 24 Nov 2018 18:56:01 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:02 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 1FBBC102F0; Sat, 24 Nov 2018 18:56:01 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=CeX9Fblk41NlWeY9UAsd6a4LLLNQgAqmGPeAe4RFW2k=; b=lxp1g dY9LCKB9gv+ksw7swjqRw0iE1T8iie7adB0zaQydDH9zfBpN79uUc2HtqOf5jb8o Y4dVd+ktzribfYeJ65IbR7UbY3B5ae/s2lOPVl28P3xW8ThpJ7DyJQSgkdUtaXve ILaRuyr7Vd/nvfZF9p7Ayi/9v40nFKMscumKWo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=CeX9Fblk41NlWeY9UAsd6a4LLLNQg AqmGPeAe4RFW2k=; b=DTJjBsq3IxHSIsQq+36/7u0lAwkeMUMhExFs8ra9qhYGE By+G+9iyHxCqeo+8FeBO3ZrDTDRu/vK2bbUgicmQR5Y96cH78ISVNqbCOHiI/sy+ rL2pDhH1euYTpmjODCraU9EgtwnksS36vE/X8YVlEr6xOvKf3NI+cMXxzDKUvOeQ ikKNDgD1spT3iNpqA3fb6z5M3KM3+5ZKf0iIBccWTTY6oNigdB+zrlMHOAm7B7pH NPbRyGPC/Vh0T0CB0CIgSzFc1zV6vYjnBtl08Q2rpq3nGd43UJpPUX26VM1zWyvE sx1zAN48BWito5KFD23bvMwBj/ycZUPFxcipbYBGg== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:43 -0500 Message-Id: <20181124235553.17371-4-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 03/13] target/tricore: use float32_is_denormal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Reviewed-by: Bastian Koppelmann Signed-off-by: Emilio G. Cota --- target/tricore/fpu_helper.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/target/tricore/fpu_helper.c b/target/tricore/fpu_helper.c index df162902d6..31df462e4a 100644 --- a/target/tricore/fpu_helper.c +++ b/target/tricore/fpu_helper.c @@ -44,11 +44,6 @@ static inline uint8_t f_get_excp_flags(CPUTriCoreState *= env) | float_flag_inexact); } =20 -static inline bool f_is_denormal(float32 arg) -{ - return float32_is_zero_or_denormal(arg) && !float32_is_zero(arg); -} - static inline float32 f_maddsub_nan_result(float32 arg1, float32 arg2, float32 arg3, float32 result, uint32_t muladd_negate_c) @@ -260,8 +255,8 @@ uint32_t helper_fcmp(CPUTriCoreState *env, uint32_t r1,= uint32_t r2) set_flush_inputs_to_zero(0, &env->fp_status); =20 result =3D 1 << (float32_compare_quiet(arg1, arg2, &env->fp_status) + = 1); - result |=3D f_is_denormal(arg1) << 4; - result |=3D f_is_denormal(arg2) << 5; + result |=3D float32_is_denormal(arg1) << 4; + result |=3D float32_is_denormal(arg2) << 5; =20 flags =3D f_get_excp_flags(env); if (flags) { --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104080128487.5279141561691; Sat, 24 Nov 2018 16:01:20 -0800 (PST) Received: from localhost ([::1]:58164 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhrS-0007mD-Uq for importer@patchew.org; Sat, 24 Nov 2018 19:01:19 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57462) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmV-0003G4-U6 for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmU-0005aP-Bx for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:11 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:45413) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmU-0005Fp-2X for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:10 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 5C731D01; Sat, 24 Nov 2018 18:56:02 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:02 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 51BC3102F1; Sat, 24 Nov 2018 18:56:01 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=xv6HgakEB7WLW0f/tlfjDvxKflDmETPVZ7+FXxmHkLg=; b=aUX8V mbzemSEcZmrRc2+A5FHk2Kb1zUxtTcHuoQNCCO68u9swIhRvzS2xHFna5LEFOHMs otWneSN2/0bt9O1MmJ3Qj+PndOLTXdrO/J74RUwqsIx6pdPJZXydtpOxWn8nLpvR D3rQ9i6+UJFhL+vD/OUz0s4r7Ung28+oHVyMT0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=xv6HgakEB7WLW0f/tlfjDvxKflDmE TPVZ7+FXxmHkLg=; b=axIe36fqcVz2gQU9ivID+GLZ29VLEP+rRLnU1wjr8Iov6 NRBYyTFPtE47pA9ANr1OCF7+FDo433/CtHac9WIITLfCIgjTAMvqHKgHQc4PvANY GUD/jBsxSYlNfa7bRWF22ZG+K4aRGopssm+LbaVMAjnFWSARBdzvj3bjw1wc/M4+ t0FSi5OFAJKTwVhpf+nE5XfSjWASnuZQdSUGXTk076nm0GmrSGRSq7keX+5c+x38 y/xnkSM2Rbg8JheDVAdeDxecwrKkyf09tWJkFcQlcuyFetBH8b/URZbHdx8FD52z kDFHqfYjh7ee7qVAMCHFV4pCjTzVqNYFB+lwUNSPw== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:44 -0500 Message-Id: <20181124235553.17371-5-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 04/13] softfloat: rename canonicalize to sf_canonicalize X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" glibc >=3D 2.25 defines canonicalize in commit eaf5ad0 (Add canonicalize, canonicalizef, canonicalizel., 2016-10-26). Given that we'll be including soon, prepare for this by prefixing our canonicalize() with sf_ to avoid clashing with the libc's canonicalize(). Reported-by: Bastian Koppelmann Tested-by: Bastian Koppelmann Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- fpu/softfloat.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index e1eef954e6..ecdc00c633 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -336,8 +336,8 @@ static inline float64 float64_pack_raw(FloatParts p) #include "softfloat-specialize.h" =20 /* Canonicalize EXP and FRAC, setting CLS. */ -static FloatParts canonicalize(FloatParts part, const FloatFmt *parm, - float_status *status) +static FloatParts sf_canonicalize(FloatParts part, const FloatFmt *parm, + float_status *status) { if (part.exp =3D=3D parm->exp_max && !parm->arm_althp) { if (part.frac =3D=3D 0) { @@ -513,7 +513,7 @@ static FloatParts round_canonical(FloatParts p, float_s= tatus *s, static FloatParts float16a_unpack_canonical(float16 f, float_status *s, const FloatFmt *params) { - return canonicalize(float16_unpack_raw(f), params, s); + return sf_canonicalize(float16_unpack_raw(f), params, s); } =20 static FloatParts float16_unpack_canonical(float16 f, float_status *s) @@ -534,7 +534,7 @@ static float16 float16_round_pack_canonical(FloatParts = p, float_status *s) =20 static FloatParts float32_unpack_canonical(float32 f, float_status *s) { - return canonicalize(float32_unpack_raw(f), &float32_params, s); + return sf_canonicalize(float32_unpack_raw(f), &float32_params, s); } =20 static float32 float32_round_pack_canonical(FloatParts p, float_status *s) @@ -544,7 +544,7 @@ static float32 float32_round_pack_canonical(FloatParts = p, float_status *s) =20 static FloatParts float64_unpack_canonical(float64 f, float_status *s) { - return canonicalize(float64_unpack_raw(f), &float64_params, s); + return sf_canonicalize(float64_unpack_raw(f), &float64_params, s); } =20 static float64 float64_round_pack_canonical(FloatParts p, float_status *s) --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543103909404335.190609320525; Sat, 24 Nov 2018 15:58:29 -0800 (PST) Received: from localhost ([::1]:58147 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhoY-0004XF-2e for importer@patchew.org; Sat, 24 Nov 2018 18:58:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57454) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmV-0003G0-SP for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmS-0005NP-4w for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:11 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:58305) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmQ-0004j3-O9 for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:08 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 42DD1CFF; Sat, 24 Nov 2018 18:56:02 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:02 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 843A1102F2; Sat, 24 Nov 2018 18:56:01 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=kIeLqTq4Rg/eYmqPOg5c92+v8rTPU4/b//I2wiwiWi4=; b=xbgVZ lEHFOYfjjkqg89qX8i0arb4cwKLZq2Z05QXIKqF67tFS2z1Y0VZ1CwnkO3XY94jZ 9ZSn+SRBozszZmIv56WNkynDIc2p6lb9grsFy3c6lrkACXNGt8uYbnXlaEvZ/mra v78/yIt/MQRkCu1/nXdjKOlSBs2a4s7VZVTf6c= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=kIeLqTq4Rg/eYmqPOg5c92+v8rTPU 4/b//I2wiwiWi4=; b=iLmgum38s/twWeXQzA15xjGBRTTpdf0PrZc6d6gV3nun6 Xp1ziyIRu9mAE9iQUnXbK4Rx0CgDs3dV7M/1/6xi6r4d9/Ka+aAMx3plWzRw6G6r Qe6QX1DG05OWzBR8xW03zo5s3Z9jMWHgroMICz1cOFZJ2sf1CbnsAY4Rdc0vnp5L CP7CQ/AADcWi9i9i2gZjjJ4zG6WOyF6fPIymFuwF6yC2EtDQp8Cy/Z0UiIM/BZ6U QIOS5bYi6J+EXc6O50QHdjYp/JBSFVutGdX/r9TKzmWSBNY/fqdoQJBVmZkcBzJk WeyWkWW39GPa1SR8Wg6ok4Qfi1e4jU9cAj/lXDMVQ== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:45 -0500 Message-Id: <20181124235553.17371-6-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 05/13] softfloat: add float{32, 64}_is_zero_or_normal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" These will gain some users very soon. Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- include/fpu/softfloat.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 9eeccd88a5..38a5e99cf3 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -474,6 +474,11 @@ static inline bool float32_is_denormal(float32 a) return float32_is_zero_or_denormal(a) && !float32_is_zero(a); } =20 +static inline bool float32_is_zero_or_normal(float32 a) +{ + return float32_is_normal(a) || float32_is_zero(a); +} + static inline float32 float32_set_sign(float32 a, int sign) { return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31)); @@ -625,6 +630,11 @@ static inline bool float64_is_denormal(float64 a) return float64_is_zero_or_denormal(a) && !float64_is_zero(a); } =20 +static inline bool float64_is_zero_or_normal(float64 a) +{ + return float64_is_normal(a) || float64_is_zero(a); +} + static inline float64 float64_set_sign(float64 a, int sign) { return make_float64((float64_val(a) & 0x7fffffffffffffffULL) --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104289700312.54509578031525; Sat, 24 Nov 2018 16:04:49 -0800 (PST) Received: from localhost ([::1]:58181 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhuh-0002ar-3q for importer@patchew.org; Sat, 24 Nov 2018 19:04:39 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57470) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmW-0003G7-2c for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmU-0005Yx-3I for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:11 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:45809) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmS-0005Fl-3m for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:09 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id C1CB8D05; Sat, 24 Nov 2018 18:56:02 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:03 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id B6EA9102F4; Sat, 24 Nov 2018 18:56:01 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=mHZ+WkgQo6dNLLpx91/bsHe5ftmVf4K1vN6Awmc8Eyo=; b=hTVKr MGOT2D5XM19Nyeye/xFgqJKdc+wm8FOBUmoC5//OA4uRWewP3sVtJQ1FCzsOOh9z wlgpUuGFs6/Ctt/A/fp3HER04KrGUygTmpODaVzaDIlvjdrboNp4MSv1TPR9IBjE eg7h9wk2umBo768PEixMvNzOD1Fn8k0Aih54NI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=mHZ+WkgQo6dNLLpx91/bsHe5ftmVf 4K1vN6Awmc8Eyo=; b=Xy0cBn8OVrt04jxI7+xp2VrKLcn3GS0a6M0sqTV7dZ/vG g+owPFpLLcOfPnMK2rf6e3a8EZ30CQIp11TUEmxgrwhNUi0qu6/2czbxzyAXtTRW PN938yXfd/SwbBYZ2CTSB7YbbovPFojYI/4TPQ7dI3PDLwX9fQ/8oWR7XW04W8Ue 0g/QGfURNMIS+kG9Ntkw1rczTguVyUNzF4IUZBrQexcByhqEr9pFbZ6x2Pnzt+TT 2oI81he6N7URnl6pgQJLfQTewIo4XMCXD6bl1OZvniJ1Yuv/LX/o/5vvzDV2/sgp efwgZZM0Qsc+bdMMeJw9t0JAzQiEU6jlTF/cmE5CA== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:46 -0500 Message-Id: <20181124235553.17371-7-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 06/13] tests/fp: add fp-bench X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" These microbenchmarks will allow us to measure the performance impact of FP emulation optimizations. Note that we can measure both directly the impa= ct on the softfloat functions (with "-t soft"), or the impact on an emulated workload (call with "-t host" and run under qemu user-mode). Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- tests/fp/fp-bench.c | 630 ++++++++++++++++++++++++++++++++++++++++++++ tests/fp/.gitignore | 1 + tests/fp/Makefile | 5 +- 3 files changed, 635 insertions(+), 1 deletion(-) create mode 100644 tests/fp/fp-bench.c diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c new file mode 100644 index 0000000000..f5bc5edebf --- /dev/null +++ b/tests/fp/fp-bench.c @@ -0,0 +1,630 @@ +/* + * fp-bench.c - A collection of simple floating point microbenchmarks. + * + * Copyright (C) 2018, Emilio G. Cota + * + * License: GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef HW_POISON_H +#error Must define HW_POISON_H to work around TARGET_* poisoning +#endif + +#include "qemu/osdep.h" +#include +#include +#include "qemu/timer.h" +#include "fpu/softfloat.h" + +/* amortize the computation of random inputs */ +#define OPS_PER_ITER 50000 + +#define MAX_OPERANDS 3 + +#define SEED_A 0xdeadfacedeadface +#define SEED_B 0xbadc0feebadc0fee +#define SEED_C 0xbeefdeadbeefdead + +enum op { + OP_ADD, + OP_SUB, + OP_MUL, + OP_DIV, + OP_FMA, + OP_SQRT, + OP_CMP, + OP_MAX_NR, +}; + +static const char * const op_names[] =3D { + [OP_ADD] =3D "add", + [OP_SUB] =3D "sub", + [OP_MUL] =3D "mul", + [OP_DIV] =3D "div", + [OP_FMA] =3D "mulAdd", + [OP_SQRT] =3D "sqrt", + [OP_CMP] =3D "cmp", + [OP_MAX_NR] =3D NULL, +}; + +enum precision { + PREC_SINGLE, + PREC_DOUBLE, + PREC_FLOAT32, + PREC_FLOAT64, + PREC_MAX_NR, +}; + +enum rounding { + ROUND_EVEN, + ROUND_ZERO, + ROUND_DOWN, + ROUND_UP, + ROUND_TIEAWAY, + N_ROUND_MODES, +}; + +static const char * const round_names[] =3D { + [ROUND_EVEN] =3D "even", + [ROUND_ZERO] =3D "zero", + [ROUND_DOWN] =3D "down", + [ROUND_UP] =3D "up", + [ROUND_TIEAWAY] =3D "tieaway", +}; + +enum tester { + TESTER_SOFT, + TESTER_HOST, + TESTER_MAX_NR, +}; + +static const char * const tester_names[] =3D { + [TESTER_SOFT] =3D "soft", + [TESTER_HOST] =3D "host", + [TESTER_MAX_NR] =3D NULL, +}; + +union fp { + float f; + double d; + float32 f32; + float64 f64; + uint64_t u64; +}; + +struct op_state; + +typedef float (*float_func_t)(const struct op_state *s); +typedef double (*double_func_t)(const struct op_state *s); + +union fp_func { + float_func_t float_func; + double_func_t double_func; +}; + +typedef void (*bench_func_t)(void); + +struct op_desc { + const char * const name; +}; + +#define DEFAULT_DURATION_SECS 1 + +static uint64_t random_ops[MAX_OPERANDS] =3D { + SEED_A, SEED_B, SEED_C, +}; +static float_status soft_status; +static enum precision precision; +static enum op operation; +static enum tester tester; +static uint64_t n_completed_ops; +static unsigned int duration =3D DEFAULT_DURATION_SECS; +static int64_t ns_elapsed; +/* disable optimizations with volatile */ +static volatile union fp res; + +/* + * From: https://en.wikipedia.org/wiki/Xorshift + * This is faster than rand_r(), and gives us a wider range (RAND_MAX is o= nly + * guaranteed to be >=3D INT_MAX). + */ +static uint64_t xorshift64star(uint64_t x) +{ + x ^=3D x >> 12; /* a */ + x ^=3D x << 25; /* b */ + x ^=3D x >> 27; /* c */ + return x * UINT64_C(2685821657736338717); +} + +static void update_random_ops(int n_ops, enum precision prec) +{ + int i; + + for (i =3D 0; i < n_ops; i++) { + uint64_t r =3D random_ops[i]; + + if (prec =3D=3D PREC_SINGLE || PREC_FLOAT32) { + do { + r =3D xorshift64star(r); + } while (!float32_is_normal(r)); + } else if (prec =3D=3D PREC_DOUBLE || PREC_FLOAT64) { + do { + r =3D xorshift64star(r); + } while (!float64_is_normal(r)); + } else { + g_assert_not_reached(); + } + random_ops[i] =3D r; + } +} + +static void fill_random(union fp *ops, int n_ops, enum precision prec, + bool no_neg) +{ + int i; + + for (i =3D 0; i < n_ops; i++) { + switch (prec) { + case PREC_SINGLE: + case PREC_FLOAT32: + ops[i].f32 =3D make_float32(random_ops[i]); + if (no_neg && float32_is_neg(ops[i].f32)) { + ops[i].f32 =3D float32_chs(ops[i].f32); + } + /* raise the exponent to limit the frequency of denormal resul= ts */ + ops[i].f32 |=3D 0x40000000; + break; + case PREC_DOUBLE: + case PREC_FLOAT64: + ops[i].f64 =3D make_float64(random_ops[i]); + if (no_neg && float64_is_neg(ops[i].f64)) { + ops[i].f64 =3D float64_chs(ops[i].f64); + } + /* raise the exponent to limit the frequency of denormal resul= ts */ + ops[i].f64 |=3D LIT64(0x4000000000000000); + break; + default: + g_assert_not_reached(); + } + } +} + +/* + * The main benchmark function. Instead of (ab)using macros, we rely + * on the compiler to unfold this at compile-time. + */ +static void bench(enum precision prec, enum op op, int n_ops, bool no_neg) +{ + int64_t tf =3D get_clock() + duration * 1000000000LL; + + while (get_clock() < tf) { + union fp ops[MAX_OPERANDS]; + int64_t t0; + int i; + + update_random_ops(n_ops, prec); + switch (prec) { + case PREC_SINGLE: + fill_random(ops, n_ops, prec, no_neg); + t0 =3D get_clock(); + for (i =3D 0; i < OPS_PER_ITER; i++) { + float a =3D ops[0].f; + float b =3D ops[1].f; + float c =3D ops[2].f; + + switch (op) { + case OP_ADD: + res.f =3D a + b; + break; + case OP_SUB: + res.f =3D a - b; + break; + case OP_MUL: + res.f =3D a * b; + break; + case OP_DIV: + res.f =3D a / b; + break; + case OP_FMA: + res.f =3D fmaf(a, b, c); + break; + case OP_SQRT: + res.f =3D sqrtf(a); + break; + case OP_CMP: + res.u64 =3D isgreater(a, b); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_DOUBLE: + fill_random(ops, n_ops, prec, no_neg); + t0 =3D get_clock(); + for (i =3D 0; i < OPS_PER_ITER; i++) { + double a =3D ops[0].d; + double b =3D ops[1].d; + double c =3D ops[2].d; + + switch (op) { + case OP_ADD: + res.d =3D a + b; + break; + case OP_SUB: + res.d =3D a - b; + break; + case OP_MUL: + res.d =3D a * b; + break; + case OP_DIV: + res.d =3D a / b; + break; + case OP_FMA: + res.d =3D fma(a, b, c); + break; + case OP_SQRT: + res.d =3D sqrt(a); + break; + case OP_CMP: + res.u64 =3D isgreater(a, b); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_FLOAT32: + fill_random(ops, n_ops, prec, no_neg); + t0 =3D get_clock(); + for (i =3D 0; i < OPS_PER_ITER; i++) { + float32 a =3D ops[0].f32; + float32 b =3D ops[1].f32; + float32 c =3D ops[2].f32; + + switch (op) { + case OP_ADD: + res.f32 =3D float32_add(a, b, &soft_status); + break; + case OP_SUB: + res.f32 =3D float32_sub(a, b, &soft_status); + break; + case OP_MUL: + res.f =3D float32_mul(a, b, &soft_status); + break; + case OP_DIV: + res.f32 =3D float32_div(a, b, &soft_status); + break; + case OP_FMA: + res.f32 =3D float32_muladd(a, b, c, 0, &soft_status); + break; + case OP_SQRT: + res.f32 =3D float32_sqrt(a, &soft_status); + break; + case OP_CMP: + res.u64 =3D float32_compare_quiet(a, b, &soft_status); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_FLOAT64: + fill_random(ops, n_ops, prec, no_neg); + t0 =3D get_clock(); + for (i =3D 0; i < OPS_PER_ITER; i++) { + float64 a =3D ops[0].f64; + float64 b =3D ops[1].f64; + float64 c =3D ops[2].f64; + + switch (op) { + case OP_ADD: + res.f64 =3D float64_add(a, b, &soft_status); + break; + case OP_SUB: + res.f64 =3D float64_sub(a, b, &soft_status); + break; + case OP_MUL: + res.f =3D float64_mul(a, b, &soft_status); + break; + case OP_DIV: + res.f64 =3D float64_div(a, b, &soft_status); + break; + case OP_FMA: + res.f64 =3D float64_muladd(a, b, c, 0, &soft_status); + break; + case OP_SQRT: + res.f64 =3D float64_sqrt(a, &soft_status); + break; + case OP_CMP: + res.u64 =3D float64_compare_quiet(a, b, &soft_status); + break; + default: + g_assert_not_reached(); + } + } + break; + default: + g_assert_not_reached(); + } + ns_elapsed +=3D get_clock() - t0; + n_completed_ops +=3D OPS_PER_ITER; + } +} + +#define GEN_BENCH(name, type, prec, op, n_ops) \ + static void __attribute__((flatten)) name(void) \ + { \ + bench(prec, op, n_ops, false); \ + } + +#define GEN_BENCH_NO_NEG(name, type, prec, op, n_ops) \ + static void __attribute__((flatten)) name(void) \ + { \ + bench(prec, op, n_ops, true); \ + } + +#define GEN_BENCH_ALL_TYPES(opname, op, n_ops) \ + GEN_BENCH(bench_ ## opname ## _float, float, PREC_SINGLE, op, n_ops) \ + GEN_BENCH(bench_ ## opname ## _double, double, PREC_DOUBLE, op, n_ops)= \ + GEN_BENCH(bench_ ## opname ## _float32, float32, PREC_FLOAT32, op, n_o= ps) \ + GEN_BENCH(bench_ ## opname ## _float64, float64, PREC_FLOAT64, op, n_o= ps) + +GEN_BENCH_ALL_TYPES(add, OP_ADD, 2) +GEN_BENCH_ALL_TYPES(sub, OP_SUB, 2) +GEN_BENCH_ALL_TYPES(mul, OP_MUL, 2) +GEN_BENCH_ALL_TYPES(div, OP_DIV, 2) +GEN_BENCH_ALL_TYPES(fma, OP_FMA, 3) +GEN_BENCH_ALL_TYPES(cmp, OP_CMP, 2) +#undef GEN_BENCH_ALL_TYPES + +#define GEN_BENCH_ALL_TYPES_NO_NEG(name, op, n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float, float, PREC_SINGLE, op, n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _double, double, PREC_DOUBLE, op, n= ) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float32, float32, PREC_FLOAT32, op= , n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float64, float64, PREC_FLOAT64, op= , n) + +GEN_BENCH_ALL_TYPES_NO_NEG(sqrt, OP_SQRT, 1) +#undef GEN_BENCH_ALL_TYPES_NO_NEG + +#undef GEN_BENCH_NO_NEG +#undef GEN_BENCH + +#define GEN_BENCH_FUNCS(opname, op) \ + [op] =3D { \ + [PREC_SINGLE] =3D bench_ ## opname ## _float, \ + [PREC_DOUBLE] =3D bench_ ## opname ## _double, \ + [PREC_FLOAT32] =3D bench_ ## opname ## _float32, \ + [PREC_FLOAT64] =3D bench_ ## opname ## _float64, \ + } + +static const bench_func_t bench_funcs[OP_MAX_NR][PREC_MAX_NR] =3D { + GEN_BENCH_FUNCS(add, OP_ADD), + GEN_BENCH_FUNCS(sub, OP_SUB), + GEN_BENCH_FUNCS(mul, OP_MUL), + GEN_BENCH_FUNCS(div, OP_DIV), + GEN_BENCH_FUNCS(fma, OP_FMA), + GEN_BENCH_FUNCS(sqrt, OP_SQRT), + GEN_BENCH_FUNCS(cmp, OP_CMP), +}; + +#undef GEN_BENCH_FUNCS + +static void run_bench(void) +{ + bench_func_t f; + + f =3D bench_funcs[operation][precision]; + g_assert(f); + f(); +} + +/* @arr must be NULL-terminated */ +static int find_name(const char * const *arr, const char *name) +{ + int i; + + for (i =3D 0; arr[i] !=3D NULL; i++) { + if (strcmp(name, arr[i]) =3D=3D 0) { + return i; + } + } + return -1; +} + +static void usage_complete(int argc, char *argv[]) +{ + gchar *op_list =3D g_strjoinv(", ", (gchar **)op_names); + gchar *tester_list =3D g_strjoinv(", ", (gchar **)tester_names); + + fprintf(stderr, "Usage: %s [options]\n", argv[0]); + fprintf(stderr, "options:\n"); + fprintf(stderr, " -d =3D duration, in seconds. Default: %d\n", + DEFAULT_DURATION_SECS); + fprintf(stderr, " -h =3D show this help message.\n"); + fprintf(stderr, " -o =3D floating point operation (%s). Default: %s\n", + op_list, op_names[0]); + fprintf(stderr, " -p =3D floating point precision (single, double). " + "Default: single\n"); + fprintf(stderr, " -r =3D rounding mode (even, zero, down, up, tieaway)= . " + "Default: even\n"); + fprintf(stderr, " -t =3D tester (%s). Default: %s\n", + tester_list, tester_names[0]); + fprintf(stderr, " -z =3D flush inputs to zero (soft tester only). " + "Default: disabled\n"); + fprintf(stderr, " -Z =3D flush output to zero (soft tester only). " + "Default: disabled\n"); + + g_free(tester_list); + g_free(op_list); +} + +static int round_name_to_mode(const char *name) +{ + int i; + + for (i =3D 0; i < N_ROUND_MODES; i++) { + if (!strcmp(round_names[i], name)) { + return i; + } + } + return -1; +} + +static void QEMU_NORETURN die_host_rounding(enum rounding rounding) +{ + fprintf(stderr, "fatal: '%s' rounding not supported on this host\n", + round_names[rounding]); + exit(EXIT_FAILURE); +} + +static void set_host_precision(enum rounding rounding) +{ + int rhost; + + switch (rounding) { + case ROUND_EVEN: + rhost =3D FE_TONEAREST; + break; + case ROUND_ZERO: + rhost =3D FE_TOWARDZERO; + break; + case ROUND_DOWN: + rhost =3D FE_DOWNWARD; + break; + case ROUND_UP: + rhost =3D FE_UPWARD; + break; + case ROUND_TIEAWAY: + die_host_rounding(rounding); + return; + default: + g_assert_not_reached(); + } + + if (fesetround(rhost)) { + die_host_rounding(rounding); + } +} + +static void set_soft_precision(enum rounding rounding) +{ + signed char mode; + + switch (rounding) { + case ROUND_EVEN: + mode =3D float_round_nearest_even; + break; + case ROUND_ZERO: + mode =3D float_round_to_zero; + break; + case ROUND_DOWN: + mode =3D float_round_down; + break; + case ROUND_UP: + mode =3D float_round_up; + break; + case ROUND_TIEAWAY: + mode =3D float_round_ties_away; + break; + default: + g_assert_not_reached(); + } + soft_status.float_rounding_mode =3D mode; +} + +static void parse_args(int argc, char *argv[]) +{ + int c; + int val; + int rounding =3D ROUND_EVEN; + + for (;;) { + c =3D getopt(argc, argv, "d:ho:p:r:t:zZ"); + if (c < 0) { + break; + } + switch (c) { + case 'd': + duration =3D atoi(optarg); + break; + case 'h': + usage_complete(argc, argv); + exit(EXIT_SUCCESS); + case 'o': + val =3D find_name(op_names, optarg); + if (val < 0) { + fprintf(stderr, "Unsupported op '%s'\n", optarg); + exit(EXIT_FAILURE); + } + operation =3D val; + break; + case 'p': + if (!strcmp(optarg, "single")) { + precision =3D PREC_SINGLE; + } else if (!strcmp(optarg, "double")) { + precision =3D PREC_DOUBLE; + } else { + fprintf(stderr, "Unsupported precision '%s'\n", optarg); + exit(EXIT_FAILURE); + } + break; + case 'r': + rounding =3D round_name_to_mode(optarg); + if (rounding < 0) { + fprintf(stderr, "fatal: invalid rounding mode '%s'\n", opt= arg); + exit(EXIT_FAILURE); + } + break; + case 't': + val =3D find_name(tester_names, optarg); + if (val < 0) { + fprintf(stderr, "Unsupported tester '%s'\n", optarg); + exit(EXIT_FAILURE); + } + tester =3D val; + break; + case 'z': + soft_status.flush_inputs_to_zero =3D 1; + break; + case 'Z': + soft_status.flush_to_zero =3D 1; + break; + } + } + + /* set precision and rounding mode based on the tester */ + switch (tester) { + case TESTER_HOST: + set_host_precision(rounding); + break; + case TESTER_SOFT: + set_soft_precision(rounding); + switch (precision) { + case PREC_SINGLE: + precision =3D PREC_FLOAT32; + break; + case PREC_DOUBLE: + precision =3D PREC_FLOAT64; + break; + default: + g_assert_not_reached(); + } + break; + default: + g_assert_not_reached(); + } +} + +static void pr_stats(void) +{ + printf("%.2f MFlops\n", (double)n_completed_ops / ns_elapsed * 1e3); +} + +int main(int argc, char *argv[]) +{ + parse_args(argc, argv); + run_bench(); + pr_stats(); + return 0; +} diff --git a/tests/fp/.gitignore b/tests/fp/.gitignore index 8d45d18ac4..704fd42992 100644 --- a/tests/fp/.gitignore +++ b/tests/fp/.gitignore @@ -1 +1,2 @@ fp-test +fp-bench diff --git a/tests/fp/Makefile b/tests/fp/Makefile index 49cdcd1bd2..5019dcdca0 100644 --- a/tests/fp/Makefile +++ b/tests/fp/Makefile @@ -553,7 +553,7 @@ TF_OBJS_LIB +=3D $(TF_OBJS_WRITECASE) TF_OBJS_LIB +=3D testLoops_common.o TF_OBJS_LIB +=3D $(TF_OBJS_TEST) =20 -BINARIES :=3D fp-test$(EXESUF) +BINARIES :=3D fp-test$(EXESUF) fp-bench$(EXESUF) =20 # everything depends on config-host.h because platform.h includes it all: $(BUILD_DIR)/config-host.h @@ -590,10 +590,13 @@ $(TF_OBJS_LIB) slowfloat.o: %.o: $(TF_SOURCE_DIR)/%.c =20 libtestfloat.a: $(TF_OBJS_LIB) =20 +fp-bench$(EXESUF): fp-bench.o $(QEMU_SOFTFLOAT_OBJ) $(LIBQEMUUTIL) + clean: rm -f *.o *.d $(BINARIES) rm -f *.gcno *.gcda *.gcov rm -f fp-test$(EXESUF) + rm -f fp-bench$(EXESUF) rm -f libsoftfloat.a rm -f libtestfloat.a =20 --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104476385431.50920847388977; Sat, 24 Nov 2018 16:07:56 -0800 (PST) Received: from localhost ([::1]:58200 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhxq-0004aw-Uf for importer@patchew.org; Sat, 24 Nov 2018 19:07:54 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57609) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmj-0003R0-As for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhme-0006eH-Cl for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:25 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:57461) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhme-0005Fn-4o for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:20 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id E994ED08; Sat, 24 Nov 2018 18:56:02 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:03 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id EAF40102F5; Sat, 24 Nov 2018 18:56:01 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=/xy6XW95ItzZ4xANi9dcT7DlqjcISm53ISMeDAabibE=; b=eO+dp BzVPcubWryORw1VQcILXL6XWw1jgt33fc0JV7nuFAQSHvUgkTfNLywDUAdvny9G5 OK9RzRXEHnjiu/NVvEKdWdI+W1mb0OzlQjho2KHYutqhkorDorKj0r6zqT34t18a WXdJU2D+CeI+BKVfcSkHqAknQQ3izanA8Nwn18= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=/xy6XW95ItzZ4xANi9dcT7DlqjcIS m53ISMeDAabibE=; b=I9Y2bpy8xNyzZ/x7DBcfsB3449FidKul1PGYpjvABv36p mAFT3oaHhvI/vdQUqvvUK2Bpz3FLaOwo4mJumJnt5rWvqN9b1sDsawUmDO/JdvVE vvGI2AnQbPX1BU0uuq7HRaRr+xOh2VI8OUPdn6Ft4/CvLqlqteE2hGipuNu/E2yT 4iPiLoNjE9SBa00hixHGlWdBpiyvMtnmaiEN1vp340zP3eOb/PqKRA+7qjFwpPUt 7lwtTsxPNepj2zQOUMZO3SZeldA8KK9jSbDmLj4j/mub/Mx4VYG0ZBfDFgzqHUQO /LAoo/lPlureAxM7KpeXsVzuacs3Yi0miDsJMK3Mw== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:47 -0500 Message-Id: <20181124235553.17371-8-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 07/13] fpu: introduce hardfloat X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The appended paves the way for leveraging the host FPU for a subset of guest FP operations. For most guest workloads (e.g. FP flags aren't ever cleared, inexact occurs often and rounding is set to the default [to nearest]) this will yield sizable performance speedups. The approach followed here avoids checking the FP exception flags register. See the added comment for details. This assumes that QEMU is running on an IEEE754-compliant FPU and that the rounding is set to the default (to nearest). The implementation-dependent specifics of the FPU should not matter; things like tininess detection and snan representation are still dealt with in soft-fp. However, this approach will break on most hosts if we compile QEMU with flags such as -ffast-math. We control the flags so this should be easy to enforce though. This patch just adds common code. Some operations will be migrated to hardfloat in subsequent patches to ease bisection. Note: some architectures (at least PPC, there might be others) clear the status flags passed to softfloat before most FP operations. This precludes the use of hardfloat, so to avoid introducing a performance regression for those targets, we add a flag to disable hardfloat. In the long run though it would be good to fix the targets so that at least the inexact flag passed to softfloat is indeed sticky. Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- fpu/softfloat.c | 315 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 315 insertions(+) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index ecdc00c633..306a12fa8d 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -83,6 +83,7 @@ this code that are retained. * target-dependent and needs the TARGET_* macros. */ #include "qemu/osdep.h" +#include #include "qemu/bitops.h" #include "fpu/softfloat.h" =20 @@ -95,6 +96,320 @@ this code that are retained. *-------------------------------------------------------------------------= ---*/ #include "fpu/softfloat-macros.h" =20 +/* + * Hardfloat + * + * Fast emulation of guest FP instructions is challenging for two reasons. + * First, FP instruction semantics are similar but not identical, particul= arly + * when handling NaNs. Second, emulating at reasonable speed the guest FP + * exception flags is not trivial: reading the host's flags register with a + * feclearexcept & fetestexcept pair is slow [slightly slower than soft-fp= ], + * and trapping on every FP exception is not fast nor pleasant to work wit= h. + * + * We address these challenges by leveraging the host FPU for a subset of = the + * operations. To do this we expand on the idea presented in this paper: + * + * Guo, Yu-Chuan, et al. "Translating the ARM Neon and VFP instructions in= a + * binary translator." Software: Practice and Experience 46.12 (2016):1591= -1615. + * + * The idea is thus to leverage the host FPU to (1) compute FP operations + * and (2) identify whether FP exceptions occurred while avoiding + * expensive exception flag register accesses. + * + * An important optimization shown in the paper is that given that excepti= on + * flags are rarely cleared by the guest, we can avoid recomputing some fl= ags. + * This is particularly useful for the inexact flag, which is very frequen= tly + * raised in floating-point workloads. + * + * We optimize the code further by deferring to soft-fp whenever FP except= ion + * detection might get hairy. Two examples: (1) when at least one operand = is + * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 r= esult + * and the result is < the minimum normal. + */ +#define GEN_INPUT_FLUSH__NOCHECK(name, soft_t) \ + static inline void name(soft_t *a, float_status *s) \ + { \ + if (unlikely(soft_t ## _is_denormal(*a))) { \ + *a =3D soft_t ## _set_sign(soft_t ## _zero, \ + soft_t ## _is_neg(*a)); \ + s->float_exception_flags |=3D float_flag_input_denormal; \ + } \ + } + +GEN_INPUT_FLUSH__NOCHECK(float32_input_flush__nocheck, float32) +GEN_INPUT_FLUSH__NOCHECK(float64_input_flush__nocheck, float64) +#undef GEN_INPUT_FLUSH__NOCHECK + +#define GEN_INPUT_FLUSH1(name, soft_t) \ + static inline void name(soft_t *a, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + } + +GEN_INPUT_FLUSH1(float32_input_flush1, float32) +GEN_INPUT_FLUSH1(float64_input_flush1, float64) +#undef GEN_INPUT_FLUSH1 + +#define GEN_INPUT_FLUSH2(name, soft_t) \ + static inline void name(soft_t *a, soft_t *b, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + soft_t ## _input_flush__nocheck(b, s); \ + } + +GEN_INPUT_FLUSH2(float32_input_flush2, float32) +GEN_INPUT_FLUSH2(float64_input_flush2, float64) +#undef GEN_INPUT_FLUSH2 + +#define GEN_INPUT_FLUSH3(name, soft_t) \ + static inline void name(soft_t *a, soft_t *b, soft_t *c, float_status = *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + soft_t ## _input_flush__nocheck(b, s); \ + soft_t ## _input_flush__nocheck(c, s); \ + } + +GEN_INPUT_FLUSH3(float32_input_flush3, float32) +GEN_INPUT_FLUSH3(float64_input_flush3, float64) +#undef GEN_INPUT_FLUSH3 + +/* + * Choose whether to use fpclassify or float32/64_* primitives in the gene= rated + * hardfloat functions. Each combination of number of inputs and float size + * gets its own value. + */ +#if defined(__x86_64__) +# define QEMU_HARDFLOAT_1F32_USE_FP 0 +# define QEMU_HARDFLOAT_1F64_USE_FP 1 +# define QEMU_HARDFLOAT_2F32_USE_FP 0 +# define QEMU_HARDFLOAT_2F64_USE_FP 1 +# define QEMU_HARDFLOAT_3F32_USE_FP 0 +# define QEMU_HARDFLOAT_3F64_USE_FP 1 +#else +# define QEMU_HARDFLOAT_1F32_USE_FP 0 +# define QEMU_HARDFLOAT_1F64_USE_FP 0 +# define QEMU_HARDFLOAT_2F32_USE_FP 0 +# define QEMU_HARDFLOAT_2F64_USE_FP 0 +# define QEMU_HARDFLOAT_3F32_USE_FP 0 +# define QEMU_HARDFLOAT_3F64_USE_FP 0 +#endif + +/* + * QEMU_HARDFLOAT_USE_ISINF chooses whether to use isinf() over + * float{32,64}_is_infinity when !USE_FP. + * On x86_64/aarch64, using the former over the latter can yield a ~6% spe= edup. + * On power64 however, using isinf() reduces fp-bench performance by up to= 50%. + */ +#if defined(__x86_64__) || defined(__aarch64__) +# define QEMU_HARDFLOAT_USE_ISINF 1 +#else +# define QEMU_HARDFLOAT_USE_ISINF 0 +#endif + +/* + * Some targets clear the FP flags before most FP operations. This prevents + * the use of hardfloat, since hardfloat relies on the inexact flag being + * already set. + */ +#if defined(TARGET_PPC) +# define QEMU_NO_HARDFLOAT 1 +# define QEMU_SOFTFLOAT_ATTR QEMU_FLATTEN +#else +# define QEMU_NO_HARDFLOAT 0 +# define QEMU_SOFTFLOAT_ATTR QEMU_FLATTEN __attribute__((noinline)) +#endif + +static inline bool can_use_fpu(const float_status *s) +{ + if (QEMU_NO_HARDFLOAT) { + return false; + } + return likely(s->float_exception_flags & float_flag_inexact && + s->float_rounding_mode =3D=3D float_round_nearest_even); +} + +/* + * Hardfloat generation functions. Each operation can have two flavors: + * either using softfloat primitives (e.g. float32_is_zero_or_normal) for + * most condition checks, or native ones (e.g. fpclassify). + * + * The flavor is chosen by the callers. Instead of using macros, we rely o= n the + * compiler to propagate constants and inline everything into the callers. + * + * We only generate functions for operations with two inputs, since only + * these are common enough to justify consolidating them into common code. + */ + +typedef union { + float32 s; + float h; +} union_float32; + +typedef union { + float64 s; + double h; +} union_float64; + +typedef bool (*f32_check_fn)(union_float32 a, union_float32 b); +typedef bool (*f64_check_fn)(union_float64 a, union_float64 b); + +typedef float32 (*soft_f32_op2_fn)(float32 a, float32 b, float_status *s); +typedef float64 (*soft_f64_op2_fn)(float64 a, float64 b, float_status *s); +typedef float (*hard_f32_op2_fn)(float a, float b); +typedef double (*hard_f64_op2_fn)(double a, double b); + +/* 2-input is-zero-or-normal */ +static inline bool f32_is_zon2(union_float32 a, union_float32 b) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + /* + * Not using a temp variable for consecutive fpclassify calls ends= up + * generating faster code. + */ + return (fpclassify(a.h) =3D=3D FP_NORMAL || fpclassify(a.h) =3D=3D= FP_ZERO) && + (fpclassify(b.h) =3D=3D FP_NORMAL || fpclassify(b.h) =3D=3D= FP_ZERO); + } + return float32_is_zero_or_normal(a.s) && + float32_is_zero_or_normal(b.s); +} + +static inline bool f64_is_zon2(union_float64 a, union_float64 b) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return (fpclassify(a.h) =3D=3D FP_NORMAL || fpclassify(a.h) =3D=3D= FP_ZERO) && + (fpclassify(b.h) =3D=3D FP_NORMAL || fpclassify(b.h) =3D=3D= FP_ZERO); + } + return float64_is_zero_or_normal(a.s) && + float64_is_zero_or_normal(b.s); +} + +/* 3-input is-zero-or-normal */ +static inline +bool f32_is_zon3(union_float32 a, union_float32 b, union_float32 c) +{ + if (QEMU_HARDFLOAT_3F32_USE_FP) { + return (fpclassify(a.h) =3D=3D FP_NORMAL || fpclassify(a.h) =3D=3D= FP_ZERO) && + (fpclassify(b.h) =3D=3D FP_NORMAL || fpclassify(b.h) =3D=3D= FP_ZERO) && + (fpclassify(c.h) =3D=3D FP_NORMAL || fpclassify(c.h) =3D=3D= FP_ZERO); + } + return float32_is_zero_or_normal(a.s) && + float32_is_zero_or_normal(b.s) && + float32_is_zero_or_normal(c.s); +} + +static inline +bool f64_is_zon3(union_float64 a, union_float64 b, union_float64 c) +{ + if (QEMU_HARDFLOAT_3F64_USE_FP) { + return (fpclassify(a.h) =3D=3D FP_NORMAL || fpclassify(a.h) =3D=3D= FP_ZERO) && + (fpclassify(b.h) =3D=3D FP_NORMAL || fpclassify(b.h) =3D=3D= FP_ZERO) && + (fpclassify(c.h) =3D=3D FP_NORMAL || fpclassify(c.h) =3D=3D= FP_ZERO); + } + return float64_is_zero_or_normal(a.s) && + float64_is_zero_or_normal(b.s) && + float64_is_zero_or_normal(c.s); +} + +static inline bool f32_is_inf(union_float32 a) +{ + if (QEMU_HARDFLOAT_USE_ISINF) { + return isinff(a.h); + } + return float32_is_infinity(a.s); +} + +static inline bool f64_is_inf(union_float64 a) +{ + if (QEMU_HARDFLOAT_USE_ISINF) { + return isinf(a.h); + } + return float64_is_infinity(a.s); +} + +/* Note: @fast_test and @post can be NULL */ +static inline float32 +float32_gen2(float32 xa, float32 xb, float_status *s, + hard_f32_op2_fn hard, soft_f32_op2_fn soft, + f32_check_fn pre, f32_check_fn post, + f32_check_fn fast_test, soft_f32_op2_fn fast_op) +{ + union_float32 ua, ub, ur; + + ua.s =3D xa; + ub.s =3D xb; + + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + + float32_input_flush2(&ua.s, &ub.s, s); + if (unlikely(!pre(ua, ub))) { + goto soft; + } + if (fast_test && fast_test(ua, ub)) { + return fast_op(ua.s, ub.s, s); + } + + ur.h =3D hard(ua.h, ub.h); + if (unlikely(f32_is_inf(ur))) { + s->float_exception_flags |=3D float_flag_overflow; + } else if (unlikely(fabsf(ur.h) <=3D FLT_MIN)) { + if (post =3D=3D NULL || post(ua, ub)) { + goto soft; + } + } + return ur.s; + + soft: + return soft(ua.s, ub.s, s); +} + +static inline float64 +float64_gen2(float64 xa, float64 xb, float_status *s, + hard_f64_op2_fn hard, soft_f64_op2_fn soft, + f64_check_fn pre, f64_check_fn post, + f64_check_fn fast_test, soft_f64_op2_fn fast_op) +{ + union_float64 ua, ub, ur; + + ua.s =3D xa; + ub.s =3D xb; + + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + + float64_input_flush2(&ua.s, &ub.s, s); + if (unlikely(!pre(ua, ub))) { + goto soft; + } + if (fast_test && fast_test(ua, ub)) { + return fast_op(ua.s, ub.s, s); + } + + ur.h =3D hard(ua.h, ub.h); + if (unlikely(f64_is_inf(ur))) { + s->float_exception_flags |=3D float_flag_overflow; + } else if (unlikely(fabs(ur.h) <=3D DBL_MIN)) { + if (post =3D=3D NULL || post(ua, ub)) { + goto soft; + } + } + return ur.s; + + soft: + return soft(ua.s, ub.s, s); +} + /*------------------------------------------------------------------------= ---- | Returns the fraction bits of the half-precision floating-point value `a'. *-------------------------------------------------------------------------= ---*/ --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104407776207.55248766300667; Sat, 24 Nov 2018 16:06:47 -0800 (PST) Received: from localhost ([::1]:58197 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhwa-0003yy-TJ for importer@patchew.org; Sat, 24 Nov 2018 19:06:36 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57610) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmj-0003R1-Ay for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhme-0006e7-AO for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:25 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:41825) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhme-0005Fk-2M for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:20 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id D5D52D07; Sat, 24 Nov 2018 18:56:02 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:03 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 24348102F6; Sat, 24 Nov 2018 18:56:02 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=T4FwrcWf3Wmcy54O5ul5f0vvX2rbgYUG90TdgAQvhrk=; b=OsWD2 Rf8IAwS6DWw+6qwMvZhoLcbZQttRrFeD7YqzczY49pb8JXho9bZEG26W6aFvICPi zG5e6ck+RdbadUw0QknsbdxRMrjk9UsLA8n+drqae2ylPAVFTfHL16vzR6J+YDRa 7SQyCbmBIxFipg9+MlSo58rmIvZU2hap2/2EcA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=T4FwrcWf3Wmcy54O5ul5f0vvX2rbg YUG90TdgAQvhrk=; b=V+1bTqRYnamXQ9alA+/ur0fIflYQwYWbJatN8F8Yu0hBI l36u/CHVB2jIDboAZDgdjhqg7phT3vJFz8r27L1XUfAW0924X6njdldOjz8iCPRw fA2PrtXI3YnRpfe02baABYB6J0K++8Zd+Z35t53xAq4LFEouhOLxlz5qP5xRtBAv VWIvZ+ggH6MaKdcl2bxEajYQTthX5BjUlXL716dIGoKuki72bZKwad87zXkbBwfU +o8WwnW1X33n8pVXULnAp50M0vdq8Jfrf/S79pvu/Qfc4Q4Aod49G9pENlGNSTdj xDCcZb+1Gx5J3MlGCRyzhpxJDapzlotBYC7Yu47RA== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:48 -0500 Message-Id: <20181124235553.17371-9-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 08/13] hardfloat: implement float32/64 addition and subtraction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results (single and double precision) for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: add-single: 135.07 MFlops add-double: 131.60 MFlops sub-single: 130.04 MFlops sub-double: 133.01 MFlops - after: add-single: 443.04 MFlops add-double: 301.95 MFlops sub-single: 411.36 MFlops sub-double: 293.15 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: add-single: 44.79 MFlops add-double: 49.20 MFlops sub-single: 44.55 MFlops sub-double: 49.06 MFlops - after: add-single: 93.28 MFlops add-double: 88.27 MFlops sub-single: 91.47 MFlops sub-double: 88.27 MFlops 3. IBM POWER8E @ 2.1 GHz - before: add-single: 72.59 MFlops add-double: 72.27 MFlops sub-single: 75.33 MFlops sub-double: 70.54 MFlops - after: add-single: 112.95 MFlops add-double: 201.11 MFlops sub-single: 116.80 MFlops sub-double: 188.72 MFlops Note that the IBM and ARM machines benefit from having HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance can suffer significantly: - IBM Power8: add-single: [1] 54.94 vs [0] 116.37 MFlops add-double: [1] 58.92 vs [0] 201.44 MFlops - Aarch64 A57: add-single: [1] 80.72 vs [0] 93.24 MFlops add-double: [1] 82.10 vs [0] 88.18 MFlops On the Intel machine, having 2F64 set to 1 pays off, but it doesn't for 2F32: - Intel i7-6700K: add-single: [1] 285.79 vs [0] 426.70 MFlops add-double: [1] 302.15 vs [0] 278.82 MFlops Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- fpu/softfloat.c | 117 ++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 98 insertions(+), 19 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 306a12fa8d..cc500b1618 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1050,49 +1050,128 @@ float16 QEMU_FLATTEN float16_add(float16 a, float1= 6 b, float_status *status) return float16_round_pack_canonical(pr, status); } =20 -float32 QEMU_FLATTEN float32_add(float32 a, float32 b, float_status *statu= s) +float16 QEMU_FLATTEN float16_sub(float16 a, float16 b, float_status *statu= s) +{ + FloatParts pa =3D float16_unpack_canonical(a, status); + FloatParts pb =3D float16_unpack_canonical(b, status); + FloatParts pr =3D addsub_floats(pa, pb, true, status); + + return float16_round_pack_canonical(pr, status); +} + +static float32 QEMU_SOFTFLOAT_ATTR +soft_f32_addsub(float32 a, float32 b, bool subtract, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); - FloatParts pr =3D addsub_floats(pa, pb, false, status); + FloatParts pr =3D addsub_floats(pa, pb, subtract, status); =20 return float32_round_pack_canonical(pr, status); } =20 -float64 QEMU_FLATTEN float64_add(float64 a, float64 b, float_status *statu= s) +static inline float32 soft_f32_add(float32 a, float32 b, float_status *sta= tus) +{ + return soft_f32_addsub(a, b, false, status); +} + +static inline float32 soft_f32_sub(float32 a, float32 b, float_status *sta= tus) +{ + return soft_f32_addsub(a, b, true, status); +} + +static float64 QEMU_SOFTFLOAT_ATTR +soft_f64_addsub(float64 a, float64 b, bool subtract, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); - FloatParts pr =3D addsub_floats(pa, pb, false, status); + FloatParts pr =3D addsub_floats(pa, pb, subtract, status); =20 return float64_round_pack_canonical(pr, status); } =20 -float16 QEMU_FLATTEN float16_sub(float16 a, float16 b, float_status *statu= s) +static inline float64 soft_f64_add(float64 a, float64 b, float_status *sta= tus) { - FloatParts pa =3D float16_unpack_canonical(a, status); - FloatParts pb =3D float16_unpack_canonical(b, status); - FloatParts pr =3D addsub_floats(pa, pb, true, status); + return soft_f64_addsub(a, b, false, status); +} =20 - return float16_round_pack_canonical(pr, status); +static inline float64 soft_f64_sub(float64 a, float64 b, float_status *sta= tus) +{ + return soft_f64_addsub(a, b, true, status); } =20 -float32 QEMU_FLATTEN float32_sub(float32 a, float32 b, float_status *statu= s) +static float hard_f32_add(float a, float b) { - FloatParts pa =3D float32_unpack_canonical(a, status); - FloatParts pb =3D float32_unpack_canonical(b, status); - FloatParts pr =3D addsub_floats(pa, pb, true, status); + return a + b; +} =20 - return float32_round_pack_canonical(pr, status); +static float hard_f32_sub(float a, float b) +{ + return a - b; } =20 -float64 QEMU_FLATTEN float64_sub(float64 a, float64 b, float_status *statu= s) +static double hard_f64_add(double a, double b) { - FloatParts pa =3D float64_unpack_canonical(a, status); - FloatParts pb =3D float64_unpack_canonical(b, status); - FloatParts pr =3D addsub_floats(pa, pb, true, status); + return a + b; +} =20 - return float64_round_pack_canonical(pr, status); +static double hard_f64_sub(double a, double b) +{ + return a - b; +} + +static bool f32_addsub_post(union_float32 a, union_float32 b) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return !(fpclassify(a.h) =3D=3D FP_ZERO && fpclassify(b.h) =3D=3D = FP_ZERO); + } + return !(float32_is_zero(a.s) && float32_is_zero(b.s)); +} + +static bool f64_addsub_post(union_float64 a, union_float64 b) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return !(fpclassify(a.h) =3D=3D FP_ZERO && fpclassify(b.h) =3D=3D = FP_ZERO); + } else { + return !(float64_is_zero(a.s) && float64_is_zero(b.s)); + } +} + +static float32 float32_addsub(float32 a, float32 b, float_status *s, + hard_f32_op2_fn hard, soft_f32_op2_fn soft) +{ + return float32_gen2(a, b, s, hard, soft, + f32_is_zon2, f32_addsub_post, NULL, NULL); +} + +static float64 float64_addsub(float64 a, float64 b, float_status *s, + hard_f64_op2_fn hard, soft_f64_op2_fn soft) +{ + return float64_gen2(a, b, s, hard, soft, + f64_is_zon2, f64_addsub_post, NULL, NULL); +} + +float32 QEMU_FLATTEN +float32_add(float32 a, float32 b, float_status *s) +{ + return float32_addsub(a, b, s, hard_f32_add, soft_f32_add); +} + +float32 QEMU_FLATTEN +float32_sub(float32 a, float32 b, float_status *s) +{ + return float32_addsub(a, b, s, hard_f32_sub, soft_f32_sub); +} + +float64 QEMU_FLATTEN +float64_add(float64 a, float64 b, float_status *s) +{ + return float64_addsub(a, b, s, hard_f64_add, soft_f64_add); +} + +float64 QEMU_FLATTEN +float64_sub(float64 a, float64 b, float_status *s) +{ + return float64_addsub(a, b, s, hard_f64_sub, soft_f64_sub); } =20 /* --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104289598786.1743391145204; Sat, 24 Nov 2018 16:04:49 -0800 (PST) Received: from localhost ([::1]:58182 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhuf-0002b1-Qx for importer@patchew.org; Sat, 24 Nov 2018 19:04:37 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57457) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmV-0003G2-So for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmU-0005Zy-9X for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:11 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:50885) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmU-0005GQ-02 for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:10 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 1C4D8D0A; Sat, 24 Nov 2018 18:56:03 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:03 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 561AE102EE; Sat, 24 Nov 2018 18:56:02 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=BT+wuB4MZEXgLIMii8IQPwQvCdbUqLdiyibYTg6ejws=; b=IC26A 58rVkmLrUG75K6H/8wYY85h/03/lTnLg4B6ImJJqErt8Nz5QWTjX03gCu7puyiZo BLjlgZLoUjrLx2NZ8cJFGCHmi/MlToPa7834bDAd4atkuYqEf61JEhpwgm3Nyotg j5vHzjvquo76amfYm6ZQey043ZT1A4bwhxKPIE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=BT+wuB4MZEXgLIMii8IQPwQvCdbUq LdiyibYTg6ejws=; b=VsQUBXz3Rnm8ktAyP+jIrv2sqcTvYbxI9C8XaTG5z9JQO jc0sRh6EWkcH2vpe8ycgcFXdWS27uBvxhfvEKRbBYjOSbF/ak6D0/SqlBEDySbFY NbOyVsC7JYkGwfURfU4S3Xh83r1tzgtENf8wnFHMBZj1Kq51ukwzNsDvtW3Hlmes Ui01NVNBqykAWeRSa6mu0W/IH6ilHjRB7Enof4KOdXsdP7O43RMX0B97iAtfYg6V +arA6gdiOrl9a3DfJ48qYhwM1PCvw3TTFuVv0zr3ZykQRijkWJYp8WBLi91m60vB jbWRgMKkQR993AEqiAsh5KhPkLfMagK39M8icaZ3w== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:49 -0500 Message-Id: <20181124235553.17371-10-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 09/13] hardfloat: implement float32/64 multiplication X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: mul-single: 126.91 MFlops mul-double: 118.28 MFlops - after: mul-single: 258.02 MFlops mul-double: 197.96 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: mul-single: 37.42 MFlops mul-double: 38.77 MFlops - after: mul-single: 73.41 MFlops mul-double: 76.93 MFlops 3. IBM POWER8E @ 2.1 GHz - before: mul-single: 58.40 MFlops mul-double: 59.33 MFlops - after: mul-single: 60.25 MFlops mul-double: 94.79 MFlops Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- fpu/softfloat.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 52 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index cc500b1618..58e67d9b80 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1232,7 +1232,8 @@ float16 QEMU_FLATTEN float16_mul(float16 a, float16 b= , float_status *status) return float16_round_pack_canonical(pr, status); } =20 -float32 QEMU_FLATTEN float32_mul(float32 a, float32 b, float_status *statu= s) +static float32 QEMU_SOFTFLOAT_ATTR +soft_f32_mul(float32 a, float32 b, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); @@ -1241,7 +1242,8 @@ float32 QEMU_FLATTEN float32_mul(float32 a, float32 b= , float_status *status) return float32_round_pack_canonical(pr, status); } =20 -float64 QEMU_FLATTEN float64_mul(float64 a, float64 b, float_status *statu= s) +static float64 QEMU_SOFTFLOAT_ATTR +soft_f64_mul(float64 a, float64 b, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); @@ -1250,6 +1252,54 @@ float64 QEMU_FLATTEN float64_mul(float64 a, float64 = b, float_status *status) return float64_round_pack_canonical(pr, status); } =20 +static float hard_f32_mul(float a, float b) +{ + return a * b; +} + +static double hard_f64_mul(double a, double b) +{ + return a * b; +} + +static bool f32_mul_fast_test(union_float32 a, union_float32 b) +{ + return float32_is_zero(a.s) || float32_is_zero(b.s); +} + +static bool f64_mul_fast_test(union_float64 a, union_float64 b) +{ + return float64_is_zero(a.s) || float64_is_zero(b.s); +} + +static float32 f32_mul_fast_op(float32 a, float32 b, float_status *s) +{ + bool signbit =3D float32_is_neg(a) ^ float32_is_neg(b); + + return float32_set_sign(float32_zero, signbit); +} + +static float64 f64_mul_fast_op(float64 a, float64 b, float_status *s) +{ + bool signbit =3D float64_is_neg(a) ^ float64_is_neg(b); + + return float64_set_sign(float64_zero, signbit); +} + +float32 QEMU_FLATTEN +float32_mul(float32 a, float32 b, float_status *s) +{ + return float32_gen2(a, b, s, hard_f32_mul, soft_f32_mul, + f32_is_zon2, NULL, f32_mul_fast_test, f32_mul_fast= _op); +} + +float64 QEMU_FLATTEN +float64_mul(float64 a, float64 b, float_status *s) +{ + return float64_gen2(a, b, s, hard_f64_mul, soft_f64_mul, + f64_is_zon2, NULL, f64_mul_fast_test, f64_mul_fast= _op); +} + /* * Returns the result of multiplying the floating-point values `a' and * `b' then adding 'c', with no intermediate rounding step after the --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104080868564.8042217530095; Sat, 24 Nov 2018 16:01:20 -0800 (PST) Received: from localhost ([::1]:58167 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhrT-0007mw-PD for importer@patchew.org; Sat, 24 Nov 2018 19:01:19 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57527) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmX-0003GO-CJ for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmV-0005dx-Qt for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:40997) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmU-0005GW-3j for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:10 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 382FDCFD; Sat, 24 Nov 2018 18:56:03 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:03 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 85E06102F7; Sat, 24 Nov 2018 18:56:02 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=Me36sf2e+BClpfg/Z2YvL1ALV7nH1Z+4yaoXVR5VVbI=; b=1uNmf kNk38xJQnDeHCwSMdqgYVikIyS3BwAXTKx08Lgv/ziTr1htE9h0PihvSYHilewtC 539nZpZVj6pUF466I0XMaKc2ntWx8JkyunKqiKc1F5kHkFsxyx7j8nDRFphbpnAq v7ym34qCXUP6AQmKvKZcRpGt/lMMzkdobO6tAU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=Me36sf2e+BClpfg/Z2YvL1ALV7nH1 Z+4yaoXVR5VVbI=; b=I/Ch5a7HLEianjelpsChdCkwAHxXldznZyZ2aokZD7QfG kLWlDm2UI3iyED08bPE229WMI2c+bEQHm07YPglt8Z2cxPRNHoKxXrqgDgHzUKwN 7o6u3XZyLF8EYskTtT4cBQt4cPKQqc/XKuZelkYWpXMpyCs3q7QuqsMvY6X46mGC NPtJxCHM7oxIg0GExo2hmNeDeCWhxVm9Lqo/Et1j6STca0t0o+tFS0a7WFnzdbM3 HTV2inXl7ceMqBEX6zF1HAyLn8Sxdie3al5yii9EB+pgzttqSaeYAsSF0KlUYxqz QlU0g5uoJUcAIxalLA7WG/qjrJqY0OIKT19XXa0tg== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:50 -0500 Message-Id: <20181124235553.17371-11-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 10/13] hardfloat: implement float32/64 division X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: div-single: 34.84 MFlops div-double: 34.04 MFlops - after: div-single: 275.23 MFlops div-double: 216.38 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: div-single: 9.33 MFlops div-double: 9.30 MFlops - after: div-single: 51.55 MFlops div-double: 15.09 MFlops 3. IBM POWER8E @ 2.1 GHz - before: div-single: 25.65 MFlops div-double: 24.91 MFlops - after: div-single: 96.83 MFlops div-double: 31.01 MFlops Here setting 2FP64_USE_FP to 1 pays off for x86_64: [1] 215.97 vs [0] 62.15 MFlops Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- fpu/softfloat.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 62 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 58e67d9b80..e35ebfaae7 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1624,7 +1624,8 @@ float16 float16_div(float16 a, float16 b, float_statu= s *status) return float16_round_pack_canonical(pr, status); } =20 -float32 float32_div(float32 a, float32 b, float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_f32_div(float32 a, float32 b, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); @@ -1633,7 +1634,8 @@ float32 float32_div(float32 a, float32 b, float_statu= s *status) return float32_round_pack_canonical(pr, status); } =20 -float64 float64_div(float64 a, float64 b, float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_f64_div(float64 a, float64 b, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); @@ -1642,6 +1644,64 @@ float64 float64_div(float64 a, float64 b, float_stat= us *status) return float64_round_pack_canonical(pr, status); } =20 +static float hard_f32_div(float a, float b) +{ + return a / b; +} + +static double hard_f64_div(double a, double b) +{ + return a / b; +} + +static bool f32_div_pre(union_float32 a, union_float32 b) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return (fpclassify(a.h) =3D=3D FP_NORMAL || fpclassify(a.h) =3D=3D= FP_ZERO) && + fpclassify(b.h) =3D=3D FP_NORMAL; + } + return float32_is_zero_or_normal(a.s) && float32_is_normal(b.s); +} + +static bool f64_div_pre(union_float64 a, union_float64 b) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return (fpclassify(a.h) =3D=3D FP_NORMAL || fpclassify(a.h) =3D=3D= FP_ZERO) && + fpclassify(b.h) =3D=3D FP_NORMAL; + } + return float64_is_zero_or_normal(a.s) && float64_is_normal(b.s); +} + +static bool f32_div_post(union_float32 a, union_float32 b) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return fpclassify(a.h) !=3D FP_ZERO; + } + return !float32_is_zero(a.s); +} + +static bool f64_div_post(union_float64 a, union_float64 b) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return fpclassify(a.h) !=3D FP_ZERO; + } + return !float64_is_zero(a.s); +} + +float32 QEMU_FLATTEN +float32_div(float32 a, float32 b, float_status *s) +{ + return float32_gen2(a, b, s, hard_f32_div, soft_f32_div, + f32_div_pre, f32_div_post, NULL, NULL); +} + +float64 QEMU_FLATTEN +float64_div(float64 a, float64 b, float_status *s) +{ + return float64_gen2(a, b, s, hard_f64_div, soft_f64_div, + f64_div_pre, f64_div_post, NULL, NULL); +} + /* * Float to Float conversions * --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104290558865.3770739521954; Sat, 24 Nov 2018 16:04:50 -0800 (PST) Received: from localhost ([::1]:58183 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhuh-0002bz-FE for importer@patchew.org; Sat, 24 Nov 2018 19:04:39 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57523) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmX-0003GN-9Z for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmV-0005de-Px for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:34511) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmU-0005Gt-31 for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:10 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 6C1EAD0C; Sat, 24 Nov 2018 18:56:03 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:03 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id B1D18102F8; Sat, 24 Nov 2018 18:56:02 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=d4KehaM7YczzpZbgJzKCWqI1nLuiRrWM6mXkGgT24LM=; b=MgTEk 1ljR0R9mRRO3wG4m+WOkFzTroflUnqGKUgHtFCtsW+k80Q0ra6TObnZy/xFjGEv/ XqLpDgEazve2f/805WwfqDwDhCCTgc3TcFdqEbQbiWFu7Q6lL9K70U0TAha6eVMQ 4wvJvPP4Gi5QFZdIDtDjmeiwgchDDfg11ujUuo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=d4KehaM7YczzpZbgJzKCWqI1nLuiR rWM6mXkGgT24LM=; b=p+R3jeDqUYSl59DasVxXT87gwK3AZxZEmsF4dsQXfeGEJ F1dnzSFPfv4qX6FASDGySYyMpCwkemU0aurJgKxHu2QvypwRYGIXuChzzDt2KkTM yfRGJgOwmUFkRkT55yE/5dWq3NlbFlKvbkxT9hh4V+PJYmeH/oGwdoHpm731UXuI nCNfcxVTEj4TTflVbPexXKZTBrmOAIUKZ1IqO1GB85EWTsxBx9yNqegt0Tz1j9C7 OxvB2sywnkFHoK9Njwlg1aE9GRzWUQtfV9AGneiM9Vzy8L3g2RV0rddj9b06JCGQ GSZIRGINLAV7LavwIsv2ati5JzpcpLAyRNzcuZhag== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:51 -0500 Message-Id: <20181124235553.17371-12-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 11/13] hardfloat: implement float32/64 fused multiply-add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: fma-single: 74.73 MFlops fma-double: 74.54 MFlops - after: fma-single: 203.37 MFlops fma-double: 169.37 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: fma-single: 23.24 MFlops fma-double: 23.70 MFlops - after: fma-single: 66.14 MFlops fma-double: 63.10 MFlops 3. IBM POWER8E @ 2.1 GHz - before: fma-single: 37.26 MFlops fma-double: 37.29 MFlops - after: fma-single: 48.90 MFlops fma-double: 59.51 MFlops Here having 3FP64 set to 1 pays off for x86_64: [1] 170.15 vs [0] 153.12 MFlops Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- fpu/softfloat.c | 132 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 128 insertions(+), 4 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index e35ebfaae7..e03feafb6f 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1514,8 +1514,9 @@ float16 QEMU_FLATTEN float16_muladd(float16 a, float1= 6 b, float16 c, return float16_round_pack_canonical(pr, status); } =20 -float32 QEMU_FLATTEN float32_muladd(float32 a, float32 b, float32 c, - int flags, float_status *s= tatus) +static float32 QEMU_SOFTFLOAT_ATTR +soft_f32_muladd(float32 a, float32 b, float32 c, int flags, + float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); @@ -1525,8 +1526,9 @@ float32 QEMU_FLATTEN float32_muladd(float32 a, float3= 2 b, float32 c, return float32_round_pack_canonical(pr, status); } =20 -float64 QEMU_FLATTEN float64_muladd(float64 a, float64 b, float64 c, - int flags, float_status *s= tatus) +static float64 QEMU_SOFTFLOAT_ATTR +soft_f64_muladd(float64 a, float64 b, float64 c, int flags, + float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); @@ -1536,6 +1538,128 @@ float64 QEMU_FLATTEN float64_muladd(float64 a, floa= t64 b, float64 c, return float64_round_pack_canonical(pr, status); } =20 +float32 QEMU_FLATTEN +float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status= *s) +{ + union_float32 ua, ub, uc, ur; + + ua.s =3D xa; + ub.s =3D xb; + uc.s =3D xc; + + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + if (unlikely(flags & float_muladd_halve_result)) { + goto soft; + } + + float32_input_flush3(&ua.s, &ub.s, &uc.s, s); + if (unlikely(!f32_is_zon3(ua, ub, uc))) { + goto soft; + } + /* + * When (a || b) =3D=3D 0, there's no need to check for under/over flo= w, + * since we know the addend is (normal || 0) and the product is 0. + */ + if (float32_is_zero(ua.s) || float32_is_zero(ub.s)) { + union_float32 up; + bool prod_sign; + + prod_sign =3D float32_is_neg(ua.s) ^ float32_is_neg(ub.s); + prod_sign ^=3D !!(flags & float_muladd_negate_product); + up.s =3D float32_set_sign(float32_zero, prod_sign); + + if (flags & float_muladd_negate_c) { + uc.h =3D -uc.h; + } + ur.h =3D up.h + uc.h; + } else { + if (flags & float_muladd_negate_product) { + ua.h =3D -ua.h; + } + if (flags & float_muladd_negate_c) { + uc.h =3D -uc.h; + } + + ur.h =3D fmaf(ua.h, ub.h, uc.h); + + if (unlikely(f32_is_inf(ur))) { + s->float_exception_flags |=3D float_flag_overflow; + } else if (unlikely(fabsf(ur.h) <=3D FLT_MIN)) { + goto soft; + } + } + if (flags & float_muladd_negate_result) { + return float32_chs(ur.s); + } + return ur.s; + + soft: + return soft_f32_muladd(ua.s, ub.s, uc.s, flags, s); +} + +float64 QEMU_FLATTEN +float64_muladd(float64 xa, float64 xb, float64 xc, int flags, float_status= *s) +{ + union_float64 ua, ub, uc, ur; + + ua.s =3D xa; + ub.s =3D xb; + uc.s =3D xc; + + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + if (unlikely(flags & float_muladd_halve_result)) { + goto soft; + } + + float64_input_flush3(&ua.s, &ub.s, &uc.s, s); + if (unlikely(!f64_is_zon3(ua, ub, uc))) { + goto soft; + } + /* + * When (a || b) =3D=3D 0, there's no need to check for under/over flo= w, + * since we know the addend is (normal || 0) and the product is 0. + */ + if (float64_is_zero(ua.s) || float64_is_zero(ub.s)) { + union_float64 up; + bool prod_sign; + + prod_sign =3D float64_is_neg(ua.s) ^ float64_is_neg(ub.s); + prod_sign ^=3D !!(flags & float_muladd_negate_product); + up.s =3D float64_set_sign(float64_zero, prod_sign); + + if (flags & float_muladd_negate_c) { + uc.h =3D -uc.h; + } + ur.h =3D up.h + uc.h; + } else { + if (flags & float_muladd_negate_product) { + ua.h =3D -ua.h; + } + if (flags & float_muladd_negate_c) { + uc.h =3D -uc.h; + } + + ur.h =3D fma(ua.h, ub.h, uc.h); + + if (unlikely(f64_is_inf(ur))) { + s->float_exception_flags |=3D float_flag_overflow; + } else if (unlikely(fabs(ur.h) <=3D FLT_MIN)) { + goto soft; + } + } + if (flags & float_muladd_negate_result) { + return float64_chs(ur.s); + } + return ur.s; + + soft: + return soft_f64_muladd(ua.s, ub.s, uc.s, flags, s); +} + /* * Returns the result of dividing the floating-point value `a' by the * corresponding value `b'. The operation is performed according to --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543103908898383.12114379240643; Sat, 24 Nov 2018 15:58:28 -0800 (PST) Received: from localhost ([::1]:58149 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhoZ-0004c7-Gw for importer@patchew.org; Sat, 24 Nov 2018 18:58:19 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57529) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmX-0003Gb-FT for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmV-0005eN-SE for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:38477) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmU-0005Gj-5i for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:10 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id AE6D5D0E; Sat, 24 Nov 2018 18:56:03 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:04 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E29A7102E4; Sat, 24 Nov 2018 18:56:02 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=x2LwubhBihsDRrY+VvsdA6mYNJwh4zyXxBSPKEN6EGM=; b=vSW8S PDasdMqpJn16e/u8dRrmcJsMiZlWDJAqgwxzyk7jVbsycfV2ZIqpJ00e/ZbYR4FY ilxlgeWMFBqtKyYxgOkkcuziOf20zb1uKs8Yq2XtaSp9u4MrtS52TBpdAyP7ZTPS uEN64CJDaHuc1l9yTDVV7zo+kP2Kb+VxVZt1Qk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=x2LwubhBihsDRrY+VvsdA6mYNJwh4 zyXxBSPKEN6EGM=; b=tS4iZy6NtkOoe5XrmpQFC3Dicqy7znH5q9xVeoonRdiHd p4F3znqSJT66xDApJGTfB6al7CE4U3hEdU/tstkNtv3Y8prJayVxSXn/7M07IWIk AWDXtMND/jlIXvY844Di0BT2NsOc2sWRO+00sm7QhakYd39gsRnqw2YOmTXVMww6 uJs6GhcBGYjbiWuk7mOdAxQ0v7x0wQoy17napi4dX5D2CGqsrvYAtTbZPSlXuto5 b7A+XA3qlZ2Cy0XR13L3g3VbL/RXf+JbDqJJKLtwFYPk4NpeEEQq2R3Ucp7/A/Cg kmJAuL93ixtBIY8Upojc47Z1CB4U6olf9TqLt4avQ== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:52 -0500 Message-Id: <20181124235553.17371-13-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 12/13] hardfloat: implement float32/64 square root X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: sqrt-single: 42.30 MFlops sqrt-double: 22.97 MFlops - after: sqrt-single: 311.42 MFlops sqrt-double: 311.08 MFlops Here USE_FP makes a huge difference for f64's, with throughput going from ~200 MFlops to ~300 MFlops. Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- fpu/softfloat.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 58 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index e03feafb6f..4c6ecd1883 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -3040,20 +3040,76 @@ float16 QEMU_FLATTEN float16_sqrt(float16 a, float_= status *status) return float16_round_pack_canonical(pr, status); } =20 -float32 QEMU_FLATTEN float32_sqrt(float32 a, float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_f32_sqrt(float32 a, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pr =3D sqrt_float(pa, status, &float32_params); return float32_round_pack_canonical(pr, status); } =20 -float64 QEMU_FLATTEN float64_sqrt(float64 a, float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_f64_sqrt(float64 a, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pr =3D sqrt_float(pa, status, &float64_params); return float64_round_pack_canonical(pr, status); } =20 +float32 QEMU_FLATTEN float32_sqrt(float32 xa, float_status *s) +{ + union_float32 ua, ur; + + ua.s =3D xa; + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + + float32_input_flush1(&ua.s, s); + if (QEMU_HARDFLOAT_1F32_USE_FP) { + if (unlikely(!(fpclassify(ua.h) =3D=3D FP_NORMAL || + fpclassify(ua.h) =3D=3D FP_ZERO) || + signbit(ua.h))) { + goto soft; + } + } else if (unlikely(!float32_is_zero_or_normal(ua.s) || + float32_is_neg(ua.s))) { + goto soft; + } + ur.h =3D sqrtf(ua.h); + return ur.s; + + soft: + return soft_f32_sqrt(ua.s, s); +} + +float64 QEMU_FLATTEN float64_sqrt(float64 xa, float_status *s) +{ + union_float64 ua, ur; + + ua.s =3D xa; + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + + float64_input_flush1(&ua.s, s); + if (QEMU_HARDFLOAT_1F64_USE_FP) { + if (unlikely(!(fpclassify(ua.h) =3D=3D FP_NORMAL || + fpclassify(ua.h) =3D=3D FP_ZERO) || + signbit(ua.h))) { + goto soft; + } + } else if (unlikely(!float64_is_zero_or_normal(ua.s) || + float64_is_neg(ua.s))) { + goto soft; + } + ur.h =3D sqrt(ua.h); + return ur.s; + + soft: + return soft_f64_sqrt(ua.s, s); +} + /*------------------------------------------------------------------------= ---- | The pattern for a default generated NaN. *-------------------------------------------------------------------------= ---*/ --=20 2.17.1 From nobody Tue May 7 12:57:41 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543104549499590.5041202800496; Sat, 24 Nov 2018 16:09:09 -0800 (PST) Received: from localhost ([::1]:58203 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhz2-00068P-Dj for importer@patchew.org; Sat, 24 Nov 2018 19:09:08 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57607) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmj-0003Qz-4t for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhme-0006e2-A1 for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:25 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:47617) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhme-0005OH-1n for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:20 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id C7C51D0F; Sat, 24 Nov 2018 18:56:03 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:04 -0500 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 1D07C102F1; Sat, 24 Nov 2018 18:56:03 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=e7ciGg3YiL1x7XCnTx7UfOknXcepmzGvg99XkPSqOPU=; b=R3CeB GuU3hrrCkWmmDS1Xmhe6R60ZaehM6l6acOQOLCaBzfOCsCDeTfx+YQ3lAH+XyhWD ssXndvrlJNUuLgeMhmU1viRDgheHfb0VGy0Zr+IYHkSugjg34UBhyAMDqvpc29U2 4I8dU5yu242XBEbouDy90o5IXz2Oz1ja8P645M= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=e7ciGg3YiL1x7XCnTx7UfOknXcepm zGvg99XkPSqOPU=; b=ZEnPOCXg2jQngh5ZXX14njeFTfLBA2xHKCoQ1X/lCiT7T 31Z4aspNfCUuKTJyxqO2yNueJjg/EmXLmk/NZZGzpss1U1vMxRwgC6intpnUP8d1 DoASRhpU4HG/AbsmuiaWF01WnuUNGjO34TE1NYuSs8XPnM/WXOklNB/lmU5DQZ/z dwLJJVkHdoAXRMZmSDIk+Yd+YOTK1nWeh89jzxixM8Tljhd8C+O0wfvDvZzuWckq zMsLeUzvHo0WazhF+mLnOrOzTWOY43o+6/QYljulE+DIVokxK/yPzpdcopUNf2CN fWkykfLwfBOWFELJu2ZrWdggcHxiuIhlRu1o1BXaw== X-ME-Sender: X-ME-Proxy: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:53 -0500 Message-Id: <20181124235553.17371-14-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 13/13] hardfloat: implement float32/64 comparison X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: cmp-single: 110.98 MFlops cmp-double: 107.12 MFlops - after: cmp-single: 506.28 MFlops cmp-double: 524.77 MFlops Note that flattening both eq and eq_signaling versions would give us extra performance (695v506, 615v524 Mflops for single/double, respectively) but this would emit two essentially identical functions for each eq/signaling pair, which is a waste. Aggregate performance improvement for the last few patches: [ all charts in png: https://imgur.com/a/4yV8p ] 1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-----------+-------------+----=3D=3D=3D-------+---=3D=3D=3D-------+= -----------+-+ 14 +-+..........................@@@&&.=3D.......@@@&&.=3D................= ...+-+ 12 +-+..........................@.@.&.=3D.......@.@.&.=3D.....+befor=3D= =3D=3D +-+ 10 +-+..........................@.@.&.=3D.......@.@.&.=3D.....+ad@@&& =3D= +-+ 8 +-+.......................$$$%.@.&.=3D.......@.@.&.=3D.....+ @@u& =3D= +-+ 6 +-+............@@@&&=3D+***##.$%.@.&.=3D***##$$%+@.&.=3D..###$$%%@i& = =3D +-+ 4 +-+.......###$%%.@.&=3D.*.*.#.$%.@.&.=3D*.*.#.$%.@.&.=3D+**.#+$ +@m& = =3D +-+ 2 +-+.....***.#$.%.@.&=3D.*.*.#.$%.@.&.=3D*.*.#.$%.@.&.=3D.**.#+$+sqr& = =3D +-+ 0 +-+-----***##$%%@@&&=3D-***##$$%@@&&=3D=3D***##$$%@@&&=3D=3D-**##$$%+c= mp=3D=3D-----+-+ FOURIER NEURAL NELU DECOMPOSITION gmean qemu-aarch64 SPEC06fp (test set) speedup over= QEMU 4c2c1015905 Host: Intel(R) Core(TM) i7-6700K CPU = @ 4.00GHz error bars: 95% confidence inte= rval 4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----= +-----+-----+-----+-----+----+-----+---+-+ 4 +-+..........................+@@+....................................= .......................................+-+ 3.5 +-+..............%%@&.........@@..............%%@&...................= .........................+++dsub +-+ 2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&...................= .................+%%&+.+%@&++%%@& +-+ 2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.= %%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+ 1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**= #%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+ 0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**= #%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+ 0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**= #%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+ 410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.Ge= msF465.tont470.lb4482.sphinxgeomean 2. Host: ARM Aarch64 A57 @ 2.4GHz qemu-aarch64 NBench score; higher is better Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz 5 +-+-----------+-------------+-------------+-------------+-----------+= -+ 4.5 +-+........................................@@@&=3D=3D................= ...+-+ 3 4 +-+..........................@@@&=3D=3D........@.@&.=3D.....+before = +-+ 3 +-+..........................@.@&.=3D........@.@&.=3D.....+ad@@@&=3D= =3D +-+ 2.5 +-+.....................##$$%%.@&.=3D........@.@&.=3D.....+ @m@& =3D= +-+ 2 +-+............@@@&=3D=3D.***#.$.%.@&.=3D.***#$$%%.@&.=3D.***#$$%%d@&= =3D +-+ 1.5 +-+.....***#$$%%.@&.=3D.*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#+$ +f@& = =3D +-+ 0.5 +-+.....*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#+$+sqr& = =3D +-+ 0 +-+-----***#$$%%@@&=3D=3D-***#$$%%@@&=3D=3D-***#$$%%@@&=3D=3D-***#$$%= +cmp=3D=3D-----+-+ FOURIER NEURAL NLU DECOMPOSITION gmean Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- fpu/softfloat.c | 109 +++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 95 insertions(+), 14 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 4c6ecd1883..b29a2b6714 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2899,28 +2899,109 @@ static int compare_floats(FloatParts a, FloatParts= b, bool is_quiet, } } =20 -#define COMPARE(sz) \ -int float ## sz ## _compare(float ## sz a, float ## sz b, \ - float_status *s) \ +#define COMPARE(name, attr, sz) \ +static int attr \ +name(float ## sz a, float ## sz b, bool is_quiet, float_status *s) \ { \ FloatParts pa =3D float ## sz ## _unpack_canonical(a, s); \ FloatParts pb =3D float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, false, s); \ -} \ -int float ## sz ## _compare_quiet(float ## sz a, float ## sz b, \ - float_status *s) \ -{ \ - FloatParts pa =3D float ## sz ## _unpack_canonical(a, s); \ - FloatParts pb =3D float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, true, s); \ + return compare_floats(pa, pb, is_quiet, s); \ } =20 -COMPARE(16) -COMPARE(32) -COMPARE(64) +COMPARE(soft_f16_compare, QEMU_FLATTEN, 16) +COMPARE(soft_f32_compare, QEMU_SOFTFLOAT_ATTR, 32) +COMPARE(soft_f64_compare, QEMU_SOFTFLOAT_ATTR, 64) =20 #undef COMPARE =20 +int float16_compare(float16 a, float16 b, float_status *s) +{ + return soft_f16_compare(a, b, false, s); +} + +int float16_compare_quiet(float16 a, float16 b, float_status *s) +{ + return soft_f16_compare(a, b, true, s); +} + +static int QEMU_FLATTEN +f32_compare(float32 xa, float32 xb, bool is_quiet, float_status *s) +{ + union_float32 ua, ub; + + ua.s =3D xa; + ub.s =3D xb; + + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + + float32_input_flush2(&ua.s, &ub.s, s); + if (isgreaterequal(ua.h, ub.h)) { + if (isgreater(ua.h, ub.h)) { + return float_relation_greater; + } + return float_relation_equal; + } + if (likely(isless(ua.h, ub.h))) { + return float_relation_less; + } + /* The only condition remaining is unordered. + * Fall through to set flags. + */ + soft: + return soft_f32_compare(ua.s, ub.s, is_quiet, s); +} + +int float32_compare(float32 a, float32 b, float_status *s) +{ + return f32_compare(a, b, false, s); +} + +int float32_compare_quiet(float32 a, float32 b, float_status *s) +{ + return f32_compare(a, b, true, s); +} + +static int QEMU_FLATTEN +f64_compare(float64 xa, float64 xb, bool is_quiet, float_status *s) +{ + union_float64 ua, ub; + + ua.s =3D xa; + ub.s =3D xb; + + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + + float64_input_flush2(&ua.s, &ub.s, s); + if (isgreaterequal(ua.h, ub.h)) { + if (isgreater(ua.h, ub.h)) { + return float_relation_greater; + } + return float_relation_equal; + } + if (likely(isless(ua.h, ub.h))) { + return float_relation_less; + } + /* The only condition remaining is unordered. + * Fall through to set flags. + */ + soft: + return soft_f64_compare(ua.s, ub.s, is_quiet, s); +} + +int float64_compare(float64 a, float64 b, float_status *s) +{ + return f64_compare(a, b, false, s); +} + +int float64_compare_quiet(float64 a, float64 b, float_status *s) +{ + return f64_compare(a, b, true, s); +} + /* Multiply A by 2 raised to the power N. */ static FloatParts scalbn_decomposed(FloatParts a, int n, float_status *s) { --=20 2.17.1