From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528768304730703.9887722156117; Mon, 11 Jun 2018 18:51:44 -0700 (PDT) Received: from localhost ([::1]:52224 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYTF-0005kz-QP for importer@patchew.org; Mon, 11 Jun 2018 21:51:41 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40808) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046l-S1 for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQh-0003Le-Rx for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:58311) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQh-0003Is-Hc for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 9D7F721BBB; Mon, 11 Jun 2018 21:49:01 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:01 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E85B01025D; Mon, 11 Jun 2018 21:49:00 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=4EmrBVNhJlZ+uC OFSBHAO0PW8cr5OwnTlsIkRMYDtFA=; b=bgnDVE+LFuBSse8/fE2bDlc10wac+z huCBWFE0AfXotka1bfcs6zlgCnolynOO9cU8AHgwKbpbpXfn7JWsGgTgZEz2+pB7 JnlDSvuRlH7jv2WDm1NiCrmdL1tS5wXThsKE4f5A3YALq0nKFuUmYTREeR6RuBCa ug1krtDA7dKB8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=4EmrBVNhJlZ+uCOFSBHAO0PW8cr5OwnTlsIkRMYDtFA=; b=ZNA9eIqa Saa+NCtXFaON4tCTZxX1r+EcnhMr75wGmO+5wcE/fv2NftZCCjV0ouvgtl0Dl3hu /zUvPSwfj7S6tWhqoFrT08FO/k8LEWAtw6o2brUaIniWERDtgP2eUQa3RoeM3nN9 OOfMbpGv/OR1HbUGHbt3Xjn8kH/HDcBvkeMhoH1Nk8zk9M+DNzixxSkbf49naJmX +doFOot/Zpp36leBjNcGc/RDQwBAqTwpmokgH+wvWIbZFkvcgv7KJdbKSnHu3RQJ TtS4XPYTXVfBZER5sRB/ZrUaRZU/QXTN9jH8VEY3KwRLJkPkn8Y8PPWY7CZAmFvw AJFVWaH3fmjm3Q== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:47 -0400 Message-Id: <1528768140-17894-2-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 01/14] tests: add fp-test, a floating point test suite X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This will allow us to run correctness tests against our FP implementation. The test can be run in two modes (called "testers"): host and soft. With the former we check the results and FP flags on the host machine against the model. With the latter we check QEMU's fpu primitives against the model. Note that in soft mode we are not instantiating any particular CPU (hence the HW_POISON_H hack to avoid macro poisoning); for that we need to run the test in host mode under QEMU. The input files are taken from IBM's FPGen test suite: https://www.research.ibm.com/haifa/projects/verification/fpgen/ I see no license file in there so I am just downloading them with wget. We might want to keep a copy on a qemu server though, in case IBM takes those files down in the future. The "IBM" syntax of those files (for now the only syntax supported in fp-test) is documented here: https://www.research.ibm.com/haifa/projects/verification/fpgen/papers/ieee-= test-suite-v2.pdf Note that the syntax document has some inaccuracies; the appended parsing code works around some of those. The exception flag (-e) is important: many of the optimizations included in the following commits assume that the inexact flag is set, so "-e x" is necessary in order to test those code paths. The whitelist flag (-w) points to a file with test cases to be ignored. I have put some whitelist files online, but we should have them on a QEMU-related server. Thus, a typical of fp-test is as follows: $ cd qemu/build/tests/fp-test $ make -j && \ ./fp-test -t soft ibm/*.fptest \ -w whitelist.txt \ -e x If we want to test after-rounding tininess detection, then we need to pass "-a -w whitelist-tininess-after.txt" in addition to the above. (NB. we can pass "-w" as many times as we want.) The patch immediately after this one fixes a mismatch against the model in softfloat, but after that is applied the above should finish with a 0 return code, and print something like: All tests OK. Tests passed: 76572. Not handled: 51237, whitelisted: 2662 The tests pass on "host" mode on x86_64 and aarch64 machines, although note that for the x86_64 you need to pass -w whitelist-tininess-after.txt. Running on host mode under QEMU reports flag mismatches (e.g. for x86_64-linux-user), but that isn't too surprising given how little love the i386 frontend gets. Host mode under aarch64-linux-user passes OK. Flush-to-zero and flush-inputs-to-zero modes can be tested with the -z and -Z flags. Note however that the IBM input files are only IEEE-compliant, so for now I've tested these modes by diff'ing the reported errors against the model files. We should look into generating files for these non-standard modes to make testing these modes less painful. Signed-off-by: Emilio G. Cota --- configure | 2 + tests/fp/fp-test.c | 1158 ++++++++++++++++++++++++++++++++++++++++++++= ++++ tests/Makefile.include | 3 + tests/fp/.gitignore | 3 + tests/fp/Makefile | 34 ++ 5 files changed, 1200 insertions(+) create mode 100644 tests/fp/fp-test.c create mode 100644 tests/fp/.gitignore create mode 100644 tests/fp/Makefile diff --git a/configure b/configure index 14b1113..49694c2 100755 --- a/configure +++ b/configure @@ -7186,12 +7186,14 @@ fi =20 # build tree in object directory in case the source is not in the current = directory DIRS=3D"tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos tests/q= api-schema tests/tcg/xtensa tests/qemu-iotests tests/vm" +DIRS=3D"$DIRS tests/fp" DIRS=3D"$DIRS docs docs/interop fsdev scsi" DIRS=3D"$DIRS pc-bios/optionrom pc-bios/spapr-rtas pc-bios/s390-ccw" DIRS=3D"$DIRS roms/seabios roms/vgabios" FILES=3D"Makefile tests/tcg/Makefile qdict-test-data.txt" FILES=3D"$FILES tests/tcg/cris/Makefile tests/tcg/cris/.gdbinit" FILES=3D"$FILES tests/tcg/lm32/Makefile tests/tcg/xtensa/Makefile po/Makef= ile" +FILES=3D"$FILES tests/fp/Makefile" FILES=3D"$FILES pc-bios/optionrom/Makefile pc-bios/keymaps" FILES=3D"$FILES pc-bios/spapr-rtas/Makefile" FILES=3D"$FILES pc-bios/s390-ccw/Makefile" diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c new file mode 100644 index 0000000..6be9ce7 --- /dev/null +++ b/tests/fp/fp-test.c @@ -0,0 +1,1158 @@ +/* + * fp-test.c - Floating point test suite. + * + * Copyright (C) 2018, Emilio G. Cota + * + * License: GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef HW_POISON_H +#error Must define HW_POISON_H to work around TARGET_* poisoning +#endif + +#include "qemu/osdep.h" +#include +#include +#include "fpu/softfloat.h" + +enum error { + ERROR_NONE, + ERROR_NOT_HANDLED, + ERROR_WHITELISTED, + ERROR_COMMENT, + ERROR_INPUT, + ERROR_RESULT, + ERROR_EXCEPTIONS, + ERROR_MAX, +}; + +enum input_fmt { + INPUT_FMT_IBM, +}; + +struct input { + const char * const name; + enum error (*test_line)(const char *line); +}; + +enum precision { + PREC_FLOAT, + PREC_DOUBLE, + PREC_QUAD, + PREC_FLOAT_TO_DOUBLE, +}; + +struct op_desc { + const char * const name; + int n_operands; +}; + +enum op { + OP_ADD, + OP_SUB, + OP_MUL, + OP_MULADD, + OP_DIV, + OP_SQRT, + OP_MINNUM, + OP_MAXNUM, + OP_MAXNUMMAG, + OP_ABS, + OP_IS_NAN, + OP_IS_INF, + OP_FLOAT_TO_DOUBLE, +}; + +static const struct op_desc ops[] =3D { + [OP_ADD] =3D { "+", 2 }, + [OP_SUB] =3D { "-", 2 }, + [OP_MUL] =3D { "*", 2 }, + [OP_MULADD] =3D { "*+", 3 }, + [OP_DIV] =3D { "/", 2 }, + [OP_SQRT] =3D { "V", 1 }, + [OP_MINNUM] =3D { "C", 2 }, + [OP_MAXNUMMAG] =3D { ">A", 2 }, + [OP_ABS] =3D { "A", 1 }, + [OP_IS_NAN] =3D { "?N", 1 }, + [OP_IS_INF] =3D { "?i", 1 }, + [OP_FLOAT_TO_DOUBLE] =3D { "cff", 1 }, +}; + +/* + * We could enumerate all the types here. But really we only care about + * QNaN and SNaN since only those can vary across ISAs. + */ +enum op_type { + OP_TYPE_NUMBER, + OP_TYPE_QNAN, + OP_TYPE_SNAN, +}; + +struct operand { + uint64_t val; + enum op_type type; +}; + +struct test_op { + struct operand operands[3]; + struct operand expected_result; + enum precision prec; + enum op op; + signed char round; + uint8_t trapped_exceptions; + uint8_t exceptions; + bool expected_result_is_valid; +}; + +typedef enum error (*tester_func_t)(struct test_op *); + +struct tester { + tester_func_t func; + const char *name; +}; + +struct whitelist { + char **lines; + size_t n; + GHashTable *ht; +}; + +static uint64_t test_stats[ERROR_MAX]; +static struct whitelist whitelist; +static uint8_t default_exceptions; +static bool die_on_error =3D true; +static struct float_status soft_status =3D { + .float_detect_tininess =3D float_tininess_before_rounding, +}; + +static inline float u64_to_float(uint64_t v) +{ + uint32_t v32 =3D v; + uint32_t *v32p =3D &v32; + + return *(float *)v32p; +} + +static inline double u64_to_double(uint64_t v) +{ + uint64_t *vp =3D &v; + + return *(double *)vp; +} + +static inline uint64_t float_to_u64(float f) +{ + float *fp =3D &f; + + return *(uint32_t *)fp; +} + +static inline uint64_t double_to_u64(double d) +{ + double *dp =3D &d; + + return *(uint64_t *)dp; +} + +static inline bool is_err(enum error err) +{ + return err !=3D ERROR_NONE && + err !=3D ERROR_NOT_HANDLED && + err !=3D ERROR_WHITELISTED && + err !=3D ERROR_COMMENT; +} + +static int host_exceptions_translate(int host_flags) +{ + int flags =3D 0; + + if (host_flags & FE_INEXACT) { + flags |=3D float_flag_inexact; + } + if (host_flags & FE_UNDERFLOW) { + flags |=3D float_flag_underflow; + } + if (host_flags & FE_OVERFLOW) { + flags |=3D float_flag_overflow; + } + if (host_flags & FE_DIVBYZERO) { + flags |=3D float_flag_divbyzero; + } + if (host_flags & FE_INVALID) { + flags |=3D float_flag_invalid; + } + return flags; +} + +static inline uint8_t host_get_exceptions(void) +{ + return host_exceptions_translate(fetestexcept(FE_ALL_EXCEPT)); +} + +static void host_set_exceptions(uint8_t flags) +{ + int host_flags =3D 0; + + if (flags & float_flag_inexact) { + host_flags |=3D FE_INEXACT; + } + if (flags & float_flag_underflow) { + host_flags |=3D FE_UNDERFLOW; + } + if (flags & float_flag_overflow) { + host_flags |=3D FE_OVERFLOW; + } + if (flags & float_flag_divbyzero) { + host_flags |=3D FE_DIVBYZERO; + } + if (flags & float_flag_invalid) { + host_flags |=3D FE_INVALID; + } + feraiseexcept(host_flags); +} + +#define STANDARD_EXCEPTIONS \ + (float_flag_inexact | float_flag_underflow | \ + float_flag_overflow | float_flag_divbyzero | float_flag_invalid) +#define FMT_EXCEPTIONS "%s%s%s%s%s%s" +#define PR_EXCEPTIONS(x) \ + ((x) & STANDARD_EXCEPTIONS ? "" : "none"), \ + (((x) & float_flag_inexact) ? "x" : ""), \ + (((x) & float_flag_underflow) ? "u" : ""), \ + (((x) & float_flag_overflow) ? "o" : ""), \ + (((x) & float_flag_divbyzero) ? "z" : ""), \ + (((x) & float_flag_invalid) ? "i" : "") + +static enum error tester_check(const struct test_op *t, uint64_t res64, + bool res_is_nan, uint8_t flags) +{ + enum error err =3D ERROR_NONE; + + if (t->expected_result_is_valid) { + if (t->expected_result.type =3D=3D OP_TYPE_QNAN || + t->expected_result.type =3D=3D OP_TYPE_SNAN) { + if (!res_is_nan) { + err =3D ERROR_RESULT; + goto out; + } + } else if (res64 !=3D t->expected_result.val) { + err =3D ERROR_RESULT; + goto out; + } + } + if (t->exceptions && flags !=3D (t->exceptions | default_exceptions)) { + err =3D ERROR_EXCEPTIONS; + goto out; + } + + out: + if (is_err(err)) { + int i; + + fprintf(stderr, "%s ", ops[t->op].name); + for (i =3D 0; i < ops[t->op].n_operands; i++) { + fprintf(stderr, "0x%" PRIx64 "%s", t->operands[i].val, + i < ops[t->op].n_operands - 1 ? " " : ""); + } + fprintf(stderr, ", expected: 0x%" PRIx64 ", returned: 0x%" PRIx64, + t->expected_result.val, res64); + if (err =3D=3D ERROR_EXCEPTIONS) { + fprintf(stderr, ", expected exceptions: " FMT_EXCEPTIONS + ", returned: " FMT_EXCEPTIONS, + PR_EXCEPTIONS(t->exceptions), PR_EXCEPTIONS(flags)); + } + fprintf(stderr, "\n"); + } + return err; +} + +static enum error host_tester(struct test_op *t) +{ + uint64_t res64; + bool result_is_nan; + uint8_t flags =3D 0; + + feclearexcept(FE_ALL_EXCEPT); + if (default_exceptions) { + host_set_exceptions(default_exceptions); + } + + if (t->prec =3D=3D PREC_FLOAT) { + float a, b, c; + float *in[] =3D { &a, &b, &c }; + float res; + int i; + + g_assert(ops[t->op].n_operands <=3D ARRAY_SIZE(in)); + for (i =3D 0; i < ops[t->op].n_operands; i++) { + /* use the host's QNaN/SNaN patterns */ + if (t->operands[i].type =3D=3D OP_TYPE_QNAN) { + *in[i] =3D __builtin_nanf(""); + } else if (t->operands[i].type =3D=3D OP_TYPE_SNAN) { + *in[i] =3D __builtin_nansf(""); + } else { + *in[i] =3D u64_to_float(t->operands[i].val); + } + } + + if (t->expected_result.type =3D=3D OP_TYPE_QNAN) { + t->expected_result.val =3D float_to_u64(__builtin_nanf("")); + } else if (t->expected_result.type =3D=3D OP_TYPE_SNAN) { + t->expected_result.val =3D float_to_u64(__builtin_nansf("")); + } + + switch (t->op) { + case OP_ADD: + res =3D a + b; + break; + case OP_SUB: + res =3D a - b; + break; + case OP_MUL: + res =3D a * b; + break; + case OP_MULADD: + res =3D fmaf(a, b, c); + break; + case OP_DIV: + res =3D a / b; + break; + case OP_SQRT: + res =3D sqrtf(a); + break; + case OP_ABS: + res =3D fabsf(a); + break; + case OP_IS_NAN: + res =3D !!isnan(a); + break; + case OP_IS_INF: + res =3D !!isinf(a); + break; + default: + return ERROR_NOT_HANDLED; + } + flags =3D host_get_exceptions(); + res64 =3D float_to_u64(res); + result_is_nan =3D isnan(res); + } else if (t->prec =3D=3D PREC_DOUBLE) { + double a, b, c; + double *in[] =3D { &a, &b, &c }; + double res; + int i; + + g_assert(ops[t->op].n_operands <=3D ARRAY_SIZE(in)); + for (i =3D 0; i < ops[t->op].n_operands; i++) { + /* use the host's QNaN/SNaN patterns */ + if (t->operands[i].type =3D=3D OP_TYPE_QNAN) { + *in[i] =3D __builtin_nan(""); + } else if (t->operands[i].type =3D=3D OP_TYPE_SNAN) { + *in[i] =3D __builtin_nans(""); + } else { + *in[i] =3D u64_to_double(t->operands[i].val); + } + } + + if (t->expected_result.type =3D=3D OP_TYPE_QNAN) { + t->expected_result.val =3D double_to_u64(__builtin_nan("")); + } else if (t->expected_result.type =3D=3D OP_TYPE_SNAN) { + t->expected_result.val =3D double_to_u64(__builtin_nans("")); + } + + switch (t->op) { + case OP_ADD: + res =3D a + b; + break; + case OP_SUB: + res =3D a - b; + break; + case OP_MUL: + res =3D a * b; + break; + case OP_MULADD: + res =3D fma(a, b, c); + break; + case OP_DIV: + res =3D a / b; + break; + case OP_SQRT: + res =3D sqrt(a); + break; + case OP_ABS: + res =3D fabs(a); + break; + case OP_IS_NAN: + res =3D !!isnan(a); + break; + case OP_IS_INF: + res =3D !!isinf(a); + break; + default: + return ERROR_NOT_HANDLED; + } + flags =3D host_get_exceptions(); + res64 =3D double_to_u64(res); + result_is_nan =3D isnan(res); + } else if (t->prec =3D=3D PREC_FLOAT_TO_DOUBLE) { + float a; + double res; + + if (t->operands[0].type =3D=3D OP_TYPE_QNAN) { + a =3D __builtin_nanf(""); + } else if (t->operands[0].type =3D=3D OP_TYPE_SNAN) { + a =3D __builtin_nansf(""); + } else { + a =3D u64_to_float(t->operands[0].val); + } + + if (t->expected_result.type =3D=3D OP_TYPE_QNAN) { + t->expected_result.val =3D double_to_u64(__builtin_nan("")); + } else if (t->expected_result.type =3D=3D OP_TYPE_SNAN) { + t->expected_result.val =3D double_to_u64(__builtin_nans("")); + } + + switch (t->op) { + case OP_FLOAT_TO_DOUBLE: + res =3D a; + break; + default: + return ERROR_NOT_HANDLED; + } + flags =3D host_get_exceptions(); + res64 =3D double_to_u64(res); + result_is_nan =3D isnan(res); + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + return tester_check(t, res64, result_is_nan, flags); +} + +static enum error soft_tester(struct test_op *t) +{ + float_status *s =3D &soft_status; + uint64_t res64; + enum error err =3D ERROR_NONE; + bool result_is_nan; + + s->float_rounding_mode =3D t->round; + s->float_exception_flags =3D default_exceptions; + + if (t->prec =3D=3D PREC_FLOAT) { + float32 a, b, c; + float32 *in[] =3D { &a, &b, &c }; + float32 res; + int i; + + g_assert(ops[t->op].n_operands <=3D ARRAY_SIZE(in)); + for (i =3D 0; i < ops[t->op].n_operands; i++) { + *in[i] =3D t->operands[i].val; + } + + switch (t->op) { + case OP_ADD: + res =3D float32_add(a, b, s); + break; + case OP_SUB: + res =3D float32_sub(a, b, s); + break; + case OP_MUL: + res =3D float32_mul(a, b, s); + break; + case OP_MULADD: + res =3D float32_muladd(a, b, c, 0, s); + break; + case OP_DIV: + res =3D float32_div(a, b, s); + break; + case OP_SQRT: + res =3D float32_sqrt(a, s); + break; + case OP_MINNUM: + res =3D float32_minnum(a, b, s); + break; + case OP_MAXNUM: + res =3D float32_maxnum(a, b, s); + break; + case OP_MAXNUMMAG: + res =3D float32_maxnummag(a, b, s); + break; + case OP_IS_NAN: + { + float f =3D !!float32_is_any_nan(a); + + res =3D float_to_u64(f); + break; + } + case OP_IS_INF: + { + float f =3D !!float32_is_infinity(a); + + res =3D float_to_u64(f); + break; + } + case OP_ABS: + /* Fall-through: float32_abs does not handle NaN's */ + default: + return ERROR_NOT_HANDLED; + } + res64 =3D res; + result_is_nan =3D isnan(*(float *)&res); + } else if (t->prec =3D=3D PREC_DOUBLE) { + float64 a, b, c; + float64 *in[] =3D { &a, &b, &c }; + int i; + + g_assert(ops[t->op].n_operands <=3D ARRAY_SIZE(in)); + for (i =3D 0; i < ops[t->op].n_operands; i++) { + *in[i] =3D t->operands[i].val; + } + + switch (t->op) { + case OP_ADD: + res64 =3D float64_add(a, b, s); + break; + case OP_SUB: + res64 =3D float64_sub(a, b, s); + break; + case OP_MUL: + res64 =3D float64_mul(a, b, s); + break; + case OP_MULADD: + res64 =3D float64_muladd(a, b, c, 0, s); + break; + case OP_DIV: + res64 =3D float64_div(a, b, s); + break; + case OP_SQRT: + res64 =3D float64_sqrt(a, s); + break; + case OP_MINNUM: + res64 =3D float64_minnum(a, b, s); + break; + case OP_MAXNUM: + res64 =3D float64_maxnum(a, b, s); + break; + case OP_MAXNUMMAG: + res64 =3D float64_maxnummag(a, b, s); + break; + case OP_IS_NAN: + { + double d =3D !!float64_is_any_nan(a); + + res64 =3D double_to_u64(d); + break; + } + case OP_IS_INF: + { + double d =3D !!float64_is_infinity(a); + + res64 =3D double_to_u64(d); + break; + } + case OP_ABS: + /* Fall-through: float64_abs does not handle NaN's */ + default: + return ERROR_NOT_HANDLED; + } + result_is_nan =3D isnan(*(double *)&res64); + } else if (t->prec =3D=3D PREC_FLOAT_TO_DOUBLE) { + float32 a =3D t->operands[0].val; + + switch (t->op) { + case OP_FLOAT_TO_DOUBLE: + res64 =3D float32_to_float64(a, s); + break; + default: + return ERROR_NOT_HANDLED; + } + result_is_nan =3D isnan(*(double *)&res64); + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + return tester_check(t, res64, result_is_nan, s->float_exception_flags); + return err; +} + +static const struct tester valid_testers[] =3D { + [0] =3D { + .name =3D "soft", + .func =3D soft_tester, + }, + [1] =3D { + .name =3D "host", + .func =3D host_tester, + }, +}; +static const struct tester *tester =3D &valid_testers[0]; + +static int ibm_get_exceptions(const char *p, uint8_t *excp) +{ + while (*p) { + switch (*p) { + case 'x': + *excp |=3D float_flag_inexact; + break; + case 'u': + *excp |=3D float_flag_underflow; + break; + case 'o': + *excp |=3D float_flag_overflow; + break; + case 'z': + *excp |=3D float_flag_divbyzero; + break; + case 'i': + *excp |=3D float_flag_invalid; + break; + default: + return 1; + } + p++; + } + return 0; +} + +static uint64_t fp_choose(enum precision prec, uint64_t f, uint64_t d) +{ + switch (prec) { + case PREC_FLOAT: + return f; + case PREC_DOUBLE: + return d; + default: + g_assert_not_reached(); + } +} + +static int +ibm_fp_hex(const char *p, enum precision prec, struct operand *ret) +{ + int len; + + ret->type =3D OP_TYPE_NUMBER; + + /* QNaN */ + if (unlikely(!strcmp("Q", p))) { + ret->val =3D fp_choose(prec, 0xffc00000, 0xfff8000000000000); + ret->type =3D OP_TYPE_QNAN; + return 0; + } + /* SNaN */ + if (unlikely(!strcmp("S", p))) { + ret->val =3D fp_choose(prec, 0xffb00000, 0xfff7000000000000); + ret->type =3D OP_TYPE_SNAN; + return 0; + } + if (unlikely(!strcmp("+Zero", p))) { + ret->val =3D fp_choose(prec, 0x00000000, 0x0000000000000000); + return 0; + } + if (unlikely(!strcmp("-Zero", p))) { + ret->val =3D fp_choose(prec, 0x80000000, 0x8000000000000000); + return 0; + } + if (unlikely(!strcmp("+inf", p) || !strcmp("+Inf", p))) { + ret->val =3D fp_choose(prec, 0x7f800000, 0x7ff0000000000000); + return 0; + } + if (unlikely(!strcmp("-inf", p) || !strcmp("-Inf", p))) { + ret->val =3D fp_choose(prec, 0xff800000, 0xfff0000000000000); + return 0; + } + + len =3D strlen(p); + + if (strchr(p, 'P')) { + bool negative =3D p[0] =3D=3D '-'; + char *pos; + bool denormal; + + if (len <=3D 4) { + return 1; + } + denormal =3D p[1] =3D=3D '0'; + if (prec =3D=3D PREC_FLOAT) { + uint32_t exponent; + uint32_t significand; + uint32_t h; + + significand =3D strtoul(&p[3], &pos, 16); + if (*pos !=3D 'P') { + return 1; + } + pos++; + exponent =3D strtol(pos, &pos, 10) + 127; + if (pos !=3D p + len) { + return 1; + } + /* + * When there's a leading zero, we have a denormal number. We'd + * expect the input (unbiased) exponent to be -127, but for so= me + * reason -126 is used. Correct that here. + */ + if (denormal) { + if (exponent !=3D 1) { + return 1; + } + exponent =3D 0; + } + h =3D negative ? (1 << 31) : 0; + h |=3D exponent << 23; + h |=3D significand; + ret->val =3D h; + return 0; + } else if (prec =3D=3D PREC_DOUBLE) { + uint64_t exponent; + uint64_t significand; + uint64_t h; + + significand =3D strtoul(&p[3], &pos, 16); + if (*pos !=3D 'P') { + return 1; + } + pos++; + exponent =3D strtol(pos, &pos, 10) + 1023; + if (pos !=3D p + len) { + return 1; + } + if (denormal) { + return 1; /* XXX */ + } + h =3D negative ? (1ULL << 63) : 0; + h |=3D exponent << 52; + h |=3D significand; + ret->val =3D h; + return 0; + } else { /* XXX */ + return 1; + } + } else if (strchr(p, 'e')) { + char *pos; + + if (prec =3D=3D PREC_FLOAT) { + float f =3D strtof(p, &pos); + + if (*pos) { + return 1; + } + ret->val =3D float_to_u64(f); + return 0; + } + if (prec =3D=3D PREC_DOUBLE) { + double d =3D strtod(p, &pos); + + if (*pos) { + return 1; + } + ret->val =3D double_to_u64(d); + return 0; + } + return 0; + } else if (!strcmp(p, "0x0")) { + if (prec =3D=3D PREC_FLOAT) { + ret->val =3D float_to_u64(0.0); + } else if (prec =3D=3D PREC_DOUBLE) { + ret->val =3D double_to_u64(0.0); + } else { + g_assert_not_reached(); + } + return 0; + } else if (!strcmp(p, "0x1")) { + if (prec =3D=3D PREC_FLOAT) { + ret->val =3D float_to_u64(1.0); + } else if (prec =3D=3D PREC_DOUBLE) { + ret->val =3D double_to_u64(1.0); + } else { + g_assert_not_reached(); + } + return 0; + } + return 1; +} + +static int find_op(const char *name, enum op *op) +{ + int i; + + for (i =3D 0; i < ARRAY_SIZE(ops); i++) { + if (strcmp(ops[i].name, name) =3D=3D 0) { + *op =3D i; + return 0; + } + } + return 1; +} + +/* Syntax of IBM FP test cases: + * https://www.research.ibm.com/haifa/projects/verification/fpgen/syntax.t= xt + */ +static enum error ibm_test_line(const char *line) +{ + struct test_op t; + /* at most nine fields; this should be more than enough for each field= */ + char s[9][64]; + char *p; + int n, field; + int i; + + /* data lines start with either b32 or d(64|128) */ + if (unlikely(line[0] !=3D 'b' && line[0] !=3D 'd')) { + return ERROR_COMMENT; + } + n =3D sscanf(line, "%63s %63s %63s %63s %63s %63s %63s %63s %63s", + s[0], s[1], s[2], s[3], s[4], s[5], s[6], s[7], s[8]); + if (unlikely(n < 5 || n > 9)) { + return ERROR_INPUT; + } + + field =3D 0; + p =3D s[field]; + if (unlikely(strlen(p) < 4)) { + return ERROR_INPUT; + } + if (strcmp("b32b64cff", p) =3D=3D 0) { + t.prec =3D PREC_FLOAT_TO_DOUBLE; + if (find_op(&p[6], &t.op)) { + return ERROR_NOT_HANDLED; + } + } else { + if (strncmp("b32", p, 3) =3D=3D 0) { + t.prec =3D PREC_FLOAT; + } else if (strncmp("d64", p, 3) =3D=3D 0) { + t.prec =3D PREC_DOUBLE; + } else if (strncmp("d128", p, 4) =3D=3D 0) { + return ERROR_NOT_HANDLED; /* XXX */ + } else { + return ERROR_INPUT; + } + if (find_op(&p[3], &t.op)) { + return ERROR_NOT_HANDLED; + } + } + + field =3D 1; + p =3D s[field]; + if (!strncmp("=3D0", p, 2)) { + t.round =3D float_round_nearest_even; + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + + /* The trapped exceptions field is optional */ + t.trapped_exceptions =3D 0; + field =3D 2; + p =3D s[field]; + if (ibm_get_exceptions(p, &t.trapped_exceptions)) { + if (unlikely(n =3D=3D 9)) { + return ERROR_INPUT; + } + } else { + field++; + } + + for (i =3D 0; i < ops[t.op].n_operands; i++) { + enum precision prec =3D t.prec =3D=3D PREC_FLOAT_TO_DOUBLE ? + PREC_FLOAT : t.prec; + + p =3D s[field++]; + if (ibm_fp_hex(p, prec, &t.operands[i])) { + return ERROR_INPUT; + } + } + + p =3D s[field++]; + if (strcmp("->", p)) { + return ERROR_INPUT; + } + + p =3D s[field++]; + if (unlikely(strcmp("#", p) =3D=3D 0)) { + t.expected_result_is_valid =3D false; + } else { + enum precision prec =3D t.prec =3D=3D PREC_FLOAT_TO_DOUBLE ? + PREC_DOUBLE : t.prec; + + if (ibm_fp_hex(p, prec, &t.expected_result)) { + return ERROR_INPUT; + } + t.expected_result_is_valid =3D true; + } + + /* + * A 0 here means "do not check the exceptions", i.e. it does NOT mean + * "there should be no exceptions raised". + */ + t.exceptions =3D 0; + /* the expected exceptions field is optional */ + if (field =3D=3D n - 1) { + p =3D s[field++]; + if (ibm_get_exceptions(p, &t.exceptions)) { + return ERROR_INPUT; + } + } + + /* + * We ignore "trapped exceptions" because we're not testing the trappi= ng + * mechanism of the host CPU. + * We test though that the exception bits are correctly set. + */ + if (t.trapped_exceptions) { + return ERROR_NOT_HANDLED; + } + return tester->func(&t); +} + +static const struct input valid_input_types[] =3D { + [INPUT_FMT_IBM] =3D { + .name =3D "ibm", + .test_line =3D ibm_test_line, + }, +}; + +static const struct input *input_type =3D &valid_input_types[INPUT_FMT_IBM= ]; + +static bool line_is_whitelisted(const char *line) +{ + if (whitelist.ht =3D=3D NULL) { + return false; + } + return !!g_hash_table_lookup(whitelist.ht, line); +} + +static void test_file(const char *filename) +{ + static char line[256]; + unsigned int i; + FILE *fp; + + fp =3D fopen(filename, "r"); + if (fp =3D=3D NULL) { + fprintf(stderr, "cannot open file '%s': %s\n", + filename, strerror(errno)); + exit(EXIT_FAILURE); + } + i =3D 0; + while (fgets(line, sizeof(line), fp)) { + enum error err; + + i++; + if (unlikely(line_is_whitelisted(line))) { + test_stats[ERROR_WHITELISTED]++; + continue; + } + err =3D input_type->test_line(line); + if (unlikely(is_err(err))) { + switch (err) { + case ERROR_INPUT: + fprintf(stderr, "error: malformed input @ %s:%d:\n", + filename, i); + break; + case ERROR_RESULT: + fprintf(stderr, "error: result mismatch for input @ %s:%d:= \n", + filename, i); + break; + case ERROR_EXCEPTIONS: + fprintf(stderr, "error: flags mismatch for input @ %s:%d:\= n", + filename, i); + break; + default: + g_assert_not_reached(); + } + fprintf(stderr, "%s", line); + if (die_on_error) { + exit(EXIT_FAILURE); + } + } + test_stats[err]++; + } + if (fclose(fp)) { + fprintf(stderr, "warning: cannot close file '%s': %s\n", + filename, strerror(errno)); + } +} + +static void set_input_fmt(const char *optarg) +{ + int i; + + for (i =3D 0; i < ARRAY_SIZE(valid_input_types); i++) { + const struct input *type =3D &valid_input_types[i]; + + if (strcmp(optarg, type->name) =3D=3D 0) { + input_type =3D type; + return; + } + } + fprintf(stderr, "Unknown input format '%s'", optarg); + exit(EXIT_FAILURE); +} + +static void set_tester(const char *optarg) +{ + int i; + + for (i =3D 0; i < ARRAY_SIZE(valid_testers); i++) { + const struct tester *t =3D &valid_testers[i]; + + if (strcmp(optarg, t->name) =3D=3D 0) { + tester =3D t; + return; + } + } + fprintf(stderr, "Unknown tester '%s'", optarg); + exit(EXIT_FAILURE); +} + +static void whitelist_add_line(const char *orig_line) +{ + char *line; + bool inserted; + + if (whitelist.ht =3D=3D NULL) { + whitelist.ht =3D g_hash_table_new(g_str_hash, g_str_equal); + } + line =3D g_hash_table_lookup(whitelist.ht, orig_line); + if (unlikely(line !=3D NULL)) { + return; + } + whitelist.n++; + whitelist.lines =3D g_realloc_n(whitelist.lines, whitelist.n, sizeof(l= ine)); + line =3D strdup(orig_line); + whitelist.lines[whitelist.n - 1] =3D line; + /* if we pass key =3D=3D val GLib will not reserve space for the value= */ + inserted =3D g_hash_table_insert(whitelist.ht, line, line); + g_assert(inserted); +} + +static void set_whitelist(const char *filename) +{ + FILE *fp; + static char line[256]; + + fp =3D fopen(filename, "r"); + if (fp =3D=3D NULL) { + fprintf(stderr, "warning: cannot open white list file '%s': %s\n", + filename, strerror(errno)); + return; + } + while (fgets(line, sizeof(line), fp)) { + if (isspace(line[0]) || line[0] =3D=3D '#') { + continue; + } + whitelist_add_line(line); + } + if (fclose(fp)) { + fprintf(stderr, "warning: cannot close file '%s': %s\n", + filename, strerror(errno)); + } +} + +static void set_default_exceptions(const char *str) +{ + if (ibm_get_exceptions(str, &default_exceptions)) { + fprintf(stderr, "Invalid exception '%s'\n", str); + exit(EXIT_FAILURE); + } +} + +static void usage_complete(int argc, char *argv[]) +{ + fprintf(stderr, "Usage: %s [options] file1 [file2 ...]\n", argv[0]); + fprintf(stderr, "options:\n"); + fprintf(stderr, " -a =3D Perform tininess detection after rounding " + "(soft tester only). Default: before\n"); + fprintf(stderr, " -n =3D do not die on error. Default: dies on error\= n"); + fprintf(stderr, " -e =3D default exception flags (xiozu). Default: no= ne\n"); + fprintf(stderr, " -f =3D format of the input file(s). Default: %s\n", + valid_input_types[0].name); + fprintf(stderr, " -t =3D tester. Default: %s\n", valid_testers[0].nam= e); + fprintf(stderr, " -w =3D path to file with test cases to be whitelist= ed\n"); + fprintf(stderr, " -z =3D flush inputs to zero (soft tester only). " + "Default: disabled\n"); + fprintf(stderr, " -Z =3D flush output to zero (soft tester only). " + "Default: disabled\n"); +} + +static void parse_opts(int argc, char *argv[]) +{ + int c; + + for (;;) { + c =3D getopt(argc, argv, "ae:f:hnt:w:zZ"); + if (c < 0) { + return; + } + switch (c) { + case 'a': + soft_status.float_detect_tininess =3D float_tininess_after_rou= nding; + break; + case 'e': + set_default_exceptions(optarg); + break; + case 'f': + set_input_fmt(optarg); + break; + case 'h': + usage_complete(argc, argv); + exit(EXIT_SUCCESS); + case 'n': + die_on_error =3D false; + break; + case 't': + set_tester(optarg); + break; + case 'w': + set_whitelist(optarg); + break; + case 'z': + soft_status.flush_inputs_to_zero =3D 1; + break; + case 'Z': + soft_status.flush_to_zero =3D 1; + break; + } + } + g_assert_not_reached(); +} + +static uint64_t count_errors(void) +{ + uint64_t ret =3D 0; + int i; + + for (i =3D ERROR_INPUT; i < ERROR_MAX; i++) { + ret +=3D test_stats[i]; + } + return ret; +} + +int main(int argc, char *argv[]) +{ + uint64_t n_errors; + int i; + + if (argc =3D=3D 1) { + usage_complete(argc, argv); + exit(EXIT_FAILURE); + } + parse_opts(argc, argv); + for (i =3D optind; i < argc; i++) { + test_file(argv[i]); + } + + n_errors =3D count_errors(); + if (n_errors) { + printf("Tests failed: %"PRIu64". Parsing: %"PRIu64 + ", result:%"PRIu64", flags:%"PRIu64"\n", + n_errors, test_stats[ERROR_INPUT], test_stats[ERROR_RESULT], + test_stats[ERROR_EXCEPTIONS]); + } else { + printf("All tests OK.\n"); + } + printf("Tests passed: %" PRIu64 ". Not handled: %" PRIu64 + ", whitelisted: %"PRIu64 "\n", + test_stats[ERROR_NONE], test_stats[ERROR_NOT_HANDLED], + test_stats[ERROR_WHITELISTED]); + return !!n_errors; +} diff --git a/tests/Makefile.include b/tests/Makefile.include index d098a10..13e547e 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -648,6 +648,9 @@ tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util= -obj-y) tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o $(test-util-ob= j-y) tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-= y) =20 +tests/fp/%: + $(MAKE) -C $(dir $@) $(notdir $@) + tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \ hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\ hw/core/bus.o \ diff --git a/tests/fp/.gitignore b/tests/fp/.gitignore new file mode 100644 index 0000000..0a9fef4 --- /dev/null +++ b/tests/fp/.gitignore @@ -0,0 +1,3 @@ +ibm +*.txt +fp-test diff --git a/tests/fp/Makefile b/tests/fp/Makefile new file mode 100644 index 0000000..a208f4c --- /dev/null +++ b/tests/fp/Makefile @@ -0,0 +1,34 @@ +BUILD_DIR=3D$(CURDIR)/../.. + +include ../../config-host.mak +include $(SRC_PATH)/rules.mak + +$(call set-vpath, $(SRC_PATH)/tests/fp $(SRC_PATH)/fpu) + +QEMU_INCLUDES +=3D -I../.. +QEMU_INCLUDES +=3D -I$(SRC_PATH)/fpu +# work around TARGET_* poisoning +QEMU_CFLAGS +=3D -DHW_POISON_H + +IBMFP :=3D ibm-fptests.zip + +OBJS :=3D fp-test$(EXESUF) + +WHITELIST_FILES :=3D whitelist.txt whitelist-tininess-after.txt + +all: $(OBJS) ibm $(WHITELIST_FILES) + +ibm: + wget -nv -O $(IBMFP) http://www.haifa.il.ibm.com/projects/verification/fp= gen/download/test_suite.zip + mkdir -p $@ + unzip $(IBMFP) -d $@ + rm -rf $(IBMFP) + +# XXX: upload this to a qemu server, or just commit it. +$(WHITELIST_FILES): + wget -nv -O $@ http://www.cs.columbia.edu/~cota/qemu/fpbench-$@ + +fp-test$(EXESUF): fp-test.o softfloat.o + +clean: + rm -f *.o *.d $(OBJS) --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528769152110984.9711388106809; Mon, 11 Jun 2018 19:05:52 -0700 (PDT) Received: from localhost ([::1]:52308 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYgx-0008Vm-5z for importer@patchew.org; Mon, 11 Jun 2018 22:05:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40795) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046P-Pe for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQh-0003LM-S5 for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:36241) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQh-0003It-Ha for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id A1C0421BFF; Mon, 11 Jun 2018 21:49:01 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:01 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 3BBB0E42DB; Mon, 11 Jun 2018 21:49:01 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=3eo3BF5jLNz3Dk lWxaBzonf9C4mPCeI4qVttCM+Iw1s=; b=psHv5foxY0wt+hkXUeiRckZPNzSBgq z8I7KKH8YSn3yaUhqML60w+zfmPzJOo3JklAS7xOT2oQN7YMqnTyC2J44xmpyuPq sOHdlYmXDvySVFiFD6uCAE+ItcP3+zs/47kJ5/iPeqHxgDq+5DlNwHOVMcPO99lC aLQ0QuLooQ3Bg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=3eo3BF5jLNz3DklWxaBzonf9C4mPCeI4qVttCM+Iw1s=; b=tvw7X4fU z4poeYwsIWKK18CVpAdak+llShMz39vs668DmnD5G4e1bllngOL50nV4CE5oC6Gv tBdG8JUhCfySLMoBC/FuMRV9FzTJmWMlZPRvG0hfS96aoept58vKfR2GhSS12vYt DRBoLdXTI1MoqIK+5GDKdgn7wFVAKcs2NkyDS01eaIVkLHC8ZVdTkn9UyTqYuuBS t6P8ukpQwazwyo32FApLKgoZIq0IzCZTYld+9uRnm4/8R+QEKsY2aZTsruZrOYgh JRuvVEzq+o6ZZTuhJaRyJLQy6bMUH5GIN8QDNXlUpXsm8uLRXt2xxavxr5GZ01b+ FKw9+pEgo8O5Rg== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:48 -0400 Message-Id: <1528768140-17894-3-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 02/14] fp-test: add muladd variants X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" These are a few muladd-related operations that the original IBM syntax does not specify; model files for these are in muladd.fptest. Signed-off-by: Emilio G. Cota --- tests/fp/fp-test.c | 24 ++++++++++++++++++++++++ tests/fp/muladd.fptest | 51 ++++++++++++++++++++++++++++++++++++++++++++++= ++++ 2 files changed, 75 insertions(+) create mode 100644 tests/fp/muladd.fptest diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c index 6be9ce7..bf6d0f3 100644 --- a/tests/fp/fp-test.c +++ b/tests/fp/fp-test.c @@ -52,6 +52,9 @@ enum op { OP_SUB, OP_MUL, OP_MULADD, + OP_MULADD_NEG_ADDEND, + OP_MULADD_NEG_PRODUCT, + OP_MULADD_NEG_RESULT, OP_DIV, OP_SQRT, OP_MINNUM, @@ -68,6 +71,9 @@ static const struct op_desc ops[] =3D { [OP_SUB] =3D { "-", 2 }, [OP_MUL] =3D { "*", 2 }, [OP_MULADD] =3D { "*+", 3 }, + [OP_MULADD_NEG_ADDEND] =3D { "*+nc", 3 }, + [OP_MULADD_NEG_PRODUCT] =3D { "*+np", 3 }, + [OP_MULADD_NEG_RESULT] =3D { "*+nr", 3 }, [OP_DIV] =3D { "/", 2 }, [OP_SQRT] =3D { "V", 1 }, [OP_MINNUM] =3D { " Q i +b32*+nc =3D0 -1.7FFFFFP127 -Inf +Inf -> Q i +b32*+nc =3D0 -1.6C9AE7P113 -Inf +Inf -> Q i +b32*+nc =3D0 -1.000000P-126 -Inf +Inf -> Q i +b32*+nc =3D0 -0.7FFFFFP-126 -Inf +Inf -> Q i +b32*+nc =3D0 -0.1B977AP-126 -Inf +Inf -> Q i +b32*+nc =3D0 -0.000001P-126 -Inf +Inf -> Q i +b32*+nc =3D0 -1.000000P0 -Inf +Inf -> Q i +b32*+nc =3D0 -Zero -Inf +Inf -> Q i +b32*+nc =3D0 +Zero -Inf +Inf -> Q i +b32*+nc =3D0 -Zero -1.000000P-126 +1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+nc =3D0 +Zero -1.000000P-126 +1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+nc =3D0 -1.000000P-126 -1.7FFFFFP127 -1.4B9156P109 -> +1.4B9156P109 x +b32*+nc =3D0 -0.7FFFFFP-126 -1.7FFFFFP127 -1.51BA59P-113 -> +1.7FFFFDP1 x +b32*+nc =3D0 -0.3D6B57P-126 -1.7FFFFFP127 -1.265398P-67 -> +1.75AD5BP0 x +b32*+nc =3D0 -0.000001P-126 -1.7FFFFFP127 -1.677330P-113 -> +1.7FFFFFP-22 x + +# np =3D=3D negate product +b32*+np =3D0 +Inf -Inf -Inf -> Q i +b32*+np =3D0 +1.7FFFFFP127 -Inf -Inf -> Q i +b32*+np =3D0 +1.6C9AE7P113 -Inf -Inf -> Q i +b32*+np =3D0 +1.000000P-126 -Inf -Inf -> Q i +b32*+np =3D0 +0.7FFFFFP-126 -Inf -Inf -> Q i +b32*+np =3D0 +0.1B977AP-126 -Inf -Inf -> Q i +b32*+np =3D0 +0.000001P-126 -Inf -Inf -> Q i +b32*+np =3D0 +1.000000P0 -Inf -Inf -> Q i +b32*+np =3D0 +Zero -Inf -Inf -> Q i +b32*+np =3D0 +Zero -Inf -Inf -> Q i +b32*+np =3D0 -Zero -1.000000P-126 -1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+np =3D0 +Zero -1.000000P-126 -1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+np =3D0 -1.3A6A89P-18 +1.24E7AEP9 -0.7FFFFFP-126 -> +1.7029E9P-9 x + +# nr =3D=3D negate result +b32*+nr =3D0 -Inf -Inf -Inf -> Q i +b32*+nr =3D0 -1.7FFFFFP127 -Inf -Inf -> Q i +b32*+nr =3D0 -1.6C9AE7P113 -Inf -Inf -> Q i +b32*+nr =3D0 -1.000000P-126 -Inf -Inf -> Q i +b32*+nr =3D0 -0.7FFFFFP-126 -Inf -Inf -> Q i +b32*+nr =3D0 -0.1B977AP-126 -Inf -Inf -> Q i +b32*+nr =3D0 -0.000001P-126 -Inf -Inf -> Q i +b32*+nr =3D0 -1.000000P0 -Inf -Inf -> Q i +b32*+nr =3D0 -Zero -Inf -Inf -> Q i +b32*+nr =3D0 -Zero -Inf -Inf -> Q i +b32*+nr =3D0 +Zero -1.000000P-126 -1.7FFFFFP127 -> +1.7FFFFFP127 +b32*+nr =3D0 -Zero -1.000000P-126 -1.7FFFFFP127 -> +1.7FFFFFP127 +b32*+nr =3D0 -1.000000P-126 -1.7FFFFFP127 -1.4B9156P109 -> +1.4B9156P109 x +b32*+nr =3D0 -0.7FFFFFP-126 -1.7FFFFFP127 -1.51BA59P-113 -> -1.7FFFFDP1 x +b32*+nr =3D0 -0.3D6B57P-126 -1.7FFFFFP127 -1.265398P-67 -> -1.75AD5BP0 x +b32*+nr =3D0 -0.000001P-126 -1.7FFFFFP127 -1.677330P-113 -> -1.7FFFFFP-22 x +b32*+nr =3D0 +1.72E53AP-33 -1.7FFFFFP127 -1.5AA684P-2 -> +1.72E539P95 x --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528768747269266.8142365209269; Mon, 11 Jun 2018 18:59:07 -0700 (PDT) Received: from localhost ([::1]:52268 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYaQ-0003dC-B5 for importer@patchew.org; Mon, 11 Jun 2018 21:59:06 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40789) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00044D-9H for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQh-0003LY-Rs for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:35369) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQh-0003Ix-IS for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id C9EFF21CF0; Mon, 11 Jun 2018 21:49:01 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:01 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 8298C10266; Mon, 11 Jun 2018 21:49:01 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=mesmtp; bh=KzUUdskvRawc5yhtZGF785MSLH EU2vZ9BnuIeu9zUhQ=; b=GBE+vvpiEBwK1ZVuN7CybdjA09CoOFMCpSy0j3XMCM jh9nJL+c/hYOJHa8PditbejReVSo43L3JVzMRq0qN3dNnzyWVgpfDvkaL2+K5aeg CLea3drCMCb5bqcZ8QhBV0AxQlod7ihg7JIDjelogbWAu9UDz9Z6bJ6PCrXreebc 4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=KzUUds kvRawc5yhtZGF785MSLHEU2vZ9BnuIeu9zUhQ=; b=cgVhlTpO9PNp8/xvMSNOh8 QFNxfWQQ9YMvvvsBbFPLKfQpSVUGsFl2djzXOgCQY6CONqMpgyM5FJmH5tHIz9lG lNJnUVTqgQxPaHzAPQMqWvbIK9EoZhgiB178WAqRtBURp4eNpWt+gRTulN1L3XY6 Q6myYaloM4XESaLMl5vvDzu6SMPtr3WxnVjPHXPQ+T+3fTAsrSsMQWN1LePkSqNc W2J3dlh1MqsjBHBnrpJEXfS71mqsO2S5kavXq7qHABfylEWkYfIxDE0h6Q+5CkiM IkhElFSd+7R0kRHME25i3YIqKF7eA/5m4NYEBkvjFTmFJGQxn/jYaZZ131feONjA == X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:49 -0400 Message-Id: <1528768140-17894-4-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 03/14] softfloat: add float{32, 64}_is_{de, }normal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Bastian Koppelmann , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 This paves the way for upcoming work. Cc: Bastian Koppelmann Reviewed-by: Bastian Koppelmann Reviewed-by: Alex Benn=C3=A9e Signed-off-by: Emilio G. Cota --- include/fpu/softfloat.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 69f4dbc..1fbece5 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -412,6 +412,16 @@ static inline int float32_is_zero_or_denormal(float32 = a) return (float32_val(a) & 0x7f800000) =3D=3D 0; } =20 +static inline bool float32_is_normal(float32 a) +{ + return ((float32_val(a) + 0x00800000) & 0x7fffffff) >=3D 0x01000000; +} + +static inline bool float32_is_denormal(float32 a) +{ + return float32_is_zero_or_denormal(a) && !float32_is_zero(a); +} + static inline float32 float32_set_sign(float32 a, int sign) { return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31)); @@ -541,6 +551,16 @@ static inline int float64_is_zero_or_denormal(float64 = a) return (float64_val(a) & 0x7ff0000000000000LL) =3D=3D 0; } =20 +static inline bool float64_is_normal(float64 a) +{ + return ((float64_val(a) + (1ULL << 52)) & -1ULL >> 1) >=3D 1ULL << 53; +} + +static inline bool float64_is_denormal(float64 a) +{ + return float64_is_zero_or_denormal(a) && !float64_is_zero(a); +} + static inline float64 float64_set_sign(float64 a, int sign) { return make_float64((float64_val(a) & 0x7fffffffffffffffULL) --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528768745817623.9422877159499; Mon, 11 Jun 2018 18:59:05 -0700 (PDT) Received: from localhost ([::1]:52267 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYaP-0003c5-3L for importer@patchew.org; Mon, 11 Jun 2018 21:59:05 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40794) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046I-Pf for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQh-0003LN-Pw for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:56157) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQh-0003J9-JJ for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 055E521D03; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:02 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id B1160E425A; Mon, 11 Jun 2018 21:49:01 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=/l3+O3JYNy2eG6 5fY7wW0JnkOA5qQfCPZ8sVipaRobw=; b=Q/FY6wWSo/qfqLeP26xQxid6sEKIn6 e6YFcft5iop4vmyWW6QkeTeJtaXqVf5xz7DjLlrLAZh6I6kj50Gmy21L/tE6cgLu 6UwwCWeVD+OtcKc1tThgLbyPi2Dw6us+XzPVeJEFwwgHrtl9etgM9zXvkgAvRdXb yclyldMEb8Fq4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=/l3+O3JYNy2eG65fY7wW0JnkOA5qQfCPZ8sVipaRobw=; b=tH2Os8C5 isUEBE8StYpuTkcElWwL7YmxaQT/6b/SCqZpNgQACvIjj6cWwQ2bXRi3RxkqTWnt TWSn1Fsv7G3jD1k5TcYNaxAsrC3cMiKt3PDK4kukl5dPhEH4gh5O/u10TIBmNraQ UBPwnRmtIqOw9yX7KSIdhlm+I1gxXK58Mt6th9ZTp7kgfrLhowCbjhJzEDLHH3bT BSM5Ga4TcfZmr6RKC4wJiuG2KUqHBEPD330p3CU5Kjsqek/4Yb1PRKjtCdvuLAds WCmtEx7CC9ss8lVuIAl9tMLbd3YVshaZG/WZrAgYrKZKmdzRorGiGttt1bmU2biA XVB+z734pDgpOQ== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:50 -0400 Message-Id: <1528768140-17894-5-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 04/14] target/tricore: use float32_is_denormal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Bastian Koppelmann , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Cc: Bastian Koppelmann Reviewed-by: Bastian Koppelmann Signed-off-by: Emilio G. Cota --- target/tricore/fpu_helper.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/target/tricore/fpu_helper.c b/target/tricore/fpu_helper.c index df16290..31df462 100644 --- a/target/tricore/fpu_helper.c +++ b/target/tricore/fpu_helper.c @@ -44,11 +44,6 @@ static inline uint8_t f_get_excp_flags(CPUTriCoreState *= env) | float_flag_inexact); } =20 -static inline bool f_is_denormal(float32 arg) -{ - return float32_is_zero_or_denormal(arg) && !float32_is_zero(arg); -} - static inline float32 f_maddsub_nan_result(float32 arg1, float32 arg2, float32 arg3, float32 result, uint32_t muladd_negate_c) @@ -260,8 +255,8 @@ uint32_t helper_fcmp(CPUTriCoreState *env, uint32_t r1,= uint32_t r2) set_flush_inputs_to_zero(0, &env->fp_status); =20 result =3D 1 << (float32_compare_quiet(arg1, arg2, &env->fp_status) + = 1); - result |=3D f_is_denormal(arg1) << 4; - result |=3D f_is_denormal(arg2) << 5; + result |=3D float32_is_denormal(arg1) << 4; + result |=3D float32_is_denormal(arg2) << 5; =20 flags =3D f_get_excp_flags(env); if (flags) { --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528769152545559.7545278742787; Mon, 11 Jun 2018 19:05:52 -0700 (PDT) Received: from localhost ([::1]:52309 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYgx-0008W2-Qx for importer@patchew.org; Mon, 11 Jun 2018 22:05:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40807) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046k-S6 for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003MI-9N for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:47357) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQi-0003L0-0v for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 46DEA20F50; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:02 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E18C51025D; Mon, 11 Jun 2018 21:49:01 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=biOJABZN96qNSY Gb3p/dnYw+/TGFNYWGnRfcfACbP8U=; b=Wr2GBFvelsPKdV3ASC4a4icOx5hZe9 85EJZ2v4SwFsBoExsNR7GUcDl8LdJLwN6N9oaYWncfq5GQFEtsh/praO+D7n8gWp Kg8Coj2hte6WLHid4ejpX/0buPk/jQjNLhIUkU4G3n66um5EDVbZD61QmM2N13w2 eEfh+XKNmkXys= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=biOJABZN96qNSYGb3p/dnYw+/TGFNYWGnRfcfACbP8U=; b=YzejLhNT vwZPIrCsKxG6XN/DBTq1ZsxVxTUm+o+bn080Gan+xHvLt8yBPDLjfybLR56AoqZQ wrjCJwU8JhIQ6hp4j6kdcP8HXoZ1sbelbgET/pMEiXdwgeS1JpUvIeOyky5RJU2Z Q6jCnkduw4XI8HqBglCfJiDttEiBRp7ZEnRLq4qgPyoPO3fLNPAH9nQQSu0sychx lfQiCNqYp9t7tHGWJIkc+j2QOg3By9CU2SDioaGNU0f08bHRhhg8c/4xBIQiFw/B 1KKoYImwUbGqb946njdc8P+6gJzpEEnabd2yII5v7hkZ4+WCcvfzBgDSs3T3TL6K RytDNnyL7AyVmQ== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:51 -0400 Message-Id: <1528768140-17894-6-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 05/14] tests/fp: add fp-bench, a collection of simple floating point microbenchmarks X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This will allow us to measure the performance impact of FP emulation optimizations. Note that we can measure both directly the impact on the softfloat functions (with "-t soft"), or the impact on an emulated workload (call with "-t host" and run under qemu user-mode). Signed-off-by: Emilio G. Cota --- tests/fp/fp-bench.c | 526 ++++++++++++++++++++++++++++++++++++++++++++++++= ++++ tests/fp/.gitignore | 1 + tests/fp/Makefile | 4 +- 3 files changed, 530 insertions(+), 1 deletion(-) create mode 100644 tests/fp/fp-bench.c diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c new file mode 100644 index 0000000..e4c6885 --- /dev/null +++ b/tests/fp/fp-bench.c @@ -0,0 +1,526 @@ +/* + * fp-bench.c - A collection of simple floating point microbenchmarks. + * + * Copyright (C) 2018, Emilio G. Cota + * + * License: GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef HW_POISON_H +#error Must define HW_POISON_H to work around TARGET_* poisoning +#endif + +#include "qemu/osdep.h" +#include +#include "qemu/timer.h" +#include "fpu/softfloat.h" + +/* amortize the computation of random inputs */ +#define OPS_PER_ITER 50000 + +#define MAX_OPERANDS 3 + +#define SEED_A 0xdeadfacedeadface +#define SEED_B 0xbadc0feebadc0fee +#define SEED_C 0xbeefdeadbeefdead + +enum op { + OP_ADD, + OP_SUB, + OP_MUL, + OP_DIV, + OP_FMA, + OP_SQRT, + OP_CMP, + OP_MAX_NR, +}; + +static const char * const op_names[] =3D { + [OP_ADD] =3D "add", + [OP_SUB] =3D "sub", + [OP_MUL] =3D "mul", + [OP_DIV] =3D "div", + [OP_FMA] =3D "fma", + [OP_SQRT] =3D "sqrt", + [OP_CMP] =3D "cmp", + [OP_MAX_NR] =3D NULL, +}; + +enum precision { + PREC_SINGLE, + PREC_DOUBLE, + PREC_FLOAT32, + PREC_FLOAT64, + PREC_MAX_NR, +}; + +enum tester { + TESTER_SOFT, + TESTER_HOST, + TESTER_MAX_NR, +}; + +static const char * const tester_names[] =3D { + [TESTER_SOFT] =3D "soft", + [TESTER_HOST] =3D "host", + [TESTER_MAX_NR] =3D NULL, +}; + +union fp { + float f; + double d; + float32 f32; + float64 f64; + uint64_t u64; +}; + +struct op_state; + +typedef float (*float_func_t)(const struct op_state *s); +typedef double (*double_func_t)(const struct op_state *s); + +union fp_func { + float_func_t float_func; + double_func_t double_func; +}; + +typedef void (*bench_func_t)(void); + +struct op_desc { + const char * const name; +}; + +#define DEFAULT_DURATION_SECS 1 + +static uint64_t random_ops[MAX_OPERANDS] =3D { + SEED_A, SEED_B, SEED_C, +}; +static float_status soft_status; +static enum precision precision; +static enum op operation; +static enum tester tester; +static uint64_t n_completed_ops; +static unsigned int duration =3D DEFAULT_DURATION_SECS; +static int64_t ns_elapsed; +/* disable optimizations with volatile */ +static volatile union fp res; + +/* + * From: https://en.wikipedia.org/wiki/Xorshift + * This is faster than rand_r(), and gives us a wider range (RAND_MAX is o= nly + * guaranteed to be >=3D INT_MAX). + */ +static uint64_t xorshift64star(uint64_t x) +{ + x ^=3D x >> 12; /* a */ + x ^=3D x << 25; /* b */ + x ^=3D x >> 27; /* c */ + return x * UINT64_C(2685821657736338717); +} + +static void update_random_ops(int n_ops, enum precision prec) +{ + int i; + + for (i =3D 0; i < n_ops; i++) { + uint64_t r =3D random_ops[i]; + + if (prec =3D=3D PREC_SINGLE || PREC_FLOAT32) { + do { + r =3D xorshift64star(r); + } while (!float32_is_normal(r)); + } else if (prec =3D=3D PREC_DOUBLE || PREC_FLOAT64) { + do { + r =3D xorshift64star(r); + } while (!float64_is_normal(r)); + } else { + g_assert_not_reached(); + } + random_ops[i] =3D r; + } +} + +static void fill_random(union fp *ops, int n_ops, enum precision prec, + bool no_neg) +{ + int i; + + for (i =3D 0; i < n_ops; i++) { + switch (prec) { + case PREC_SINGLE: + case PREC_FLOAT32: + ops[i].f32 =3D make_float32(random_ops[i]); + if (no_neg && float32_is_neg(ops[i].f32)) { + ops[i].f32 =3D float32_chs(ops[i].f32); + } + /* raise the exponent to limit the frequency of denormal resul= ts */ + ops[i].f32 |=3D 0x40000000; + break; + case PREC_DOUBLE: + case PREC_FLOAT64: + ops[i].f64 =3D make_float64(random_ops[i]); + if (no_neg && float64_is_neg(ops[i].f64)) { + ops[i].f64 =3D float64_chs(ops[i].f64); + } + /* raise the exponent to limit the frequency of denormal resul= ts */ + ops[i].f64 |=3D LIT64(0x4000000000000000); + break; + default: + g_assert_not_reached(); + } + } +} + +/* + * The main benchmark function. Instead of (ab)using macros, we rely + * on the compiler to unfold this at compile-time. + */ +static void bench(enum precision prec, enum op op, int n_ops, bool no_neg) +{ + int64_t tf =3D get_clock_realtime() + duration * 1000000000LL; + + while (get_clock_realtime() < tf) { + union fp ops[MAX_OPERANDS]; + int64_t t0; + int i; + + update_random_ops(n_ops, prec); + switch (prec) { + case PREC_SINGLE: + fill_random(ops, n_ops, prec, no_neg); + t0 =3D get_clock_realtime(); + for (i =3D 0; i < OPS_PER_ITER; i++) { + float a =3D ops[0].f; + float b =3D ops[1].f; + float c =3D ops[2].f; + + switch (op) { + case OP_ADD: + res.f =3D a + b; + break; + case OP_SUB: + res.f =3D a - b; + break; + case OP_MUL: + res.f =3D a * b; + break; + case OP_DIV: + res.f =3D a / b; + break; + case OP_FMA: + res.f =3D fmaf(a, b, c); + break; + case OP_SQRT: + res.f =3D sqrtf(a); + break; + case OP_CMP: + res.u64 =3D isgreater(a, b); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_DOUBLE: + fill_random(ops, n_ops, prec, no_neg); + t0 =3D get_clock_realtime(); + for (i =3D 0; i < OPS_PER_ITER; i++) { + double a =3D ops[0].d; + double b =3D ops[1].d; + double c =3D ops[2].d; + + switch (op) { + case OP_ADD: + res.d =3D a + b; + break; + case OP_SUB: + res.d =3D a - b; + break; + case OP_MUL: + res.d =3D a * b; + break; + case OP_DIV: + res.d =3D a / b; + break; + case OP_FMA: + res.d =3D fma(a, b, c); + break; + case OP_SQRT: + res.d =3D sqrt(a); + break; + case OP_CMP: + res.u64 =3D isgreater(a, b); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_FLOAT32: + fill_random(ops, n_ops, prec, no_neg); + t0 =3D get_clock_realtime(); + for (i =3D 0; i < OPS_PER_ITER; i++) { + float32 a =3D ops[0].f32; + float32 b =3D ops[1].f32; + float32 c =3D ops[2].f32; + + switch (op) { + case OP_ADD: + res.f32 =3D float32_add(a, b, &soft_status); + break; + case OP_SUB: + res.f32 =3D float32_sub(a, b, &soft_status); + break; + case OP_MUL: + res.f =3D float32_mul(a, b, &soft_status); + break; + case OP_DIV: + res.f32 =3D float32_div(a, b, &soft_status); + break; + case OP_FMA: + res.f32 =3D float32_muladd(a, b, c, 0, &soft_status); + break; + case OP_SQRT: + res.f32 =3D float32_sqrt(a, &soft_status); + break; + case OP_CMP: + res.u64 =3D float32_compare_quiet(a, b, &soft_status); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_FLOAT64: + fill_random(ops, n_ops, prec, no_neg); + t0 =3D get_clock_realtime(); + for (i =3D 0; i < OPS_PER_ITER; i++) { + float64 a =3D ops[0].f64; + float64 b =3D ops[1].f64; + float64 c =3D ops[2].f64; + + switch (op) { + case OP_ADD: + res.f64 =3D float64_add(a, b, &soft_status); + break; + case OP_SUB: + res.f64 =3D float64_sub(a, b, &soft_status); + break; + case OP_MUL: + res.f =3D float64_mul(a, b, &soft_status); + break; + case OP_DIV: + res.f64 =3D float64_div(a, b, &soft_status); + break; + case OP_FMA: + res.f64 =3D float64_muladd(a, b, c, 0, &soft_status); + break; + case OP_SQRT: + res.f64 =3D float64_sqrt(a, &soft_status); + break; + case OP_CMP: + res.u64 =3D float64_compare_quiet(a, b, &soft_status); + break; + default: + g_assert_not_reached(); + } + } + break; + default: + g_assert_not_reached(); + } + ns_elapsed +=3D get_clock_realtime() - t0; + n_completed_ops +=3D OPS_PER_ITER; + } +} + +#define GEN_BENCH(name, type, prec, op, n_ops) \ + static void __attribute__((flatten)) name(void) \ + { \ + bench(prec, op, n_ops, false); \ + } + +#define GEN_BENCH_NO_NEG(name, type, prec, op, n_ops) \ + static void __attribute__((flatten)) name(void) \ + { \ + bench(prec, op, n_ops, true); \ + } + +#define GEN_BENCH_ALL_TYPES(opname, op, n_ops) \ + GEN_BENCH(bench_ ## opname ## _float, float, PREC_SINGLE, op, n_ops) \ + GEN_BENCH(bench_ ## opname ## _double, double, PREC_DOUBLE, op, n_ops)= \ + GEN_BENCH(bench_ ## opname ## _float32, float32, PREC_FLOAT32, op, n_o= ps) \ + GEN_BENCH(bench_ ## opname ## _float64, float64, PREC_FLOAT64, op, n_o= ps) + +GEN_BENCH_ALL_TYPES(add, OP_ADD, 2) +GEN_BENCH_ALL_TYPES(sub, OP_SUB, 2) +GEN_BENCH_ALL_TYPES(mul, OP_MUL, 2) +GEN_BENCH_ALL_TYPES(div, OP_DIV, 2) +GEN_BENCH_ALL_TYPES(fma, OP_FMA, 3) +GEN_BENCH_ALL_TYPES(cmp, OP_CMP, 2) +#undef GEN_BENCH_ALL_TYPES + +#define GEN_BENCH_ALL_TYPES_NO_NEG(name, op, n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float, float, PREC_SINGLE, op, n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _double, double, PREC_DOUBLE, op, n= ) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float32, float32, PREC_FLOAT32, op= , n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float64, float64, PREC_FLOAT64, op= , n) + +GEN_BENCH_ALL_TYPES_NO_NEG(sqrt, OP_SQRT, 1) +#undef GEN_BENCH_ALL_TYPES_NO_NEG + +#undef GEN_BENCH_NO_NEG +#undef GEN_BENCH + +#define GEN_BENCH_FUNCS(opname, op) \ + [op] =3D { \ + [PREC_SINGLE] =3D bench_ ## opname ## _float, \ + [PREC_DOUBLE] =3D bench_ ## opname ## _double, \ + [PREC_FLOAT32] =3D bench_ ## opname ## _float32, \ + [PREC_FLOAT64] =3D bench_ ## opname ## _float64, \ + } + +static const bench_func_t bench_funcs[OP_MAX_NR][PREC_MAX_NR] =3D { + GEN_BENCH_FUNCS(add, OP_ADD), + GEN_BENCH_FUNCS(sub, OP_SUB), + GEN_BENCH_FUNCS(mul, OP_MUL), + GEN_BENCH_FUNCS(div, OP_DIV), + GEN_BENCH_FUNCS(fma, OP_FMA), + GEN_BENCH_FUNCS(sqrt, OP_SQRT), + GEN_BENCH_FUNCS(cmp, OP_CMP), +}; + +#undef GEN_BENCH_FUNCS + +static void run_bench(void) +{ + bench_func_t f; + + f =3D bench_funcs[operation][precision]; + g_assert(f); + f(); +} + +/* @arr must be NULL-terminated */ +static int find_name(const char * const *arr, const char *name) +{ + int i; + + for (i =3D 0; arr[i] !=3D NULL; i++) { + if (strcmp(name, arr[i]) =3D=3D 0) { + return i; + } + } + return -1; +} + +static void usage_complete(int argc, char *argv[]) +{ + gchar *op_list =3D g_strjoinv(", ", (gchar **)op_names); + gchar *tester_list =3D g_strjoinv(", ", (gchar **)tester_names); + + fprintf(stderr, "Usage: %s [options]\n", argv[0]); + fprintf(stderr, "options:\n"); + fprintf(stderr, " -d =3D duration, in seconds. Default: %d\n", + DEFAULT_DURATION_SECS); + fprintf(stderr, " -h =3D show this help message.\n"); + fprintf(stderr, " -o =3D floating point operation (%s). Default: %s\n= ", + op_list, op_names[0]); + fprintf(stderr, " -p =3D floating point precision (single, double). " + "Default: single\n"); + fprintf(stderr, " -t =3D tester (%s). Default: %s\n", + tester_list, tester_names[0]); + fprintf(stderr, " -z =3D flush inputs to zero (soft tester only). " + "Default: disabled\n"); + fprintf(stderr, " -Z =3D flush output to zero (soft tester only). " + "Default: disabled\n"); + + g_free(tester_list); + g_free(op_list); +} + +static void parse_args(int argc, char *argv[]) +{ + int c; + int val; + + for (;;) { + c =3D getopt(argc, argv, "d:ho:p:t:zZ"); + if (c < 0) { + break; + } + switch (c) { + case 'd': + duration =3D atoi(optarg); + break; + case 'h': + usage_complete(argc, argv); + exit(EXIT_SUCCESS); + case 'o': + val =3D find_name(op_names, optarg); + if (val < 0) { + fprintf(stderr, "Unsupported op '%s'\n", optarg); + exit(EXIT_FAILURE); + } + operation =3D val; + break; + case 'p': + if (!strcmp(optarg, "single")) { + precision =3D PREC_SINGLE; + } else if (!strcmp(optarg, "double")) { + precision =3D PREC_DOUBLE; + } else { + fprintf(stderr, "Unsupported precision '%s'\n", optarg); + exit(EXIT_FAILURE); + } + break; + case 't': + val =3D find_name(tester_names, optarg); + if (val < 0) { + fprintf(stderr, "Unsupported tester '%s'\n", optarg); + exit(EXIT_FAILURE); + } + tester =3D val; + break; + case 'z': + soft_status.flush_inputs_to_zero =3D 1; + break; + case 'Z': + soft_status.flush_to_zero =3D 1; + break; + } + } + + /* set precision based on the tester */ + switch (tester) { + case TESTER_HOST: + break; + case TESTER_SOFT: + switch (precision) { + case PREC_SINGLE: + precision =3D PREC_FLOAT32; + break; + case PREC_DOUBLE: + precision =3D PREC_FLOAT64; + break; + default: + g_assert_not_reached(); + } + break; + default: + g_assert_not_reached(); + } +} + +static void pr_stats(void) +{ + printf("%.2f MFlops\n", (double)n_completed_ops / ns_elapsed * 1e3); +} + +int main(int argc, char *argv[]) +{ + parse_args(argc, argv); + run_bench(); + pr_stats(); + return 0; +} diff --git a/tests/fp/.gitignore b/tests/fp/.gitignore index 0a9fef4..a4e59d7 100644 --- a/tests/fp/.gitignore +++ b/tests/fp/.gitignore @@ -1,3 +1,4 @@ ibm *.txt fp-test +fp-bench diff --git a/tests/fp/Makefile b/tests/fp/Makefile index a208f4c..7c88ab0 100644 --- a/tests/fp/Makefile +++ b/tests/fp/Makefile @@ -12,7 +12,7 @@ QEMU_CFLAGS +=3D -DHW_POISON_H =20 IBMFP :=3D ibm-fptests.zip =20 -OBJS :=3D fp-test$(EXESUF) +OBJS :=3D fp-test$(EXESUF) fp-bench$(EXESUF) =20 WHITELIST_FILES :=3D whitelist.txt whitelist-tininess-after.txt =20 @@ -30,5 +30,7 @@ $(WHITELIST_FILES): =20 fp-test$(EXESUF): fp-test.o softfloat.o =20 +fp-bench$(EXESUF): fp-bench.o softfloat.o + clean: rm -f *.o *.d $(OBJS) --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528768412720570.6746525672762; Mon, 11 Jun 2018 18:53:32 -0700 (PDT) Received: from localhost ([::1]:52233 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYV2-0007Fp-14 for importer@patchew.org; Mon, 11 Jun 2018 21:53:32 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40811) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046p-SV for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003M6-6Z for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:52041) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQi-0003Kz-0h for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 6939221D1D; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:02 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 2A7A5E42DB; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=zvlxyqqcOOe8eh SeZqkv1aNfYPxySx48ROnGR5hd4Es=; b=1zaXmybv+KG+C5hB09iyxs4UhXi0sV WG+sMzl7XEne0DO4/fScbebD2gJJrLomjX2+z8UFNixG6fRC6isLZcD9Ga3khxLm oI4VJqTBgoXoEADVx8faVCM62w4VEBovGlMIssJqWu1C1/aqKdvs88vyOuDFR6iq 6FsDsR8s6VCR0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=zvlxyqqcOOe8ehSeZqkv1aNfYPxySx48ROnGR5hd4Es=; b=PqJQOAZj Yft3B8kj4sq/q4hOj5oZS7xBp7PJDmyp0H4FAxg4+uyrew0LlfMBib4fYPFnETGg t/lDf6D+pbbqUFSaxJLwDAXzqaGcnyoMShipOQyBnNz6dBfg6uR7Z1+DKRP5D6sI nj4ZeO+/McSynbTxrYBI3vxsw5jjAMc+g4QLuq8OV5QOaWYuocXMx3xiu772Qltu g+3CWxbChRgzMmmwdsrAi9tfktfyCeCWWdFdVpb+BwWF7CIgeL0t4vvVMGsut9al MV+UwNSVulz4sZw0q1jM6tP7H+FRH7sFmKOHceXRhLV67g5JMGm4cPzCWuEMG92v GieS4PY9zX1sBQ== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:52 -0400 Message-Id: <1528768140-17894-7-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 06/14] softfloat: rename canonicalize to sf_canonicalize X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Bastian Koppelmann , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" glibc >=3D 2.25 defines canonicalize in commit eaf5ad0 (Add canonicalize, canonicalizef, canonicalizel., 2016-10-26). Given that we'll be including soon, prepare for this by prefixing our canonicalize() with sf_ to avoid clashing with the libc's canonicalize(). Cc: Bastian Koppelmann Reported-by: Bastian Koppelmann Tested-by: Bastian Koppelmann Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 8cd2400..2ab5a88 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -336,8 +336,8 @@ static inline float64 float64_pack_raw(FloatParts p) #include "softfloat-specialize.h" =20 /* Canonicalize EXP and FRAC, setting CLS. */ -static FloatParts canonicalize(FloatParts part, const FloatFmt *parm, - float_status *status) +static FloatParts sf_canonicalize(FloatParts part, const FloatFmt *parm, + float_status *status) { if (part.exp =3D=3D parm->exp_max && !parm->arm_althp) { if (part.frac =3D=3D 0) { @@ -513,7 +513,7 @@ static FloatParts round_canonical(FloatParts p, float_s= tatus *s, static FloatParts float16a_unpack_canonical(float16 f, float_status *s, const FloatFmt *params) { - return canonicalize(float16_unpack_raw(f), params, s); + return sf_canonicalize(float16_unpack_raw(f), params, s); } =20 static FloatParts float16_unpack_canonical(float16 f, float_status *s) @@ -534,7 +534,7 @@ static float16 float16_round_pack_canonical(FloatParts = p, float_status *s) =20 static FloatParts float32_unpack_canonical(float32 f, float_status *s) { - return canonicalize(float32_unpack_raw(f), &float32_params, s); + return sf_canonicalize(float32_unpack_raw(f), &float32_params, s); } =20 static float32 float32_round_pack_canonical(FloatParts p, float_status *s) @@ -544,7 +544,7 @@ static float32 float32_round_pack_canonical(FloatParts = p, float_status *s) =20 static FloatParts float64_unpack_canonical(float64 f, float_status *s) { - return canonicalize(float64_unpack_raw(f), &float64_params, s); + return sf_canonicalize(float64_unpack_raw(f), &float64_params, s); } =20 static float64 float64_round_pack_canonical(FloatParts p, float_status *s) --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 152876889942291.13258879900752; Mon, 11 Jun 2018 19:01:39 -0700 (PDT) Received: from localhost ([::1]:52289 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYci-0005dn-Lv for importer@patchew.org; Mon, 11 Jun 2018 22:01:28 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40792) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00045z-P2 for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003MC-7e for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:38467) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQi-0003L1-0u for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 9279021B6F; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:02 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 51DFF1026A; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=tzZn5as9ozv6Bu t1xntsQauXvxABaQiSwOZ8JnywSik=; b=TRpsHPf/C5oFt1aB9fLmngNHXMomu+ n8TGW77ogvGBBHohUEWnRIoaDXiL3cDsZZK6cWtqb/IjfFZlgJasTbHlDiLQhFCv o8Aj7vjbnkvMcOayMcRdlduPRTzXU2FRBprMNW4rkwMSeZhcHooZoQjkDcFndXF9 peOnTyZkJwjMY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=tzZn5as9ozv6But1xntsQauXvxABaQiSwOZ8JnywSik=; b=I653X6tG 9rgR7dHxP2XcFO1xCEKvtvO0oqE4ceygfKY3W10B+1IxC06Die9rixUs90Y0nlk+ ZP0Gmo/b03ckKn4eeTVQMQlPHE63bFIlvRvL4FeS99cE72FDcW1ut8pg6MqFuspk 8xO7PTWWdOc56mxx+0hGRgDBpDFLea6/rUsx75RGN4c+gfgtpMvTnux1aY62Y0si wBYsjPodzwSrWM/BwvFhzsuIpQV/WwWw34izucencEBBfQAPSoet2z2UKZUNSi3p vlWpjP7tJfYa0O35ZoTE4GF4BGrt781+ZJCd/xsABkGAUzVPEGFyJCyLyqvF55xX cEPCMbU8IZZKEg== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:53 -0400 Message-Id: <1528768140-17894-8-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 07/14] softfloat: add float{32, 64}_is_zero_or_normal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" These will gain some users very soon. Signed-off-by: Emilio G. Cota --- include/fpu/softfloat.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 1fbece5..08f63ae 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -422,6 +422,11 @@ static inline bool float32_is_denormal(float32 a) return float32_is_zero_or_denormal(a) && !float32_is_zero(a); } =20 +static inline bool float32_is_zero_or_normal(float32 a) +{ + return float32_is_normal(a) || float32_is_zero(a); +} + static inline float32 float32_set_sign(float32 a, int sign) { return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31)); @@ -561,6 +566,11 @@ static inline bool float64_is_denormal(float64 a) return float64_is_zero_or_denormal(a) && !float64_is_zero(a); } =20 +static inline bool float64_is_zero_or_normal(float64 a) +{ + return float64_is_normal(a) || float64_is_zero(a); +} + static inline float64 float64_set_sign(float64 a, int sign) { return make_float64((float64_val(a) & 0x7fffffffffffffffULL) --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528769285830170.80908294723133; Mon, 11 Jun 2018 19:08:05 -0700 (PDT) Received: from localhost ([::1]:52315 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYj2-0001Vt-0Z for importer@patchew.org; Mon, 11 Jun 2018 22:08:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40802) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046f-QY for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003MJ-9H for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:37789) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQi-0003L3-1V for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id CE9C821D31; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:02 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 7C022E42DB; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=/FRMUCJYltClJy oQLy+ZvDnlZwRuds/xqG7xndhCJh0=; b=SBZOjVL5unrA/djYw/gTkge8CsqImW 01ig8k2B3GNR+rXnrj2zHfyosk5KS7WFAVXcWLKFt47wzGwP/0Sa0aVqNOim+lHh X8qzoWGkQwuk7T/qZRvjijcwbdxilEocueLkTEzN9/g7L5yllC56KkORAL2yDZXT KzQ8Du3WBoe6U= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=/FRMUCJYltClJyoQLy+ZvDnlZwRuds/xqG7xndhCJh0=; b=Om2hv52X 6XQiUCppLhqXawr/iKCdib9zgFp1BxGf/Z1FHqhvXAnZyx7XDTI9ANvi680nrNor AMW6s1cqEs3e9n6590fhqc7AveUaTSeogUuIVybW6FjNCBeH2qIq5TfJAjF2VFaV wyDL/+TkB0TvJhEUQzxt+FJHB4wCWZF4H7f/BSHgD9RZC0bSbSLM32PTQvXTrIuB cxJPCIK9ynoTRpEHwq9IhYrKFKKk8sNJsiUmjuyJIpVHqvRtDYyaTe+r2hjmupcE hkR3SWLvfzAOOPIr2TsjLfVxH6ScQPU16PjejPN9H6s5S+3aEuLEif/4Nlbtaq1i oo+M+GU0gRrP0Q== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:54 -0400 Message-Id: <1528768140-17894-9-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 08/14] fpu: introduce hardfloat X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The appended paves the way for leveraging the host FPU for a subset of guest FP operations. For most guest workloads (e.g. FP flags aren't ever cleared, inexact occurs often and rounding is set to the default [to nearest]) this will yield sizable performance speedups. The approach followed here avoids checking the FP exception flags register. See the added comment for details. This assumes that QEMU is running on an IEEE754-compliant FPU and that the rounding is set to the default (to nearest). The implementation-dependent specifics of the FPU should not matter; things like tininess detection and snan representation are still dealt with in soft-fp. However, this approach will break on most hosts if we compile QEMU with flags such as -ffast-math. We control the flags so this should be easy to enforce though. This patch just adds common code. Some operations will be migrated to hardfloat in subsequent patches to ease bisection. Note: some architectures (at least PPC, there might be others) clear the status flags passed to softfloat before most FP operations. This precludes the use of hardfloat, so to avoid introducing a performance regression for those targets, we add a flag to disable hardfloat. In the long run though it would be good to fix the targets so that at least the inexact flag passed to softfloat is indeed sticky. Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 341 ++++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 341 insertions(+) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 2ab5a88..4d378d7 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -83,6 +83,7 @@ this code that are retained. * target-dependent and needs the TARGET_* macros. */ #include "qemu/osdep.h" +#include #include "qemu/bitops.h" #include "fpu/softfloat.h" =20 @@ -95,6 +96,346 @@ this code that are retained. *-------------------------------------------------------------------------= ---*/ #include "fpu/softfloat-macros.h" =20 +/* + * Hardfloat + * + * Fast emulation of guest FP instructions is challenging for two reasons. + * First, FP instruction semantics are similar but not identical, particul= arly + * when handling NaNs. Second, emulating at reasonable speed the guest FP + * exception flags is not trivial: reading the host's flags register with a + * feclearexcept & fetestexcept pair is slow [slightly slower than soft-fp= ], + * and trapping on every FP exception is not fast nor pleasant to work wit= h. + * + * We address these challenges by leveraging the host FPU for a subset of = the + * operations. To do this we expand on the idea presented in this paper: + * + * Guo, Yu-Chuan, et al. "Translating the ARM Neon and VFP instructions in= a + * binary translator." Software: Practice and Experience 46.12 (2016):1591= -1615. + * + * The idea is thus to leverage the host FPU to (1) compute FP operations + * and (2) identify whether FP exceptions occurred while avoiding + * expensive exception flag register accesses. + * + * An important optimization shown in the paper is that given that excepti= on + * flags are rarely cleared by the guest, we can avoid recomputing some fl= ags. + * This is particularly useful for the inexact flag, which is very frequen= tly + * raised in floating-point workloads. + * + * We optimize the code further by deferring to soft-fp whenever FP except= ion + * detection might get hairy. Two examples: (1) when at least one operand = is + * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 r= esult + * and the result is < the minimum normal. + */ +#define GEN_TYPE_CONV(name, to_t, from_t) \ + static inline to_t name(from_t a) \ + { \ + to_t r =3D *(to_t *)&a; \ + return r; \ + } + +GEN_TYPE_CONV(float32_to_float, float, float32) +GEN_TYPE_CONV(float64_to_double, double, float64) +GEN_TYPE_CONV(float_to_float32, float32, float) +GEN_TYPE_CONV(double_to_float64, float64, double) +#undef GEN_TYPE_CONV + +#define GEN_INPUT_FLUSH__NOCHECK(name, soft_t) \ + static inline void name(soft_t *a, float_status *s) \ + { \ + if (unlikely(soft_t ## _is_denormal(*a))) { \ + *a =3D soft_t ## _set_sign(soft_t ## _zero, \ + soft_t ## _is_neg(*a)); \ + s->float_exception_flags |=3D float_flag_input_denormal; \ + } \ + } + +GEN_INPUT_FLUSH__NOCHECK(float32_input_flush__nocheck, float32) +GEN_INPUT_FLUSH__NOCHECK(float64_input_flush__nocheck, float64) +#undef GEN_INPUT_FLUSH__NOCHECK + +#define GEN_INPUT_FLUSH1(name, soft_t) \ + static inline void name(soft_t *a, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + } + +GEN_INPUT_FLUSH1(float32_input_flush1, float32) +GEN_INPUT_FLUSH1(float64_input_flush1, float64) +#undef GEN_INPUT_FLUSH1 + +#define GEN_INPUT_FLUSH2(name, soft_t) \ + static inline void name(soft_t *a, soft_t *b, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + soft_t ## _input_flush__nocheck(b, s); \ + } + +GEN_INPUT_FLUSH2(float32_input_flush2, float32) +GEN_INPUT_FLUSH2(float64_input_flush2, float64) +#undef GEN_INPUT_FLUSH2 + +#define GEN_INPUT_FLUSH3(name, soft_t) \ + static inline void name(soft_t *a, soft_t *b, soft_t *c, float_status = *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + soft_t ## _input_flush__nocheck(b, s); \ + soft_t ## _input_flush__nocheck(c, s); \ + } + +GEN_INPUT_FLUSH3(float32_input_flush3, float32) +GEN_INPUT_FLUSH3(float64_input_flush3, float64) +#undef GEN_INPUT_FLUSH3 + +static inline bool can_use_fpu(const float_status *s) +{ + return likely(s->float_exception_flags & float_flag_inexact && + s->float_rounding_mode =3D=3D float_round_nearest_even); +} + +/* + * Choose whether to use fpclassify or float32/64_* primitives in the gene= rated + * hardfloat functions. Each combination of number of inputs and float size + * gets its own value. + */ +#if defined(__x86_64__) +# define QEMU_HARDFLOAT_1F32_USE_FP 0 +# define QEMU_HARDFLOAT_1F64_USE_FP 0 +# define QEMU_HARDFLOAT_2F32_USE_FP 0 +# define QEMU_HARDFLOAT_2F64_USE_FP 1 +# define QEMU_HARDFLOAT_3F32_USE_FP 0 +# define QEMU_HARDFLOAT_3F64_USE_FP 1 +#else +# define QEMU_HARDFLOAT_1F32_USE_FP 0 +# define QEMU_HARDFLOAT_1F64_USE_FP 0 +# define QEMU_HARDFLOAT_2F32_USE_FP 0 +# define QEMU_HARDFLOAT_2F64_USE_FP 0 +# define QEMU_HARDFLOAT_3F32_USE_FP 0 +# define QEMU_HARDFLOAT_3F64_USE_FP 0 +#endif + +/* + * QEMU_HARDFLOAT_USE_ISINF chooses whether to use isinf() over + * float{32,64}_is_infinity when !USE_FP. + * On x86_64/aarch64, using the former over the latter can yield a ~6% spe= edup. + * On power64 however, using isinf() reduces fp-bench performance by up to= 50%. + */ +#if defined(__x86_64__) || defined(__aarch64__) +# define QEMU_HARDFLOAT_USE_ISINF 1 +#else +# define QEMU_HARDFLOAT_USE_ISINF 0 +#endif + +/* + * Some targets clear the FP flags before most FP operations. This prevents + * the use of hardfloat, since hardfloat relies on the inexact flag being + * already set. + */ +#if defined(TARGET_PPC) +# define QEMU_NO_HARDFLOAT 1 +# define QEMU_SOFTFLOAT_ATTR __attribute__((flatten)) +#else +# define QEMU_NO_HARDFLOAT 0 +# define QEMU_SOFTFLOAT_ATTR __attribute__((noinline)) +#endif + +/* + * Hardfloat generation functions. Each operation can have two flavors: + * either using softfloat primitives (e.g. float32_is_zero_or_normal) for + * most condition checks, or native ones (e.g. fpclassify). + * + * The flavor is chosen by the callers. Instead of using macros, we rely o= n the + * compiler to propagate constants and inline everything into the callers. + * + * We only generate functions for operations with two inputs, since only + * these are common enough to justify consolidating them into common code. + */ +typedef bool (*f32_check_func_t)(float32 a, float32 b, const float_status = *s); +typedef bool (*f64_check_func_t)(float64 a, float64 b, const float_status = *s); +typedef bool (*float_check_func_t)(float a, float b, const float_status *s= ); +typedef bool (*double_check_func_t)(double a, double b, const float_status= *s); + +typedef float32 (*f32_op2_func_t)(float32 a, float32 b, float_status *s); +typedef float64 (*f64_op2_func_t)(float64 a, float64 b, float_status *s); +typedef float (*float_op2_func_t)(float a, float b); +typedef double (*double_op2_func_t)(double a, double b); + +/* 2-input is-zero-or-normal */ +static inline bool +f32_is_zon2(float32 a, float32 b, const struct float_status *s) +{ + return likely(float32_is_zero_or_normal(a) && + float32_is_zero_or_normal(b) && + can_use_fpu(s)); +} + +static inline bool +float_is_zon2(float a, float b, const struct float_status *s) +{ + return likely((fpclassify(a) =3D=3D FP_NORMAL || fpclassify(a) =3D=3D = FP_ZERO) && + (fpclassify(b) =3D=3D FP_NORMAL || fpclassify(b) =3D=3D = FP_ZERO) && + can_use_fpu(s)); +} + +static inline bool +f64_is_zon2(float64 a, float64 b, const struct float_status *s) +{ + return likely(float64_is_zero_or_normal(a) && + float64_is_zero_or_normal(b) && + can_use_fpu(s)); +} + +static inline bool +double_is_zon2(double a, double b, const struct float_status *s) +{ + return likely((fpclassify(a) =3D=3D FP_NORMAL || fpclassify(a) =3D=3D = FP_ZERO) && + (fpclassify(b) =3D=3D FP_NORMAL || fpclassify(b) =3D=3D = FP_ZERO) && + can_use_fpu(s)); +} + +/* + * Note: @fast and @post can be NULL. + * Note: @fast and @fast_op always use softfloat types. + */ +static inline float32 +f32_gen2(float32 a, float32 b, float_status *s, float_op2_func_t hard, + f32_op2_func_t soft, f32_check_func_t pre, f32_check_func_t post, + f32_check_func_t fast, f32_op2_func_t fast_op) +{ + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + float32_input_flush2(&a, &b, s); + if (likely(pre(a, b, s))) { + if (fast !=3D NULL && fast(a, b, s)) { + return fast_op(a, b, s); + } else { + float ha =3D float32_to_float(a); + float hb =3D float32_to_float(b); + float hr =3D hard(ha, hb); + float32 r =3D float_to_float32(hr); + + if (unlikely(QEMU_HARDFLOAT_USE_ISINF ? + isinf(hr) : float32_is_infinity(r))) { + s->float_exception_flags |=3D float_flag_overflow; + } else if (unlikely(fabsf(hr) <=3D FLT_MIN && + (post =3D=3D NULL || post(a, b, s)))) { + goto soft; + } + return r; + } + } + soft: + return soft(a, b, s); +} + +static inline float32 +float_gen2(float32 a, float32 b, float_status *s, float_op2_func_t hard, + f32_op2_func_t soft, float_check_func_t pre, float_check_func_t= post, + f32_check_func_t fast, f32_op2_func_t fast_op) +{ + float ha, hb; + + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + float32_input_flush2(&a, &b, s); + ha =3D float32_to_float(a); + hb =3D float32_to_float(b); + if (likely(pre(ha, hb, s))) { + if (fast !=3D NULL && fast(a, b, s)) { + return fast_op(a, b, s); + } else { + float hr =3D hard(ha, hb); + float32 r =3D float_to_float32(hr); + + if (unlikely(isinf(hr))) { + s->float_exception_flags |=3D float_flag_overflow; + } else if (unlikely(fabsf(hr) <=3D FLT_MIN && + (post =3D=3D NULL || post(ha, hb, s)))) { + goto soft; + } + return r; + } + } + soft: + return soft(a, b, s); +} + +static inline float64 +f64_gen2(float64 a, float64 b, float_status *s, double_op2_func_t hard, + f64_op2_func_t soft, f64_check_func_t pre, f64_check_func_t post, + f64_check_func_t fast, f64_op2_func_t fast_op) +{ + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + float64_input_flush2(&a, &b, s); + if (likely(pre(a, b, s))) { + if (fast !=3D NULL && fast(a, b, s)) { + return fast_op(a, b, s); + } else { + double ha =3D float64_to_double(a); + double hb =3D float64_to_double(b); + double hr =3D hard(ha, hb); + float64 r =3D double_to_float64(hr); + + if (unlikely(QEMU_HARDFLOAT_USE_ISINF ? + isinf(hr) : float64_is_infinity(r))) { + s->float_exception_flags |=3D float_flag_overflow; + } else if (unlikely(fabsf(hr) <=3D FLT_MIN && + (post =3D=3D NULL || post(a, b, s)))) { + goto soft; + } + return r; + } + } + soft: + return soft(a, b, s); +} + +static inline float64 +double_gen2(float64 a, float64 b, float_status *s, double_op2_func_t hard, + f64_op2_func_t soft, double_check_func_t pre, + double_check_func_t post, f64_check_func_t fast, + f64_op2_func_t fast_op) +{ + double ha, hb; + + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + float64_input_flush2(&a, &b, s); + ha =3D float64_to_double(a); + hb =3D float64_to_double(b); + if (likely(pre(ha, hb, s))) { + if (fast !=3D NULL && fast(a, b, s)) { + return fast_op(a, b, s); + } else { + double hr =3D hard(ha, hb); + float64 r =3D double_to_float64(hr); + + if (unlikely(isinf(hr))) { + s->float_exception_flags |=3D float_flag_overflow; + } else if (unlikely(fabs(hr) <=3D DBL_MIN && + (post =3D=3D NULL || post(ha, hb, s)))) { + goto soft; + } + return r; + } + } + soft: + return soft(a, b, s); +} + /*------------------------------------------------------------------------= ---- | Returns the fraction bits of the half-precision floating-point value `a'. *-------------------------------------------------------------------------= ---*/ --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528769023582487.4660048141433; Mon, 11 Jun 2018 19:03:43 -0700 (PDT) Received: from localhost ([::1]:52297 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYes-00071y-OE for importer@patchew.org; Mon, 11 Jun 2018 22:03:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40799) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046a-QO for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003Mc-9I for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:35109) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQh-0003L7-WA for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id F1FA4218C8; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:02 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id AC07610266; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=EEG1LqOxLjYYhj bZYeFcGR6InS8BG9FtzoIv3RbS+FQ=; b=kEZKo8K5tDRmORVjSDlNKDWejR0CXh kEcJEFkKo1PmXtdGQy80cuxj8hXahDI3GPH+giagokWEJO2FM+XpQzypql9XY95g moWIHS5/xuLHpXBaWlqfRLWsKb0FCWa+vvLlOhMDDbZcqV63/+QvGKNJFfLL81oc lPBGccsuYKJ60= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=EEG1LqOxLjYYhjbZYeFcGR6InS8BG9FtzoIv3RbS+FQ=; b=ZycSr9kQ An8fUJzWaYyTQVziEgROo4HnMQ88sGKularna5H+m6H76MFhjHUyU2ij4BZLfNsA uSya602PuzHqp6pg4xubajGXmg6AH+QCuUjTOyWKMY6k/5Z+r02s0pJK+YzAtRnk bnLphis1EC1Y77t/oQtdhh50puAYPgjiSqwd5gpbXrY3QCCBB7G6EYN6VIfASL35 iabXr37/ds+WdxXLS+vUmNFkctSid1BD4zZ5lVGYoZFlGcg8np/dW09BPevDTqo2 wL1MGHcrv99RJpfIVG+3iTkoB9XiG/eTSBhe/E+cReS2cpx5XkPDZxa0GCGfF4zi VzmoKSHKdKjbeA== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:55 -0400 Message-Id: <1528768140-17894-10-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 09/14] hardfloat: support float32/64 addition and subtraction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results (single and double precision) for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: add-single: 135.07 MFlops add-double: 131.60 MFlops sub-single: 130.04 MFlops sub-double: 133.01 MFlops - after: add-single: 443.04 MFlops add-double: 301.95 MFlops sub-single: 411.36 MFlops sub-double: 293.15 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: add-single: 44.79 MFlops add-double: 49.20 MFlops sub-single: 44.55 MFlops sub-double: 49.06 MFlops - after: add-single: 93.28 MFlops add-double: 88.27 MFlops sub-single: 91.47 MFlops sub-double: 88.27 MFlops 3. IBM POWER8E @ 2.1 GHz - before: add-single: 72.59 MFlops add-double: 72.27 MFlops sub-single: 75.33 MFlops sub-double: 70.54 MFlops - after: add-single: 112.95 MFlops add-double: 201.11 MFlops sub-single: 116.80 MFlops sub-double: 188.72 MFlops Note that the IBM and ARM machines benefit from having HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance can suffer significantly: - IBM Power8: add-single: [1] 54.94 vs [0] 116.37 MFlops add-double: [1] 58.92 vs [0] 201.44 MFlops - Aarch64 A57: add-single: [1] 80.72 vs [0] 93.24 MFlops add-double: [1] 82.10 vs [0] 88.18 MFlops On the Intel machine, having 2F64 set to 1 pays off, but it doesn't for 2F32: - Intel i7-6700K: add-single: [1] 285.79 vs [0] 426.70 MFlops add-double: [1] 302.15 vs [0] 278.82 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++-= ---- 1 file changed, 98 insertions(+), 8 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 4d378d7..cdce6b2 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1077,8 +1077,8 @@ float16 __attribute__((flatten)) float16_add(float16= a, float16 b, return float16_round_pack_canonical(pr, status); } =20 -float32 __attribute__((flatten)) float32_add(float32 a, float32 b, - float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_add(float32 a, float32 b, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); @@ -1087,8 +1087,8 @@ float32 __attribute__((flatten)) float32_add(float32 = a, float32 b, return float32_round_pack_canonical(pr, status); } =20 -float64 __attribute__((flatten)) float64_add(float64 a, float64 b, - float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_add(float64 a, float64 b, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); @@ -1107,8 +1107,8 @@ float16 __attribute__((flatten)) float16_sub(float16 = a, float16 b, return float16_round_pack_canonical(pr, status); } =20 -float32 __attribute__((flatten)) float32_sub(float32 a, float32 b, - float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_sub(float32 a, float32 b, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); @@ -1117,8 +1117,8 @@ float32 __attribute__((flatten)) float32_sub(float32 = a, float32 b, return float32_round_pack_canonical(pr, status); } =20 -float64 __attribute__((flatten)) float64_sub(float64 a, float64 b, - float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_sub(float64 a, float64 b, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); @@ -1127,6 +1127,96 @@ float64 __attribute__((flatten)) float64_sub(float64= a, float64 b, return float64_round_pack_canonical(pr, status); } =20 +static float float_add(float a, float b) +{ + return a + b; +} + +static float float_sub(float a, float b) +{ + return a - b; +} + +static double double_add(double a, double b) +{ + return a + b; +} + +static double double_sub(double a, double b) +{ + return a - b; +} + +static bool f32_addsub_post(float32 a, float32 b, const struct float_statu= s *s) +{ + return !(float32_is_zero(a) && float32_is_zero(b)); +} + +static bool +float_addsub_post(float a, float b, const struct float_status *s) +{ + return !(fpclassify(a) =3D=3D FP_ZERO && fpclassify(b) =3D=3D FP_ZERO); +} + +static bool f64_addsub_post(float64 a, float64 b, const struct float_statu= s *s) +{ + return !(float64_is_zero(a) && float64_is_zero(b)); +} + +static bool +double_addsub_post(double a, double b, const struct float_status *s) +{ + return !(fpclassify(a) =3D=3D FP_ZERO && fpclassify(b) =3D=3D FP_ZERO); +} + +static float32 float32_addsub(float32 a, float32 b, float_status *s, + float_op2_func_t hard, f32_op2_func_t soft) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return float_gen2(a, b, s, hard, soft, float_is_zon2, float_addsub= _post, + NULL, NULL); + } else { + return f32_gen2(a, b, s, hard, soft, f32_is_zon2, f32_addsub_post, + NULL, NULL); + } +} + +static float64 float64_addsub(float64 a, float64 b, float_status *s, + double_op2_func_t hard, f64_op2_func_t soft) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return double_gen2(a, b, s, hard, soft, double_is_zon2, + double_addsub_post, NULL, NULL); + } else { + return f64_gen2(a, b, s, hard, soft, f64_is_zon2, f64_addsub_post, + NULL, NULL); + } +} + +float32 __attribute__((flatten)) +float32_add(float32 a, float32 b, float_status *s) +{ + return float32_addsub(a, b, s, float_add, soft_float32_add); +} + +float32 __attribute__((flatten)) +float32_sub(float32 a, float32 b, float_status *s) +{ + return float32_addsub(a, b, s, float_sub, soft_float32_sub); +} + +float64 __attribute__((flatten)) +float64_add(float64 a, float64 b, float_status *s) +{ + return float64_addsub(a, b, s, double_add, soft_float64_add); +} + +float64 __attribute__((flatten)) +float64_sub(float64 a, float64 b, float_status *s) +{ + return float64_addsub(a, b, s, double_sub, soft_float64_sub); +} + /* * Returns the result of multiplying the floating-point values `a' and * `b'. The operation is performed according to the IEC/IEEE Standard --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528768495653380.3777237991727; Mon, 11 Jun 2018 18:54:55 -0700 (PDT) Received: from localhost ([::1]:52240 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYWJ-0008Jd-I6 for importer@patchew.org; Mon, 11 Jun 2018 21:54:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40797) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046W-Qb for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003Lo-1H for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:33655) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQh-0003L8-Ub for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 41A1121D43; Mon, 11 Jun 2018 21:49:03 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:03 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E2016E42DB; Mon, 11 Jun 2018 21:49:02 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=QdPGQv08TqAN2B KXuhiZLtFVn3vU3SvZv7N88a7GSBw=; b=l+ZLdxBkQuSb6vDeMB4oQfQ0uZFxq4 runia1c7FjK4GlyKE0uNlL1+BBsoJ8Dp46qi6LwTNGKhETeguyv2urKOTn5/H/lk mqflI8kCVJYnn4RJaxq8zXUqCxIeRj2bnkbB3sx6tdI2gsNgyUQPu4ZPg3bb9u3L 9wWDBysVKQc+g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=QdPGQv08TqAN2BKXuhiZLtFVn3vU3SvZv7N88a7GSBw=; b=ff93dLBP R++ZybTX5mnxT2gTHFQaTsKcHUEsRZof1Ji8RrR7SHqzlaz/ysu7hoHyQUzrd8ND PWMC+YvGB8Au1Et9U16tsrrwu0p2/ublrgIe4ja0Xt+OFsGK46yeseNEuoNUgyZK 3yfVNPNa2r8t0qlAK7OryOuTDa5JTzlWjIwQsOJfzON4FsDYLUmFWghj9ngpLZTu YfF7a80302DJ/Bzv6olUM0cfenxRtKm1MXRP9sU/y5j3yL5Bxx/3Dp++XFYWgMUL RQM8iGIiEfmPZqIGyO1jBxU4kvuVFHtooWzR1DHYqNVWidPFN/7sV0zGyCGeH4xN iUqkVk8uHpPyxg== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:56 -0400 Message-Id: <1528768140-17894-11-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 10/14] hardfloat: support float32/64 multiplication X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: mul-single: 126.91 MFlops mul-double: 118.28 MFlops - after: mul-single: 258.02 MFlops mul-double: 197.96 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: mul-single: 37.42 MFlops mul-double: 38.77 MFlops - after: mul-single: 73.41 MFlops mul-double: 76.93 MFlops 3. IBM POWER8E @ 2.1 GHz - before: mul-single: 58.40 MFlops mul-double: 59.33 MFlops - after: mul-single: 60.25 MFlops mul-double: 94.79 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++= ---- 1 file changed, 62 insertions(+), 4 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index cdce6b2..4fcabf6 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1276,8 +1276,8 @@ float16 __attribute__((flatten)) float16_mul(float16 = a, float16 b, return float16_round_pack_canonical(pr, status); } =20 -float32 __attribute__((flatten)) float32_mul(float32 a, float32 b, - float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_mul(float32 a, float32 b, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); @@ -1286,8 +1286,8 @@ float32 __attribute__((flatten)) float32_mul(float32 = a, float32 b, return float32_round_pack_canonical(pr, status); } =20 -float64 __attribute__((flatten)) float64_mul(float64 a, float64 b, - float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_mul(float64 a, float64 b, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); @@ -1296,6 +1296,64 @@ float64 __attribute__((flatten)) float64_mul(float64= a, float64 b, return float64_round_pack_canonical(pr, status); } =20 +static float float_mul(float a, float b) +{ + return a * b; +} + +static double double_mul(double a, double b) +{ + return a * b; +} + +static bool f32_mul_fast(float32 a, float32 b, const struct float_status *= s) +{ + return float32_is_zero(a) || float32_is_zero(b); +} + +static bool f64_mul_fast(float64 a, float64 b, const struct float_status *= s) +{ + return float64_is_zero(a) || float64_is_zero(b); +} + +static float32 f32_mul_fast_op(float32 a, float32 b, float_status *s) +{ + bool signbit =3D float32_is_neg(a) ^ float32_is_neg(b); + + return float32_set_sign(float32_zero, signbit); +} + +static float64 f64_mul_fast_op(float64 a, float64 b, float_status *s) +{ + bool signbit =3D float64_is_neg(a) ^ float64_is_neg(b); + + return float64_set_sign(float64_zero, signbit); +} + +float32 __attribute__((flatten)) +float32_mul(float32 a, float32 b, float_status *s) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return float_gen2(a, b, s, float_mul, soft_float32_mul, float_is_z= on2, + NULL, f32_mul_fast, f32_mul_fast_op); + } else { + return f32_gen2(a, b, s, float_mul, soft_float32_mul, f32_is_zon2,= NULL, + f32_mul_fast, f32_mul_fast_op); + } +} + +float64 __attribute__((flatten)) +float64_mul(float64 a, float64 b, float_status *s) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return double_gen2(a, b, s, double_mul, soft_float64_mul, + double_is_zon2, NULL, f64_mul_fast, f64_mul_fas= t_op); + } else { + return f64_gen2(a, b, s, double_mul, soft_float64_mul, f64_is_zon2, + NULL, f64_mul_fast, f64_mul_fast_op); + } +} + /* * Returns the result of multiplying the floating-point values `a' and * `b' then adding 'c', with no intermediate rounding step after the --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528768595889776.5716722083265; Mon, 11 Jun 2018 18:56:35 -0700 (PDT) Received: from localhost ([::1]:52253 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYXz-0001QT-4a for importer@patchew.org; Mon, 11 Jun 2018 21:56:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40800) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046e-Q8 for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003MO-9j for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:45721) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQi-0003LE-0V for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 634B921D81; Mon, 11 Jun 2018 21:49:03 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:03 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 1E21E10266; Mon, 11 Jun 2018 21:49:03 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=3b6ajoTs1izHZZ lG5rWS1xP1WwDE6H5Aak5ItkBclgk=; b=el4+QtQXEZPPZve7dzu5BXhLazvyIM te79RlAgm8CVUaktugzA5i1h3Bo1QBpiKnwVOXM9+tAX8xlYanNYnVEndxIIRNDz 89V2nm/gbMCXHwFHzhkCaZQIfm6r0FShdqARJINGc7GI7qoBv6l0k/GtvmUddTsB hAi0fuuzAluJg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=3b6ajoTs1izHZZlG5rWS1xP1WwDE6H5Aak5ItkBclgk=; b=rVnrl2xd KeUhdXOhPnngVhL8qogf5lJpEaIyqcutNMYpKRTDqLzPi8WkfArAH/7jXRkXiVVD mkptFvDvAqy/ZdsoEeVoAh83ktjZGVXd4khFx1NUgajaGTb3LPnKHHTpNj6H03tk st/0/vRIZ7y3Y/+vf1fehnrsNs0c5HtTZMBG+4T4PwCIIdnHJq40tNemHylEcmks h8/iAx2XykKsNfryypTjKYnkWZaujJgo/NBvzUYS3J+yt0Bn15seGtni3bPVoxIZ mpIhO05praqygO8G8oWM7YkNb2ISD+E4Mdfq9e2898cdWuiQ/L70BOVcYTArwXb+ xo8cS3dwPyOMSw== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:57 -0400 Message-Id: <1528768140-17894-12-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 11/14] hardfloat: support float32/64 division X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: div-single: 34.84 MFlops div-double: 34.04 MFlops - after: div-single: 275.23 MFlops div-double: 216.38 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: div-single: 9.33 MFlops div-double: 9.30 MFlops - after: div-single: 51.55 MFlops div-double: 15.09 MFlops 3. IBM POWER8E @ 2.1 GHz - before: div-single: 25.65 MFlops div-double: 24.91 MFlops - after: div-single: 96.83 MFlops div-double: 31.01 MFlops Here setting 2FP64_USE_FP to 1 pays off for x86_64: [1] 215.97 vs [0] 62.15 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++= ++-- 1 file changed, 86 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 4fcabf6..fa6c3b6 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1659,7 +1659,8 @@ float16 float16_div(float16 a, float16 b, float_statu= s *status) return float16_round_pack_canonical(pr, status); } =20 -float32 float32_div(float32 a, float32 b, float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_div(float32 a, float32 b, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); @@ -1668,7 +1669,8 @@ float32 float32_div(float32 a, float32 b, float_statu= s *status) return float32_round_pack_canonical(pr, status); } =20 -float64 float64_div(float64 a, float64 b, float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_div(float64 a, float64 b, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); @@ -1677,6 +1679,88 @@ float64 float64_div(float64 a, float64 b, float_stat= us *status) return float64_round_pack_canonical(pr, status); } =20 +static float float_div(float a, float b) +{ + return a / b; +} + +static double double_div(double a, double b) +{ + return a / b; +} + +static bool f32_div_pre(float32 a, float32 b, const struct float_status *s) +{ + return likely(float32_is_zero_or_normal(a) && + float32_is_normal(b) && + can_use_fpu(s)); +} + +static bool f64_div_pre(float64 a, float64 b, const struct float_status *s) +{ + return likely(float64_is_zero_or_normal(a) && + float64_is_normal(b) && + can_use_fpu(s)); +} + +static bool float_div_pre(float a, float b, const struct float_status *s) +{ + return likely((fpclassify(a) =3D=3D FP_NORMAL || fpclassify(a) =3D=3D = FP_ZERO) && + fpclassify(b) =3D=3D FP_NORMAL && + can_use_fpu(s)); +} + +static bool double_div_pre(double a, double b, const struct float_status *= s) +{ + return likely((fpclassify(a) =3D=3D FP_NORMAL || fpclassify(a) =3D=3D = FP_ZERO) && + fpclassify(b) =3D=3D FP_NORMAL && + can_use_fpu(s)); +} + +static bool f32_div_post(float32 a, float32 b, const struct float_status *= s) +{ + return !float32_is_zero(a); +} + +static bool f64_div_post(float64 a, float64 b, const struct float_status *= s) +{ + return !float64_is_zero(a); +} + +static bool float_div_post(float a, float b, const struct float_status *s) +{ + return fpclassify(a) !=3D FP_ZERO; +} + +static bool double_div_post(double a, double b, const struct float_status = *s) +{ + return fpclassify(a) !=3D FP_ZERO; +} + +float32 __attribute__((flatten)) +float32_div(float32 a, float32 b, float_status *s) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return float_gen2(a, b, s, float_div, soft_float32_div, float_div_= pre, + float_div_post, NULL, NULL); + } else { + return f32_gen2(a, b, s, float_div, soft_float32_div, f32_div_pre, + f32_div_post, NULL, NULL); + } +} + +float64 __attribute__((flatten)) +float64_div(float64 a, float64 b, float_status *s) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return double_gen2(a, b, s, double_div, soft_float64_div, + double_div_pre, double_div_post, NULL, NULL); + } else { + return f64_gen2(a, b, s, double_div, soft_float64_div, f64_div_pre, + f64_div_post, NULL, NULL); + } +} + /* * Float to Float conversions * --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528769025278834.0562201463919; Mon, 11 Jun 2018 19:03:45 -0700 (PDT) Received: from localhost ([::1]:52298 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYet-000725-Lq for importer@patchew.org; Mon, 11 Jun 2018 22:03:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40804) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00046j-RM for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003MT-8c for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:53419) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQi-0003Lj-1V for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id AFAD921DC4; Mon, 11 Jun 2018 21:49:03 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:03 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 566D3E425A; Mon, 11 Jun 2018 21:49:03 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=HJ22xWSScLS5SE Re0RksmrBr/awqpAQhjMmI0q/qQuw=; b=rDRkZ8+0SX0+nJImEamiCmHLWR9Dmj v6gUgxlphG/VV0R1OruENvfbWf2zjoWosOowjUNVHraR4SV+RkGCCQVmmZQCZsuD xcjh20rnMGpAn38RuKC5TovCRUFLll40Q0j5S28lkc86JVRzMHlYSOyfM64mNNQJ 5JLKHP9J3BCAA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=HJ22xWSScLS5SERe0RksmrBr/awqpAQhjMmI0q/qQuw=; b=SP9s4JAG ZinYbFJDsNVtTtNhyV7QFXJQgyKx26XffZGtJJzhHt0cbJs4QSqPa6+vmeA4kQ7W ZasvsXWGlRnw6MxKXO/E8STjxOIEq5eo2uu3LoPP8kwnwYeJJ176Zk7IC0Qzarxv NQ/ggsuWB1FBfcvtJFVUhwNoRjjmEJcRctI1l/CClYou2tH+BNfNSSWtyq8Hh8ps ZufYN4dY1pypCHc45kGf1BHDwCXYNYagg2NHhQeJyHMgww0E1lXNXdXhuxZnAsHI 1/trUIEigWPmHXedex8oigC6nUDSEI92klPUsw4E8IeJr4X8VnSML/OTdBqB8xv1 FUsH0sGkf/vWlg== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:58 -0400 Message-Id: <1528768140-17894-13-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 12/14] hardfloat: support float32/64 fused multiply-add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: fma-single: 74.73 MFlops fma-double: 74.54 MFlops - after: fma-single: 203.37 MFlops fma-double: 169.37 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: fma-single: 23.24 MFlops fma-double: 23.70 MFlops - after: fma-single: 66.14 MFlops fma-double: 63.10 MFlops 3. IBM POWER8E @ 2.1 GHz - before: fma-single: 37.26 MFlops fma-double: 37.29 MFlops - after: fma-single: 48.90 MFlops fma-double: 59.51 MFlops Here having 3FP64 set to 1 pays off for x86_64: [1] 170.15 vs [0] 153.12 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++++++++= ++-- 1 file changed, 165 insertions(+), 4 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index fa6c3b6..63cf60c 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1568,8 +1568,9 @@ float16 __attribute__((flatten)) float16_muladd(float= 16 a, float16 b, float16 c, return float16_round_pack_canonical(pr, status); } =20 -float32 __attribute__((flatten)) float32_muladd(float32 a, float32 b, floa= t32 c, - int flags, float_status *s= tatus) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_muladd(float32 a, float32 b, float32 c, int flags, + float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pb =3D float32_unpack_canonical(b, status); @@ -1579,8 +1580,9 @@ float32 __attribute__((flatten)) float32_muladd(float= 32 a, float32 b, float32 c, return float32_round_pack_canonical(pr, status); } =20 -float64 __attribute__((flatten)) float64_muladd(float64 a, float64 b, floa= t64 c, - int flags, float_status *s= tatus) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_muladd(float64 a, float64 b, float64 c, int flags, + float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pb =3D float64_unpack_canonical(b, status); @@ -1591,6 +1593,165 @@ float64 __attribute__((flatten)) float64_muladd(flo= at64 a, float64 b, float64 c, } =20 /* + * FMA generator for softfloat-based condition checks. + * + * When (a || b) =3D=3D 0, there's no need to check for under/over flow, + * since we know the addend is (normal || 0) and the product is 0. + */ +#define GEN_FMA_SF(name, soft_t, host_t, host_fma_f, host_abs_f, min_norma= l) \ + static soft_t \ + name(soft_t a, soft_t b, soft_t c, int flags, float_status *s) \ + { \ + if (QEMU_NO_HARDFLOAT) { \ + goto soft; \ + } \ + soft_t ## _input_flush3(&a, &b, &c, s); \ + if (likely(soft_t ## _is_zero_or_normal(a) && \ + soft_t ## _is_zero_or_normal(b) && \ + soft_t ## _is_zero_or_normal(c) && \ + !(flags & float_muladd_halve_result) && \ + can_use_fpu(s))) { \ + if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) { \ + soft_t p, r; \ + host_t hp, hc, hr; \ + bool prod_sign; \ + \ + prod_sign =3D soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b);= \ + prod_sign ^=3D !!(flags & float_muladd_negate_product); \ + p =3D soft_t ## _set_sign(soft_t ## _zero, prod_sign); \ + \ + if (flags & float_muladd_negate_c) { \ + c =3D soft_t ## _chs(c); \ + } \ + \ + hp =3D soft_t ## _to_ ## host_t(p); \ + hc =3D soft_t ## _to_ ## host_t(c); \ + hr =3D hp + hc; \ + r =3D host_t ## _to_ ## soft_t(hr); \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } else { \ + host_t ha, hb, hc, hr; \ + soft_t r; \ + soft_t sa =3D flags & float_muladd_negate_product ? \ + soft_t ## _chs(a) : a; \ + soft_t sc =3D flags & float_muladd_negate_c ? \ + soft_t ## _chs(c) : c; \ + \ + ha =3D soft_t ## _to_ ## host_t(sa); \ + hb =3D soft_t ## _to_ ## host_t(b); \ + hc =3D soft_t ## _to_ ## host_t(sc); \ + hr =3D host_fma_f(ha, hb, hc); \ + r =3D host_t ## _to_ ## soft_t(hr); \ + \ + if (unlikely(isinf(hr))) { \ + s->float_exception_flags |=3D float_flag_overflow; \ + } else if (unlikely(host_abs_f(hr) <=3D min_normal)) { \ + goto soft; \ + } \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } \ + } \ + soft: \ + return soft_ ## soft_t ## _muladd(a, b, c, flags, s); \ + } + +/* FMA generator for native floating point condition checks */ +#define GEN_FMA_FP(name, soft_t, host_t, host_fma_f, host_abs_f, min_norma= l) \ + static soft_t \ + name(soft_t a, soft_t b, soft_t c, int flags, float_status *s) \ + { \ + host_t ha, hb, hc; \ + \ + if (QEMU_NO_HARDFLOAT) { \ + goto soft; \ + } \ + soft_t ## _input_flush3(&a, &b, &c, s); \ + ha =3D soft_t ## _to_ ## host_t(a); \ + hb =3D soft_t ## _to_ ## host_t(b); \ + hc =3D soft_t ## _to_ ## host_t(c); \ + if (likely((fpclassify(ha) =3D=3D FP_NORMAL || = \ + fpclassify(ha) =3D=3D FP_ZERO) && = \ + (fpclassify(hb) =3D=3D FP_NORMAL || = \ + fpclassify(hb) =3D=3D FP_ZERO) && = \ + (fpclassify(hc) =3D=3D FP_NORMAL || = \ + fpclassify(hc) =3D=3D FP_ZERO) && = \ + !(flags & float_muladd_halve_result) && \ + can_use_fpu(s))) { \ + if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) { \ + soft_t p, r; \ + host_t hp, hc, hr; \ + bool prod_sign; \ + \ + prod_sign =3D soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b);= \ + prod_sign ^=3D !!(flags & float_muladd_negate_product); \ + p =3D soft_t ## _set_sign(soft_t ## _zero, prod_sign); \ + \ + if (flags & float_muladd_negate_c) { \ + c =3D soft_t ## _chs(c); \ + } \ + \ + hp =3D soft_t ## _to_ ## host_t(p); \ + hc =3D soft_t ## _to_ ## host_t(c); \ + hr =3D hp + hc; \ + r =3D host_t ## _to_ ## soft_t(hr); \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } else { \ + host_t hr; \ + \ + if (flags & float_muladd_negate_product) { \ + ha =3D -ha; \ + } \ + if (flags & float_muladd_negate_c) { \ + hc =3D -hc; \ + } \ + hr =3D host_fma_f(ha, hb, hc); \ + if (unlikely(isinf(hr))) { \ + s->float_exception_flags |=3D float_flag_overflow; \ + } else if (unlikely(host_abs_f(hr) <=3D min_normal)) { \ + goto soft; \ + } \ + if (flags & float_muladd_negate_result) { \ + hr =3D -hr; \ + } \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + } \ + soft: \ + return soft_ ## soft_t ## _muladd(a, b, c, flags, s); \ + } + +GEN_FMA_SF(f32_muladd, float32, float, fmaf, fabsf, FLT_MIN) +GEN_FMA_SF(f64_muladd, float64, double, fma, fabs, DBL_MIN) +#undef GEN_FMA_SF + +GEN_FMA_FP(float_muladd, float32, float, fmaf, fabsf, FLT_MIN) +GEN_FMA_FP(double_muladd, float64, double, fma, fabs, DBL_MIN) +#undef GEN_FMA_FP + +float32 __attribute__((flatten)) +float32_muladd(float32 a, float32 b, float32 c, int flags, float_status *s) +{ + if (QEMU_HARDFLOAT_3F32_USE_FP) { + return float_muladd(a, b, c, flags, s); + } else { + return f32_muladd(a, b, c, flags, s); + } +} + +float64 __attribute__((flatten)) +float64_muladd(float64 a, float64 b, float64 c, int flags, float_status *s) +{ + if (QEMU_HARDFLOAT_3F64_USE_FP) { + return double_muladd(a, b, c, flags, s); + } else { + return f64_muladd(a, b, c, flags, s); + } +} + +/* * Returns the result of dividing the floating-point value `a' by the * corresponding value `b'. The operation is performed according to * the IEC/IEEE Standard for Binary Floating-Point Arithmetic. --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528769290478360.7767159648407; Mon, 11 Jun 2018 19:08:10 -0700 (PDT) Received: from localhost ([::1]:52316 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYj2-0001WS-Kg for importer@patchew.org; Mon, 11 Jun 2018 22:08:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40793) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-000464-OO for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003Mm-GF for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:60827) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQi-0003Ls-9u for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id E3C0921D77; Mon, 11 Jun 2018 21:49:03 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:03 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 8B4EE1025D; Mon, 11 Jun 2018 21:49:03 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=Vfqc39+8Dvxfki rM4wL8QQDTSxAqwcvmeIBJexFl2YY=; b=2oh+bhFfGfShQIqSQ4Pv1wHejiPA6K 6wjDWIqkSiGlMsJW4eWsa8NUrOgCDuRlOvnzswkiN6O0aYWPxlaIrJTppbYidK9w KnFSw0Ac5SVfUwjDfVSmFKFxH8rliQ3Bj76mol61lKXbzRj8R59Xn4Pp57QUmwIm Cl4ogLcXqsgpU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=Vfqc39+8DvxfkirM4wL8QQDTSxAqwcvmeIBJexFl2YY=; b=mZjWRsL2 BZ5m3cT5PncLvEqgbnRtWn1RH2SQAWge4a1/+bRVn18vu0bc2X7kOwUdIB6P/r8a 2atNimnZxuJ/tS+YP5FOnBYaeRtBC+1eKvvL5BNzBiI4s2tCdllb828/rBWIjTDj FNpfii/4mnHG0nVvasJ3Hf8VWj6siEmyCx+T9eKnLPy+iSxkMl0UrBJQ7OL1hE1q tdA+N0dRDqGhQm06J/PHTKUeszd/i8KdJfxui4lvefuEXRZoHTwEdzgVM9cPwXq8 B+LfNbs0xVYuvO070dYn3Wslv7Nle1Ncxag2hBsfxGyqbLumh4dJ5zrCdjHC3col S0L5umJ7EKcquw== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:48:59 -0400 Message-Id: <1528768140-17894-14-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 13/14] hardfloat: support float32/64 square root X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: sqrt-single: 43.27 MFlops sqrt-double: 24.81 MFlops - after: sqrt-single: 297.94 MFlops sqrt-double: 210.46 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: sqrt-single: 12.41 MFlops sqrt-double: 6.22 MFlops - after: sqrt-single: 55.58 MFlops sqrt-double: 40.63 MFlops 3. IBM POWER8E @ 2.1 GHz - before: sqrt-single: 17.01 MFlops sqrt-double: 9.61 MFlops - after: sqrt-single: 104.17 MFlops sqrt-double: 133.32 MFlops Here none of the machines got faster from enabling USE_FP. For instance, on x86_64 sqrt is 23% slower for single precision, with it enabled, and 17% slower for double precision. Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++= ++-- 1 file changed, 71 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 63cf60c..f89e872 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2812,14 +2812,16 @@ float16 __attribute__((flatten)) float16_sqrt(float= 16 a, float_status *status) return float16_round_pack_canonical(pr, status); } =20 -float32 __attribute__((flatten)) float32_sqrt(float32 a, float_status *sta= tus) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_sqrt(float32 a, float_status *status) { FloatParts pa =3D float32_unpack_canonical(a, status); FloatParts pr =3D sqrt_float(pa, status, &float32_params); return float32_round_pack_canonical(pr, status); } =20 -float64 __attribute__((flatten)) float64_sqrt(float64 a, float_status *sta= tus) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_sqrt(float64 a, float_status *status) { FloatParts pa =3D float64_unpack_canonical(a, status); FloatParts pr =3D sqrt_float(pa, status, &float64_params); @@ -2899,6 +2901,73 @@ float64 float64_silence_nan(float64 a, float_status = *status) return float64_pack_raw(p); } =20 +#define GEN_SQRT_SF(name, soft_t, host_t, host_sqrt_func) \ + static soft_t name(soft_t a, float_status *s) \ + { \ + if (QEMU_NO_HARDFLOAT) { \ + goto soft; \ + } \ + soft_t ## _input_flush1(&a, s); \ + if (likely(soft_t ## _is_zero_or_normal(a) && \ + !soft_t ## _is_neg(a) && \ + can_use_fpu(s))) { \ + host_t ha =3D soft_t ## _to_ ## host_t(a); \ + host_t hr =3D host_sqrt_func(ha); \ + \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + soft: \ + return soft_ ## soft_t ## _sqrt(a, s); \ + } + +#define GEN_SQRT_FP(name, soft_t, host_t, host_sqrt_func) \ + static soft_t name(soft_t a, float_status *s) \ + { \ + host_t ha; \ + \ + if (QEMU_NO_HARDFLOAT) { \ + goto soft; \ + } \ + soft_t ## _input_flush1(&a, s); \ + ha =3D soft_t ## _to_ ## host_t(a); \ + if (likely((fpclassify(ha) =3D=3D FP_NORMAL || = \ + fpclassify(ha) =3D=3D FP_ZERO) && = \ + !signbit(ha) && \ + can_use_fpu(s))) { \ + host_t hr =3D host_sqrt_func(ha); \ + \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + soft: \ + return soft_ ## soft_t ## _sqrt(a, s); \ + } + +GEN_SQRT_SF(f32_sqrt, float32, float, sqrtf) +GEN_SQRT_SF(f64_sqrt, float64, double, sqrt) +#undef GEN_SQRT_SF + +GEN_SQRT_FP(float_sqrt, float32, float, sqrtf) +GEN_SQRT_FP(double_sqrt, float64, double, sqrt) +#undef GEN_SQRT_FP + +float32 __attribute__((flatten)) float32_sqrt(float32 a, float_status *s) +{ + if (QEMU_HARDFLOAT_1F32_USE_FP) { + return float_sqrt(a, s); + } else { + return f32_sqrt(a, s); + } +} + +float64 __attribute__((flatten)) float64_sqrt(float64 a, float_status *s) +{ + if (QEMU_HARDFLOAT_1F64_USE_FP) { + return double_sqrt(a, s); + } else { + return f64_sqrt(a, s); + } +} + /*------------------------------------------------------------------------= ---- | Takes a 64-bit fixed-point value `absZ' with binary point between bits 6 | and 7, and returns the properly rounded 32-bit integer corresponding to = the --=20 2.7.4 From nobody Sun May 5 07:15:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528768671463513.3018656132066; Mon, 11 Jun 2018 18:57:51 -0700 (PDT) Received: from localhost ([::1]:52262 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYZC-0002XO-9h for importer@patchew.org; Mon, 11 Jun 2018 21:57:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40791) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSYQm-00045y-OQ for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSYQi-0003My-Ts for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:08 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:46725) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fSYQi-0003MY-P0 for qemu-devel@nongnu.org; Mon, 11 Jun 2018 21:49:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 15B5721D06; Mon, 11 Jun 2018 21:49:04 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 11 Jun 2018 21:49:04 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id C41F6E425A; Mon, 11 Jun 2018 21:49:03 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=O/3PY6Jjm6lw3I l66sIDvMeuxD9QTMrpte5HfpVavfk=; b=BbPDGRsd5KFVQD94iIrWuUi/adW7sx Pwyygto4ev+Zjgfz3Or80l6khVGBg6wIKBp20tW0eJdvgz5JJD50u8DbYQMyiACA KUED7jVJzEaw5nHDa8T+538BsBvlWHrkeBtqxS/vWSjlo+11q9kxBgVUKXWpjZWN wxyB3jua3izkY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=O/3PY6Jjm6lw3Il66sIDvMeuxD9QTMrpte5HfpVavfk=; b=CgruI0na j0isHfrBdy5dXix5U/Gd4YC/1PJBO6Ka9TMUD6f3ymiBfIowahA+wEQlTuu2JN4s XsovVg6QkHTUs8MOkpw8Su+6+aASDs+8DmHPc50hw95weeIubl5kVQYyZnusg75t 8+Ht1iVfDFGaf7/17XmfpYKU4ulAbAQaq54uTSKWvnyNd6sY3pAgkCX1X68KDZEN BMpFI7MdnQyrCX47A/UvQ9B8/nYONIOODHaSpIPWDm0lPnVf5LDSzJDDbDv7YJoJ nio78H+itfOjFuxWSQTAnn27eU84ANo3kCPG/UiVo7fgAc6+0A/hp42TGhUF/aJ6 R4cQZnPoClgy3Q== X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Mon, 11 Jun 2018 21:49:00 -0400 Message-Id: <1528768140-17894-15-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1528768140-17894-1-git-send-email-cota@braap.org> References: <1528768140-17894-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v4 14/14] hardfloat: support float32/64 comparison X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: cmp-single: 113.01 MFlops cmp-double: 115.54 MFlops - after: cmp-single: 527.83 MFlops cmp-double: 457.21 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: cmp-single: 39.32 MFlops cmp-double: 39.80 MFlops - after: cmp-single: 162.74 MFlops cmp-double: 167.08 MFlops 3. IBM POWER8E @ 2.1 GHz - before: cmp-single: 60.81 MFlops cmp-double: 62.76 MFlops - after: cmp-single: 235.39 MFlops cmp-double: 283.44 MFlops Here using float{32,64}_is_any_nan is faster than using isnan for all machines. On x86_64 the perf difference is just a few percentage points, but on aarch64 we go from 117/119 to 164/169 MFlops for single/double precision, respectively. Aggregate performance improvement for the last few patches: [ all charts in png: https://imgur.com/a/4yV8p ] 1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-----------+-------------+----=3D=3D=3D-------+---=3D=3D=3D-------+= -----------+-+ 14 +-+..........................@@@&&.=3D.......@@@&&.=3D................= ...+-+ 12 +-+..........................@.@.&.=3D.......@.@.&.=3D.....+befor=3D= =3D=3D +-+ 10 +-+..........................@.@.&.=3D.......@.@.&.=3D.....+ad@@&& =3D= +-+ 8 +-+.......................$$$%.@.&.=3D.......@.@.&.=3D.....+ @@u& =3D= +-+ 6 +-+............@@@&&=3D+***##.$%.@.&.=3D***##$$%+@.&.=3D..###$$%%@i& = =3D +-+ 4 +-+.......###$%%.@.&=3D.*.*.#.$%.@.&.=3D*.*.#.$%.@.&.=3D+**.#+$ +@m& = =3D +-+ 2 +-+.....***.#$.%.@.&=3D.*.*.#.$%.@.&.=3D*.*.#.$%.@.&.=3D.**.#+$+sqr& = =3D +-+ 0 +-+-----***##$%%@@&&=3D-***##$$%@@&&=3D=3D***##$$%@@&&=3D=3D-**##$$%+c= mp=3D=3D-----+-+ FOURIER NEURAL NELU DECOMPOSITION gmean qemu-aarch64 SPEC06fp (test set) speedup over= QEMU 4c2c1015905 Host: Intel(R) Core(TM) i7-6700K CPU = @ 4.00GHz error bars: 95% confidence inte= rval 4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----= +-----+-----+-----+-----+----+-----+---+-+ 4 +-+..........................+@@+....................................= .......................................+-+ 3.5 +-+..............%%@&.........@@..............%%@&...................= .........................+++dsub +-+ 2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&...................= .................+%%&+.+%@&++%%@& +-+ 2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.= %%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+ 1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**= #%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+ 0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**= #%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+ 0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**= #%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+ 410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.Ge= msF465.tont470.lb4482.sphinxgeomean 2. Host: ARM Aarch64 A57 @ 2.4GHz qemu-aarch64 NBench score; higher is better Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz 5 +-+-----------+-------------+-------------+-------------+-----------+= -+ 4.5 +-+........................................@@@&=3D=3D................= ...+-+ 3 4 +-+..........................@@@&=3D=3D........@.@&.=3D.....+before = +-+ 3 +-+..........................@.@&.=3D........@.@&.=3D.....+ad@@@&=3D= =3D +-+ 2.5 +-+.....................##$$%%.@&.=3D........@.@&.=3D.....+ @m@& =3D= +-+ 2 +-+............@@@&=3D=3D.***#.$.%.@&.=3D.***#$$%%.@&.=3D.***#$$%%d@&= =3D +-+ 1.5 +-+.....***#$$%%.@&.=3D.*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#+$ +f@& = =3D +-+ 0.5 +-+.....*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#.$.%.@&.=3D.*.*#+$+sqr& = =3D +-+ 0 +-+-----***#$$%%@@&=3D=3D-***#$$%%@@&=3D=3D-***#$$%%@@&=3D=3D-***#$$%= +cmp=3D=3D-----+-+ FOURIER NEURAL NLU DECOMPOSITION gmean Note that by not inlining the soft-fp primitives we end up with a smaller softfloat.o--in particular, see the difference for the softfloat.o built for fp-bench: - before this series: text data bss dec hex filename 103235 0 0 103235 19343 softfloat.o - after: text data bss dec hex filename 93369 0 0 93369 16cb9 softfloat.o Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++-------= ---- 1 file changed, 60 insertions(+), 14 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index f89e872..1cf74d1 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2671,28 +2671,74 @@ static int compare_floats(FloatParts a, FloatParts = b, bool is_quiet, } } =20 -#define COMPARE(sz) \ -int float ## sz ## _compare(float ## sz a, float ## sz b, \ - float_status *s) \ +#define COMPARE(name, attr, sz) \ +static int attr \ +name(float ## sz a, float ## sz b, bool is_quiet, float_status *s) \ { \ FloatParts pa =3D float ## sz ## _unpack_canonical(a, s); \ FloatParts pb =3D float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, false, s); \ -} \ -int float ## sz ## _compare_quiet(float ## sz a, float ## sz b, \ - float_status *s) \ -{ \ - FloatParts pa =3D float ## sz ## _unpack_canonical(a, s); \ - FloatParts pb =3D float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, true, s); \ + return compare_floats(pa, pb, is_quiet, s); \ } =20 -COMPARE(16) -COMPARE(32) -COMPARE(64) +COMPARE(soft_float16_compare, , 16) +COMPARE(soft_float32_compare, QEMU_SOFTFLOAT_ATTR, 32) +COMPARE(soft_float64_compare, QEMU_SOFTFLOAT_ATTR, 64) =20 #undef COMPARE =20 +int __attribute__((flatten)) +float16_compare(float16 a, float16 b, float_status *s) +{ + return soft_float16_compare(a, b, false, s); +} + +int __attribute__((flatten)) +float16_compare_quiet(float16 a, float16 b, float_status *s) +{ + return soft_float16_compare(a, b, true, s); +} + +#define GEN_FPU_COMPARE(name, quiet_name, soft_t, host_t) \ + static int \ + fpu_ ## name(soft_t a, soft_t b, bool is_quiet, float_status *s) \ + { \ + host_t ha, hb; \ + \ + if (QEMU_NO_HARDFLOAT) { \ + return soft_ ## name(a, b, is_quiet, s); \ + } \ + soft_t ## _input_flush2(&a, &b, s); \ + ha =3D soft_t ## _to_ ## host_t(a); \ + hb =3D soft_t ## _to_ ## host_t(b); \ + if (unlikely(soft_t ## _is_any_nan(a) || \ + soft_t ## _is_any_nan(b))) { \ + return soft_ ## name(a, b, is_quiet, s); \ + } \ + if (isgreater(ha, hb)) { \ + return float_relation_greater; \ + } \ + if (isless(ha, hb)) { \ + return float_relation_less; \ + } \ + return float_relation_equal; \ + } \ + \ + int __attribute__((flatten)) \ + name(soft_t a, soft_t b, float_status *s) \ + { \ + return fpu_ ## name(a, b, false, s); \ + } \ + \ + int __attribute__((flatten)) \ + quiet_name(soft_t a, soft_t b, float_status *s) \ + { \ + return fpu_ ## name(a, b, true, s); \ + } + +GEN_FPU_COMPARE(float32_compare, float32_compare_quiet, float32, float) +GEN_FPU_COMPARE(float64_compare, float64_compare_quiet, float64, double) +#undef GEN_FPU_COMPARE + /* Multiply A by 2 raised to the power N. */ static FloatParts scalbn_decomposed(FloatParts a, int n, float_status *s) { --=20 2.7.4