From nobody Sun Oct 26 00:03:09 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1521663409619368.0415619782606; Wed, 21 Mar 2018 13:16:49 -0700 (PDT) Received: from localhost ([::1]:57142 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eykA8-0006WV-LI for importer@patchew.org; Wed, 21 Mar 2018 16:16:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42173) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eyk5W-0002A1-0M for qemu-devel@nongnu.org; Wed, 21 Mar 2018 16:12:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eyk5Q-0000y8-R4 for qemu-devel@nongnu.org; Wed, 21 Mar 2018 16:11:57 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:57315) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eyk5Q-0000xG-K5 for qemu-devel@nongnu.org; Wed, 21 Mar 2018 16:11:52 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id D0FD720DB5; Wed, 21 Mar 2018 16:11:50 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Wed, 21 Mar 2018 16:11:50 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 6242E7E16D; Wed, 21 Mar 2018 16:11:50 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=9IL+AYyx+viZ1Z OzzoSiZkXUiMJrd3t5Je3ADr8ZVfE=; b=P/I7/q0/BlQZSeA/m/IjV07FCWC0BS pwht/MeXLXT1T0ZrfX6QCRphj8+cHeqkz/Q4fkJV6oWhUjfz/AF6G/jZlDL9dc8l pdwUri21hs9N1OwEZOqy3IWwaEG25ytZX1vXzqsir/YhxJZr2na+xqYT3/Tsj3WV ZGkCwtGhk1jW0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=9IL+AYyx+viZ1ZOzzoSiZkXUiMJrd3t5Je3ADr8ZVfE=; b=LjrSdGnz AFLJ8ozs4VJsm9wQzgQ/0PeqdBcavxeEylNJ0ux/+EbEvL8KbrhX/dKU4eFgLMoH nJwsV58mINM870YcwERtnDXnjY14YzdcRX88SEq8M5EuJRdEIY/kyynyuvyvQEK9 34z/kHBNAfbAZ9M6ovvkOyXOGoZ+RDAkvBr2j4u8Jo0DoQl/zGi8y9YM0wqRHMz9 qCPse+u2UMPrx6KDSTXcB/G2/gdZcJz6usGX4faV5CeSIFHUqxUZnq1eMNA+qO54 2EP0lFIIiKhPkMT6Y+xo2jqcE3WEAaZkX3niJUBJH6KYSvqh6S7Gn0Gvz+liLRDv KNb6ow7HzVwz3A== X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 21 Mar 2018 16:11:36 -0400 Message-Id: <1521663109-32262-2-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1521663109-32262-1-git-send-email-cota@braap.org> References: <1521663109-32262-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v1 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This will allow us to measure the performance impact of FP emulation optimizations. Signed-off-by: Emilio G. Cota Reviewed-by: Alex Benn=C3=A9e --- tests/fp-bench.c | 290 +++++++++++++++++++++++++++++++++++++++++++++= ++++ tests/.gitignore | 1 + tests/Makefile.include | 3 +- 3 files changed, 293 insertions(+), 1 deletion(-) create mode 100644 tests/fp-bench.c diff --git a/tests/fp-bench.c b/tests/fp-bench.c new file mode 100644 index 0000000..a782093 --- /dev/null +++ b/tests/fp-bench.c @@ -0,0 +1,290 @@ +/* + * fp-bench.c - A collection of simple floating point microbenchmarks. + * + * Copyright (C) 2018, Emilio G. Cota + * + * License: GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include "qemu/osdep.h" +#include "qemu/atomic.h" + +#include + +#include +#include +#include +#include +#include +#include + +/* amortize the computation of random inputs */ +#define OPS_PER_ITER (1000ULL) + +#define SEED_A 0xdeadfacedeadface +#define SEED_B 0xbadc0feebadc0fee +#define SEED_C 0xbeefdeadbeefdead + +enum op { + OP_ADD, + OP_SUB, + OP_MUL, + OP_DIV, + OP_FMA, + OP_SQRT, +}; + +static const char * const op_names[] =3D { + [OP_ADD] =3D "add", + [OP_SUB] =3D "sub", + [OP_MUL] =3D "mul", + [OP_DIV] =3D "div", + [OP_FMA] =3D "fma", + [OP_SQRT] =3D "sqrt", +}; + +static uint64_t n_ops =3D 10000000; +static enum op op; +static const char *precision =3D "float"; + +static const char commands_string[] =3D + " -n =3D number of floating point operations\n" + " -o =3D floating point operation (add, sub, mul, div, fma, sqrt). Def= ault: add\n" + " -p =3D precision (float|single, double). Default: float"; + +static void usage_complete(int argc, char *argv[]) +{ + fprintf(stderr, "Usage: %s [options]\n", argv[0]); + fprintf(stderr, "options:\n%s\n", commands_string); + exit(-1); +} + +static void set_op(const char *name) +{ + int i; + + for (i =3D 0; i < ARRAY_SIZE(op_names); i++) { + if (strcmp(name, op_names[i]) =3D=3D 0) { + op =3D i; + return; + } + } + fprintf(stderr, "Unsupported op '%s'\n", name); + exit(EXIT_FAILURE); +} + +static inline int64_t get_clock_realtime(void) +{ + struct timeval tv; + + gettimeofday(&tv, NULL); + return tv.tv_sec * 1000000000LL + (tv.tv_usec * 1000); +} + +/* + * From: https://en.wikipedia.org/wiki/Xorshift + * This is faster than rand_r(), and gives us a wider range (RAND_MAX is o= nly + * guaranteed to be >=3D INT_MAX). + */ +static uint64_t xorshift64star(uint64_t x) +{ + x ^=3D x >> 12; /* a */ + x ^=3D x << 25; /* b */ + x ^=3D x >> 27; /* c */ + return x * UINT64_C(2685821657736338717); +} + +static inline bool u32_is_normal(uint32_t x) +{ + return ((x + 0x00800000) & 0x7fffffff) >=3D 0x01000000; +} + +static inline bool u64_is_normal(uint64_t x) +{ + return ((x + (1ULL << 52)) & -1ULL >> 1) >=3D 1ULL << 53; +} + +static inline float get_random_float(uint64_t *x) +{ + uint64_t r =3D *x; + uint32_t r32; + + do { + r =3D xorshift64star(r); + } while (!u32_is_normal(r)); + *x =3D r; + r32 =3D r; + return *(float *)&r32; +} + +static inline double get_random_double(uint64_t *x) +{ + uint64_t r =3D *x; + + do { + r =3D xorshift64star(r); + } while (!u64_is_normal(r)); + *x =3D r; + return *(double *)&r; +} + +/* + * Disable optimizations (e.g. "a OP b" outside of the inner loop) with + * volatile. + */ +#define GEN_BENCH_1OPF(NAME, FUNC, PRECISION) \ + static void NAME(volatile PRECISION *res) \ + { \ + uint64_t ra =3D SEED_A; \ + uint64_t i, j; \ + \ + for (i =3D 0; i < n_ops; i +=3D OPS_PER_ITER) { = \ + volatile PRECISION a =3D glue(get_random_, PRECISION)(&ra); \ + \ + for (j =3D 0; j < OPS_PER_ITER; j++) { \ + *res =3D FUNC(a); \ + } \ + } \ + } + +GEN_BENCH_1OPF(bench_float_sqrt, sqrtf, float) +GEN_BENCH_1OPF(bench_double_sqrt, sqrt, double) +#undef GEN_BENCH_1OPF + +#define GEN_BENCH_2OP(NAME, OP, PRECISION) \ + static void NAME(volatile PRECISION *res) \ + { \ + uint64_t ra =3D SEED_A; \ + uint64_t rb =3D SEED_B; \ + uint64_t i, j; \ + \ + for (i =3D 0; i < n_ops; i +=3D OPS_PER_ITER) { = \ + volatile PRECISION a =3D glue(get_random_, PRECISION)(&ra); \ + volatile PRECISION b =3D glue(get_random_, PRECISION)(&rb); \ + \ + for (j =3D 0; j < OPS_PER_ITER; j++) { \ + *res =3D a OP b; \ + } \ + } \ + } + +GEN_BENCH_2OP(bench_float_add, +, float) +GEN_BENCH_2OP(bench_float_sub, -, float) +GEN_BENCH_2OP(bench_float_mul, *, float) +GEN_BENCH_2OP(bench_float_div, /, float) + +GEN_BENCH_2OP(bench_double_add, +, double) +GEN_BENCH_2OP(bench_double_sub, -, double) +GEN_BENCH_2OP(bench_double_mul, *, double) +GEN_BENCH_2OP(bench_double_div, /, double) + +#define GEN_BENCH_3OPF(NAME, FUNC, PRECISION) \ + static void NAME(volatile PRECISION *res) \ + { \ + uint64_t ra =3D SEED_A; \ + uint64_t rb =3D SEED_B; \ + uint64_t rc =3D SEED_C; \ + uint64_t i, j; \ + \ + for (i =3D 0; i < n_ops; i +=3D OPS_PER_ITER) { = \ + volatile PRECISION a =3D glue(get_random_, PRECISION)(&ra); \ + volatile PRECISION b =3D glue(get_random_, PRECISION)(&rb); \ + volatile PRECISION c =3D glue(get_random_, PRECISION)(&rc); \ + \ + for (j =3D 0; j < OPS_PER_ITER; j++) { \ + *res =3D FUNC(a, b, c); \ + } \ + } \ + } + +GEN_BENCH_3OPF(bench_float_fma, fmaf, float) +GEN_BENCH_3OPF(bench_double_fma, fma, double) +#undef GEN_BENCH_3OPF + +static void parse_args(int argc, char *argv[]) +{ + int c; + + for (;;) { + c =3D getopt(argc, argv, "n:ho:p:"); + if (c < 0) { + break; + } + switch (c) { + case 'h': + usage_complete(argc, argv); + exit(0); + case 'n': + n_ops =3D atoll(optarg); + if (n_ops < OPS_PER_ITER) { + n_ops =3D OPS_PER_ITER; + } + n_ops -=3D n_ops % OPS_PER_ITER; + break; + case 'o': + set_op(optarg); + break; + case 'p': + precision =3D optarg; + if (strcmp(precision, "float") && + strcmp(precision, "single") && + strcmp(precision, "double")) { + fprintf(stderr, "Unsupported precision '%s'\n", precision); + exit(EXIT_FAILURE); + } + break; + } + } +} + +#define CALL_BENCH(OP, PRECISION, RESP) \ + do { \ + switch (OP) { \ + case OP_ADD: \ + glue(glue(bench_, PRECISION), _add)(RESP); \ + break; \ + case OP_SUB: \ + glue(glue(bench_, PRECISION), _sub)(RESP); \ + break; \ + case OP_MUL: \ + glue(glue(bench_, PRECISION), _mul)(RESP); \ + break; \ + case OP_DIV: \ + glue(glue(bench_, PRECISION), _div)(RESP); \ + break; \ + case OP_FMA: \ + glue(glue(bench_, PRECISION), _fma)(RESP); \ + break; \ + case OP_SQRT: \ + glue(glue(bench_, PRECISION), _sqrt)(RESP); \ + break; \ + default: \ + g_assert_not_reached(); \ + } \ + } while (0) + +int main(int argc, char *argv[]) +{ + int64_t t0, t1; + double resd; + + parse_args(argc, argv); + if (!strcmp(precision, "float") || !strcmp(precision, "single")) { + float res; + t0 =3D get_clock_realtime(); + CALL_BENCH(op, float, &res); + t1 =3D get_clock_realtime(); + resd =3D res; + } else if (!strcmp(precision, "double")) { + t0 =3D get_clock_realtime(); + CALL_BENCH(op, double, &resd); + t1 =3D get_clock_realtime(); + } else { + g_assert_not_reached(); + } + printf("%.2f MFlops\n", (double)n_ops / (t1 - t0) * 1e3); + if (resd) { + return 0; + } + return 0; +} diff --git a/tests/.gitignore b/tests/.gitignore index 18e58b2..df69175 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -12,6 +12,7 @@ check-qobject check-qstring check-qom-interface check-qom-proplist +fp-bench qht-bench rcutorture test-aio diff --git a/tests/Makefile.include b/tests/Makefile.include index ef9b88c..f6121ee 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -587,7 +587,7 @@ test-obj-y =3D tests/check-qnum.o tests/check-qstring.o= tests/check-qdict.o \ tests/rcutorture.o tests/test-rcu-list.o \ tests/test-qdist.o tests/test-shift128.o \ tests/test-qht.o tests/qht-bench.o tests/test-qht-par.o \ - tests/atomic_add-bench.o + tests/atomic_add-bench.o tests/fp-bench.o =20 $(test-obj-y): QEMU_INCLUDES +=3D -Itests QEMU_CFLAGS +=3D -I$(SRC_PATH)/tests @@ -639,6 +639,7 @@ tests/test-qht-par$(EXESUF): tests/test-qht-par.o tests= /qht-bench$(EXESUF) $(tes tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util-obj-y) tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o $(test-util-ob= j-y) tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-= y) +tests/fp-bench$(EXESUF): tests/fp-bench.o $(test-util-obj-y) =20 tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \ hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\ --=20 2.7.4