[PATCH 00/72] Convert floatx80 and float128 to FloatParts

Richard Henderson posted 72 patches 1 year, 3 months ago
Test checkpatch failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210508014802.892561-1-richard.henderson@linaro.org
There is a newer version of this series
Maintainers: "Alex Bennée" <alex.bennee@linaro.org>, Jiaxun Yang <jiaxun.yang@flygoat.com>, Richard Henderson <richard.henderson@linaro.org>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>, Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>, Eduardo Habkost <ehabkost@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, Laurent Vivier <laurent@vivier.eu>, Aurelien Jarno <aurelien@aurel32.net>, Paolo Bonzini <pbonzini@redhat.com>
include/fpu/softfloat-helpers.h  |    5 +-
include/fpu/softfloat-macros.h   |  247 +-
include/fpu/softfloat-types.h    |   10 +-
include/fpu/softfloat.h          |   11 +-
include/qemu/host-utils.h        |  291 ++
target/mips/fpu_helper.h         |   10 +-
accel/tcg/tcg-runtime-gvec.c     |   36 +-
fpu/softfloat.c                  | 7760 ++++++++++--------------------
linux-user/arm/nwfpe/fpa11.c     |   41 +-
target/i386/tcg/fpu_helper.c     |   79 +-
target/m68k/fpu_helper.c         |   50 +-
target/m68k/softfloat.c          |   90 +-
tests/fp/fp-bench.c              |   88 +-
tests/fp/fp-test-log2.c          |  118 +
tests/fp/fp-test.c               |   11 +-
fpu/softfloat-parts-addsub.c.inc |   62 +
fpu/softfloat-parts.c.inc        | 1480 ++++++
fpu/softfloat-specialize.c.inc   |  424 +-
tests/fp/wrap.c.inc              |   12 +
tests/fp/meson.build             |   11 +
20 files changed, 4886 insertions(+), 5950 deletions(-)
create mode 100644 tests/fp/fp-test-log2.c
create mode 100644 fpu/softfloat-parts-addsub.c.inc
create mode 100644 fpu/softfloat-parts.c.inc
[PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Richard Henderson 1 year, 3 months ago
Reorg everything using QEMU_GENERIC and multiple inclusion to
reduce the amount of code duplication between the formats.

The use of QEMU_GENERIC means that we need to use pointers instead
of structures, which means that even the smaller float formats
need rearranging.

I've carried it through to completion within fpu/, so that we don't
have (much) of the legacy code remaining.  There is some floatx80
stuff in target/m68k and target/i386 that's still hanging around.


r~


Alex Bennée (1):
  tests/fp: add quad support to the benchmark utility

Richard Henderson (71):
  qemu/host-utils: Use __builtin_bitreverseN
  qemu/host-utils: Add wrappers for overflow builtins
  qemu/host-utils: Add wrappers for carry builtins
  accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c
  softfloat: Move the binary point to the msb
  softfloat: Inline float_raise
  softfloat: Use float_raise in more places
  softfloat: Tidy a * b + inf return
  softfloat: Add float_cmask and constants
  softfloat: Use return_nan in float_to_float
  softfloat: fix return_nan vs default_nan_mode
  target/mips: Set set_default_nan_mode with set_snan_bit_is_one
  softfloat: Do not produce a default_nan from parts_silence_nan
  softfloat: Rename FloatParts to FloatParts64
  softfloat: Move type-specific pack/unpack routines
  softfloat: Use pointers with parts_default_nan
  softfloat: Use pointers with unpack_raw
  softfloat: Use pointers with ftype_unpack_raw
  softfloat: Use pointers with pack_raw
  softfloat: Use pointers with ftype_pack_raw
  softfloat: Use pointers with ftype_unpack_canonical
  softfloat: Use pointers with ftype_round_pack_canonical
  softfloat: Use pointers with parts_silence_nan
  softfloat: Rearrange FloatParts64
  softfloat: Convert float128_silence_nan to parts
  softfloat: Convert float128_default_nan to parts
  softfloat: Move return_nan to softfloat-parts.c.inc
  softfloat: Move pick_nan to softfloat-parts.c.inc
  softfloat: Move pick_nan_muladd to softfloat-parts.c.inc
  softfloat: Move sf_canonicalize to softfloat-parts.c.inc
  softfloat: Move round_canonical to softfloat-parts.c.inc
  softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h
  softfloat: Move addsub_floats to softfloat-parts.c.inc
  softfloat: Implement float128_add/sub via parts
  softfloat: Move mul_floats to softfloat-parts.c.inc
  softfloat: Move muladd_floats to softfloat-parts.c.inc
  softfloat: Use mulu64 for mul64To128
  softfloat: Use add192 in mul128To256
  softfloat: Tidy mul128By64To192
  softfloat: Introduce sh[lr]_double primitives
  softfloat: Move div_floats to softfloat-parts.c.inc
  softfloat: Split float_to_float
  softfloat: Convert float-to-float conversions with float128
  softfloat: Move round_to_int to softfloat-parts.c.inc
  softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc
  softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc
  softfloat: Move int_to_float to softfloat-parts.c.inc
  softfloat: Move uint_to_float to softfloat-parts.c.inc
  softfloat: Move minmax_flags to softfloat-parts.c.inc
  softfloat: Move compare_floats to softfloat-parts.c.inc
  softfloat: Move scalbn_decomposed to softfloat-parts.c.inc
  softfloat: Move sqrt_float to softfloat-parts.c.inc
  softfloat: Split out parts_uncanon_normal
  softfloat: Reduce FloatFmt
  softfloat: Introduce Floatx80RoundPrec
  softfloat: Adjust parts_uncanon_normal for floatx80
  tests/fp/fp-test: Reverse order of floatx80 precision tests
  softfloat: Convert floatx80_add/sub to FloatParts
  softfloat: Convert floatx80_mul to FloatParts
  softfloat: Convert floatx80_div to FloatParts
  softfloat: Convert floatx80_sqrt to FloatParts
  softfloat: Convert floatx80_round to FloatParts
  softfloat: Convert floatx80_round_to_int to FloatParts
  softfloat: Convert integer to floatx80 to FloatParts
  softfloat: Convert floatx80 float conversions to FloatParts
  softfloat: Convert floatx80 to integer to FloatParts
  softfloat: Convert floatx80_scalbn to FloatParts
  softfloat: Convert floatx80 compare to FloatParts
  softfloat: Convert float32_exp2 to FloatParts
  softfloat: Move floatN_log2 to softfloat-parts.c.inc
  softfloat: Convert modrem operations to FloatParts

 include/fpu/softfloat-helpers.h  |    5 +-
 include/fpu/softfloat-macros.h   |  247 +-
 include/fpu/softfloat-types.h    |   10 +-
 include/fpu/softfloat.h          |   11 +-
 include/qemu/host-utils.h        |  291 ++
 target/mips/fpu_helper.h         |   10 +-
 accel/tcg/tcg-runtime-gvec.c     |   36 +-
 fpu/softfloat.c                  | 7760 ++++++++++--------------------
 linux-user/arm/nwfpe/fpa11.c     |   41 +-
 target/i386/tcg/fpu_helper.c     |   79 +-
 target/m68k/fpu_helper.c         |   50 +-
 target/m68k/softfloat.c          |   90 +-
 tests/fp/fp-bench.c              |   88 +-
 tests/fp/fp-test-log2.c          |  118 +
 tests/fp/fp-test.c               |   11 +-
 fpu/softfloat-parts-addsub.c.inc |   62 +
 fpu/softfloat-parts.c.inc        | 1480 ++++++
 fpu/softfloat-specialize.c.inc   |  424 +-
 tests/fp/wrap.c.inc              |   12 +
 tests/fp/meson.build             |   11 +
 20 files changed, 4886 insertions(+), 5950 deletions(-)
 create mode 100644 tests/fp/fp-test-log2.c
 create mode 100644 fpu/softfloat-parts-addsub.c.inc
 create mode 100644 fpu/softfloat-parts.c.inc

-- 
2.25.1


Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Alex Bennée 1 year, 3 months ago
Richard Henderson <richard.henderson@linaro.org> writes:

> Reorg everything using QEMU_GENERIC and multiple inclusion to
> reduce the amount of code duplication between the formats.
>
> The use of QEMU_GENERIC means that we need to use pointers instead
> of structures, which means that even the smaller float formats
> need rearranging.
>
> I've carried it through to completion within fpu/, so that we don't
> have (much) of the legacy code remaining.  There is some floatx80
> stuff in target/m68k and target/i386 that's still hanging around.

I'm going to take a break from reviewing this now I've been through
about 2/3rds of the patches. Overall I think the series is in great
shape and while the performance modulations are interesting they are not
a blocker from my point of view. I'll happily take a small hit to
performance for a more unified (and correct!) code base. However the
frontend maintainers for those affected may take a different view.

-- 
Alex Bennée

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Alex Bennée 1 year, 3 months ago
Richard Henderson <richard.henderson@linaro.org> writes:

> Reorg everything using QEMU_GENERIC and multiple inclusion to
> reduce the amount of code duplication between the formats.
>
> The use of QEMU_GENERIC means that we need to use pointers instead
> of structures, which means that even the smaller float formats
> need rearranging.
>
> I've carried it through to completion within fpu/, so that we don't
> have (much) of the legacy code remaining.  There is some floatx80
> stuff in target/m68k and target/i386 that's still hanging around.

FWIW I could enable a few more tests although extF80_lt_quiet still has
some failures on equality tests:

./tests/fp/fp-test -l 1 -r all extF80_lt_quiet
>> Testing extF80_lt_quiet
46464 tests total.
Errors found in extF80_lt_quiet:
+0000.0000000000000000  +0000.0000000000000000  => 1 .....  expected 0 .....
+0000.0000000000000000  -0000.0000000000000000  => 1 .....  expected 0 .....
+0000.0000000000000001  +0000.0000000000000001  => 1 .....  expected 0 .....
+0000.7FFFFFFFFFFFFFFF  +0000.7FFFFFFFFFFFFFFF  => 1 .....  expected 0 .....
+0000.7FFFFFFFFFFFFFFE  +0000.7FFFFFFFFFFFFFFE  => 1 .....  expected 0 .....
+0001.8000000000000000  +0001.8000000000000000  => 1 .....  expected 0 .....
+0001.8000000000000001  +0001.8000000000000001  => 1 .....  expected 0 .....
+0001.FFFFFFFFFFFFFFFF  +0001.FFFFFFFFFFFFFFFF  => 1 .....  expected 0 .....
+0001.FFFFFFFFFFFFFFFE  +0001.FFFFFFFFFFFFFFFE  => 1 .....  expected 0 .....
+3FBF.8000000000000000  +3FBF.8000000000000000  => 1 .....  expected 0 .....
+3FBF.8000000000000001  +3FBF.8000000000000001  => 1 .....  expected 0 .....
+3FBF.FFFFFFFFFFFFFFFF  +3FBF.FFFFFFFFFFFFFFFF  => 1 .....  expected 0 .....
+3FBF.FFFFFFFFFFFFFFFE  +3FBF.FFFFFFFFFFFFFFFE  => 1 .....  expected 0 .....
+3FFD.8000000000000000  +3FFD.8000000000000000  => 1 .....  expected 0 .....
+3FFD.8000000000000001  +3FFD.8000000000000001  => 1 .....  expected 0 .....
+3FFD.FFFFFFFFFFFFFFFF  +3FFD.FFFFFFFFFFFFFFFF  => 1 .....  expected 0 .....
+3FFD.FFFFFFFFFFFFFFFE  +3FFD.FFFFFFFFFFFFFFFE  => 1 .....  expected 0 .....
+3FFE.8000000000000000  +3FFE.8000000000000000  => 1 .....  expected 0 .....
+3FFE.8000000000000001  +3FFE.8000000000000001  => 1 .....  expected 0 .....
+3FFE.FFFFFFFFFFFFFFFF  +3FFE.FFFFFFFFFFFFFFFF  => 1 .....  expected 0 .....
9618 tests performed; 20 errors found.

However the rest can be enabled:

tests/fp: enable more tests

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

1 file changed, 6 insertions(+), 6 deletions(-)
tests/fp/meson.build | 12 ++++++------

modified   tests/fp/meson.build
@@ -556,7 +556,9 @@ softfloat_conv_tests = {
                       'extF80_to_f64 extF80_to_f128 ' +
                       'f128_to_f16',
     'int-to-float': 'i32_to_f16 i64_to_f16 i32_to_f32 i64_to_f32 ' +
-                    'i32_to_f64 i64_to_f64 i32_to_f128 i64_to_f128',
+                    'i32_to_f64 i64_to_f64 ' +
+                    'i32_to_extF80 i64_to_extF80 ' +
+                    'i32_to_f128 i64_to_f128',
     'uint-to-float': 'ui32_to_f16 ui64_to_f16 ui32_to_f32 ui64_to_f32 ' +
                      'ui32_to_f64 ui64_to_f64 ui64_to_f128 ' +
                      'ui32_to_extF80 ui64_to_extF80',
@@ -581,7 +583,7 @@ softfloat_conv_tests = {
                      'extF80_to_ui64 extF80_to_ui64_r_minMag ' +
                      'f128_to_ui64 f128_to_ui64_r_minMag',
     'round-to-integer': 'f16_roundToInt f32_roundToInt ' +
-                        'f64_roundToInt f128_roundToInt'
+                        'f64_roundToInt extF80_roundToInt f128_roundToInt'
 }
 softfloat_tests = {
     'eq_signaling' : 'compare',
@@ -602,18 +604,16 @@ fptest_args = ['-s', '-l', '1']
 fptest_rounding_args = ['-r', 'all']
 
 # Conversion Routines:
-# FIXME: i32_to_extF80 (broken), i64_to_extF80 (broken)
-#        extF80_roundToInt (broken)
 foreach k, v : softfloat_conv_tests
   test('fp-test-' + k, fptest,
        args: fptest_args + fptest_rounding_args + v.split(),
        suite: ['softfloat', 'softfloat-conv'])
 endforeach
 
-# FIXME: extF80_{lt_quiet, rem} (broken),
+# FIXME: extF80_{lt_quiet} (broken),
 #        extF80_{mulAdd} (missing)
 foreach k, v : softfloat_tests
-  extF80_broken = ['lt_quiet', 'rem'].contains(k)
+  extF80_broken = ['lt_quiet'].contains(k)
   test('fp-test-' + k, fptest,
        args: fptest_args + fptest_rounding_args +
              ['f16_' + k, 'f32_' + k, 'f64_' + k, 'f128_' + k] +

-- 
Alex Bennée

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Richard Henderson 1 year, 3 months ago
On 5/10/21 8:36 AM, Alex Bennée wrote:
> 
> Richard Henderson <richard.henderson@linaro.org> writes:
> 
>> Reorg everything using QEMU_GENERIC and multiple inclusion to
>> reduce the amount of code duplication between the formats.
>>
>> The use of QEMU_GENERIC means that we need to use pointers instead
>> of structures, which means that even the smaller float formats
>> need rearranging.
>>
>> I've carried it through to completion within fpu/, so that we don't
>> have (much) of the legacy code remaining.  There is some floatx80
>> stuff in target/m68k and target/i386 that's still hanging around.
> 
> FWIW I could enable a few more tests...

Ah, thanks for the reminder that these were disabled.
I'll add this to my patch set for v2.


> ...although extF80_lt_quiet still has some failures on equality tests:

This turns out to be a trivial typo in the tester itself:

diff --git a/tests/fp/wrap.c.inc b/tests/fp/wrap.c.inc
index cb1bb77e4c..9ff884c140 100644
--- a/tests/fp/wrap.c.inc
+++ b/tests/fp/wrap.c.inc
@@ -643,7 +643,7 @@ WRAP_CMP80(qemu_extF80M_eq, floatx80_eq_quiet)
  WRAP_CMP80(qemu_extF80M_le, floatx80_le)
  WRAP_CMP80(qemu_extF80M_lt, floatx80_lt)
  WRAP_CMP80(qemu_extF80M_le_quiet, floatx80_le_quiet)
-WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_le_quiet)
+WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_lt_quiet)
  #undef WRAP_CMP80

  #define WRAP_CMP128(name, func)


r~

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Alex Bennée 1 year, 3 months ago
Richard Henderson <richard.henderson@linaro.org> writes:

> On 5/10/21 8:36 AM, Alex Bennée wrote:
>> Richard Henderson <richard.henderson@linaro.org> writes:
>> 
>>> Reorg everything using QEMU_GENERIC and multiple inclusion to
>>> reduce the amount of code duplication between the formats.
>>>
>>> The use of QEMU_GENERIC means that we need to use pointers instead
>>> of structures, which means that even the smaller float formats
>>> need rearranging.
>>>
>>> I've carried it through to completion within fpu/, so that we don't
>>> have (much) of the legacy code remaining.  There is some floatx80
>>> stuff in target/m68k and target/i386 that's still hanging around.
>> FWIW I could enable a few more tests...
>
> Ah, thanks for the reminder that these were disabled.
> I'll add this to my patch set for v2.
>
>
>> ...although extF80_lt_quiet still has some failures on equality tests:
>
> This turns out to be a trivial typo in the tester itself:
>
> diff --git a/tests/fp/wrap.c.inc b/tests/fp/wrap.c.inc
> index cb1bb77e4c..9ff884c140 100644
> --- a/tests/fp/wrap.c.inc
> +++ b/tests/fp/wrap.c.inc
> @@ -643,7 +643,7 @@ WRAP_CMP80(qemu_extF80M_eq, floatx80_eq_quiet)
>  WRAP_CMP80(qemu_extF80M_le, floatx80_le)
>  WRAP_CMP80(qemu_extF80M_lt, floatx80_lt)
>  WRAP_CMP80(qemu_extF80M_le_quiet, floatx80_le_quiet)
> -WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_le_quiet)
> +WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_lt_quiet)
>  #undef WRAP_CMP80
>
>  #define WRAP_CMP128(name, func)

\o/

I did note we are missing mulAdd tests but they seem to be missing from
the underlying testfloat code as well. I guess we don't care that much
for the 80bit code? Is it even used by any architectures?

-- 
Alex Bennée

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Richard Henderson 1 year, 3 months ago
On 5/12/21 6:22 AM, Alex Bennée wrote:
> I did note we are missing mulAdd tests but they seem to be missing from
> the underlying testfloat code as well. I guess we don't care that much
> for the 80bit code? Is it even used by any architectures?

It's not used by any architecture, so no point in implementing it.


r~


Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Alex Bennée 1 year, 3 months ago
Richard Henderson <richard.henderson@linaro.org> writes:

> Reorg everything using QEMU_GENERIC and multiple inclusion to
> reduce the amount of code duplication between the formats.
>
> The use of QEMU_GENERIC means that we need to use pointers instead
> of structures, which means that even the smaller float formats
> need rearranging.
>
> I've carried it through to completion within fpu/, so that we don't
> have (much) of the legacy code remaining.  There is some floatx80
> stuff in target/m68k and target/i386 that's still hanging around.

OK and here are some quad benchmarks. There is actual change above the
noise but I think the biggest hit comes from the parts conversion but we
do claw some of it back:

* Run Quad Benchmarks

#+name: run-quad-float-benchmarks
#+begin_src sh :results output table append
  commit=$(git describe)
  add=$(./tests/fp/fp-bench add -p quad)
  mul=$(./tests/fp/fp-bench add -p quad)
  muladd=$(./tests/fp/fp-bench add -p quad)
  desc=$(git log --format="format:%s" HEAD^..)
  echo "$commit,$add,$mul,$muladd,$desc"
#+end_src

#+RESULTS: run-quad-float-benchmarks
| pull-target-arm-20210510-1-91-g0fe775d52c  | 90.28 MFlops | 90.15 MFlops | 90.75 MFlops |                                                                 |
| pull-target-arm-20210510-1-92-gf7a6dabee2  | 90.80 MFlops | 89.92 MFlops | 90.66 MFlops |                                                                 |
| pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.93 MFlops | 89.10 MFlops | 87.32 MFlops |                                                                 |
| pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.85 MFlops | 88.83 MFlops | 88.53 MFlops |                                                                 |
| pull-target-arm-20210510-1-94-g900ea1f79d  | 87.10 MFlops | 88.02 MFlops | 88.22 MFlops |                                                                 |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 88.11 MFlops | 87.10 MFlops | 87.48 MFlops | softfloat: Tidy a * b + inf return                              |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.27 MFlops | 84.86 MFlops | 87.99 MFlops |                                                                 |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.56 MFlops | 88.31 MFlops | 88.41 MFlops | softfloat: Tidy a * b + inf return                              |
| pull-target-arm-20210510-1-96-gec2be8ad0c  | 88.12 MFlops | 88.88 MFlops | 89.09 MFlops | softfloat: Add float_cmask and constants                        |
| pull-target-arm-20210510-1-97-g2328f560a1  | 91.18 MFlops | 91.84 MFlops | 91.30 MFlops | softfloat: Use return_nan in float_to_float                     |
| pull-target-arm-20210510-1-97-g2328f560a1  | 90.07 MFlops | 91.16 MFlops | 91.14 MFlops | softfloat: Use return_nan in float_to_float                     |
| pull-target-arm-20210510-1-98-g89e2096c6f  | 87.54 MFlops | 87.71 MFlops | 87.90 MFlops | softfloat: fix return_nan vs default_nan_mode                   |
| pull-target-arm-20210510-1-98-g89e2096c6f  | 87.57 MFlops | 83.80 MFlops | 85.95 MFlops | softfloat: fix return_nan vs default_nan_mode                   |
| pull-target-arm-20210510-1-99-g67ceccacea  | 89.29 MFlops | 87.46 MFlops | 87.40 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
| pull-target-arm-20210510-1-99-g67ceccacea  | 88.08 MFlops | 88.54 MFlops | 88.42 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
| pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | 92.37 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
| pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | 93.17 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
| pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | 91.56 MFlops | softfloat: Rename FloatParts to FloatParts64                    |
| pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | 92.54 MFlops |                                                                 |
| pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | 94.00 MFlops | softfloat: Use pointers with parts_silence_nan                  |
| pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | 94.67 MFlops | softfloat: Rearrange FloatParts64                               |
| pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | 95.13 MFlops |                                                                 |
| pull-target-arm-20210510-1-111-g039cab1333 | 95.55 MFlops | 94.72 MFlops | 95.55 MFlops |                                                                 |
| pull-target-arm-20210510-1-112-g5de6cec92b | 87.99 MFlops | 87.98 MFlops | 88.64 MFlops | softfloat: Convert float128_silence_nan to parts                |
| pull-target-arm-20210510-1-112-g5de6cec92b | 87.20 MFlops | 88.26 MFlops | 88.04 MFlops | softfloat: Convert float128_silence_nan to parts                |
| pull-target-arm-20210510-1-113-g6eb5e07c28 | 88.01 MFlops | 87.70 MFlops | 87.69 MFlops | softfloat: Convert float128_default_nan to parts                |
| pull-target-arm-20210510-1-113-g6eb5e07c28 | 87.88 MFlops | 87.76 MFlops | 87.20 MFlops | softfloat: Convert float128_default_nan to parts                |
| pull-target-arm-20210510-1-114-g7a4f7331e4 | 84.38 MFlops | 84.55 MFlops | 86.92 MFlops | softfloat: Move return_nan to softfloat-parts.c.inc             |
| pull-target-arm-20210510-1-115-g08f1f1f3ed | 90.40 MFlops | 89.79 MFlops | 90.74 MFlops | softfloat: Move pick_nan to softfloat-parts.c.inc               |
| pull-target-arm-20210510-1-115-g08f1f1f3ed | 90.74 MFlops | 90.11 MFlops | 90.59 MFlops | softfloat: Move pick_nan to softfloat-parts.c.inc               |
| pull-target-arm-20210510-1-116-g474eb5be10 | 87.84 MFlops | 87.04 MFlops | 88.25 MFlops | softfloat: Move pick_nan_muladd to softfloat-parts.c.inc        |
| pull-target-arm-20210510-1-116-g474eb5be10 | 88.22 MFlops | 87.79 MFlops | 88.10 MFlops | softfloat: Move pick_nan_muladd to softfloat-parts.c.inc        |
| pull-target-arm-20210510-1-117-g096a466c23 | 86.37 MFlops | 85.32 MFlops | 86.22 MFlops | softfloat: Move sf_canonicalize to softfloat-parts.c.inc        |
| pull-target-arm-20210510-1-118-g973977719f | 85.41 MFlops | 84.75 MFlops | 83.47 MFlops | softfloat: Move round_canonical to softfloat-parts.c.inc        |
| pull-target-arm-20210510-1-119-g89c1fd4763 | 85.29 MFlops | 86.27 MFlops | 85.33 MFlops | softfloat: Use uadd64_carry/usub64_borrow in softfloat-macros.h |
| pull-target-arm-20210510-1-120-gfa24239a88 | 86.61 MFlops | 86.24 MFlops | 86.60 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc          |
| pull-target-arm-20210510-1-120-gfa24239a88 | 86.86 MFlops | 86.43 MFlops | 86.38 MFlops |                                                                 |
| pull-target-arm-20210510-1-120-gfa24239a88 | 86.57 MFlops | 86.57 MFlops | 86.25 MFlops |                                                                 |
| pull-target-arm-20210510-1-121-g15cf4c773a | 74.07 MFlops | 73.24 MFlops | 73.53 MFlops |                                                                 |
| pull-target-arm-20210510-1-121-g15cf4c773a | 73.44 MFlops | 73.52 MFlops | 73.86 MFlops | softfloat: Implement float128_add/sub via parts                 |
| pull-target-arm-20210510-1-122-gc07a5fec10 | 73.38 MFlops | 73.01 MFlops | 72.94 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc             |
| pull-target-arm-20210510-1-122-gc07a5fec10 | 73.41 MFlops | 72.32 MFlops | 73.64 MFlops |                                                                 |
| pull-target-arm-20210510-1-122-gc07a5fec10 | 73.88 MFlops | 73.51 MFlops | 73.36 MFlops |                                                                 |
| pull-target-arm-20210510-1-123-g88e1635abf | 69.04 MFlops | 69.33 MFlops | 69.51 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc          |
| pull-target-arm-20210510-1-123-g88e1635abf | 69.84 MFlops | 70.15 MFlops | 69.46 MFlops |                                                                 |
| pull-target-arm-20210510-1-123-g88e1635abf | 69.66 MFlops | 69.67 MFlops | 69.05 MFlops |                                                                 |
| pull-target-arm-20210510-1-124-g449de2f64c | 69.26 MFlops | 67.87 MFlops | 68.56 MFlops | softfloat: Use mulu64 for mul64To128                            |
| pull-target-arm-20210510-1-125-gd87617480b | 68.43 MFlops | 69.05 MFlops | 69.27 MFlops | softfloat: Use add192 in mul128To256                            |
| pull-target-arm-20210510-1-126-gd030713711 | 69.08 MFlops | 68.88 MFlops | 69.32 MFlops | softfloat: Tidy mul128By64To192                                 |
| pull-target-arm-20210510-1-127-gcb1a047f33 | 75.77 MFlops | 75.26 MFlops | 75.56 MFlops | softfloat: Introduce sh[lr]_double primitives                   |
| pull-target-arm-20210510-1-140-g0bd7f06cb0 | 72.07 MFlops | 71.54 MFlops | 71.55 MFlops | softfloat: Split out parts_uncanon_normal                       |
| pull-target-arm-20210510-1-141-g4ccbd2edf1 | 72.45 MFlops | 71.55 MFlops | 71.79 MFlops | softfloat: Reduce FloatFmt                                      |
| pull-target-arm-20210510-1-159-g97028a57bb | 74.87 MFlops | 75.75 MFlops | 75.69 MFlops |                                                                 |
| pull-target-arm-20210510-1-159-g97028a57bb | 73.34 MFlops | 73.42 MFlops | 72.93 MFlops | tests/fp: enable more tests                                     |
| pull-target-arm-20210510-1-159-g97028a57bb | 75.61 MFlops | 75.72 MFlops | 75.30 MFlops |                                                                 |


-- 
Alex Bennée

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Richard Henderson 1 year, 3 months ago
On 5/12/21 2:23 PM, Alex Bennée wrote:
> 
> Richard Henderson <richard.henderson@linaro.org> writes:
> 
>> Reorg everything using QEMU_GENERIC and multiple inclusion to
>> reduce the amount of code duplication between the formats.
>>
>> The use of QEMU_GENERIC means that we need to use pointers instead
>> of structures, which means that even the smaller float formats
>> need rearranging.
>>
>> I've carried it through to completion within fpu/, so that we don't
>> have (much) of the legacy code remaining.  There is some floatx80
>> stuff in target/m68k and target/i386 that's still hanging around.
> 
> OK and here are some quad benchmarks. There is actual change above the
> noise but I think the biggest hit comes from the parts conversion but we
> do claw some of it back:
> 
> * Run Quad Benchmarks
> 
> #+name: run-quad-float-benchmarks
> #+begin_src sh :results output table append
>    commit=$(git describe)
>    add=$(./tests/fp/fp-bench add -p quad)
>    mul=$(./tests/fp/fp-bench add -p quad)
>    muladd=$(./tests/fp/fp-bench add -p quad)
>    desc=$(git log --format="format:%s" HEAD^..)
>    echo "$commit,$add,$mul,$muladd,$desc"
> #+end_src
> 
> #+RESULTS: run-quad-float-benchmarks
> | pull-target-arm-20210510-1-91-g0fe775d52c  | 90.28 MFlops | 90.15 MFlops | 90.75 MFlops |                                                                 |
> | pull-target-arm-20210510-1-92-gf7a6dabee2  | 90.80 MFlops | 89.92 MFlops | 90.66 MFlops |                                                                 |
> | pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.93 MFlops | 89.10 MFlops | 87.32 MFlops |                                                                 |
> | pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.85 MFlops | 88.83 MFlops | 88.53 MFlops |                                                                 |
> | pull-target-arm-20210510-1-94-g900ea1f79d  | 87.10 MFlops | 88.02 MFlops | 88.22 MFlops |                                                                 |
> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 88.11 MFlops | 87.10 MFlops | 87.48 MFlops | softfloat: Tidy a * b + inf return                              |
> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.27 MFlops | 84.86 MFlops | 87.99 MFlops |                                                                 |
> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.56 MFlops | 88.31 MFlops | 88.41 MFlops | softfloat: Tidy a * b + inf return                              |
> | pull-target-arm-20210510-1-96-gec2be8ad0c  | 88.12 MFlops | 88.88 MFlops | 89.09 MFlops | softfloat: Add float_cmask and constants                        |
> | pull-target-arm-20210510-1-97-g2328f560a1  | 91.18 MFlops | 91.84 MFlops | 91.30 MFlops | softfloat: Use return_nan in float_to_float                     |
> | pull-target-arm-20210510-1-97-g2328f560a1  | 90.07 MFlops | 91.16 MFlops | 91.14 MFlops | softfloat: Use return_nan in float_to_float                     |
> | pull-target-arm-20210510-1-98-g89e2096c6f  | 87.54 MFlops | 87.71 MFlops | 87.90 MFlops | softfloat: fix return_nan vs default_nan_mode                   |
> | pull-target-arm-20210510-1-98-g89e2096c6f  | 87.57 MFlops | 83.80 MFlops | 85.95 MFlops | softfloat: fix return_nan vs default_nan_mode                   |
> | pull-target-arm-20210510-1-99-g67ceccacea  | 89.29 MFlops | 87.46 MFlops | 87.40 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
> | pull-target-arm-20210510-1-99-g67ceccacea  | 88.08 MFlops | 88.54 MFlops | 88.42 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
> | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | 92.37 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
> | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | 93.17 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
> | pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | 91.56 MFlops | softfloat: Rename FloatParts to FloatParts64                    |
> | pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | 92.54 MFlops |                                                                 |
> | pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | 94.00 MFlops | softfloat: Use pointers with parts_silence_nan                  |
> | pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | 94.67 MFlops | softfloat: Rearrange FloatParts64                               |
> | pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | 95.13 MFlops |                                                                 |
> | pull-target-arm-20210510-1-111-g039cab1333 | 95.55 MFlops | 94.72 MFlops | 95.55 MFlops |                                                                 |
> | pull-target-arm-20210510-1-112-g5de6cec92b | 87.99 MFlops | 87.98 MFlops | 88.64 MFlops | softfloat: Convert float128_silence_nan to parts                |
> | pull-target-arm-20210510-1-112-g5de6cec92b | 87.20 MFlops | 88.26 MFlops | 88.04 MFlops | softfloat: Convert float128_silence_nan to parts                |
> | pull-target-arm-20210510-1-113-g6eb5e07c28 | 88.01 MFlops | 87.70 MFlops | 87.69 MFlops | softfloat: Convert float128_default_nan to parts                |
> | pull-target-arm-20210510-1-113-g6eb5e07c28 | 87.88 MFlops | 87.76 MFlops | 87.20 MFlops | softfloat: Convert float128_default_nan to parts                |
> | pull-target-arm-20210510-1-114-g7a4f7331e4 | 84.38 MFlops | 84.55 MFlops | 86.92 MFlops | softfloat: Move return_nan to softfloat-parts.c.inc             |
> | pull-target-arm-20210510-1-115-g08f1f1f3ed | 90.40 MFlops | 89.79 MFlops | 90.74 MFlops | softfloat: Move pick_nan to softfloat-parts.c.inc               |
> | pull-target-arm-20210510-1-115-g08f1f1f3ed | 90.74 MFlops | 90.11 MFlops | 90.59 MFlops | softfloat: Move pick_nan to softfloat-parts.c.inc               |
> | pull-target-arm-20210510-1-116-g474eb5be10 | 87.84 MFlops | 87.04 MFlops | 88.25 MFlops | softfloat: Move pick_nan_muladd to softfloat-parts.c.inc        |
> | pull-target-arm-20210510-1-116-g474eb5be10 | 88.22 MFlops | 87.79 MFlops | 88.10 MFlops | softfloat: Move pick_nan_muladd to softfloat-parts.c.inc        |
> | pull-target-arm-20210510-1-117-g096a466c23 | 86.37 MFlops | 85.32 MFlops | 86.22 MFlops | softfloat: Move sf_canonicalize to softfloat-parts.c.inc        |
> | pull-target-arm-20210510-1-118-g973977719f | 85.41 MFlops | 84.75 MFlops | 83.47 MFlops | softfloat: Move round_canonical to softfloat-parts.c.inc        |
> | pull-target-arm-20210510-1-119-g89c1fd4763 | 85.29 MFlops | 86.27 MFlops | 85.33 MFlops | softfloat: Use uadd64_carry/usub64_borrow in softfloat-macros.h |
> | pull-target-arm-20210510-1-120-gfa24239a88 | 86.61 MFlops | 86.24 MFlops | 86.60 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc          |
> | pull-target-arm-20210510-1-120-gfa24239a88 | 86.86 MFlops | 86.43 MFlops | 86.38 MFlops |                                                                 |
> | pull-target-arm-20210510-1-120-gfa24239a88 | 86.57 MFlops | 86.57 MFlops | 86.25 MFlops |                                                                 |
> | pull-target-arm-20210510-1-121-g15cf4c773a | 74.07 MFlops | 73.24 MFlops | 73.53 MFlops |                                                                 |

If I'm reading your output properly, there should have been no change in the 
benchmarking through to this point, because all we have done so far is modify 
float64 and below.

Thus there seems to be a *lot* of noise in your numbers.

> | pull-target-arm-20210510-1-121-g15cf4c773a | 73.44 MFlops | 73.52 MFlops | 73.86 MFlops | softfloat: Implement float128_add/sub via parts                 |

This is where I would have expected a hit to the first column, and no change to 
the other two.

> | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.38 MFlops | 73.01 MFlops | 72.94 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc             |
> | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.41 MFlops | 72.32 MFlops | 73.64 MFlops |                                                                 |
> | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.88 MFlops | 73.51 MFlops | 73.36 MFlops |                                                                 |
> | pull-target-arm-20210510-1-123-g88e1635abf | 69.04 MFlops | 69.33 MFlops | 69.51 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc          |

Now add and mul columns are going down when the only change is to muladd?  Is 
this just more noise?

> | pull-target-arm-20210510-1-123-g88e1635abf | 69.84 MFlops | 70.15 MFlops | 69.46 MFlops |                                                                 |
> | pull-target-arm-20210510-1-123-g88e1635abf | 69.66 MFlops | 69.67 MFlops | 69.05 MFlops |                                                                 |
> | pull-target-arm-20210510-1-124-g449de2f64c | 69.26 MFlops | 67.87 MFlops | 68.56 MFlops | softfloat: Use mulu64 for mul64To128                            |
> | pull-target-arm-20210510-1-125-gd87617480b | 68.43 MFlops | 69.05 MFlops | 69.27 MFlops | softfloat: Use add192 in mul128To256                            |
> | pull-target-arm-20210510-1-126-gd030713711 | 69.08 MFlops | 68.88 MFlops | 69.32 MFlops | softfloat: Tidy mul128By64To192                                 |
> | pull-target-arm-20210510-1-127-gcb1a047f33 | 75.77 MFlops | 75.26 MFlops | 75.56 MFlops | softfloat: Introduce sh[lr]_double primitives                   |
> | pull-target-arm-20210510-1-140-g0bd7f06cb0 | 72.07 MFlops | 71.54 MFlops | 71.55 MFlops | softfloat: Split out parts_uncanon_normal                       |
> | pull-target-arm-20210510-1-141-g4ccbd2edf1 | 72.45 MFlops | 71.55 MFlops | 71.79 MFlops | softfloat: Reduce FloatFmt                                      |
> | pull-target-arm-20210510-1-159-g97028a57bb | 74.87 MFlops | 75.75 MFlops | 75.69 MFlops |                                                                 |
> | pull-target-arm-20210510-1-159-g97028a57bb | 73.34 MFlops | 73.42 MFlops | 72.93 MFlops | tests/fp: enable more tests                                     |
> | pull-target-arm-20210510-1-159-g97028a57bb | 75.61 MFlops | 75.72 MFlops | 75.30 MFlops |                                                                 |
> 
> 

r~

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Alex Bennée 1 year, 3 months ago
Richard Henderson <richard.henderson@linaro.org> writes:

> On 5/12/21 2:23 PM, Alex Bennée wrote:
>> Richard Henderson <richard.henderson@linaro.org> writes:
>> 
>>> Reorg everything using QEMU_GENERIC and multiple inclusion to
>>> reduce the amount of code duplication between the formats.
>>>
>>> The use of QEMU_GENERIC means that we need to use pointers instead
>>> of structures, which means that even the smaller float formats
>>> need rearranging.
>>>
>>> I've carried it through to completion within fpu/, so that we don't
>>> have (much) of the legacy code remaining.  There is some floatx80
>>> stuff in target/m68k and target/i386 that's still hanging around.
>> OK and here are some quad benchmarks. There is actual change above
>> the
>> noise but I think the biggest hit comes from the parts conversion but we
>> do claw some of it back:
>> * Run Quad Benchmarks
>> #+name: run-quad-float-benchmarks
>> #+begin_src sh :results output table append
>>    commit=$(git describe)
>>    add=$(./tests/fp/fp-bench add -p quad)
>>    mul=$(./tests/fp/fp-bench add -p quad)
>>    muladd=$(./tests/fp/fp-bench add -p quad)
>>    desc=$(git log --format="format:%s" HEAD^..)
>>    echo "$commit,$add,$mul,$muladd,$desc"
>> #+end_src
>> #+RESULTS: run-quad-float-benchmarks
>> | pull-target-arm-20210510-1-91-g0fe775d52c  | 90.28 MFlops | 90.15 MFlops | 90.75 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-92-gf7a6dabee2  | 90.80 MFlops | 89.92 MFlops | 90.66 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.93 MFlops | 89.10 MFlops | 87.32 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.85 MFlops | 88.83 MFlops | 88.53 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-94-g900ea1f79d  | 87.10 MFlops | 88.02 MFlops | 88.22 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 88.11 MFlops | 87.10 MFlops | 87.48 MFlops | softfloat: Tidy a * b + inf return                              |
>> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.27 MFlops | 84.86 MFlops | 87.99 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.56 MFlops | 88.31 MFlops | 88.41 MFlops | softfloat: Tidy a * b + inf return                              |
>> | pull-target-arm-20210510-1-96-gec2be8ad0c  | 88.12 MFlops | 88.88 MFlops | 89.09 MFlops | softfloat: Add float_cmask and constants                        |
>> | pull-target-arm-20210510-1-97-g2328f560a1  | 91.18 MFlops | 91.84 MFlops | 91.30 MFlops | softfloat: Use return_nan in float_to_float                     |
>> | pull-target-arm-20210510-1-97-g2328f560a1  | 90.07 MFlops | 91.16 MFlops | 91.14 MFlops | softfloat: Use return_nan in float_to_float                     |
>> | pull-target-arm-20210510-1-98-g89e2096c6f  | 87.54 MFlops | 87.71 MFlops | 87.90 MFlops | softfloat: fix return_nan vs default_nan_mode                   |
>> | pull-target-arm-20210510-1-98-g89e2096c6f  | 87.57 MFlops | 83.80 MFlops | 85.95 MFlops | softfloat: fix return_nan vs default_nan_mode                   |
>> | pull-target-arm-20210510-1-99-g67ceccacea  | 89.29 MFlops | 87.46 MFlops | 87.40 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
>> | pull-target-arm-20210510-1-99-g67ceccacea  | 88.08 MFlops | 88.54 MFlops | 88.42 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
>> | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | 92.37 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
>> | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | 93.17 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
>> | pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | 91.56 MFlops | softfloat: Rename FloatParts to FloatParts64                    |
>> | pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | 92.54 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | 94.00 MFlops | softfloat: Use pointers with parts_silence_nan                  |
>> | pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | 94.67 MFlops | softfloat: Rearrange FloatParts64                               |
>> | pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | 95.13 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-111-g039cab1333 | 95.55 MFlops | 94.72 MFlops | 95.55 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-112-g5de6cec92b | 87.99 MFlops | 87.98 MFlops | 88.64 MFlops | softfloat: Convert float128_silence_nan to parts                |
>> | pull-target-arm-20210510-1-112-g5de6cec92b | 87.20 MFlops | 88.26 MFlops | 88.04 MFlops | softfloat: Convert float128_silence_nan to parts                |
>> | pull-target-arm-20210510-1-113-g6eb5e07c28 | 88.01 MFlops | 87.70 MFlops | 87.69 MFlops | softfloat: Convert float128_default_nan to parts                |
>> | pull-target-arm-20210510-1-113-g6eb5e07c28 | 87.88 MFlops | 87.76 MFlops | 87.20 MFlops | softfloat: Convert float128_default_nan to parts                |
>> | pull-target-arm-20210510-1-114-g7a4f7331e4 | 84.38 MFlops | 84.55 MFlops | 86.92 MFlops | softfloat: Move return_nan to softfloat-parts.c.inc             |
>> | pull-target-arm-20210510-1-115-g08f1f1f3ed | 90.40 MFlops | 89.79 MFlops | 90.74 MFlops | softfloat: Move pick_nan to softfloat-parts.c.inc               |
>> | pull-target-arm-20210510-1-115-g08f1f1f3ed | 90.74 MFlops | 90.11 MFlops | 90.59 MFlops | softfloat: Move pick_nan to softfloat-parts.c.inc               |
>> | pull-target-arm-20210510-1-116-g474eb5be10 | 87.84 MFlops | 87.04 MFlops | 88.25 MFlops | softfloat: Move pick_nan_muladd to softfloat-parts.c.inc        |
>> | pull-target-arm-20210510-1-116-g474eb5be10 | 88.22 MFlops | 87.79 MFlops | 88.10 MFlops | softfloat: Move pick_nan_muladd to softfloat-parts.c.inc        |
>> | pull-target-arm-20210510-1-117-g096a466c23 | 86.37 MFlops | 85.32 MFlops | 86.22 MFlops | softfloat: Move sf_canonicalize to softfloat-parts.c.inc        |
>> | pull-target-arm-20210510-1-118-g973977719f | 85.41 MFlops | 84.75 MFlops | 83.47 MFlops | softfloat: Move round_canonical to softfloat-parts.c.inc        |
>> | pull-target-arm-20210510-1-119-g89c1fd4763 | 85.29 MFlops | 86.27 MFlops | 85.33 MFlops | softfloat: Use uadd64_carry/usub64_borrow in softfloat-macros.h |
>> | pull-target-arm-20210510-1-120-gfa24239a88 | 86.61 MFlops | 86.24 MFlops | 86.60 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc          |
>> | pull-target-arm-20210510-1-120-gfa24239a88 | 86.86 MFlops | 86.43 MFlops | 86.38 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-120-gfa24239a88 | 86.57 MFlops | 86.57 MFlops | 86.25 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-121-g15cf4c773a | 74.07 MFlops | 73.24 MFlops | 73.53 MFlops |                                                                 |
>
> If I'm reading your output properly, there should have been no change
> in the benchmarking through to this point, because all we have done so
> far is modify float64 and below.
>
> Thus there seems to be a *lot* of noise in your numbers.

That's why some of the passes were run multiple times. Although the
results are sorted in commit order I did 3 separate passes through the
code when I was gathering the numbers. However you are right to say they
are indicative rather than strict benchmarking conditions - I've just
re-run on 3 commits:

  #+RESULTS: run-quad-float-benchmarks
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 80.97 MFlops | 84.22 MFlops | 81.71 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 82.51 MFlops | 79.53 MFlops | 84.27 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 84.36 MFlops | 85.77 MFlops | 81.84 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 85.28 MFlops | 84.05 MFlops | 87.40 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 85.27 MFlops | 87.66 MFlops | 85.53 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 85.23 MFlops | 87.64 MFlops | 87.05 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 86.19 MFlops | 85.73 MFlops | 85.61 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 85.04 MFlops | 85.36 MFlops | 86.28 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 86.00 MFlops | 86.65 MFlops | 88.30 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 86.94 MFlops | 88.36 MFlops | 86.94 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 86.62 MFlops | 88.44 MFlops | 87.74 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 87.42 MFlops | 87.89 MFlops | 87.85 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 86.87 MFlops | 88.31 MFlops | 87.30 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 88.00 MFlops | 88.12 MFlops | 88.63 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 88.16 MFlops | 87.84 MFlops | 86.42 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 88.35 MFlops | 86.78 MFlops | 88.01 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 87.76 MFlops | 88.01 MFlops | 88.02 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-95-gdb0bb2966f | 88.46 MFlops | 88.10 MFlops | 87.95 MFlops | softfloat: Tidy a * b + inf return          |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 83.80 MFlops | 88.45 MFlops | 87.70 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 88.30 MFlops | 88.55 MFlops | 88.42 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 87.96 MFlops | 86.73 MFlops | 88.58 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 87.23 MFlops | 87.42 MFlops | 88.14 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 87.41 MFlops | 87.75 MFlops | 88.26 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 88.28 MFlops | 88.39 MFlops | 87.51 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 86.68 MFlops | 88.43 MFlops | 88.23 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 86.87 MFlops | 88.04 MFlops | 87.93 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 86.60 MFlops | 88.19 MFlops | 87.46 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 88.58 MFlops | 88.18 MFlops | 88.27 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 86.85 MFlops | 88.73 MFlops | 88.37 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-96-gec2be8ad0c | 87.25 MFlops | 88.70 MFlops | 88.04 MFlops | softfloat: Add float_cmask and constants    |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 89.23 MFlops | 90.74 MFlops | 91.03 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 90.40 MFlops | 90.79 MFlops | 91.23 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 91.19 MFlops | 91.13 MFlops | 90.99 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 91.06 MFlops | 91.38 MFlops | 90.86 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 89.91 MFlops | 89.57 MFlops | 91.09 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 91.30 MFlops | 91.65 MFlops | 91.64 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 90.62 MFlops | 91.00 MFlops | 91.35 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 91.38 MFlops | 91.55 MFlops | 90.80 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 90.64 MFlops | 91.50 MFlops | 88.21 MFlops | softfloat: Use return_nan in float_to_float |
  | pull-target-arm-20210510-1-97-g2328f560a1 | 90.13 MFlops | 91.20 MFlops | 91.16 MFlops | softfloat: Use return_nan in float_to_float |

So while the first commit showed a nearly 10% variation over multiple
runs (cache effects?) I think the step between float_cmask and
return_nan is more real.

I wonder if it would be worth tweaking the fp-bench program to make the
numbers a bit more stable? For total execution time runs I use hyperfine
which does at least try and measure the deviation between runs and trend
to a stable number but these numbers are reported directly by the
fp-bench tool.

>
>> | pull-target-arm-20210510-1-121-g15cf4c773a | 73.44 MFlops | 73.52 MFlops | 73.86 MFlops | softfloat: Implement float128_add/sub via parts                 |
>
> This is where I would have expected a hit to the first column, and no
> change to the other two.

The effect looks very real:

  #+RESULTS: run-quad-float-benchmarks
  | pull-target-arm-20210510-1-120-gfa24239a88 | 86.32 MFlops | 82.89 MFlops | 85.81 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 86.31 MFlops | 85.83 MFlops | 85.26 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 85.89 MFlops | 86.99 MFlops | 86.57 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 83.82 MFlops | 85.01 MFlops | 85.83 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 85.51 MFlops | 84.93 MFlops | 86.27 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 85.90 MFlops | 85.15 MFlops | 85.42 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 83.39 MFlops | 85.73 MFlops | 85.28 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 86.03 MFlops | 85.73 MFlops | 84.60 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 85.38 MFlops | 85.18 MFlops | 86.69 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 84.01 MFlops | 84.48 MFlops | 85.49 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 86.56 MFlops | 84.50 MFlops | 85.12 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 86.27 MFlops | 85.39 MFlops | 85.23 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.14 MFlops | 71.95 MFlops | 72.78 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.70 MFlops | 71.64 MFlops | 73.36 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.26 MFlops | 73.44 MFlops | 72.86 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.95 MFlops | 72.24 MFlops | 72.26 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.00 MFlops | 72.78 MFlops | 72.92 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.10 MFlops | 73.36 MFlops | 72.71 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.23 MFlops | 72.76 MFlops | 73.48 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 71.91 MFlops | 71.54 MFlops | 73.01 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.08 MFlops | 73.75 MFlops | 72.17 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.21 MFlops | 73.69 MFlops | 72.65 MFlops | softfloat: Implement float128_add/sub via parts        |

I wonder if it is an effect of greater code re-organisation affecting
cache lines or some other residency if the other functions? 

>
>> | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.38 MFlops | 73.01 MFlops | 72.94 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc             |
>> | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.41 MFlops | 72.32 MFlops | 73.64 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.88 MFlops | 73.51 MFlops | 73.36 MFlops |                                                                 |
>> | pull-target-arm-20210510-1-123-g88e1635abf | 69.04 MFlops | 69.33 MFlops | 69.51 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc          |
>
> Now add and mul columns are going down when the only change is to
> muladd?  Is this just more noise?

Running again more times I think it is a real effect:

  #+RESULTS: run-quad-float-benchmarks
  | pull-target-arm-20210510-1-120-gfa24239a88 | 85.99 MFlops | 84.44 MFlops | 85.98 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 86.77 MFlops | 85.99 MFlops | 85.02 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 87.07 MFlops | 86.22 MFlops | 85.65 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 84.49 MFlops | 85.94 MFlops | 85.49 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 85.75 MFlops | 85.05 MFlops | 85.65 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 85.51 MFlops | 86.55 MFlops | 84.26 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-120-gfa24239a88 | 85.68 MFlops | 86.39 MFlops | 85.05 MFlops | softfloat: Move addsub_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 71.73 MFlops | 72.85 MFlops | 73.68 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.32 MFlops | 72.05 MFlops | 72.44 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.03 MFlops | 73.92 MFlops | 72.03 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 71.59 MFlops | 73.25 MFlops | 72.97 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.23 MFlops | 73.55 MFlops | 73.47 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.59 MFlops | 73.03 MFlops | 72.76 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.60 MFlops | 71.20 MFlops | 73.73 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.91 MFlops | 73.22 MFlops | 72.98 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.27 MFlops | 70.69 MFlops | 72.62 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.04 MFlops | 73.35 MFlops | 73.13 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.74 MFlops | 71.59 MFlops | 72.33 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.20 MFlops | 73.35 MFlops | 72.61 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.72 MFlops | 72.79 MFlops | 73.29 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.97 MFlops | 73.58 MFlops | 72.72 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.73 MFlops | 73.20 MFlops | 72.74 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.42 MFlops | 73.61 MFlops | 73.37 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.88 MFlops | 73.36 MFlops | 73.55 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.06 MFlops | 72.64 MFlops | 72.75 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 73.78 MFlops | 73.31 MFlops | 73.97 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.78 MFlops | 73.68 MFlops | 73.65 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.58 MFlops | 73.01 MFlops | 72.94 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-121-g15cf4c773a | 72.48 MFlops | 71.77 MFlops | 73.38 MFlops | softfloat: Implement float128_add/sub via parts        |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 72.23 MFlops | 73.22 MFlops | 73.23 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 72.84 MFlops | 72.80 MFlops | 73.42 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.24 MFlops | 72.20 MFlops | 73.68 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 71.59 MFlops | 72.49 MFlops | 73.79 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.66 MFlops | 72.63 MFlops | 73.41 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.40 MFlops | 73.51 MFlops | 73.38 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.47 MFlops | 73.80 MFlops | 73.07 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 73.41 MFlops | 73.78 MFlops | 73.10 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-122-gc07a5fec10 | 72.38 MFlops | 72.54 MFlops | 72.09 MFlops | softfloat: Move mul_floats to softfloat-parts.c.inc    |
  | pull-target-arm-20210510-1-123-g88e1635abf | 68.45 MFlops | 67.83 MFlops | 66.24 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-123-g88e1635abf | 69.52 MFlops | 68.83 MFlops | 69.40 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-123-g88e1635abf | 69.39 MFlops | 68.94 MFlops | 69.41 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-123-g88e1635abf | 68.71 MFlops | 69.38 MFlops | 69.78 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-123-g88e1635abf | 69.51 MFlops | 69.38 MFlops | 68.44 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-123-g88e1635abf | 69.77 MFlops | 69.53 MFlops | 69.12 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-123-g88e1635abf | 69.42 MFlops | 69.54 MFlops | 69.72 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-123-g88e1635abf | 68.03 MFlops | 69.33 MFlops | 68.57 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |
  | pull-target-arm-20210510-1-123-g88e1635abf | 69.48 MFlops | 69.07 MFlops | 69.68 MFlops | softfloat: Move muladd_floats to softfloat-parts.c.inc |

-- 
Alex Bennée

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Richard Henderson 1 year, 3 months ago
On 5/13/21 8:33 AM, Alex Bennée wrote:
>> Now add and mul columns are going down when the only change is to
>> muladd?  Is this just more noise?
> 
> Running again more times I think it is a real effect:

I don't believe it.  If source code for a given function is not changing then 
the generated code should not change (much, especially with FLATTEN), and thus 
the runtime should not change (much).

Are you absolutely sure that you're measuring what you think you are measuring?

Is your compiler mis-behaving somehow and not inlining stuff?


r~

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by no-reply@patchew.org 1 year, 3 months ago
Patchew URL: https://patchew.org/QEMU/20210508014802.892561-1-richard.henderson@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210508014802.892561-1-richard.henderson@linaro.org
Subject: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20210508014802.892561-1-richard.henderson@linaro.org -> patchew/20210508014802.892561-1-richard.henderson@linaro.org
Switched to a new branch 'test'
2dcaa72 softfloat: Convert modrem operations to FloatParts
2956f6c softfloat: Move floatN_log2 to softfloat-parts.c.inc
b786038 softfloat: Convert float32_exp2 to FloatParts
01ab7b4 softfloat: Convert floatx80 compare to FloatParts
d55b655 softfloat: Convert floatx80_scalbn to FloatParts
0b1869b softfloat: Convert floatx80 to integer to FloatParts
6287e68 softfloat: Convert floatx80 float conversions to FloatParts
5c28030 softfloat: Convert integer to floatx80 to FloatParts
e686e7d softfloat: Convert floatx80_round_to_int to FloatParts
9e7a606 softfloat: Convert floatx80_round to FloatParts
d19c872 softfloat: Convert floatx80_sqrt to FloatParts
06addff softfloat: Convert floatx80_div to FloatParts
6974166 softfloat: Convert floatx80_mul to FloatParts
0ab65d7 softfloat: Convert floatx80_add/sub to FloatParts
5ae98cd tests/fp/fp-test: Reverse order of floatx80 precision tests
dc5eada softfloat: Adjust parts_uncanon_normal for floatx80
0f4d3ce softfloat: Introduce Floatx80RoundPrec
aecc38b softfloat: Reduce FloatFmt
d8342e2 softfloat: Split out parts_uncanon_normal
e8a3234 softfloat: Move sqrt_float to softfloat-parts.c.inc
db066f3 softfloat: Move scalbn_decomposed to softfloat-parts.c.inc
870c56c softfloat: Move compare_floats to softfloat-parts.c.inc
4895586 softfloat: Move minmax_flags to softfloat-parts.c.inc
c9e01de softfloat: Move uint_to_float to softfloat-parts.c.inc
0b58632 softfloat: Move int_to_float to softfloat-parts.c.inc
e408986 softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc
1330f41 softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc
2d631f0 softfloat: Move round_to_int to softfloat-parts.c.inc
45abdca softfloat: Convert float-to-float conversions with float128
0ddbdfd softfloat: Split float_to_float
296ad69 softfloat: Move div_floats to softfloat-parts.c.inc
d5ff786 softfloat: Introduce sh[lr]_double primitives
d227707 softfloat: Tidy mul128By64To192
5a9c124 softfloat: Use add192 in mul128To256
86d8505 softfloat: Use mulu64 for mul64To128
1186174 softfloat: Move muladd_floats to softfloat-parts.c.inc
5e4de79 softfloat: Move mul_floats to softfloat-parts.c.inc
1450eec softfloat: Implement float128_add/sub via parts
ed27dea softfloat: Move addsub_floats to softfloat-parts.c.inc
3fe4a87 softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h
5687e8b softfloat: Move round_canonical to softfloat-parts.c.inc
ddf7181 softfloat: Move sf_canonicalize to softfloat-parts.c.inc
49b211c softfloat: Move pick_nan_muladd to softfloat-parts.c.inc
d6fae5a softfloat: Move pick_nan to softfloat-parts.c.inc
ada8c0f softfloat: Move return_nan to softfloat-parts.c.inc
701d4e6 softfloat: Convert float128_default_nan to parts
735bd6f softfloat: Convert float128_silence_nan to parts
be031db softfloat: Rearrange FloatParts64
997cce0 softfloat: Use pointers with parts_silence_nan
759543e softfloat: Use pointers with ftype_round_pack_canonical
59155e0 softfloat: Use pointers with ftype_unpack_canonical
67f866d softfloat: Use pointers with ftype_pack_raw
ad2e600 softfloat: Use pointers with pack_raw
6725bec softfloat: Use pointers with ftype_unpack_raw
6fa54f0 softfloat: Use pointers with unpack_raw
36916c3 softfloat: Use pointers with parts_default_nan
e255c56 softfloat: Move type-specific pack/unpack routines
17cab05 softfloat: Rename FloatParts to FloatParts64
7973d6d softfloat: Do not produce a default_nan from parts_silence_nan
7267830 target/mips: Set set_default_nan_mode with set_snan_bit_is_one
8acc1a9 softfloat: fix return_nan vs default_nan_mode
5c700cd softfloat: Use return_nan in float_to_float
e6a00e2 softfloat: Add float_cmask and constants
689fe83 softfloat: Tidy a * b + inf return
75285a4 softfloat: Use float_raise in more places
a8ea262 softfloat: Inline float_raise
3be68a5 softfloat: Move the binary point to the msb
e2236e1 tests/fp: add quad support to the benchmark utility
be7703a accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c
3731829 qemu/host-utils: Add wrappers for carry builtins
a1367bc qemu/host-utils: Add wrappers for overflow builtins
96efc5c qemu/host-utils: Use __builtin_bitreverseN

=== OUTPUT BEGIN ===
1/72 Checking commit 96efc5c8f4d0 (qemu/host-utils: Use __builtin_bitreverseN)
WARNING: architecture specific defines should be avoided
#24: FILE: include/qemu/host-utils.h:275:
+#if __has_builtin(__builtin_bitreverse8)

WARNING: architecture specific defines should be avoided
#42: FILE: include/qemu/host-utils.h:296:
+#if __has_builtin(__builtin_bitreverse16)

WARNING: architecture specific defines should be avoided
#60: FILE: include/qemu/host-utils.h:319:
+#if __has_builtin(__builtin_bitreverse32)

WARNING: architecture specific defines should be avoided
#78: FILE: include/qemu/host-utils.h:342:
+#if __has_builtin(__builtin_bitreverse64)

total: 0 errors, 4 warnings, 64 lines checked

Patch 1/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
2/72 Checking commit a1367bcd9ebb (qemu/host-utils: Add wrappers for overflow builtins)
WARNING: architecture specific defines should be avoided
#34: FILE: include/qemu/host-utils.h:369:
+#if __has_builtin(__builtin_add_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#52: FILE: include/qemu/host-utils.h:387:
+#if __has_builtin(__builtin_add_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#70: FILE: include/qemu/host-utils.h:405:
+#if __has_builtin(__builtin_add_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#88: FILE: include/qemu/host-utils.h:423:
+#if __has_builtin(__builtin_add_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#107: FILE: include/qemu/host-utils.h:442:
+#if __has_builtin(__builtin_sub_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#126: FILE: include/qemu/host-utils.h:461:
+#if __has_builtin(__builtin_sub_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#145: FILE: include/qemu/host-utils.h:480:
+#if __has_builtin(__builtin_sub_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#164: FILE: include/qemu/host-utils.h:499:
+#if __has_builtin(__builtin_sub_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#182: FILE: include/qemu/host-utils.h:517:
+#if __has_builtin(__builtin_mul_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#201: FILE: include/qemu/host-utils.h:536:
+#if __has_builtin(__builtin_mul_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#221: FILE: include/qemu/host-utils.h:556:
+#if __has_builtin(__builtin_mul_overflow) || __GNUC__ >= 5

WARNING: architecture specific defines should be avoided
#240: FILE: include/qemu/host-utils.h:575:
+#if __has_builtin(__builtin_mul_overflow) || __GNUC__ >= 5

total: 0 errors, 12 warnings, 231 lines checked

Patch 2/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/72 Checking commit 37318292524c (qemu/host-utils: Add wrappers for carry builtins)
WARNING: architecture specific defines should be avoided
#43: FILE: include/qemu/host-utils.h:595:
+#if __has_builtin(__builtin_addcll)

WARNING: architecture specific defines should be avoided
#68: FILE: include/qemu/host-utils.h:620:
+#if __has_builtin(__builtin_subcll)

total: 0 errors, 2 warnings, 62 lines checked

Patch 3/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
4/72 Checking commit be7703ad3d1d (accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c)
5/72 Checking commit e2236e18fc30 (tests/fp: add quad support to the benchmark utility)
WARNING: line over 80 characters
#182: FILE: tests/fp/fp-bench.c:458:
+    GEN_BENCH_NO_NEG(bench_ ## name ## _float128, float128, PREC_FLOAT128, op, n)

WARNING: line over 80 characters
#199: FILE: tests/fp/fp-bench.c:521:
+    fprintf(stderr, " -p = floating point precision (single, double, quad[soft only]). "

total: 0 errors, 2 warnings, 185 lines checked

Patch 5/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
6/72 Checking commit 3be68a54ecda (softfloat: Move the binary point to the msb)
7/72 Checking commit a8ea262fb400 (softfloat: Inline float_raise)
8/72 Checking commit 75285a4341ef (softfloat: Use float_raise in more places)
9/72 Checking commit 689fe83bcde1 (softfloat: Tidy a * b + inf return)
10/72 Checking commit e6a00e231540 (softfloat: Add float_cmask and constants)
11/72 Checking commit 5c700cd63df0 (softfloat: Use return_nan in float_to_float)
12/72 Checking commit 8acc1a959a15 (softfloat: fix return_nan vs default_nan_mode)
13/72 Checking commit 7267830dd66c (target/mips: Set set_default_nan_mode with set_snan_bit_is_one)
14/72 Checking commit 7973d6db759b (softfloat: Do not produce a default_nan from parts_silence_nan)
15/72 Checking commit 17cab0560556 (softfloat: Rename FloatParts to FloatParts64)
WARNING: line over 80 characters
#235: FILE: fpu/softfloat.c:928:
+static FloatParts64 pick_nan_muladd(FloatParts64 a, FloatParts64 b, FloatParts64 c,

WARNING: line over 80 characters
#390: FILE: fpu/softfloat.c:1347:
+static FloatParts64 muladd_floats(FloatParts64 a, FloatParts64 b, FloatParts64 c,

WARNING: line over 80 characters
#837: FILE: fpu/softfloat.c:3189:
+static FloatRelation compare_floats(FloatParts64 a, FloatParts64 b, bool is_quiet,

WARNING: Block comments use a leading /* on a separate line
#875: FILE: fpu/softfloat.c:3374:
+        /* The largest float type (even though not supported by FloatParts64)

WARNING: line over 80 characters
#926: FILE: fpu/softfloat.c:3425:
+static FloatParts64 sqrt_float(FloatParts64 a, float_status *s, const FloatFmt *p)

total: 0 errors, 5 warnings, 1002 lines checked

Patch 15/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
16/72 Checking commit e255c5631e4f (softfloat: Move type-specific pack/unpack routines)
17/72 Checking commit 36916c3edd6a (softfloat: Use pointers with parts_default_nan)
18/72 Checking commit 6fa54f0a9176 (softfloat: Use pointers with unpack_raw)
19/72 Checking commit 6725bec472d1 (softfloat: Use pointers with ftype_unpack_raw)
20/72 Checking commit ad2e6003ba80 (softfloat: Use pointers with pack_raw)
21/72 Checking commit 67f866dc3382 (softfloat: Use pointers with ftype_pack_raw)
22/72 Checking commit 59155e0623d2 (softfloat: Use pointers with ftype_unpack_canonical)
23/72 Checking commit 759543ead10f (softfloat: Use pointers with ftype_round_pack_canonical)
24/72 Checking commit 997cce09e9b5 (softfloat: Use pointers with parts_silence_nan)
25/72 Checking commit be031db06eb0 (softfloat: Rearrange FloatParts64)
26/72 Checking commit 735bd6fba22f (softfloat: Convert float128_silence_nan to parts)
27/72 Checking commit 701d4e63d467 (softfloat: Convert float128_default_nan to parts)
28/72 Checking commit ada8c0fd97c8 (softfloat: Move return_nan to softfloat-parts.c.inc)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#18: 
new file mode 100644

total: 0 errors, 1 warnings, 124 lines checked

Patch 28/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
29/72 Checking commit d6fae5a986f6 (softfloat: Move pick_nan to softfloat-parts.c.inc)
30/72 Checking commit 49b211c61f33 (softfloat: Move pick_nan_muladd to softfloat-parts.c.inc)
31/72 Checking commit ddf7181b7486 (softfloat: Move sf_canonicalize to softfloat-parts.c.inc)
32/72 Checking commit 5687e8bdf817 (softfloat: Move round_canonical to softfloat-parts.c.inc)
33/72 Checking commit 3fe4a87f6fd6 (softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h)
34/72 Checking commit ed27dea74977 (softfloat: Move addsub_floats to softfloat-parts.c.inc)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#19: 
new file mode 100644

ERROR: space required after that ',' (ctx:VxV)
#270: FILE: fpu/softfloat.c:957:
+#define partsN(NAME)   glue(glue(glue(parts,N),_),NAME)
                                            ^

ERROR: space required after that ',' (ctx:VxV)
#270: FILE: fpu/softfloat.c:957:
+#define partsN(NAME)   glue(glue(glue(parts,N),_),NAME)
                                               ^

ERROR: space required after that ',' (ctx:VxV)
#270: FILE: fpu/softfloat.c:957:
+#define partsN(NAME)   glue(glue(glue(parts,N),_),NAME)
                                                  ^

ERROR: space required after that ',' (ctx:VxV)
#271: FILE: fpu/softfloat.c:958:
+#define FloatPartsN    glue(FloatParts,N)
                                       ^

total: 4 errors, 1 warnings, 489 lines checked

Patch 34/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

35/72 Checking commit 1450eec1d1a9 (softfloat: Implement float128_add/sub via parts)
36/72 Checking commit 5e4de79a96ed (softfloat: Move mul_floats to softfloat-parts.c.inc)
ERROR: space required after that ',' (ctx:VxV)
#151: FILE: fpu/softfloat.c:1003:
+#define FloatPartsW    glue(FloatParts,W)
                                       ^

total: 1 errors, 0 warnings, 350 lines checked

Patch 36/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

37/72 Checking commit 11861744634d (softfloat: Move muladd_floats to softfloat-parts.c.inc)
38/72 Checking commit 86d850509935 (softfloat: Use mulu64 for mul64To128)
39/72 Checking commit 5a9c12452902 (softfloat: Use add192 in mul128To256)
ERROR: space prohibited after that open parenthesis '('
#61: FILE: include/fpu/softfloat-macros.h:527:
+    add192( 0, m1, m2,  0, n1, n2, &m0, &m1, &m2);

total: 1 errors, 0 warnings, 46 lines checked

Patch 39/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

40/72 Checking commit d22770700ad7 (softfloat: Tidy mul128By64To192)
41/72 Checking commit d5ff7869a46b (softfloat: Introduce sh[lr]_double primitives)
WARNING: architecture specific defines should be avoided
#203: FILE: include/fpu/softfloat-macros.h:98:
+#if defined(__x86_64__)

WARNING: architecture specific defines should be avoided
#221: FILE: include/fpu/softfloat-macros.h:116:
+#if defined(__x86_64__)

total: 0 errors, 2 warnings, 199 lines checked

Patch 41/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
42/72 Checking commit 296ad691084d (softfloat: Move div_floats to softfloat-parts.c.inc)
43/72 Checking commit 0ddbdfdd1a7c (softfloat: Split float_to_float)
44/72 Checking commit 45abdcae4504 (softfloat: Convert float-to-float conversions with float128)
45/72 Checking commit 2d631f07b19f (softfloat: Move round_to_int to softfloat-parts.c.inc)
46/72 Checking commit 1330f41e01bb (softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc)
47/72 Checking commit e408986129be (softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc)
48/72 Checking commit 0b586325bf5a (softfloat: Move int_to_float to softfloat-parts.c.inc)
49/72 Checking commit c9e01de1b349 (softfloat: Move uint_to_float to softfloat-parts.c.inc)
50/72 Checking commit 4895586cb8dc (softfloat: Move minmax_flags to softfloat-parts.c.inc)
51/72 Checking commit 870c56c4e127 (softfloat: Move compare_floats to softfloat-parts.c.inc)
52/72 Checking commit db066f333726 (softfloat: Move scalbn_decomposed to softfloat-parts.c.inc)
53/72 Checking commit e8a323470fbf (softfloat: Move sqrt_float to softfloat-parts.c.inc)
54/72 Checking commit d8342e21ffda (softfloat: Split out parts_uncanon_normal)
55/72 Checking commit aecc38becfb6 (softfloat: Reduce FloatFmt)
56/72 Checking commit 0f4d3ce1b908 (softfloat: Introduce Floatx80RoundPrec)
57/72 Checking commit dc5eada61c8f (softfloat: Adjust parts_uncanon_normal for floatx80)
58/72 Checking commit 5ae98cd1dd64 (tests/fp/fp-test: Reverse order of floatx80 precision tests)
59/72 Checking commit 0ab65d7f28d9 (softfloat: Convert floatx80_add/sub to FloatParts)
60/72 Checking commit 6974166c3993 (softfloat: Convert floatx80_mul to FloatParts)
61/72 Checking commit 06addffabfe9 (softfloat: Convert floatx80_div to FloatParts)
62/72 Checking commit d19c872604c3 (softfloat: Convert floatx80_sqrt to FloatParts)
63/72 Checking commit 9e7a60645c6d (softfloat: Convert floatx80_round to FloatParts)
64/72 Checking commit e686e7da3302 (softfloat: Convert floatx80_round_to_int to FloatParts)
65/72 Checking commit 5c28030e4ea5 (softfloat: Convert integer to floatx80 to FloatParts)
66/72 Checking commit 6287e685faf8 (softfloat: Convert floatx80 float conversions to FloatParts)
67/72 Checking commit 0b1869b8f8b3 (softfloat: Convert floatx80 to integer to FloatParts)
68/72 Checking commit d55b65538352 (softfloat: Convert floatx80_scalbn to FloatParts)
69/72 Checking commit 01ab7b4aed96 (softfloat: Convert floatx80 compare to FloatParts)
70/72 Checking commit b78603823353 (softfloat: Convert float32_exp2 to FloatParts)
71/72 Checking commit 2956f6c7c5d3 (softfloat: Move floatN_log2 to softfloat-parts.c.inc)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#325: 
new file mode 100644

ERROR: trailing whitespace
#376: FILE: tests/fp/fp-test-log2.c:47:
+   $

total: 1 errors, 1 warnings, 410 lines checked

Patch 71/72 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

72/72 Checking commit 2dcaa72f9b5e (softfloat: Convert modrem operations to FloatParts)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210508014802.892561-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Posted by Alex Bennée 1 year, 3 months ago
Richard Henderson <richard.henderson@linaro.org> writes:

> Reorg everything using QEMU_GENERIC and multiple inclusion to
> reduce the amount of code duplication between the formats.
>
> The use of QEMU_GENERIC means that we need to use pointers instead
> of structures, which means that even the smaller float formats
> need rearranging.

I did a basic some basic benchmarks which show no issues (although I
suspect hardfloat is hiding any true cost of the softfloat itself):

#+name: run-float-benchmarks
#+begin_src shell :results output :async
  ./fp-bench add -p single
  ./fp-bench add -p double
  ./fp-bench mul -p single
  ./fp-bench mul -p double
  ./fp-bench muladd -p single
  ./fp-bench muladd -p double
#+end_src

#+RESULTS: run-float-benchmarks-after
: 374.77 MFlops
: 287.58 MFlops
: 371.55 MFlops
: 281.48 MFlops
: 370.76 MFlops
: 287.39 MFlops

#+RESULTS: run-float-benchmarks-before
: 362.40 MFlops
: 278.65 MFlops
: 360.68 MFlops
: 280.92 MFlops
: 360.75 MFlops
: 280.76 MFlops

I guess what would be really telling is if a ext80 benchmark exhibited
any slowdown.

>
> I've carried it through to completion within fpu/, so that we don't
> have (much) of the legacy code remaining.  There is some floatx80
> stuff in target/m68k and target/i386 that's still hanging around.
>
>
> r~
>
>
> Alex Bennée (1):
>   tests/fp: add quad support to the benchmark utility
>
> Richard Henderson (71):
>   qemu/host-utils: Use __builtin_bitreverseN
>   qemu/host-utils: Add wrappers for overflow builtins
>   qemu/host-utils: Add wrappers for carry builtins
>   accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c
>   softfloat: Move the binary point to the msb
>   softfloat: Inline float_raise
>   softfloat: Use float_raise in more places
>   softfloat: Tidy a * b + inf return
>   softfloat: Add float_cmask and constants
>   softfloat: Use return_nan in float_to_float
>   softfloat: fix return_nan vs default_nan_mode
>   target/mips: Set set_default_nan_mode with set_snan_bit_is_one
>   softfloat: Do not produce a default_nan from parts_silence_nan
>   softfloat: Rename FloatParts to FloatParts64
>   softfloat: Move type-specific pack/unpack routines
>   softfloat: Use pointers with parts_default_nan
>   softfloat: Use pointers with unpack_raw
>   softfloat: Use pointers with ftype_unpack_raw
>   softfloat: Use pointers with pack_raw
>   softfloat: Use pointers with ftype_pack_raw
>   softfloat: Use pointers with ftype_unpack_canonical
>   softfloat: Use pointers with ftype_round_pack_canonical
>   softfloat: Use pointers with parts_silence_nan
>   softfloat: Rearrange FloatParts64
>   softfloat: Convert float128_silence_nan to parts
>   softfloat: Convert float128_default_nan to parts
>   softfloat: Move return_nan to softfloat-parts.c.inc
>   softfloat: Move pick_nan to softfloat-parts.c.inc
>   softfloat: Move pick_nan_muladd to softfloat-parts.c.inc
>   softfloat: Move sf_canonicalize to softfloat-parts.c.inc
>   softfloat: Move round_canonical to softfloat-parts.c.inc
>   softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h
>   softfloat: Move addsub_floats to softfloat-parts.c.inc
>   softfloat: Implement float128_add/sub via parts
>   softfloat: Move mul_floats to softfloat-parts.c.inc
>   softfloat: Move muladd_floats to softfloat-parts.c.inc
>   softfloat: Use mulu64 for mul64To128
>   softfloat: Use add192 in mul128To256
>   softfloat: Tidy mul128By64To192
>   softfloat: Introduce sh[lr]_double primitives
>   softfloat: Move div_floats to softfloat-parts.c.inc
>   softfloat: Split float_to_float
>   softfloat: Convert float-to-float conversions with float128
>   softfloat: Move round_to_int to softfloat-parts.c.inc
>   softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc
>   softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc
>   softfloat: Move int_to_float to softfloat-parts.c.inc
>   softfloat: Move uint_to_float to softfloat-parts.c.inc
>   softfloat: Move minmax_flags to softfloat-parts.c.inc
>   softfloat: Move compare_floats to softfloat-parts.c.inc
>   softfloat: Move scalbn_decomposed to softfloat-parts.c.inc
>   softfloat: Move sqrt_float to softfloat-parts.c.inc
>   softfloat: Split out parts_uncanon_normal
>   softfloat: Reduce FloatFmt
>   softfloat: Introduce Floatx80RoundPrec
>   softfloat: Adjust parts_uncanon_normal for floatx80
>   tests/fp/fp-test: Reverse order of floatx80 precision tests
>   softfloat: Convert floatx80_add/sub to FloatParts
>   softfloat: Convert floatx80_mul to FloatParts
>   softfloat: Convert floatx80_div to FloatParts
>   softfloat: Convert floatx80_sqrt to FloatParts
>   softfloat: Convert floatx80_round to FloatParts
>   softfloat: Convert floatx80_round_to_int to FloatParts
>   softfloat: Convert integer to floatx80 to FloatParts
>   softfloat: Convert floatx80 float conversions to FloatParts
>   softfloat: Convert floatx80 to integer to FloatParts
>   softfloat: Convert floatx80_scalbn to FloatParts
>   softfloat: Convert floatx80 compare to FloatParts
>   softfloat: Convert float32_exp2 to FloatParts
>   softfloat: Move floatN_log2 to softfloat-parts.c.inc
>   softfloat: Convert modrem operations to FloatParts
>
>  include/fpu/softfloat-helpers.h  |    5 +-
>  include/fpu/softfloat-macros.h   |  247 +-
>  include/fpu/softfloat-types.h    |   10 +-
>  include/fpu/softfloat.h          |   11 +-
>  include/qemu/host-utils.h        |  291 ++
>  target/mips/fpu_helper.h         |   10 +-
>  accel/tcg/tcg-runtime-gvec.c     |   36 +-
>  fpu/softfloat.c                  | 7760 ++++++++++--------------------
>  linux-user/arm/nwfpe/fpa11.c     |   41 +-
>  target/i386/tcg/fpu_helper.c     |   79 +-
>  target/m68k/fpu_helper.c         |   50 +-
>  target/m68k/softfloat.c          |   90 +-
>  tests/fp/fp-bench.c              |   88 +-
>  tests/fp/fp-test-log2.c          |  118 +
>  tests/fp/fp-test.c               |   11 +-
>  fpu/softfloat-parts-addsub.c.inc |   62 +
>  fpu/softfloat-parts.c.inc        | 1480 ++++++
>  fpu/softfloat-specialize.c.inc   |  424 +-
>  tests/fp/wrap.c.inc              |   12 +
>  tests/fp/meson.build             |   11 +
>  20 files changed, 4886 insertions(+), 5950 deletions(-)
>  create mode 100644 tests/fp/fp-test-log2.c
>  create mode 100644 fpu/softfloat-parts-addsub.c.inc
>  create mode 100644 fpu/softfloat-parts.c.inc


-- 
Alex Bennée