[PATCH v3 0/4] do_div() with constant divisor simplification

Nicolas Pitre posted 4 patches 1 year, 7 months ago
arch/arm/include/asm/div64.h |  13 +++-
include/asm-generic/div64.h  | 121 ++++++++++++-----------------------
lib/math/test_div64.c        |  85 +++++++++++++++++++++++-
3 files changed, 134 insertions(+), 85 deletions(-)
[PATCH v3 0/4] do_div() with constant divisor simplification
Posted by Nicolas Pitre 1 year, 7 months ago
While working on mul_u64_u64_div_u64() improvements I realized that there
is a better way to perform a 64x64->128 bits multiplication with overflow
handling. This is not as lean as v1 of the series but still much better
than the existing code IMHO.

Change from v2:

- Fix last minute edit screw-up (missing one function return type).

Link to v2: https://lore.kernel.org/lkml/20240707171919.1951895-1-nico@fluxnic.net/

Changes from v1:

- Formalize condition for when overflow handling can be skipped.
- Make this condition apply only if it can be determined at compile time
  (beware of the compiler not always inling code).
- Keep the ARM assembly but apply the above changes to it as well.
- Force __always_inline when optimizing for performance.
- Augment test_div64.c with important edge cases.

Link to v1: https://lore.kernel.org/lkml/20240705022334.1378363-1-nico@fluxnic.net/

The diffstat is:

 arch/arm/include/asm/div64.h |  13 +++-
 include/asm-generic/div64.h  | 121 ++++++++++++-----------------------
 lib/math/test_div64.c        |  85 +++++++++++++++++++++++-
 3 files changed, 134 insertions(+), 85 deletions(-)
Re: [PATCH v3 0/4] do_div() with constant divisor simplification
Posted by Nicolas Pitre 1 year, 5 months ago
Ping.


On Sun, 7 Jul 2024, Nicolas Pitre wrote:

> While working on mul_u64_u64_div_u64() improvements I realized that there
> is a better way to perform a 64x64->128 bits multiplication with overflow
> handling. This is not as lean as v1 of the series but still much better
> than the existing code IMHO.
> 
> Change from v2:
> 
> - Fix last minute edit screw-up (missing one function return type).
> 
> Link to v2: https://lore.kernel.org/lkml/20240707171919.1951895-1-nico@fluxnic.net/
> 
> Changes from v1:
> 
> - Formalize condition for when overflow handling can be skipped.
> - Make this condition apply only if it can be determined at compile time
>   (beware of the compiler not always inling code).
> - Keep the ARM assembly but apply the above changes to it as well.
> - Force __always_inline when optimizing for performance.
> - Augment test_div64.c with important edge cases.
> 
> Link to v1: https://lore.kernel.org/lkml/20240705022334.1378363-1-nico@fluxnic.net/
> 
> The diffstat is:
> 
>  arch/arm/include/asm/div64.h |  13 +++-
>  include/asm-generic/div64.h  | 121 ++++++++++++-----------------------
>  lib/math/test_div64.c        |  85 +++++++++++++++++++++++-
>  3 files changed, 134 insertions(+), 85 deletions(-)
> 
>