[RFC PATCH for-8.2 00/18] crypto: Provide clmul.h and host accel

Richard Henderson posted 18 patches 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230713211435.13505-1-richard.henderson@linaro.org
Maintainers: "Daniel P. Berrangé" <berrange@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, Daniel Henrique Barboza <danielhb413@gmail.com>, "Cédric Le Goater" <clg@kaod.org>, David Gibson <david@gibson.dropbear.id.au>, Greg Kurz <groug@kaod.org>, Nicholas Piggin <npiggin@gmail.com>, David Hildenbrand <david@redhat.com>, Ilya Leoshkevich <iii@linux.ibm.com>, Thomas Huth <thuth@redhat.com>
There is a newer version of this series
host/include/aarch64/host/cpuinfo.h      |   1 +
host/include/aarch64/host/crypto/clmul.h | 230 +++++++++++++++++++++++
host/include/generic/host/crypto/clmul.h |  28 +++
host/include/i386/host/cpuinfo.h         |   1 +
host/include/i386/host/crypto/clmul.h    | 187 ++++++++++++++++++
host/include/x86_64/host/crypto/clmul.h  |   1 +
include/crypto/clmul.h                   | 123 ++++++++++++
target/arm/tcg/vec_internal.h            |  11 --
crypto/clmul.c                           | 163 ++++++++++++++++
target/arm/tcg/mve_helper.c              |  16 +-
target/arm/tcg/vec_helper.c              | 112 ++---------
target/ppc/int_helper.c                  |  63 +++----
target/s390x/tcg/vec_int_helper.c        | 175 +++++++----------
util/cpuinfo-aarch64.c                   |   4 +-
util/cpuinfo-i386.c                      |   1 +
crypto/meson.build                       |   9 +-
16 files changed, 865 insertions(+), 260 deletions(-)
create mode 100644 host/include/aarch64/host/crypto/clmul.h
create mode 100644 host/include/generic/host/crypto/clmul.h
create mode 100644 host/include/i386/host/crypto/clmul.h
create mode 100644 host/include/x86_64/host/crypto/clmul.h
create mode 100644 include/crypto/clmul.h
create mode 100644 crypto/clmul.c
[RFC PATCH for-8.2 00/18] crypto: Provide clmul.h and host accel
Posted by Richard Henderson 10 months ago
Inspired by Ard Biesheuvel's RFC patches [1] for accelerating
carry-less multiply under emulation.

This is less polished than the AES patch set:

(1) Should I split HAVE_CLMUL_ACCEL into per-width HAVE_CLMUL{N}_ACCEL?
    The "_generic" and "_accel" split is different from aes-round.h
    because of the difference in support for different widths, and it
    means that each host accel has more boilerplate.

(2) Should I bother trying to accelerate anything other than 64x64->128?
    That seems to be the one that GSM really wants anyway.  I'd keep all
    of the sizes implemented generically, since that centralizes the 3
    target implementations.

(3) The use of Int128 isn't fantastic -- better would be a vector type,
    though that has its own special problems for ppc64le (see the
    endianness hoops within aes-round.h).  Perhaps leave things in
    env memory, like I was mostly able to do with AES?

(4) No guest test case(s).


r~


[1] https://patchew.org/QEMU/20230601123332.3297404-1-ardb@kernel.org/

Richard Henderson (18):
  crypto: Add generic 8-bit carry-less multiply routines
  target/arm: Use clmul_8* routines
  target/s390x: Use clmul_8* routines
  target/ppc: Use clmul_8* routines
  crypto: Add generic 16-bit carry-less multiply routines
  target/arm: Use clmul_16* routines
  target/s390x: Use clmul_16* routines
  target/ppc: Use clmul_16* routines
  crypto: Add generic 32-bit carry-less multiply routines
  target/arm: Use clmul_32* routines
  target/s390x: Use clmul_32* routines
  target/ppc: Use clmul_32* routines
  crypto: Add generic 64-bit carry-less multiply routine
  target/arm: Use clmul_64
  target/s390x: Use clmul_64
  target/ppc: Use clmul_64
  host/include/i386: Implement clmul.h
  host/include/aarch64: Implement clmul.h

 host/include/aarch64/host/cpuinfo.h      |   1 +
 host/include/aarch64/host/crypto/clmul.h | 230 +++++++++++++++++++++++
 host/include/generic/host/crypto/clmul.h |  28 +++
 host/include/i386/host/cpuinfo.h         |   1 +
 host/include/i386/host/crypto/clmul.h    | 187 ++++++++++++++++++
 host/include/x86_64/host/crypto/clmul.h  |   1 +
 include/crypto/clmul.h                   | 123 ++++++++++++
 target/arm/tcg/vec_internal.h            |  11 --
 crypto/clmul.c                           | 163 ++++++++++++++++
 target/arm/tcg/mve_helper.c              |  16 +-
 target/arm/tcg/vec_helper.c              | 112 ++---------
 target/ppc/int_helper.c                  |  63 +++----
 target/s390x/tcg/vec_int_helper.c        | 175 +++++++----------
 util/cpuinfo-aarch64.c                   |   4 +-
 util/cpuinfo-i386.c                      |   1 +
 crypto/meson.build                       |   9 +-
 16 files changed, 865 insertions(+), 260 deletions(-)
 create mode 100644 host/include/aarch64/host/crypto/clmul.h
 create mode 100644 host/include/generic/host/crypto/clmul.h
 create mode 100644 host/include/i386/host/crypto/clmul.h
 create mode 100644 host/include/x86_64/host/crypto/clmul.h
 create mode 100644 include/crypto/clmul.h
 create mode 100644 crypto/clmul.c

-- 
2.34.1
Re: [RFC PATCH for-8.2 00/18] crypto: Provide clmul.h and host accel
Posted by Ard Biesheuvel 9 months, 2 weeks ago
On Thu, 13 Jul 2023 at 23:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Inspired by Ard Biesheuvel's RFC patches [1] for accelerating
> carry-less multiply under emulation.
>
> This is less polished than the AES patch set:
>
> (1) Should I split HAVE_CLMUL_ACCEL into per-width HAVE_CLMUL{N}_ACCEL?
>     The "_generic" and "_accel" split is different from aes-round.h
>     because of the difference in support for different widths, and it
>     means that each host accel has more boilerplate.
>
> (2) Should I bother trying to accelerate anything other than 64x64->128?

That is the only compelling use case afaict.

>     That seems to be the one that GSM really wants anyway.  I'd keep all
>     of the sizes implemented generically, since that centralizes the 3
>     target implementations.
>
> (3) The use of Int128 isn't fantastic -- better would be a vector type,
>     though that has its own special problems for ppc64le (see the
>     endianness hoops within aes-round.h).  Perhaps leave things in
>     env memory, like I was mostly able to do with AES?
>
> (4) No guest test case(s).
>
>
> r~
>
>
> [1] https://patchew.org/QEMU/20230601123332.3297404-1-ardb@kernel.org/
>
> Richard Henderson (18):
>   crypto: Add generic 8-bit carry-less multiply routines
>   target/arm: Use clmul_8* routines
>   target/s390x: Use clmul_8* routines
>   target/ppc: Use clmul_8* routines
>   crypto: Add generic 16-bit carry-less multiply routines
>   target/arm: Use clmul_16* routines
>   target/s390x: Use clmul_16* routines
>   target/ppc: Use clmul_16* routines
>   crypto: Add generic 32-bit carry-less multiply routines
>   target/arm: Use clmul_32* routines
>   target/s390x: Use clmul_32* routines
>   target/ppc: Use clmul_32* routines
>   crypto: Add generic 64-bit carry-less multiply routine
>   target/arm: Use clmul_64
>   target/s390x: Use clmul_64
>   target/ppc: Use clmul_64
>   host/include/i386: Implement clmul.h
>   host/include/aarch64: Implement clmul.h
>
>  host/include/aarch64/host/cpuinfo.h      |   1 +
>  host/include/aarch64/host/crypto/clmul.h | 230 +++++++++++++++++++++++
>  host/include/generic/host/crypto/clmul.h |  28 +++
>  host/include/i386/host/cpuinfo.h         |   1 +
>  host/include/i386/host/crypto/clmul.h    | 187 ++++++++++++++++++
>  host/include/x86_64/host/crypto/clmul.h  |   1 +
>  include/crypto/clmul.h                   | 123 ++++++++++++
>  target/arm/tcg/vec_internal.h            |  11 --
>  crypto/clmul.c                           | 163 ++++++++++++++++
>  target/arm/tcg/mve_helper.c              |  16 +-
>  target/arm/tcg/vec_helper.c              | 112 ++---------
>  target/ppc/int_helper.c                  |  63 +++----
>  target/s390x/tcg/vec_int_helper.c        | 175 +++++++----------
>  util/cpuinfo-aarch64.c                   |   4 +-
>  util/cpuinfo-i386.c                      |   1 +
>  crypto/meson.build                       |   9 +-
>  16 files changed, 865 insertions(+), 260 deletions(-)
>  create mode 100644 host/include/aarch64/host/crypto/clmul.h
>  create mode 100644 host/include/generic/host/crypto/clmul.h
>  create mode 100644 host/include/i386/host/crypto/clmul.h
>  create mode 100644 host/include/x86_64/host/crypto/clmul.h
>  create mode 100644 include/crypto/clmul.h
>  create mode 100644 crypto/clmul.c
>
> --
> 2.34.1
>