From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5EB5F2D595B; Thu, 19 Mar 2026 06:19:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901155; cv=none; b=FUVJFRFYQ/4FrsYjv0cYuvO2FkimstLOGdxrZTYh+bbtcdzfifOU+VFrUdAUyWmxjQw+bWS7lpHRpcAHqfSlZtMgCpj6X9h6KJRkP0pPPfcSoVC3xSMPwf0ZY4kOtI5UaTtdZSpjlVFGTvPfhhs0VP1Gqpk4Og9l219/9RM/JTg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901155; c=relaxed/simple; bh=2RnBJCTFrD0ohcPH8mcR5XHGHKUTn3w5CDa+XMF3VZs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DoTFXfMtt8TYcY5sN94vEIv4v3Q1ygMBlA2rtlTxvEcoaqGVyffwZNxoJxu95FT6bVDRGd3qXHmpncMaIx1z7zWG3pVDc/x4L0OFZl0y2QL1cxaPwrIYVpBccVS5Oe9/gdEJ8FKYfEFGuj3d7jg5VMBcHlAkwaUVmYONVW+GhpU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qme/wxlp; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qme/wxlp" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E91D1C2BCB2; Thu, 19 Mar 2026 06:19:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901155; bh=2RnBJCTFrD0ohcPH8mcR5XHGHKUTn3w5CDa+XMF3VZs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qme/wxlpFt5vEOTbM9QLsloHawmga9oHXECrQF9aFqJ+c22qswdZvB1tt8zMSpFDD gj3SR5bJOOOI/CwrWD6c/OJFWl6EoLqissuqhMXuLElf33A+M8eFK3/akwhQUYCPvv fHA/MWnN0WFeo0BwGMowPsFpuS1BfPQExabl5O8liLGii9wcok730wyc+yhgqX4k1Y dtF3PmCJMvNkwrSj+l+Q1BAbG0f3bn6mstN4ZrdizPGgfPaae6hGPJvH9JSvRQJSV7 a7U0XkjSLJgXrpcZOsY81qaijqTblYWUKaPClmI5/TznDuh/7d/w2RxYSaODwp1DcK XlkYCLJv7+YuA== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 01/19] lib/crypto: gf128hash: Rename polyval module to gf128hash Date: Wed, 18 Mar 2026 23:17:02 -0700 Message-ID: <20260319061723.1140720-2-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, the standalone GHASH code is coupled with crypto_shash. This has resulted in unnecessary complexity and overhead, as well as the code being unavailable to library code such as the AES-GCM library. Like was done with POLYVAL, it needs to find a new home in lib/crypto/. GHASH and POLYVAL are closely related and can each be implemented in terms of each other. Optimized code for one can be reused with the other. But also since GHASH tends to be difficult to implement directly due to its unnatural bit order, most modern GHASH implementations (including the existing arm, arm64, powerpc, and x86 optimized GHASH code, and the new generic GHASH code I'll be adding) actually reinterpret the GHASH computation as an equivalent POLYVAL computation, pre and post-processing the inputs and outputs to map to/from POLYVAL. Given this close relationship, it makes sense to group the GHASH and POLYVAL code together in the same module. This gives us a wide range of options for implementing them, reusing code between the two and properly utilizing whatever instructions each architecture provides. Thus, GHASH support will be added to the library module that is currently called "polyval". Rename it to an appropriate name: "gf128hash". Rename files, options, functions, etc. where appropriate to reflect the upcoming sharing with GHASH. (Note: polyval_kunit is not renamed, as ghash_kunit will be added alongside it instead.) Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- crypto/Kconfig | 2 +- crypto/hctr2.c | 2 +- include/crypto/{polyval.h =3D> gf128hash.h} | 16 ++++++------- lib/crypto/Kconfig | 24 +++++++++---------- lib/crypto/Makefile | 20 ++++++++-------- lib/crypto/arm64/{polyval.h =3D> gf128hash.h} | 4 ++-- lib/crypto/{polyval.c =3D> gf128hash.c} | 26 ++++++++++----------- lib/crypto/tests/Kconfig | 4 ++-- lib/crypto/tests/polyval_kunit.c | 2 +- lib/crypto/x86/{polyval.h =3D> gf128hash.h} | 4 ++-- 10 files changed, 52 insertions(+), 52 deletions(-) rename include/crypto/{polyval.h =3D> gf128hash.h} (94%) rename lib/crypto/arm64/{polyval.h =3D> gf128hash.h} (95%) rename lib/crypto/{polyval.c =3D> gf128hash.c} (94%) rename lib/crypto/x86/{polyval.h =3D> gf128hash.h} (95%) diff --git a/crypto/Kconfig b/crypto/Kconfig index b8608ef6823b..5627b3691561 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -684,11 +684,11 @@ config CRYPTO_ECB ECB (Electronic Codebook) mode (NIST SP800-38A) =20 config CRYPTO_HCTR2 tristate "HCTR2" select CRYPTO_XCTR - select CRYPTO_LIB_POLYVAL + select CRYPTO_LIB_GF128HASH select CRYPTO_MANAGER help HCTR2 length-preserving encryption mode =20 A mode for storage encryption that is efficient on processors with diff --git a/crypto/hctr2.c b/crypto/hctr2.c index f4cd6c29b4d3..ad5edf9366ac 100644 --- a/crypto/hctr2.c +++ b/crypto/hctr2.c @@ -14,13 +14,13 @@ * * For more details, see the paper: "Length-preserving encryption with HCT= R2" * (https://eprint.iacr.org/2021/1441.pdf) */ =20 +#include #include #include -#include #include #include =20 #define BLOCKCIPHER_BLOCK_SIZE 16 =20 diff --git a/include/crypto/polyval.h b/include/crypto/gf128hash.h similarity index 94% rename from include/crypto/polyval.h rename to include/crypto/gf128hash.h index b28b8ef11353..5ffa86f5c13f 100644 --- a/include/crypto/polyval.h +++ b/include/crypto/gf128hash.h @@ -1,14 +1,14 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* - * POLYVAL library API + * GF(2^128) polynomial hashing: GHASH and POLYVAL * * Copyright 2025 Google LLC */ =20 -#ifndef _CRYPTO_POLYVAL_H -#define _CRYPTO_POLYVAL_H +#ifndef _CRYPTO_GF128HASH_H +#define _CRYPTO_GF128HASH_H =20 #include #include =20 #define POLYVAL_BLOCK_SIZE 16 @@ -42,24 +42,24 @@ struct polyval_elem { * * By H^i we mean H^(i-1) * H * x^-128, with base case H^1 =3D H. I.e. the * exponentiation repeats the POLYVAL dot operation, with its "extra" x^-1= 28. */ struct polyval_key { -#ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH +#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH #ifdef CONFIG_ARM64 /** @h_powers: Powers of the hash key H^8 through H^1 */ struct polyval_elem h_powers[8]; #elif defined(CONFIG_X86) /** @h_powers: Powers of the hash key H^8 through H^1 */ struct polyval_elem h_powers[8]; #else #error "Unhandled arch" #endif -#else /* CONFIG_CRYPTO_LIB_POLYVAL_ARCH */ +#else /* CONFIG_CRYPTO_LIB_GF128HASH_ARCH */ /** @h: The hash key H */ struct polyval_elem h; -#endif /* !CONFIG_CRYPTO_LIB_POLYVAL_ARCH */ +#endif /* !CONFIG_CRYPTO_LIB_GF128HASH_ARCH */ }; =20 /** * struct polyval_ctx - Context for computing a POLYVAL value * @key: Pointer to the prepared POLYVAL key. The user of the API is @@ -82,11 +82,11 @@ struct polyval_ctx { * copy, or it may involve precomputing powers of the key, depending on the * platform's POLYVAL implementation. * * Context: Any context. */ -#ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH +#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH void polyval_preparekey(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]); =20 #else static inline void polyval_preparekey(struct polyval_key *key, @@ -185,6 +185,6 @@ static inline void polyval(const struct polyval_key *ke= y, polyval_init(&ctx, key); polyval_update(&ctx, data, len); polyval_final(&ctx, out); } =20 -#endif /* _CRYPTO_POLYVAL_H */ +#endif /* _CRYPTO_GF128HASH_H */ diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig index 4910fe20e42a..98cedd95c2a5 100644 --- a/lib/crypto/Kconfig +++ b/lib/crypto/Kconfig @@ -108,10 +108,22 @@ config CRYPTO_LIB_CURVE25519_GENERIC default y if !CRYPTO_LIB_CURVE25519_ARCH || ARM || X86_64 =20 config CRYPTO_LIB_DES tristate =20 +config CRYPTO_LIB_GF128HASH + tristate + help + The GHASH and POLYVAL library functions. Select this if your module + uses any of the functions from . + +config CRYPTO_LIB_GF128HASH_ARCH + bool + depends on CRYPTO_LIB_GF128HASH && !UML + default y if ARM64 + default y if X86_64 + config CRYPTO_LIB_MD5 tristate help The MD5 and HMAC-MD5 library functions. Select this if your module uses any of the functions from . @@ -176,22 +188,10 @@ config CRYPTO_LIB_POLY1305_RSIZE default 2 if MIPS || RISCV default 11 if X86_64 default 9 if ARM || ARM64 default 1 =20 -config CRYPTO_LIB_POLYVAL - tristate - help - The POLYVAL library functions. Select this if your module uses any of - the functions from . - -config CRYPTO_LIB_POLYVAL_ARCH - bool - depends on CRYPTO_LIB_POLYVAL && !UML - default y if ARM64 - default y if X86_64 - config CRYPTO_LIB_CHACHA20POLY1305 tristate select CRYPTO_LIB_CHACHA select CRYPTO_LIB_POLY1305 select CRYPTO_LIB_UTILS diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile index a961615c8c7f..fc30622123d2 100644 --- a/lib/crypto/Makefile +++ b/lib/crypto/Makefile @@ -152,10 +152,20 @@ endif obj-$(CONFIG_CRYPTO_LIB_DES) +=3D libdes.o libdes-y :=3D des.o =20 ##########################################################################= ###### =20 +obj-$(CONFIG_CRYPTO_LIB_GF128HASH) +=3D libgf128hash.o +libgf128hash-y :=3D gf128hash.o +ifeq ($(CONFIG_CRYPTO_LIB_GF128HASH_ARCH),y) +CFLAGS_gf128hash.o +=3D -I$(src)/$(SRCARCH) +libgf128hash-$(CONFIG_ARM64) +=3D arm64/polyval-ce-core.o +libgf128hash-$(CONFIG_X86) +=3D x86/polyval-pclmul-avx.o +endif + +##########################################################################= ###### + obj-$(CONFIG_CRYPTO_LIB_MD5) +=3D libmd5.o libmd5-y :=3D md5.o ifeq ($(CONFIG_CRYPTO_LIB_MD5_ARCH),y) CFLAGS_md5.o +=3D -I$(src)/$(SRCARCH) libmd5-$(CONFIG_PPC) +=3D powerpc/md5-asm.o @@ -249,20 +259,10 @@ clean-files +=3D arm/poly1305-core.S \ riscv/poly1305-core.S \ x86/poly1305-x86_64-cryptogams.S =20 ##########################################################################= ###### =20 -obj-$(CONFIG_CRYPTO_LIB_POLYVAL) +=3D libpolyval.o -libpolyval-y :=3D polyval.o -ifeq ($(CONFIG_CRYPTO_LIB_POLYVAL_ARCH),y) -CFLAGS_polyval.o +=3D -I$(src)/$(SRCARCH) -libpolyval-$(CONFIG_ARM64) +=3D arm64/polyval-ce-core.o -libpolyval-$(CONFIG_X86) +=3D x86/polyval-pclmul-avx.o -endif - -##########################################################################= ###### - obj-$(CONFIG_CRYPTO_LIB_SHA1) +=3D libsha1.o libsha1-y :=3D sha1.o ifeq ($(CONFIG_CRYPTO_LIB_SHA1_ARCH),y) CFLAGS_sha1.o +=3D -I$(src)/$(SRCARCH) ifeq ($(CONFIG_ARM),y) diff --git a/lib/crypto/arm64/polyval.h b/lib/crypto/arm64/gf128hash.h similarity index 95% rename from lib/crypto/arm64/polyval.h rename to lib/crypto/arm64/gf128hash.h index a39763395e9b..c1012007adcf 100644 --- a/lib/crypto/arm64/polyval.h +++ b/lib/crypto/arm64/gf128hash.h @@ -70,11 +70,11 @@ static void polyval_blocks_arch(struct polyval_elem *ac= c, polyval_blocks_generic(acc, &key->h_powers[NUM_H_POWERS - 1], data, nblocks); } } =20 -#define polyval_mod_init_arch polyval_mod_init_arch -static void polyval_mod_init_arch(void) +#define gf128hash_mod_init_arch gf128hash_mod_init_arch +static void gf128hash_mod_init_arch(void) { if (cpu_have_named_feature(PMULL)) static_branch_enable(&have_pmull); } diff --git a/lib/crypto/polyval.c b/lib/crypto/gf128hash.c similarity index 94% rename from lib/crypto/polyval.c rename to lib/crypto/gf128hash.c index 5796275f574a..8bb848bf26b7 100644 --- a/lib/crypto/polyval.c +++ b/lib/crypto/gf128hash.c @@ -1,13 +1,13 @@ // SPDX-License-Identifier: GPL-2.0-or-later /* - * POLYVAL library functions + * GF(2^128) polynomial hashing: GHASH and POLYVAL * * Copyright 2025 Google LLC */ =20 -#include +#include #include #include #include #include =20 @@ -216,12 +216,12 @@ polyval_blocks_generic(struct polyval_elem *acc, cons= t struct polyval_elem *key, data +=3D POLYVAL_BLOCK_SIZE; } while (--nblocks); } =20 /* Include the arch-optimized implementation of POLYVAL, if one is availab= le. */ -#ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH -#include "polyval.h" /* $(SRCARCH)/polyval.h */ +#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH +#include "gf128hash.h" /* $(SRCARCH)/gf128hash.h */ void polyval_preparekey(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]) { polyval_preparekey_arch(key, raw_key); } @@ -236,21 +236,21 @@ EXPORT_SYMBOL_GPL(polyval_preparekey); * code is needed to pass the appropriate key argument. */ =20 static void polyval_mul(struct polyval_ctx *ctx) { -#ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH +#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH polyval_mul_arch(&ctx->acc, ctx->key); #else polyval_mul_generic(&ctx->acc, &ctx->key->h); #endif } =20 static void polyval_blocks(struct polyval_ctx *ctx, const u8 *data, size_t nblocks) { -#ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH +#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH polyval_blocks_arch(&ctx->acc, ctx->key, data, nblocks); #else polyval_blocks_generic(&ctx->acc, &ctx->key->h, data, nblocks); #endif } @@ -287,21 +287,21 @@ void polyval_final(struct polyval_ctx *ctx, u8 out[PO= LYVAL_BLOCK_SIZE]) memcpy(out, &ctx->acc, POLYVAL_BLOCK_SIZE); memzero_explicit(ctx, sizeof(*ctx)); } EXPORT_SYMBOL_GPL(polyval_final); =20 -#ifdef polyval_mod_init_arch -static int __init polyval_mod_init(void) +#ifdef gf128hash_mod_init_arch +static int __init gf128hash_mod_init(void) { - polyval_mod_init_arch(); + gf128hash_mod_init_arch(); return 0; } -subsys_initcall(polyval_mod_init); +subsys_initcall(gf128hash_mod_init); =20 -static void __exit polyval_mod_exit(void) +static void __exit gf128hash_mod_exit(void) { } -module_exit(polyval_mod_exit); +module_exit(gf128hash_mod_exit); #endif =20 -MODULE_DESCRIPTION("POLYVAL almost-XOR-universal hash function"); +MODULE_DESCRIPTION("GF(2^128) polynomial hashing: GHASH and POLYVAL"); MODULE_LICENSE("GPL"); diff --git a/lib/crypto/tests/Kconfig b/lib/crypto/tests/Kconfig index 42e1770e1883..aa627b6b9855 100644 --- a/lib/crypto/tests/Kconfig +++ b/lib/crypto/tests/Kconfig @@ -67,11 +67,11 @@ config CRYPTO_LIB_POLY1305_KUNIT_TEST help KUnit tests for the Poly1305 library functions. =20 config CRYPTO_LIB_POLYVAL_KUNIT_TEST tristate "KUnit tests for POLYVAL" if !KUNIT_ALL_TESTS - depends on KUNIT && CRYPTO_LIB_POLYVAL + depends on KUNIT && CRYPTO_LIB_GF128HASH default KUNIT_ALL_TESTS select CRYPTO_LIB_BENCHMARK_VISIBLE help KUnit tests for the POLYVAL library functions. =20 @@ -120,15 +120,15 @@ config CRYPTO_LIB_ENABLE_ALL_FOR_KUNIT tristate "Enable all crypto library code for KUnit tests" depends on KUNIT select CRYPTO_LIB_AES_CBC_MACS select CRYPTO_LIB_BLAKE2B select CRYPTO_LIB_CURVE25519 + select CRYPTO_LIB_GF128HASH select CRYPTO_LIB_MD5 select CRYPTO_LIB_MLDSA select CRYPTO_LIB_NH select CRYPTO_LIB_POLY1305 - select CRYPTO_LIB_POLYVAL select CRYPTO_LIB_SHA1 select CRYPTO_LIB_SHA256 select CRYPTO_LIB_SHA512 select CRYPTO_LIB_SHA3 help diff --git a/lib/crypto/tests/polyval_kunit.c b/lib/crypto/tests/polyval_ku= nit.c index f47f41a39a41..d1f53a690ab8 100644 --- a/lib/crypto/tests/polyval_kunit.c +++ b/lib/crypto/tests/polyval_kunit.c @@ -1,10 +1,10 @@ // SPDX-License-Identifier: GPL-2.0-or-later /* * Copyright 2025 Google LLC */ -#include +#include #include "polyval-testvecs.h" =20 /* * A fixed key used when presenting POLYVAL as an unkeyed hash function in= order * to reuse hash-test-template.h. At the beginning of the test suite, thi= s is diff --git a/lib/crypto/x86/polyval.h b/lib/crypto/x86/gf128hash.h similarity index 95% rename from lib/crypto/x86/polyval.h rename to lib/crypto/x86/gf128hash.h index ef8797521420..fe506cf6431b 100644 --- a/lib/crypto/x86/polyval.h +++ b/lib/crypto/x86/gf128hash.h @@ -72,12 +72,12 @@ static void polyval_blocks_arch(struct polyval_elem *ac= c, polyval_blocks_generic(acc, &key->h_powers[NUM_H_POWERS - 1], data, nblocks); } } =20 -#define polyval_mod_init_arch polyval_mod_init_arch -static void polyval_mod_init_arch(void) +#define gf128hash_mod_init_arch gf128hash_mod_init_arch +static void gf128hash_mod_init_arch(void) { if (boot_cpu_has(X86_FEATURE_PCLMULQDQ) && boot_cpu_has(X86_FEATURE_AVX)) static_branch_enable(&have_pclmul_avx); } --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CC25E33E37A; Thu, 19 Mar 2026 06:19:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901155; cv=none; b=Ni2COjDv7uVQ9Zcjz3B3cV3wLzZX6+QHDX/Pnu9/ZK9bPuiljj55BzWDyitOMVRnLB77yypa0UolAovzQU+c67toewe79RY+A6Oed/snSWvyvpsmbYIJVNFvJUEc+yvam/Unvp9wnZCLJzxGT9h9TXc+k0kvlLIMtFshQPyJTc4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901155; c=relaxed/simple; bh=hid0vw0Vx+WzxqZPhBF+Eve7jlNniJ4iCco18/lmzfE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j+1hQNFj4ozg3lmLl9diMN+XAAvBbtpx42eqWs9FaMa7FpiZCE6b8XBcKCXcAIWAbGtP5RUz5KM3hIel9fCTT8fd7u8uyvzg1uwbFZ05pW4z6ryoosZkcWAgDbUJNel0GKk5eCdvhvb4SLMeU1u7RXPQGDuNEdTx7MPRmrqc+ZM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LhKkKHYO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LhKkKHYO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 62EB1C2BCB0; Thu, 19 Mar 2026 06:19:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901155; bh=hid0vw0Vx+WzxqZPhBF+Eve7jlNniJ4iCco18/lmzfE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LhKkKHYOHF4+WqJuVtNO8WdR+dlrLylzpj92e01/b4ILJZzjpIZxYcIANC6IgLjPt lCnvXj6zL6Yn5v4eRxqPX80CJ9AYnGoHf7poVmkVNFYVYuEGEuqeSPLSLhcTewuF2C MxnlP3+ft3sYyrlPa09r9f6XslFfYIXCfcUULHIBuvjCTCPF3di8IVyzkXoYUuoCgo GGuCitZEMUpyCe0HwaK9tnBvRKbWOSfK/9g6ohLtsKUivvpedQHSGCucZNXlcH0vSl 6LMhKrsd47tdNEwKB91CjpgwwNSRZCIErLofSJI3WU/cYfTc5VVb4f73wVz27lcpnG mfOCmTLx1N1wg== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 02/19] lib/crypto: gf128hash: Support GF128HASH_ARCH without all POLYVAL functions Date: Wed, 18 Mar 2026 23:17:03 -0700 Message-ID: <20260319061723.1140720-3-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, some architectures (arm64 and x86) have optimized code for both GHASH and POLYVAL. Others (arm, powerpc, riscv, and s390) have optimized code only for GHASH. While POLYVAL support could be implemented on these other architectures, until then we need to support the case where arch-optimized functions are present only for GHASH. Therefore, update the support for arch-optimized POLYVAL functions to allow architectures to opt into supporting these functions individually. The new meaning of CONFIG_CRYPTO_LIB_GF128HASH_ARCH is that some level of GHASH and/or POLYVAL acceleration is provided. Also provide an implementation of polyval_mul() based on polyval_blocks_arch(), for when polyval_mul_arch() isn't implemented. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- include/crypto/gf128hash.h | 22 +++------------------- lib/crypto/arm64/gf128hash.h | 3 +++ lib/crypto/gf128hash.c | 16 ++++++++++++---- lib/crypto/x86/gf128hash.h | 3 +++ 4 files changed, 21 insertions(+), 23 deletions(-) diff --git a/include/crypto/gf128hash.h b/include/crypto/gf128hash.h index 5ffa86f5c13f..1052041e3499 100644 --- a/include/crypto/gf128hash.h +++ b/include/crypto/gf128hash.h @@ -42,24 +42,18 @@ struct polyval_elem { * * By H^i we mean H^(i-1) * H * x^-128, with base case H^1 =3D H. I.e. the * exponentiation repeats the POLYVAL dot operation, with its "extra" x^-1= 28. */ struct polyval_key { -#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH -#ifdef CONFIG_ARM64 - /** @h_powers: Powers of the hash key H^8 through H^1 */ - struct polyval_elem h_powers[8]; -#elif defined(CONFIG_X86) +#if defined(CONFIG_CRYPTO_LIB_GF128HASH_ARCH) && \ + (defined(CONFIG_ARM64) || defined(CONFIG_X86)) /** @h_powers: Powers of the hash key H^8 through H^1 */ struct polyval_elem h_powers[8]; #else -#error "Unhandled arch" -#endif -#else /* CONFIG_CRYPTO_LIB_GF128HASH_ARCH */ /** @h: The hash key H */ struct polyval_elem h; -#endif /* !CONFIG_CRYPTO_LIB_GF128HASH_ARCH */ +#endif }; =20 /** * struct polyval_ctx - Context for computing a POLYVAL value * @key: Pointer to the prepared POLYVAL key. The user of the API is @@ -82,23 +76,13 @@ struct polyval_ctx { * copy, or it may involve precomputing powers of the key, depending on the * platform's POLYVAL implementation. * * Context: Any context. */ -#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH void polyval_preparekey(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]); =20 -#else -static inline void polyval_preparekey(struct polyval_key *key, - const u8 raw_key[POLYVAL_BLOCK_SIZE]) -{ - /* Just a simple copy, so inline it. */ - memcpy(key->h.bytes, raw_key, POLYVAL_BLOCK_SIZE); -} -#endif - /** * polyval_init() - Initialize a POLYVAL context for a new message * @ctx: The context to initialize * @key: The key to use. Note that a pointer to the key is saved in the * context, so the key must live at least as long as the context. diff --git a/lib/crypto/arm64/gf128hash.h b/lib/crypto/arm64/gf128hash.h index c1012007adcf..796c36804dda 100644 --- a/lib/crypto/arm64/gf128hash.h +++ b/lib/crypto/arm64/gf128hash.h @@ -15,10 +15,11 @@ asmlinkage void polyval_mul_pmull(struct polyval_elem *= a, const struct polyval_elem *b); asmlinkage void polyval_blocks_pmull(struct polyval_elem *acc, const struct polyval_key *key, const u8 *data, size_t nblocks); =20 +#define polyval_preparekey_arch polyval_preparekey_arch static void polyval_preparekey_arch(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]) { static_assert(ARRAY_SIZE(key->h_powers) =3D=3D NUM_H_POWERS); memcpy(&key->h_powers[NUM_H_POWERS - 1], raw_key, POLYVAL_BLOCK_SIZE); @@ -38,10 +39,11 @@ static void polyval_preparekey_arch(struct polyval_key = *key, &key->h_powers[NUM_H_POWERS - 1]); } } } =20 +#define polyval_mul_arch polyval_mul_arch static void polyval_mul_arch(struct polyval_elem *acc, const struct polyval_key *key) { if (static_branch_likely(&have_pmull) && may_use_simd()) { scoped_ksimd() @@ -49,10 +51,11 @@ static void polyval_mul_arch(struct polyval_elem *acc, } else { polyval_mul_generic(acc, &key->h_powers[NUM_H_POWERS - 1]); } } =20 +#define polyval_blocks_arch polyval_blocks_arch static void polyval_blocks_arch(struct polyval_elem *acc, const struct polyval_key *key, const u8 *data, size_t nblocks) { if (static_branch_likely(&have_pmull) && may_use_simd()) { diff --git a/lib/crypto/gf128hash.c b/lib/crypto/gf128hash.c index 8bb848bf26b7..05f44a9193f7 100644 --- a/lib/crypto/gf128hash.c +++ b/lib/crypto/gf128hash.c @@ -215,20 +215,24 @@ polyval_blocks_generic(struct polyval_elem *acc, cons= t struct polyval_elem *key, polyval_mul_generic(acc, key); data +=3D POLYVAL_BLOCK_SIZE; } while (--nblocks); } =20 -/* Include the arch-optimized implementation of POLYVAL, if one is availab= le. */ #ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH #include "gf128hash.h" /* $(SRCARCH)/gf128hash.h */ +#endif + void polyval_preparekey(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]) { +#ifdef polyval_preparekey_arch polyval_preparekey_arch(key, raw_key); +#else + memcpy(key->h.bytes, raw_key, POLYVAL_BLOCK_SIZE); +#endif } EXPORT_SYMBOL_GPL(polyval_preparekey); -#endif /* Else, polyval_preparekey() is an inline function. */ =20 /* * polyval_mul_generic() and polyval_blocks_generic() take the key as a * polyval_elem rather than a polyval_key, so that arch-optimized * implementations with a different key format can use it as a fallback (i= f they @@ -236,21 +240,25 @@ EXPORT_SYMBOL_GPL(polyval_preparekey); * code is needed to pass the appropriate key argument. */ =20 static void polyval_mul(struct polyval_ctx *ctx) { -#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH +#ifdef polyval_mul_arch polyval_mul_arch(&ctx->acc, ctx->key); +#elif defined(polyval_blocks_arch) + static const u8 zeroes[POLYVAL_BLOCK_SIZE]; + + polyval_blocks_arch(&ctx->acc, ctx->key, zeroes, 1); #else polyval_mul_generic(&ctx->acc, &ctx->key->h); #endif } =20 static void polyval_blocks(struct polyval_ctx *ctx, const u8 *data, size_t nblocks) { -#ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH +#ifdef polyval_blocks_arch polyval_blocks_arch(&ctx->acc, ctx->key, data, nblocks); #else polyval_blocks_generic(&ctx->acc, &ctx->key->h, data, nblocks); #endif } diff --git a/lib/crypto/x86/gf128hash.h b/lib/crypto/x86/gf128hash.h index fe506cf6431b..adf6147ea677 100644 --- a/lib/crypto/x86/gf128hash.h +++ b/lib/crypto/x86/gf128hash.h @@ -15,10 +15,11 @@ asmlinkage void polyval_mul_pclmul_avx(struct polyval_e= lem *a, const struct polyval_elem *b); asmlinkage void polyval_blocks_pclmul_avx(struct polyval_elem *acc, const struct polyval_key *key, const u8 *data, size_t nblocks); =20 +#define polyval_preparekey_arch polyval_preparekey_arch static void polyval_preparekey_arch(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]) { static_assert(ARRAY_SIZE(key->h_powers) =3D=3D NUM_H_POWERS); memcpy(&key->h_powers[NUM_H_POWERS - 1], raw_key, POLYVAL_BLOCK_SIZE); @@ -38,10 +39,11 @@ static void polyval_preparekey_arch(struct polyval_key = *key, &key->h_powers[NUM_H_POWERS - 1]); } } } =20 +#define polyval_mul_arch polyval_mul_arch static void polyval_mul_arch(struct polyval_elem *acc, const struct polyval_key *key) { if (static_branch_likely(&have_pclmul_avx) && irq_fpu_usable()) { kernel_fpu_begin(); @@ -50,10 +52,11 @@ static void polyval_mul_arch(struct polyval_elem *acc, } else { polyval_mul_generic(acc, &key->h_powers[NUM_H_POWERS - 1]); } } =20 +#define polyval_blocks_arch polyval_blocks_arch static void polyval_blocks_arch(struct polyval_elem *acc, const struct polyval_key *key, const u8 *data, size_t nblocks) { if (static_branch_likely(&have_pclmul_avx) && irq_fpu_usable()) { --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7781E34676D; Thu, 19 Mar 2026 06:19:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901156; cv=none; b=TSPhux+HXCasv5hTZ58etMv20xXGb0yeJ50pFctYOoozHt8SjUZ2x04jSIZSw3fMMyC6C5DaDZoFytCgMss/LhrvTc0bh3lRSQqfqiqBQ9fqbLmjsWH7n65tELmwxXahzTChKcvvlhNiOTfsjZ6AhgnWb7CESerMGaBu6gO35Xo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901156; c=relaxed/simple; bh=vAFsFueFVZjX8UrC9CKtb1LtlzSIZdgQBkLXwrqPjDQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l66nf9OuKoPpVTTca6j6hjftDBXtINPIS0HFZfibDp4rwpGXjo8VLH3N9V2YIcMWvnY2ij6tG66Y3qfpjtg7oYO5eaOT3+ph3smscafFL4jyK1RgxgeuGLzKcvpgEJ4qdxsRdIkRAk7YngObPheE5B3mgfUYctw9C2HQfQg2lpM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mDxFlaf2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mDxFlaf2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF37DC2BCB1; Thu, 19 Mar 2026 06:19:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901156; bh=vAFsFueFVZjX8UrC9CKtb1LtlzSIZdgQBkLXwrqPjDQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mDxFlaf2E9+JpBiICNMPfdXCbnzZlGDYC4q9HiG1RWXZUJy60eoR5S/7Z2KT20zm7 xz43ekPxXVtElzkcASlkj081iP1m4QuoUdnrJ1uhyTKuiiY/lFCqe3zry5n371t+nz DgetuYl9Ut1aKCfwQQVhUGHH/BJZcigsTpzjuLgpetdX72Nne1hJbU0FylVjsQoQME D9LhY4/DHIXUgkFpcNVYFrvhbhZA1o8NrEyzzKufVX9SFpc/nz0NmzVpGx6G574zb5 XGy+1PIDsumKRAEEVb/BZs/QXbQMZFBlIOnB94E5UNmGgVFogk3iOpm08ibdsx9ytz uEtdqikcnUIKQ== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 03/19] lib/crypto: gf128hash: Add GHASH support Date: Wed, 18 Mar 2026 23:17:04 -0700 Message-ID: <20260319061723.1140720-4-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add GHASH support to the gf128hash module. This will replace the GHASH support in the crypto_shash API. It will be used by the "gcm" template and by the AES-GCM library (when an arch-optimized implementation of the full AES-GCM is unavailable). This consists of a simple API that mirrors the existing POLYVAL API, a generic implementation of that API based on the existing efficient and side-channel-resistant polyval_mul_generic(), and the framework for architecture-optimized implementations of the GHASH functions. The GHASH accumulator is stored in POLYVAL format rather than GHASH format, since this is what most modern GHASH implementations actually need. The few implementations that expect the accumulator in GHASH format will just convert the accumulator to/from GHASH format temporarily. (Supporting architecture-specific accumulator formats would be possible, but doesn't seem worth the complexity.) However, architecture-specific formats of struct ghash_key will be supported, since a variety of formats will be needed there anyway. The default format is just the key in POLYVAL format. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- include/crypto/gf128hash.h | 95 ++++++++++++++++++++++++ lib/crypto/gf128hash.c | 145 +++++++++++++++++++++++++++++++++---- 2 files changed, 227 insertions(+), 13 deletions(-) diff --git a/include/crypto/gf128hash.h b/include/crypto/gf128hash.h index 1052041e3499..5090fbaa87f8 100644 --- a/include/crypto/gf128hash.h +++ b/include/crypto/gf128hash.h @@ -9,10 +9,12 @@ #define _CRYPTO_GF128HASH_H =20 #include #include =20 +#define GHASH_BLOCK_SIZE 16 +#define GHASH_DIGEST_SIZE 16 #define POLYVAL_BLOCK_SIZE 16 #define POLYVAL_DIGEST_SIZE 16 =20 /** * struct polyval_elem - An element of the POLYVAL finite field @@ -31,10 +33,20 @@ struct polyval_elem { __le64 hi; }; }; }; =20 +/** + * struct ghash_key - Prepared key for GHASH + * + * Use ghash_preparekey() to initialize this. + */ +struct ghash_key { + /** @h: The hash key H, in POLYVAL format */ + struct polyval_elem h; +}; + /** * struct polyval_key - Prepared key for POLYVAL * * This may contain just the raw key H, or it may contain precomputed key * powers, depending on the platform's POLYVAL implementation. Use @@ -52,10 +64,24 @@ struct polyval_key { /** @h: The hash key H */ struct polyval_elem h; #endif }; =20 +/** + * struct ghash_ctx - Context for computing a GHASH value + * @key: Pointer to the prepared GHASH key. The user of the API is + * responsible for ensuring that the key lives as long as the context. + * @acc: The accumulator. It is stored in POLYVAL format rather than GHASH + * format, since most implementations want it in POLYVAL format. + * @partial: Number of data bytes processed so far modulo GHASH_BLOCK_SIZE + */ +struct ghash_ctx { + const struct ghash_key *key; + struct polyval_elem acc; + size_t partial; +}; + /** * struct polyval_ctx - Context for computing a POLYVAL value * @key: Pointer to the prepared POLYVAL key. The user of the API is * responsible for ensuring that the key lives as long as the context. * @acc: The accumulator @@ -65,10 +91,22 @@ struct polyval_ctx { const struct polyval_key *key; struct polyval_elem acc; size_t partial; }; =20 +/** + * ghash_preparekey() - Prepare a GHASH key + * @key: (output) The key structure to initialize + * @raw_key: The raw hash key + * + * Initialize a GHASH key structure from a raw key. + * + * Context: Any context. + */ +void ghash_preparekey(struct ghash_key *key, + const u8 raw_key[GHASH_BLOCK_SIZE]); + /** * polyval_preparekey() - Prepare a POLYVAL key * @key: (output) The key structure to initialize * @raw_key: The raw hash key * @@ -79,10 +117,22 @@ struct polyval_ctx { * Context: Any context. */ void polyval_preparekey(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]); =20 +/** + * ghash_init() - Initialize a GHASH context for a new message + * @ctx: The context to initialize + * @key: The key to use. Note that a pointer to the key is saved in the + * context, so the key must live at least as long as the context. + */ +static inline void ghash_init(struct ghash_ctx *ctx, + const struct ghash_key *key) +{ + *ctx =3D (struct ghash_ctx){ .key =3D key }; +} + /** * polyval_init() - Initialize a POLYVAL context for a new message * @ctx: The context to initialize * @key: The key to use. Note that a pointer to the key is saved in the * context, so the key must live at least as long as the context. @@ -123,10 +173,22 @@ static inline void polyval_export_blkaligned(const st= ruct polyval_ctx *ctx, struct polyval_elem *acc) { *acc =3D ctx->acc; } =20 +/** + * ghash_update() - Update a GHASH context with message data + * @ctx: The context to update; must have been initialized + * @data: The message data + * @len: The data length in bytes. Doesn't need to be block-aligned. + * + * This can be called any number of times. + * + * Context: Any context. + */ +void ghash_update(struct ghash_ctx *ctx, const u8 *data, size_t len); + /** * polyval_update() - Update a POLYVAL context with message data * @ctx: The context to update; must have been initialized * @data: The message data * @len: The data length in bytes. Doesn't need to be block-aligned. @@ -135,10 +197,24 @@ static inline void polyval_export_blkaligned(const st= ruct polyval_ctx *ctx, * * Context: Any context. */ void polyval_update(struct polyval_ctx *ctx, const u8 *data, size_t len); =20 +/** + * ghash_final() - Finish computing a GHASH value + * @ctx: The context to finalize + * @out: The output value + * + * If the total data length isn't a multiple of GHASH_BLOCK_SIZE, then the + * final block is automatically zero-padded. + * + * After finishing, this zeroizes @ctx. So the caller does not need to do= it. + * + * Context: Any context. + */ +void ghash_final(struct ghash_ctx *ctx, u8 out[GHASH_BLOCK_SIZE]); + /** * polyval_final() - Finish computing a POLYVAL value * @ctx: The context to finalize * @out: The output value * @@ -149,10 +225,29 @@ void polyval_update(struct polyval_ctx *ctx, const u8= *data, size_t len); * * Context: Any context. */ void polyval_final(struct polyval_ctx *ctx, u8 out[POLYVAL_BLOCK_SIZE]); =20 +/** + * ghash() - Compute a GHASH value + * @key: The prepared key + * @data: The message data + * @len: The data length in bytes. Doesn't need to be block-aligned. + * @out: The output value + * + * Context: Any context. + */ +static inline void ghash(const struct ghash_key *key, const u8 *data, + size_t len, u8 out[GHASH_BLOCK_SIZE]) +{ + struct ghash_ctx ctx; + + ghash_init(&ctx, key); + ghash_update(&ctx, data, len); + ghash_final(&ctx, out); +} + /** * polyval() - Compute a POLYVAL value * @key: The prepared key * @data: The message data * @len: The data length in bytes. Doesn't need to be block-aligned. diff --git a/lib/crypto/gf128hash.c b/lib/crypto/gf128hash.c index 05f44a9193f7..2650603d8ba8 100644 --- a/lib/crypto/gf128hash.c +++ b/lib/crypto/gf128hash.c @@ -10,27 +10,34 @@ #include #include #include =20 /* - * POLYVAL is an almost-XOR-universal hash function. Similar to GHASH, PO= LYVAL - * interprets the message as the coefficients of a polynomial in GF(2^128)= and - * evaluates that polynomial at a secret point. POLYVAL has a simple - * mathematical relationship with GHASH, but it uses a better field conven= tion - * which makes it easier and faster to implement. + * GHASH and POLYVAL are almost-XOR-universal hash functions. They interp= ret + * the message as the coefficients of a polynomial in the finite field GF(= 2^128) + * and evaluate that polynomial at a secret point. * - * POLYVAL is not a cryptographic hash function, and it should be used onl= y by - * algorithms that are specifically designed to use it. + * Neither GHASH nor POLYVAL is a cryptographic hash function. They shoul= d be + * used only by algorithms that are specifically designed to use them. * - * POLYVAL is specified by "AES-GCM-SIV: Nonce Misuse-Resistant Authentica= ted - * Encryption" (https://datatracker.ietf.org/doc/html/rfc8452) + * GHASH is the older variant, defined as part of GCM in NIST SP 800-38D + * (https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-= 38d.pdf). + * GHASH is hard to implement directly, due to its backwards mapping betwe= en + * bits and polynomial coefficients. GHASH implementations typically pre = and + * post-process the inputs and outputs (mainly by byte-swapping) to conver= t the + * GHASH computation into an equivalent computation over a different, + * easier-to-use representation of GF(2^128). * - * POLYVAL is also used by HCTR2. See "Length-preserving encryption with = HCTR2" - * (https://eprint.iacr.org/2021/1441.pdf). + * POLYVAL is a newer GF(2^128) polynomial hash, originally defined as par= t of + * AES-GCM-SIV (https://datatracker.ietf.org/doc/html/rfc8452) and also us= ed by + * HCTR2 (https://eprint.iacr.org/2021/1441.pdf). It uses that easier-to-= use + * field representation directly, eliminating the data conversion steps. * - * This file provides a library API for POLYVAL. This API can delegate to - * either a generic implementation or an architecture-optimized implementa= tion. + * This file provides library APIs for GHASH and POLYVAL. These APIs can + * delegate to either a generic implementation or an architecture-optimized + * implementation. Due to the mathematical relationship between GHASH and + * POLYVAL, in some cases code for one is reused with the other. * * For the generic implementation, we don't use the traditional table appr= oach * to GF(2^128) multiplication. That approach is not constant-time and re= quires * a lot of memory. Instead, we use a different approach which emulates * carryless multiplication using standard multiplications by spreading th= e data @@ -203,10 +210,23 @@ polyval_mul_generic(struct polyval_elem *a, const str= uct polyval_elem *b) /* Return (c2, c3). This implicitly multiplies by x^-128. */ a->lo =3D cpu_to_le64(c2); a->hi =3D cpu_to_le64(c3); } =20 +static void __maybe_unused ghash_blocks_generic(struct polyval_elem *acc, + const struct polyval_elem *key, + const u8 *data, size_t nblocks) +{ + do { + acc->lo ^=3D + cpu_to_le64(get_unaligned_be64((__be64 *)(data + 8))); + acc->hi ^=3D cpu_to_le64(get_unaligned_be64((__be64 *)data)); + polyval_mul_generic(acc, key); + data +=3D GHASH_BLOCK_SIZE; + } while (--nblocks); +} + static void __maybe_unused polyval_blocks_generic(struct polyval_elem *acc, const struct polyval_elem= *key, const u8 *data, size_t nblocks) { do { @@ -215,14 +235,112 @@ polyval_blocks_generic(struct polyval_elem *acc, con= st struct polyval_elem *key, polyval_mul_generic(acc, key); data +=3D POLYVAL_BLOCK_SIZE; } while (--nblocks); } =20 +/* Convert the key from GHASH format to POLYVAL format. */ +static void __maybe_unused ghash_key_to_polyval(const u8 in[GHASH_BLOCK_SI= ZE], + struct polyval_elem *out) +{ + u64 hi =3D get_unaligned_be64(&in[0]); + u64 lo =3D get_unaligned_be64(&in[8]); + u64 mask =3D (s64)hi >> 63; + + hi =3D (hi << 1) ^ (lo >> 63) ^ (mask & ((u64)0xc2 << 56)); + lo =3D (lo << 1) ^ (mask & 1); + out->lo =3D cpu_to_le64(lo); + out->hi =3D cpu_to_le64(hi); +} + +/* Convert the accumulator from POLYVAL format to GHASH format. */ +static void polyval_acc_to_ghash(const struct polyval_elem *in, + u8 out[GHASH_BLOCK_SIZE]) +{ + put_unaligned_be64(le64_to_cpu(in->hi), &out[0]); + put_unaligned_be64(le64_to_cpu(in->lo), &out[8]); +} + +/* Convert the accumulator from GHASH format to POLYVAL format. */ +static void __maybe_unused ghash_acc_to_polyval(const u8 in[GHASH_BLOCK_SI= ZE], + struct polyval_elem *out) +{ + out->lo =3D cpu_to_le64(get_unaligned_be64(&in[8])); + out->hi =3D cpu_to_le64(get_unaligned_be64(&in[0])); +} + #ifdef CONFIG_CRYPTO_LIB_GF128HASH_ARCH #include "gf128hash.h" /* $(SRCARCH)/gf128hash.h */ #endif =20 +void ghash_preparekey(struct ghash_key *key, const u8 raw_key[GHASH_BLOCK_= SIZE]) +{ +#ifdef ghash_preparekey_arch + ghash_preparekey_arch(key, raw_key); +#else + ghash_key_to_polyval(raw_key, &key->h); +#endif +} +EXPORT_SYMBOL_GPL(ghash_preparekey); + +static void ghash_mul(struct ghash_ctx *ctx) +{ +#ifdef ghash_mul_arch + ghash_mul_arch(&ctx->acc, ctx->key); +#elif defined(ghash_blocks_arch) + static const u8 zeroes[GHASH_BLOCK_SIZE]; + + ghash_blocks_arch(&ctx->acc, ctx->key, zeroes, 1); +#else + polyval_mul_generic(&ctx->acc, &ctx->key->h); +#endif +} + +/* nblocks is always >=3D 1. */ +static void ghash_blocks(struct ghash_ctx *ctx, const u8 *data, size_t nbl= ocks) +{ +#ifdef ghash_blocks_arch + ghash_blocks_arch(&ctx->acc, ctx->key, data, nblocks); +#else + ghash_blocks_generic(&ctx->acc, &ctx->key->h, data, nblocks); +#endif +} + +void ghash_update(struct ghash_ctx *ctx, const u8 *data, size_t len) +{ + if (unlikely(ctx->partial)) { + size_t n =3D min(len, GHASH_BLOCK_SIZE - ctx->partial); + + len -=3D n; + while (n--) + ctx->acc.bytes[GHASH_BLOCK_SIZE - 1 - ctx->partial++] ^=3D + *data++; + if (ctx->partial < GHASH_BLOCK_SIZE) + return; + ghash_mul(ctx); + } + if (len >=3D GHASH_BLOCK_SIZE) { + size_t nblocks =3D len / GHASH_BLOCK_SIZE; + + ghash_blocks(ctx, data, nblocks); + data +=3D len & ~(GHASH_BLOCK_SIZE - 1); + len &=3D GHASH_BLOCK_SIZE - 1; + } + for (size_t i =3D 0; i < len; i++) + ctx->acc.bytes[GHASH_BLOCK_SIZE - 1 - i] ^=3D data[i]; + ctx->partial =3D len; +} +EXPORT_SYMBOL_GPL(ghash_update); + +void ghash_final(struct ghash_ctx *ctx, u8 out[GHASH_BLOCK_SIZE]) +{ + if (unlikely(ctx->partial)) + ghash_mul(ctx); + polyval_acc_to_ghash(&ctx->acc, out); + memzero_explicit(ctx, sizeof(*ctx)); +} +EXPORT_SYMBOL_GPL(ghash_final); + void polyval_preparekey(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]) { #ifdef polyval_preparekey_arch polyval_preparekey_arch(key, raw_key); @@ -251,10 +369,11 @@ static void polyval_mul(struct polyval_ctx *ctx) #else polyval_mul_generic(&ctx->acc, &ctx->key->h); #endif } =20 +/* nblocks is always >=3D 1. */ static void polyval_blocks(struct polyval_ctx *ctx, const u8 *data, size_t nblocks) { #ifdef polyval_blocks_arch polyval_blocks_arch(&ctx->acc, ctx->key, data, nblocks); --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEBEA346E6C; Thu, 19 Mar 2026 06:19:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901156; cv=none; b=cZUdc+rNdw6iKXl+8MAo5T5xoXwJCYqRd+7gujj8RE3G0Q6K+75yW4peVdaYRdRsuKUzlCvwHzWqRqSEtU9f8WXsljpff6xxB4MaE/3ZmSl9DxjnixGLsUIcRtwOfirfa+DR1Nd6ZldXwYgI2ihJ2p+K/3k/9EUzdFYhQ3268wU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901156; c=relaxed/simple; bh=XUHU2r86SR48mQ3AE6CcnrCUHzAbQqfAQdblCGnw44Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=A7cbS3XDEZXS8Ne2V5IHHTs0ICnhhE68gkuZW8K6BXeIyjJok8YErk3kn8nKSiomWYfKqwGdXYCguCIknj19qvp7FevVRMW8MuXc1OaH5LExAp50PD4J45U6ZH+mZVptbampuxHICAUgQYctgepJ3yUFVtUIqZSFKi+0WNdFxGs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uw8zDhbS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uw8zDhbS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 47733C2BCB0; Thu, 19 Mar 2026 06:19:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901156; bh=XUHU2r86SR48mQ3AE6CcnrCUHzAbQqfAQdblCGnw44Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uw8zDhbSvx+0bDfe49vNP7qxI4XvC7y/00dzjALest3xnJ6DnvxDCmylSODAjFt3o e8uG5uW6fjK6blRywulpbwIhV34QJLV1BcQQH9FMeLKMKnVLCylVZovOutL220lAld R6QcZAgmBaVebxt8eGh4UZRV19fLzHMU3FGkM+3e0MyBvaUMBJOz/DQpbHkt5GOAOx /+HHE6mgINMhNK6wsCQf3cqgIqMJVg7A9sWGyDFVhvh4uEig95DQM8xxG2mc+wmdQr tVme4a8MU1QqzKOJjFh8Kv20uPE6WFtaFjUiA1p/HxGWpJTbOipPg2eeHrP2G2b1fT XkrLTWlMOgOWQ== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 04/19] lib/crypto: tests: Add KUnit tests for GHASH Date: Wed, 18 Mar 2026 23:17:05 -0700 Message-ID: <20260319061723.1140720-5-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a KUnit test suite for the GHASH library functions. It closely mirrors the POLYVAL test suite. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- lib/crypto/.kunitconfig | 1 + lib/crypto/tests/Kconfig | 8 ++ lib/crypto/tests/Makefile | 1 + lib/crypto/tests/ghash-testvecs.h | 186 ++++++++++++++++++++++++++ lib/crypto/tests/ghash_kunit.c | 194 ++++++++++++++++++++++++++++ scripts/crypto/gen-hash-testvecs.py | 63 ++++++++- 6 files changed, 452 insertions(+), 1 deletion(-) create mode 100644 lib/crypto/tests/ghash-testvecs.h create mode 100644 lib/crypto/tests/ghash_kunit.c diff --git a/lib/crypto/.kunitconfig b/lib/crypto/.kunitconfig index 63a592731d1d..391836511c8b 100644 --- a/lib/crypto/.kunitconfig +++ b/lib/crypto/.kunitconfig @@ -4,10 +4,11 @@ CONFIG_CRYPTO_LIB_ENABLE_ALL_FOR_KUNIT=3Dy =20 CONFIG_CRYPTO_LIB_AES_CBC_MACS_KUNIT_TEST=3Dy CONFIG_CRYPTO_LIB_BLAKE2B_KUNIT_TEST=3Dy CONFIG_CRYPTO_LIB_BLAKE2S_KUNIT_TEST=3Dy CONFIG_CRYPTO_LIB_CURVE25519_KUNIT_TEST=3Dy +CONFIG_CRYPTO_LIB_GHASH_KUNIT_TEST=3Dy CONFIG_CRYPTO_LIB_MD5_KUNIT_TEST=3Dy CONFIG_CRYPTO_LIB_MLDSA_KUNIT_TEST=3Dy CONFIG_CRYPTO_LIB_NH_KUNIT_TEST=3Dy CONFIG_CRYPTO_LIB_POLY1305_KUNIT_TEST=3Dy CONFIG_CRYPTO_LIB_POLYVAL_KUNIT_TEST=3Dy diff --git a/lib/crypto/tests/Kconfig b/lib/crypto/tests/Kconfig index aa627b6b9855..279ff1a339be 100644 --- a/lib/crypto/tests/Kconfig +++ b/lib/crypto/tests/Kconfig @@ -33,10 +33,18 @@ config CRYPTO_LIB_CURVE25519_KUNIT_TEST default KUNIT_ALL_TESTS select CRYPTO_LIB_BENCHMARK_VISIBLE help KUnit tests for the Curve25519 Diffie-Hellman function. =20 +config CRYPTO_LIB_GHASH_KUNIT_TEST + tristate "KUnit tests for GHASH" if !KUNIT_ALL_TESTS + depends on KUNIT && CRYPTO_LIB_GF128HASH + default KUNIT_ALL_TESTS + select CRYPTO_LIB_BENCHMARK_VISIBLE + help + KUnit tests for GHASH library functions. + config CRYPTO_LIB_MD5_KUNIT_TEST tristate "KUnit tests for MD5" if !KUNIT_ALL_TESTS depends on KUNIT && CRYPTO_LIB_MD5 default KUNIT_ALL_TESTS select CRYPTO_LIB_BENCHMARK_VISIBLE diff --git a/lib/crypto/tests/Makefile b/lib/crypto/tests/Makefile index f864e0ffbee4..751ae507fdd0 100644 --- a/lib/crypto/tests/Makefile +++ b/lib/crypto/tests/Makefile @@ -2,10 +2,11 @@ =20 obj-$(CONFIG_CRYPTO_LIB_AES_CBC_MACS_KUNIT_TEST) +=3D aes_cbc_macs_kunit.o obj-$(CONFIG_CRYPTO_LIB_BLAKE2B_KUNIT_TEST) +=3D blake2b_kunit.o obj-$(CONFIG_CRYPTO_LIB_BLAKE2S_KUNIT_TEST) +=3D blake2s_kunit.o obj-$(CONFIG_CRYPTO_LIB_CURVE25519_KUNIT_TEST) +=3D curve25519_kunit.o +obj-$(CONFIG_CRYPTO_LIB_GHASH_KUNIT_TEST) +=3D ghash_kunit.o obj-$(CONFIG_CRYPTO_LIB_MD5_KUNIT_TEST) +=3D md5_kunit.o obj-$(CONFIG_CRYPTO_LIB_MLDSA_KUNIT_TEST) +=3D mldsa_kunit.o obj-$(CONFIG_CRYPTO_LIB_NH_KUNIT_TEST) +=3D nh_kunit.o obj-$(CONFIG_CRYPTO_LIB_POLY1305_KUNIT_TEST) +=3D poly1305_kunit.o obj-$(CONFIG_CRYPTO_LIB_POLYVAL_KUNIT_TEST) +=3D polyval_kunit.o diff --git a/lib/crypto/tests/ghash-testvecs.h b/lib/crypto/tests/ghash-tes= tvecs.h new file mode 100644 index 000000000000..759eb4072336 --- /dev/null +++ b/lib/crypto/tests/ghash-testvecs.h @@ -0,0 +1,186 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* This file was generated by: ./scripts/crypto/gen-hash-testvecs.py ghash= */ + +static const struct { + size_t data_len; + u8 digest[GHASH_DIGEST_SIZE]; +} hash_testvecs[] =3D { + { + .data_len =3D 0, + .digest =3D { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + }, + }, + { + .data_len =3D 1, + .digest =3D { + 0x13, 0x91, 0xa1, 0x11, 0x08, 0xc3, 0x7e, 0xeb, + 0x21, 0x42, 0x4a, 0xd6, 0x45, 0x0f, 0x41, 0xa7, + }, + }, + { + .data_len =3D 2, + .digest =3D { + 0xde, 0x00, 0x63, 0x3f, 0x71, 0x0f, 0xc6, 0x29, + 0x53, 0x2e, 0x49, 0xd9, 0xc2, 0xb7, 0x73, 0xce, + }, + }, + { + .data_len =3D 3, + .digest =3D { + 0xcf, 0xc7, 0xa8, 0x20, 0x24, 0xe9, 0x7a, 0x6c, + 0x2c, 0x2a, 0x34, 0x70, 0x26, 0xba, 0xd5, 0x9a, + }, + }, + { + .data_len =3D 16, + .digest =3D { + 0xaa, 0xe0, 0xdc, 0x7f, 0xcf, 0x8b, 0xe6, 0x0c, + 0x2e, 0x93, 0x89, 0x7d, 0x68, 0x4e, 0xc2, 0x63, + }, + }, + { + .data_len =3D 32, + .digest =3D { + 0x4b, 0x8b, 0x93, 0x5c, 0x79, 0xad, 0x85, 0x08, + 0xd3, 0x8a, 0xcd, 0xdd, 0x4c, 0x6e, 0x0e, 0x6f, + }, + }, + { + .data_len =3D 48, + .digest =3D { + 0xfa, 0xa0, 0x25, 0xdd, 0x61, 0x9a, 0x52, 0x9a, + 0xea, 0xee, 0xc6, 0x62, 0xb2, 0xba, 0x11, 0x49, + }, + }, + { + .data_len =3D 49, + .digest =3D { + 0x23, 0xf1, 0x05, 0xeb, 0x30, 0x40, 0xb9, 0x1d, + 0xe6, 0x35, 0x51, 0x4e, 0x0f, 0xc0, 0x1b, 0x9e, + }, + }, + { + .data_len =3D 63, + .digest =3D { + 0x8d, 0xcf, 0xa0, 0xc8, 0x83, 0x21, 0x06, 0x81, + 0xc6, 0x36, 0xd5, 0x62, 0xbf, 0xa0, 0xcd, 0x9c, + }, + }, + { + .data_len =3D 64, + .digest =3D { + 0xe7, 0xca, 0xbe, 0xe7, 0x66, 0xc8, 0x85, 0xad, + 0xbc, 0xaf, 0x58, 0x21, 0xd7, 0x67, 0x82, 0x15, + }, + }, + { + .data_len =3D 65, + .digest =3D { + 0x9f, 0x48, 0x10, 0xd9, 0xa2, 0x6b, 0x9d, 0xe0, + 0xb1, 0x87, 0xe1, 0x39, 0xc3, 0xd7, 0xee, 0x09, + }, + }, + { + .data_len =3D 127, + .digest =3D { + 0xa4, 0x36, 0xb7, 0x82, 0xd2, 0x67, 0x7e, 0xaf, + 0x5d, 0xfd, 0x67, 0x9c, 0x1d, 0x9f, 0xe4, 0xf7, + }, + }, + { + .data_len =3D 128, + .digest =3D { + 0x57, 0xe7, 0x1d, 0x78, 0xf0, 0x8e, 0xc7, 0x0c, + 0x15, 0xee, 0x18, 0xc4, 0xd1, 0x75, 0x90, 0xaa, + }, + }, + { + .data_len =3D 129, + .digest =3D { + 0x9b, 0xad, 0x81, 0xa9, 0x22, 0xb2, 0x19, 0x53, + 0x60, 0x30, 0xe7, 0xa0, 0x4f, 0xd6, 0x72, 0x42, + }, + }, + { + .data_len =3D 256, + .digest =3D { + 0xf7, 0x33, 0x42, 0xbf, 0x58, 0xde, 0x88, 0x0f, + 0x8d, 0x3d, 0xa6, 0x11, 0x14, 0xc3, 0xf1, 0xdc, + }, + }, + { + .data_len =3D 511, + .digest =3D { + 0x59, 0xdc, 0xa9, 0xc0, 0x4e, 0xd6, 0x97, 0xb3, + 0x60, 0xaf, 0xa8, 0xa0, 0xea, 0x54, 0x8e, 0xc3, + }, + }, + { + .data_len =3D 513, + .digest =3D { + 0xa2, 0x23, 0x37, 0xcc, 0x97, 0xec, 0xea, 0xbe, + 0xd6, 0xc7, 0x13, 0xf7, 0x93, 0x73, 0xc0, 0x64, + }, + }, + { + .data_len =3D 1000, + .digest =3D { + 0x46, 0x8b, 0x43, 0x77, 0x9b, 0xc2, 0xfc, 0xa4, + 0x68, 0x6a, 0x6c, 0x07, 0xa4, 0x6f, 0x47, 0x65, + }, + }, + { + .data_len =3D 3333, + .digest =3D { + 0x69, 0x7f, 0x19, 0xc3, 0xb9, 0xa4, 0xff, 0x40, + 0xe3, 0x03, 0x71, 0xa3, 0x88, 0x8a, 0xf1, 0xbd, + }, + }, + { + .data_len =3D 4096, + .digest =3D { + 0x4d, 0x65, 0xe6, 0x9c, 0xeb, 0x6a, 0x46, 0x8d, + 0xe9, 0x32, 0x96, 0x72, 0xb3, 0x0d, 0x08, 0xa9, + }, + }, + { + .data_len =3D 4128, + .digest =3D { + 0xfc, 0xa1, 0x74, 0x46, 0x21, 0x64, 0xa7, 0x64, + 0xbe, 0x47, 0x03, 0x1e, 0x05, 0xf7, 0xd8, 0x37, + }, + }, + { + .data_len =3D 4160, + .digest =3D { + 0x70, 0x5b, 0xe9, 0x17, 0xab, 0xd5, 0xa2, 0xee, + 0xcb, 0x39, 0xa4, 0x81, 0x2f, 0x41, 0x70, 0xae, + }, + }, + { + .data_len =3D 4224, + .digest =3D { + 0x07, 0xbd, 0xb6, 0x52, 0xe2, 0x75, 0x2c, 0x33, + 0x6d, 0x1b, 0x63, 0x56, 0x58, 0xda, 0x98, 0x55, + }, + }, + { + .data_len =3D 16384, + .digest =3D { + 0x9c, 0xb5, 0xf4, 0x14, 0xe8, 0xa8, 0x4a, 0xde, + 0xee, 0x7b, 0xbb, 0xd6, 0x21, 0x6d, 0x6a, 0x69, + }, + }, +}; + +static const u8 hash_testvec_consolidated[GHASH_DIGEST_SIZE] =3D { + 0x08, 0xef, 0xf5, 0x27, 0xb1, 0xca, 0xd4, 0x1d, + 0xad, 0x38, 0x69, 0x88, 0x6b, 0x16, 0xdf, 0xa8, +}; + +static const u8 ghash_allones_hashofhashes[GHASH_DIGEST_SIZE] =3D { + 0xef, 0x85, 0x58, 0xf8, 0x54, 0x9c, 0x5e, 0x54, + 0xd9, 0xbe, 0x04, 0x1f, 0xff, 0xff, 0xff, 0xff, +}; diff --git a/lib/crypto/tests/ghash_kunit.c b/lib/crypto/tests/ghash_kunit.c new file mode 100644 index 000000000000..68b3837a3607 --- /dev/null +++ b/lib/crypto/tests/ghash_kunit.c @@ -0,0 +1,194 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright 2026 Google LLC + */ +#include +#include "ghash-testvecs.h" + +/* + * A fixed key used when presenting GHASH as an unkeyed hash function in o= rder + * to reuse hash-test-template.h. At the beginning of the test suite, thi= s is + * initialized to a key prepared from bytes generated from a fixed seed. + */ +static struct ghash_key test_key; + +static void ghash_init_withtestkey(struct ghash_ctx *ctx) +{ + ghash_init(ctx, &test_key); +} + +static void ghash_withtestkey(const u8 *data, size_t len, + u8 out[GHASH_BLOCK_SIZE]) +{ + ghash(&test_key, data, len, out); +} + +/* Generate the HASH_KUNIT_CASES using hash-test-template.h. */ +#define HASH ghash_withtestkey +#define HASH_CTX ghash_ctx +#define HASH_SIZE GHASH_BLOCK_SIZE +#define HASH_INIT ghash_init_withtestkey +#define HASH_UPDATE ghash_update +#define HASH_FINAL ghash_final +#include "hash-test-template.h" + +/* + * Test a key and messages containing all one bits. This is useful to det= ect + * overflow bugs in implementations that emulate carryless multiplication = using + * a series of standard multiplications with the bits spread out. + */ +static void test_ghash_allones_key_and_message(struct kunit *test) +{ + struct ghash_key key; + struct ghash_ctx hashofhashes_ctx; + u8 hash[GHASH_BLOCK_SIZE]; + + static_assert(TEST_BUF_LEN >=3D 4096); + memset(test_buf, 0xff, 4096); + + ghash_preparekey(&key, test_buf); + ghash_init(&hashofhashes_ctx, &key); + for (size_t len =3D 0; len <=3D 4096; len +=3D 16) { + ghash(&key, test_buf, len, hash); + ghash_update(&hashofhashes_ctx, hash, sizeof(hash)); + } + ghash_final(&hashofhashes_ctx, hash); + KUNIT_ASSERT_MEMEQ(test, hash, ghash_allones_hashofhashes, + sizeof(hash)); +} + +#define MAX_LEN_FOR_KEY_CHECK 1024 + +/* + * Given two prepared keys which should be identical (but may differ in + * alignment and/or whether they are followed by a guard page or not), ver= ify + * that they produce consistent results on various data lengths. + */ +static void check_key_consistency(struct kunit *test, + const struct ghash_key *key1, + const struct ghash_key *key2) +{ + u8 *data =3D test_buf; + u8 hash1[GHASH_BLOCK_SIZE]; + u8 hash2[GHASH_BLOCK_SIZE]; + + rand_bytes(data, MAX_LEN_FOR_KEY_CHECK); + KUNIT_ASSERT_MEMEQ(test, key1, key2, sizeof(*key1)); + + for (int i =3D 0; i < 100; i++) { + size_t len =3D rand_length(MAX_LEN_FOR_KEY_CHECK); + + ghash(key1, data, len, hash1); + ghash(key2, data, len, hash2); + KUNIT_ASSERT_MEMEQ(test, hash1, hash2, sizeof(hash1)); + } +} + +/* Test that no buffer overreads occur on either raw_key or ghash_key. */ +static void test_ghash_with_guarded_key(struct kunit *test) +{ + u8 raw_key[GHASH_BLOCK_SIZE]; + u8 *guarded_raw_key =3D &test_buf[TEST_BUF_LEN - sizeof(raw_key)]; + struct ghash_key key1, key2; + struct ghash_key *guarded_key =3D + (struct ghash_key *)&test_buf[TEST_BUF_LEN - sizeof(key1)]; + + /* Prepare with regular buffers. */ + rand_bytes(raw_key, sizeof(raw_key)); + ghash_preparekey(&key1, raw_key); + + /* Prepare with guarded raw_key, then check that it works. */ + memcpy(guarded_raw_key, raw_key, sizeof(raw_key)); + ghash_preparekey(&key2, guarded_raw_key); + check_key_consistency(test, &key1, &key2); + + /* Prepare guarded ghash_key, then check that it works. */ + ghash_preparekey(guarded_key, raw_key); + check_key_consistency(test, &key1, guarded_key); +} + +/* + * Test that ghash_key only needs to be aligned to + * __alignof__(struct ghash_key), i.e. 8 bytes. The assembly code may pre= fer + * 16-byte or higher alignment, but it mustn't require it. + */ +static void test_ghash_with_minimally_aligned_key(struct kunit *test) +{ + u8 raw_key[GHASH_BLOCK_SIZE]; + struct ghash_key key; + struct ghash_key *minaligned_key =3D + (struct ghash_key *)&test_buf[MAX_LEN_FOR_KEY_CHECK + + __alignof__(struct ghash_key)]; + + KUNIT_ASSERT_TRUE(test, IS_ALIGNED((uintptr_t)minaligned_key, + __alignof__(struct ghash_key))); + KUNIT_ASSERT_TRUE(test, !IS_ALIGNED((uintptr_t)minaligned_key, + 2 * __alignof__(struct ghash_key))); + + rand_bytes(raw_key, sizeof(raw_key)); + ghash_preparekey(&key, raw_key); + ghash_preparekey(minaligned_key, raw_key); + check_key_consistency(test, &key, minaligned_key); +} + +struct ghash_irq_test_state { + struct ghash_key expected_key; + u8 raw_key[GHASH_BLOCK_SIZE]; +}; + +static bool ghash_irq_test_func(void *state_) +{ + struct ghash_irq_test_state *state =3D state_; + struct ghash_key key; + + ghash_preparekey(&key, state->raw_key); + return memcmp(&key, &state->expected_key, sizeof(key)) =3D=3D 0; +} + +/* + * Test that ghash_preparekey() produces the same output regardless of whe= ther + * FPU or vector registers are usable when it is called. + */ +static void test_ghash_preparekey_in_irqs(struct kunit *test) +{ + struct ghash_irq_test_state state; + + rand_bytes(state.raw_key, sizeof(state.raw_key)); + ghash_preparekey(&state.expected_key, state.raw_key); + kunit_run_irq_test(test, ghash_irq_test_func, 200000, &state); +} + +static int ghash_suite_init(struct kunit_suite *suite) +{ + u8 raw_key[GHASH_BLOCK_SIZE]; + + rand_bytes_seeded_from_len(raw_key, sizeof(raw_key)); + ghash_preparekey(&test_key, raw_key); + return hash_suite_init(suite); +} + +static void ghash_suite_exit(struct kunit_suite *suite) +{ + hash_suite_exit(suite); +} + +static struct kunit_case ghash_test_cases[] =3D { + HASH_KUNIT_CASES, + KUNIT_CASE(test_ghash_allones_key_and_message), + KUNIT_CASE(test_ghash_with_guarded_key), + KUNIT_CASE(test_ghash_with_minimally_aligned_key), + KUNIT_CASE(test_ghash_preparekey_in_irqs), + KUNIT_CASE(benchmark_hash), + {}, +}; + +static struct kunit_suite ghash_test_suite =3D { + .name =3D "ghash", + .test_cases =3D ghash_test_cases, + .suite_init =3D ghash_suite_init, + .suite_exit =3D ghash_suite_exit, +}; +kunit_test_suite(ghash_test_suite); + +MODULE_DESCRIPTION("KUnit tests and benchmark for GHASH"); +MODULE_LICENSE("GPL"); diff --git a/scripts/crypto/gen-hash-testvecs.py b/scripts/crypto/gen-hash-= testvecs.py index 34b7c48f3456..e69ce213fb33 100755 --- a/scripts/crypto/gen-hash-testvecs.py +++ b/scripts/crypto/gen-hash-testvecs.py @@ -66,10 +66,56 @@ class Poly1305: # nondestructive, i.e. not changing any field of self. def digest(self): m =3D (self.h + self.s) % 2**128 return m.to_bytes(16, byteorder=3D'little') =20 +GHASH_POLY =3D sum((1 << i) for i in [128, 7, 2, 1, 0]) +GHASH_BLOCK_SIZE =3D 16 + +# A straightforward, unoptimized implementation of GHASH. +class Ghash: + + @staticmethod + def reflect_bits_in_bytes(v): + res =3D 0 + for offs in range(0, 128, 8): + for bit in range(8): + if (v & (1 << (offs + bit))) !=3D 0: + res ^=3D 1 << (offs + 7 - bit) + return res + + @staticmethod + def bytes_to_poly(data): + return Ghash.reflect_bits_in_bytes(int.from_bytes(data, byteorder= =3D'little')) + + @staticmethod + def poly_to_bytes(poly): + return Ghash.reflect_bits_in_bytes(poly).to_bytes(16, byteorder=3D= 'little') + + def __init__(self, key): + assert len(key) =3D=3D 16 + self.h =3D Ghash.bytes_to_poly(key) + self.acc =3D 0 + + # Note: this supports partial blocks only at the end. + def update(self, data): + for i in range(0, len(data), 16): + # acc +=3D block + self.acc ^=3D Ghash.bytes_to_poly(data[i:i+16]) + # acc =3D (acc * h) mod GHASH_POLY + product =3D 0 + for j in range(127, -1, -1): + if (self.h & (1 << j)) !=3D 0: + product ^=3D self.acc << j + if (product & (1 << (128 + j))) !=3D 0: + product ^=3D GHASH_POLY << j + self.acc =3D product + return self + + def digest(self): + return Ghash.poly_to_bytes(self.acc) + POLYVAL_POLY =3D sum((1 << i) for i in [128, 127, 126, 121, 0]) POLYVAL_BLOCK_SIZE =3D 16 =20 # A straightforward, unoptimized implementation of POLYVAL. # Reference: https://datatracker.ietf.org/doc/html/rfc8452 @@ -101,10 +147,12 @@ def hash_init(alg): # The keyed hash functions are assigned a fixed random key here, to pr= esent # them as unkeyed hash functions. This allows all the test cases for # unkeyed hash functions to work on them. if alg =3D=3D 'aes-cmac': return AesCmac(rand_bytes(AES_256_KEY_SIZE)) + if alg =3D=3D 'ghash': + return Ghash(rand_bytes(GHASH_BLOCK_SIZE)) if alg =3D=3D 'poly1305': return Poly1305(rand_bytes(POLY1305_KEY_SIZE)) if alg =3D=3D 'polyval': return Polyval(rand_bytes(POLYVAL_BLOCK_SIZE)) return hashlib.new(alg) @@ -255,10 +303,19 @@ def gen_additional_poly1305_testvecs(): data +=3D ctx.digest() print_static_u8_array_definition( 'poly1305_allones_macofmacs[POLY1305_DIGEST_SIZE]', Poly1305(key).update(data).digest()) =20 +def gen_additional_ghash_testvecs(): + key =3D b'\xff' * GHASH_BLOCK_SIZE + hashes =3D b'' + for data_len in range(0, 4097, 16): + hashes +=3D Ghash(key).update(b'\xff' * data_len).digest() + print_static_u8_array_definition( + 'ghash_allones_hashofhashes[GHASH_DIGEST_SIZE]', + Ghash(key).update(hashes).digest()) + def gen_additional_polyval_testvecs(): key =3D b'\xff' * POLYVAL_BLOCK_SIZE hashes =3D b'' for data_len in range(0, 4097, 16): hashes +=3D Polyval(key).update(b'\xff' * data_len).digest() @@ -266,11 +323,12 @@ def gen_additional_polyval_testvecs(): 'polyval_allones_hashofhashes[POLYVAL_DIGEST_SIZE]', Polyval(key).update(hashes).digest()) =20 if len(sys.argv) !=3D 2: sys.stderr.write('Usage: gen-hash-testvecs.py ALGORITHM\n') - sys.stderr.write('ALGORITHM may be any supported by Python hashlib; or= poly1305, polyval, or sha3.\n') + sys.stderr.write('ALGORITHM may be any supported by Python hashlib;\n') + sys.stderr.write(' or aes-cmac, ghash, nh, poly1305, polyval, or sha3= .\n') sys.stderr.write('Example: gen-hash-testvecs.py sha512\n') sys.exit(1) =20 alg =3D sys.argv[1] print('/* SPDX-License-Identifier: GPL-2.0-or-later */') @@ -278,10 +336,13 @@ print(f'/* This file was generated by: {sys.argv[0]} = {" ".join(sys.argv[1:])} */ if alg =3D=3D 'aes-cmac': gen_unkeyed_testvecs(alg) elif alg.startswith('blake2'): gen_unkeyed_testvecs(alg) gen_additional_blake2_testvecs(alg) +elif alg =3D=3D 'ghash': + gen_unkeyed_testvecs(alg) + gen_additional_ghash_testvecs() elif alg =3D=3D 'nh': gen_nh_testvecs() elif alg =3D=3D 'poly1305': gen_unkeyed_testvecs(alg) gen_additional_poly1305_testvecs() --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A31F2FFF99; Thu, 19 Mar 2026 06:19:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901157; cv=none; b=XTsnT7XlRhLYEi/8A59UTBFzIpwZmDzUKyzEukJX1ATu+F1JGqWMQQ5IAEV9eUzYC1+KFqspnmLphAXjCUYCYcpj+x1/bIkWSK2NwQziqs7GStJpH4qjDPupZb6zZIybhFL6tDDtlpvtGs6+zJyXzO1IfddJ1bpm+He+GSEzuF0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901157; c=relaxed/simple; bh=kMhXMfUw7a3oxO5oBok+3qNaUgBLXU92TQEyaKkUMGU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Eq+U5XoAiTX8XD7WIE5c/VAb3oqzlc6dnANV8IatoFhdy64cFMkw54ZDUc/4fjaEKV0RNbjEIPG4s1EbuC27VMJ7hJuBOcMnphFczp9wHHNL9QRW+ELZh53aENEv+FD5Hbr497JT2QfFBqGIKqjoqbAHZVfSCAPQsGKIyGAWRjw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rlqFKpbd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rlqFKpbd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B40D8C19425; Thu, 19 Mar 2026 06:19:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901157; bh=kMhXMfUw7a3oxO5oBok+3qNaUgBLXU92TQEyaKkUMGU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rlqFKpbdFAY4Pcu1e0H3TJvAM5+Z2cLF5fmtHa514Z/v0t5GLEqlRilbtxuv5DjQ6 spY4I+vncBblbT6ugst4CEbOSsJ6dyzLOSKb2bScLfRZx4lkxcgozT4YmL3s7lDrg/ ULs0u7RLPlEoSTSxrp8rXth/FYpDFEGkuxsgbGnN/CeixwPe4JqhZOsTTc5eN+wpWe JWB6aD+U4OHZIhDW0o4kbcJ2keEMvmVEuxw1udP92hcWXQvCwiARe91aZlF+EYKm8A eaF4bHuTU+Crd5B3yStX34+sjjm5j/MT4MMaHBVdrAmIfsSfSN4OSt+4vN7GHQZf70 R9TZKEXDlwy8g== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 05/19] crypto: arm/ghash - Make the "ghash" crypto_shash NEON-only Date: Wed, 18 Mar 2026 23:17:06 -0700 Message-ID: <20260319061723.1140720-6-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" arch/arm/crypto/ghash-ce-glue.c originally provided only a "ghash" crypto_shash algorithm using PMULL if available, else NEON. Significantly later, it was updated to also provide a full AES-GCM implementation using PMULL. This made the PMULL support in the "ghash" crypto_shash largely obsolete. Indeed, the arm64 equivalent of this file unconditionally uses only ASIMD in its "ghash" crypto_shash. Given that inconsistency and the fact that the NEON-only code is more easily separable into the GHASH library than the PMULL based code is, let's align with arm64 and just support NEON-only for the pure GHASH. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/arm/crypto/ghash-ce-glue.c | 32 ++++++-------------------------- 1 file changed, 6 insertions(+), 26 deletions(-) diff --git a/arch/arm/crypto/ghash-ce-glue.c b/arch/arm/crypto/ghash-ce-glu= e.c index 454adcc62cc6..d7d787de7dd3 100644 --- a/arch/arm/crypto/ghash-ce-glue.c +++ b/arch/arm/crypto/ghash-ce-glue.c @@ -34,11 +34,11 @@ MODULE_ALIAS_CRYPTO("rfc4106(gcm(aes))"); =20 #define RFC4106_NONCE_SIZE 4 =20 struct ghash_key { be128 k; - u64 h[][2]; + u64 h[1][2]; }; =20 struct gcm_key { u64 h[4][2]; u32 rk[AES_MAX_KEYLENGTH_U32]; @@ -49,16 +49,14 @@ struct gcm_key { struct arm_ghash_desc_ctx { u64 digest[GHASH_DIGEST_SIZE/sizeof(u64)]; }; =20 asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *s= rc, - u64 const h[][2], const char *head); + u64 const h[4][2], const char *head); =20 asmlinkage void pmull_ghash_update_p8(int blocks, u64 dg[], const char *sr= c, - u64 const h[][2], const char *head); - -static __ro_after_init DEFINE_STATIC_KEY_FALSE(use_p64); + u64 const h[1][2], const char *head); =20 static int ghash_init(struct shash_desc *desc) { struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); =20 @@ -68,14 +66,11 @@ static int ghash_init(struct shash_desc *desc) =20 static void ghash_do_update(int blocks, u64 dg[], const char *src, struct ghash_key *key, const char *head) { kernel_neon_begin(); - if (static_branch_likely(&use_p64)) - pmull_ghash_update_p64(blocks, dg, src, key->h, head); - else - pmull_ghash_update_p8(blocks, dg, src, key->h, head); + pmull_ghash_update_p8(blocks, dg, src, key->h, head); kernel_neon_end(); } =20 static int ghash_update(struct shash_desc *desc, const u8 *src, unsigned int len) @@ -145,23 +140,10 @@ static int ghash_setkey(struct crypto_shash *tfm, return -EINVAL; =20 /* needed for the fallback */ memcpy(&key->k, inkey, GHASH_BLOCK_SIZE); ghash_reflect(key->h[0], &key->k); - - if (static_branch_likely(&use_p64)) { - be128 h =3D key->k; - - gf128mul_lle(&h, &key->k); - ghash_reflect(key->h[1], &h); - - gf128mul_lle(&h, &key->k); - ghash_reflect(key->h[2], &h); - - gf128mul_lle(&h, &key->k); - ghash_reflect(key->h[3], &h); - } return 0; } =20 static struct shash_alg ghash_alg =3D { .digestsize =3D GHASH_DIGEST_SIZE, @@ -173,15 +155,15 @@ static struct shash_alg ghash_alg =3D { .import =3D ghash_import, .descsize =3D sizeof(struct arm_ghash_desc_ctx), .statesize =3D sizeof(struct ghash_desc_ctx), =20 .base.cra_name =3D "ghash", - .base.cra_driver_name =3D "ghash-ce", + .base.cra_driver_name =3D "ghash-neon", .base.cra_priority =3D 300, .base.cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY, .base.cra_blocksize =3D GHASH_BLOCK_SIZE, - .base.cra_ctxsize =3D sizeof(struct ghash_key) + sizeof(u64[2]), + .base.cra_ctxsize =3D sizeof(struct ghash_key), .base.cra_module =3D THIS_MODULE, }; =20 void pmull_gcm_encrypt(int blocks, u64 dg[], const char *src, struct gcm_key const *k, char *dst, @@ -569,12 +551,10 @@ static int __init ghash_ce_mod_init(void) if (elf_hwcap2 & HWCAP2_PMULL) { err =3D crypto_register_aeads(gcm_aes_algs, ARRAY_SIZE(gcm_aes_algs)); if (err) return err; - ghash_alg.base.cra_ctxsize +=3D 3 * sizeof(u64[2]); - static_branch_enable(&use_p64); } =20 err =3D crypto_register_shash(&ghash_alg); if (err) goto err_aead; --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 984FF34D923; Thu, 19 Mar 2026 06:19:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901157; cv=none; b=tTEWlY48/7uRO0qbVPFg0Ib6CSFlxbodoPLncTGYXZ29QLPCU4ghhf0xb0pI9vl2RMeZUAzYpbE2YzcUCTuQl+YQMyhCY9J19KGQSCgJggAMK0hDLuSCnzcykhpKLSEQCTJc3p8uy+ziu1yxjPUQbmEtp82ulS0uYJk/OBvryj0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901157; c=relaxed/simple; bh=V2j6hPx45gvxh0kD2ugA9ZQq+RDxK4vT1DIHz7d+PX4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=o7RpOuSfotlBJnrTRQB/+snhdq2nZKtcF4vEh8BHIGBnHUuStwkeWRA8kohP/QZny8OVzBHtnmeddp+28wZZSccEFV7BrZ5K+Kyo2RhlFg/9eblvNmwjmBDYflBPQdzJTZHB+k5VFyOIqp/jJQwZo0nB2iqXMzYUhgGDEL/5XKg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=d8TTmcAi; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="d8TTmcAi" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2CB32C2BCB0; Thu, 19 Mar 2026 06:19:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901157; bh=V2j6hPx45gvxh0kD2ugA9ZQq+RDxK4vT1DIHz7d+PX4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=d8TTmcAiFzhKSiHPR+WkM8CA6ky0KX2q3kPE3nwbJyewHZyMixuvB6X7fm9EZk9SJ t93dNZFdBMtGNa4fkJSJyx4bY6DicytwSb7/HdnKtrq5pWkvH0bnHuzB6rkv9ieG7A DdopRLVS+mLHrpWt6qhNXQxULA55S/4OGeXyBcIrBAQC7vuMQ8WDcxgJZi2k5Q6Vow uVBZCFKQyz6Thxst9LgukjuoVA9uECrZC7fGQNsEMoF1uPmUfgkcyLBQ5Pc/rkbDhR fZIbppcMdMQ+bl5l1xRzcnEagqDeIzv5gTO+LPZEZ+ucVt4GqtwA/Hod2eNr/OeAJI iGe4zZbaGC2SQ== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 06/19] crypto: arm/ghash - Move NEON GHASH assembly into its own file Date: Wed, 18 Mar 2026 23:17:07 -0700 Message-ID: <20260319061723.1140720-7-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" arch/arm/crypto/ghash-ce-core.S implements pmull_ghash_update_p8(), which is used only by a crypto_shash implementation of GHASH. It also implements other functions, including pmull_ghash_update_p64() and others, which are used only by a crypto_aead implementation of AES-GCM. While some code is shared between pmull_ghash_update_p8() and pmull_ghash_update_p64(), it's not very much. Since pmull_ghash_update_p8() will also need to be migrated into lib/crypto/ to achieve parity in the standalone GHASH support, let's move it into a separate file ghash-neon-core.S. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/arm/crypto/Makefile | 2 +- arch/arm/crypto/ghash-ce-core.S | 171 ++---------------------- arch/arm/crypto/ghash-neon-core.S | 207 ++++++++++++++++++++++++++++++ 3 files changed, 222 insertions(+), 158 deletions(-) create mode 100644 arch/arm/crypto/ghash-neon-core.S diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile index e73099e120b3..cedce94d5ee5 100644 --- a/arch/arm/crypto/Makefile +++ b/arch/arm/crypto/Makefile @@ -8,6 +8,6 @@ obj-$(CONFIG_CRYPTO_AES_ARM_BS) +=3D aes-arm-bs.o obj-$(CONFIG_CRYPTO_AES_ARM_CE) +=3D aes-arm-ce.o obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) +=3D ghash-arm-ce.o =20 aes-arm-bs-y :=3D aes-neonbs-core.o aes-neonbs-glue.o aes-arm-ce-y :=3D aes-ce-core.o aes-ce-glue.o -ghash-arm-ce-y :=3D ghash-ce-core.o ghash-ce-glue.o +ghash-arm-ce-y :=3D ghash-ce-core.o ghash-ce-glue.o ghash-neon-core.o diff --git a/arch/arm/crypto/ghash-ce-core.S b/arch/arm/crypto/ghash-ce-cor= e.S index 858c0d66798b..a449525d61f8 100644 --- a/arch/arm/crypto/ghash-ce-core.S +++ b/arch/arm/crypto/ghash-ce-core.S @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * Accelerated GHASH implementation with NEON/ARMv8 vmull.p8/64 instructio= ns. + * Accelerated AES-GCM implementation with ARMv8 Crypto Extensions. * * Copyright (C) 2015 - 2017 Linaro Ltd. * Copyright (C) 2023 Google LLC. */ =20 @@ -27,43 +27,14 @@ XL_H .req d5 XM_L .req d6 XM_H .req d7 XH_L .req d8 =20 - t0l .req d10 - t0h .req d11 - t1l .req d12 - t1h .req d13 - t2l .req d14 - t2h .req d15 - t3l .req d16 - t3h .req d17 - t4l .req d18 - t4h .req d19 - - t0q .req q5 - t1q .req q6 - t2q .req q7 - t3q .req q8 - t4q .req q9 XH2 .req q9 =20 - s1l .req d20 - s1h .req d21 - s2l .req d22 - s2h .req d23 - s3l .req d24 - s3h .req d25 - s4l .req d26 - s4h .req d27 - MASK .req d28 - SHASH2_p8 .req d28 =20 - k16 .req d29 - k32 .req d30 - k48 .req d31 SHASH2_p64 .req d31 =20 HH .req q10 HH3 .req q11 HH4 .req q12 @@ -91,76 +62,10 @@ T3_L .req d16 T3_H .req d17 =20 .text =20 - .macro __pmull_p64, rd, rn, rm, b1, b2, b3, b4 - vmull.p64 \rd, \rn, \rm - .endm - - /* - * This implementation of 64x64 -> 128 bit polynomial multiplication - * using vmull.p8 instructions (8x8 -> 16) is taken from the paper - * "Fast Software Polynomial Multiplication on ARM Processors Using - * the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and - * Ricardo Dahab (https://hal.inria.fr/hal-01506572) - * - * It has been slightly tweaked for in-order performance, and to allow - * 'rq' to overlap with 'ad' or 'bd'. - */ - .macro __pmull_p8, rq, ad, bd, b1=3Dt4l, b2=3Dt3l, b3=3Dt4l, b4=3Dt3l - vext.8 t0l, \ad, \ad, #1 @ A1 - .ifc \b1, t4l - vext.8 t4l, \bd, \bd, #1 @ B1 - .endif - vmull.p8 t0q, t0l, \bd @ F =3D A1*B - vext.8 t1l, \ad, \ad, #2 @ A2 - vmull.p8 t4q, \ad, \b1 @ E =3D A*B1 - .ifc \b2, t3l - vext.8 t3l, \bd, \bd, #2 @ B2 - .endif - vmull.p8 t1q, t1l, \bd @ H =3D A2*B - vext.8 t2l, \ad, \ad, #3 @ A3 - vmull.p8 t3q, \ad, \b2 @ G =3D A*B2 - veor t0q, t0q, t4q @ L =3D E + F - .ifc \b3, t4l - vext.8 t4l, \bd, \bd, #3 @ B3 - .endif - vmull.p8 t2q, t2l, \bd @ J =3D A3*B - veor t0l, t0l, t0h @ t0 =3D (L) (P0 + P1) << 8 - veor t1q, t1q, t3q @ M =3D G + H - .ifc \b4, t3l - vext.8 t3l, \bd, \bd, #4 @ B4 - .endif - vmull.p8 t4q, \ad, \b3 @ I =3D A*B3 - veor t1l, t1l, t1h @ t1 =3D (M) (P2 + P3) << 16 - vmull.p8 t3q, \ad, \b4 @ K =3D A*B4 - vand t0h, t0h, k48 - vand t1h, t1h, k32 - veor t2q, t2q, t4q @ N =3D I + J - veor t0l, t0l, t0h - veor t1l, t1l, t1h - veor t2l, t2l, t2h @ t2 =3D (N) (P4 + P5) << 24 - vand t2h, t2h, k16 - veor t3l, t3l, t3h @ t3 =3D (K) (P6 + P7) << 32 - vmov.i64 t3h, #0 - vext.8 t0q, t0q, t0q, #15 - veor t2l, t2l, t2h - vext.8 t1q, t1q, t1q, #14 - vmull.p8 \rq, \ad, \bd @ D =3D A*B - vext.8 t2q, t2q, t2q, #13 - vext.8 t3q, t3q, t3q, #12 - veor t0q, t0q, t1q - veor t2q, t2q, t3q - veor \rq, \rq, t0q - veor \rq, \rq, t2q - .endm - - // - // PMULL (64x64->128) based reduction for CPUs that can do - // it in a single instruction. - // .macro __pmull_reduce_p64 vmull.p64 T1, XL_L, MASK =20 veor XH_L, XH_L, XM_H vext.8 T1, T1, T1, #8 @@ -168,34 +73,11 @@ veor T1, T1, XL =20 vmull.p64 XL, T1_H, MASK .endm =20 - // - // Alternative reduction for CPUs that lack support for the - // 64x64->128 PMULL instruction - // - .macro __pmull_reduce_p8 - veor XL_H, XL_H, XM_L - veor XH_L, XH_L, XM_H - - vshl.i64 T1, XL, #57 - vshl.i64 T2, XL, #62 - veor T1, T1, T2 - vshl.i64 T2, XL, #63 - veor T1, T1, T2 - veor XL_H, XL_H, T1_L - veor XH_L, XH_L, T1_H - - vshr.u64 T1, XL, #1 - veor XH, XH, XL - veor XL, XL, T1 - vshr.u64 T1, T1, #6 - vshr.u64 XL, XL, #1 - .endm - - .macro ghash_update, pn, enc, aggregate=3D1, head=3D1 + .macro ghash_update, enc, aggregate=3D1, head=3D1 vld1.64 {XL}, [r1] =20 .if \head /* do the head block first, if supplied */ ldr ip, [sp] @@ -204,12 +86,11 @@ vld1.64 {T1}, [ip] teq r0, #0 b 3f .endif =20 -0: .ifc \pn, p64 - .if \aggregate +0: .if \aggregate tst r0, #3 // skip until #blocks is a bne 2f // round multiple of 4 =20 vld1.8 {XL2-XM2}, [r2]! 1: vld1.8 {T2-T3}, [r2]! @@ -286,11 +167,10 @@ veor T1, T1, XH veor XL, XL, T1 =20 b 1b .endif - .endif =20 2: vld1.8 {T1}, [r2]! =20 .ifnb \enc \enc\()_1x T1 @@ -306,29 +186,29 @@ =20 vext.8 IN1, T1, T1, #8 veor T1_L, T1_L, XL_H veor XL, XL, IN1 =20 - __pmull_\pn XH, XL_H, SHASH_H, s1h, s2h, s3h, s4h @ a1 * b1 + vmull.p64 XH, XL_H, SHASH_H @ a1 * b1 veor T1, T1, XL - __pmull_\pn XL, XL_L, SHASH_L, s1l, s2l, s3l, s4l @ a0 * b0 - __pmull_\pn XM, T1_L, SHASH2_\pn @ (a1+a0)(b1+b0) + vmull.p64 XL, XL_L, SHASH_L @ a0 * b0 + vmull.p64 XM, T1_L, SHASH2_p64 @ (a1+a0)(b1+b0) =20 4: veor T1, XL, XH veor XM, XM, T1 =20 - __pmull_reduce_\pn + __pmull_reduce_p64 =20 veor T1, T1, XH veor XL, XL, T1 =20 bne 0b .endm =20 /* - * void pmull_ghash_update(int blocks, u64 dg[], const char *src, - * struct ghash_key const *k, const char *head) + * void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src, + * u64 const h[4][2], const char *head) */ ENTRY(pmull_ghash_update_p64) vld1.64 {SHASH}, [r3]! vld1.64 {HH}, [r3]! vld1.64 {HH3-HH4}, [r3] @@ -339,39 +219,16 @@ ENTRY(pmull_ghash_update_p64) veor HH34_H, HH4_L, HH4_H =20 vmov.i8 MASK, #0xe1 vshl.u64 MASK, MASK, #57 =20 - ghash_update p64 + ghash_update vst1.64 {XL}, [r1] =20 bx lr ENDPROC(pmull_ghash_update_p64) =20 -ENTRY(pmull_ghash_update_p8) - vld1.64 {SHASH}, [r3] - veor SHASH2_p8, SHASH_L, SHASH_H - - vext.8 s1l, SHASH_L, SHASH_L, #1 - vext.8 s2l, SHASH_L, SHASH_L, #2 - vext.8 s3l, SHASH_L, SHASH_L, #3 - vext.8 s4l, SHASH_L, SHASH_L, #4 - vext.8 s1h, SHASH_H, SHASH_H, #1 - vext.8 s2h, SHASH_H, SHASH_H, #2 - vext.8 s3h, SHASH_H, SHASH_H, #3 - vext.8 s4h, SHASH_H, SHASH_H, #4 - - vmov.i64 k16, #0xffff - vmov.i64 k32, #0xffffffff - vmov.i64 k48, #0xffffffffffff - - ghash_update p8 - vst1.64 {XL}, [r1] - - bx lr -ENDPROC(pmull_ghash_update_p8) - e0 .req q9 e1 .req q10 e2 .req q11 e3 .req q12 e0l .req d18 @@ -534,11 +391,11 @@ ENTRY(pmull_gcm_encrypt) ldrd r4, r5, [sp, #24] ldrd r6, r7, [sp, #32] =20 vld1.64 {SHASH}, [r3] =20 - ghash_update p64, enc, head=3D0 + ghash_update enc, head=3D0 vst1.64 {XL}, [r1] =20 pop {r4-r8, pc} ENDPROC(pmull_gcm_encrypt) =20 @@ -552,11 +409,11 @@ ENTRY(pmull_gcm_decrypt) ldrd r4, r5, [sp, #24] ldrd r6, r7, [sp, #32] =20 vld1.64 {SHASH}, [r3] =20 - ghash_update p64, dec, head=3D0 + ghash_update dec, head=3D0 vst1.64 {XL}, [r1] =20 pop {r4-r8, pc} ENDPROC(pmull_gcm_decrypt) =20 @@ -601,11 +458,11 @@ ENTRY(pmull_gcm_enc_final) vmov.i8 MASK, #0xe1 veor SHASH2_p64, SHASH_L, SHASH_H vshl.u64 MASK, MASK, #57 mov r0, #1 bne 3f // process head block first - ghash_update p64, aggregate=3D0, head=3D0 + ghash_update aggregate=3D0, head=3D0 =20 vrev64.8 XL, XL vext.8 XL, XL, XL, #8 veor XL, XL, e1 =20 @@ -658,11 +515,11 @@ ENTRY(pmull_gcm_dec_final) vmov.i8 MASK, #0xe1 veor SHASH2_p64, SHASH_L, SHASH_H vshl.u64 MASK, MASK, #57 mov r0, #1 bne 3f // process head block first - ghash_update p64, aggregate=3D0, head=3D0 + ghash_update aggregate=3D0, head=3D0 =20 vrev64.8 XL, XL vext.8 XL, XL, XL, #8 veor XL, XL, e1 =20 diff --git a/arch/arm/crypto/ghash-neon-core.S b/arch/arm/crypto/ghash-neon= -core.S new file mode 100644 index 000000000000..bdf6fb6d063c --- /dev/null +++ b/arch/arm/crypto/ghash-neon-core.S @@ -0,0 +1,207 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Accelerated GHASH implementation with NEON vmull.p8 instructions. + * + * Copyright (C) 2015 - 2017 Linaro Ltd. + * Copyright (C) 2023 Google LLC. + */ + +#include +#include + + .fpu neon + + SHASH .req q0 + T1 .req q1 + XL .req q2 + XM .req q3 + XH .req q4 + IN1 .req q4 + + SHASH_L .req d0 + SHASH_H .req d1 + T1_L .req d2 + T1_H .req d3 + XL_L .req d4 + XL_H .req d5 + XM_L .req d6 + XM_H .req d7 + XH_L .req d8 + + t0l .req d10 + t0h .req d11 + t1l .req d12 + t1h .req d13 + t2l .req d14 + t2h .req d15 + t3l .req d16 + t3h .req d17 + t4l .req d18 + t4h .req d19 + + t0q .req q5 + t1q .req q6 + t2q .req q7 + t3q .req q8 + t4q .req q9 + + s1l .req d20 + s1h .req d21 + s2l .req d22 + s2h .req d23 + s3l .req d24 + s3h .req d25 + s4l .req d26 + s4h .req d27 + + SHASH2_p8 .req d28 + + k16 .req d29 + k32 .req d30 + k48 .req d31 + + T2 .req q7 + + .text + + /* + * This implementation of 64x64 -> 128 bit polynomial multiplication + * using vmull.p8 instructions (8x8 -> 16) is taken from the paper + * "Fast Software Polynomial Multiplication on ARM Processors Using + * the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and + * Ricardo Dahab (https://hal.inria.fr/hal-01506572) + * + * It has been slightly tweaked for in-order performance, and to allow + * 'rq' to overlap with 'ad' or 'bd'. + */ + .macro __pmull_p8, rq, ad, bd, b1=3Dt4l, b2=3Dt3l, b3=3Dt4l, b4=3Dt3l + vext.8 t0l, \ad, \ad, #1 @ A1 + .ifc \b1, t4l + vext.8 t4l, \bd, \bd, #1 @ B1 + .endif + vmull.p8 t0q, t0l, \bd @ F =3D A1*B + vext.8 t1l, \ad, \ad, #2 @ A2 + vmull.p8 t4q, \ad, \b1 @ E =3D A*B1 + .ifc \b2, t3l + vext.8 t3l, \bd, \bd, #2 @ B2 + .endif + vmull.p8 t1q, t1l, \bd @ H =3D A2*B + vext.8 t2l, \ad, \ad, #3 @ A3 + vmull.p8 t3q, \ad, \b2 @ G =3D A*B2 + veor t0q, t0q, t4q @ L =3D E + F + .ifc \b3, t4l + vext.8 t4l, \bd, \bd, #3 @ B3 + .endif + vmull.p8 t2q, t2l, \bd @ J =3D A3*B + veor t0l, t0l, t0h @ t0 =3D (L) (P0 + P1) << 8 + veor t1q, t1q, t3q @ M =3D G + H + .ifc \b4, t3l + vext.8 t3l, \bd, \bd, #4 @ B4 + .endif + vmull.p8 t4q, \ad, \b3 @ I =3D A*B3 + veor t1l, t1l, t1h @ t1 =3D (M) (P2 + P3) << 16 + vmull.p8 t3q, \ad, \b4 @ K =3D A*B4 + vand t0h, t0h, k48 + vand t1h, t1h, k32 + veor t2q, t2q, t4q @ N =3D I + J + veor t0l, t0l, t0h + veor t1l, t1l, t1h + veor t2l, t2l, t2h @ t2 =3D (N) (P4 + P5) << 24 + vand t2h, t2h, k16 + veor t3l, t3l, t3h @ t3 =3D (K) (P6 + P7) << 32 + vmov.i64 t3h, #0 + vext.8 t0q, t0q, t0q, #15 + veor t2l, t2l, t2h + vext.8 t1q, t1q, t1q, #14 + vmull.p8 \rq, \ad, \bd @ D =3D A*B + vext.8 t2q, t2q, t2q, #13 + vext.8 t3q, t3q, t3q, #12 + veor t0q, t0q, t1q + veor t2q, t2q, t3q + veor \rq, \rq, t0q + veor \rq, \rq, t2q + .endm + + .macro __pmull_reduce_p8 + veor XL_H, XL_H, XM_L + veor XH_L, XH_L, XM_H + + vshl.i64 T1, XL, #57 + vshl.i64 T2, XL, #62 + veor T1, T1, T2 + vshl.i64 T2, XL, #63 + veor T1, T1, T2 + veor XL_H, XL_H, T1_L + veor XH_L, XH_L, T1_H + + vshr.u64 T1, XL, #1 + veor XH, XH, XL + veor XL, XL, T1 + vshr.u64 T1, T1, #6 + vshr.u64 XL, XL, #1 + .endm + + .macro ghash_update + vld1.64 {XL}, [r1] + + /* do the head block first, if supplied */ + ldr ip, [sp] + teq ip, #0 + beq 0f + vld1.64 {T1}, [ip] + teq r0, #0 + b 3f + +0: + vld1.8 {T1}, [r2]! + subs r0, r0, #1 + +3: /* multiply XL by SHASH in GF(2^128) */ + vrev64.8 T1, T1 + + vext.8 IN1, T1, T1, #8 + veor T1_L, T1_L, XL_H + veor XL, XL, IN1 + + __pmull_p8 XH, XL_H, SHASH_H, s1h, s2h, s3h, s4h @ a1 * b1 + veor T1, T1, XL + __pmull_p8 XL, XL_L, SHASH_L, s1l, s2l, s3l, s4l @ a0 * b0 + __pmull_p8 XM, T1_L, SHASH2_p8 @ (a1+a0)(b1+b0) + + veor T1, XL, XH + veor XM, XM, T1 + + __pmull_reduce_p8 + + veor T1, T1, XH + veor XL, XL, T1 + + bne 0b + .endm + + /* + * void pmull_ghash_update_p8(int blocks, u64 dg[], const char *src, + * u64 const h[1][2], const char *head) + */ +ENTRY(pmull_ghash_update_p8) + vld1.64 {SHASH}, [r3] + veor SHASH2_p8, SHASH_L, SHASH_H + + vext.8 s1l, SHASH_L, SHASH_L, #1 + vext.8 s2l, SHASH_L, SHASH_L, #2 + vext.8 s3l, SHASH_L, SHASH_L, #3 + vext.8 s4l, SHASH_L, SHASH_L, #4 + vext.8 s1h, SHASH_H, SHASH_H, #1 + vext.8 s2h, SHASH_H, SHASH_H, #2 + vext.8 s3h, SHASH_H, SHASH_H, #3 + vext.8 s4h, SHASH_H, SHASH_H, #4 + + vmov.i64 k16, #0xffff + vmov.i64 k32, #0xffffffff + vmov.i64 k48, #0xffffffffffff + + ghash_update + vst1.64 {XL}, [r1] + + bx lr +ENDPROC(pmull_ghash_update_p8) --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D74634F487; Thu, 19 Mar 2026 06:19:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901158; cv=none; b=L41DYXdnjnG5A3EA8dCTf0G3Gx0tInu6Uh3X2RXfvlv/8HMjmfFEw92Dfspt3WAcSyPU0CitiY15on6fILom8Tb/1vBQf7JITo+67031KXNqIc89T7J8EdNKUFE5CaFQChZIAQl+IvfjR+ypD4s1KtnRG7gm+gDedFA0YRIkmzA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901158; c=relaxed/simple; bh=SUlBh6ude964ptwoct2eRUqHN2yhqaxUURXeunMfh1w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=b3pa7vLwNkbsMKy8RZNn8A8H5SeWcAOeLx3x6TLUxtyuydkruSU+15ObincppYDfXB7M3ZwzdSkr8SGpXbSjPjLzo8vv304v3JQr6GYV9+jTl8TkydX16cNlmUeBoJy+mBrhTAervux/FDaAgDqmfl00LaVDLyau/zGHBTITTIU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=c9oN+4HY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="c9oN+4HY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9917BC2BCAF; Thu, 19 Mar 2026 06:19:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901157; bh=SUlBh6ude964ptwoct2eRUqHN2yhqaxUURXeunMfh1w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=c9oN+4HYz50QQhyhsYQS0uoK8sVJjrgvy/HMW9/IDw+Bksg5NpyRYOFIftimZWHqV vxwhE6XtM6scxfGLKGA9D2lrnYfiQygnqngC2RzY6Xj4fkebDCXyHXWDqIyT7kmiM6 JjkehxjMri8fQq2rI1BC9+5GopxUSlVkC+EvJL9qXN5uInDMgpIMRCQPLeBjyAqpXy q6XXI78pR9PfCWrCIs8kPi6Vf81yTVijCWZ+AR2jpcCUNf1lVfsR0BpbRf7ncVccYQ C/biGma0wwnRC21kXe9SOZ14mh9r+L7TqUF3W+0wDW3SbhgrR7+vAA45u1hBl3+DJy bwj2qMifmZ/Wg== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 07/19] lib/crypto: arm/ghash: Migrate optimized code into library Date: Wed, 18 Mar 2026 23:17:08 -0700 Message-ID: <20260319061723.1140720-8-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the "ghash-neon" crypto_shash algorithm. Move the corresponding assembly code into lib/crypto/, and wire it up to the GHASH library. This makes the GHASH library be optimized on arm (though only with NEON, not PMULL; for now the goal is just parity with crypto_shash). It greatly reduces the amount of arm-specific glue code that is needed, and it fixes the issue where this optimization was disabled by default. To integrate the assembly code correctly with the library, make the following tweaks: - Change the type of 'blocks' from int to size_t. - Change the types of 'dg' and 'k' to polyval_elem. Note that this simply reflects the format that the code was already using, at least on little endian CPUs. For big endian CPUs, add byte-swaps. - Remove the 'head' argument, which is no longer needed. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/arm/crypto/Kconfig | 13 +- arch/arm/crypto/Makefile | 2 +- arch/arm/crypto/ghash-ce-glue.c | 144 +----------------- lib/crypto/Kconfig | 1 + lib/crypto/Makefile | 1 + lib/crypto/arm/gf128hash.h | 43 ++++++ .../crypto/arm}/ghash-neon-core.S | 24 +-- 7 files changed, 66 insertions(+), 162 deletions(-) create mode 100644 lib/crypto/arm/gf128hash.h rename {arch/arm/crypto =3D> lib/crypto/arm}/ghash-neon-core.S (92%) diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig index b9c28c818b7c..f884b8b2fd93 100644 --- a/arch/arm/crypto/Kconfig +++ b/arch/arm/crypto/Kconfig @@ -1,30 +1,21 @@ # SPDX-License-Identifier: GPL-2.0 =20 menu "Accelerated Cryptographic Algorithms for CPU (arm)" =20 config CRYPTO_GHASH_ARM_CE - tristate "Hash functions: GHASH (PMULL/NEON/ARMv8 Crypto Extensions)" + tristate "AEAD cipher: AES in GCM mode (ARMv8 Crypto Extensions)" depends on KERNEL_MODE_NEON select CRYPTO_AEAD - select CRYPTO_HASH - select CRYPTO_CRYPTD select CRYPTO_LIB_AES select CRYPTO_LIB_GF128MUL help - GCM GHASH function (NIST SP800-38D) + AEAD cipher: AES-GCM =20 Architecture: arm using - - PMULL (Polynomial Multiply Long) instructions - - NEON (Advanced SIMD) extensions - ARMv8 Crypto Extensions =20 - Use an implementation of GHASH (used by the GCM AEAD chaining mode) - that uses the 64x64 to 128 bit polynomial multiplication (vmull.p64) - that is part of the ARMv8 Crypto Extensions, or a slower variant that - uses the vmull.p8 instruction that is part of the basic NEON ISA. - config CRYPTO_AES_ARM_BS tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS (bit-sliced NEON)" depends on KERNEL_MODE_NEON select CRYPTO_SKCIPHER select CRYPTO_LIB_AES diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile index cedce94d5ee5..e73099e120b3 100644 --- a/arch/arm/crypto/Makefile +++ b/arch/arm/crypto/Makefile @@ -8,6 +8,6 @@ obj-$(CONFIG_CRYPTO_AES_ARM_BS) +=3D aes-arm-bs.o obj-$(CONFIG_CRYPTO_AES_ARM_CE) +=3D aes-arm-ce.o obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) +=3D ghash-arm-ce.o =20 aes-arm-bs-y :=3D aes-neonbs-core.o aes-neonbs-glue.o aes-arm-ce-y :=3D aes-ce-core.o aes-ce-glue.o -ghash-arm-ce-y :=3D ghash-ce-core.o ghash-ce-glue.o ghash-neon-core.o +ghash-arm-ce-y :=3D ghash-ce-core.o ghash-ce-glue.o diff --git a/arch/arm/crypto/ghash-ce-glue.c b/arch/arm/crypto/ghash-ce-glu= e.c index d7d787de7dd3..9aa0ada5b627 100644 --- a/arch/arm/crypto/ghash-ce-glue.c +++ b/arch/arm/crypto/ghash-ce-glue.c @@ -1,8 +1,8 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * Accelerated GHASH implementation with ARMv8 vmull.p64 instructions. + * AES-GCM using ARMv8 Crypto Extensions * * Copyright (C) 2015 - 2018 Linaro Ltd. * Copyright (C) 2023 Google LLC. */ =20 @@ -12,116 +12,38 @@ #include #include #include #include #include -#include #include #include #include #include #include #include #include #include #include =20 -MODULE_DESCRIPTION("GHASH hash function using ARMv8 Crypto Extensions"); +MODULE_DESCRIPTION("AES-GCM using ARMv8 Crypto Extensions"); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL"); -MODULE_ALIAS_CRYPTO("ghash"); MODULE_ALIAS_CRYPTO("gcm(aes)"); MODULE_ALIAS_CRYPTO("rfc4106(gcm(aes))"); =20 #define RFC4106_NONCE_SIZE 4 =20 -struct ghash_key { - be128 k; - u64 h[1][2]; -}; - struct gcm_key { u64 h[4][2]; u32 rk[AES_MAX_KEYLENGTH_U32]; int rounds; u8 nonce[]; // for RFC4106 nonce }; =20 -struct arm_ghash_desc_ctx { - u64 digest[GHASH_DIGEST_SIZE/sizeof(u64)]; -}; - asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *s= rc, u64 const h[4][2], const char *head); =20 -asmlinkage void pmull_ghash_update_p8(int blocks, u64 dg[], const char *sr= c, - u64 const h[1][2], const char *head); - -static int ghash_init(struct shash_desc *desc) -{ - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - - *ctx =3D (struct arm_ghash_desc_ctx){}; - return 0; -} - -static void ghash_do_update(int blocks, u64 dg[], const char *src, - struct ghash_key *key, const char *head) -{ - kernel_neon_begin(); - pmull_ghash_update_p8(blocks, dg, src, key->h, head); - kernel_neon_end(); -} - -static int ghash_update(struct shash_desc *desc, const u8 *src, - unsigned int len) -{ - struct ghash_key *key =3D crypto_shash_ctx(desc->tfm); - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - int blocks; - - blocks =3D len / GHASH_BLOCK_SIZE; - ghash_do_update(blocks, ctx->digest, src, key, NULL); - return len - blocks * GHASH_BLOCK_SIZE; -} - -static int ghash_export(struct shash_desc *desc, void *out) -{ - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - u8 *dst =3D out; - - put_unaligned_be64(ctx->digest[1], dst); - put_unaligned_be64(ctx->digest[0], dst + 8); - return 0; -} - -static int ghash_import(struct shash_desc *desc, const void *in) -{ - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - const u8 *src =3D in; - - ctx->digest[1] =3D get_unaligned_be64(src); - ctx->digest[0] =3D get_unaligned_be64(src + 8); - return 0; -} - -static int ghash_finup(struct shash_desc *desc, const u8 *src, - unsigned int len, u8 *dst) -{ - struct ghash_key *key =3D crypto_shash_ctx(desc->tfm); - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - - if (len) { - u8 buf[GHASH_BLOCK_SIZE] =3D {}; - - memcpy(buf, src, len); - ghash_do_update(1, ctx->digest, buf, key, NULL); - memzero_explicit(buf, sizeof(buf)); - } - return ghash_export(desc, dst); -} - static void ghash_reflect(u64 h[], const be128 *k) { u64 carry =3D be64_to_cpu(k->a) >> 63; =20 h[0] =3D (be64_to_cpu(k->b) << 1) | carry; @@ -129,44 +51,10 @@ static void ghash_reflect(u64 h[], const be128 *k) =20 if (carry) h[1] ^=3D 0xc200000000000000UL; } =20 -static int ghash_setkey(struct crypto_shash *tfm, - const u8 *inkey, unsigned int keylen) -{ - struct ghash_key *key =3D crypto_shash_ctx(tfm); - - if (keylen !=3D GHASH_BLOCK_SIZE) - return -EINVAL; - - /* needed for the fallback */ - memcpy(&key->k, inkey, GHASH_BLOCK_SIZE); - ghash_reflect(key->h[0], &key->k); - return 0; -} - -static struct shash_alg ghash_alg =3D { - .digestsize =3D GHASH_DIGEST_SIZE, - .init =3D ghash_init, - .update =3D ghash_update, - .finup =3D ghash_finup, - .setkey =3D ghash_setkey, - .export =3D ghash_export, - .import =3D ghash_import, - .descsize =3D sizeof(struct arm_ghash_desc_ctx), - .statesize =3D sizeof(struct ghash_desc_ctx), - - .base.cra_name =3D "ghash", - .base.cra_driver_name =3D "ghash-neon", - .base.cra_priority =3D 300, - .base.cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY, - .base.cra_blocksize =3D GHASH_BLOCK_SIZE, - .base.cra_ctxsize =3D sizeof(struct ghash_key), - .base.cra_module =3D THIS_MODULE, -}; - void pmull_gcm_encrypt(int blocks, u64 dg[], const char *src, struct gcm_key const *k, char *dst, const char *iv, int rounds, u32 counter); =20 void pmull_gcm_enc_final(int blocks, u64 dg[], char *tag, @@ -541,40 +429,18 @@ static struct aead_alg gcm_aes_algs[] =3D {{ .base.cra_module =3D THIS_MODULE, }}; =20 static int __init ghash_ce_mod_init(void) { - int err; - - if (!(elf_hwcap & HWCAP_NEON)) + if (!(elf_hwcap & HWCAP_NEON) || !(elf_hwcap2 & HWCAP2_PMULL)) return -ENODEV; =20 - if (elf_hwcap2 & HWCAP2_PMULL) { - err =3D crypto_register_aeads(gcm_aes_algs, - ARRAY_SIZE(gcm_aes_algs)); - if (err) - return err; - } - - err =3D crypto_register_shash(&ghash_alg); - if (err) - goto err_aead; - - return 0; - -err_aead: - if (elf_hwcap2 & HWCAP2_PMULL) - crypto_unregister_aeads(gcm_aes_algs, - ARRAY_SIZE(gcm_aes_algs)); - return err; + return crypto_register_aeads(gcm_aes_algs, ARRAY_SIZE(gcm_aes_algs)); } =20 static void __exit ghash_ce_mod_exit(void) { - crypto_unregister_shash(&ghash_alg); - if (elf_hwcap2 & HWCAP2_PMULL) - crypto_unregister_aeads(gcm_aes_algs, - ARRAY_SIZE(gcm_aes_algs)); + crypto_unregister_aeads(gcm_aes_algs, ARRAY_SIZE(gcm_aes_algs)); } =20 module_init(ghash_ce_mod_init); module_exit(ghash_ce_mod_exit); diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig index 98cedd95c2a5..4f1a79883a56 100644 --- a/lib/crypto/Kconfig +++ b/lib/crypto/Kconfig @@ -117,10 +117,11 @@ config CRYPTO_LIB_GF128HASH uses any of the functions from . =20 config CRYPTO_LIB_GF128HASH_ARCH bool depends on CRYPTO_LIB_GF128HASH && !UML + default y if ARM && KERNEL_MODE_NEON default y if ARM64 default y if X86_64 =20 config CRYPTO_LIB_MD5 tristate diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile index fc30622123d2..8a06dd6a43ea 100644 --- a/lib/crypto/Makefile +++ b/lib/crypto/Makefile @@ -156,10 +156,11 @@ libdes-y :=3D des.o =20 obj-$(CONFIG_CRYPTO_LIB_GF128HASH) +=3D libgf128hash.o libgf128hash-y :=3D gf128hash.o ifeq ($(CONFIG_CRYPTO_LIB_GF128HASH_ARCH),y) CFLAGS_gf128hash.o +=3D -I$(src)/$(SRCARCH) +libgf128hash-$(CONFIG_ARM) +=3D arm/ghash-neon-core.o libgf128hash-$(CONFIG_ARM64) +=3D arm64/polyval-ce-core.o libgf128hash-$(CONFIG_X86) +=3D x86/polyval-pclmul-avx.o endif =20 ##########################################################################= ###### diff --git a/lib/crypto/arm/gf128hash.h b/lib/crypto/arm/gf128hash.h new file mode 100644 index 000000000000..cb929bed29d5 --- /dev/null +++ b/lib/crypto/arm/gf128hash.h @@ -0,0 +1,43 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * GHASH, arm optimized + * + * Copyright 2026 Google LLC + */ + +#include +#include +#include + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon); + +void pmull_ghash_update_p8(size_t blocks, struct polyval_elem *dg, + const u8 *src, const struct polyval_elem *k); + +#define ghash_blocks_arch ghash_blocks_arch +static void ghash_blocks_arch(struct polyval_elem *acc, + const struct ghash_key *key, + const u8 *data, size_t nblocks) +{ + if (static_branch_likely(&have_neon) && may_use_simd()) { + do { + /* Allow rescheduling every 4 KiB. */ + size_t n =3D + min_t(size_t, nblocks, 4096 / GHASH_BLOCK_SIZE); + + scoped_ksimd() + pmull_ghash_update_p8(n, acc, data, &key->h); + data +=3D n * GHASH_BLOCK_SIZE; + nblocks -=3D n; + } while (nblocks); + } else { + ghash_blocks_generic(acc, &key->h, data, nblocks); + } +} + +#define gf128hash_mod_init_arch gf128hash_mod_init_arch +static void gf128hash_mod_init_arch(void) +{ + if (elf_hwcap & HWCAP_NEON) + static_branch_enable(&have_neon); +} diff --git a/arch/arm/crypto/ghash-neon-core.S b/lib/crypto/arm/ghash-neon-= core.S similarity index 92% rename from arch/arm/crypto/ghash-neon-core.S rename to lib/crypto/arm/ghash-neon-core.S index bdf6fb6d063c..bf423fb06a75 100644 --- a/arch/arm/crypto/ghash-neon-core.S +++ b/lib/crypto/arm/ghash-neon-core.S @@ -139,26 +139,25 @@ veor XL, XL, T1 vshr.u64 T1, T1, #6 vshr.u64 XL, XL, #1 .endm =20 + .macro vrev64_if_be a +#ifdef CONFIG_CPU_BIG_ENDIAN + vrev64.8 \a, \a +#endif + .endm + .macro ghash_update vld1.64 {XL}, [r1] - - /* do the head block first, if supplied */ - ldr ip, [sp] - teq ip, #0 - beq 0f - vld1.64 {T1}, [ip] - teq r0, #0 - b 3f + vrev64_if_be XL =20 0: vld1.8 {T1}, [r2]! subs r0, r0, #1 =20 -3: /* multiply XL by SHASH in GF(2^128) */ + /* multiply XL by SHASH in GF(2^128) */ vrev64.8 T1, T1 =20 vext.8 IN1, T1, T1, #8 veor T1_L, T1_L, XL_H veor XL, XL, IN1 @@ -178,15 +177,17 @@ =20 bne 0b .endm =20 /* - * void pmull_ghash_update_p8(int blocks, u64 dg[], const char *src, - * u64 const h[1][2], const char *head) + * void pmull_ghash_update_p8(size_t blocks, struct polyval_elem *dg, + * const u8 *src, + * const struct polyval_elem *k) */ ENTRY(pmull_ghash_update_p8) vld1.64 {SHASH}, [r3] + vrev64_if_be SHASH veor SHASH2_p8, SHASH_L, SHASH_H =20 vext.8 s1l, SHASH_L, SHASH_L, #1 vext.8 s2l, SHASH_L, SHASH_L, #2 vext.8 s3l, SHASH_L, SHASH_L, #3 @@ -199,9 +200,10 @@ ENTRY(pmull_ghash_update_p8) vmov.i64 k16, #0xffff vmov.i64 k32, #0xffffffff vmov.i64 k48, #0xffffffffffff =20 ghash_update + vrev64_if_be XL vst1.64 {XL}, [r1] =20 bx lr ENDPROC(pmull_ghash_update_p8) --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 968A7355F46; Thu, 19 Mar 2026 06:19:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901158; cv=none; b=s7HqTYVhEF0I7gUEQetmGMdq7pSbY1WahpjiBGHDY0rJw0g/NMgbEHucbR6r1NUgBlemFybecDFtN8x481H5X/Q6fJraiXnD9pI+DD/YAE9lCwpF3UPrPnihjDN224L9ats/suKgi+jn23f02uHTdQQzRsenUjy/7oZXL96dGjA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901158; c=relaxed/simple; bh=sJchU+D5iVBtqxo0a18kfpJyCQYQXaGhsufyuqET8E4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lx4AeJ4OVSeiY409aDZueMJQNcPUZ29c4KnYst5CYYjNeT81f2YdNqvlhKYcM1Vah+Xqigs2xxYkBEYBHlVO6NjgLObK23YbzoNt0tFeDXA9MGPtRDmP3/LmnY3ZkFLgdsGWZXjclny+EC/oPQ/DZziY6BSB9jkoU4K6N87T89w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lbL6kp8Q; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lbL6kp8Q" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 11745C2BCB0; Thu, 19 Mar 2026 06:19:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901158; bh=sJchU+D5iVBtqxo0a18kfpJyCQYQXaGhsufyuqET8E4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lbL6kp8QtwQt8SUU7QXIx/mdm74G/ku13WvGixi/vBNv5Du0xPzxmE1PDcwA/J6wT JSqjT/FxIdPQIloxJPuHBBJjeTnsThSf+2uYnwqAKBwB6MY3GlTWfsuE3tgH+slGJA rOCsYVjqL4L5cJ/eXlX61ibF6/NaWBk2MtL33rqJc6y/RJhXD5UnwbG0fmpKu3pAMI jiHAJwCI4ag7YrUABdu8XeW/a/Z4BRs+FMLOPGv2R/Z+s8xSRJLYD6PsyzlZxnnvQB OnZAcTqchQKuQX0ffqjJ4mfjP9jSrk7lAOwTiNO632g3rsXKtgNMC2wJVf8M57DJxH r1kurhQF8AIaA== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 08/19] crypto: arm64/ghash - Move NEON GHASH assembly into its own file Date: Wed, 18 Mar 2026 23:17:09 -0700 Message-ID: <20260319061723.1140720-9-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" arch/arm64/crypto/ghash-ce-core.S implements pmull_ghash_update_p8(), which is used only by a crypto_shash implementation of GHASH. It also implements other functions, including pmull_ghash_update_p64() and others, which are used only by a crypto_aead implementation of AES-GCM. While some code is shared between pmull_ghash_update_p8() and pmull_ghash_update_p64(), it's not very much. Since pmull_ghash_update_p8() will also need to be migrated into lib/crypto/ to achieve parity in the standalone GHASH support, let's move it into a separate file ghash-neon-core.S. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/arm64/crypto/Makefile | 2 +- arch/arm64/crypto/ghash-ce-core.S | 207 ++----------------------- arch/arm64/crypto/ghash-neon-core.S | 226 ++++++++++++++++++++++++++++ 3 files changed, 239 insertions(+), 196 deletions(-) create mode 100644 arch/arm64/crypto/ghash-neon-core.S diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index 8a8e3e551ed3..b7ba43ce8584 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -25,11 +25,11 @@ sm4-ce-gcm-y :=3D sm4-ce-gcm-glue.o sm4-ce-gcm-core.o =20 obj-$(CONFIG_CRYPTO_SM4_ARM64_NEON_BLK) +=3D sm4-neon.o sm4-neon-y :=3D sm4-neon-glue.o sm4-neon-core.o =20 obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) +=3D ghash-ce.o -ghash-ce-y :=3D ghash-ce-glue.o ghash-ce-core.o +ghash-ce-y :=3D ghash-ce-glue.o ghash-ce-core.o ghash-neon-core.o =20 obj-$(CONFIG_CRYPTO_AES_ARM64_CE_CCM) +=3D aes-ce-ccm.o aes-ce-ccm-y :=3D aes-ce-ccm-glue.o aes-ce-ccm-core.o =20 obj-$(CONFIG_CRYPTO_AES_ARM64_CE_BLK) +=3D aes-ce-blk.o diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce= -core.S index 23ee9a5eaf27..4344fe213d14 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * Accelerated GHASH implementation with ARMv8 PMULL instructions. + * Accelerated AES-GCM implementation with ARMv8 Crypto Extensions. * * Copyright (C) 2014 - 2018 Linaro Ltd. */ =20 #include @@ -17,35 +17,10 @@ XM .req v5 XL .req v6 XH .req v7 IN1 .req v7 =20 - k00_16 .req v8 - k32_48 .req v9 - - t3 .req v10 - t4 .req v11 - t5 .req v12 - t6 .req v13 - t7 .req v14 - t8 .req v15 - t9 .req v16 - - perm1 .req v17 - perm2 .req v18 - perm3 .req v19 - - sh1 .req v20 - sh2 .req v21 - sh3 .req v22 - sh4 .req v23 - - ss1 .req v24 - ss2 .req v25 - ss3 .req v26 - ss4 .req v27 - XL2 .req v8 XM2 .req v9 XH2 .req v10 XL3 .req v11 XM3 .req v12 @@ -58,94 +33,10 @@ HH34 .req v19 =20 .text .arch armv8-a+crypto =20 - .macro __pmull_p64, rd, rn, rm - pmull \rd\().1q, \rn\().1d, \rm\().1d - .endm - - .macro __pmull2_p64, rd, rn, rm - pmull2 \rd\().1q, \rn\().2d, \rm\().2d - .endm - - .macro __pmull_p8, rq, ad, bd - ext t3.8b, \ad\().8b, \ad\().8b, #1 // A1 - ext t5.8b, \ad\().8b, \ad\().8b, #2 // A2 - ext t7.8b, \ad\().8b, \ad\().8b, #3 // A3 - - __pmull_p8_\bd \rq, \ad - .endm - - .macro __pmull2_p8, rq, ad, bd - tbl t3.16b, {\ad\().16b}, perm1.16b // A1 - tbl t5.16b, {\ad\().16b}, perm2.16b // A2 - tbl t7.16b, {\ad\().16b}, perm3.16b // A3 - - __pmull2_p8_\bd \rq, \ad - .endm - - .macro __pmull_p8_SHASH, rq, ad - __pmull_p8_tail \rq, \ad\().8b, SHASH.8b, 8b,, sh1, sh2, sh3, sh4 - .endm - - .macro __pmull_p8_SHASH2, rq, ad - __pmull_p8_tail \rq, \ad\().8b, SHASH2.8b, 8b,, ss1, ss2, ss3, ss4 - .endm - - .macro __pmull2_p8_SHASH, rq, ad - __pmull_p8_tail \rq, \ad\().16b, SHASH.16b, 16b, 2, sh1, sh2, sh3, sh4 - .endm - - .macro __pmull_p8_tail, rq, ad, bd, nb, t, b1, b2, b3, b4 - pmull\t t3.8h, t3.\nb, \bd // F =3D A1*B - pmull\t t4.8h, \ad, \b1\().\nb // E =3D A*B1 - pmull\t t5.8h, t5.\nb, \bd // H =3D A2*B - pmull\t t6.8h, \ad, \b2\().\nb // G =3D A*B2 - pmull\t t7.8h, t7.\nb, \bd // J =3D A3*B - pmull\t t8.8h, \ad, \b3\().\nb // I =3D A*B3 - pmull\t t9.8h, \ad, \b4\().\nb // K =3D A*B4 - pmull\t \rq\().8h, \ad, \bd // D =3D A*B - - eor t3.16b, t3.16b, t4.16b // L =3D E + F - eor t5.16b, t5.16b, t6.16b // M =3D G + H - eor t7.16b, t7.16b, t8.16b // N =3D I + J - - uzp1 t4.2d, t3.2d, t5.2d - uzp2 t3.2d, t3.2d, t5.2d - uzp1 t6.2d, t7.2d, t9.2d - uzp2 t7.2d, t7.2d, t9.2d - - // t3 =3D (L) (P0 + P1) << 8 - // t5 =3D (M) (P2 + P3) << 16 - eor t4.16b, t4.16b, t3.16b - and t3.16b, t3.16b, k32_48.16b - - // t7 =3D (N) (P4 + P5) << 24 - // t9 =3D (K) (P6 + P7) << 32 - eor t6.16b, t6.16b, t7.16b - and t7.16b, t7.16b, k00_16.16b - - eor t4.16b, t4.16b, t3.16b - eor t6.16b, t6.16b, t7.16b - - zip2 t5.2d, t4.2d, t3.2d - zip1 t3.2d, t4.2d, t3.2d - zip2 t9.2d, t6.2d, t7.2d - zip1 t7.2d, t6.2d, t7.2d - - ext t3.16b, t3.16b, t3.16b, #15 - ext t5.16b, t5.16b, t5.16b, #14 - ext t7.16b, t7.16b, t7.16b, #13 - ext t9.16b, t9.16b, t9.16b, #12 - - eor t3.16b, t3.16b, t5.16b - eor t7.16b, t7.16b, t9.16b - eor \rq\().16b, \rq\().16b, t3.16b - eor \rq\().16b, \rq\().16b, t7.16b - .endm - .macro __pmull_pre_p64 add x8, x3, #16 ld1 {HH.2d-HH4.2d}, [x8] =20 trn1 SHASH2.2d, SHASH.2d, HH.2d @@ -158,47 +49,10 @@ =20 movi MASK.16b, #0xe1 shl MASK.2d, MASK.2d, #57 .endm =20 - .macro __pmull_pre_p8 - ext SHASH2.16b, SHASH.16b, SHASH.16b, #8 - eor SHASH2.16b, SHASH2.16b, SHASH.16b - - // k00_16 :=3D 0x0000000000000000_000000000000ffff - // k32_48 :=3D 0x00000000ffffffff_0000ffffffffffff - movi k32_48.2d, #0xffffffff - mov k32_48.h[2], k32_48.h[0] - ushr k00_16.2d, k32_48.2d, #32 - - // prepare the permutation vectors - mov_q x5, 0x080f0e0d0c0b0a09 - movi T1.8b, #8 - dup perm1.2d, x5 - eor perm1.16b, perm1.16b, T1.16b - ushr perm2.2d, perm1.2d, #8 - ushr perm3.2d, perm1.2d, #16 - ushr T1.2d, perm1.2d, #24 - sli perm2.2d, perm1.2d, #56 - sli perm3.2d, perm1.2d, #48 - sli T1.2d, perm1.2d, #40 - - // precompute loop invariants - tbl sh1.16b, {SHASH.16b}, perm1.16b - tbl sh2.16b, {SHASH.16b}, perm2.16b - tbl sh3.16b, {SHASH.16b}, perm3.16b - tbl sh4.16b, {SHASH.16b}, T1.16b - ext ss1.8b, SHASH2.8b, SHASH2.8b, #1 - ext ss2.8b, SHASH2.8b, SHASH2.8b, #2 - ext ss3.8b, SHASH2.8b, SHASH2.8b, #3 - ext ss4.8b, SHASH2.8b, SHASH2.8b, #4 - .endm - - // - // PMULL (64x64->128) based reduction for CPUs that can do - // it in a single instruction. - // .macro __pmull_reduce_p64 pmull T2.1q, XL.1d, MASK.1d eor XM.16b, XM.16b, T1.16b =20 mov XH.d[0], XM.d[1] @@ -207,51 +61,27 @@ eor XL.16b, XM.16b, T2.16b ext T2.16b, XL.16b, XL.16b, #8 pmull XL.1q, XL.1d, MASK.1d .endm =20 - // - // Alternative reduction for CPUs that lack support for the - // 64x64->128 PMULL instruction - // - .macro __pmull_reduce_p8 - eor XM.16b, XM.16b, T1.16b - - mov XL.d[1], XM.d[0] - mov XH.d[0], XM.d[1] - - shl T1.2d, XL.2d, #57 - shl T2.2d, XL.2d, #62 - eor T2.16b, T2.16b, T1.16b - shl T1.2d, XL.2d, #63 - eor T2.16b, T2.16b, T1.16b - ext T1.16b, XL.16b, XH.16b, #8 - eor T2.16b, T2.16b, T1.16b - - mov XL.d[1], T2.d[0] - mov XH.d[0], T2.d[1] - - ushr T2.2d, XL.2d, #1 - eor XH.16b, XH.16b, XL.16b - eor XL.16b, XL.16b, T2.16b - ushr T2.2d, T2.2d, #6 - ushr XL.2d, XL.2d, #1 - .endm - - .macro __pmull_ghash, pn + /* + * void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src, + * u64 const h[][2], const char *head) + */ +SYM_TYPED_FUNC_START(pmull_ghash_update_p64) ld1 {SHASH.2d}, [x3] ld1 {XL.2d}, [x1] =20 - __pmull_pre_\pn + __pmull_pre_p64 =20 /* do the head block first, if supplied */ cbz x4, 0f ld1 {T1.2d}, [x4] mov x4, xzr b 3f =20 -0: .ifc \pn, p64 +0: tbnz w0, #0, 2f // skip until #blocks is a tbnz w0, #1, 2f // round multiple of 4 =20 1: ld1 {XM3.16b-TT4.16b}, [x2], #64 =20 @@ -312,11 +142,10 @@ eor T2.16b, T2.16b, XH.16b eor XL.16b, XL.16b, T2.16b =20 cbz w0, 5f b 1b - .endif =20 2: ld1 {T1.2d}, [x2], #16 sub w0, w0, #1 =20 3: /* multiply XL by SHASH in GF(2^128) */ @@ -325,42 +154,30 @@ CPU_LE( rev64 T1.16b, T1.16b ) ext T2.16b, XL.16b, XL.16b, #8 ext IN1.16b, T1.16b, T1.16b, #8 eor T1.16b, T1.16b, T2.16b eor XL.16b, XL.16b, IN1.16b =20 - __pmull2_\pn XH, XL, SHASH // a1 * b1 + pmull2 XH.1q, XL.2d, SHASH.2d // a1 * b1 eor T1.16b, T1.16b, XL.16b - __pmull_\pn XL, XL, SHASH // a0 * b0 - __pmull_\pn XM, T1, SHASH2 // (a1 + a0)(b1 + b0) + pmull XL.1q, XL.1d, SHASH.1d // a0 * b0 + pmull XM.1q, T1.1d, SHASH2.1d // (a1 + a0)(b1 + b0) =20 4: eor T2.16b, XL.16b, XH.16b ext T1.16b, XL.16b, XH.16b, #8 eor XM.16b, XM.16b, T2.16b =20 - __pmull_reduce_\pn + __pmull_reduce_p64 =20 eor T2.16b, T2.16b, XH.16b eor XL.16b, XL.16b, T2.16b =20 cbnz w0, 0b =20 5: st1 {XL.2d}, [x1] ret - .endm - - /* - * void pmull_ghash_update(int blocks, u64 dg[], const char *src, - * struct ghash_key const *k, const char *head) - */ -SYM_TYPED_FUNC_START(pmull_ghash_update_p64) - __pmull_ghash p64 SYM_FUNC_END(pmull_ghash_update_p64) =20 -SYM_TYPED_FUNC_START(pmull_ghash_update_p8) - __pmull_ghash p8 -SYM_FUNC_END(pmull_ghash_update_p8) - KS0 .req v8 KS1 .req v9 KS2 .req v10 KS3 .req v11 =20 diff --git a/arch/arm64/crypto/ghash-neon-core.S b/arch/arm64/crypto/ghash-= neon-core.S new file mode 100644 index 000000000000..6157135ad566 --- /dev/null +++ b/arch/arm64/crypto/ghash-neon-core.S @@ -0,0 +1,226 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Accelerated GHASH implementation with ARMv8 ASIMD instructions. + * + * Copyright (C) 2014 - 2018 Linaro Ltd. + */ + +#include +#include +#include + + SHASH .req v0 + SHASH2 .req v1 + T1 .req v2 + T2 .req v3 + XM .req v5 + XL .req v6 + XH .req v7 + IN1 .req v7 + + k00_16 .req v8 + k32_48 .req v9 + + t3 .req v10 + t4 .req v11 + t5 .req v12 + t6 .req v13 + t7 .req v14 + t8 .req v15 + t9 .req v16 + + perm1 .req v17 + perm2 .req v18 + perm3 .req v19 + + sh1 .req v20 + sh2 .req v21 + sh3 .req v22 + sh4 .req v23 + + ss1 .req v24 + ss2 .req v25 + ss3 .req v26 + ss4 .req v27 + + .text + + .macro __pmull_p8, rq, ad, bd + ext t3.8b, \ad\().8b, \ad\().8b, #1 // A1 + ext t5.8b, \ad\().8b, \ad\().8b, #2 // A2 + ext t7.8b, \ad\().8b, \ad\().8b, #3 // A3 + + __pmull_p8_\bd \rq, \ad + .endm + + .macro __pmull2_p8, rq, ad, bd + tbl t3.16b, {\ad\().16b}, perm1.16b // A1 + tbl t5.16b, {\ad\().16b}, perm2.16b // A2 + tbl t7.16b, {\ad\().16b}, perm3.16b // A3 + + __pmull2_p8_\bd \rq, \ad + .endm + + .macro __pmull_p8_SHASH, rq, ad + __pmull_p8_tail \rq, \ad\().8b, SHASH.8b, 8b,, sh1, sh2, sh3, sh4 + .endm + + .macro __pmull_p8_SHASH2, rq, ad + __pmull_p8_tail \rq, \ad\().8b, SHASH2.8b, 8b,, ss1, ss2, ss3, ss4 + .endm + + .macro __pmull2_p8_SHASH, rq, ad + __pmull_p8_tail \rq, \ad\().16b, SHASH.16b, 16b, 2, sh1, sh2, sh3, sh4 + .endm + + .macro __pmull_p8_tail, rq, ad, bd, nb, t, b1, b2, b3, b4 + pmull\t t3.8h, t3.\nb, \bd // F =3D A1*B + pmull\t t4.8h, \ad, \b1\().\nb // E =3D A*B1 + pmull\t t5.8h, t5.\nb, \bd // H =3D A2*B + pmull\t t6.8h, \ad, \b2\().\nb // G =3D A*B2 + pmull\t t7.8h, t7.\nb, \bd // J =3D A3*B + pmull\t t8.8h, \ad, \b3\().\nb // I =3D A*B3 + pmull\t t9.8h, \ad, \b4\().\nb // K =3D A*B4 + pmull\t \rq\().8h, \ad, \bd // D =3D A*B + + eor t3.16b, t3.16b, t4.16b // L =3D E + F + eor t5.16b, t5.16b, t6.16b // M =3D G + H + eor t7.16b, t7.16b, t8.16b // N =3D I + J + + uzp1 t4.2d, t3.2d, t5.2d + uzp2 t3.2d, t3.2d, t5.2d + uzp1 t6.2d, t7.2d, t9.2d + uzp2 t7.2d, t7.2d, t9.2d + + // t3 =3D (L) (P0 + P1) << 8 + // t5 =3D (M) (P2 + P3) << 16 + eor t4.16b, t4.16b, t3.16b + and t3.16b, t3.16b, k32_48.16b + + // t7 =3D (N) (P4 + P5) << 24 + // t9 =3D (K) (P6 + P7) << 32 + eor t6.16b, t6.16b, t7.16b + and t7.16b, t7.16b, k00_16.16b + + eor t4.16b, t4.16b, t3.16b + eor t6.16b, t6.16b, t7.16b + + zip2 t5.2d, t4.2d, t3.2d + zip1 t3.2d, t4.2d, t3.2d + zip2 t9.2d, t6.2d, t7.2d + zip1 t7.2d, t6.2d, t7.2d + + ext t3.16b, t3.16b, t3.16b, #15 + ext t5.16b, t5.16b, t5.16b, #14 + ext t7.16b, t7.16b, t7.16b, #13 + ext t9.16b, t9.16b, t9.16b, #12 + + eor t3.16b, t3.16b, t5.16b + eor t7.16b, t7.16b, t9.16b + eor \rq\().16b, \rq\().16b, t3.16b + eor \rq\().16b, \rq\().16b, t7.16b + .endm + + .macro __pmull_pre_p8 + ext SHASH2.16b, SHASH.16b, SHASH.16b, #8 + eor SHASH2.16b, SHASH2.16b, SHASH.16b + + // k00_16 :=3D 0x0000000000000000_000000000000ffff + // k32_48 :=3D 0x00000000ffffffff_0000ffffffffffff + movi k32_48.2d, #0xffffffff + mov k32_48.h[2], k32_48.h[0] + ushr k00_16.2d, k32_48.2d, #32 + + // prepare the permutation vectors + mov_q x5, 0x080f0e0d0c0b0a09 + movi T1.8b, #8 + dup perm1.2d, x5 + eor perm1.16b, perm1.16b, T1.16b + ushr perm2.2d, perm1.2d, #8 + ushr perm3.2d, perm1.2d, #16 + ushr T1.2d, perm1.2d, #24 + sli perm2.2d, perm1.2d, #56 + sli perm3.2d, perm1.2d, #48 + sli T1.2d, perm1.2d, #40 + + // precompute loop invariants + tbl sh1.16b, {SHASH.16b}, perm1.16b + tbl sh2.16b, {SHASH.16b}, perm2.16b + tbl sh3.16b, {SHASH.16b}, perm3.16b + tbl sh4.16b, {SHASH.16b}, T1.16b + ext ss1.8b, SHASH2.8b, SHASH2.8b, #1 + ext ss2.8b, SHASH2.8b, SHASH2.8b, #2 + ext ss3.8b, SHASH2.8b, SHASH2.8b, #3 + ext ss4.8b, SHASH2.8b, SHASH2.8b, #4 + .endm + + .macro __pmull_reduce_p8 + eor XM.16b, XM.16b, T1.16b + + mov XL.d[1], XM.d[0] + mov XH.d[0], XM.d[1] + + shl T1.2d, XL.2d, #57 + shl T2.2d, XL.2d, #62 + eor T2.16b, T2.16b, T1.16b + shl T1.2d, XL.2d, #63 + eor T2.16b, T2.16b, T1.16b + ext T1.16b, XL.16b, XH.16b, #8 + eor T2.16b, T2.16b, T1.16b + + mov XL.d[1], T2.d[0] + mov XH.d[0], T2.d[1] + + ushr T2.2d, XL.2d, #1 + eor XH.16b, XH.16b, XL.16b + eor XL.16b, XL.16b, T2.16b + ushr T2.2d, T2.2d, #6 + ushr XL.2d, XL.2d, #1 + .endm + + /* + * void pmull_ghash_update_p8(int blocks, u64 dg[], const char *src, + * u64 const h[][2], const char *head) + */ +SYM_TYPED_FUNC_START(pmull_ghash_update_p8) + ld1 {SHASH.2d}, [x3] + ld1 {XL.2d}, [x1] + + __pmull_pre_p8 + + /* do the head block first, if supplied */ + cbz x4, 0f + ld1 {T1.2d}, [x4] + mov x4, xzr + b 3f + +0: ld1 {T1.2d}, [x2], #16 + sub w0, w0, #1 + +3: /* multiply XL by SHASH in GF(2^128) */ +CPU_LE( rev64 T1.16b, T1.16b ) + + ext T2.16b, XL.16b, XL.16b, #8 + ext IN1.16b, T1.16b, T1.16b, #8 + eor T1.16b, T1.16b, T2.16b + eor XL.16b, XL.16b, IN1.16b + + __pmull2_p8 XH, XL, SHASH // a1 * b1 + eor T1.16b, T1.16b, XL.16b + __pmull_p8 XL, XL, SHASH // a0 * b0 + __pmull_p8 XM, T1, SHASH2 // (a1 + a0)(b1 + b0) + + eor T2.16b, XL.16b, XH.16b + ext T1.16b, XL.16b, XH.16b, #8 + eor XM.16b, XM.16b, T2.16b + + __pmull_reduce_p8 + + eor T2.16b, T2.16b, XH.16b + eor XL.16b, XL.16b, T2.16b + + cbnz w0, 0b + + st1 {XL.2d}, [x1] + ret +SYM_FUNC_END(pmull_ghash_update_p8) --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E7DB359703; Thu, 19 Mar 2026 06:19:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901159; cv=none; b=sTlDMvzWL+IzwmCoJZtnueTo4kWg1L4mDAFNHK868dZSltiaXlxliNe/LWO+SkcWh1wc2JnUC+uW/oBoV8yfXsOtjRsG/oA8eJP8aqgsxD/Vt695pBrvb9ufjtazy04Oo1UIcOuyMga5HctZmYA5c6Xg9fGRXDiKJRlfqVEh5zI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901159; c=relaxed/simple; bh=yr8pKBP08w4IQOykt2Y32o124L8GWjwohcioSRWmwdw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ulbkcSUmy0UaBJjqZlrx5ON1rBN1vVR+ANuo45a4+0Wn89cSqZu9QCmiHi77AAnBiuPp7FBUX5J5KiWhYr1IYieY2wprqS3qcLvmhGNWiM7ZcFOy6GjVs0bOhAizGHlc/HffQDHeyNMDYvAxEzJip+o26hxqvjBypUPjYo2OGrc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=srZICETK; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="srZICETK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E9F7C19425; Thu, 19 Mar 2026 06:19:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901158; bh=yr8pKBP08w4IQOykt2Y32o124L8GWjwohcioSRWmwdw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=srZICETKoqua/JoZbO/T3R+C7Ijrlp6jUWjb/0ULuTwyL7bg8SKcyKQjSg13NqVkI rYncMULyh6z0UsA0CNoYrrJQk0adbGrcOdqWw1WKcIR3d0fBG8XDDyS+MsQsIi0tUF vPS5fHQaV2NjqJ+v7ZwNRVTVcsUKy4n/f2dtZ4elTMUrTPPykUOEJDOtaV54jvgsUn iAGKNzdr51p1nQ3wZTo5vssSgzic0sPUBQ9QFFQZ9Nm99iVE2e6UbJV++nWN8cmTsU WWRP/lezIN8tazlI2KPjsV2jN9OP06PI2ojgLZ4KQ4OwSg1xm87vxPZ37+IwZhj9hT njCZCgVt2oqDA== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 09/19] lib/crypto: arm64/ghash: Migrate optimized code into library Date: Wed, 18 Mar 2026 23:17:10 -0700 Message-ID: <20260319061723.1140720-10-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the "ghash-neon" crypto_shash algorithm. Move the corresponding assembly code into lib/crypto/, and wire it up to the GHASH library. This makes the GHASH library be optimized on arm64 (though only with NEON, not PMULL; for now the goal is just parity with crypto_shash). It greatly reduces the amount of arm64-specific glue code that is needed, and it fixes the issue where this optimization was disabled by default. To integrate the assembly code correctly with the library, make the following tweaks: - Change the type of 'blocks' from int to size_t - Change the types of 'dg' and 'k' to polyval_elem. Note that this simply reflects the format that the code was already using. - Remove the 'head' argument, which is no longer needed. - Remove the CFI stubs, as indirect calls are no longer used. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 5 +- arch/arm64/crypto/Makefile | 2 +- arch/arm64/crypto/ghash-ce-core.S | 3 +- arch/arm64/crypto/ghash-ce-glue.c | 146 ++---------------- lib/crypto/Makefile | 3 +- lib/crypto/arm64/gf128hash.h | 68 +++++++- .../crypto/arm64}/ghash-neon-core.S | 20 +-- 7 files changed, 86 insertions(+), 161 deletions(-) rename {arch/arm64/crypto =3D> lib/crypto/arm64}/ghash-neon-core.S (93%) diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 82794afaffc9..1a0c553fbfd7 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -1,18 +1,17 @@ # SPDX-License-Identifier: GPL-2.0 =20 menu "Accelerated Cryptographic Algorithms for CPU (arm64)" =20 config CRYPTO_GHASH_ARM64_CE - tristate "Hash functions: GHASH (ARMv8 Crypto Extensions)" + tristate "AEAD cipher: AES in GCM mode (ARMv8 Crypto Extensions)" depends on KERNEL_MODE_NEON - select CRYPTO_HASH select CRYPTO_LIB_AES select CRYPTO_LIB_GF128MUL select CRYPTO_AEAD help - GCM GHASH function (NIST SP800-38D) + AEAD cipher: AES-GCM =20 Architecture: arm64 using: - ARMv8 Crypto Extensions =20 config CRYPTO_SM3_NEON diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index b7ba43ce8584..8a8e3e551ed3 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -25,11 +25,11 @@ sm4-ce-gcm-y :=3D sm4-ce-gcm-glue.o sm4-ce-gcm-core.o =20 obj-$(CONFIG_CRYPTO_SM4_ARM64_NEON_BLK) +=3D sm4-neon.o sm4-neon-y :=3D sm4-neon-glue.o sm4-neon-core.o =20 obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) +=3D ghash-ce.o -ghash-ce-y :=3D ghash-ce-glue.o ghash-ce-core.o ghash-neon-core.o +ghash-ce-y :=3D ghash-ce-glue.o ghash-ce-core.o =20 obj-$(CONFIG_CRYPTO_AES_ARM64_CE_CCM) +=3D aes-ce-ccm.o aes-ce-ccm-y :=3D aes-ce-ccm-glue.o aes-ce-ccm-core.o =20 obj-$(CONFIG_CRYPTO_AES_ARM64_CE_BLK) +=3D aes-ce-blk.o diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce= -core.S index 4344fe213d14..a01f136f4fb2 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -4,11 +4,10 @@ * * Copyright (C) 2014 - 2018 Linaro Ltd. */ =20 #include -#include #include =20 SHASH .req v0 SHASH2 .req v1 T1 .req v2 @@ -65,11 +64,11 @@ =20 /* * void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src, * u64 const h[][2], const char *head) */ -SYM_TYPED_FUNC_START(pmull_ghash_update_p64) +SYM_FUNC_START(pmull_ghash_update_p64) ld1 {SHASH.2d}, [x3] ld1 {XL.2d}, [x1] =20 __pmull_pre_p64 =20 diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce= -glue.c index 63bb9e062251..42fb46bdc124 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -1,19 +1,18 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * Accelerated GHASH implementation with ARMv8 PMULL instructions. + * AES-GCM using ARMv8 Crypto Extensions * * Copyright (C) 2014 - 2018 Linaro Ltd. */ =20 #include #include #include #include #include #include -#include #include #include #include #include #include @@ -21,14 +20,15 @@ #include #include =20 #include =20 -MODULE_DESCRIPTION("GHASH and AES-GCM using ARMv8 Crypto Extensions"); +MODULE_DESCRIPTION("AES-GCM using ARMv8 Crypto Extensions"); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); -MODULE_ALIAS_CRYPTO("ghash"); +MODULE_ALIAS_CRYPTO("gcm(aes)"); +MODULE_ALIAS_CRYPTO("rfc4106(gcm(aes))"); =20 #define RFC4106_NONCE_SIZE 4 =20 struct ghash_key { be128 k; @@ -46,100 +46,23 @@ struct gcm_aes_ctx { }; =20 asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *s= rc, u64 const h[][2], const char *head); =20 -asmlinkage void pmull_ghash_update_p8(int blocks, u64 dg[], const char *sr= c, - u64 const h[][2], const char *head); - asmlinkage void pmull_gcm_encrypt(int bytes, u8 dst[], const u8 src[], u64 const h[][2], u64 dg[], u8 ctr[], u32 const rk[], int rounds, u8 tag[]); asmlinkage int pmull_gcm_decrypt(int bytes, u8 dst[], const u8 src[], u64 const h[][2], u64 dg[], u8 ctr[], u32 const rk[], int rounds, const u8 l[], const u8 tag[], u64 authsize); =20 -static int ghash_init(struct shash_desc *desc) -{ - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - - *ctx =3D (struct arm_ghash_desc_ctx){}; - return 0; -} - -static __always_inline -void ghash_do_simd_update(int blocks, u64 dg[], const char *src, - struct ghash_key *key, const char *head, - void (*simd_update)(int blocks, u64 dg[], - const char *src, - u64 const h[][2], - const char *head)) +static void ghash_do_simd_update(int blocks, u64 dg[], const char *src, + struct ghash_key *key, const char *head) { scoped_ksimd() - simd_update(blocks, dg, src, key->h, head); -} - -/* avoid hogging the CPU for too long */ -#define MAX_BLOCKS (SZ_64K / GHASH_BLOCK_SIZE) - -static int ghash_update(struct shash_desc *desc, const u8 *src, - unsigned int len) -{ - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - struct ghash_key *key =3D crypto_shash_ctx(desc->tfm); - int blocks; - - blocks =3D len / GHASH_BLOCK_SIZE; - len -=3D blocks * GHASH_BLOCK_SIZE; - - do { - int chunk =3D min(blocks, MAX_BLOCKS); - - ghash_do_simd_update(chunk, ctx->digest, src, key, NULL, - pmull_ghash_update_p8); - blocks -=3D chunk; - src +=3D chunk * GHASH_BLOCK_SIZE; - } while (unlikely(blocks > 0)); - return len; -} - -static int ghash_export(struct shash_desc *desc, void *out) -{ - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - u8 *dst =3D out; - - put_unaligned_be64(ctx->digest[1], dst); - put_unaligned_be64(ctx->digest[0], dst + 8); - return 0; -} - -static int ghash_import(struct shash_desc *desc, const void *in) -{ - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - const u8 *src =3D in; - - ctx->digest[1] =3D get_unaligned_be64(src); - ctx->digest[0] =3D get_unaligned_be64(src + 8); - return 0; -} - -static int ghash_finup(struct shash_desc *desc, const u8 *src, - unsigned int len, u8 *dst) -{ - struct arm_ghash_desc_ctx *ctx =3D shash_desc_ctx(desc); - struct ghash_key *key =3D crypto_shash_ctx(desc->tfm); - - if (len) { - u8 buf[GHASH_BLOCK_SIZE] =3D {}; - - memcpy(buf, src, len); - ghash_do_simd_update(1, ctx->digest, buf, key, NULL, - pmull_ghash_update_p8); - memzero_explicit(buf, sizeof(buf)); - } - return ghash_export(desc, dst); + pmull_ghash_update_p64(blocks, dg, src, key->h, head); } =20 static void ghash_reflect(u64 h[], const be128 *k) { u64 carry =3D be64_to_cpu(k->a) & BIT(63) ? 1 : 0; @@ -149,45 +72,10 @@ static void ghash_reflect(u64 h[], const be128 *k) =20 if (carry) h[1] ^=3D 0xc200000000000000UL; } =20 -static int ghash_setkey(struct crypto_shash *tfm, - const u8 *inkey, unsigned int keylen) -{ - struct ghash_key *key =3D crypto_shash_ctx(tfm); - - if (keylen !=3D GHASH_BLOCK_SIZE) - return -EINVAL; - - /* needed for the fallback */ - memcpy(&key->k, inkey, GHASH_BLOCK_SIZE); - - ghash_reflect(key->h[0], &key->k); - return 0; -} - -static struct shash_alg ghash_alg =3D { - .base.cra_name =3D "ghash", - .base.cra_driver_name =3D "ghash-neon", - .base.cra_priority =3D 150, - .base.cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY, - .base.cra_blocksize =3D GHASH_BLOCK_SIZE, - .base.cra_ctxsize =3D sizeof(struct ghash_key) + sizeof(u64[2]), - .base.cra_module =3D THIS_MODULE, - - .digestsize =3D GHASH_DIGEST_SIZE, - .init =3D ghash_init, - .update =3D ghash_update, - .finup =3D ghash_finup, - .setkey =3D ghash_setkey, - .export =3D ghash_export, - .import =3D ghash_import, - .descsize =3D sizeof(struct arm_ghash_desc_ctx), - .statesize =3D sizeof(struct ghash_desc_ctx), -}; - static int gcm_aes_setkey(struct crypto_aead *tfm, const u8 *inkey, unsigned int keylen) { struct gcm_aes_ctx *ctx =3D crypto_aead_ctx(tfm); u8 key[GHASH_BLOCK_SIZE]; @@ -238,13 +126,11 @@ static void gcm_update_mac(u64 dg[], const u8 *src, i= nt count, u8 buf[], =20 if (count >=3D GHASH_BLOCK_SIZE || *buf_count =3D=3D GHASH_BLOCK_SIZE) { int blocks =3D count / GHASH_BLOCK_SIZE; =20 ghash_do_simd_update(blocks, dg, src, &ctx->ghash_key, - *buf_count ? buf : NULL, - pmull_ghash_update_p64); - + *buf_count ? buf : NULL); src +=3D blocks * GHASH_BLOCK_SIZE; count %=3D GHASH_BLOCK_SIZE; *buf_count =3D 0; } =20 @@ -273,12 +159,11 @@ static void gcm_calculate_auth_mac(struct aead_reques= t *req, u64 dg[], u32 len) len -=3D n; } while (len); =20 if (buf_count) { memset(&buf[buf_count], 0, GHASH_BLOCK_SIZE - buf_count); - ghash_do_simd_update(1, dg, buf, &ctx->ghash_key, NULL, - pmull_ghash_update_p64); + ghash_do_simd_update(1, dg, buf, &ctx->ghash_key, NULL); } } =20 static int gcm_encrypt(struct aead_request *req, char *iv, int assoclen) { @@ -503,26 +388,19 @@ static struct aead_alg gcm_aes_algs[] =3D {{ .base.cra_module =3D THIS_MODULE, }}; =20 static int __init ghash_ce_mod_init(void) { - if (!cpu_have_named_feature(ASIMD)) + if (!cpu_have_named_feature(ASIMD) || !cpu_have_named_feature(PMULL)) return -ENODEV; =20 - if (cpu_have_named_feature(PMULL)) - return crypto_register_aeads(gcm_aes_algs, - ARRAY_SIZE(gcm_aes_algs)); - - return crypto_register_shash(&ghash_alg); + return crypto_register_aeads(gcm_aes_algs, ARRAY_SIZE(gcm_aes_algs)); } =20 static void __exit ghash_ce_mod_exit(void) { - if (cpu_have_named_feature(PMULL)) - crypto_unregister_aeads(gcm_aes_algs, ARRAY_SIZE(gcm_aes_algs)); - else - crypto_unregister_shash(&ghash_alg); + crypto_unregister_aeads(gcm_aes_algs, ARRAY_SIZE(gcm_aes_algs)); } =20 static const struct cpu_feature __maybe_unused ghash_cpu_feature[] =3D { { cpu_feature(PMULL) }, { } }; diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile index 8a06dd6a43ea..4ce0bac8fd93 100644 --- a/lib/crypto/Makefile +++ b/lib/crypto/Makefile @@ -157,11 +157,12 @@ libdes-y :=3D des.o obj-$(CONFIG_CRYPTO_LIB_GF128HASH) +=3D libgf128hash.o libgf128hash-y :=3D gf128hash.o ifeq ($(CONFIG_CRYPTO_LIB_GF128HASH_ARCH),y) CFLAGS_gf128hash.o +=3D -I$(src)/$(SRCARCH) libgf128hash-$(CONFIG_ARM) +=3D arm/ghash-neon-core.o -libgf128hash-$(CONFIG_ARM64) +=3D arm64/polyval-ce-core.o +libgf128hash-$(CONFIG_ARM64) +=3D arm64/ghash-neon-core.o \ + arm64/polyval-ce-core.o libgf128hash-$(CONFIG_X86) +=3D x86/polyval-pclmul-avx.o endif =20 ##########################################################################= ###### =20 diff --git a/lib/crypto/arm64/gf128hash.h b/lib/crypto/arm64/gf128hash.h index 796c36804dda..d5ef1b1b77e1 100644 --- a/lib/crypto/arm64/gf128hash.h +++ b/lib/crypto/arm64/gf128hash.h @@ -1,23 +1,27 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* - * POLYVAL library functions, arm64 optimized + * GHASH and POLYVAL, arm64 optimized * * Copyright 2025 Google LLC */ #include #include =20 #define NUM_H_POWERS 8 =20 +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_asimd); static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_pmull); =20 asmlinkage void polyval_mul_pmull(struct polyval_elem *a, const struct polyval_elem *b); asmlinkage void polyval_blocks_pmull(struct polyval_elem *acc, const struct polyval_key *key, const u8 *data, size_t nblocks); +asmlinkage void pmull_ghash_update_p8(size_t blocks, struct polyval_elem *= dg, + const u8 *src, + const struct polyval_elem *k); =20 #define polyval_preparekey_arch polyval_preparekey_arch static void polyval_preparekey_arch(struct polyval_key *key, const u8 raw_key[POLYVAL_BLOCK_SIZE]) { @@ -39,19 +43,66 @@ static void polyval_preparekey_arch(struct polyval_key = *key, &key->h_powers[NUM_H_POWERS - 1]); } } } =20 +static void polyval_mul_arm64(struct polyval_elem *a, + const struct polyval_elem *b) +{ + if (static_branch_likely(&have_asimd) && may_use_simd()) { + static const u8 zeroes[GHASH_BLOCK_SIZE]; + + scoped_ksimd() { + if (static_branch_likely(&have_pmull)) { + polyval_mul_pmull(a, b); + } else { + /* + * Note that this is indeed equivalent to a + * POLYVAL multiplication, since it takes the + * accumulator and key in POLYVAL format, and + * byte-swapping a block of zeroes is a no-op. + */ + pmull_ghash_update_p8(1, a, zeroes, b); + } + } + } else { + polyval_mul_generic(a, b); + } +} + +#define ghash_mul_arch ghash_mul_arch +static void ghash_mul_arch(struct polyval_elem *acc, + const struct ghash_key *key) +{ + polyval_mul_arm64(acc, &key->h); +} + #define polyval_mul_arch polyval_mul_arch static void polyval_mul_arch(struct polyval_elem *acc, const struct polyval_key *key) { - if (static_branch_likely(&have_pmull) && may_use_simd()) { - scoped_ksimd() - polyval_mul_pmull(acc, &key->h_powers[NUM_H_POWERS - 1]); + polyval_mul_arm64(acc, &key->h_powers[NUM_H_POWERS - 1]); +} + +#define ghash_blocks_arch ghash_blocks_arch +static void ghash_blocks_arch(struct polyval_elem *acc, + const struct ghash_key *key, + const u8 *data, size_t nblocks) +{ + if (static_branch_likely(&have_asimd) && may_use_simd()) { + do { + /* Allow rescheduling every 4 KiB. */ + size_t n =3D + min_t(size_t, nblocks, 4096 / GHASH_BLOCK_SIZE); + + scoped_ksimd() + pmull_ghash_update_p8(n, acc, data, &key->h); + data +=3D n * GHASH_BLOCK_SIZE; + nblocks -=3D n; + } while (nblocks); } else { - polyval_mul_generic(acc, &key->h_powers[NUM_H_POWERS - 1]); + ghash_blocks_generic(acc, &key->h, data, nblocks); } } =20 #define polyval_blocks_arch polyval_blocks_arch static void polyval_blocks_arch(struct polyval_elem *acc, @@ -76,8 +127,11 @@ static void polyval_blocks_arch(struct polyval_elem *ac= c, } =20 #define gf128hash_mod_init_arch gf128hash_mod_init_arch static void gf128hash_mod_init_arch(void) { - if (cpu_have_named_feature(PMULL)) - static_branch_enable(&have_pmull); + if (cpu_have_named_feature(ASIMD)) { + static_branch_enable(&have_asimd); + if (cpu_have_named_feature(PMULL)) + static_branch_enable(&have_pmull); + } } diff --git a/arch/arm64/crypto/ghash-neon-core.S b/lib/crypto/arm64/ghash-n= eon-core.S similarity index 93% rename from arch/arm64/crypto/ghash-neon-core.S rename to lib/crypto/arm64/ghash-neon-core.S index 6157135ad566..eadd6da47247 100644 --- a/arch/arm64/crypto/ghash-neon-core.S +++ b/lib/crypto/arm64/ghash-neon-core.S @@ -4,11 +4,10 @@ * * Copyright (C) 2014 - 2018 Linaro Ltd. */ =20 #include -#include #include =20 SHASH .req v0 SHASH2 .req v1 T1 .req v2 @@ -177,29 +176,24 @@ ushr T2.2d, T2.2d, #6 ushr XL.2d, XL.2d, #1 .endm =20 /* - * void pmull_ghash_update_p8(int blocks, u64 dg[], const char *src, - * u64 const h[][2], const char *head) + * void pmull_ghash_update_p8(size_t blocks, struct polyval_elem *dg, + * const u8 *src, + * const struct polyval_elem *k) */ -SYM_TYPED_FUNC_START(pmull_ghash_update_p8) +SYM_FUNC_START(pmull_ghash_update_p8) ld1 {SHASH.2d}, [x3] ld1 {XL.2d}, [x1] =20 __pmull_pre_p8 =20 - /* do the head block first, if supplied */ - cbz x4, 0f - ld1 {T1.2d}, [x4] - mov x4, xzr - b 3f - 0: ld1 {T1.2d}, [x2], #16 - sub w0, w0, #1 + sub x0, x0, #1 =20 -3: /* multiply XL by SHASH in GF(2^128) */ + /* multiply XL by SHASH in GF(2^128) */ CPU_LE( rev64 T1.16b, T1.16b ) =20 ext T2.16b, XL.16b, XL.16b, #8 ext IN1.16b, T1.16b, T1.16b, #8 eor T1.16b, T1.16b, T2.16b @@ -217,10 +211,10 @@ CPU_LE( rev64 T1.16b, T1.16b ) __pmull_reduce_p8 =20 eor T2.16b, T2.16b, XH.16b eor XL.16b, XL.16b, T2.16b =20 - cbnz w0, 0b + cbnz x0, 0b =20 st1 {XL.2d}, [x1] ret SYM_FUNC_END(pmull_ghash_update_p8) --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76B56359A7A; Thu, 19 Mar 2026 06:19:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901159; cv=none; b=pNLPQ0yuKzYYlhC8rwm3buFvZhbAvwUrMxZEqiI5C+9mxesYsN74L7L7RIgfK1+LUfT5XCvTbIC5Hk1AdsvLynrQWd9PxilOF781HdNziNGK+q5DEPrBxW5O4m7GqhJd3RUAswyevWYb3uG0UXTS1HUAJ01mbsgbW0fLSJyAv/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901159; c=relaxed/simple; bh=g4HC7CR5REgRhpaC7W+TZS5hRJ5IjsLa0h/bTCy9m6k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DK4ITbSM57sITtDs7qAEUIGgcDPAH4vw5sticNhOgs2ZfVHEqjI4fZ23lxniBzwLloyt/uz7NzvSN45ZwSpNqDnAhDN25jOrAwxE/tiLo4kBtUX05r8PZXIGGsAApFGKhrym763Puck7ImO+3T9o3eUjxFbIbuEmfTzDyZpm/8M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=K0/qBsme; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="K0/qBsme" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0C688C2BCAF; Thu, 19 Mar 2026 06:19:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901159; bh=g4HC7CR5REgRhpaC7W+TZS5hRJ5IjsLa0h/bTCy9m6k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=K0/qBsme+JzpcNbY/maKCWAZlEYHcGW3FJEAde5UvPs1AUj87gYJjDHtePlBYTO6U 5hkY81O0L87uiuAeSxqFIj5g0A+yCnwrN9XuTOtUNoSm+Rl5TxmOX3iLj2+/GYDZZr RwL/KZcHD3V61g5HYxIeXsb0blIgF647GTQayQDTob3pavHtLMNWp+NnlxAfzhxNEa Y//XSF8cE8wG2iJEeRz/q6Lv5XxPfCWcMMrgHyV1JxN1VkFgZ5Ib4+DDBsSd3KFFn2 JsNmiLclTDgj9zCx4vQxwbHyYg9VfxFws8xnWfXE95Su1tTZX5Ko/M2kGtJQKv8rGq X22oPsRM/tPDg== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 10/19] crypto: arm64/aes-gcm - Rename struct ghash_key and make fixed-sized Date: Wed, 18 Mar 2026 23:17:11 -0700 Message-ID: <20260319061723.1140720-11-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rename the 'struct ghash_key' in arch/arm64/crypto/ghash-ce-glue.c to prevent a naming conflict with the library 'struct ghash_key'. In addition, declare the 'h' field with an explicit size, now that there's no longer any reason for it to be a flexible array. Update the comments in the assembly file to match the C code. Note that some of these were out-of-date. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/arm64/crypto/ghash-ce-core.S | 15 ++++++++------- arch/arm64/crypto/ghash-ce-glue.c | 20 +++++++++----------- 2 files changed, 17 insertions(+), 18 deletions(-) diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce= -core.S index a01f136f4fb2..33772d8fe6b5 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -62,11 +62,11 @@ pmull XL.1q, XL.1d, MASK.1d .endm =20 /* * void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src, - * u64 const h[][2], const char *head) + * u64 const h[4][2], const char *head) */ SYM_FUNC_START(pmull_ghash_update_p64) ld1 {SHASH.2d}, [x3] ld1 {XL.2d}, [x1] =20 @@ -411,22 +411,23 @@ CPU_LE( rev w8, w8 ) .endif b 3b .endm =20 /* - * void pmull_gcm_encrypt(int blocks, u8 dst[], const u8 src[], - * struct ghash_key const *k, u64 dg[], u8 ctr[], - * int rounds, u8 tag) + * void pmull_gcm_encrypt(int bytes, u8 dst[], const u8 src[], + * u64 const h[4][2], u64 dg[], u8 ctr[], + * u32 const rk[], int rounds, u8 tag[]) */ SYM_FUNC_START(pmull_gcm_encrypt) pmull_gcm_do_crypt 1 SYM_FUNC_END(pmull_gcm_encrypt) =20 /* - * void pmull_gcm_decrypt(int blocks, u8 dst[], const u8 src[], - * struct ghash_key const *k, u64 dg[], u8 ctr[], - * int rounds, u8 tag) + * int pmull_gcm_decrypt(int bytes, u8 dst[], const u8 src[], + * u64 const h[4][2], u64 dg[], u8 ctr[], + * u32 const rk[], int rounds, const u8 l[], + * const u8 tag[], u64 authsize) */ SYM_FUNC_START(pmull_gcm_decrypt) pmull_gcm_do_crypt 0 SYM_FUNC_END(pmull_gcm_decrypt) =20 diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce= -glue.c index 42fb46bdc124..c74066d430fa 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -28,38 +28,38 @@ MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("gcm(aes)"); MODULE_ALIAS_CRYPTO("rfc4106(gcm(aes))"); =20 #define RFC4106_NONCE_SIZE 4 =20 -struct ghash_key { +struct arm_ghash_key { be128 k; - u64 h[][2]; + u64 h[4][2]; }; =20 struct arm_ghash_desc_ctx { u64 digest[GHASH_DIGEST_SIZE/sizeof(u64)]; }; =20 struct gcm_aes_ctx { struct aes_enckey aes_key; u8 nonce[RFC4106_NONCE_SIZE]; - struct ghash_key ghash_key; + struct arm_ghash_key ghash_key; }; =20 asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *s= rc, - u64 const h[][2], const char *head); + u64 const h[4][2], const char *head); =20 asmlinkage void pmull_gcm_encrypt(int bytes, u8 dst[], const u8 src[], - u64 const h[][2], u64 dg[], u8 ctr[], + u64 const h[4][2], u64 dg[], u8 ctr[], u32 const rk[], int rounds, u8 tag[]); asmlinkage int pmull_gcm_decrypt(int bytes, u8 dst[], const u8 src[], - u64 const h[][2], u64 dg[], u8 ctr[], + u64 const h[4][2], u64 dg[], u8 ctr[], u32 const rk[], int rounds, const u8 l[], const u8 tag[], u64 authsize); =20 static void ghash_do_simd_update(int blocks, u64 dg[], const char *src, - struct ghash_key *key, const char *head) + struct arm_ghash_key *key, const char *head) { scoped_ksimd() pmull_ghash_update_p64(blocks, dg, src, key->h, head); } =20 @@ -365,12 +365,11 @@ static struct aead_alg gcm_aes_algs[] =3D {{ =20 .base.cra_name =3D "gcm(aes)", .base.cra_driver_name =3D "gcm-aes-ce", .base.cra_priority =3D 300, .base.cra_blocksize =3D 1, - .base.cra_ctxsize =3D sizeof(struct gcm_aes_ctx) + - 4 * sizeof(u64[2]), + .base.cra_ctxsize =3D sizeof(struct gcm_aes_ctx), .base.cra_module =3D THIS_MODULE, }, { .ivsize =3D GCM_RFC4106_IV_SIZE, .chunksize =3D AES_BLOCK_SIZE, .maxauthsize =3D AES_BLOCK_SIZE, @@ -381,12 +380,11 @@ static struct aead_alg gcm_aes_algs[] =3D {{ =20 .base.cra_name =3D "rfc4106(gcm(aes))", .base.cra_driver_name =3D "rfc4106-gcm-aes-ce", .base.cra_priority =3D 300, .base.cra_blocksize =3D 1, - .base.cra_ctxsize =3D sizeof(struct gcm_aes_ctx) + - 4 * sizeof(u64[2]), + .base.cra_ctxsize =3D sizeof(struct gcm_aes_ctx), .base.cra_module =3D THIS_MODULE, }}; =20 static int __init ghash_ce_mod_init(void) { --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F35E435A3AB; Thu, 19 Mar 2026 06:19:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901160; cv=none; b=YURSsvGxL/J8NGkHHIPLryHAmpX03/YllGKrenKlmX6IkCUQHantmpak/nx4x8d3piKIQBZkCaS7d6tr/ajnvgDNZfgLv5qexpnK7DYnV8KTcMUDt6Xxzl0jd8s4M8L0+dbhR4T6IhoPDG3FGm2WjL1efIEYh8Q4Ox0yno6dkUA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901160; c=relaxed/simple; bh=EHqY05M4yxDEHTjZmJV/a9yk3e/LPZ0/CaDs3VLZU30=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SnjRlUmnC0bi0l5shB5qD/cb/r1QH8955ccjj9+qql/db2d8ghS2q/doYdDq/BdzZHYotdfvYa1iukUIK24vz9+nUOm//FRD29VvFDvxG4TeJd1kWMCCzNOqXJWEjQ9tFyYOtyGv/iqPonS9cAZjvmoolqxBbvv7cpqg36AHWVo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lTH7i7+A; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lTH7i7+A" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7A8ADC2BCC9; Thu, 19 Mar 2026 06:19:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901159; bh=EHqY05M4yxDEHTjZmJV/a9yk3e/LPZ0/CaDs3VLZU30=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lTH7i7+AKBT1f9uNWK9OSxjGE6ei2uH1GkKTY8+xpc9FlDr0DMQuJePlqkkCjJ2Ke 4rTUYwQJXc03wMLPswL/AR9mPhfGP9L/OgsAQ5FvQi5fFWf5NZQLcNCPAFwe00vASm OD4sOQkZmhds255aXuAxBqpIw9TJhTyObvxYrEVqhBVuXqzokmLzvzCFmzOtVGwY1y e8G6vltDnKmM0HY6wmzwZNzVJwVdDPoWPsJEDmiWaQ2TaTNgGdUZe9c6FhltliBFwU cB75GRdjB2+y2h3JCenbHseWg69IMMnG/OxeOzVDTFgLuPPlwbfFvi7i3EwtsbZkIC /IXCz1t+GxFdQ== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 11/19] lib/crypto: powerpc/ghash: Migrate optimized code into library Date: Wed, 18 Mar 2026 23:17:12 -0700 Message-ID: <20260319061723.1140720-12-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the "p8_ghash" crypto_shash algorithm. Move the corresponding assembly code into lib/crypto/, and wire it up to the GHASH library. This makes the GHASH library be optimized for POWER8. It also greatly reduces the amount of powerpc-specific glue code that is needed, and it fixes the issue where this optimized GHASH code was disabled by default. Note that previously the C code defined the POWER8 GHASH key format as "u128 htable[16]", despite the assembly code only using four entries. Fix the C code to use the correct key format. To fulfill the library API contract, also make the key preparation work in all contexts. Note that the POWER8 assembly code takes the accumulator in GHASH format, but it actually byte-reflects it to get it into POLYVAL format. The library already works with POLYVAL natively. For now, just wire up this existing code by converting it to/from GHASH format in C code. This should be cleaned up to eliminate the unnecessary conversion later. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- MAINTAINERS | 4 +- arch/powerpc/crypto/Kconfig | 5 +- arch/powerpc/crypto/Makefile | 8 +- arch/powerpc/crypto/aesp8-ppc.h | 1 - arch/powerpc/crypto/ghash.c | 160 ------------------ arch/powerpc/crypto/vmx.c | 10 +- include/crypto/gf128hash.h | 4 + lib/crypto/Kconfig | 1 + lib/crypto/Makefile | 25 ++- lib/crypto/powerpc/.gitignore | 1 + lib/crypto/powerpc/gf128hash.h | 109 ++++++++++++ .../crypto/powerpc}/ghashp8-ppc.pl | 1 + 12 files changed, 143 insertions(+), 186 deletions(-) delete mode 100644 arch/powerpc/crypto/ghash.c create mode 100644 lib/crypto/powerpc/gf128hash.h rename {arch/powerpc/crypto =3D> lib/crypto/powerpc}/ghashp8-ppc.pl (98%) diff --git a/MAINTAINERS b/MAINTAINERS index 77fdfcb55f06..f088f4085653 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12265,14 +12265,14 @@ F: arch/powerpc/crypto/Makefile F: arch/powerpc/crypto/aes.c F: arch/powerpc/crypto/aes_cbc.c F: arch/powerpc/crypto/aes_ctr.c F: arch/powerpc/crypto/aes_xts.c F: arch/powerpc/crypto/aesp8-ppc.* -F: arch/powerpc/crypto/ghash.c -F: arch/powerpc/crypto/ghashp8-ppc.pl F: arch/powerpc/crypto/ppc-xlate.pl F: arch/powerpc/crypto/vmx.c +F: lib/crypto/powerpc/gf128hash.h +F: lib/crypto/powerpc/ghashp8-ppc.pl =20 IBM ServeRAID RAID DRIVER S: Orphan F: drivers/scsi/ips.* =20 diff --git a/arch/powerpc/crypto/Kconfig b/arch/powerpc/crypto/Kconfig index 2d056f1fc90f..b247f7ed973e 100644 --- a/arch/powerpc/crypto/Kconfig +++ b/arch/powerpc/crypto/Kconfig @@ -52,14 +52,13 @@ config CRYPTO_DEV_VMX_ENCRYPT tristate "Encryption acceleration support on P8 CPU" depends on CRYPTO_DEV_VMX select CRYPTO_AES select CRYPTO_CBC select CRYPTO_CTR - select CRYPTO_GHASH select CRYPTO_XTS default m help Support for VMX cryptographic acceleration instructions on Power8 CPU. - This module supports acceleration for AES and GHASH in hardware. If you - choose 'M' here, this module will be called vmx-crypto. + This module supports acceleration for AES in hardware. If you choose + 'M' here, this module will be called vmx-crypto. =20 endmenu diff --git a/arch/powerpc/crypto/Makefile b/arch/powerpc/crypto/Makefile index 3ac0886282a2..a1fe102a90ae 100644 --- a/arch/powerpc/crypto/Makefile +++ b/arch/powerpc/crypto/Makefile @@ -9,11 +9,11 @@ obj-$(CONFIG_CRYPTO_AES_PPC_SPE) +=3D aes-ppc-spe.o obj-$(CONFIG_CRYPTO_AES_GCM_P10) +=3D aes-gcm-p10-crypto.o obj-$(CONFIG_CRYPTO_DEV_VMX_ENCRYPT) +=3D vmx-crypto.o =20 aes-ppc-spe-y :=3D aes-spe-glue.o aes-gcm-p10-crypto-y :=3D aes-gcm-p10-glue.o aes-gcm-p10.o ghashp10-ppc.o = aesp10-ppc.o -vmx-crypto-objs :=3D vmx.o ghashp8-ppc.o aes_cbc.o aes_ctr.o aes_xts.o gha= sh.o +vmx-crypto-objs :=3D vmx.o aes_cbc.o aes_ctr.o aes_xts.o =20 ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y) override flavour :=3D linux-ppc64le else ifdef CONFIG_PPC64_ELF_ABI_V2 @@ -24,16 +24,12 @@ endif endif =20 quiet_cmd_perl =3D PERL $@ cmd_perl =3D $(PERL) $< $(flavour) > $@ =20 -targets +=3D aesp10-ppc.S ghashp10-ppc.S ghashp8-ppc.S +targets +=3D aesp10-ppc.S ghashp10-ppc.S =20 $(obj)/aesp10-ppc.S $(obj)/ghashp10-ppc.S: $(obj)/%.S: $(src)/%.pl FORCE $(call if_changed,perl) =20 -$(obj)/ghashp8-ppc.S: $(obj)/%.S: $(src)/%.pl FORCE - $(call if_changed,perl) - OBJECT_FILES_NON_STANDARD_aesp10-ppc.o :=3D y OBJECT_FILES_NON_STANDARD_ghashp10-ppc.o :=3D y -OBJECT_FILES_NON_STANDARD_ghashp8-ppc.o :=3D y diff --git a/arch/powerpc/crypto/aesp8-ppc.h b/arch/powerpc/crypto/aesp8-pp= c.h index 6862c605cc33..c68f5b6965fa 100644 --- a/arch/powerpc/crypto/aesp8-ppc.h +++ b/arch/powerpc/crypto/aesp8-ppc.h @@ -1,8 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ #include #include =20 -extern struct shash_alg p8_ghash_alg; extern struct skcipher_alg p8_aes_cbc_alg; extern struct skcipher_alg p8_aes_ctr_alg; extern struct skcipher_alg p8_aes_xts_alg; diff --git a/arch/powerpc/crypto/ghash.c b/arch/powerpc/crypto/ghash.c deleted file mode 100644 index 7308735bdb33..000000000000 --- a/arch/powerpc/crypto/ghash.c +++ /dev/null @@ -1,160 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * GHASH routines supporting VMX instructions on the Power 8 - * - * Copyright (C) 2015, 2019 International Business Machines Inc. - * - * Author: Marcelo Henrique Cerri - * - * Extended by Daniel Axtens to replace the fallback - * mechanism. The new approach is based on arm64 code, which is: - * Copyright (C) 2014 - 2018 Linaro Ltd. - */ - -#include "aesp8-ppc.h" -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -void gcm_init_p8(u128 htable[16], const u64 Xi[2]); -void gcm_gmult_p8(u64 Xi[2], const u128 htable[16]); -void gcm_ghash_p8(u64 Xi[2], const u128 htable[16], - const u8 *in, size_t len); - -struct p8_ghash_ctx { - /* key used by vector asm */ - u128 htable[16]; - /* key used by software fallback */ - be128 key; -}; - -struct p8_ghash_desc_ctx { - u64 shash[2]; -}; - -static int p8_ghash_init(struct shash_desc *desc) -{ - struct p8_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - memset(dctx->shash, 0, GHASH_DIGEST_SIZE); - return 0; -} - -static int p8_ghash_setkey(struct crypto_shash *tfm, const u8 *key, - unsigned int keylen) -{ - struct p8_ghash_ctx *ctx =3D crypto_tfm_ctx(crypto_shash_tfm(tfm)); - - if (keylen !=3D GHASH_BLOCK_SIZE) - return -EINVAL; - - preempt_disable(); - pagefault_disable(); - enable_kernel_vsx(); - gcm_init_p8(ctx->htable, (const u64 *) key); - disable_kernel_vsx(); - pagefault_enable(); - preempt_enable(); - - memcpy(&ctx->key, key, GHASH_BLOCK_SIZE); - - return 0; -} - -static inline void __ghash_block(struct p8_ghash_ctx *ctx, - struct p8_ghash_desc_ctx *dctx, - const u8 *src) -{ - if (crypto_simd_usable()) { - preempt_disable(); - pagefault_disable(); - enable_kernel_vsx(); - gcm_ghash_p8(dctx->shash, ctx->htable, src, GHASH_BLOCK_SIZE); - disable_kernel_vsx(); - pagefault_enable(); - preempt_enable(); - } else { - crypto_xor((u8 *)dctx->shash, src, GHASH_BLOCK_SIZE); - gf128mul_lle((be128 *)dctx->shash, &ctx->key); - } -} - -static inline int __ghash_blocks(struct p8_ghash_ctx *ctx, - struct p8_ghash_desc_ctx *dctx, - const u8 *src, unsigned int srclen) -{ - int remain =3D srclen - round_down(srclen, GHASH_BLOCK_SIZE); - - srclen -=3D remain; - if (crypto_simd_usable()) { - preempt_disable(); - pagefault_disable(); - enable_kernel_vsx(); - gcm_ghash_p8(dctx->shash, ctx->htable, - src, srclen); - disable_kernel_vsx(); - pagefault_enable(); - preempt_enable(); - } else { - do { - crypto_xor((u8 *)dctx->shash, src, GHASH_BLOCK_SIZE); - gf128mul_lle((be128 *)dctx->shash, &ctx->key); - srclen -=3D GHASH_BLOCK_SIZE; - src +=3D GHASH_BLOCK_SIZE; - } while (srclen); - } - - return remain; -} - -static int p8_ghash_update(struct shash_desc *desc, - const u8 *src, unsigned int srclen) -{ - struct p8_ghash_ctx *ctx =3D crypto_tfm_ctx(crypto_shash_tfm(desc->tfm)); - struct p8_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - return __ghash_blocks(ctx, dctx, src, srclen); -} - -static int p8_ghash_finup(struct shash_desc *desc, const u8 *src, - unsigned int len, u8 *out) -{ - struct p8_ghash_ctx *ctx =3D crypto_tfm_ctx(crypto_shash_tfm(desc->tfm)); - struct p8_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - if (len) { - u8 buf[GHASH_BLOCK_SIZE] =3D {}; - - memcpy(buf, src, len); - __ghash_block(ctx, dctx, buf); - memzero_explicit(buf, sizeof(buf)); - } - memcpy(out, dctx->shash, GHASH_DIGEST_SIZE); - return 0; -} - -struct shash_alg p8_ghash_alg =3D { - .digestsize =3D GHASH_DIGEST_SIZE, - .init =3D p8_ghash_init, - .update =3D p8_ghash_update, - .finup =3D p8_ghash_finup, - .setkey =3D p8_ghash_setkey, - .descsize =3D sizeof(struct p8_ghash_desc_ctx), - .base =3D { - .cra_name =3D "ghash", - .cra_driver_name =3D "p8_ghash", - .cra_priority =3D 1000, - .cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY, - .cra_blocksize =3D GHASH_BLOCK_SIZE, - .cra_ctxsize =3D sizeof(struct p8_ghash_ctx), - .cra_module =3D THIS_MODULE, - }, -}; diff --git a/arch/powerpc/crypto/vmx.c b/arch/powerpc/crypto/vmx.c index 7d2beb774f99..08da5311dfdf 100644 --- a/arch/powerpc/crypto/vmx.c +++ b/arch/powerpc/crypto/vmx.c @@ -12,26 +12,21 @@ #include #include #include #include #include -#include #include =20 #include "aesp8-ppc.h" =20 static int __init p8_init(void) { int ret; =20 - ret =3D crypto_register_shash(&p8_ghash_alg); - if (ret) - goto err; - ret =3D crypto_register_skcipher(&p8_aes_cbc_alg); if (ret) - goto err_unregister_ghash; + goto err; =20 ret =3D crypto_register_skcipher(&p8_aes_ctr_alg); if (ret) goto err_unregister_aes_cbc; =20 @@ -43,22 +38,19 @@ static int __init p8_init(void) =20 err_unregister_aes_ctr: crypto_unregister_skcipher(&p8_aes_ctr_alg); err_unregister_aes_cbc: crypto_unregister_skcipher(&p8_aes_cbc_alg); -err_unregister_ghash: - crypto_unregister_shash(&p8_ghash_alg); err: return ret; } =20 static void __exit p8_exit(void) { crypto_unregister_skcipher(&p8_aes_xts_alg); crypto_unregister_skcipher(&p8_aes_ctr_alg); crypto_unregister_skcipher(&p8_aes_cbc_alg); - crypto_unregister_shash(&p8_ghash_alg); } =20 module_cpu_feature_match(PPC_MODULE_FEATURE_VEC_CRYPTO, p8_init); module_exit(p8_exit); =20 diff --git a/include/crypto/gf128hash.h b/include/crypto/gf128hash.h index 5090fbaa87f8..650652dd6003 100644 --- a/include/crypto/gf128hash.h +++ b/include/crypto/gf128hash.h @@ -39,10 +39,14 @@ struct polyval_elem { * struct ghash_key - Prepared key for GHASH * * Use ghash_preparekey() to initialize this. */ struct ghash_key { +#if defined(CONFIG_CRYPTO_LIB_GF128HASH_ARCH) && defined(CONFIG_PPC64) + /** @htable: GHASH key format used by the POWER8 assembly code */ + u64 htable[4][2]; +#endif /** @h: The hash key H, in POLYVAL format */ struct polyval_elem h; }; =20 /** diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig index 4f1a79883a56..f54add7d9070 100644 --- a/lib/crypto/Kconfig +++ b/lib/crypto/Kconfig @@ -119,10 +119,11 @@ config CRYPTO_LIB_GF128HASH config CRYPTO_LIB_GF128HASH_ARCH bool depends on CRYPTO_LIB_GF128HASH && !UML default y if ARM && KERNEL_MODE_NEON default y if ARM64 + default y if PPC64 && VSX default y if X86_64 =20 config CRYPTO_LIB_MD5 tristate help diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile index 4ce0bac8fd93..8a9084188778 100644 --- a/lib/crypto/Makefile +++ b/lib/crypto/Makefile @@ -6,10 +6,14 @@ quiet_cmd_perlasm =3D PERLASM $@ cmd_perlasm =3D $(PERL) $(<) > $(@) =20 quiet_cmd_perlasm_with_args =3D PERLASM $@ cmd_perlasm_with_args =3D $(PERL) $(<) void $(@) =20 +ppc64-perlasm-flavour-y :=3D linux-ppc64 +ppc64-perlasm-flavour-$(CONFIG_PPC64_ELF_ABI_V2) :=3D linux-ppc64-elfv2 +ppc64-perlasm-flavour-$(CONFIG_CPU_LITTLE_ENDIAN) :=3D linux-ppc64le + obj-$(CONFIG_KUNIT) +=3D tests/ =20 obj-$(CONFIG_CRYPTO_HASH_INFO) +=3D hash_info.o =20 obj-$(CONFIG_CRYPTO_LIB_UTILS) +=3D libcryptoutils.o @@ -34,15 +38,12 @@ libaes-y +=3D powerpc/aes-spe-core.o \ powerpc/aes-spe-keys.o \ powerpc/aes-spe-modes.o \ powerpc/aes-tab-4k.o else libaes-y +=3D powerpc/aesp8-ppc.o -aes-perlasm-flavour-y :=3D linux-ppc64 -aes-perlasm-flavour-$(CONFIG_PPC64_ELF_ABI_V2) :=3D linux-ppc64-elfv2 -aes-perlasm-flavour-$(CONFIG_CPU_LITTLE_ENDIAN) :=3D linux-ppc64le quiet_cmd_perlasm_aes =3D PERLASM $@ - cmd_perlasm_aes =3D $(PERL) $< $(aes-perlasm-flavour-y) $@ + cmd_perlasm_aes =3D $(PERL) $< $(ppc64-perlasm-flavour-y) $@ # Use if_changed instead of cmd, in case the flavour changed. $(obj)/powerpc/aesp8-ppc.S: $(src)/powerpc/aesp8-ppc.pl FORCE $(call if_changed,perlasm_aes) targets +=3D powerpc/aesp8-ppc.S OBJECT_FILES_NON_STANDARD_powerpc/aesp8-ppc.o :=3D y @@ -159,13 +160,27 @@ libgf128hash-y :=3D gf128hash.o ifeq ($(CONFIG_CRYPTO_LIB_GF128HASH_ARCH),y) CFLAGS_gf128hash.o +=3D -I$(src)/$(SRCARCH) libgf128hash-$(CONFIG_ARM) +=3D arm/ghash-neon-core.o libgf128hash-$(CONFIG_ARM64) +=3D arm64/ghash-neon-core.o \ arm64/polyval-ce-core.o -libgf128hash-$(CONFIG_X86) +=3D x86/polyval-pclmul-avx.o + +ifeq ($(CONFIG_PPC),y) +libgf128hash-y +=3D powerpc/ghashp8-ppc.o +quiet_cmd_perlasm_ghash =3D PERLASM $@ + cmd_perlasm_ghash =3D $(PERL) $< $(ppc64-perlasm-flavour-y) $@ +$(obj)/powerpc/ghashp8-ppc.S: $(src)/powerpc/ghashp8-ppc.pl FORCE + $(call if_changed,perlasm_ghash) +targets +=3D powerpc/ghashp8-ppc.S +OBJECT_FILES_NON_STANDARD_powerpc/ghashp8-ppc.o :=3D y endif =20 +libgf128hash-$(CONFIG_X86) +=3D x86/polyval-pclmul-avx.o +endif # CONFIG_CRYPTO_LIB_GF128HASH_ARCH + +# clean-files must be defined unconditionally +clean-files +=3D powerpc/ghashp8-ppc.S + ##########################################################################= ###### =20 obj-$(CONFIG_CRYPTO_LIB_MD5) +=3D libmd5.o libmd5-y :=3D md5.o ifeq ($(CONFIG_CRYPTO_LIB_MD5_ARCH),y) diff --git a/lib/crypto/powerpc/.gitignore b/lib/crypto/powerpc/.gitignore index 598ca7aff6b1..7aa71d83f739 100644 --- a/lib/crypto/powerpc/.gitignore +++ b/lib/crypto/powerpc/.gitignore @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only aesp8-ppc.S +ghashp8-ppc.S diff --git a/lib/crypto/powerpc/gf128hash.h b/lib/crypto/powerpc/gf128hash.h new file mode 100644 index 000000000000..629cd325d0c7 --- /dev/null +++ b/lib/crypto/powerpc/gf128hash.h @@ -0,0 +1,109 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * GHASH routines supporting VMX instructions on the Power 8 + * + * Copyright (C) 2015, 2019 International Business Machines Inc. + * Copyright (C) 2014 - 2018 Linaro Ltd. + * Copyright 2026 Google LLC + */ + +#include +#include +#include +#include +#include +#include + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_vec_crypto); + +void gcm_init_p8(u64 htable[4][2], const u8 h[16]); +void gcm_gmult_p8(u8 Xi[16], const u64 htable[4][2]); +void gcm_ghash_p8(u8 Xi[16], const u64 htable[4][2], const u8 *in, size_t = len); + +#define ghash_preparekey_arch ghash_preparekey_arch +static void ghash_preparekey_arch(struct ghash_key *key, + const u8 raw_key[GHASH_BLOCK_SIZE]) +{ + ghash_key_to_polyval(raw_key, &key->h); + + if (static_branch_likely(&have_vec_crypto) && likely(may_use_simd())) { + preempt_disable(); + pagefault_disable(); + enable_kernel_vsx(); + gcm_init_p8(key->htable, raw_key); + disable_kernel_vsx(); + pagefault_enable(); + preempt_enable(); + } else { + /* This reproduces gcm_init_p8() on both LE and BE systems. */ + key->htable[0][0] =3D 0; + key->htable[0][1] =3D 0xc200000000000000; + + key->htable[1][0] =3D 0; + key->htable[1][1] =3D le64_to_cpu(key->h.lo); + + key->htable[2][0] =3D le64_to_cpu(key->h.lo); + key->htable[2][1] =3D le64_to_cpu(key->h.hi); + + key->htable[3][0] =3D le64_to_cpu(key->h.hi); + key->htable[3][1] =3D 0; + } +} + +#define ghash_mul_arch ghash_mul_arch +static void ghash_mul_arch(struct polyval_elem *acc, + const struct ghash_key *key) +{ + if (static_branch_likely(&have_vec_crypto) && likely(may_use_simd())) { + u8 ghash_acc[GHASH_BLOCK_SIZE]; + + polyval_acc_to_ghash(acc, ghash_acc); + + preempt_disable(); + pagefault_disable(); + enable_kernel_vsx(); + gcm_gmult_p8(ghash_acc, key->htable); + disable_kernel_vsx(); + pagefault_enable(); + preempt_enable(); + + ghash_acc_to_polyval(ghash_acc, acc); + memzero_explicit(ghash_acc, sizeof(ghash_acc)); + } else { + polyval_mul_generic(acc, &key->h); + } +} + +#define ghash_blocks_arch ghash_blocks_arch +static void ghash_blocks_arch(struct polyval_elem *acc, + const struct ghash_key *key, + const u8 *data, size_t nblocks) +{ + if (static_branch_likely(&have_vec_crypto) && likely(may_use_simd())) { + u8 ghash_acc[GHASH_BLOCK_SIZE]; + + polyval_acc_to_ghash(acc, ghash_acc); + + preempt_disable(); + pagefault_disable(); + enable_kernel_vsx(); + gcm_ghash_p8(ghash_acc, key->htable, data, + nblocks * GHASH_BLOCK_SIZE); + disable_kernel_vsx(); + pagefault_enable(); + preempt_enable(); + + ghash_acc_to_polyval(ghash_acc, acc); + memzero_explicit(ghash_acc, sizeof(ghash_acc)); + } else { + ghash_blocks_generic(acc, &key->h, data, nblocks); + } +} + +#define gf128hash_mod_init_arch gf128hash_mod_init_arch +static void gf128hash_mod_init_arch(void) +{ + if (cpu_has_feature(CPU_FTR_ARCH_207S) && + (cur_cpu_spec->cpu_user_features2 & PPC_FEATURE2_VEC_CRYPTO)) + static_branch_enable(&have_vec_crypto); +} diff --git a/arch/powerpc/crypto/ghashp8-ppc.pl b/lib/crypto/powerpc/ghashp= 8-ppc.pl similarity index 98% rename from arch/powerpc/crypto/ghashp8-ppc.pl rename to lib/crypto/powerpc/ghashp8-ppc.pl index 041e633c214f..7c38eedc02cc 100644 --- a/arch/powerpc/crypto/ghashp8-ppc.pl +++ b/lib/crypto/powerpc/ghashp8-ppc.pl @@ -45,10 +45,11 @@ if ($flavour =3D~ /64/) { } else { die "nonsense $flavour"; } =20 $0 =3D~ m/(.*[\/\\])[^\/\\]+$/; $dir=3D$1; ( $xlate=3D"${dir}ppc-xlate.pl" and -f $xlate ) or ( $xlate=3D"${dir}../../perlasm/ppc-xlate.pl" and -f $xlate) or +( $xlate=3D"${dir}../../../arch/powerpc/crypto/ppc-xlate.pl" and -f $xlate= ) or die "can't locate ppc-xlate.pl"; =20 open STDOUT,"| $^X $xlate $flavour $output" || die "can't call $xlate: $!"; =20 my ($Xip,$Htbl,$inp,$len)=3Dmap("r$_",(3..6)); # argument block --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F1F635AC22; Thu, 19 Mar 2026 06:19:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901160; cv=none; b=NGo1flKntbYo3DQPT2DNf0PM1H6QUCY9+z+sgC9FEVgIspRvYwc1rFSYM6JOM7wHz+qLv3Lk5tfelFdcEUKXedvAKJZWZPSofKEZH/GTPlMzaU1WsqNMQRQVbFATcSYzQthuGAtRfszuHXvJimXKQePg3QBevzogPPLZItrCxo0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901160; c=relaxed/simple; bh=+ZmjSDH5DIbHkhs8sGfuQZibCGDn1Tzw/dqhYSngFD8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j3vr7r5Udqw6UFMJerradT65zKdWsQ1sK52AEV1z/ekqsfb5gdE1OC8sICJvrgij1Eladznp93Xt2j5FkjNYD/xz06ZetIqawpVvLs+DASapJ6mpxsuYmhNLJFDKfU0Hhx7YjOqTg0l2kyAv0UGNj7UXygusWfrhAwqYx0/p4Iw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hnean7F5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hnean7F5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9139C2BCB1; Thu, 19 Mar 2026 06:19:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901160; bh=+ZmjSDH5DIbHkhs8sGfuQZibCGDn1Tzw/dqhYSngFD8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hnean7F5sIW0j/PDDxjaqGyGILEQL5LWC1ZOa2gtGq3EkrwAV03/l3rgTmOOHTpHf zuWiucXxFt0bIE/omsLvX3AYzKLw35DJ0KiTAXUiu9rnnVV3Go7xE7twTP3m4a9u4k kuPIiu4CjMhPqmQlqBhSgq836J+HHGm9m7WGDcJHOWIDXrjaGPWWdc9IThZSx53uz8 H4ZXX1uojF+l9CmYgAMCgkuREmq7GPg5yOYWBKCe7vY+YlOQDe0U1ZT7l6KuRL/+a1 Cto/yQ8kWtV3VUj+9ES3MrtArOdzHHnqThiXY2S9ptS6HVrdHf4U3bv7k3V7NpVhmB i8tqNk8NcHEbw== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 12/19] lib/crypto: riscv/ghash: Migrate optimized code into library Date: Wed, 18 Mar 2026 23:17:13 -0700 Message-ID: <20260319061723.1140720-13-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the "ghash-riscv64-zvkg" crypto_shash algorithm. Move the corresponding assembly code into lib/crypto/, modify it to take the length in blocks instead of bytes, and wire it up to the GHASH library. This makes the GHASH library be optimized with the RISC-V Vector Cryptography Extension. It also greatly reduces the amount of riscv-specific glue code that is needed, and it fixes the issue where this optimized GHASH code was disabled by default. Note that this RISC-V code has multiple opportunities for improvement, such as adding more parallelism, providing an optimized multiplication function, and directly supporting POLYVAL. But for now, this commit simply tweaks ghash_zvkg() slightly to make it compatible with the library, then wires it up to ghash_blocks_arch(). ghash_preparekey_arch() is also implemented to store the copy of the raw key needed by the vghsh.vv instruction. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/riscv/crypto/Kconfig | 11 -- arch/riscv/crypto/Makefile | 3 - arch/riscv/crypto/ghash-riscv64-glue.c | 146 ------------------ include/crypto/gf128hash.h | 3 + lib/crypto/Kconfig | 2 + lib/crypto/Makefile | 1 + lib/crypto/riscv/gf128hash.h | 57 +++++++ .../crypto/riscv}/ghash-riscv64-zvkg.S | 13 +- 8 files changed, 69 insertions(+), 167 deletions(-) delete mode 100644 arch/riscv/crypto/ghash-riscv64-glue.c create mode 100644 lib/crypto/riscv/gf128hash.h rename {arch/riscv/crypto =3D> lib/crypto/riscv}/ghash-riscv64-zvkg.S (91%) diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 22d4eaab15f3..c208f54afbcd 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -15,21 +15,10 @@ config CRYPTO_AES_RISCV64 - Zvkned vector crypto extension - Zvbb vector extension (XTS) - Zvkb vector crypto extension (CTR) - Zvkg vector crypto extension (XTS) =20 -config CRYPTO_GHASH_RISCV64 - tristate "Hash functions: GHASH" - depends on 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ - RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS - select CRYPTO_GCM - help - GCM GHASH function (NIST SP 800-38D) - - Architecture: riscv64 using: - - Zvkg vector crypto extension - config CRYPTO_SM3_RISCV64 tristate "Hash functions: SM3 (ShangMi 3)" depends on 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS select CRYPTO_HASH diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 183495a95cc0..5c9ee1b876fa 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -2,13 +2,10 @@ =20 obj-$(CONFIG_CRYPTO_AES_RISCV64) +=3D aes-riscv64.o aes-riscv64-y :=3D aes-riscv64-glue.o aes-riscv64-zvkned.o \ aes-riscv64-zvkned-zvbb-zvkg.o aes-riscv64-zvkned-zvkb.o =20 -obj-$(CONFIG_CRYPTO_GHASH_RISCV64) +=3D ghash-riscv64.o -ghash-riscv64-y :=3D ghash-riscv64-glue.o ghash-riscv64-zvkg.o - obj-$(CONFIG_CRYPTO_SM3_RISCV64) +=3D sm3-riscv64.o sm3-riscv64-y :=3D sm3-riscv64-glue.o sm3-riscv64-zvksh-zvkb.o =20 obj-$(CONFIG_CRYPTO_SM4_RISCV64) +=3D sm4-riscv64.o sm4-riscv64-y :=3D sm4-riscv64-glue.o sm4-riscv64-zvksed-zvkb.o diff --git a/arch/riscv/crypto/ghash-riscv64-glue.c b/arch/riscv/crypto/gha= sh-riscv64-glue.c deleted file mode 100644 index d86073d25387..000000000000 --- a/arch/riscv/crypto/ghash-riscv64-glue.c +++ /dev/null @@ -1,146 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * GHASH using the RISC-V vector crypto extensions - * - * Copyright (C) 2023 VRULL GmbH - * Author: Heiko Stuebner - * - * Copyright (C) 2023 SiFive, Inc. - * Author: Jerry Shih - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -asmlinkage void ghash_zvkg(be128 *accumulator, const be128 *key, const u8 = *data, - size_t len); - -struct riscv64_ghash_tfm_ctx { - be128 key; -}; - -struct riscv64_ghash_desc_ctx { - be128 accumulator; -}; - -static int riscv64_ghash_setkey(struct crypto_shash *tfm, const u8 *key, - unsigned int keylen) -{ - struct riscv64_ghash_tfm_ctx *tctx =3D crypto_shash_ctx(tfm); - - if (keylen !=3D GHASH_BLOCK_SIZE) - return -EINVAL; - - memcpy(&tctx->key, key, GHASH_BLOCK_SIZE); - - return 0; -} - -static int riscv64_ghash_init(struct shash_desc *desc) -{ - struct riscv64_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - *dctx =3D (struct riscv64_ghash_desc_ctx){}; - - return 0; -} - -static inline void -riscv64_ghash_blocks(const struct riscv64_ghash_tfm_ctx *tctx, - struct riscv64_ghash_desc_ctx *dctx, - const u8 *src, size_t srclen) -{ - /* The srclen is nonzero and a multiple of 16. */ - if (crypto_simd_usable()) { - kernel_vector_begin(); - ghash_zvkg(&dctx->accumulator, &tctx->key, src, srclen); - kernel_vector_end(); - } else { - do { - crypto_xor((u8 *)&dctx->accumulator, src, - GHASH_BLOCK_SIZE); - gf128mul_lle(&dctx->accumulator, &tctx->key); - src +=3D GHASH_BLOCK_SIZE; - srclen -=3D GHASH_BLOCK_SIZE; - } while (srclen); - } -} - -static int riscv64_ghash_update(struct shash_desc *desc, const u8 *src, - unsigned int srclen) -{ - const struct riscv64_ghash_tfm_ctx *tctx =3D crypto_shash_ctx(desc->tfm); - struct riscv64_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - riscv64_ghash_blocks(tctx, dctx, src, - round_down(srclen, GHASH_BLOCK_SIZE)); - return srclen - round_down(srclen, GHASH_BLOCK_SIZE); -} - -static int riscv64_ghash_finup(struct shash_desc *desc, const u8 *src, - unsigned int len, u8 *out) -{ - const struct riscv64_ghash_tfm_ctx *tctx =3D crypto_shash_ctx(desc->tfm); - struct riscv64_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - if (len) { - u8 buf[GHASH_BLOCK_SIZE] =3D {}; - - memcpy(buf, src, len); - riscv64_ghash_blocks(tctx, dctx, buf, GHASH_BLOCK_SIZE); - memzero_explicit(buf, sizeof(buf)); - } - - memcpy(out, &dctx->accumulator, GHASH_DIGEST_SIZE); - return 0; -} - -static struct shash_alg riscv64_ghash_alg =3D { - .init =3D riscv64_ghash_init, - .update =3D riscv64_ghash_update, - .finup =3D riscv64_ghash_finup, - .setkey =3D riscv64_ghash_setkey, - .descsize =3D sizeof(struct riscv64_ghash_desc_ctx), - .digestsize =3D GHASH_DIGEST_SIZE, - .base =3D { - .cra_blocksize =3D GHASH_BLOCK_SIZE, - .cra_ctxsize =3D sizeof(struct riscv64_ghash_tfm_ctx), - .cra_priority =3D 300, - .cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY, - .cra_name =3D "ghash", - .cra_driver_name =3D "ghash-riscv64-zvkg", - .cra_module =3D THIS_MODULE, - }, -}; - -static int __init riscv64_ghash_mod_init(void) -{ - if (riscv_isa_extension_available(NULL, ZVKG) && - riscv_vector_vlen() >=3D 128) - return crypto_register_shash(&riscv64_ghash_alg); - - return -ENODEV; -} - -static void __exit riscv64_ghash_mod_exit(void) -{ - crypto_unregister_shash(&riscv64_ghash_alg); -} - -module_init(riscv64_ghash_mod_init); -module_exit(riscv64_ghash_mod_exit); - -MODULE_DESCRIPTION("GHASH (RISC-V accelerated)"); -MODULE_AUTHOR("Heiko Stuebner "); -MODULE_LICENSE("GPL"); -MODULE_ALIAS_CRYPTO("ghash"); diff --git a/include/crypto/gf128hash.h b/include/crypto/gf128hash.h index 650652dd6003..b798438cce23 100644 --- a/include/crypto/gf128hash.h +++ b/include/crypto/gf128hash.h @@ -42,10 +42,13 @@ struct polyval_elem { */ struct ghash_key { #if defined(CONFIG_CRYPTO_LIB_GF128HASH_ARCH) && defined(CONFIG_PPC64) /** @htable: GHASH key format used by the POWER8 assembly code */ u64 htable[4][2]; +#elif defined(CONFIG_CRYPTO_LIB_GF128HASH_ARCH) && defined(CONFIG_RISCV) + /** @h_raw: The hash key H, in GHASH format */ + u8 h_raw[GHASH_BLOCK_SIZE]; #endif /** @h: The hash key H, in POLYVAL format */ struct polyval_elem h; }; =20 diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig index f54add7d9070..027802e0de33 100644 --- a/lib/crypto/Kconfig +++ b/lib/crypto/Kconfig @@ -120,10 +120,12 @@ config CRYPTO_LIB_GF128HASH_ARCH bool depends on CRYPTO_LIB_GF128HASH && !UML default y if ARM && KERNEL_MODE_NEON default y if ARM64 default y if PPC64 && VSX + default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ + RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS default y if X86_64 =20 config CRYPTO_LIB_MD5 tristate help diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile index 8a9084188778..8950509833af 100644 --- a/lib/crypto/Makefile +++ b/lib/crypto/Makefile @@ -171,10 +171,11 @@ $(obj)/powerpc/ghashp8-ppc.S: $(src)/powerpc/ghashp8-= ppc.pl FORCE $(call if_changed,perlasm_ghash) targets +=3D powerpc/ghashp8-ppc.S OBJECT_FILES_NON_STANDARD_powerpc/ghashp8-ppc.o :=3D y endif =20 +libgf128hash-$(CONFIG_RISCV) +=3D riscv/ghash-riscv64-zvkg.o libgf128hash-$(CONFIG_X86) +=3D x86/polyval-pclmul-avx.o endif # CONFIG_CRYPTO_LIB_GF128HASH_ARCH =20 # clean-files must be defined unconditionally clean-files +=3D powerpc/ghashp8-ppc.S diff --git a/lib/crypto/riscv/gf128hash.h b/lib/crypto/riscv/gf128hash.h new file mode 100644 index 000000000000..4301a0384f60 --- /dev/null +++ b/lib/crypto/riscv/gf128hash.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * GHASH, RISC-V optimized + * + * Copyright (C) 2023 VRULL GmbH + * Copyright (C) 2023 SiFive, Inc. + * Copyright 2026 Google LLC + */ + +#include +#include + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_zvkg); + +asmlinkage void ghash_zvkg(u8 accumulator[GHASH_BLOCK_SIZE], + const u8 key[GHASH_BLOCK_SIZE], + const u8 *data, size_t nblocks); + +#define ghash_preparekey_arch ghash_preparekey_arch +static void ghash_preparekey_arch(struct ghash_key *key, + const u8 raw_key[GHASH_BLOCK_SIZE]) +{ + /* Save key in POLYVAL format for fallback */ + ghash_key_to_polyval(raw_key, &key->h); + + /* Save key in GHASH format for zvkg */ + memcpy(key->h_raw, raw_key, GHASH_BLOCK_SIZE); +} + +#define ghash_blocks_arch ghash_blocks_arch +static void ghash_blocks_arch(struct polyval_elem *acc, + const struct ghash_key *key, + const u8 *data, size_t nblocks) +{ + if (static_branch_likely(&have_zvkg) && likely(may_use_simd())) { + u8 ghash_acc[GHASH_BLOCK_SIZE]; + + polyval_acc_to_ghash(acc, ghash_acc); + + kernel_vector_begin(); + ghash_zvkg(ghash_acc, key->h_raw, data, nblocks); + kernel_vector_end(); + + ghash_acc_to_polyval(ghash_acc, acc); + memzero_explicit(ghash_acc, sizeof(ghash_acc)); + } else { + ghash_blocks_generic(acc, &key->h, data, nblocks); + } +} + +#define gf128hash_mod_init_arch gf128hash_mod_init_arch +static void gf128hash_mod_init_arch(void) +{ + if (riscv_isa_extension_available(NULL, ZVKG) && + riscv_vector_vlen() >=3D 128) + static_branch_enable(&have_zvkg); +} diff --git a/arch/riscv/crypto/ghash-riscv64-zvkg.S b/lib/crypto/riscv/ghas= h-riscv64-zvkg.S similarity index 91% rename from arch/riscv/crypto/ghash-riscv64-zvkg.S rename to lib/crypto/riscv/ghash-riscv64-zvkg.S index f2b43fb4d434..2839ff1a990c 100644 --- a/arch/riscv/crypto/ghash-riscv64-zvkg.S +++ b/lib/crypto/riscv/ghash-riscv64-zvkg.S @@ -48,25 +48,24 @@ .option arch, +zvkg =20 #define ACCUMULATOR a0 #define KEY a1 #define DATA a2 -#define LEN a3 +#define NBLOCKS a3 =20 -// void ghash_zvkg(be128 *accumulator, const be128 *key, const u8 *data, -// size_t len); -// -// |len| must be nonzero and a multiple of 16 (GHASH_BLOCK_SIZE). +// void ghash_zvkg(u8 accumulator[GHASH_BLOCK_SIZE], +// const u8 key[GHASH_BLOCK_SIZE], +// const u8 *data, size_t nblocks); SYM_FUNC_START(ghash_zvkg) vsetivli zero, 4, e32, m1, ta, ma vle32.v v1, (ACCUMULATOR) vle32.v v2, (KEY) .Lnext_block: vle32.v v3, (DATA) vghsh.vv v1, v2, v3 addi DATA, DATA, 16 - addi LEN, LEN, -16 - bnez LEN, .Lnext_block + addi NBLOCKS, NBLOCKS, -1 + bnez NBLOCKS, .Lnext_block =20 vse32.v v1, (ACCUMULATOR) ret SYM_FUNC_END(ghash_zvkg) --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E975F35BDBA; Thu, 19 Mar 2026 06:19:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901161; cv=none; b=cyd9UTXPCQHSuK3x5PYrZ0DO31Bpuykf547otJjEg8n7vXyfujeT9m3Qk6YG0ZW4H5bEj2dfaJmcz4UW7sF42s1BOnwStBzEFhtoubMJFwJQvZx+ERhuJ8pb/qNVmC7Q8qwqmls+QIc3E183v1UG9vZ7M1wOy7uGvvLMRXLrNPg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901161; c=relaxed/simple; bh=7F7ndcFH8rpPiACNbtYxCLPxNZA0C89ig93X3UFOhZg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AsZP8A/7nIu3bRkOjDN7V+n9AgGyVZL4K5gWeaYlL7ktQkoGm91xI2tBZLbcaLy80IdeNN45Ae8YVEE+P4qlavcZlWZiOytWLFY12LyrCOUoeyYSfs9UMNNWfylVUya3tOftEkk/kdmE/T0oTfvAnyoXxlwSBPWbm/PGAb7v4NQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LtLToKlg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LtLToKlg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 63CFAC2BCC9; Thu, 19 Mar 2026 06:19:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901160; bh=7F7ndcFH8rpPiACNbtYxCLPxNZA0C89ig93X3UFOhZg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LtLToKlgA0+Pk4GjTlYNad6rvHvHbEP/2D3kndnAGFI1vz5KKAGFs1+R+SwEgJXO8 HQbMb6bXMWgJVHvmjee84xWTWOzkHZ5i2XUw5vG9WYRbpoThUyDHWaQlFZ9MaABGo2 IiiR8rZuYEQBM3NpxG1cPccwPVLZezGnovy4A/EK05lEICEgEH54q6KrgAYlcrcVvW 6hTnWXcqDfGFkeJe7m8YD9h/6qVmQqqDpydEODU3dhxIIqxXb2Lbwc3XJu9APpryvN PhH/eiY+dyPlSjSeQemdVtcw6QvNayZP1xVoHu9PrhGn9/YsRBOEtSYwmfpda3vkVu 2c4i/tSpO0www== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 13/19] lib/crypto: s390/ghash: Migrate optimized code into library Date: Wed, 18 Mar 2026 23:17:14 -0700 Message-ID: <20260319061723.1140720-14-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the "ghash-s390" crypto_shash algorithm, and replace it with an implementation of ghash_blocks_arch() for the GHASH library. This makes the GHASH library be optimized with CPACF. It also greatly reduces the amount of s390-specific glue code that is needed, and it fixes the issue where this GHASH optimization was disabled by default. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/s390/configs/debug_defconfig | 1 - arch/s390/configs/defconfig | 1 - arch/s390/crypto/Kconfig | 10 --- arch/s390/crypto/Makefile | 1 - arch/s390/crypto/ghash_s390.c | 144 ------------------------------ include/crypto/gf128hash.h | 3 +- lib/crypto/Kconfig | 1 + lib/crypto/s390/gf128hash.h | 54 +++++++++++ 8 files changed, 57 insertions(+), 158 deletions(-) delete mode 100644 arch/s390/crypto/ghash_s390.c create mode 100644 lib/crypto/s390/gf128hash.h diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_de= fconfig index 98fd0a2f51c6..aa862d4fcc68 100644 --- a/arch/s390/configs/debug_defconfig +++ b/arch/s390/configs/debug_defconfig @@ -807,11 +807,10 @@ CONFIG_CRYPTO_LZ4HC=3Dm CONFIG_CRYPTO_ZSTD=3Dm CONFIG_CRYPTO_USER_API_HASH=3Dm CONFIG_CRYPTO_USER_API_SKCIPHER=3Dm CONFIG_CRYPTO_USER_API_RNG=3Dm CONFIG_CRYPTO_USER_API_AEAD=3Dm -CONFIG_CRYPTO_GHASH_S390=3Dm CONFIG_CRYPTO_AES_S390=3Dm CONFIG_CRYPTO_DES_S390=3Dm CONFIG_CRYPTO_HMAC_S390=3Dm CONFIG_ZCRYPT=3Dm CONFIG_PKEY=3Dm diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig index 0f4cedcab3ce..74f943307c46 100644 --- a/arch/s390/configs/defconfig +++ b/arch/s390/configs/defconfig @@ -792,11 +792,10 @@ CONFIG_CRYPTO_ZSTD=3Dm CONFIG_CRYPTO_JITTERENTROPY_OSR=3D1 CONFIG_CRYPTO_USER_API_HASH=3Dm CONFIG_CRYPTO_USER_API_SKCIPHER=3Dm CONFIG_CRYPTO_USER_API_RNG=3Dm CONFIG_CRYPTO_USER_API_AEAD=3Dm -CONFIG_CRYPTO_GHASH_S390=3Dm CONFIG_CRYPTO_AES_S390=3Dm CONFIG_CRYPTO_DES_S390=3Dm CONFIG_CRYPTO_HMAC_S390=3Dm CONFIG_ZCRYPT=3Dm CONFIG_PKEY=3Dm diff --git a/arch/s390/crypto/Kconfig b/arch/s390/crypto/Kconfig index 79a2d0034258..ee83052dbc15 100644 --- a/arch/s390/crypto/Kconfig +++ b/arch/s390/crypto/Kconfig @@ -1,19 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 =20 menu "Accelerated Cryptographic Algorithms for CPU (s390)" =20 -config CRYPTO_GHASH_S390 - tristate "Hash functions: GHASH" - select CRYPTO_HASH - help - GCM GHASH hash function (NIST SP800-38D) - - Architecture: s390 - - It is available as of z196. - config CRYPTO_AES_S390 tristate "Ciphers: AES, modes: ECB, CBC, CTR, XTS, GCM" select CRYPTO_SKCIPHER help AEAD cipher: AES with GCM diff --git a/arch/s390/crypto/Makefile b/arch/s390/crypto/Makefile index 387a229e1038..4449c1b19ef5 100644 --- a/arch/s390/crypto/Makefile +++ b/arch/s390/crypto/Makefile @@ -5,9 +5,8 @@ =20 obj-$(CONFIG_CRYPTO_DES_S390) +=3D des_s390.o obj-$(CONFIG_CRYPTO_AES_S390) +=3D aes_s390.o obj-$(CONFIG_CRYPTO_PAES_S390) +=3D paes_s390.o obj-$(CONFIG_S390_PRNG) +=3D prng.o -obj-$(CONFIG_CRYPTO_GHASH_S390) +=3D ghash_s390.o obj-$(CONFIG_CRYPTO_HMAC_S390) +=3D hmac_s390.o obj-$(CONFIG_CRYPTO_PHMAC_S390) +=3D phmac_s390.o obj-y +=3D arch_random.o diff --git a/arch/s390/crypto/ghash_s390.c b/arch/s390/crypto/ghash_s390.c deleted file mode 100644 index dcbcee37cb63..000000000000 --- a/arch/s390/crypto/ghash_s390.c +++ /dev/null @@ -1,144 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Cryptographic API. - * - * s390 implementation of the GHASH algorithm for GCM (Galois/Counter Mode= ). - * - * Copyright IBM Corp. 2011 - * Author(s): Gerald Schaefer - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -struct s390_ghash_ctx { - u8 key[GHASH_BLOCK_SIZE]; -}; - -struct s390_ghash_desc_ctx { - u8 icv[GHASH_BLOCK_SIZE]; - u8 key[GHASH_BLOCK_SIZE]; -}; - -static int ghash_init(struct shash_desc *desc) -{ - struct s390_ghash_ctx *ctx =3D crypto_shash_ctx(desc->tfm); - struct s390_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - memset(dctx, 0, sizeof(*dctx)); - memcpy(dctx->key, ctx->key, GHASH_BLOCK_SIZE); - - return 0; -} - -static int ghash_setkey(struct crypto_shash *tfm, - const u8 *key, unsigned int keylen) -{ - struct s390_ghash_ctx *ctx =3D crypto_shash_ctx(tfm); - - if (keylen !=3D GHASH_BLOCK_SIZE) - return -EINVAL; - - memcpy(ctx->key, key, GHASH_BLOCK_SIZE); - - return 0; -} - -static int ghash_update(struct shash_desc *desc, - const u8 *src, unsigned int srclen) -{ - struct s390_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - unsigned int n; - - n =3D srclen & ~(GHASH_BLOCK_SIZE - 1); - cpacf_kimd(CPACF_KIMD_GHASH, dctx, src, n); - return srclen - n; -} - -static void ghash_flush(struct s390_ghash_desc_ctx *dctx, const u8 *src, - unsigned int len) -{ - if (len) { - u8 buf[GHASH_BLOCK_SIZE] =3D {}; - - memcpy(buf, src, len); - cpacf_kimd(CPACF_KIMD_GHASH, dctx, buf, GHASH_BLOCK_SIZE); - memzero_explicit(buf, sizeof(buf)); - } -} - -static int ghash_finup(struct shash_desc *desc, const u8 *src, - unsigned int len, u8 *dst) -{ - struct s390_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - ghash_flush(dctx, src, len); - memcpy(dst, dctx->icv, GHASH_BLOCK_SIZE); - return 0; -} - -static int ghash_export(struct shash_desc *desc, void *out) -{ - struct s390_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - memcpy(out, dctx->icv, GHASH_DIGEST_SIZE); - return 0; -} - -static int ghash_import(struct shash_desc *desc, const void *in) -{ - struct s390_ghash_ctx *ctx =3D crypto_shash_ctx(desc->tfm); - struct s390_ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - memcpy(dctx->icv, in, GHASH_DIGEST_SIZE); - memcpy(dctx->key, ctx->key, GHASH_BLOCK_SIZE); - return 0; -} - -static struct shash_alg ghash_alg =3D { - .digestsize =3D GHASH_DIGEST_SIZE, - .init =3D ghash_init, - .update =3D ghash_update, - .finup =3D ghash_finup, - .setkey =3D ghash_setkey, - .export =3D ghash_export, - .import =3D ghash_import, - .statesize =3D sizeof(struct ghash_desc_ctx), - .descsize =3D sizeof(struct s390_ghash_desc_ctx), - .base =3D { - .cra_name =3D "ghash", - .cra_driver_name =3D "ghash-s390", - .cra_priority =3D 300, - .cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY, - .cra_blocksize =3D GHASH_BLOCK_SIZE, - .cra_ctxsize =3D sizeof(struct s390_ghash_ctx), - .cra_module =3D THIS_MODULE, - }, -}; - -static int __init ghash_mod_init(void) -{ - if (!cpacf_query_func(CPACF_KIMD, CPACF_KIMD_GHASH)) - return -ENODEV; - - return crypto_register_shash(&ghash_alg); -} - -static void __exit ghash_mod_exit(void) -{ - crypto_unregister_shash(&ghash_alg); -} - -module_cpu_feature_match(S390_CPU_FEATURE_MSA, ghash_mod_init); -module_exit(ghash_mod_exit); - -MODULE_ALIAS_CRYPTO("ghash"); - -MODULE_LICENSE("GPL"); -MODULE_DESCRIPTION("GHASH hash function, s390 implementation"); diff --git a/include/crypto/gf128hash.h b/include/crypto/gf128hash.h index b798438cce23..0bc649d01e12 100644 --- a/include/crypto/gf128hash.h +++ b/include/crypto/gf128hash.h @@ -42,11 +42,12 @@ struct polyval_elem { */ struct ghash_key { #if defined(CONFIG_CRYPTO_LIB_GF128HASH_ARCH) && defined(CONFIG_PPC64) /** @htable: GHASH key format used by the POWER8 assembly code */ u64 htable[4][2]; -#elif defined(CONFIG_CRYPTO_LIB_GF128HASH_ARCH) && defined(CONFIG_RISCV) +#elif defined(CONFIG_CRYPTO_LIB_GF128HASH_ARCH) && \ + (defined(CONFIG_RISCV) || defined(CONFIG_S390)) /** @h_raw: The hash key H, in GHASH format */ u8 h_raw[GHASH_BLOCK_SIZE]; #endif /** @h: The hash key H, in POLYVAL format */ struct polyval_elem h; diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig index 027802e0de33..a39e7707e9ee 100644 --- a/lib/crypto/Kconfig +++ b/lib/crypto/Kconfig @@ -122,10 +122,11 @@ config CRYPTO_LIB_GF128HASH_ARCH default y if ARM && KERNEL_MODE_NEON default y if ARM64 default y if PPC64 && VSX default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \ RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS + default y if S390 default y if X86_64 =20 config CRYPTO_LIB_MD5 tristate help diff --git a/lib/crypto/s390/gf128hash.h b/lib/crypto/s390/gf128hash.h new file mode 100644 index 000000000000..1e46ce4bca40 --- /dev/null +++ b/lib/crypto/s390/gf128hash.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * GHASH optimized using the CP Assist for Cryptographic Functions (CPACF) + * + * Copyright 2026 Google LLC + */ +#include +#include + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_cpacf_ghash); + +#define ghash_preparekey_arch ghash_preparekey_arch +static void ghash_preparekey_arch(struct ghash_key *key, + const u8 raw_key[GHASH_BLOCK_SIZE]) +{ + /* Save key in POLYVAL format for fallback */ + ghash_key_to_polyval(raw_key, &key->h); + + /* Save key in GHASH format for CPACF_KIMD_GHASH */ + memcpy(key->h_raw, raw_key, GHASH_BLOCK_SIZE); +} + +#define ghash_blocks_arch ghash_blocks_arch +static void ghash_blocks_arch(struct polyval_elem *acc, + const struct ghash_key *key, + const u8 *data, size_t nblocks) +{ + if (static_branch_likely(&have_cpacf_ghash)) { + /* + * CPACF_KIMD_GHASH requires the accumulator and key in a single + * buffer, each using the GHASH convention. + */ + u8 ctx[2][GHASH_BLOCK_SIZE] __aligned(8); + + polyval_acc_to_ghash(acc, ctx[0]); + memcpy(ctx[1], key->h_raw, GHASH_BLOCK_SIZE); + + cpacf_kimd(CPACF_KIMD_GHASH, ctx, data, + nblocks * GHASH_BLOCK_SIZE); + + ghash_acc_to_polyval(ctx[0], acc); + memzero_explicit(ctx, sizeof(ctx)); + } else { + ghash_blocks_generic(acc, &key->h, data, nblocks); + } +} + +#define gf128hash_mod_init_arch gf128hash_mod_init_arch +static void gf128hash_mod_init_arch(void) +{ + if (cpu_have_feature(S390_CPU_FEATURE_MSA) && + cpacf_query_func(CPACF_KIMD, CPACF_KIMD_GHASH)) + static_branch_enable(&have_cpacf_ghash); +} --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78FA135CB98; Thu, 19 Mar 2026 06:19:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901161; cv=none; b=USr+8FFNJXidhcMKOelNy0P04OOhaSzXisLACjfOsZ7YscOoRkKu3BjgOMNH4aqNMTe0f1n7apskILsSgAB/Jgv+XZlVex2GXf6G9sXSADNLNtApan29qmWjABf0X3VViVYEXHAhTc0C0OT016A0bDXbLEGD4lbALrrIMblX24o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901161; c=relaxed/simple; bh=v23KyGPrHWAAC2T6J0ScHjRgqX7DUkokMnXW1NT3GDU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TZQJamOGsDIABGp0hPWK3f/SiMwS1GII3EDnBVJz/vmvEwaXP0vcRNNu7xloxirbqSH457NnS9Pj8km5ccWWs9nB5NTbJNjkc5suwH/GrOF51+NV7VnkYhE0m+W3xGR6GR2V7nzz/p2uzTxtzeIBmGl44phwHH68FG2+4UPrp2g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=R7hvvACg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="R7hvvACg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D3186C2BCF5; Thu, 19 Mar 2026 06:19:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901161; bh=v23KyGPrHWAAC2T6J0ScHjRgqX7DUkokMnXW1NT3GDU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=R7hvvACgXThVeEen2oB8v6/av/AmwXd++45upOpQyMaz+8DIdfwgfaEHjmMxHY95L KgDxi1UEF65u8um9/80JUp1ERfI1lwB48xLsrNZalIFhHPKAcJW45DmTzH2Q1gcugu QzkkUbuyb8Lf4vpYiyVNf9MKR5L+6kQSzmVeeZ4M1kBoBgBCjM7FrjyN7WDr+hjlz4 nj48OxB+Q1YTRVJe28oDZXuI2pmvrueIrl3YO8B99ipT7UI93XYMveTfuaS/W7NWjc S860Q85nik5x7j1fDEP5CaEq7D1GEvarLqRTq/lxlioLeZlJeB3qp19mVvt6FHCjW9 w8pMMm1XNjL6w== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 14/19] lib/crypto: x86/ghash: Migrate optimized code into library Date: Wed, 18 Mar 2026 23:17:15 -0700 Message-ID: <20260319061723.1140720-15-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the "ghash-pclmulqdqni" crypto_shash algorithm. Move the corresponding assembly code into lib/crypto/, and wire it up to the GHASH library. This makes the GHASH library be optimized with x86's carryless multiplication instructions. It also greatly reduces the amount of x86-specific glue code that is needed, and it fixes the issue where this GHASH optimization was disabled by default. Rename and adjust the prototypes of the assembly functions to make them fit better with the library. Remove the byte-swaps (pshufb instructions) that are no longer necessary because the library keeps the accumulator in POLYVAL format rather than GHASH format. Rename clmul_ghash_mul() to polyval_mul_pclmul() to reflect that it really does a POLYVAL style multiplication. Wire it up to both ghash_mul_arch() and polyval_mul_arch(). Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/x86/crypto/Kconfig | 10 -- arch/x86/crypto/Makefile | 3 - arch/x86/crypto/ghash-clmulni-intel_glue.c | 163 ------------------ lib/crypto/Makefile | 3 +- lib/crypto/x86/gf128hash.h | 65 ++++++- .../crypto/x86/ghash-pclmul.S | 98 +++++------ 6 files changed, 104 insertions(+), 238 deletions(-) delete mode 100644 arch/x86/crypto/ghash-clmulni-intel_glue.c rename arch/x86/crypto/ghash-clmulni-intel_asm.S =3D> lib/crypto/x86/ghash= -pclmul.S (54%) diff --git a/arch/x86/crypto/Kconfig b/arch/x86/crypto/Kconfig index 7fb2319a0916..905e8a23cec3 100644 --- a/arch/x86/crypto/Kconfig +++ b/arch/x86/crypto/Kconfig @@ -342,16 +342,6 @@ config CRYPTO_SM3_AVX_X86_64 Architecture: x86_64 using: - AVX (Advanced Vector Extensions) =20 If unsure, say N. =20 -config CRYPTO_GHASH_CLMUL_NI_INTEL - tristate "Hash functions: GHASH (CLMUL-NI)" - depends on 64BIT - select CRYPTO_CRYPTD - help - GCM GHASH hash function (NIST SP800-38D) - - Architecture: x86_64 using: - - CLMUL-NI (carry-less multiplication new instructions) - endmenu diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile index b21ad0978c52..d562f4341da6 100644 --- a/arch/x86/crypto/Makefile +++ b/arch/x86/crypto/Makefile @@ -48,13 +48,10 @@ aesni-intel-$(CONFIG_64BIT) +=3D aes-ctr-avx-x86_64.o \ aes-gcm-aesni-x86_64.o \ aes-gcm-vaes-avx2.o \ aes-gcm-vaes-avx512.o \ aes-xts-avx-x86_64.o =20 -obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) +=3D ghash-clmulni-intel.o -ghash-clmulni-intel-y :=3D ghash-clmulni-intel_asm.o ghash-clmulni-intel_g= lue.o - obj-$(CONFIG_CRYPTO_SM3_AVX_X86_64) +=3D sm3-avx-x86_64.o sm3-avx-x86_64-y :=3D sm3-avx-asm_64.o sm3_avx_glue.o =20 obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64) +=3D sm4-aesni-avx-x86_64.o sm4-aesni-avx-x86_64-y :=3D sm4-aesni-avx-asm_64.o sm4_aesni_avx_glue.o diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/g= hash-clmulni-intel_glue.c deleted file mode 100644 index aea5d4d06be7..000000000000 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ /dev/null @@ -1,163 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * Accelerated GHASH implementation with Intel PCLMULQDQ-NI - * instructions. This file contains glue code. - * - * Copyright (c) 2009 Intel Corp. - * Author: Huang Ying - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -asmlinkage void clmul_ghash_mul(char *dst, const le128 *shash); - -asmlinkage int clmul_ghash_update(char *dst, const char *src, - unsigned int srclen, const le128 *shash); - -struct x86_ghash_ctx { - le128 shash; -}; - -static int ghash_init(struct shash_desc *desc) -{ - struct ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - memset(dctx, 0, sizeof(*dctx)); - - return 0; -} - -static int ghash_setkey(struct crypto_shash *tfm, - const u8 *key, unsigned int keylen) -{ - struct x86_ghash_ctx *ctx =3D crypto_shash_ctx(tfm); - u64 a, b; - - if (keylen !=3D GHASH_BLOCK_SIZE) - return -EINVAL; - - /* - * GHASH maps bits to polynomial coefficients backwards, which makes it - * hard to implement. But it can be shown that the GHASH multiplication - * - * D * K (mod x^128 + x^7 + x^2 + x + 1) - * - * (where D is a data block and K is the key) is equivalent to: - * - * bitreflect(D) * bitreflect(K) * x^(-127) - * (mod x^128 + x^127 + x^126 + x^121 + 1) - * - * So, the code below precomputes: - * - * bitreflect(K) * x^(-127) (mod x^128 + x^127 + x^126 + x^121 + 1) - * - * ... but in Montgomery form (so that Montgomery multiplication can be - * used), i.e. with an extra x^128 factor, which means actually: - * - * bitreflect(K) * x (mod x^128 + x^127 + x^126 + x^121 + 1) - * - * The within-a-byte part of bitreflect() cancels out GHASH's built-in - * reflection, and thus bitreflect() is actually a byteswap. - */ - a =3D get_unaligned_be64(key); - b =3D get_unaligned_be64(key + 8); - ctx->shash.a =3D cpu_to_le64((a << 1) | (b >> 63)); - ctx->shash.b =3D cpu_to_le64((b << 1) | (a >> 63)); - if (a >> 63) - ctx->shash.a ^=3D cpu_to_le64((u64)0xc2 << 56); - return 0; -} - -static int ghash_update(struct shash_desc *desc, - const u8 *src, unsigned int srclen) -{ - struct x86_ghash_ctx *ctx =3D crypto_shash_ctx(desc->tfm); - struct ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - u8 *dst =3D dctx->buffer; - int remain; - - kernel_fpu_begin(); - remain =3D clmul_ghash_update(dst, src, srclen, &ctx->shash); - kernel_fpu_end(); - return remain; -} - -static void ghash_flush(struct x86_ghash_ctx *ctx, struct ghash_desc_ctx *= dctx, - const u8 *src, unsigned int len) -{ - u8 *dst =3D dctx->buffer; - - kernel_fpu_begin(); - if (len) { - crypto_xor(dst, src, len); - clmul_ghash_mul(dst, &ctx->shash); - } - kernel_fpu_end(); -} - -static int ghash_finup(struct shash_desc *desc, const u8 *src, - unsigned int len, u8 *dst) -{ - struct x86_ghash_ctx *ctx =3D crypto_shash_ctx(desc->tfm); - struct ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - u8 *buf =3D dctx->buffer; - - ghash_flush(ctx, dctx, src, len); - memcpy(dst, buf, GHASH_BLOCK_SIZE); - - return 0; -} - -static struct shash_alg ghash_alg =3D { - .digestsize =3D GHASH_DIGEST_SIZE, - .init =3D ghash_init, - .update =3D ghash_update, - .finup =3D ghash_finup, - .setkey =3D ghash_setkey, - .descsize =3D sizeof(struct ghash_desc_ctx), - .base =3D { - .cra_name =3D "ghash", - .cra_driver_name =3D "ghash-pclmulqdqni", - .cra_priority =3D 400, - .cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY, - .cra_blocksize =3D GHASH_BLOCK_SIZE, - .cra_ctxsize =3D sizeof(struct x86_ghash_ctx), - .cra_module =3D THIS_MODULE, - }, -}; - -static const struct x86_cpu_id pcmul_cpu_id[] =3D { - X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), /* Pickle-Mickle-Duck */ - {} -}; -MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); - -static int __init ghash_pclmulqdqni_mod_init(void) -{ - if (!x86_match_cpu(pcmul_cpu_id)) - return -ENODEV; - - return crypto_register_shash(&ghash_alg); -} - -static void __exit ghash_pclmulqdqni_mod_exit(void) -{ - crypto_unregister_shash(&ghash_alg); -} - -module_init(ghash_pclmulqdqni_mod_init); -module_exit(ghash_pclmulqdqni_mod_exit); - -MODULE_LICENSE("GPL"); -MODULE_DESCRIPTION("GHASH hash function, accelerated by PCLMULQDQ-NI"); -MODULE_ALIAS_CRYPTO("ghash"); diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile index 8950509833af..19c67f70fb38 100644 --- a/lib/crypto/Makefile +++ b/lib/crypto/Makefile @@ -172,11 +172,12 @@ $(obj)/powerpc/ghashp8-ppc.S: $(src)/powerpc/ghashp8-= ppc.pl FORCE targets +=3D powerpc/ghashp8-ppc.S OBJECT_FILES_NON_STANDARD_powerpc/ghashp8-ppc.o :=3D y endif =20 libgf128hash-$(CONFIG_RISCV) +=3D riscv/ghash-riscv64-zvkg.o -libgf128hash-$(CONFIG_X86) +=3D x86/polyval-pclmul-avx.o +libgf128hash-$(CONFIG_X86) +=3D x86/ghash-pclmul.o \ + x86/polyval-pclmul-avx.o endif # CONFIG_CRYPTO_LIB_GF128HASH_ARCH =20 # clean-files must be defined unconditionally clean-files +=3D powerpc/ghashp8-ppc.S =20 diff --git a/lib/crypto/x86/gf128hash.h b/lib/crypto/x86/gf128hash.h index adf6147ea677..6b79b06caab0 100644 --- a/lib/crypto/x86/gf128hash.h +++ b/lib/crypto/x86/gf128hash.h @@ -1,20 +1,27 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* - * POLYVAL library functions, x86_64 optimized + * GHASH and POLYVAL, x86_64 optimized * * Copyright 2025 Google LLC */ #include #include =20 #define NUM_H_POWERS 8 =20 +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_pclmul); static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_pclmul_avx); =20 +asmlinkage void polyval_mul_pclmul(struct polyval_elem *a, + const struct polyval_elem *b); asmlinkage void polyval_mul_pclmul_avx(struct polyval_elem *a, const struct polyval_elem *b); + +asmlinkage void ghash_blocks_pclmul(struct polyval_elem *acc, + const struct polyval_elem *key, + const u8 *data, size_t nblocks); asmlinkage void polyval_blocks_pclmul_avx(struct polyval_elem *acc, const struct polyval_key *key, const u8 *data, size_t nblocks); =20 #define polyval_preparekey_arch polyval_preparekey_arch @@ -39,20 +46,58 @@ static void polyval_preparekey_arch(struct polyval_key = *key, &key->h_powers[NUM_H_POWERS - 1]); } } } =20 +static void polyval_mul_x86(struct polyval_elem *a, + const struct polyval_elem *b) +{ + if (static_branch_likely(&have_pclmul) && irq_fpu_usable()) { + kernel_fpu_begin(); + if (static_branch_likely(&have_pclmul_avx)) + polyval_mul_pclmul_avx(a, b); + else + polyval_mul_pclmul(a, b); + kernel_fpu_end(); + } else { + polyval_mul_generic(a, b); + } +} + +#define ghash_mul_arch ghash_mul_arch +static void ghash_mul_arch(struct polyval_elem *acc, + const struct ghash_key *key) +{ + polyval_mul_x86(acc, &key->h); +} + #define polyval_mul_arch polyval_mul_arch static void polyval_mul_arch(struct polyval_elem *acc, const struct polyval_key *key) { - if (static_branch_likely(&have_pclmul_avx) && irq_fpu_usable()) { - kernel_fpu_begin(); - polyval_mul_pclmul_avx(acc, &key->h_powers[NUM_H_POWERS - 1]); - kernel_fpu_end(); + polyval_mul_x86(acc, &key->h_powers[NUM_H_POWERS - 1]); +} + +#define ghash_blocks_arch ghash_blocks_arch +static void ghash_blocks_arch(struct polyval_elem *acc, + const struct ghash_key *key, + const u8 *data, size_t nblocks) +{ + if (static_branch_likely(&have_pclmul) && irq_fpu_usable()) { + do { + /* Allow rescheduling every 4 KiB. */ + size_t n =3D min_t(size_t, nblocks, + 4096 / GHASH_BLOCK_SIZE); + + kernel_fpu_begin(); + ghash_blocks_pclmul(acc, &key->h, data, n); + kernel_fpu_end(); + data +=3D n * GHASH_BLOCK_SIZE; + nblocks -=3D n; + } while (nblocks); } else { - polyval_mul_generic(acc, &key->h_powers[NUM_H_POWERS - 1]); + ghash_blocks_generic(acc, &key->h, data, nblocks); } } =20 #define polyval_blocks_arch polyval_blocks_arch static void polyval_blocks_arch(struct polyval_elem *acc, @@ -78,9 +123,11 @@ static void polyval_blocks_arch(struct polyval_elem *ac= c, } =20 #define gf128hash_mod_init_arch gf128hash_mod_init_arch static void gf128hash_mod_init_arch(void) { - if (boot_cpu_has(X86_FEATURE_PCLMULQDQ) && - boot_cpu_has(X86_FEATURE_AVX)) - static_branch_enable(&have_pclmul_avx); + if (boot_cpu_has(X86_FEATURE_PCLMULQDQ)) { + static_branch_enable(&have_pclmul); + if (boot_cpu_has(X86_FEATURE_AVX)) + static_branch_enable(&have_pclmul_avx); + } } diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/lib/crypto/x86/gha= sh-pclmul.S similarity index 54% rename from arch/x86/crypto/ghash-clmulni-intel_asm.S rename to lib/crypto/x86/ghash-pclmul.S index c4fbaa82ed7a..6ffb5aea6063 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_asm.S +++ b/lib/crypto/x86/ghash-pclmul.S @@ -19,12 +19,12 @@ .section .rodata.cst16.bswap_mask, "aM", @progbits, 16 .align 16 .Lbswap_mask: .octa 0x000102030405060708090a0b0c0d0e0f =20 -#define DATA %xmm0 -#define SHASH %xmm1 +#define ACC %xmm0 +#define KEY %xmm1 #define T1 %xmm2 #define T2 %xmm3 #define T3 %xmm4 #define BSWAP %xmm5 #define IN1 %xmm6 @@ -32,102 +32,96 @@ .text =20 /* * __clmul_gf128mul_ble: internal ABI * input: - * DATA: operand1 - * SHASH: operand2, hash_key << 1 mod poly + * ACC: operand1 + * KEY: operand2, hash_key << 1 mod poly * output: - * DATA: operand1 * operand2 mod poly + * ACC: operand1 * operand2 mod poly * changed: * T1 * T2 * T3 */ SYM_FUNC_START_LOCAL(__clmul_gf128mul_ble) - movaps DATA, T1 - pshufd $0b01001110, DATA, T2 - pshufd $0b01001110, SHASH, T3 - pxor DATA, T2 - pxor SHASH, T3 + movaps ACC, T1 + pshufd $0b01001110, ACC, T2 + pshufd $0b01001110, KEY, T3 + pxor ACC, T2 + pxor KEY, T3 =20 - pclmulqdq $0x00, SHASH, DATA # DATA =3D a0 * b0 - pclmulqdq $0x11, SHASH, T1 # T1 =3D a1 * b1 + pclmulqdq $0x00, KEY, ACC # ACC =3D a0 * b0 + pclmulqdq $0x11, KEY, T1 # T1 =3D a1 * b1 pclmulqdq $0x00, T3, T2 # T2 =3D (a1 + a0) * (b1 + b0) - pxor DATA, T2 + pxor ACC, T2 pxor T1, T2 # T2 =3D a0 * b1 + a1 * b0 =20 movaps T2, T3 pslldq $8, T3 psrldq $8, T2 - pxor T3, DATA - pxor T2, T1 # is result of + pxor T3, ACC + pxor T2, T1 # is result of # carry-less multiplication =20 # first phase of the reduction - movaps DATA, T3 + movaps ACC, T3 psllq $1, T3 - pxor DATA, T3 + pxor ACC, T3 psllq $5, T3 - pxor DATA, T3 + pxor ACC, T3 psllq $57, T3 movaps T3, T2 pslldq $8, T2 psrldq $8, T3 - pxor T2, DATA + pxor T2, ACC pxor T3, T1 =20 # second phase of the reduction - movaps DATA, T2 + movaps ACC, T2 psrlq $5, T2 - pxor DATA, T2 + pxor ACC, T2 psrlq $1, T2 - pxor DATA, T2 + pxor ACC, T2 psrlq $1, T2 pxor T2, T1 - pxor T1, DATA + pxor T1, ACC RET SYM_FUNC_END(__clmul_gf128mul_ble) =20 -/* void clmul_ghash_mul(char *dst, const le128 *shash) */ -SYM_FUNC_START(clmul_ghash_mul) +/* + * void polyval_mul_pclmul(struct polyval_elem *a, + * const struct polyval_elem *b) + */ +SYM_FUNC_START(polyval_mul_pclmul) FRAME_BEGIN - movups (%rdi), DATA - movups (%rsi), SHASH - movaps .Lbswap_mask(%rip), BSWAP - pshufb BSWAP, DATA + movups (%rdi), ACC + movups (%rsi), KEY call __clmul_gf128mul_ble - pshufb BSWAP, DATA - movups DATA, (%rdi) + movups ACC, (%rdi) FRAME_END RET -SYM_FUNC_END(clmul_ghash_mul) +SYM_FUNC_END(polyval_mul_pclmul) =20 /* - * int clmul_ghash_update(char *dst, const char *src, unsigned int srclen, - * const le128 *shash); + * void ghash_blocks_pclmul(struct polyval_elem *acc, + * const struct polyval_elem *key, + * const u8 *data, size_t nblocks) */ -SYM_FUNC_START(clmul_ghash_update) +SYM_FUNC_START(ghash_blocks_pclmul) FRAME_BEGIN - cmp $16, %rdx - jb .Lupdate_just_ret # check length movaps .Lbswap_mask(%rip), BSWAP - movups (%rdi), DATA - movups (%rcx), SHASH - pshufb BSWAP, DATA + movups (%rdi), ACC + movups (%rsi), KEY .align 4 -.Lupdate_loop: - movups (%rsi), IN1 +.Lnext_block: + movups (%rdx), IN1 pshufb BSWAP, IN1 - pxor IN1, DATA + pxor IN1, ACC call __clmul_gf128mul_ble - sub $16, %rdx - add $16, %rsi - cmp $16, %rdx - jge .Lupdate_loop - pshufb BSWAP, DATA - movups DATA, (%rdi) -.Lupdate_just_ret: - mov %rdx, %rax + add $16, %rdx + dec %rcx + jnz .Lnext_block + movups ACC, (%rdi) FRAME_END RET -SYM_FUNC_END(clmul_ghash_update) +SYM_FUNC_END(ghash_blocks_pclmul) --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D96AE34AB17; Thu, 19 Mar 2026 06:19:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901161; cv=none; b=WO39TQ5V9ie02QXYQthj2P+G0owrwnexxGKUFmA20n/OH7QfCDW0LzjwRJudUHb0dmoUNWk577QCxy5hSM786iOgDC2jue4HsaGce3qdRb69rVRzyROhnQ2xn0eo8Hv6C3PbMLw3WIzr1Zvaaq4EcdCeO98rhEcTYf37mtcTC/k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901161; c=relaxed/simple; bh=Y/jiAG4WYOgLgUKopzWm2Mr8y1Z3WH35BAGXg5EAIW4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=I4JKqr94iJRGNziRBcgliICIiMJeEqyHJOvF4NdvgWeEN7IZQ2PGIaByyfGIi8PC5+zCh2nidpEgujJjziYmawJ5t2n6rj5EiqhURrtCDIKVTqrno+AsNAuGjnW7A2VAXGRwAdLXXZEE6xnsvk0QtkXoZ2X2tZEdBIe2appEP4s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=l9kgMshk; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="l9kgMshk" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 53BB9C2BCB3; Thu, 19 Mar 2026 06:19:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901161; bh=Y/jiAG4WYOgLgUKopzWm2Mr8y1Z3WH35BAGXg5EAIW4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=l9kgMshkfQJkJhsWfrrmEY/dXu0RsB5ej79ZbbCkwLerjo7cTRHJrk/0fDKn8OzoT n8HH8zm+cZVC/e/xVtLaVxLEorCYoszIAt1OZitUUaoxmdTsQEXQG/5piFr+pXKJEF CnIXhupQoxjzABf1u+5ylMlMSOMOvxwcu7EA9RbihORUTgdN/r80b32yaAxyb0/NYs 4W/nVgiu/SSCylqFIkm6uaouBhJHljhw8YwIw1uxme8JZRTXkuLBFMjEhjg4mGj6xQ LK4dHNHorY1Zye9y8uO8qZHRKijSI+YfPoaxmqCjgJJpWU7rovcQNDlzqfEuhtoETI xdyD4jaRYHwbg== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 15/19] crypto: gcm - Use GHASH library instead of crypto_ahash Date: Wed, 18 Mar 2026 23:17:16 -0700 Message-ID: <20260319061723.1140720-16-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Make the "gcm" template access GHASH using the library API instead of crypto_ahash. This is much simpler and more efficient, especially given that all GHASH implementations are synchronous and CPU-based anyway. Note that this allows "ghash" to be removed from the crypto_ahash (and crypto_shash) API, which a later commit will do. This mirrors the similar cleanup that was done with POLYVAL. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- crypto/Kconfig | 2 +- crypto/gcm.c | 413 +++++---------------------- crypto/testmgr.c | 10 +- drivers/crypto/starfive/jh7110-aes.c | 2 +- 4 files changed, 85 insertions(+), 342 deletions(-) diff --git a/crypto/Kconfig b/crypto/Kconfig index 5627b3691561..13ccf5ac2f1a 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -792,11 +792,11 @@ config CRYPTO_CCM =20 config CRYPTO_GCM tristate "GCM (Galois/Counter Mode) and GMAC (GCM MAC)" select CRYPTO_CTR select CRYPTO_AEAD - select CRYPTO_GHASH + select CRYPTO_LIB_GF128HASH select CRYPTO_MANAGER help GCM (Galois/Counter Mode) authenticated encryption mode and GMAC (GCM Message Authentication Code) (NIST SP800-38D) =20 diff --git a/crypto/gcm.c b/crypto/gcm.c index e1e878d37410..5f16b237b3c5 100644 --- a/crypto/gcm.c +++ b/crypto/gcm.c @@ -3,31 +3,28 @@ * GCM: Galois/Counter Mode. * * Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen */ =20 -#include #include #include -#include #include #include -#include +#include #include #include #include #include #include =20 struct gcm_instance_ctx { struct crypto_skcipher_spawn ctr; - struct crypto_ahash_spawn ghash; }; =20 struct crypto_gcm_ctx { struct crypto_skcipher *ctr; - struct crypto_ahash *ghash; + struct ghash_key ghash; }; =20 struct crypto_rfc4106_ctx { struct crypto_aead *child; u8 nonce[4]; @@ -50,35 +47,19 @@ struct crypto_rfc4543_ctx { =20 struct crypto_rfc4543_req_ctx { struct aead_request subreq; }; =20 -struct crypto_gcm_ghash_ctx { - unsigned int cryptlen; - struct scatterlist *src; - int (*complete)(struct aead_request *req, u32 flags); -}; - struct crypto_gcm_req_priv_ctx { u8 iv[16]; u8 auth_tag[16]; u8 iauth_tag[16]; struct scatterlist src[3]; struct scatterlist dst[3]; - struct scatterlist sg; - struct crypto_gcm_ghash_ctx ghash_ctx; - union { - struct ahash_request ahreq; - struct skcipher_request skreq; - } u; + struct skcipher_request skreq; /* Must be last */ }; =20 -static struct { - u8 buf[16]; - struct scatterlist sg; -} *gcm_zeroes; - static inline struct crypto_gcm_req_priv_ctx *crypto_gcm_reqctx( struct aead_request *req) { unsigned long align =3D crypto_aead_alignmask(crypto_aead_reqtfm(req)); =20 @@ -87,14 +68,13 @@ static inline struct crypto_gcm_req_priv_ctx *crypto_gc= m_reqctx( =20 static int crypto_gcm_setkey(struct crypto_aead *aead, const u8 *key, unsigned int keylen) { struct crypto_gcm_ctx *ctx =3D crypto_aead_ctx(aead); - struct crypto_ahash *ghash =3D ctx->ghash; struct crypto_skcipher *ctr =3D ctx->ctr; struct { - be128 hash; + u8 h[GHASH_BLOCK_SIZE]; u8 iv[16]; =20 struct crypto_wait wait; =20 struct scatterlist sg[1]; @@ -113,29 +93,26 @@ static int crypto_gcm_setkey(struct crypto_aead *aead,= const u8 *key, GFP_KERNEL); if (!data) return -ENOMEM; =20 crypto_init_wait(&data->wait); - sg_init_one(data->sg, &data->hash, sizeof(data->hash)); + sg_init_one(data->sg, data->h, sizeof(data->h)); skcipher_request_set_tfm(&data->req, ctr); skcipher_request_set_callback(&data->req, CRYPTO_TFM_REQ_MAY_SLEEP | CRYPTO_TFM_REQ_MAY_BACKLOG, crypto_req_done, &data->wait); skcipher_request_set_crypt(&data->req, data->sg, data->sg, - sizeof(data->hash), data->iv); + sizeof(data->h), data->iv); =20 err =3D crypto_wait_req(crypto_skcipher_encrypt(&data->req), &data->wait); =20 if (err) goto out; =20 - crypto_ahash_clear_flags(ghash, CRYPTO_TFM_REQ_MASK); - crypto_ahash_set_flags(ghash, crypto_aead_get_flags(aead) & - CRYPTO_TFM_REQ_MASK); - err =3D crypto_ahash_setkey(ghash, (u8 *)&data->hash, sizeof(be128)); + ghash_preparekey(&ctx->ghash, data->h); out: kfree_sensitive(data); return err; } =20 @@ -174,288 +151,106 @@ static void crypto_gcm_init_crypt(struct aead_reque= st *req, unsigned int cryptlen) { struct crypto_aead *aead =3D crypto_aead_reqtfm(req); struct crypto_gcm_ctx *ctx =3D crypto_aead_ctx(aead); struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct skcipher_request *skreq =3D &pctx->u.skreq; + struct skcipher_request *skreq =3D &pctx->skreq; struct scatterlist *dst; =20 dst =3D req->src =3D=3D req->dst ? pctx->src : pctx->dst; =20 skcipher_request_set_tfm(skreq, ctx->ctr); skcipher_request_set_crypt(skreq, pctx->src, dst, cryptlen + sizeof(pctx->auth_tag), pctx->iv); } =20 -static inline unsigned int gcm_remain(unsigned int len) -{ - len &=3D 0xfU; - return len ? 16 - len : 0; -} - -static void gcm_hash_len_done(void *data, int err); - -static int gcm_hash_update(struct aead_request *req, - crypto_completion_t compl, - struct scatterlist *src, - unsigned int len, u32 flags) -{ - struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct ahash_request *ahreq =3D &pctx->u.ahreq; - - ahash_request_set_callback(ahreq, flags, compl, req); - ahash_request_set_crypt(ahreq, src, NULL, len); - - return crypto_ahash_update(ahreq); -} - -static int gcm_hash_remain(struct aead_request *req, - unsigned int remain, - crypto_completion_t compl, u32 flags) -{ - return gcm_hash_update(req, compl, &gcm_zeroes->sg, remain, flags); -} - -static int gcm_hash_len(struct aead_request *req, u32 flags) -{ - struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct ahash_request *ahreq =3D &pctx->u.ahreq; - struct crypto_gcm_ghash_ctx *gctx =3D &pctx->ghash_ctx; - be128 lengths; - - lengths.a =3D cpu_to_be64(req->assoclen * 8); - lengths.b =3D cpu_to_be64(gctx->cryptlen * 8); - memcpy(pctx->iauth_tag, &lengths, 16); - sg_init_one(&pctx->sg, pctx->iauth_tag, 16); - ahash_request_set_callback(ahreq, flags, gcm_hash_len_done, req); - ahash_request_set_crypt(ahreq, &pctx->sg, - pctx->iauth_tag, sizeof(lengths)); - - return crypto_ahash_finup(ahreq); -} - -static int gcm_hash_len_continue(struct aead_request *req, u32 flags) -{ - struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct crypto_gcm_ghash_ctx *gctx =3D &pctx->ghash_ctx; - - return gctx->complete(req, flags); -} - -static void gcm_hash_len_done(void *data, int err) -{ - struct aead_request *req =3D data; - - if (err) - goto out; - - err =3D gcm_hash_len_continue(req, 0); - if (err =3D=3D -EINPROGRESS) - return; - -out: - aead_request_complete(req, err); -} - -static int gcm_hash_crypt_remain_continue(struct aead_request *req, u32 fl= ags) -{ - return gcm_hash_len(req, flags) ?: - gcm_hash_len_continue(req, flags); -} - -static void gcm_hash_crypt_remain_done(void *data, int err) -{ - struct aead_request *req =3D data; - - if (err) - goto out; - - err =3D gcm_hash_crypt_remain_continue(req, 0); - if (err =3D=3D -EINPROGRESS) - return; - -out: - aead_request_complete(req, err); -} - -static int gcm_hash_crypt_continue(struct aead_request *req, u32 flags) -{ - struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct crypto_gcm_ghash_ctx *gctx =3D &pctx->ghash_ctx; - unsigned int remain; - - remain =3D gcm_remain(gctx->cryptlen); - if (remain) - return gcm_hash_remain(req, remain, - gcm_hash_crypt_remain_done, flags) ?: - gcm_hash_crypt_remain_continue(req, flags); - - return gcm_hash_crypt_remain_continue(req, flags); -} - -static void gcm_hash_crypt_done(void *data, int err) -{ - struct aead_request *req =3D data; - - if (err) - goto out; - - err =3D gcm_hash_crypt_continue(req, 0); - if (err =3D=3D -EINPROGRESS) - return; - -out: - aead_request_complete(req, err); -} - -static int gcm_hash_assoc_remain_continue(struct aead_request *req, u32 fl= ags) -{ - struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct crypto_gcm_ghash_ctx *gctx =3D &pctx->ghash_ctx; - - if (gctx->cryptlen) - return gcm_hash_update(req, gcm_hash_crypt_done, - gctx->src, gctx->cryptlen, flags) ?: - gcm_hash_crypt_continue(req, flags); - - return gcm_hash_crypt_remain_continue(req, flags); -} - -static void gcm_hash_assoc_remain_done(void *data, int err) -{ - struct aead_request *req =3D data; - - if (err) - goto out; - - err =3D gcm_hash_assoc_remain_continue(req, 0); - if (err =3D=3D -EINPROGRESS) - return; - -out: - aead_request_complete(req, err); -} - -static int gcm_hash_assoc_continue(struct aead_request *req, u32 flags) +static void ghash_update_sg_and_pad(struct ghash_ctx *ghash, + struct scatterlist *sg, unsigned int len) { - unsigned int remain; + static const u8 zeroes[GHASH_BLOCK_SIZE]; =20 - remain =3D gcm_remain(req->assoclen); - if (remain) - return gcm_hash_remain(req, remain, - gcm_hash_assoc_remain_done, flags) ?: - gcm_hash_assoc_remain_continue(req, flags); + if (len) { + unsigned int pad_len =3D -len % GHASH_BLOCK_SIZE; + struct scatter_walk walk; =20 - return gcm_hash_assoc_remain_continue(req, flags); -} + scatterwalk_start(&walk, sg); + do { + unsigned int n =3D scatterwalk_next(&walk, len); =20 -static void gcm_hash_assoc_done(void *data, int err) -{ - struct aead_request *req =3D data; + ghash_update(ghash, walk.addr, n); + scatterwalk_done_src(&walk, n); + len -=3D n; + } while (len); =20 - if (err) - goto out; - - err =3D gcm_hash_assoc_continue(req, 0); - if (err =3D=3D -EINPROGRESS) - return; - -out: - aead_request_complete(req, err); -} - -static int gcm_hash_init_continue(struct aead_request *req, u32 flags) -{ - if (req->assoclen) - return gcm_hash_update(req, gcm_hash_assoc_done, - req->src, req->assoclen, flags) ?: - gcm_hash_assoc_continue(req, flags); - - return gcm_hash_assoc_remain_continue(req, flags); + if (pad_len) + ghash_update(ghash, zeroes, pad_len); + } } =20 -static void gcm_hash_init_done(void *data, int err) +static void gcm_hash(struct aead_request *req, struct scatterlist *ctext, + unsigned int datalen, u8 out[GHASH_BLOCK_SIZE]) { - struct aead_request *req =3D data; - - if (err) - goto out; + const struct crypto_gcm_ctx *ctx =3D + crypto_aead_ctx(crypto_aead_reqtfm(req)); + __be64 lengths[2] =3D { + cpu_to_be64(8 * (u64)req->assoclen), + cpu_to_be64(8 * (u64)datalen), + }; + struct ghash_ctx ghash; =20 - err =3D gcm_hash_init_continue(req, 0); - if (err =3D=3D -EINPROGRESS) - return; + ghash_init(&ghash, &ctx->ghash); =20 -out: - aead_request_complete(req, err); -} + /* Associated data, then zero-padding to the next 16-byte boundary */ + ghash_update_sg_and_pad(&ghash, req->src, req->assoclen); =20 -static int gcm_hash(struct aead_request *req, u32 flags) -{ - struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct ahash_request *ahreq =3D &pctx->u.ahreq; - struct crypto_gcm_ctx *ctx =3D crypto_aead_ctx(crypto_aead_reqtfm(req)); + /* Ciphertext, then zero-padding to the next 16-byte boundary */ + ghash_update_sg_and_pad(&ghash, ctext, datalen); =20 - ahash_request_set_tfm(ahreq, ctx->ghash); + /* Lengths block */ + ghash_update(&ghash, (const u8 *)lengths, sizeof(lengths)); =20 - ahash_request_set_callback(ahreq, flags, gcm_hash_init_done, req); - return crypto_ahash_init(ahreq) ?: - gcm_hash_init_continue(req, flags); + ghash_final(&ghash, out); } =20 -static int gcm_enc_copy_hash(struct aead_request *req, u32 flags) +static int gcm_add_auth_tag(struct aead_request *req) { - struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); struct crypto_aead *aead =3D crypto_aead_reqtfm(req); - u8 *auth_tag =3D pctx->auth_tag; - - crypto_xor(auth_tag, pctx->iauth_tag, 16); - scatterwalk_map_and_copy(auth_tag, req->dst, - req->assoclen + req->cryptlen, - crypto_aead_authsize(aead), 1); - return 0; -} - -static int gcm_encrypt_continue(struct aead_request *req, u32 flags) -{ struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct crypto_gcm_ghash_ctx *gctx =3D &pctx->ghash_ctx; =20 - gctx->src =3D sg_next(req->src =3D=3D req->dst ? pctx->src : pctx->dst); - gctx->cryptlen =3D req->cryptlen; - gctx->complete =3D gcm_enc_copy_hash; - - return gcm_hash(req, flags); + gcm_hash(req, sg_next(req->src =3D=3D req->dst ? pctx->src : pctx->dst), + req->cryptlen, pctx->iauth_tag); + crypto_xor(pctx->auth_tag, pctx->iauth_tag, 16); + memcpy_to_sglist(req->dst, req->assoclen + req->cryptlen, + pctx->auth_tag, crypto_aead_authsize(aead)); + return 0; } =20 static void gcm_encrypt_done(void *data, int err) { struct aead_request *req =3D data; =20 if (err) goto out; =20 - err =3D gcm_encrypt_continue(req, 0); - if (err =3D=3D -EINPROGRESS) - return; + err =3D gcm_add_auth_tag(req); =20 out: aead_request_complete(req, err); } =20 static int crypto_gcm_encrypt(struct aead_request *req) { struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct skcipher_request *skreq =3D &pctx->u.skreq; + struct skcipher_request *skreq =3D &pctx->skreq; u32 flags =3D aead_request_flags(req); =20 crypto_gcm_init_common(req); crypto_gcm_init_crypt(req, req->cryptlen); skcipher_request_set_callback(skreq, flags, gcm_encrypt_done, req); =20 - return crypto_skcipher_encrypt(skreq) ?: - gcm_encrypt_continue(req, flags); + return crypto_skcipher_encrypt(skreq) ?: gcm_add_auth_tag(req); } =20 static int crypto_gcm_verify(struct aead_request *req) { struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); @@ -479,106 +274,71 @@ static void gcm_decrypt_done(void *data, int err) err =3D crypto_gcm_verify(req); =20 aead_request_complete(req, err); } =20 -static int gcm_dec_hash_continue(struct aead_request *req, u32 flags) -{ - struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct skcipher_request *skreq =3D &pctx->u.skreq; - struct crypto_gcm_ghash_ctx *gctx =3D &pctx->ghash_ctx; - - crypto_gcm_init_crypt(req, gctx->cryptlen); - skcipher_request_set_callback(skreq, flags, gcm_decrypt_done, req); - return crypto_skcipher_decrypt(skreq) ?: crypto_gcm_verify(req); -} - static int crypto_gcm_decrypt(struct aead_request *req) { struct crypto_aead *aead =3D crypto_aead_reqtfm(req); struct crypto_gcm_req_priv_ctx *pctx =3D crypto_gcm_reqctx(req); - struct crypto_gcm_ghash_ctx *gctx =3D &pctx->ghash_ctx; - unsigned int authsize =3D crypto_aead_authsize(aead); - unsigned int cryptlen =3D req->cryptlen; - u32 flags =3D aead_request_flags(req); - - cryptlen -=3D authsize; + struct skcipher_request *skreq =3D &pctx->skreq; + unsigned int datalen =3D req->cryptlen - crypto_aead_authsize(aead); =20 crypto_gcm_init_common(req); =20 - gctx->src =3D sg_next(pctx->src); - gctx->cryptlen =3D cryptlen; - gctx->complete =3D gcm_dec_hash_continue; + gcm_hash(req, sg_next(pctx->src), datalen, pctx->iauth_tag); =20 - return gcm_hash(req, flags); + crypto_gcm_init_crypt(req, datalen); + skcipher_request_set_callback(skreq, aead_request_flags(req), + gcm_decrypt_done, req); + return crypto_skcipher_decrypt(skreq) ?: crypto_gcm_verify(req); } =20 static int crypto_gcm_init_tfm(struct crypto_aead *tfm) { struct aead_instance *inst =3D aead_alg_instance(tfm); struct gcm_instance_ctx *ictx =3D aead_instance_ctx(inst); struct crypto_gcm_ctx *ctx =3D crypto_aead_ctx(tfm); struct crypto_skcipher *ctr; - struct crypto_ahash *ghash; unsigned long align; - int err; - - ghash =3D crypto_spawn_ahash(&ictx->ghash); - if (IS_ERR(ghash)) - return PTR_ERR(ghash); =20 ctr =3D crypto_spawn_skcipher(&ictx->ctr); - err =3D PTR_ERR(ctr); if (IS_ERR(ctr)) - goto err_free_hash; + return PTR_ERR(ctr); =20 ctx->ctr =3D ctr; - ctx->ghash =3D ghash; =20 align =3D crypto_aead_alignmask(tfm); align &=3D ~(crypto_tfm_ctx_alignment() - 1); crypto_aead_set_reqsize(tfm, - align + offsetof(struct crypto_gcm_req_priv_ctx, u) + - max(sizeof(struct skcipher_request) + - crypto_skcipher_reqsize(ctr), - sizeof(struct ahash_request) + - crypto_ahash_reqsize(ghash))); - + align + sizeof(struct crypto_gcm_req_priv_ctx) + + crypto_skcipher_reqsize(ctr)); return 0; - -err_free_hash: - crypto_free_ahash(ghash); - return err; } =20 static void crypto_gcm_exit_tfm(struct crypto_aead *tfm) { struct crypto_gcm_ctx *ctx =3D crypto_aead_ctx(tfm); =20 - crypto_free_ahash(ctx->ghash); crypto_free_skcipher(ctx->ctr); } =20 static void crypto_gcm_free(struct aead_instance *inst) { struct gcm_instance_ctx *ctx =3D aead_instance_ctx(inst); =20 crypto_drop_skcipher(&ctx->ctr); - crypto_drop_ahash(&ctx->ghash); kfree(inst); } =20 static int crypto_gcm_create_common(struct crypto_template *tmpl, - struct rtattr **tb, - const char *ctr_name, - const char *ghash_name) + struct rtattr **tb, const char *ctr_name) { struct skcipher_alg_common *ctr; u32 mask; struct aead_instance *inst; struct gcm_instance_ctx *ctx; - struct hash_alg_common *ghash; int err; =20 err =3D crypto_check_attr_type(tb, CRYPTO_ALG_TYPE_AEAD, &mask); if (err) return err; @@ -586,21 +346,10 @@ static int crypto_gcm_create_common(struct crypto_tem= plate *tmpl, inst =3D kzalloc(sizeof(*inst) + sizeof(*ctx), GFP_KERNEL); if (!inst) return -ENOMEM; ctx =3D aead_instance_ctx(inst); =20 - err =3D crypto_grab_ahash(&ctx->ghash, aead_crypto_instance(inst), - ghash_name, 0, mask); - if (err) - goto err_free_inst; - ghash =3D crypto_spawn_ahash_alg(&ctx->ghash); - - err =3D -EINVAL; - if (strcmp(ghash->base.cra_name, "ghash") !=3D 0 || - ghash->digestsize !=3D 16) - goto err_free_inst; - err =3D crypto_grab_skcipher(&ctx->ctr, aead_crypto_instance(inst), ctr_name, 0, mask); if (err) goto err_free_inst; ctr =3D crypto_spawn_skcipher_alg_common(&ctx->ctr); @@ -615,17 +364,15 @@ static int crypto_gcm_create_common(struct crypto_tem= plate *tmpl, if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME, "gcm(%s", ctr->base.cra_name + 4) >=3D CRYPTO_MAX_ALG_NAME) goto err_free_inst; =20 if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME, - "gcm_base(%s,%s)", ctr->base.cra_driver_name, - ghash->base.cra_driver_name) >=3D - CRYPTO_MAX_ALG_NAME) + "gcm_base(%s,ghash-lib)", + ctr->base.cra_driver_name) >=3D CRYPTO_MAX_ALG_NAME) goto err_free_inst; =20 - inst->alg.base.cra_priority =3D (ghash->base.cra_priority + - ctr->base.cra_priority) / 2; + inst->alg.base.cra_priority =3D ctr->base.cra_priority; inst->alg.base.cra_blocksize =3D 1; inst->alg.base.cra_alignmask =3D ctr->base.cra_alignmask; inst->alg.base.cra_ctxsize =3D sizeof(struct crypto_gcm_ctx); inst->alg.ivsize =3D GCM_AES_IV_SIZE; inst->alg.chunksize =3D ctr->chunksize; @@ -658,11 +405,11 @@ static int crypto_gcm_create(struct crypto_template *= tmpl, struct rtattr **tb) =20 if (snprintf(ctr_name, CRYPTO_MAX_ALG_NAME, "ctr(%s)", cipher_name) >=3D CRYPTO_MAX_ALG_NAME) return -ENAMETOOLONG; =20 - return crypto_gcm_create_common(tmpl, tb, ctr_name, "ghash"); + return crypto_gcm_create_common(tmpl, tb, ctr_name); } =20 static int crypto_gcm_base_create(struct crypto_template *tmpl, struct rtattr **tb) { @@ -675,11 +422,20 @@ static int crypto_gcm_base_create(struct crypto_templ= ate *tmpl, =20 ghash_name =3D crypto_attr_alg_name(tb[2]); if (IS_ERR(ghash_name)) return PTR_ERR(ghash_name); =20 - return crypto_gcm_create_common(tmpl, tb, ctr_name, ghash_name); + /* + * Originally this parameter allowed requesting a specific + * implementation of GHASH. This is no longer supported. Now the best + * implementation of GHASH is just always used. + */ + if (strcmp(ghash_name, "ghash") !=3D 0 && + strcmp(ghash_name, "ghash-lib") !=3D 0) + return -EINVAL; + + return crypto_gcm_create_common(tmpl, tb, ctr_name); } =20 static int crypto_rfc4106_setkey(struct crypto_aead *parent, const u8 *key, unsigned int keylen) { @@ -1094,29 +850,16 @@ static struct crypto_template crypto_gcm_tmpls[] =3D= { }, }; =20 static int __init crypto_gcm_module_init(void) { - int err; - - gcm_zeroes =3D kzalloc_obj(*gcm_zeroes); - if (!gcm_zeroes) - return -ENOMEM; - - sg_init_one(&gcm_zeroes->sg, gcm_zeroes->buf, sizeof(gcm_zeroes->buf)); - - err =3D crypto_register_templates(crypto_gcm_tmpls, - ARRAY_SIZE(crypto_gcm_tmpls)); - if (err) - kfree(gcm_zeroes); - - return err; + return crypto_register_templates(crypto_gcm_tmpls, + ARRAY_SIZE(crypto_gcm_tmpls)); } =20 static void __exit crypto_gcm_module_exit(void) { - kfree(gcm_zeroes); crypto_unregister_templates(crypto_gcm_tmpls, ARRAY_SIZE(crypto_gcm_tmpls)); } =20 module_init(crypto_gcm_module_init); diff --git a/crypto/testmgr.c b/crypto/testmgr.c index fec950f1628b..0b0ad358e091 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -4963,26 +4963,26 @@ static const struct alg_test_desc alg_test_descs[] = =3D { .kpp =3D __VECS(ffdhe8192_dh_tv_template) } }, { #endif /* CONFIG_CRYPTO_DH_RFC7919_GROUPS */ .alg =3D "gcm(aes)", - .generic_driver =3D "gcm_base(ctr(aes-lib),ghash-generic)", + .generic_driver =3D "gcm_base(ctr(aes-lib),ghash-lib)", .test =3D alg_test_aead, .fips_allowed =3D 1, .suite =3D { .aead =3D __VECS(aes_gcm_tv_template) } }, { .alg =3D "gcm(aria)", - .generic_driver =3D "gcm_base(ctr(aria-generic),ghash-generic)", + .generic_driver =3D "gcm_base(ctr(aria-generic),ghash-lib)", .test =3D alg_test_aead, .suite =3D { .aead =3D __VECS(aria_gcm_tv_template) } }, { .alg =3D "gcm(sm4)", - .generic_driver =3D "gcm_base(ctr(sm4-generic),ghash-generic)", + .generic_driver =3D "gcm_base(ctr(sm4-generic),ghash-lib)", .test =3D alg_test_aead, .suite =3D { .aead =3D __VECS(sm4_gcm_tv_template) } }, { @@ -5312,11 +5312,11 @@ static const struct alg_test_desc alg_test_descs[] = =3D { .suite =3D { .cipher =3D __VECS(sm4_ctr_rfc3686_tv_template) } }, { .alg =3D "rfc4106(gcm(aes))", - .generic_driver =3D "rfc4106(gcm_base(ctr(aes-lib),ghash-generic))", + .generic_driver =3D "rfc4106(gcm_base(ctr(aes-lib),ghash-lib))", .test =3D alg_test_aead, .fips_allowed =3D 1, .suite =3D { .aead =3D { ____VECS(aes_gcm_rfc4106_tv_template), @@ -5336,11 +5336,11 @@ static const struct alg_test_desc alg_test_descs[] = =3D { .aad_iv =3D 1, } } }, { .alg =3D "rfc4543(gcm(aes))", - .generic_driver =3D "rfc4543(gcm_base(ctr(aes-lib),ghash-generic))", + .generic_driver =3D "rfc4543(gcm_base(ctr(aes-lib),ghash-lib))", .test =3D alg_test_aead, .suite =3D { .aead =3D { ____VECS(aes_gcm_rfc4543_tv_template), .einval_allowed =3D 1, diff --git a/drivers/crypto/starfive/jh7110-aes.c b/drivers/crypto/starfive= /jh7110-aes.c index 2e2d97d17e6c..a0713aa21250 100644 --- a/drivers/crypto/starfive/jh7110-aes.c +++ b/drivers/crypto/starfive/jh7110-aes.c @@ -1006,11 +1006,11 @@ static int starfive_aes_ccm_init_tfm(struct crypto_= aead *tfm) return starfive_aes_aead_init_tfm(tfm, "ccm_base(ctr(aes-lib),cbcmac-aes-= lib)"); } =20 static int starfive_aes_gcm_init_tfm(struct crypto_aead *tfm) { - return starfive_aes_aead_init_tfm(tfm, "gcm_base(ctr(aes-lib),ghash-gener= ic)"); + return starfive_aes_aead_init_tfm(tfm, "gcm_base(ctr(aes-lib),ghash-lib)"= ); } =20 static struct skcipher_engine_alg skcipher_algs[] =3D { { .base.init =3D starfive_aes_ecb_init_tfm, --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E5393603F6; Thu, 19 Mar 2026 06:19:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901162; cv=none; b=KIU3EYqAdfV0AtII5+/tz+ZRBg70qCuHYWPJsUY9CHa8EV5ZPmXAOirjPQpWGw/D4YlwPqWTP37wZ8ViI7IHhVG15BRSj47v++1r+ZJwV9qEfX+vN83Po6xAuuS2WC5anBkr3fVaXvJQ2PUmJxg2VXimqqtgoXJlZ++3Qnv+q4A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901162; c=relaxed/simple; bh=A17Or1mhILOD8g/OauvklfFcseA7rh2PNPCFHHxYHps=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=YHRau+aDERm3jTqCcDeZd0V7jtV15+y8420ct4/QRsjm2fvED4wT290xjZUeJdXT6jjWlRSluvMGm1s6X1SqOsdipxQ/jv0WH5W8fqvu/WbUc/NhJ9OJ+J78HBkcWUX6NfvxZgonp8OQznE+Y43MYMYHNJGbhTq5MO3oVSpOuVo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rLuXxi5J; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rLuXxi5J" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD107C2BCC4; Thu, 19 Mar 2026 06:19:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901162; bh=A17Or1mhILOD8g/OauvklfFcseA7rh2PNPCFHHxYHps=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rLuXxi5JU7e4lhDja+r/raWzll947pWc4MN189ypr8SfFoCCk+3TWuQLyZg5+aqur KtJWSfl0m9oJzz/nIxydzurMQxR/TXkO3B2H6vPc8ohT6HxPfCtyBjfA+jfePcJTLv iXWK9SuB8JbbOGVLAdzzUmS/keEKFG5TpF5elm5N3dm7pANpOfiyrfurP6rzophpj0 PgPEoS+zkDU/xnBynkSgZcrtmGPnxFbVyqi30/9F0B7YBbz8fyFYdGGZgkOat5RGXw RDkNDtSWLFnuvY0vc62CKN9KHXl5vaSu/UIq1pYiP609MOnMgpbb18sNPGGJfx/fIG WUzFLC7v2SSNg== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 16/19] crypto: ghash - Remove ghash from crypto_shash API Date: Wed, 18 Mar 2026 23:17:17 -0700 Message-ID: <20260319061723.1140720-17-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Now that there are no users of the "ghash" crypto_shash algorithm, remove it. GHASH remains supported via the library API. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- crypto/Kconfig | 7 -- crypto/Makefile | 1 - crypto/ghash-generic.c | 162 ----------------------------------------- crypto/tcrypt.c | 9 --- crypto/testmgr.c | 6 -- crypto/testmgr.h | 109 --------------------------- 6 files changed, 294 deletions(-) delete mode 100644 crypto/ghash-generic.c diff --git a/crypto/Kconfig b/crypto/Kconfig index 13ccf5ac2f1a..efb482ea192d 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -886,17 +886,10 @@ config CRYPTO_CMAC select CRYPTO_MANAGER help CMAC (Cipher-based Message Authentication Code) authentication mode (NIST SP800-38B and IETF RFC4493) =20 -config CRYPTO_GHASH - tristate "GHASH" - select CRYPTO_HASH - select CRYPTO_LIB_GF128MUL - help - GCM GHASH function (NIST SP800-38D) - config CRYPTO_HMAC tristate "HMAC (Keyed-Hash MAC)" select CRYPTO_HASH select CRYPTO_MANAGER help diff --git a/crypto/Makefile b/crypto/Makefile index 04e269117589..17f4fca9b9e5 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -169,11 +169,10 @@ CFLAGS_jitterentropy.o =3D -O0 KASAN_SANITIZE_jitterentropy.o =3D n UBSAN_SANITIZE_jitterentropy.o =3D n jitterentropy_rng-y :=3D jitterentropy.o jitterentropy-kcapi.o obj-$(CONFIG_CRYPTO_JITTERENTROPY_TESTINTERFACE) +=3D jitterentropy-testin= g.o obj-$(CONFIG_CRYPTO_BENCHMARK) +=3D tcrypt.o -obj-$(CONFIG_CRYPTO_GHASH) +=3D ghash-generic.o obj-$(CONFIG_CRYPTO_USER_API) +=3D af_alg.o obj-$(CONFIG_CRYPTO_USER_API_HASH) +=3D algif_hash.o obj-$(CONFIG_CRYPTO_USER_API_SKCIPHER) +=3D algif_skcipher.o obj-$(CONFIG_CRYPTO_USER_API_RNG) +=3D algif_rng.o obj-$(CONFIG_CRYPTO_USER_API_AEAD) +=3D algif_aead.o diff --git a/crypto/ghash-generic.c b/crypto/ghash-generic.c deleted file mode 100644 index e5803c249c12..000000000000 --- a/crypto/ghash-generic.c +++ /dev/null @@ -1,162 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * GHASH: hash function for GCM (Galois/Counter Mode). - * - * Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen - * Copyright (c) 2009 Intel Corp. - * Author: Huang Ying - */ - -/* - * GHASH is a keyed hash function used in GCM authentication tag generatio= n. - * - * The original GCM paper [1] presents GHASH as a function GHASH(H, A, C) = which - * takes a 16-byte hash key H, additional authenticated data A, and a ciph= ertext - * C. It formats A and C into a single byte string X, interprets X as a - * polynomial over GF(2^128), and evaluates this polynomial at the point H. - * - * However, the NIST standard for GCM [2] presents GHASH as GHASH(H, X) wh= ere X - * is the already-formatted byte string containing both A and C. - * - * "ghash" in the Linux crypto API uses the 'X' (pre-formatted) convention, - * since the API supports only a single data stream per hash. Thus, the - * formatting of 'A' and 'C' is done in the "gcm" template, not in "ghash". - * - * The reason "ghash" is separate from "gcm" is to allow "gcm" to use an - * accelerated "ghash" when a standalone accelerated "gcm(aes)" is unavail= able. - * It is generally inappropriate to use "ghash" for other purposes, since = it is - * an "=CE=B5-almost-XOR-universal hash function", not a cryptographic has= h function. - * It can only be used securely in crypto modes specially designed to use = it. - * - * [1] The Galois/Counter Mode of Operation (GCM) - * (http://citeseerx.ist.psu.edu/viewdoc/download?doi=3D10.1.1.694.695= &rep=3Drep1&type=3Dpdf) - * [2] Recommendation for Block Cipher Modes of Operation: Galois/Counter = Mode (GCM) and GMAC - * (https://csrc.nist.gov/publications/detail/sp/800-38d/final) - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -static int ghash_init(struct shash_desc *desc) -{ - struct ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - - memset(dctx, 0, sizeof(*dctx)); - - return 0; -} - -static int ghash_setkey(struct crypto_shash *tfm, - const u8 *key, unsigned int keylen) -{ - struct ghash_ctx *ctx =3D crypto_shash_ctx(tfm); - be128 k; - - if (keylen !=3D GHASH_BLOCK_SIZE) - return -EINVAL; - - if (ctx->gf128) - gf128mul_free_4k(ctx->gf128); - - BUILD_BUG_ON(sizeof(k) !=3D GHASH_BLOCK_SIZE); - memcpy(&k, key, GHASH_BLOCK_SIZE); /* avoid violating alignment rules */ - ctx->gf128 =3D gf128mul_init_4k_lle(&k); - memzero_explicit(&k, GHASH_BLOCK_SIZE); - - if (!ctx->gf128) - return -ENOMEM; - - return 0; -} - -static int ghash_update(struct shash_desc *desc, - const u8 *src, unsigned int srclen) -{ - struct ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - struct ghash_ctx *ctx =3D crypto_shash_ctx(desc->tfm); - u8 *dst =3D dctx->buffer; - - do { - crypto_xor(dst, src, GHASH_BLOCK_SIZE); - gf128mul_4k_lle((be128 *)dst, ctx->gf128); - src +=3D GHASH_BLOCK_SIZE; - srclen -=3D GHASH_BLOCK_SIZE; - } while (srclen >=3D GHASH_BLOCK_SIZE); - - return srclen; -} - -static void ghash_flush(struct shash_desc *desc, const u8 *src, - unsigned int len) -{ - struct ghash_ctx *ctx =3D crypto_shash_ctx(desc->tfm); - struct ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - u8 *dst =3D dctx->buffer; - - if (len) { - crypto_xor(dst, src, len); - gf128mul_4k_lle((be128 *)dst, ctx->gf128); - } -} - -static int ghash_finup(struct shash_desc *desc, const u8 *src, - unsigned int len, u8 *dst) -{ - struct ghash_desc_ctx *dctx =3D shash_desc_ctx(desc); - u8 *buf =3D dctx->buffer; - - ghash_flush(desc, src, len); - memcpy(dst, buf, GHASH_BLOCK_SIZE); - - return 0; -} - -static void ghash_exit_tfm(struct crypto_tfm *tfm) -{ - struct ghash_ctx *ctx =3D crypto_tfm_ctx(tfm); - if (ctx->gf128) - gf128mul_free_4k(ctx->gf128); -} - -static struct shash_alg ghash_alg =3D { - .digestsize =3D GHASH_DIGEST_SIZE, - .init =3D ghash_init, - .update =3D ghash_update, - .finup =3D ghash_finup, - .setkey =3D ghash_setkey, - .descsize =3D sizeof(struct ghash_desc_ctx), - .base =3D { - .cra_name =3D "ghash", - .cra_driver_name =3D "ghash-generic", - .cra_priority =3D 100, - .cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY, - .cra_blocksize =3D GHASH_BLOCK_SIZE, - .cra_ctxsize =3D sizeof(struct ghash_ctx), - .cra_module =3D THIS_MODULE, - .cra_exit =3D ghash_exit_tfm, - }, -}; - -static int __init ghash_mod_init(void) -{ - return crypto_register_shash(&ghash_alg); -} - -static void __exit ghash_mod_exit(void) -{ - crypto_unregister_shash(&ghash_alg); -} - -module_init(ghash_mod_init); -module_exit(ghash_mod_exit); - -MODULE_LICENSE("GPL"); -MODULE_DESCRIPTION("GHASH hash function"); -MODULE_ALIAS_CRYPTO("ghash"); -MODULE_ALIAS_CRYPTO("ghash-generic"); diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index aded37546137..1773f5f71351 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -1648,14 +1648,10 @@ static int do_test(const char *alg, u32 type, u32 m= ask, int m, u32 num_mb) =20 case 45: ret =3D min(ret, tcrypt_test("rfc4309(ccm(aes))")); break; =20 - case 46: - ret =3D min(ret, tcrypt_test("ghash")); - break; - case 48: ret =3D min(ret, tcrypt_test("sha3-224")); break; =20 case 49: @@ -2249,15 +2245,10 @@ static int do_test(const char *alg, u32 type, u32 m= ask, int m, u32 num_mb) fallthrough; case 317: test_hash_speed("blake2b-512", sec, generic_hash_speed_template); if (mode > 300 && mode < 400) break; fallthrough; - case 318: - klen =3D 16; - test_hash_speed("ghash", sec, generic_hash_speed_template); - if (mode > 300 && mode < 400) break; - fallthrough; case 319: test_hash_speed("crc32c", sec, generic_hash_speed_template); if (mode > 300 && mode < 400) break; fallthrough; case 322: diff --git a/crypto/testmgr.c b/crypto/testmgr.c index 0b0ad358e091..dd01f86dd6fe 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -4983,16 +4983,10 @@ static const struct alg_test_desc alg_test_descs[] = =3D { .generic_driver =3D "gcm_base(ctr(sm4-generic),ghash-lib)", .test =3D alg_test_aead, .suite =3D { .aead =3D __VECS(sm4_gcm_tv_template) } - }, { - .alg =3D "ghash", - .test =3D alg_test_hash, - .suite =3D { - .hash =3D __VECS(ghash_tv_template) - } }, { .alg =3D "hctr2(aes)", .generic_driver =3D "hctr2_base(xctr(aes-lib),polyval-lib)", .test =3D alg_test_skcipher, .suite =3D { diff --git a/crypto/testmgr.h b/crypto/testmgr.h index 1c69c11c0cdb..a3274abacfde 100644 --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -6181,119 +6181,10 @@ static const struct hash_testvec wp256_tv_template= [] =3D { "\x8A\x7A\x5A\x52\xDE\xEE\x65\x62" "\x07\xC5\x62\xF9\x88\xE9\x5C\x69", }, }; =20 -static const struct hash_testvec ghash_tv_template[] =3D -{ - { - .key =3D "\xdf\xa6\xbf\x4d\xed\x81\xdb\x03" - "\xff\xca\xff\x95\xf8\x30\xf0\x61", - .ksize =3D 16, - .plaintext =3D "\x95\x2b\x2a\x56\xa5\x60\x04a\xc0" - "\xb3\x2b\x66\x56\xa0\x5b\x40\xb6", - .psize =3D 16, - .digest =3D "\xda\x53\xeb\x0a\xd2\xc5\x5b\xb6" - "\x4f\xc4\x80\x2c\xc3\xfe\xda\x60", - }, { - .key =3D "\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b" - "\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b", - .ksize =3D 16, - .plaintext =3D "what do ya want for nothing?", - .psize =3D 28, - .digest =3D "\x3e\x1f\x5c\x4d\x65\xf0\xef\xce" - "\x0d\x61\x06\x27\x66\x51\xd5\xe2", - }, { - .key =3D "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa" - "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa", - .ksize =3D 16, - .plaintext =3D "\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd" - "\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd" - "\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd" - "\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd", - .psize =3D 50, - .digest =3D "\xfb\x49\x8a\x36\xe1\x96\xe1\x96" - "\xe1\x96\xe1\x96\xe1\x96\xe1\x96", - }, { - .key =3D "\xda\x53\xeb\x0a\xd2\xc5\x5b\xb6" - "\x4f\xc4\x80\x2c\xc3\xfe\xda\x60", - .ksize =3D 16, - .plaintext =3D "\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd" - "\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd" - "\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd" - "\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd\xcd", - .psize =3D 50, - .digest =3D "\x2b\x5c\x0c\x7f\x52\xd1\x60\xc2" - "\x49\xed\x6e\x32\x7a\xa9\xbe\x08", - }, { - .key =3D "\x95\x2b\x2a\x56\xa5\x60\x04a\xc0" - "\xb3\x2b\x66\x56\xa0\x5b\x40\xb6", - .ksize =3D 16, - .plaintext =3D "Test With Truncation", - .psize =3D 20, - .digest =3D "\xf8\x94\x87\x2a\x4b\x63\x99\x28" - "\x23\xf7\x93\xf7\x19\xf5\x96\xd9", - }, { - .key =3D "\x0a\x1b\x2c\x3d\x4e\x5f\x64\x71" - "\x82\x93\xa4\xb5\xc6\xd7\xe8\xf9", - .ksize =3D 16, - .plaintext =3D "\x56\x6f\x72\x20\x6c\x61\x75\x74" - "\x65\x72\x20\x4c\x61\x75\x73\x63" - "\x68\x65\x6e\x20\x75\x6e\x64\x20" - "\x53\x74\x61\x75\x6e\x65\x6e\x20" - "\x73\x65\x69\x20\x73\x74\x69\x6c" - "\x6c\x2c\x0a\x64\x75\x20\x6d\x65" - "\x69\x6e\x20\x74\x69\x65\x66\x74" - "\x69\x65\x66\x65\x73\x20\x4c\x65" - "\x62\x65\x6e\x3b\x0a\x64\x61\x73" - "\x73\x20\x64\x75\x20\x77\x65\x69" - "\xc3\x9f\x74\x20\x77\x61\x73\x20" - "\x64\x65\x72\x20\x57\x69\x6e\x64" - "\x20\x64\x69\x72\x20\x77\x69\x6c" - "\x6c\x2c\x0a\x65\x68\x20\x6e\x6f" - "\x63\x68\x20\x64\x69\x65\x20\x42" - "\x69\x72\x6b\x65\x6e\x20\x62\x65" - "\x62\x65\x6e\x2e\x0a\x0a\x55\x6e" - "\x64\x20\x77\x65\x6e\x6e\x20\x64" - "\x69\x72\x20\x65\x69\x6e\x6d\x61" - "\x6c\x20\x64\x61\x73\x20\x53\x63" - "\x68\x77\x65\x69\x67\x65\x6e\x20" - "\x73\x70\x72\x61\x63\x68\x2c\x0a" - "\x6c\x61\x73\x73\x20\x64\x65\x69" - "\x6e\x65\x20\x53\x69\x6e\x6e\x65" - "\x20\x62\x65\x73\x69\x65\x67\x65" - "\x6e\x2e\x0a\x4a\x65\x64\x65\x6d" - "\x20\x48\x61\x75\x63\x68\x65\x20" - "\x67\x69\x62\x74\x20\x64\x69\x63" - "\x68\x2c\x20\x67\x69\x62\x20\x6e" - "\x61\x63\x68\x2c\x0a\x65\x72\x20" - "\x77\x69\x72\x64\x20\x64\x69\x63" - "\x68\x20\x6c\x69\x65\x62\x65\x6e" - "\x20\x75\x6e\x64\x20\x77\x69\x65" - "\x67\x65\x6e\x2e\x0a\x0a\x55\x6e" - "\x64\x20\x64\x61\x6e\x6e\x20\x6d" - "\x65\x69\x6e\x65\x20\x53\x65\x65" - "\x6c\x65\x20\x73\x65\x69\x74\x20" - "\x77\x65\x69\x74\x2c\x20\x73\x65" - "\x69\x20\x77\x65\x69\x74\x2c\x0a" - "\x64\x61\x73\x73\x20\x64\x69\x72" - "\x20\x64\x61\x73\x20\x4c\x65\x62" - "\x65\x6e\x20\x67\x65\x6c\x69\x6e" - "\x67\x65\x2c\x0a\x62\x72\x65\x69" - "\x74\x65\x20\x64\x69\x63\x68\x20" - "\x77\x69\x65\x20\x65\x69\x6e\x20" - "\x46\x65\x69\x65\x72\x6b\x6c\x65" - "\x69\x64\x0a\xc3\xbc\x62\x65\x72" - "\x20\x64\x69\x65\x20\x73\x69\x6e" - "\x6e\x65\x6e\x64\x65\x6e\x20\x44" - "\x69\x6e\x67\x65\x2e\x2e\x2e\x0a", - .psize =3D 400, - .digest =3D "\xad\xb1\xc1\xe9\x56\x70\x31\x1d" - "\xbb\x5b\xdf\x5e\x70\x72\x1a\x57", - }, -}; - /* * HMAC-MD5 test vectors from RFC2202 * (These need to be fixed to not use strlen). */ static const struct hash_testvec hmac_md5_tv_template[] =3D --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 57185362131; Thu, 19 Mar 2026 06:19:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901163; cv=none; b=ZOa/WdoH9RTV35fI25p/fCJEYw4lTWl4D3sHZrc8m7gryWDCeUvDqOGbdb9GWbxjGpho9S1anFZBSAuw8x62dF23JryxtK5N6BLo3I8mWQdLmpdLjT6/AVRGSACI13NMJFFWdi7i6SqifENeSgPGtLDASYfWZe3Ni1cd7PyvEmM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901163; c=relaxed/simple; bh=mTRLUTdGzhcdizBLZYtXG0wi5TCm65S4qlvgr+wYUYI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RksUNSGF3Q9LPAOvpdXITmmNfhVexSD9RzEWj9a2CEw3ZZYo/iriUeb7acrQwH6GJ9E/LsIWYZUgj6lxxkLc9F1N0esNCvPHUO9DrlI+zcvdxFRJ3EttMvtkrSrVQS/A2LMWynuuM4OapboNVUMSUc/IUzk94g0m26KBulQp2uQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qUjaGrt7; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qUjaGrt7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 78620C2BCB3; Thu, 19 Mar 2026 06:19:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901162; bh=mTRLUTdGzhcdizBLZYtXG0wi5TCm65S4qlvgr+wYUYI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qUjaGrt7RFDnbK+cIlqFbgJnk6RjjiIP4nOUXjksgVsooOttsP6ts10f9DTl8O1LL fltmBuZVfF4AgTiJWgwJpoljMMtZvb1kGG1F8wwoAvdwrtEKtC+mdAi2Ie7ox16kJ0 aPxVeBkcYFwJLXZB2NkYHm93FvphnZNvhSEZ1qO6kngwuTmb64LuUjgHg3brwwDaVz rvUSK30xdWlSDu5/JM8w8gPwvKmXljbiOQiOhp7P9c5k91Qe88X67tUuicrNsDvd7e X9JHCcVGW72qMY4IPT8WssLvoWtjXoBYf8FZoAx4HmznfMi6YdscUdMDOR1zRH+lfJ cMpc+5cqAOd/A== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 17/19] lib/crypto: gf128mul: Remove unused 4k_lle functions Date: Wed, 18 Mar 2026 23:17:18 -0700 Message-ID: <20260319061723.1140720-18-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the 4k_lle multiplication functions and the associated gf128mul_table_le data table. Their only user was the generic implementation of GHASH, which has now been changed to use a different implementation based on standard integer multiplication. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- include/crypto/gf128mul.h | 17 ++------- lib/crypto/gf128mul.c | 73 +-------------------------------------- 2 files changed, 4 insertions(+), 86 deletions(-) diff --git a/include/crypto/gf128mul.h b/include/crypto/gf128mul.h index b0853f7cada0..6ed2a8351902 100644 --- a/include/crypto/gf128mul.h +++ b/include/crypto/gf128mul.h @@ -213,29 +213,18 @@ static inline void gf128mul_x_ble(le128 *r, const le1= 28 *x) =20 r->a =3D cpu_to_le64((a << 1) | (b >> 63)); r->b =3D cpu_to_le64((b << 1) ^ _tt); } =20 -/* 4k table optimization */ - -struct gf128mul_4k { - be128 t[256]; -}; - -struct gf128mul_4k *gf128mul_init_4k_lle(const be128 *g); -void gf128mul_4k_lle(be128 *a, const struct gf128mul_4k *t); void gf128mul_x8_ble(le128 *r, const le128 *x); -static inline void gf128mul_free_4k(struct gf128mul_4k *t) -{ - kfree_sensitive(t); -} - =20 /* 64k table optimization, implemented for bbe */ =20 struct gf128mul_64k { - struct gf128mul_4k *t[16]; + struct { + be128 t[256]; + } *t[16]; }; =20 /* First initialize with the constant factor with which you * want to multiply and then call gf128mul_64k_bbe with the other * factor in the first argument, and the table in the second. diff --git a/lib/crypto/gf128mul.c b/lib/crypto/gf128mul.c index e5a727b15f07..7ebf07ce1168 100644 --- a/lib/crypto/gf128mul.c +++ b/lib/crypto/gf128mul.c @@ -125,31 +125,13 @@ (i & 0x20 ? 0x3840 : 0) ^ (i & 0x10 ? 0x1c20 : 0) ^ \ (i & 0x08 ? 0x0e10 : 0) ^ (i & 0x04 ? 0x0708 : 0) ^ \ (i & 0x02 ? 0x0384 : 0) ^ (i & 0x01 ? 0x01c2 : 0) \ ) =20 -static const u16 gf128mul_table_le[256] =3D gf128mul_dat(xda_le); static const u16 gf128mul_table_be[256] =3D gf128mul_dat(xda_be); =20 -/* - * The following functions multiply a field element by x^8 in - * the polynomial field representation. They use 64-bit word operations - * to gain speed but compensate for machine endianness and hence work - * correctly on both styles of machine. - */ - -static void gf128mul_x8_lle(be128 *x) -{ - u64 a =3D be64_to_cpu(x->a); - u64 b =3D be64_to_cpu(x->b); - u64 _tt =3D gf128mul_table_le[b & 0xff]; - - x->b =3D cpu_to_be64((b >> 8) | (a << 56)); - x->a =3D cpu_to_be64((a >> 8) ^ (_tt << 48)); -} - -/* time invariant version of gf128mul_x8_lle */ +/* A table-less implementation of multiplying by x^8 */ static void gf128mul_x8_lle_ti(be128 *x) { u64 a =3D be64_to_cpu(x->a); u64 b =3D be64_to_cpu(x->b); u64 _tt =3D xda_le(b & 0xff); /* avoid table lookup */ @@ -303,60 +285,7 @@ void gf128mul_64k_bbe(be128 *a, const struct gf128mul_= 64k *t) be128_xor(r, r, &t->t[i]->t[ap[15 - i]]); *a =3D *r; } EXPORT_SYMBOL(gf128mul_64k_bbe); =20 -/* This version uses 4k bytes of table space. - A 16 byte buffer has to be multiplied by a 16 byte key - value in GF(2^128). If we consider a GF(2^128) value in a - single byte, we can construct a table of the 256 16 byte - values that result from the 256 values of this byte. - This requires 4096 bytes. If we take the highest byte in - the buffer and use this table to get the result, we then - have to multiply by x^120 to get the final value. For the - next highest byte the result has to be multiplied by x^112 - and so on. But we can do this by accumulating the result - in an accumulator starting with the result for the top - byte. We repeatedly multiply the accumulator value by - x^8 and then add in (i.e. xor) the 16 bytes of the next - lower byte in the buffer, stopping when we reach the - lowest byte. This requires a 4096 byte table. -*/ -struct gf128mul_4k *gf128mul_init_4k_lle(const be128 *g) -{ - struct gf128mul_4k *t; - int j, k; - - t =3D kzalloc_obj(*t); - if (!t) - goto out; - - t->t[128] =3D *g; - for (j =3D 64; j > 0; j >>=3D 1) - gf128mul_x_lle(&t->t[j], &t->t[j+j]); - - for (j =3D 2; j < 256; j +=3D j) - for (k =3D 1; k < j; ++k) - be128_xor(&t->t[j + k], &t->t[j], &t->t[k]); - -out: - return t; -} -EXPORT_SYMBOL(gf128mul_init_4k_lle); - -void gf128mul_4k_lle(be128 *a, const struct gf128mul_4k *t) -{ - u8 *ap =3D (u8 *)a; - be128 r[1]; - int i =3D 15; - - *r =3D t->t[ap[15]]; - while (i--) { - gf128mul_x8_lle(r); - be128_xor(r, r, &t->t[ap[i]]); - } - *a =3D *r; -} -EXPORT_SYMBOL(gf128mul_4k_lle); - MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Functions for multiplying elements of GF(2^128)"); --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7849B34D391; Thu, 19 Mar 2026 06:19:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901163; cv=none; b=NEORRmaUssRQPPfwtbbP4w2hgPkYBWSB1HBsfhnnpOkrwuXeqqW5x30dN6Qenv4JjWjPXQwTHaAEQRctPde1NPt7FJjuALwo265tnNQL3EQVFKHnN2jJjxYJUfuvH9p6zwJKU0uB8WwTtiCpR+JDL7C3ytKtNKfenIUq7k10ApA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901163; c=relaxed/simple; bh=30YR8O9RGszwoyHMoelh/TUtAbWKLQ7inDpOxAFZk7k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SRVKAJR6OU78BWoEbWnUmvesWH/UNdy3LSenMMBZvGHyPXcBytKFxDQpvc/Zq9OaGqxqHt9SNKvMl6YgqccFj5vh/HlZWrdMG9r3cnUoexIML+ffGHL1c3HjvUZc45pCtUSCnzWoUOdckAtxlE/LPjIdr01xJQp2e2VRw0+SZLs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lb5YamRu; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lb5YamRu" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0BE38C2BCB7; Thu, 19 Mar 2026 06:19:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901163; bh=30YR8O9RGszwoyHMoelh/TUtAbWKLQ7inDpOxAFZk7k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lb5YamRuffHIic0o3S+n53i4qR5e6SehVWlaTLv8p1LyBsgjpRVs/k1Cx/Xp7ce1W Zf0W4HoFFtWz5rRWKO4sKN81SffR9WDMIBpwl0xK1/I00zBhZVV5MF6ckGERn1BU6Z m2OAlKc80xW5wLY01FrtxijQ0aLm3eKC9uTS1nN6KW82CQkUXIMv6Jl62fACy+VPm+ TMTPRYFNjVleqEA95/N+pxM69ANF2eFCsbQwyUDCmDZ2rKjS12BzFcRwNqGEV8EbLp 8f7Bo8/Jb08B8sDDi2wv+tf5yku4WvsNXYwoIrg79Hvcy/49rLVT1DjimFh74VZ0LU uLuXhujkLXMCg== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 18/19] lib/crypto: gf128hash: Remove unused content from ghash.h Date: Wed, 18 Mar 2026 23:17:19 -0700 Message-ID: <20260319061723.1140720-19-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now that the structures in are no longer used, remove them. Since this leaves as just containing constants, include it from to deduplicate these definitions. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- include/crypto/gf128hash.h | 3 +-- include/crypto/ghash.h | 12 ------------ 2 files changed, 1 insertion(+), 14 deletions(-) diff --git a/include/crypto/gf128hash.h b/include/crypto/gf128hash.h index 0bc649d01e12..41c557d55965 100644 --- a/include/crypto/gf128hash.h +++ b/include/crypto/gf128hash.h @@ -6,15 +6,14 @@ */ =20 #ifndef _CRYPTO_GF128HASH_H #define _CRYPTO_GF128HASH_H =20 +#include #include #include =20 -#define GHASH_BLOCK_SIZE 16 -#define GHASH_DIGEST_SIZE 16 #define POLYVAL_BLOCK_SIZE 16 #define POLYVAL_DIGEST_SIZE 16 =20 /** * struct polyval_elem - An element of the POLYVAL finite field diff --git a/include/crypto/ghash.h b/include/crypto/ghash.h index 043d938e9a2c..d187e5af9925 100644 --- a/include/crypto/ghash.h +++ b/include/crypto/ghash.h @@ -4,21 +4,9 @@ */ =20 #ifndef __CRYPTO_GHASH_H__ #define __CRYPTO_GHASH_H__ =20 -#include - #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 =20 -struct gf128mul_4k; - -struct ghash_ctx { - struct gf128mul_4k *gf128; -}; - -struct ghash_desc_ctx { - u8 buffer[GHASH_BLOCK_SIZE]; -}; - #endif --=20 2.53.0 From nobody Mon Apr 6 15:03:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6D4B363C45; Thu, 19 Mar 2026 06:19:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901164; cv=none; b=a5QQ//YmkLX9VLXMl5Zj5yicoYYY6anKeFP1c9WyxaqeNnz0Ct5N+7mFZOGUxqunntNc7y9o38CK9aT4hCEJMmTMqMH0yAOYdxsfECLQBE1qVaJ1HWrEZWgBD4hkymtPOEJGfplnKTuclpXjg2lJoDNErJ2GCcnepqPFw8UoIr0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773901164; c=relaxed/simple; bh=3U+NGORzr+uh9rj8G+n62LLimEQQsU/GR7nNllucuSc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ek/jBPphnbE6S7Rhtv71zrc6G2HtGtcOHScdCQ0sYBQRcyAr9iJwWdSb+bfXv6jCL0ZdPVNHe3hnisjX3k9HQyhZ4+mDUacgC2Xba4G7xwUHeTPXakEp7y26j88zSXjt0ILvETQoVoMmnVzvT5VmRzZBCUEDC0KUGiuhvs25tf4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=m2Y3GEVL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="m2Y3GEVL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B06FC2BCB2; Thu, 19 Mar 2026 06:19:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773901163; bh=3U+NGORzr+uh9rj8G+n62LLimEQQsU/GR7nNllucuSc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=m2Y3GEVLGtbsHY8WNRU+QFtRIEvKC6sgYmhQjrPtaNQJHos9ekuXQTY0AiXpp5Ygs 36CIJ5QSZzuwHHRdIXj6WbXCskrNNkJtG1oBpEzt4z9NgHtoD5om3QeKkjnKsG3DJG olZFYsn7vYCLr4TlueikipE26Ea42XWUA11/b/XDxSNOGqi6H1NzHuQUWQOpKGm0iN 4+kAS02ryjXuwJez66eLP6nwrbpJuxJMs8VzC8IO0YNdj3sCyGZ4iS4Uv2hUxZZ2OW MW5pYJD4e1KYvCs/nNWHL7g0aMpyDayR3GGYAzrBIms8gHhr697QWnkIQAZZH0/OBX yhtS5h/NzKhMw== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ard Biesheuvel , "Jason A . Donenfeld" , Herbert Xu , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, x86@kernel.org, Eric Biggers Subject: [PATCH 19/19] lib/crypto: aesgcm: Use GHASH library API Date: Wed, 18 Mar 2026 23:17:20 -0700 Message-ID: <20260319061723.1140720-20-ebiggers@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260319061723.1140720-1-ebiggers@kernel.org> References: <20260319061723.1140720-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Make the AES-GCM library use the GHASH library instead of directly calling gf128mul_lle(). This allows the architecture-optimized GHASH implementations to be used, or the improved generic implementation if no architecture-optimized implementation is usable. Note: this means that no longer needs to include . Remove that inclusion, and include explicitly from arch/x86/crypto/aesni-intel_glue.c which previously was relying on the transitive inclusion. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/x86/crypto/aesni-intel_glue.c | 1 + include/crypto/gcm.h | 4 +-- lib/crypto/Kconfig | 2 +- lib/crypto/aesgcm.c | 55 +++++++++++++++--------------- 4 files changed, 32 insertions(+), 30 deletions(-) diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-int= el_glue.c index e6c38d1d8a92..f522fff9231e 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -23,10 +23,11 @@ #include #include #include #include #include +#include #include #include #include #include #include diff --git a/include/crypto/gcm.h b/include/crypto/gcm.h index b524e47bd4d0..1d5f39ff1dc4 100644 --- a/include/crypto/gcm.h +++ b/include/crypto/gcm.h @@ -2,11 +2,11 @@ #define _CRYPTO_GCM_H =20 #include =20 #include -#include +#include =20 #define GCM_AES_IV_SIZE 12 #define GCM_RFC4106_IV_SIZE 8 #define GCM_RFC4543_IV_SIZE 8 =20 @@ -63,11 +63,11 @@ static inline int crypto_ipsec_check_assoclen(unsigned = int assoclen) =20 return 0; } =20 struct aesgcm_ctx { - be128 ghash_key; + struct ghash_key ghash_key; struct aes_enckey aes_key; unsigned int authsize; }; =20 int aesgcm_expandkey(struct aesgcm_ctx *ctx, const u8 *key, diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig index a39e7707e9ee..32fafe245f47 100644 --- a/lib/crypto/Kconfig +++ b/lib/crypto/Kconfig @@ -39,11 +39,11 @@ config CRYPTO_LIB_AES_CBC_MACS . =20 config CRYPTO_LIB_AESGCM tristate select CRYPTO_LIB_AES - select CRYPTO_LIB_GF128MUL + select CRYPTO_LIB_GF128HASH select CRYPTO_LIB_UTILS =20 config CRYPTO_LIB_ARC4 tristate =20 diff --git a/lib/crypto/aesgcm.c b/lib/crypto/aesgcm.c index 02f5b5f32c76..8c7e74d2d147 100644 --- a/lib/crypto/aesgcm.c +++ b/lib/crypto/aesgcm.c @@ -3,13 +3,12 @@ * Minimal library implementation of GCM * * Copyright 2022 Google LLC */ =20 -#include #include -#include +#include #include #include #include =20 static void aesgcm_encrypt_block(const struct aes_enckey *key, void *dst, @@ -43,37 +42,26 @@ static void aesgcm_encrypt_block(const struct aes_encke= y *key, void *dst, * that are not permitted by the GCM specification. */ int aesgcm_expandkey(struct aesgcm_ctx *ctx, const u8 *key, unsigned int keysize, unsigned int authsize) { - u8 kin[AES_BLOCK_SIZE] =3D {}; + u8 h[AES_BLOCK_SIZE] =3D {}; int ret; =20 ret =3D crypto_gcm_check_authsize(authsize) ?: aes_prepareenckey(&ctx->aes_key, key, keysize); if (ret) return ret; =20 ctx->authsize =3D authsize; - aesgcm_encrypt_block(&ctx->aes_key, &ctx->ghash_key, kin); - + aesgcm_encrypt_block(&ctx->aes_key, h, h); + ghash_preparekey(&ctx->ghash_key, h); + memzero_explicit(h, sizeof(h)); return 0; } EXPORT_SYMBOL(aesgcm_expandkey); =20 -static void aesgcm_ghash(be128 *ghash, const be128 *key, const void *src, - int len) -{ - while (len > 0) { - crypto_xor((u8 *)ghash, src, min(len, GHASH_BLOCK_SIZE)); - gf128mul_lle(ghash, key); - - src +=3D GHASH_BLOCK_SIZE; - len -=3D GHASH_BLOCK_SIZE; - } -} - /** * aesgcm_mac - Generates the authentication tag using AES-GCM algorithm. * @ctx: The data structure that will hold the AES-GCM key schedule * @src: The input source data. * @src_len: Length of the source data. @@ -86,24 +74,37 @@ static void aesgcm_ghash(be128 *ghash, const be128 *key= , const void *src, * and an output buffer for the authentication tag. */ static void aesgcm_mac(const struct aesgcm_ctx *ctx, const u8 *src, int sr= c_len, const u8 *assoc, int assoc_len, __be32 *ctr, u8 *authtag) { - be128 tail =3D { cpu_to_be64(assoc_len * 8), cpu_to_be64(src_len * 8) }; - u8 buf[AES_BLOCK_SIZE]; - be128 ghash =3D {}; + static const u8 zeroes[GHASH_BLOCK_SIZE]; + __be64 tail[2] =3D { + cpu_to_be64((u64)assoc_len * 8), + cpu_to_be64((u64)src_len * 8), + }; + struct ghash_ctx ghash; + u8 ghash_out[AES_BLOCK_SIZE]; + u8 enc_ctr[AES_BLOCK_SIZE]; + + ghash_init(&ghash, &ctx->ghash_key); + + ghash_update(&ghash, assoc, assoc_len); + ghash_update(&ghash, zeroes, -assoc_len & (GHASH_BLOCK_SIZE - 1)); =20 - aesgcm_ghash(&ghash, &ctx->ghash_key, assoc, assoc_len); - aesgcm_ghash(&ghash, &ctx->ghash_key, src, src_len); - aesgcm_ghash(&ghash, &ctx->ghash_key, &tail, sizeof(tail)); + ghash_update(&ghash, src, src_len); + ghash_update(&ghash, zeroes, -src_len & (GHASH_BLOCK_SIZE - 1)); + + ghash_update(&ghash, (const u8 *)&tail, sizeof(tail)); + + ghash_final(&ghash, ghash_out); =20 ctr[3] =3D cpu_to_be32(1); - aesgcm_encrypt_block(&ctx->aes_key, buf, ctr); - crypto_xor_cpy(authtag, buf, (u8 *)&ghash, ctx->authsize); + aesgcm_encrypt_block(&ctx->aes_key, enc_ctr, ctr); + crypto_xor_cpy(authtag, ghash_out, enc_ctr, ctx->authsize); =20 - memzero_explicit(&ghash, sizeof(ghash)); - memzero_explicit(buf, sizeof(buf)); + memzero_explicit(ghash_out, sizeof(ghash_out)); + memzero_explicit(enc_ctr, sizeof(enc_ctr)); } =20 static void aesgcm_crypt(const struct aesgcm_ctx *ctx, u8 *dst, const u8 *= src, int len, __be32 *ctr) { --=20 2.53.0