From nobody Mon Feb 9 21:38:30 2026 Received: from abb.hmeau.com (abb.hmeau.com [144.6.53.87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 72BA91C5D7D; Mon, 28 Apr 2025 05:17:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=144.6.53.87 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745817441; cv=none; b=t0vLrZuesT3QAra0+slymuSmgBRzHdaeNG6wv/JH/qGmZIMXMaS5AVtgFFBWbbwIeZ2en5m/ppgjnrN6+dPWaUGRnBGl/Iffl5Mmbq0vrHPzJUzR7XDfOKrEqZyjenb75PCVKxFqnhj2NRo7pVW7kCFg9FaLuNX+B9mJ9VhN0rk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745817441; c=relaxed/simple; bh=XMClf3wCa2NubTKVC/iGipIrp0HImYzJ0t2Qt0ONKC4=; h=Date:Message-Id:In-Reply-To:References:From:Subject:To:Cc; b=C3wOVHaCgY9Tyfw0qZ8X+RoXNd7JtnYedkvEMFCA/UTnW7lUyvi6VKuMSfErN7w2nGgjHNo2DQQLUY7pgFwM/1VUn42V/2KEXHGPMZ0JUjvnbHEEyM29A7K9kiKamoywDmdNhOisQdf1Ef4d4zJD48VEgUKxokFYjEaFNtrJy98= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=gondor.apana.org.au; spf=pass smtp.mailfrom=gondor.apana.org.au; dkim=pass (2048-bit key) header.d=hmeau.com header.i=@hmeau.com header.b=QMvmi/vX; arc=none smtp.client-ip=144.6.53.87 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=gondor.apana.org.au Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gondor.apana.org.au Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=hmeau.com header.i=@hmeau.com header.b="QMvmi/vX" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=hmeau.com; s=formenos; h=Cc:To:Subject:From:References:In-Reply-To:Message-Id:Date: Sender:Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=ZJyttXmiIjSdv1E4r6rUPWCh6pFMFPfu4ZpfWFmdHyw=; b=QMvmi/vX2mCc5PxzNKF5Gin4fI qsCU1ljvFWhoZcLBAQegpxl19/AXN87/jhdnLNrQfvKtJGJpWbmXUT6Ca0F/Cz/1yVtKu86sv2/LP M5pM1YMeyieSBb4n669MqaSKQ/yMxKs0i44Ko8EDKXkKdRLXnWYbohErrmr82g+Vg7ae3QvH6YdST zy0aBLvQWGt0fRoMF90h3tuPoVDOKShsoZshB5Iy406QwcZ3ZgoEudNn2jT1WHtBcKURCAhldiDno PKDViUBKOUqAG3eEyrB69/533jD0/7Rqzp0MYO2NP1TE4xkVRB/LLS3cNUh/KFQFVWCyivGNvdm9p Qu7Bu4oA==; Received: from loth.rohan.me.apana.org.au ([192.168.167.2]) by formenos.hmeau.com with smtp (Exim 4.96 #2 (Debian)) id 1u9Grf-001WRl-0x; Mon, 28 Apr 2025 13:17:12 +0800 Received: by loth.rohan.me.apana.org.au (sSMTP sendmail emulation); Mon, 28 Apr 2025 13:17:11 +0800 Date: Mon, 28 Apr 2025 13:17:11 +0800 Message-Id: <59670d6539eac83227db86dea67166ec7b86c1ca.1745816372.git.herbert@gondor.apana.org.au> In-Reply-To: References: From: Herbert Xu Subject: [v3 PATCH 04/13] crypto: arm64/sha256 - implement library instead of shash To: Linux Crypto Mailing List Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, sparclinux@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, Ard Biesheuvel , "Jason A . Donenfeld " , Linus Torvalds Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Eric Biggers Instead of providing crypto_shash algorithms for the arch-optimized SHA-256 code, instead implement the SHA-256 library. This is much simpler, it makes the SHA-256 library functions be arch-optimized, and it fixes the longstanding issue where the arch-optimized SHA-256 was disabled by default. SHA-256 still remains available through crypto_shash, but individual architectures no longer need to handle it. Remove support for SHA-256 finalization from the ARMv8 CE assembly code, since the library does not yet support architecture-specific overrides of the finalization. (Support for that has been omitted for now, for simplicity and because usually it isn't performance-critical.) To match sha256_blocks_arch(), change the type of the nblocks parameter of the assembly functions from int or 'unsigned int' to size_t. Update the ARMv8 CE assembly function accordingly. The scalar and NEON assembly functions actually already treated it as size_t. While renaming the assembly files, also fix the naming quirks where "sha2" meant sha256, and "sha512" meant both sha256 and sha512. Signed-off-by: Eric Biggers Reviewed-by: Ard Biesheuvel Signed-off-by: Herbert Xu --- arch/arm64/configs/defconfig | 1 - arch/arm64/crypto/Kconfig | 19 --- arch/arm64/crypto/Makefile | 13 +- arch/arm64/crypto/sha2-ce-glue.c | 138 ---------------- arch/arm64/crypto/sha256-glue.c | 156 ------------------ arch/arm64/crypto/sha512-glue.c | 6 +- arch/arm64/lib/crypto/.gitignore | 1 + arch/arm64/lib/crypto/Kconfig | 6 + arch/arm64/lib/crypto/Makefile | 9 +- .../crypto/sha2-armv8.pl} | 2 +- .../sha2-ce-core.S =3D> lib/crypto/sha256-ce.S} | 36 +--- arch/arm64/lib/crypto/sha256.c | 75 +++++++++ 12 files changed, 103 insertions(+), 359 deletions(-) delete mode 100644 arch/arm64/crypto/sha2-ce-glue.c delete mode 100644 arch/arm64/crypto/sha256-glue.c rename arch/arm64/{crypto/sha512-armv8.pl =3D> lib/crypto/sha2-armv8.pl} (= 99%) rename arch/arm64/{crypto/sha2-ce-core.S =3D> lib/crypto/sha256-ce.S} (80%) create mode 100644 arch/arm64/lib/crypto/sha256.c diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 5bb8f09422a2..b0d4c7d173ea 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -1737,7 +1737,6 @@ CONFIG_CRYPTO_USER_API_RNG=3Dm CONFIG_CRYPTO_CHACHA20_NEON=3Dm CONFIG_CRYPTO_GHASH_ARM64_CE=3Dy CONFIG_CRYPTO_SHA1_ARM64_CE=3Dy -CONFIG_CRYPTO_SHA2_ARM64_CE=3Dy CONFIG_CRYPTO_SHA512_ARM64_CE=3Dm CONFIG_CRYPTO_SHA3_ARM64=3Dm CONFIG_CRYPTO_SM3_ARM64_CE=3Dm diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 55a7d87a6769..c44b0f202a1f 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -36,25 +36,6 @@ config CRYPTO_SHA1_ARM64_CE Architecture: arm64 using: - ARMv8 Crypto Extensions =20 -config CRYPTO_SHA256_ARM64 - tristate "Hash functions: SHA-224 and SHA-256" - select CRYPTO_HASH - help - SHA-224 and SHA-256 secure hash algorithms (FIPS 180) - - Architecture: arm64 - -config CRYPTO_SHA2_ARM64_CE - tristate "Hash functions: SHA-224 and SHA-256 (ARMv8 Crypto Extensions)" - depends on KERNEL_MODE_NEON - select CRYPTO_HASH - select CRYPTO_SHA256_ARM64 - help - SHA-224 and SHA-256 secure hash algorithms (FIPS 180) - - Architecture: arm64 using: - - ARMv8 Crypto Extensions - config CRYPTO_SHA512_ARM64 tristate "Hash functions: SHA-384 and SHA-512" select CRYPTO_HASH diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index 089ae3ddde81..c231c980c514 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -8,9 +8,6 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM64_CE) +=3D sha1-ce.o sha1-ce-y :=3D sha1-ce-glue.o sha1-ce-core.o =20 -obj-$(CONFIG_CRYPTO_SHA2_ARM64_CE) +=3D sha2-ce.o -sha2-ce-y :=3D sha2-ce-glue.o sha2-ce-core.o - obj-$(CONFIG_CRYPTO_SHA512_ARM64_CE) +=3D sha512-ce.o sha512-ce-y :=3D sha512-ce-glue.o sha512-ce-core.o =20 @@ -56,9 +53,6 @@ aes-ce-blk-y :=3D aes-glue-ce.o aes-ce.o obj-$(CONFIG_CRYPTO_AES_ARM64_NEON_BLK) +=3D aes-neon-blk.o aes-neon-blk-y :=3D aes-glue-neon.o aes-neon.o =20 -obj-$(CONFIG_CRYPTO_SHA256_ARM64) +=3D sha256-arm64.o -sha256-arm64-y :=3D sha256-glue.o sha256-core.o - obj-$(CONFIG_CRYPTO_SHA512_ARM64) +=3D sha512-arm64.o sha512-arm64-y :=3D sha512-glue.o sha512-core.o =20 @@ -74,10 +68,7 @@ aes-neon-bs-y :=3D aes-neonbs-core.o aes-neonbs-glue.o quiet_cmd_perlasm =3D PERLASM $@ cmd_perlasm =3D $(PERL) $(<) void $(@) =20 -$(obj)/%-core.S: $(src)/%-armv8.pl +$(obj)/sha512-core.S: $(src)/../lib/crypto/sha2-armv8.pl $(call cmd,perlasm) =20 -$(obj)/sha256-core.S: $(src)/sha512-armv8.pl - $(call cmd,perlasm) - -clean-files +=3D sha256-core.S sha512-core.S +clean-files +=3D sha512-core.S diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-g= lue.c deleted file mode 100644 index 912c215101eb..000000000000 --- a/arch/arm64/crypto/sha2-ce-glue.c +++ /dev/null @@ -1,138 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * sha2-ce-glue.c - SHA-224/SHA-256 using ARMv8 Crypto Extensions - * - * Copyright (C) 2014 - 2017 Linaro Ltd - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash using ARMv8 Crypto Extensi= ons"); -MODULE_AUTHOR("Ard Biesheuvel "); -MODULE_LICENSE("GPL v2"); -MODULE_ALIAS_CRYPTO("sha224"); -MODULE_ALIAS_CRYPTO("sha256"); - -struct sha256_ce_state { - struct crypto_sha256_state sst; - u32 finalize; -}; - -extern const u32 sha256_ce_offsetof_count; -extern const u32 sha256_ce_offsetof_finalize; - -asmlinkage int __sha256_ce_transform(struct sha256_ce_state *sst, u8 const= *src, - int blocks); - -static void sha256_ce_transform(struct crypto_sha256_state *sst, u8 const = *src, - int blocks) -{ - while (blocks) { - int rem; - - kernel_neon_begin(); - rem =3D __sha256_ce_transform(container_of(sst, - struct sha256_ce_state, - sst), src, blocks); - kernel_neon_end(); - src +=3D (blocks - rem) * SHA256_BLOCK_SIZE; - blocks =3D rem; - } -} - -const u32 sha256_ce_offsetof_count =3D offsetof(struct sha256_ce_state, - sst.count); -const u32 sha256_ce_offsetof_finalize =3D offsetof(struct sha256_ce_state, - finalize); - -static int sha256_ce_update(struct shash_desc *desc, const u8 *data, - unsigned int len) -{ - struct sha256_ce_state *sctx =3D shash_desc_ctx(desc); - - sctx->finalize =3D 0; - return sha256_base_do_update_blocks(desc, data, len, - sha256_ce_transform); -} - -static int sha256_ce_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out) -{ - struct sha256_ce_state *sctx =3D shash_desc_ctx(desc); - bool finalize =3D !(len % SHA256_BLOCK_SIZE) && len; - - /* - * Allow the asm code to perform the finalization if there is no - * partial data and the input is a round multiple of the block size. - */ - sctx->finalize =3D finalize; - - if (finalize) - sha256_base_do_update_blocks(desc, data, len, - sha256_ce_transform); - else - sha256_base_do_finup(desc, data, len, sha256_ce_transform); - return sha256_base_finish(desc, out); -} - -static int sha256_ce_digest(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out) -{ - sha256_base_init(desc); - return sha256_ce_finup(desc, data, len, out); -} - -static struct shash_alg algs[] =3D { { - .init =3D sha224_base_init, - .update =3D sha256_ce_update, - .finup =3D sha256_ce_finup, - .descsize =3D sizeof(struct sha256_ce_state), - .statesize =3D sizeof(struct crypto_sha256_state), - .digestsize =3D SHA224_DIGEST_SIZE, - .base =3D { - .cra_name =3D "sha224", - .cra_driver_name =3D "sha224-ce", - .cra_priority =3D 200, - .cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY | - CRYPTO_AHASH_ALG_FINUP_MAX, - .cra_blocksize =3D SHA256_BLOCK_SIZE, - .cra_module =3D THIS_MODULE, - } -}, { - .init =3D sha256_base_init, - .update =3D sha256_ce_update, - .finup =3D sha256_ce_finup, - .digest =3D sha256_ce_digest, - .descsize =3D sizeof(struct sha256_ce_state), - .statesize =3D sizeof(struct crypto_sha256_state), - .digestsize =3D SHA256_DIGEST_SIZE, - .base =3D { - .cra_name =3D "sha256", - .cra_driver_name =3D "sha256-ce", - .cra_priority =3D 200, - .cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY | - CRYPTO_AHASH_ALG_FINUP_MAX, - .cra_blocksize =3D SHA256_BLOCK_SIZE, - .cra_module =3D THIS_MODULE, - } -} }; - -static int __init sha2_ce_mod_init(void) -{ - return crypto_register_shashes(algs, ARRAY_SIZE(algs)); -} - -static void __exit sha2_ce_mod_fini(void) -{ - crypto_unregister_shashes(algs, ARRAY_SIZE(algs)); -} - -module_cpu_feature_match(SHA2, sha2_ce_mod_init); -module_exit(sha2_ce_mod_fini); diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glu= e.c deleted file mode 100644 index d63ea82e1374..000000000000 --- a/arch/arm64/crypto/sha256-glue.c +++ /dev/null @@ -1,156 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * Linux/arm64 port of the OpenSSL SHA256 implementation for AArch64 - * - * Copyright (c) 2016 Linaro Ltd. - */ - -#include -#include -#include -#include -#include -#include -#include - -MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash for arm64"); -MODULE_AUTHOR("Andy Polyakov "); -MODULE_AUTHOR("Ard Biesheuvel "); -MODULE_LICENSE("GPL v2"); -MODULE_ALIAS_CRYPTO("sha224"); -MODULE_ALIAS_CRYPTO("sha256"); - -asmlinkage void sha256_block_data_order(u32 *digest, const void *data, - unsigned int num_blks); -EXPORT_SYMBOL(sha256_block_data_order); - -static void sha256_arm64_transform(struct crypto_sha256_state *sst, - u8 const *src, int blocks) -{ - sha256_block_data_order(sst->state, src, blocks); -} - -asmlinkage void sha256_block_neon(u32 *digest, const void *data, - unsigned int num_blks); - -static void sha256_neon_transform(struct crypto_sha256_state *sst, - u8 const *src, int blocks) -{ - kernel_neon_begin(); - sha256_block_neon(sst->state, src, blocks); - kernel_neon_end(); -} - -static int crypto_sha256_arm64_update(struct shash_desc *desc, const u8 *d= ata, - unsigned int len) -{ - return sha256_base_do_update_blocks(desc, data, len, - sha256_arm64_transform); -} - -static int crypto_sha256_arm64_finup(struct shash_desc *desc, const u8 *da= ta, - unsigned int len, u8 *out) -{ - sha256_base_do_finup(desc, data, len, sha256_arm64_transform); - return sha256_base_finish(desc, out); -} - -static struct shash_alg algs[] =3D { { - .digestsize =3D SHA256_DIGEST_SIZE, - .init =3D sha256_base_init, - .update =3D crypto_sha256_arm64_update, - .finup =3D crypto_sha256_arm64_finup, - .descsize =3D sizeof(struct crypto_sha256_state), - .base.cra_name =3D "sha256", - .base.cra_driver_name =3D "sha256-arm64", - .base.cra_priority =3D 125, - .base.cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY | - CRYPTO_AHASH_ALG_FINUP_MAX, - .base.cra_blocksize =3D SHA256_BLOCK_SIZE, - .base.cra_module =3D THIS_MODULE, -}, { - .digestsize =3D SHA224_DIGEST_SIZE, - .init =3D sha224_base_init, - .update =3D crypto_sha256_arm64_update, - .finup =3D crypto_sha256_arm64_finup, - .descsize =3D sizeof(struct crypto_sha256_state), - .base.cra_name =3D "sha224", - .base.cra_driver_name =3D "sha224-arm64", - .base.cra_priority =3D 125, - .base.cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY | - CRYPTO_AHASH_ALG_FINUP_MAX, - .base.cra_blocksize =3D SHA224_BLOCK_SIZE, - .base.cra_module =3D THIS_MODULE, -} }; - -static int sha256_update_neon(struct shash_desc *desc, const u8 *data, - unsigned int len) -{ - return sha256_base_do_update_blocks(desc, data, len, - sha256_neon_transform); -} - -static int sha256_finup_neon(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out) -{ - if (len >=3D SHA256_BLOCK_SIZE) { - int remain =3D sha256_update_neon(desc, data, len); - - data +=3D len - remain; - len =3D remain; - } - sha256_base_do_finup(desc, data, len, sha256_neon_transform); - return sha256_base_finish(desc, out); -} - -static struct shash_alg neon_algs[] =3D { { - .digestsize =3D SHA256_DIGEST_SIZE, - .init =3D sha256_base_init, - .update =3D sha256_update_neon, - .finup =3D sha256_finup_neon, - .descsize =3D sizeof(struct crypto_sha256_state), - .base.cra_name =3D "sha256", - .base.cra_driver_name =3D "sha256-arm64-neon", - .base.cra_priority =3D 150, - .base.cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY | - CRYPTO_AHASH_ALG_FINUP_MAX, - .base.cra_blocksize =3D SHA256_BLOCK_SIZE, - .base.cra_module =3D THIS_MODULE, -}, { - .digestsize =3D SHA224_DIGEST_SIZE, - .init =3D sha224_base_init, - .update =3D sha256_update_neon, - .finup =3D sha256_finup_neon, - .descsize =3D sizeof(struct crypto_sha256_state), - .base.cra_name =3D "sha224", - .base.cra_driver_name =3D "sha224-arm64-neon", - .base.cra_priority =3D 150, - .base.cra_flags =3D CRYPTO_AHASH_ALG_BLOCK_ONLY | - CRYPTO_AHASH_ALG_FINUP_MAX, - .base.cra_blocksize =3D SHA224_BLOCK_SIZE, - .base.cra_module =3D THIS_MODULE, -} }; - -static int __init sha256_mod_init(void) -{ - int ret =3D crypto_register_shashes(algs, ARRAY_SIZE(algs)); - if (ret) - return ret; - - if (cpu_have_named_feature(ASIMD)) { - ret =3D crypto_register_shashes(neon_algs, ARRAY_SIZE(neon_algs)); - if (ret) - crypto_unregister_shashes(algs, ARRAY_SIZE(algs)); - } - return ret; -} - -static void __exit sha256_mod_fini(void) -{ - if (cpu_have_named_feature(ASIMD)) - crypto_unregister_shashes(neon_algs, ARRAY_SIZE(neon_algs)); - crypto_unregister_shashes(algs, ARRAY_SIZE(algs)); -} - -module_init(sha256_mod_init); -module_exit(sha256_mod_fini); diff --git a/arch/arm64/crypto/sha512-glue.c b/arch/arm64/crypto/sha512-glu= e.c index ab2e1c13dfad..15aa9d8b7b2c 100644 --- a/arch/arm64/crypto/sha512-glue.c +++ b/arch/arm64/crypto/sha512-glue.c @@ -18,13 +18,13 @@ MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("sha384"); MODULE_ALIAS_CRYPTO("sha512"); =20 -asmlinkage void sha512_block_data_order(u64 *digest, const void *data, - unsigned int num_blks); +asmlinkage void sha512_blocks_arch(u64 *digest, const void *data, + unsigned int num_blks); =20 static void sha512_arm64_transform(struct sha512_state *sst, u8 const *src, int blocks) { - sha512_block_data_order(sst->state, src, blocks); + sha512_blocks_arch(sst->state, src, blocks); } =20 static int sha512_update(struct shash_desc *desc, const u8 *data, diff --git a/arch/arm64/lib/crypto/.gitignore b/arch/arm64/lib/crypto/.giti= gnore index 0d47d4f21c6d..12d74d8b03d0 100644 --- a/arch/arm64/lib/crypto/.gitignore +++ b/arch/arm64/lib/crypto/.gitignore @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only poly1305-core.S +sha256-core.S diff --git a/arch/arm64/lib/crypto/Kconfig b/arch/arm64/lib/crypto/Kconfig index 0b903ef524d8..129a7685cb4c 100644 --- a/arch/arm64/lib/crypto/Kconfig +++ b/arch/arm64/lib/crypto/Kconfig @@ -12,3 +12,9 @@ config CRYPTO_POLY1305_NEON depends on KERNEL_MODE_NEON default CRYPTO_LIB_POLY1305 select CRYPTO_ARCH_HAVE_LIB_POLY1305 + +config CRYPTO_SHA256_ARM64 + tristate + default CRYPTO_LIB_SHA256 + select CRYPTO_ARCH_HAVE_LIB_SHA256 + select CRYPTO_ARCH_HAVE_LIB_SHA256_SIMD diff --git a/arch/arm64/lib/crypto/Makefile b/arch/arm64/lib/crypto/Makefile index 6207088397a7..946c09903711 100644 --- a/arch/arm64/lib/crypto/Makefile +++ b/arch/arm64/lib/crypto/Makefile @@ -8,10 +8,17 @@ poly1305-neon-y :=3D poly1305-core.o poly1305-glue.o AFLAGS_poly1305-core.o +=3D -Dpoly1305_init=3Dpoly1305_block_init_arch AFLAGS_poly1305-core.o +=3D -Dpoly1305_emit=3Dpoly1305_emit_arch =20 +obj-$(CONFIG_CRYPTO_SHA256_ARM64) +=3D sha256-arm64.o +sha256-arm64-y :=3D sha256.o sha256-core.o +sha256-arm64-$(CONFIG_KERNEL_MODE_NEON) +=3D sha256-ce.o + quiet_cmd_perlasm =3D PERLASM $@ cmd_perlasm =3D $(PERL) $(<) void $(@) =20 $(obj)/%-core.S: $(src)/%-armv8.pl $(call cmd,perlasm) =20 -clean-files +=3D poly1305-core.S +$(obj)/sha256-core.S: $(src)/sha2-armv8.pl + $(call cmd,perlasm) + +clean-files +=3D poly1305-core.S sha256-core.S diff --git a/arch/arm64/crypto/sha512-armv8.pl b/arch/arm64/lib/crypto/sha2= -armv8.pl similarity index 99% rename from arch/arm64/crypto/sha512-armv8.pl rename to arch/arm64/lib/crypto/sha2-armv8.pl index 35ec9ae99fe1..4aebd20c498b 100644 --- a/arch/arm64/crypto/sha512-armv8.pl +++ b/arch/arm64/lib/crypto/sha2-armv8.pl @@ -95,7 +95,7 @@ if ($output =3D~ /512/) { $reg_t=3D"w"; } =20 -$func=3D"sha${BITS}_block_data_order"; +$func=3D"sha${BITS}_blocks_arch"; =20 ($ctx,$inp,$num,$Ktbl)=3Dmap("x$_",(0..2,30)); =20 diff --git a/arch/arm64/crypto/sha2-ce-core.S b/arch/arm64/lib/crypto/sha25= 6-ce.S similarity index 80% rename from arch/arm64/crypto/sha2-ce-core.S rename to arch/arm64/lib/crypto/sha256-ce.S index fce84d88ddb2..a8461d6dad63 100644 --- a/arch/arm64/crypto/sha2-ce-core.S +++ b/arch/arm64/lib/crypto/sha256-ce.S @@ -71,8 +71,8 @@ .word 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 =20 /* - * int __sha256_ce_transform(struct sha256_ce_state *sst, u8 const *src, - * int blocks) + * size_t __sha256_ce_transform(u32 state[SHA256_STATE_WORDS], + * const u8 *data, size_t nblocks); */ .text SYM_FUNC_START(__sha256_ce_transform) @@ -86,20 +86,16 @@ SYM_FUNC_START(__sha256_ce_transform) /* load state */ ld1 {dgav.4s, dgbv.4s}, [x0] =20 - /* load sha256_ce_state::finalize */ - ldr_l w4, sha256_ce_offsetof_finalize, x4 - ldr w4, [x0, x4] - /* load input */ 0: ld1 {v16.4s-v19.4s}, [x1], #64 - sub w2, w2, #1 + sub x2, x2, #1 =20 CPU_LE( rev32 v16.16b, v16.16b ) CPU_LE( rev32 v17.16b, v17.16b ) CPU_LE( rev32 v18.16b, v18.16b ) CPU_LE( rev32 v19.16b, v19.16b ) =20 -1: add t0.4s, v16.4s, v0.4s + add t0.4s, v16.4s, v0.4s mov dg0v.16b, dgav.16b mov dg1v.16b, dgbv.16b =20 @@ -128,30 +124,12 @@ CPU_LE( rev32 v19.16b, v19.16b ) add dgbv.4s, dgbv.4s, dg1v.4s =20 /* handled all input blocks? */ - cbz w2, 2f + cbz x2, 1f cond_yield 3f, x5, x6 b 0b =20 - /* - * Final block: add padding and total bit count. - * Skip if the input size was not a round multiple of the block size, - * the padding is handled by the C code in that case. - */ -2: cbz x4, 3f - ldr_l w4, sha256_ce_offsetof_count, x4 - ldr x4, [x0, x4] - movi v17.2d, #0 - mov x8, #0x80000000 - movi v18.2d, #0 - ror x7, x4, #29 // ror(lsl(x4, 3), 32) - fmov d16, x8 - mov x4, #0 - mov v19.d[0], xzr - mov v19.d[1], x7 - b 1b - /* store new state */ -3: st1 {dgav.4s, dgbv.4s}, [x0] - mov w0, w2 +1: st1 {dgav.4s, dgbv.4s}, [x0] + mov x0, x2 ret SYM_FUNC_END(__sha256_ce_transform) diff --git a/arch/arm64/lib/crypto/sha256.c b/arch/arm64/lib/crypto/sha256.c new file mode 100644 index 000000000000..fdceb2d0899c --- /dev/null +++ b/arch/arm64/lib/crypto/sha256.c @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * SHA-256 optimized for ARM64 + * + * Copyright 2025 Google LLC + */ +#include +#include +#include +#include + +asmlinkage void sha256_blocks_arch(u32 state[SHA256_STATE_WORDS], + const u8 *data, size_t nblocks); +EXPORT_SYMBOL_GPL(sha256_blocks_arch); +asmlinkage void sha256_block_neon(u32 state[SHA256_STATE_WORDS], + const u8 *data, size_t nblocks); +asmlinkage size_t __sha256_ce_transform(u32 state[SHA256_STATE_WORDS], + const u8 *data, size_t nblocks); + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon); +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_ce); + +void sha256_blocks_simd(u32 state[SHA256_STATE_WORDS], + const u8 *data, size_t nblocks) +{ + if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && + static_branch_likely(&have_neon)) { + if (static_branch_likely(&have_ce)) { + do { + size_t rem; + + kernel_neon_begin(); + rem =3D __sha256_ce_transform(state, + data, nblocks); + kernel_neon_end(); + data +=3D (nblocks - rem) * SHA256_BLOCK_SIZE; + nblocks =3D rem; + } while (nblocks); + } else { + kernel_neon_begin(); + sha256_block_neon(state, data, nblocks); + kernel_neon_end(); + } + } else { + sha256_blocks_arch(state, data, nblocks); + } +} +EXPORT_SYMBOL_GPL(sha256_blocks_simd); + +bool sha256_is_arch_optimized(void) +{ + /* We always can use at least the ARM64 scalar implementation. */ + return true; +} +EXPORT_SYMBOL_GPL(sha256_is_arch_optimized); + +static int __init sha256_arm64_mod_init(void) +{ + if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && + cpu_have_named_feature(ASIMD)) { + static_branch_enable(&have_neon); + if (cpu_have_named_feature(SHA2)) + static_branch_enable(&have_ce); + } + return 0; +} +arch_initcall(sha256_arm64_mod_init); + +static void __exit sha256_arm64_mod_exit(void) +{ +} +module_exit(sha256_arm64_mod_exit); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("SHA-256 optimized for ARM64"); --=20 2.39.5