Documentation/crypto/index.rst | 1 + Documentation/crypto/sha3.rst | 241 +++++++++++++ arch/arm64/crypto/sha3-ce-glue.c | 47 +-- arch/s390/crypto/sha3_256_s390.c | 26 +- arch/s390/crypto/sha3_512_s390.c | 26 +- crypto/sha3_generic.c | 233 +++--------- crypto/testmgr.c | 14 + crypto/testmgr.h | 59 ++++ include/crypto/sha3.h | 467 +++++++++++++++++++++++- lib/crypto/Kconfig | 7 + lib/crypto/Makefile | 6 + lib/crypto/sha3.c | 529 ++++++++++++++++++++++++++++ lib/crypto/tests/Kconfig | 12 + lib/crypto/tests/Makefile | 1 + lib/crypto/tests/sha3_kunit.c | 338 ++++++++++++++++++ lib/crypto/tests/sha3_testvecs.h | 231 ++++++++++++ scripts/crypto/gen-hash-testvecs.py | 8 +- 17 files changed, 2012 insertions(+), 234 deletions(-) create mode 100644 Documentation/crypto/sha3.rst create mode 100644 lib/crypto/sha3.c create mode 100644 lib/crypto/tests/sha3_kunit.c create mode 100644 lib/crypto/tests/sha3_testvecs.h
Hi Eric, Herbert,
Here's a set of patches does the following:
(1) Renames s390 and arm64 sha3_* functions to avoid name collisions.
(2) Copies the core of SHA3 support from crypto/ to lib/crypto/.
(3) Simplifies the internal code to maintain the buffer in little endian
form, thereby simplifying the update and extraction code which don't
then need to worry about this. Instead, the state buffer is
byteswapped before and after.
(4) Moves the Iota transform into the function with the rest of the
transforms.
(5) Adds SHAKE128 and SHAKE256 support (needed for ML-DSA).
(6) Adds a kunit test for SHA3 in lib/crypto/tests/.
(7) Adds proper API documentation for SHA3.
(8) Makes crypto/sha3_generic.c use lib/crypto/sha3. This necessitates a
slight enlargement of the context buffers which might affect optimised
assembly/hardware drivers.
Note that only the generic code is moved across; the asm-optimised stuff is
not touched as I'm not familiar with that.
I have done what Eric required and made a separate wrapper struct and set
of wrapper functions for each algorithm, though I think this is excessively
bureaucratic as this multiplies the API load by 7 (and maybe 9 in the
future[*]).
[*] The Kyber algorithm also uses CSHAKE variants in the SHA3 family - and
NIST mentions some other variants too.
This does, however, cause a problem for what I need to do as the ML-DSA
prehash is dynamically selectable by certificate OID, so I have to add
SHAKE128/256 support to the crypto shash API too - though hopefully it will
only require an output of 16 or 32 bytes respectively for the prehash case
and won't require multiple squeezing.
This is based on Eric's libcrypto-next branch.
The patches can also be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=keys-pqc
David
Changes
=======
ver #3)
- Renamed conflicting arm64 functions.
- Made a separate wrapper API for each algorithm in the family.
- Removed sha3_init(), sha3_reinit() and sha3_final().
- Removed sha3_ctx::digest_size.
- Renamed sha3_ctx::partial to sha3_ctx::absorb_offset.
- Refer to the output of SHAKE* as "output" not "digest".
- Moved the Iota transform into the one-round function.
- Made sha3_update() warn if called after sha3_squeeze().
- Simplified the module-load test to not do update after squeeze.
- Added Return: and Context: kdoc statements and expanded the kdoc
headers.
- Added an API description document.
- Overhauled the kunit tests.
- Only have one kunit test.
- Only call the general hash tester on one algo.
- Add separate simple cursory checks for the other algos.
- Add resqueezing tests.
- Add some NIST example tests.
- Changed crypto/sha3_generic to use this
- Added SHAKE128/256 to crypto/sha3_generic and crypto/testmgr
- Folded struct sha3_state into struct sha3_ctx.
ver #2)
- Simplify the endianness handling.
- Rename sha3_final() to sha3_squeeze() and don't clear the context at the
end as it's permitted to continue calling sha3_final() to extract
continuations of the digest (needed by ML-DSA).
- Don't reapply the end marker to the hash state in continuation
sha3_squeeze() unless sha3_update() gets called again (needed by
ML-DSA).
- Give sha3_squeeze() the amount of digest to produce as a parameter
rather than using ctx->digest_size and don't return the amount digested.
- Reimplement sha3_final() as a wrapper around sha3_squeeze() that
extracts ctx->digest_size amount of digest and then zeroes out the
context. The latter is necessary to avoid upsetting
hash-test-template.h.
- Provide a sha3_reinit() function to clear the state, but to leave the
parameters that indicate the hash properties unaffected, allowing for
reuse.
- Provide a sha3_set_digestsize() function to change the size of the
digest to be extracted by sha3_final(). sha3_squeeze() takes a
parameter for this instead.
- Don't pass the digest size as a parameter to shake128/256_init() but
rather default to 128/256 bits as per the function name.
- Provide a sha3_clear() function to zero out the context.
David Howells (8):
s390/sha3: Rename conflicting functions
arm64/sha3: Rename conflicting functions
lib/crypto: Add SHA3-224, SHA3-256, SHA3-384, SHA-512, SHAKE128,
SHAKE256
lib/crypto: Move the SHA3 Iota transform into the single round
function
lib/crypto: Add SHA3 kunit tests
crypto/sha3: Use lib/crypto/sha3
crypto/sha3: Add SHAKE128/256 support
crypto: SHAKE tests
Documentation/crypto/index.rst | 1 +
Documentation/crypto/sha3.rst | 241 +++++++++++++
arch/arm64/crypto/sha3-ce-glue.c | 47 +--
arch/s390/crypto/sha3_256_s390.c | 26 +-
arch/s390/crypto/sha3_512_s390.c | 26 +-
crypto/sha3_generic.c | 233 +++---------
crypto/testmgr.c | 14 +
crypto/testmgr.h | 59 ++++
include/crypto/sha3.h | 467 +++++++++++++++++++++++-
lib/crypto/Kconfig | 7 +
lib/crypto/Makefile | 6 +
lib/crypto/sha3.c | 529 ++++++++++++++++++++++++++++
lib/crypto/tests/Kconfig | 12 +
lib/crypto/tests/Makefile | 1 +
lib/crypto/tests/sha3_kunit.c | 338 ++++++++++++++++++
lib/crypto/tests/sha3_testvecs.h | 231 ++++++++++++
scripts/crypto/gen-hash-testvecs.py | 8 +-
17 files changed, 2012 insertions(+), 234 deletions(-)
create mode 100644 Documentation/crypto/sha3.rst
create mode 100644 lib/crypto/sha3.c
create mode 100644 lib/crypto/tests/sha3_kunit.c
create mode 100644 lib/crypto/tests/sha3_testvecs.h
Hi David,
On Fri, Sep 26, 2025 at 03:19:43PM +0100, David Howells wrote:
> I have done what Eric required and made a separate wrapper struct and set
> of wrapper functions for each algorithm, though I think this is excessively
> bureaucratic as this multiplies the API load by 7 (and maybe 9 in the
> future[*]).
I don't think I "required" that it be implemented in exactly this way.
Sorry if I wasn't clear. Let me quote what I wrote:
First, this patch's proposed API is error-prone due to the weak
typing that allows mixing steps of different algorithms together.
For example, users could initialize a sha3_ctx with sha3_256_init()
and then squeeze an arbitrary amount from it, incorrectly treating
it as a XOF. It would be worth considering separating the APIs for
the different algorithms that are part of SHA-3, similar to what I
did with SHA-224 and SHA-256. (They would of course still share
code internally, just like SHA-2.)
So I asked that to prevent usage errors such as treating a digest as a
XOF, you consider separating the APIs. There is more than one way to do
that, and I was hoping that you'd consider different ways. One way is
separate functions and types for all six SHA-3 algorithms.
However, if that is not scaling well, then we could instead just
separate the SHA-3 algorithms into two groups, the digests and the XOFs:
void sha3_224_init(struct sha3_ctx *ctx);
void sha3_256_init(struct sha3_ctx *ctx);
void sha3_384_init(struct sha3_ctx *ctx);
void sha3_512_init(struct sha3_ctx *ctx);
void sha3_update(struct sha3_ctx *ctx, const u8 *data, size_t data_len);
void sha3_final(struct sha3_ctx *ctx, u8 *out);
void shake128_init(struct shake_ctx *ctx);
void shake256_init(struct shake_ctx *ctx);
void shake_update(struct shake_ctx *ctx, const u8 *data, size_t data_len);
void shake_squeeze(struct shake_ctx *ctx, u8 *out, size_t out_len);
void shake_clear(struct shake_ctx *ctx);
(With "sha3_ctx" being used for the digests specifically, the internal
context struct would then have to have a third name, like "__sha3_ctx".)
The *_init() functions would store the correct information in the
context so that the other functions would know what to do. This would
be similar to how blake2s_init() saves the 'outlen' for blake2s_final().
That would be sufficient to prevent misuse errors where steps of
different algorithms are mixed together, right?
Keep in mind that for SHA-2 we have to have completely different code
and underlying state for the 32-bit hashes (SHA-224 and SHA-256) and
64-bit hashes (SHA-384 and SHA-512) anyway. We also traditionally
haven't kept any information in the SHA-2 context about which SHA-2
algorithm is being executed. So that led us more down the road of the
separate functions and types for each SHA-2 algorithm. With SHA-3,
where e.g. the 224, 256, 384, and 512-bit digests all use the same
underlying state, a slightly more unified API might be appropriate.
All I'm really requesting is that we don't create footguns, like the
following that the API in the v2 patch permitted:
1. sha3_init() + sha3_update()
[infinite loop]
2. sha3_256_init() + sha3_update() + sha3_squeeze()
[not valid, treats SHA3-256 as a XOF]
3. sha3_update() + sha3_squeeze() + sha3_update() + sha3_squeeze()
[not valid, as discussed]
(1) is prevented just by not having the internal function sha3_init() as
a public function.
Splitting the context into two types, one for the digests and one for
the XOFs, is sufficient to prevent (2), as long as there's still one
init function per algorithm. We don't necessarily need six types.
(3) isn't preventable via the type system, but it's detectable by a
run-time check, which you've done by adding a WARN_ON_ONCE() to
sha3_update().
So, I think we'd be in a good position with just the digests and XOFs
separated out into different functions + types.
> This does, however, cause a problem for what I need to do as the ML-DSA
> prehash is dynamically selectable by certificate OID, so I have to add
> SHAKE128/256 support to the crypto shash API too - though hopefully it will
> only require an output of 16 or 32 bytes respectively for the prehash case
> and won't require multiple squeezing.
When there's only a small number of supported algorithms, just doing the
dispatch in the calling code tends to be simpler than using
crypto_shash. For example, see the recent conversion of fs/verity/ to
use the SHA-2 library API instead of crypto_shash.
- Eric
Hi David, On Fri, Sep 26, 2025 at 12:59:58PM -0700, Eric Biggers wrote: > Hi David, > > On Fri, Sep 26, 2025 at 03:19:43PM +0100, David Howells wrote: > > I have done what Eric required and made a separate wrapper struct and set > > of wrapper functions for each algorithm, though I think this is excessively > > bureaucratic as this multiplies the API load by 7 (and maybe 9 in the > > future[*]). > > I don't think I "required" that it be implemented in exactly this way. > Sorry if I wasn't clear. Let me quote what I wrote: Have you had a chance to read this reply? In the v4 patchset I don't see any evidence that you read this reply. And you didn't respond to it either. - Eric
Eric Biggers <ebiggers@kernel.org> wrote: > Have you had a chance to read this reply? I have. You held up your implementation of sha256 and sha224 as an example of how it perhaps should be implemented: It would be worth considering separating the APIs for the different algorithms that are part of SHA-3, similar to what I did with SHA-224 and SHA-256. so I have followed that. That defines a type for each, so I'll leave it at that. > All I'm really requesting is that we don't create footguns, like the > following that the API in the v2 patch permitted: The way you did a separate type for each removed one more footgun - and arguably a more important one - as the *type* enforces[1] the output buffer size and the sha3_*_final() function has the same sized-array as the convenience wrappers. It also eliminates the need to store the digest size as this is only needed at the final step for the digest variant algorithms. David [1] Inasmuch as this is effective in C.
On Thu, Oct 02, 2025 at 02:14:44PM +0100, David Howells wrote:
> Eric Biggers <ebiggers@kernel.org> wrote:
>
> > Have you had a chance to read this reply?
>
> I have.
>
> You held up your implementation of sha256 and sha224 as an example of how it
> perhaps should be implemented:
>
> It would be worth considering separating the APIs for the different
> algorithms that are part of SHA-3, similar to what I did with SHA-224
> and SHA-256.
>
> so I have followed that. That defines a type for each, so I'll leave it at
> that.
In v3, you were pretty clear that you don't like the six-types solution:
I have done what Eric required and made a separate wrapper struct
and set of wrapper functions for each algorithm, though I think this
is excessively bureaucratic as this multiplies the API load by 7
(and maybe 9 in the future[*]).
[*] The Kyber algorithm also uses CSHAKE variants in the SHA3 family - and
NIST mentions some other variants too.
This does, however, cause a problem for what I need to do as the
ML-DSA prehash is dynamically selectable by certificate OID, so I
have to add SHAKE128/256 support to the crypto shash API too -
though hopefully it will only require an output of 16 or 32 bytes
respectively for the prehash case and won't require multiple
squeezing.
So, I listened. And I realized that we can address my concern about the
digest vs. XOF confusion using just two types.
And I explained I didn't intend to require that we fully split the API
into all six variants, and apologized for not being clear.
Remember, we haven't done a SHA-3 library API before. We're both
learning as we go...
If you've now changed your mind and strongly prefer six types, I can
tolerate that too. But I want to make sure you actually want that, and
aren't choosing it just to try to prove a point or something.
> > All I'm really requesting is that we don't create footguns, like the
> > following that the API in the v2 patch permitted:
>
> The way you did a separate type for each removed one more footgun - and
> arguably a more important one - as the *type* enforces[1] the output buffer
> size and the sha3_*_final() function has the same sized-array as the
> convenience wrappers.
>
> It also eliminates the need to store the digest size as this is only needed at
> the final step for the digest variant algorithms.
>
> David
>
> [1] Inasmuch as this is effective in C.
Well, that "Inasmuch as this is effective in C" disclaimer is really
important, because it means it doesn't actually work properly.
Effectively, array bounds in function parameters are for humans, not the
compiler.
Which is still useful, but probably not to the extent that we should
have all those extra functions and types, vs. just documenting that
sha3_final() produces output of length matching the init function that
was called. (As I mentioned, the BLAKE2s API does something similar.)
- Eric
Eric Biggers <ebiggers@kernel.org> wrote: > If you've now changed your mind and strongly prefer six types, I can > tolerate that too. I'll stick with it for the moment. It does have the aforementioned minor advantage of having the output buffer sizes encoded in the final functions. Hopefully, there won't be so many places that actually #include it. Further, this is something that can probably be changed relatively easily later. Since the merge window was still open and much flux happening upstream, I decided to press ahead with stripping down the ML-DSA stuff and leave reissuing the patches till after -rc1, so that I could be more sure of what I actually needed for that. I have ML-DSA working as far as being able to load keys and check signatures in the kernel - but hit a minor bump of openssl not apparently being able to actually generate CMS signatures for it:-/. It seems the standard is not settled quite yet... I have them rebased and will repost them hopefully today with the ML-DSA patches, such as they are, attached for reference. David
David Howells <dhowells@redhat.com> wrote: > I have ML-DSA working as far as being able to load keys and check signatures > in the kernel - but hit a minor bump of openssl not apparently being able to > actually generate CMS signatures for it:-/. It seems the standard is not > settled quite yet... Actually, openssl CMS can generate ML-DSA signatures, but only if CMS_NOATTR is not specified. David
Am Donnerstag, 16. Oktober 2025, 16:34:11 Mitteleuropäische Sommerzeit schrieb David Howells: Hi David, > David Howells <dhowells@redhat.com> wrote: > > I have ML-DSA working as far as being able to load keys and check > > signatures in the kernel - but hit a minor bump of openssl not apparently > > being able to actually generate CMS signatures for it:-/. It seems the > > standard is not settled quite yet... > > Actually, openssl CMS can generate ML-DSA signatures, but only if CMS_NOATTR > is not specified. If you want to test it, perhaps the lc_x509_generator or lc_pkcs7_generator from leancrypto could be used [1]. [1] https://leancrypto.org/leancrypto/x509_support/index.html > > David Ciao Stephan
Hi David, On Thu, Oct 02, 2025 at 09:27:05AM -0700, Eric Biggers wrote: > On Thu, Oct 02, 2025 at 02:14:44PM +0100, David Howells wrote: > > Eric Biggers <ebiggers@kernel.org> wrote: > > > > > Have you had a chance to read this reply? > > > > I have. > > > > You held up your implementation of sha256 and sha224 as an example of how it > > perhaps should be implemented: > > > > It would be worth considering separating the APIs for the different > > algorithms that are part of SHA-3, similar to what I did with SHA-224 > > and SHA-256. > > > > so I have followed that. That defines a type for each, so I'll leave it at > > that. > > In v3, you were pretty clear that you don't like the six-types solution: Let me know if you have a new version of this patchset planned. Otherwise I'll work on cleaning it up and finishing the remaining parts, like incorporating the arch-optimized SHA-3 code. - Eric
© 2016 - 2026 Red Hat, Inc.