arch/arm64/crypto/ghash-ce-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block
handling") made ghash_finup() pass the wrong buffer to
ghash_do_simd_update(). As a result, ghash-neon now produces incorrect
outputs when the message length isn't divisible by 16 bytes. Fix this.
(I didn't notice this earlier because this code is reached only on CPUs
that support NEON but not PMULL. I haven't yet found a way to get
qemu-system-aarch64 to emulate that configuration.)
Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling")
Cc: stable@vger.kernel.org
Reported-by: Diederik de Haas <diederik@cknow-tech.com>
Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.com/
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
If it's okay, I'd like to just take this via libcrypto-fixes.
arch/arm64/crypto/ghash-ce-glue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c
index 7951557a285a..ef249d06c92c 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -131,11 +131,11 @@ static int ghash_finup(struct shash_desc *desc, const u8 *src,
if (len) {
u8 buf[GHASH_BLOCK_SIZE] = {};
memcpy(buf, src, len);
- ghash_do_simd_update(1, ctx->digest, src, key, NULL,
+ ghash_do_simd_update(1, ctx->digest, buf, key, NULL,
pmull_ghash_update_p8);
memzero_explicit(buf, sizeof(buf));
}
return ghash_export(desc, dst);
}
base-commit: 7a3984bbd69055898add0fe22445f99435f33450
--
2.52.0
On Tue Dec 9, 2025 at 11:34 PM CET, Eric Biggers wrote:
> Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block
> handling") made ghash_finup() pass the wrong buffer to
> ghash_do_simd_update(). As a result, ghash-neon now produces incorrect
> outputs when the message length isn't divisible by 16 bytes. Fix this.
I was hoping to not have to do a 'git bisect', but this is much better
:-D I can confirm that this patch fixes the error I was seeing, so
Tested-by: Diederik de Haas <diederik@cknow-tech.com>
> (I didn't notice this earlier because this code is reached only on CPUs
> that support NEON but not PMULL. I haven't yet found a way to get
> qemu-system-aarch64 to emulate that configuration.)
https://www.qemu.org/docs/master/system/arm/raspi.html indicates it can
emulate various Raspberry Pi models. I've only tested it with RPi 3B+
(bc of its wifi+bt chip), but I wouldn't be surprised if all RPi models
would have this problem? Dunno if QEMU emulates that though.
Thanks for the quick fix!
Cheers,
Diederik
> Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling")
> Cc: stable@vger.kernel.org
> Reported-by: Diederik de Haas <diederik@cknow-tech.com>
> Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.com/
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
> ---
>
> If it's okay, I'd like to just take this via libcrypto-fixes.
>
> arch/arm64/crypto/ghash-ce-glue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c
> index 7951557a285a..ef249d06c92c 100644
> --- a/arch/arm64/crypto/ghash-ce-glue.c
> +++ b/arch/arm64/crypto/ghash-ce-glue.c
> @@ -131,11 +131,11 @@ static int ghash_finup(struct shash_desc *desc, const u8 *src,
>
> if (len) {
> u8 buf[GHASH_BLOCK_SIZE] = {};
>
> memcpy(buf, src, len);
> - ghash_do_simd_update(1, ctx->digest, src, key, NULL,
> + ghash_do_simd_update(1, ctx->digest, buf, key, NULL,
> pmull_ghash_update_p8);
> memzero_explicit(buf, sizeof(buf));
> }
> return ghash_export(desc, dst);
> }
>
> base-commit: 7a3984bbd69055898add0fe22445f99435f33450
On Wed, 10 Dec 2025 at 18:22, Diederik de Haas <diederik@cknow-tech.com> wrote:
>
> On Tue Dec 9, 2025 at 11:34 PM CET, Eric Biggers wrote:
> > Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block
> > handling") made ghash_finup() pass the wrong buffer to
> > ghash_do_simd_update(). As a result, ghash-neon now produces incorrect
> > outputs when the message length isn't divisible by 16 bytes. Fix this.
>
> I was hoping to not have to do a 'git bisect', but this is much better
> :-D I can confirm that this patch fixes the error I was seeing, so
>
> Tested-by: Diederik de Haas <diederik@cknow-tech.com>
>
> > (I didn't notice this earlier because this code is reached only on CPUs
> > that support NEON but not PMULL. I haven't yet found a way to get
> > qemu-system-aarch64 to emulate that configuration.)
>
> https://www.qemu.org/docs/master/system/arm/raspi.html indicates it can
> emulate various Raspberry Pi models. I've only tested it with RPi 3B+
> (bc of its wifi+bt chip), but I wouldn't be surprised if all RPi models
> would have this problem? Dunno if QEMU emulates that though.
>
All 64-bit RPi models except the RPi5 are affected by this, as those
do not implement the crypto extensions. So I would expect QEMU to do
the same.
It would be nice, though, if we could emulate this on the mach-virt
machine model too. It should be fairly trivial to do, so if there is
demand for this I can look into it.
On Wed, Dec 10, 2025 at 06:31:44PM +0900, Ard Biesheuvel wrote:
> On Wed, 10 Dec 2025 at 18:22, Diederik de Haas <diederik@cknow-tech.com> wrote:
> >
> > On Tue Dec 9, 2025 at 11:34 PM CET, Eric Biggers wrote:
> > > Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block
> > > handling") made ghash_finup() pass the wrong buffer to
> > > ghash_do_simd_update(). As a result, ghash-neon now produces incorrect
> > > outputs when the message length isn't divisible by 16 bytes. Fix this.
> >
> > I was hoping to not have to do a 'git bisect', but this is much better
> > :-D I can confirm that this patch fixes the error I was seeing, so
> >
> > Tested-by: Diederik de Haas <diederik@cknow-tech.com>
> >
> > > (I didn't notice this earlier because this code is reached only on CPUs
> > > that support NEON but not PMULL. I haven't yet found a way to get
> > > qemu-system-aarch64 to emulate that configuration.)
> >
> > https://www.qemu.org/docs/master/system/arm/raspi.html indicates it can
> > emulate various Raspberry Pi models. I've only tested it with RPi 3B+
> > (bc of its wifi+bt chip), but I wouldn't be surprised if all RPi models
> > would have this problem? Dunno if QEMU emulates that though.
> >
>
> All 64-bit RPi models except the RPi5 are affected by this, as those
> do not implement the crypto extensions. So I would expect QEMU to do
> the same.
>
> It would be nice, though, if we could emulate this on the mach-virt
> machine model too. It should be fairly trivial to do, so if there is
> demand for this I can look into it.
I'm definitely interested in it. I'm already testing multiple "-cpu"
options, and it's easy to add more.
With qemu-system-aarch64 I'm currently only using "-M virt", since the
other machine models I've tried don't boot with arm64 defconfig,
including "-M raspi3b" and "-M raspi4b".
There may be some tricks I'm missing. Regardless, expanding the
selection of available CPUs for "-M virt" would be helpful. Either by
adding "real" CPUs that have "interesting" combinations of features, or
by just allowing turning features off like
"-cpu max,aes=off,pmull=off,sha256=off". (Certain features like sve can
already be turned off in that way, but not the ones relevant to us.)
- Eric
On Fri, 12 Dec 2025 at 06:40, Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Wed, Dec 10, 2025 at 06:31:44PM +0900, Ard Biesheuvel wrote:
> > On Wed, 10 Dec 2025 at 18:22, Diederik de Haas <diederik@cknow-tech.com> wrote:
> > >
> > > On Tue Dec 9, 2025 at 11:34 PM CET, Eric Biggers wrote:
> > > > Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block
> > > > handling") made ghash_finup() pass the wrong buffer to
> > > > ghash_do_simd_update(). As a result, ghash-neon now produces incorrect
> > > > outputs when the message length isn't divisible by 16 bytes. Fix this.
> > >
> > > I was hoping to not have to do a 'git bisect', but this is much better
> > > :-D I can confirm that this patch fixes the error I was seeing, so
> > >
> > > Tested-by: Diederik de Haas <diederik@cknow-tech.com>
> > >
> > > > (I didn't notice this earlier because this code is reached only on CPUs
> > > > that support NEON but not PMULL. I haven't yet found a way to get
> > > > qemu-system-aarch64 to emulate that configuration.)
> > >
> > > https://www.qemu.org/docs/master/system/arm/raspi.html indicates it can
> > > emulate various Raspberry Pi models. I've only tested it with RPi 3B+
> > > (bc of its wifi+bt chip), but I wouldn't be surprised if all RPi models
> > > would have this problem? Dunno if QEMU emulates that though.
> > >
> >
> > All 64-bit RPi models except the RPi5 are affected by this, as those
> > do not implement the crypto extensions. So I would expect QEMU to do
> > the same.
> >
> > It would be nice, though, if we could emulate this on the mach-virt
> > machine model too. It should be fairly trivial to do, so if there is
> > demand for this I can look into it.
>
> I'm definitely interested in it. I'm already testing multiple "-cpu"
> options, and it's easy to add more.
>
> With qemu-system-aarch64 I'm currently only using "-M virt", since the
> other machine models I've tried don't boot with arm64 defconfig,
> including "-M raspi3b" and "-M raspi4b".
>
> There may be some tricks I'm missing. Regardless, expanding the
> selection of available CPUs for "-M virt" would be helpful. Either by
> adding "real" CPUs that have "interesting" combinations of features, or
> by just allowing turning features off like
> "-cpu max,aes=off,pmull=off,sha256=off". (Certain features like sve can
> already be turned off in that way, but not the ones relevant to us.)
>
There are some architectural rules around which combinations of crypto
extensions are permitted:
- PMULL implies AES, and there is no way for the ID registers to
describe a CPU that has PMULL but not AES
- SHA256 implies SHA1 (but the ID register fields are independent)
- SHA3 and SHA512 both imply SHA256+SHA1
- SVE versions are not allowed to be implemented unless the plain NEON
version is implemented as well
- FEAT_Crypto has different meanings for v8.0, v8.2 and v9.x
So it would be much easier, also in terms of future maintenance, to
have a simple 'crypto=off' setting that applies to all emulated CPU
models, given that disabling all crypto on any given compliant CPU
will never result in something that the architecture does not permit.
Would that work for you?
On Mon, Dec 15, 2025 at 04:54:34PM +0900, Ard Biesheuvel wrote: > > > All 64-bit RPi models except the RPi5 are affected by this, as those > > > do not implement the crypto extensions. So I would expect QEMU to do > > > the same. > > > > > > It would be nice, though, if we could emulate this on the mach-virt > > > machine model too. It should be fairly trivial to do, so if there is > > > demand for this I can look into it. > > > > I'm definitely interested in it. I'm already testing multiple "-cpu" > > options, and it's easy to add more. > > > > With qemu-system-aarch64 I'm currently only using "-M virt", since the > > other machine models I've tried don't boot with arm64 defconfig, > > including "-M raspi3b" and "-M raspi4b". > > > > There may be some tricks I'm missing. Regardless, expanding the > > selection of available CPUs for "-M virt" would be helpful. Either by > > adding "real" CPUs that have "interesting" combinations of features, or > > by just allowing turning features off like > > "-cpu max,aes=off,pmull=off,sha256=off". (Certain features like sve can > > already be turned off in that way, but not the ones relevant to us.) > > > > There are some architectural rules around which combinations of crypto > extensions are permitted: > - PMULL implies AES, and there is no way for the ID registers to > describe a CPU that has PMULL but not AES > - SHA256 implies SHA1 (but the ID register fields are independent) > - SHA3 and SHA512 both imply SHA256+SHA1 > - SVE versions are not allowed to be implemented unless the plain NEON > version is implemented as well > - FEAT_Crypto has different meanings for v8.0, v8.2 and v9.x > > So it would be much easier, also in terms of future maintenance, to > have a simple 'crypto=off' setting that applies to all emulated CPU > models, given that disabling all crypto on any given compliant CPU > will never result in something that the architecture does not permit. > > Would that work for you? I thought it had been established that the "crypto" grouping of features (as implemented by gcc and clang) doesn't reflect the actual hardware feature fields and is misleading because additional crypto extensions continue to be added. I'm not sure that applies here, but just something to consider. There's certainly no need to support emulating combinations of features that no hardware actually implements. So yes, if that means "crypto" is the right choice, that sounds fine. - Eric
On Mon, 15 Dec 2025 at 21:16, Eric Biggers <ebiggers@kernel.org> wrote: > > On Mon, Dec 15, 2025 at 04:54:34PM +0900, Ard Biesheuvel wrote: > > > > All 64-bit RPi models except the RPi5 are affected by this, as those > > > > do not implement the crypto extensions. So I would expect QEMU to do > > > > the same. > > > > > > > > It would be nice, though, if we could emulate this on the mach-virt > > > > machine model too. It should be fairly trivial to do, so if there is > > > > demand for this I can look into it. > > > > > > I'm definitely interested in it. I'm already testing multiple "-cpu" > > > options, and it's easy to add more. > > > > > > With qemu-system-aarch64 I'm currently only using "-M virt", since the > > > other machine models I've tried don't boot with arm64 defconfig, > > > including "-M raspi3b" and "-M raspi4b". > > > > > > There may be some tricks I'm missing. Regardless, expanding the > > > selection of available CPUs for "-M virt" would be helpful. Either by > > > adding "real" CPUs that have "interesting" combinations of features, or > > > by just allowing turning features off like > > > "-cpu max,aes=off,pmull=off,sha256=off". (Certain features like sve can > > > already be turned off in that way, but not the ones relevant to us.) > > > > > > > There are some architectural rules around which combinations of crypto > > extensions are permitted: > > - PMULL implies AES, and there is no way for the ID registers to > > describe a CPU that has PMULL but not AES > > - SHA256 implies SHA1 (but the ID register fields are independent) > > - SHA3 and SHA512 both imply SHA256+SHA1 > > - SVE versions are not allowed to be implemented unless the plain NEON > > version is implemented as well > > - FEAT_Crypto has different meanings for v8.0, v8.2 and v9.x > > > > So it would be much easier, also in terms of future maintenance, to > > have a simple 'crypto=off' setting that applies to all emulated CPU > > models, given that disabling all crypto on any given compliant CPU > > will never result in something that the architecture does not permit. > > > > Would that work for you? > > I thought it had been established that the "crypto" grouping of features > (as implemented by gcc and clang) doesn't reflect the actual hardware > feature fields and is misleading because additional crypto extensions > continue to be added. > > I'm not sure that applies here, but just something to consider. > You are right, this is why 'crypto=on' can never mean anything other than 'do not disable the crypto extensions that this particular CPU type provides' But that does not mean 'crypto=off' is equally problematic. > There's certainly no need to support emulating combinations of features > that no hardware actually implements. So yes, if that means "crypto" is > the right choice, that sounds fine. > OK, I'll have a stab at that and cc you on the patches.
On Tue, Dec 09, 2025 at 02:34:17PM -0800, Eric Biggers wrote:
> Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block
> handling") made ghash_finup() pass the wrong buffer to
> ghash_do_simd_update(). As a result, ghash-neon now produces incorrect
> outputs when the message length isn't divisible by 16 bytes. Fix this.
>
> (I didn't notice this earlier because this code is reached only on CPUs
> that support NEON but not PMULL. I haven't yet found a way to get
> qemu-system-aarch64 to emulate that configuration.)
>
> Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling")
> Cc: stable@vger.kernel.org
> Reported-by: Diederik de Haas <diederik@cknow-tech.com>
> Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.com/
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
> ---
>
> If it's okay, I'd like to just take this via libcrypto-fixes.
>
> arch/arm64/crypto/ghash-ce-glue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Applied to https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git/log/?h=libcrypto-fixes
(As always, additional reviews/acks still appreciated!)
- Eric
On Tue, Dec 09, 2025 at 02:34:17PM -0800, Eric Biggers wrote:
> Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block
> handling") made ghash_finup() pass the wrong buffer to
> ghash_do_simd_update(). As a result, ghash-neon now produces incorrect
> outputs when the message length isn't divisible by 16 bytes. Fix this.
>
> (I didn't notice this earlier because this code is reached only on CPUs
> that support NEON but not PMULL. I haven't yet found a way to get
> qemu-system-aarch64 to emulate that configuration.)
>
> Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling")
> Cc: stable@vger.kernel.org
> Reported-by: Diederik de Haas <diederik@cknow-tech.com>
> Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.com/
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
> ---
>
> If it's okay, I'd like to just take this via libcrypto-fixes.
>
> arch/arm64/crypto/ghash-ce-glue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Thanks for catching this!
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
© 2016 - 2025 Red Hat, Inc.