[PATCH v2 00/10] crypto: sun8i-ce - implement request batching

Ovidiu Panait posted 10 patches 3 months, 2 weeks ago
.../allwinner/sun8i-ce/sun8i-ce-cipher.c      |  90 +++++------
.../crypto/allwinner/sun8i-ce/sun8i-ce-core.c | 152 ++++++++++++++----
.../crypto/allwinner/sun8i-ce/sun8i-ce-hash.c | 140 +++++++++-------
.../crypto/allwinner/sun8i-ce/sun8i-ce-prng.c |   1 -
.../crypto/allwinner/sun8i-ce/sun8i-ce-trng.c |   1 -
drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h  |  84 +++++++++-
6 files changed, 327 insertions(+), 141 deletions(-)
[PATCH v2 00/10] crypto: sun8i-ce - implement request batching
Posted by Ovidiu Panait 3 months, 2 weeks ago
The Allwinner crypto engine can process multiple requests at a time,
if they are chained together using the task descriptor's 'next' field.
Having multiple requests processed in one go can reduce the number
of interrupts generated and also improve throughput.

When compared to the existing non-batching implementation, the tcrypt
multibuffer benchmark shows an increase in throughput of ~85% for 16 byte
AES blocks (when testing with 8 data streams on the OrangePi Zero2 board).

Patches 1-9 perform refactoring work on the existing do_one_request()
callbacks, to make them more modular and easier to integrate with the
request batching workflow.

Patch 10 implements the actual request batching.

Changes in v2:
   - fixed [-Wformat-truncation=] warning reported by kernel test robot


Ovidiu Panait (10):
  crypto: sun8i-ce - remove channel timeout field
  crypto: sun8i-ce - remove boilerplate in sun8i_ce_hash_digest()
  crypto: sun8i-ce - move bounce_iv and backup_iv to request context
  crypto: sun8i-ce - save hash buffers and dma info to request context
  crytpo: sun8i-ce - factor out prepare/unprepare code from ahash
    do_one_request
  crypto: sun8i-ce - fold sun8i_ce_cipher_run() into
    sun8i_ce_cipher_do_one()
  crypto: sun8i-ce - pass task descriptor to cipher prepare/unprepare
  crypto: sun8i-ce - factor out public versions of finalize request
  crypto: sun8i-ce - add a new function for dumping task descriptors
  crypto: sun8i-ce - implement request batching

 .../allwinner/sun8i-ce/sun8i-ce-cipher.c      |  90 +++++------
 .../crypto/allwinner/sun8i-ce/sun8i-ce-core.c | 152 ++++++++++++++----
 .../crypto/allwinner/sun8i-ce/sun8i-ce-hash.c | 140 +++++++++-------
 .../crypto/allwinner/sun8i-ce/sun8i-ce-prng.c |   1 -
 .../crypto/allwinner/sun8i-ce/sun8i-ce-trng.c |   1 -
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h  |  84 +++++++++-
 6 files changed, 327 insertions(+), 141 deletions(-)

-- 
2.49.0
Re: [PATCH v2 00/10] crypto: sun8i-ce - implement request batching
Posted by Herbert Xu 3 months ago
On Thu, Jun 26, 2025 at 12:58:03PM +0300, Ovidiu Panait wrote:
> The Allwinner crypto engine can process multiple requests at a time,
> if they are chained together using the task descriptor's 'next' field.
> Having multiple requests processed in one go can reduce the number
> of interrupts generated and also improve throughput.

I think we should phase out the batching code in crypto_engine
as it doesn't really work that well.

Instead of doing batching based on backlog, we should be letting
the user push this.  For example, IPsec can hook into GSO and get
64K of data each time.  Similarly for block encryption, unit sizes
can be much greater than 4K.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 00/10] crypto: sun8i-ce - implement request batching
Posted by Corentin Labbe 3 months, 1 week ago
Le Thu, Jun 26, 2025 at 12:58:03PM +0300, Ovidiu Panait a écrit :
> The Allwinner crypto engine can process multiple requests at a time,
> if they are chained together using the task descriptor's 'next' field.
> Having multiple requests processed in one go can reduce the number
> of interrupts generated and also improve throughput.
> 
> When compared to the existing non-batching implementation, the tcrypt
> multibuffer benchmark shows an increase in throughput of ~85% for 16 byte
> AES blocks (when testing with 8 data streams on the OrangePi Zero2 board).
> 
> Patches 1-9 perform refactoring work on the existing do_one_request()
> callbacks, to make them more modular and easier to integrate with the
> request batching workflow.
> 
> Patch 10 implements the actual request batching.
> 
> Changes in v2:
>    - fixed [-Wformat-truncation=] warning reported by kernel test robot
> 

Hello

Thanks for your patch, I am starting review and test it.

@Herbert, please me give me time for it.

Regards