crypto: benchmark - add the crypto benchmark

[RFC PATCH 0/6] crypto: benchmark - add the crypto benchmark

Posted by Yang Shen 3 years, 6 months ago

Add crypto benchmark - A tool to help the users quickly get the
performance of a algorithm registered in crypto.

The tool tries to use the same API to unify the processes of different
algorithms. The algorithm can do some private operations in the callbacks.
For users, they can see the unified configuration parameters, rather than
a set of configuration parameters corresponding to each algorithm.

This tool can provide users with the ability to test the performance of
algorithms in some specific scenarios. At present, the following parameters
are selected for users configuration: block size, block number,
thread number, bound numa and request number for per tfm. These parameters
can help users simulate approximate business scenarios.

For the RFC version, the compression benchmark test is supported.
I did some verification on Kunpeng920.

The first test case is for zlib-deflate software algorithm.
The cpu frequency is 2.6 GHz. I want to show you the influence of these
parameters.

The configuration is following:
run set: algorithm zlib-deflate, algtype CRYPTO_COMPRESS, inputsize 1024,
loop 1, numamask 0x0, optype 0, reqnum 1, threadnum 1, time 1.
The result is :
Crypto benchmark result:
        throughput      pps             time
        150 MB/s        150 kPP/s       1000 ms

And then change the block size:
run set: algorithm zlib-deflate, algtype CRYPTO_COMPRESS, inputsize 8192,
loop 1, numamask 0x0, optype 0, reqnum 1, threadnum 1, time 1.
Crypto benchmark result:
        throughput      pps             time
        473 MB/s        59 kPP/s        1005 ms

run set: algorithm zlib-deflate, algtype CRYPTO_COMPRESS, inputsize 65536,
loop 1, numamask 0x0, optype 0, reqnum 1, threadnum 1, time 1.
Crypto benchmark result:
        throughput      pps             time
        421 MB/s        6 kPP/s         1038 ms

With the test, users can know that the throughput and pps are both
influenced by block size on this server. And the throughput has a peak
value while the pps is inverse ratio with bolck size increasing.
Due to the software algorithm, thread number will linear increase the
result while it is less than cpu number and other parameters have little
influence on performance.

The second test case is for zlib-deflate hardware. The tested parameters
has the same effect on hardware. Here I test the parameter 'reqnum'.
The software algorithm register to synchronous process. So here it is
useless for software performance.

run set: algorithm zlib-deflate, algtype CRYPTO_COMPRESS, inputsize 8192,
loop 1, numamask 0x0, optype 0, reqnum 1, threadnum 1, time 1.
Crypto benchmark result:
        throughput      pps             time
        367 MB/s        46 kPP/s        941 ms

run set: algorithm zlib-deflate, algtype CRYPTO_COMPRESS, inputsize 8192,
loop 1, numamask 0x0, optype 0, reqnum 10, threadnum 1, time 1.
Crypto benchmark result:
        throughput      pps             time
        3507 MB/s       438 kPP/s       1003 ms

run set: algorithm zlib-deflate, algtype CRYPTO_COMPRESS, inputsize 8192,
loop 1, numamask 0x0, optype 0, reqnum 100, threadnum 1, time 1.
Crypto benchmark result:
        throughput      pps             time
        6318 MB/s       790 kPP/s       1093 ms

So we can know that for asynchronous algorithms, request number for per
tfm also influence the throughput and pps until a peak value.

So with this tool, we can get a quick verification for different platform
and get some reference for business scenarios configuration.

Yang Shen (6):
  moduleparams: Add hexulong type parameter
  crypto: benchmark - add a crypto benchmark tool
  crytpo: benchmark - support compression/decompresssion
  crypto: benchmark - add help information
  crypto: benchmark - add API documentation
  MAINTAINERS: add crypto benchmark MAINTAINER

 Documentation/crypto/benchmark.rst | 104 +++++
 MAINTAINERS                        |   7 +
 crypto/Kconfig                     |   2 +
 crypto/Makefile                    |   5 +
 crypto/benchmark/Kconfig           |  11 +
 crypto/benchmark/Makefile          |   3 +
 crypto/benchmark/benchmark.c       | 599 +++++++++++++++++++++++++++++
 crypto/benchmark/benchmark.h       |  76 ++++
 crypto/benchmark/bm_comp.c         | 435 +++++++++++++++++++++
 crypto/benchmark/bm_comp.h         |  19 +
 include/linux/moduleparam.h        |   7 +-
 kernel/params.c                    |   1 +
 12 files changed, 1268 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/crypto/benchmark.rst
 create mode 100644 crypto/benchmark/Kconfig
 create mode 100644 crypto/benchmark/Makefile
 create mode 100644 crypto/benchmark/benchmark.c
 create mode 100644 crypto/benchmark/benchmark.h
 create mode 100644 crypto/benchmark/bm_comp.c
 create mode 100644 crypto/benchmark/bm_comp.h

--
2.24.0

Re: [RFC PATCH 0/6] crypto: benchmark - add the crypto benchmark

Posted by Herbert Xu 3 years, 6 months ago

On Mon, Sep 19, 2022 at 08:05:31PM +0800, Yang Shen wrote:
> Add crypto benchmark - A tool to help the users quickly get the
> performance of a algorithm registered in crypto.

Please explain how this relates to the existing speed testing
functionality in tcrypt.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [RFC PATCH 0/6] crypto: benchmark - add the crypto benchmark

Posted by Yang Shen 3 years, 6 months ago

在 2022/9/20 16:28, Herbert Xu 写道:
> On Mon, Sep 19, 2022 at 08:05:31PM +0800, Yang Shen wrote:
>> Add crypto benchmark - A tool to help the users quickly get the
>> performance of a algorithm registered in crypto.
> Please explain how this relates to the existing speed testing
> functionality in tcrypt.
>
> Thanks,

In fact, the purpose for I is to get a crypto benchmark tool which is 
customizable
and easy to called. For example, I test the hardware performance every 
rc1 to check
whether the modification of the common module affects it. For me, I need 
to test
the mutil threads, mutil numas, mutil requests of one tfm and so on. 
These test
cases are used to simulate some service scenarios. And in these cases, I 
can find
if any common module apply a patch that has an impact on us.

I know the tcrypt.ko has the speed test cases. But the tcrypt.ko test 
case is fixed.
If I understand correctly, the design model of tcrypt.ko is test the 
algorithms with
determined case conditions. It can provide some standardized testing to 
ensure
that the implementation of the algorithm meets the requirements. This is a
reasonable developer test tool, but it is not flexible enough for 
testers and users.

There are two main reasons for this:
1> For testers, the performance is not only related to algorithms and 
algorithm
configurations. Many configurations may have obvious effect on 
performance which
are not provided on tcrypt.ko. Of course, this problem can fix by add 
these as module
parameters.
2> For users, a friendly tool is that they can use the tool directly 
rather to need to
watch the source code to know how to use it. In tcrypt.ko, users need to 
get the 'mode'
number of case they want to test if exist.

So this tool's original intention is to allow users test more complex 
scenarios and get the
parameters usage directly.

If I have any misunderstanding about tcrypt.ko, please correct me. And 
I'll try to use the
tcrytp.ko to meet my request.

Thanks,

Yang

Re: [RFC PATCH 0/6] crypto: benchmark - add the crypto benchmark

Posted by Herbert Xu 3 years, 6 months ago

On Wed, Sep 21, 2022 at 04:19:18PM +0800, Yang Shen wrote:
>
> I know the tcrypt.ko has the speed test cases. But the tcrypt.ko test case
> is fixed.
> If I understand correctly, the design model of tcrypt.ko is test the
> algorithms with
> determined case conditions. It can provide some standardized testing to
> ensure
> that the implementation of the algorithm meets the requirements. This is a
> reasonable developer test tool, but it is not flexible enough for testers
> and users.

How about improving tcrypt then? We're not going to have two things
in the kernel that do the same thing unless you provide a clear path
of eliminating one of them.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [RFC PATCH 0/6] crypto: benchmark - add the crypto benchmark

Posted by Yang Shen 3 years, 5 months ago


在 2022/9/30 12:51, Herbert Xu 写道:
> On Wed, Sep 21, 2022 at 04:19:18PM +0800, Yang Shen wrote:
>> I know the tcrypt.ko has the speed test cases. But the tcrypt.ko test case
>> is fixed.
>> If I understand correctly, the design model of tcrypt.ko is test the
>> algorithms with
>> determined case conditions. It can provide some standardized testing to
>> ensure
>> that the implementation of the algorithm meets the requirements. This is a
>> reasonable developer test tool, but it is not flexible enough for testers
>> and users.
> How about improving tcrypt then? We're not going to have two things
> in the kernel that do the same thing unless you provide a clear path
> of eliminating one of them.
>
> Cheers,

Got it. I'll try to support this on the tcrypt.

Thanks,

Yang

Re: [RFC PATCH 0/6] crypto: benchmark - add the crypto benchmark

Posted by Herbert Xu 3 years, 5 months ago

On Fri, Oct 14, 2022 at 09:43:40AM +0800, Yang Shen wrote:
>
> Got it. I'll try to support this on the tcrypt.

Before you get too far into this, please note that I have no
preference as to whether you go with tcrypt or your new benchmark
code.

My only requirement is that we pick one mechanism.

But obivously others might have a preference so you should try
to produce RFCs as early as possible.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt