[Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support

longpeng.mike@gmail.com posted 18 patches 6 years, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1500032321-13951-1-git-send-email-longpeng.mike@gmail.com
Test FreeBSD passed
Test checkpatch passed
Test docker passed
Test s390x passed
There is a newer version of this series
configure                       |  22 ++++
crypto/Makefile.objs            |   3 +
crypto/afalg.c                  | 116 +++++++++++++++++++++
crypto/afalgpriv.h              |  64 ++++++++++++
crypto/cipher-afalg.c           | 226 ++++++++++++++++++++++++++++++++++++++++
crypto/cipher-builtin.c         | 125 +++++++++++-----------
crypto/cipher-gcrypt.c          | 105 ++++++++++---------
crypto/cipher-nettle.c          |  84 ++++++++-------
crypto/cipher.c                 |  80 ++++++++++++++
crypto/cipherpriv.h             |  56 ++++++++++
crypto/hash-afalg.c             | 214 +++++++++++++++++++++++++++++++++++++
crypto/hash-gcrypt.c            |  19 ++--
crypto/hash-glib.c              |  19 ++--
crypto/hash-nettle.c            |  19 ++--
crypto/hash.c                   |  30 ++++++
crypto/hashpriv.h               |  39 +++++++
crypto/hmac-gcrypt.c            |  42 ++++----
crypto/hmac-glib.c              |  63 ++++++-----
crypto/hmac-nettle.c            |  42 ++++----
crypto/hmac.c                   |  58 +++++++++++
crypto/hmac.h                   | 166 -----------------------------
crypto/hmacpriv.h               |  48 +++++++++
include/crypto/cipher.h         |   1 +
include/crypto/hmac.h           | 167 +++++++++++++++++++++++++++++
tests/Makefile.include          |  13 ++-
tests/benchmark-crypto-cipher.c |  88 ++++++++++++++++
tests/benchmark-crypto-hash.c   |  67 ++++++++++++
tests/benchmark-crypto-hmac.c   |  82 +++++++++++++++
28 files changed, 1647 insertions(+), 411 deletions(-)
create mode 100644 crypto/afalg.c
create mode 100644 crypto/afalgpriv.h
create mode 100644 crypto/cipher-afalg.c
create mode 100644 crypto/cipherpriv.h
create mode 100644 crypto/hash-afalg.c
create mode 100644 crypto/hashpriv.h
delete mode 100644 crypto/hmac.h
create mode 100644 crypto/hmacpriv.h
create mode 100644 include/crypto/hmac.h
create mode 100644 tests/benchmark-crypto-cipher.c
create mode 100644 tests/benchmark-crypto-hash.c
create mode 100644 tests/benchmark-crypto-hmac.c
[Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support
Posted by longpeng.mike@gmail.com 6 years, 9 months ago
From: "Longpeng(Mike)" <longpeng2@huawei.com>

The AF_ALG socket family is the userspace interface for linux
crypto API, users can use it to access hardware accelerators.

This patchset adds a afalg-backend for qemu crypto subsystem. Currently
when performs encrypt/decrypt, we'll try afalg-backend first and will
back to libiary-backend if it failed.

In the next step, It would support a command parameter to specifies
which backends prefer to and some other improvements.

I measured the performance about the afalg-backend impls, I tested
how many data could be encrypted in 5 seconds.

NOTE: If we use specific hardware crypto cards, I think afalg-backend
      would even faster.

test-environment: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz

*sha256*
chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
512                 93.03                       185.87
1024                146.32                      201.78
2048                213.32                      210.93
4096                275.48                      215.26
8192                321.77                      217.49
16384               349.60                      219.26
32768               363.59                      219.73
65536               375.79                      219.99

*hmac(sha256)*
chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
512                 71.26                       165.55
1024                117.43                      189.15
2048                180.96                      203.24
4096                247.60                      211.38
8192                301.99                      215.65
16384               340.79                      218.22
32768               365.51                      219.49
65536               377.92                      220.24

*cbc(aes128)*
chunk_size(bytes)   MB/sec(afalg:cbc-aes-aesni)  MB/sec(nettle)
512                 371.76                       188.41
1024                559.86                       189.64
2048                768.66                       192.11
4096                939.15                       192.40
8192                1029.48                      192.49
16384               1072.79                      190.52
32768               1109.38                      190.41
65536               1102.38                      190.40

---
Changes since v4:
    - remove 'name' field in 'struct CryptoAFAlg'. [Daniel]
    - add error handling for read() returning less than requested. [Daniel]
    - use iov_send_recv to recv msg in hash-afalg.c. [Daniel]
    - refactor hmac benchmark as suggestion. [Daniel]

Changes since v3:
    - add "Reviewed-by: Daniel P. Berrange <address@hidden>" in
      commit messages of PATCH 1/2/3/4/5/7/8/9/10/11.
    - PATCH 12: use strlen() instead of qemu_strnlen() in 
qcrypto_afalg_build_saddr(). [Daniel]
    - PATCH 12: rather than indenting the entire method, just return immediately
                if afalg=NULL. [Daniel]
    - PATCH 13: use g_strdup_printf() instead of g_new0+snprintf() and remove
                redundant bounds check in qcrypto_afalg_cipher_format_name(). 
[Daniel]
    - PATCH 13: s/except_niv/expect_niv  s/origin_contorllen/origin_controllen. 
[Daniel]
    - PATCH 13: use '%zu' to print 'size_t' in qcrypto_afalg_cipher_setiv(). 
[Daniel]
    - PATCH 13: remove qcrypto_cipher_using_afalg_drv(). [Daniel]
    - PATCH 13: refactor the qcrypto_cipher_new() as Daniel's suggestion. 
[Daniel]
    - PATCH 13: correct the ->cmsg initialization int 
qcrypto_afalg_cipher_ctx_new() to
                avoid different behaviour in test_cipher_null_iv(). [Daniel]
    - PATCH 14: use g_strdup_printf() instead of g_new0+snprintf() and remove
                edundant bounds check in qcrypto_afalg_hash_format_name(). 
[Daniel]
    - PATCH 14: s/except_len/expect_len. [Daniel]
    - PATCH 14: free 'errp' if afalg_driver.hash_bytesv() failed. [Daniel]
    - PATCH 14: maybe some afalg errors should be treated as fatal, but we
                have no idea yet, so add a "TODO" comment.
    - PATCH 15: refactor the qcrypto_hmac_new() as Daniel's suggestion. [Daniel]

Changes since v2:
    - init sockaddr_alg object when it's defined. [Gonglei]
    - fix some superfluous initialization. [Gonglei]
    - s/opeartion/operation/g in crypto/afalgpriv.h. [Gonglei]
    - check 'niv' in qcrypto_afalg_cipher_setiv. [Gonglei]

Changes since v1:
    - use "make check-speed" to testing the performance. [Daniel]
    - put private definations into crypto/***priv.h. [Daniel]
    - remove afalg socket from qapi-schema, put them into crypto/. [Daniel]
    - some Error report change. [Daniel]
    - s/QCryptoAfalg/QCryptoAFAlg. [Daniel]
    - use snprintf with bounds checking instead of sprintf. [Daniel]
    - use "qcrypto_afalg_" prefix and "qcrypto_nettle(gcrypt,glib,builtin)_" 
prefix. [Daniel]
    - add testing results in cover-letter. [Gonglei]

---
Longpeng(Mike) (18):
  crypto: cipher: introduce context free function
  crypto: cipher: introduce qcrypto_cipher_ctx_new for gcrypt-backend
  crypto: cipher: introduce qcrypto_cipher_ctx_new for nettle-backend
  crypto: cipher: introduce qcrypto_cipher_ctx_new for builtin-backend
  crypto: cipher: add cipher driver framework
  crypto: hash: add hash driver framework
  crypto: hmac: move crypto/hmac.h into include/crypto/
  crypto: hmac: introduce qcrypto_hmac_ctx_new for gcrypt-backend
  crypto: hmac: introduce qcrypto_hmac_ctx_new for nettle-backend
  crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend
  crypto: hmac: add hmac driver framework
  crypto: introduce some common functions for af_alg backend
  crypto: cipher: add afalg-backend cipher support
  crypto: hash: add afalg-backend hash support
  crypto: hmac: add af_alg-backend hmac support
  tests: crypto: add cipher speed benchmark support
  tests: crypto: add hash speed benchmark support
  tests: crypto: add hmac speed benchmark support

 configure                       |  22 ++++
 crypto/Makefile.objs            |   3 +
 crypto/afalg.c                  | 116 +++++++++++++++++++++
 crypto/afalgpriv.h              |  64 ++++++++++++
 crypto/cipher-afalg.c           | 226 ++++++++++++++++++++++++++++++++++++++++
 crypto/cipher-builtin.c         | 125 +++++++++++-----------
 crypto/cipher-gcrypt.c          | 105 ++++++++++---------
 crypto/cipher-nettle.c          |  84 ++++++++-------
 crypto/cipher.c                 |  80 ++++++++++++++
 crypto/cipherpriv.h             |  56 ++++++++++
 crypto/hash-afalg.c             | 214 +++++++++++++++++++++++++++++++++++++
 crypto/hash-gcrypt.c            |  19 ++--
 crypto/hash-glib.c              |  19 ++--
 crypto/hash-nettle.c            |  19 ++--
 crypto/hash.c                   |  30 ++++++
 crypto/hashpriv.h               |  39 +++++++
 crypto/hmac-gcrypt.c            |  42 ++++----
 crypto/hmac-glib.c              |  63 ++++++-----
 crypto/hmac-nettle.c            |  42 ++++----
 crypto/hmac.c                   |  58 +++++++++++
 crypto/hmac.h                   | 166 -----------------------------
 crypto/hmacpriv.h               |  48 +++++++++
 include/crypto/cipher.h         |   1 +
 include/crypto/hmac.h           | 167 +++++++++++++++++++++++++++++
 tests/Makefile.include          |  13 ++-
 tests/benchmark-crypto-cipher.c |  88 ++++++++++++++++
 tests/benchmark-crypto-hash.c   |  67 ++++++++++++
 tests/benchmark-crypto-hmac.c   |  82 +++++++++++++++
 28 files changed, 1647 insertions(+), 411 deletions(-)
 create mode 100644 crypto/afalg.c
 create mode 100644 crypto/afalgpriv.h
 create mode 100644 crypto/cipher-afalg.c
 create mode 100644 crypto/cipherpriv.h
 create mode 100644 crypto/hash-afalg.c
 create mode 100644 crypto/hashpriv.h
 delete mode 100644 crypto/hmac.h
 create mode 100644 crypto/hmacpriv.h
 create mode 100644 include/crypto/hmac.h
 create mode 100644 tests/benchmark-crypto-cipher.c
 create mode 100644 tests/benchmark-crypto-hash.c
 create mode 100644 tests/benchmark-crypto-hmac.c

-- 
1.8.3.1


Re: [Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support
Posted by Daniel P. Berrange 6 years, 9 months ago
On Fri, Jul 14, 2017 at 07:38:22AM -0400, longpeng.mike@gmail.com wrote:
> From: "Longpeng(Mike)" <longpeng2@huawei.com>
> 
> The AF_ALG socket family is the userspace interface for linux
> crypto API, users can use it to access hardware accelerators.
> 
> This patchset adds a afalg-backend for qemu crypto subsystem. Currently
> when performs encrypt/decrypt, we'll try afalg-backend first and will
> back to libiary-backend if it failed.
> 
> In the next step, It would support a command parameter to specifies
> which backends prefer to and some other improvements.
> 
> I measured the performance about the afalg-backend impls, I tested
> how many data could be encrypted in 5 seconds.
> 
> NOTE: If we use specific hardware crypto cards, I think afalg-backend
>       would even faster.
> 
> test-environment: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
> 
> *sha256*
> chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
> 512                 93.03                       185.87
> 1024                146.32                      201.78
> 2048                213.32                      210.93
> 4096                275.48                      215.26
> 8192                321.77                      217.49
> 16384               349.60                      219.26
> 32768               363.59                      219.73
> 65536               375.79                      219.99
> 
> *hmac(sha256)*
> chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
> 512                 71.26                       165.55
> 1024                117.43                      189.15
> 2048                180.96                      203.24
> 4096                247.60                      211.38
> 8192                301.99                      215.65
> 16384               340.79                      218.22
> 32768               365.51                      219.49
> 65536               377.92                      220.24
> 
> *cbc(aes128)*
> chunk_size(bytes)   MB/sec(afalg:cbc-aes-aesni)  MB/sec(nettle)
> 512                 371.76                       188.41
> 1024                559.86                       189.64
> 2048                768.66                       192.11
> 4096                939.15                       192.40
> 8192                1029.48                      192.49
> 16384               1072.79                      190.52
> 32768               1109.38                      190.41
> 65536               1102.38                      190.40

So I've attempted to replicate these results, and see totally
different outcome. NB, I hacked your code so that setting
QEMU_DISABLE_AF_ALG=1 would skip the af-alg impl. The results
I get are:

$ tests/benchmark-crypto-hash --quiet
sha256: Testing chunk_size 512 bytes done: 197.31 MB in 5.00 secs: 39.46 MB/sec
sha256: Testing chunk_size 1024 bytes done: 337.03 MB in 5.00 secs: 67.41 MB/sec
sha256: Testing chunk_size 2048 bytes done: 516.27 MB in 5.00 secs: 103.25 MB/sec
sha256: Testing chunk_size 4096 bytes done: 675.18 MB in 5.00 secs: 135.04 MB/sec
sha256: Testing chunk_size 8192 bytes done: 837.73 MB in 5.00 secs: 167.55 MB/sec
sha256: Testing chunk_size 16384 bytes done: 946.78 MB in 5.00 secs: 189.35 MB/sec
sha256: Testing chunk_size 32768 bytes done: 1008.56 MB in 5.00 secs: 201.71 MB/sec
sha256: Testing chunk_size 65536 bytes done: 1037.19 MB in 5.00 secs: 207.43 MB/sec

$ QEMU_DISABLE_AF_ALG=1 tests/benchmark-crypto-hash --quiet
sha256: Testing chunk_size 512 bytes done: 1099.92 MB in 5.00 secs: 219.98 MB/sec
sha256: Testing chunk_size 1024 bytes done: 1223.40 MB in 5.00 secs: 244.68 MB/sec
sha256: Testing chunk_size 2048 bytes done: 1304.04 MB in 5.00 secs: 260.81 MB/sec
sha256: Testing chunk_size 4096 bytes done: 1339.29 MB in 5.00 secs: 267.86 MB/sec
sha256: Testing chunk_size 8192 bytes done: 1359.68 MB in 5.00 secs: 271.94 MB/sec
sha256: Testing chunk_size 16384 bytes done: 1363.58 MB in 5.00 secs: 272.71 MB/sec
sha256: Testing chunk_size 32768 bytes done: 1364.66 MB in 5.00 secs: 272.93 MB/sec
sha256: Testing chunk_size 65536 bytes done: 1326.56 MB in 5.00 secs: 265.30 MB/sec


  ==> AF_ALG is slower in every case, by as much as x4



$ tests/benchmark-crypto-hmac --quiet
hmac(sha256): Testing chunk_size 512 bytes done: 173.83 MB in 5.00 secs: 34.77 MB/sec
hmac(sha256): Testing chunk_size 1024 bytes done: 302.32 MB in 5.00 secs: 60.46 MB/sec
hmac(sha256): Testing chunk_size 2048 bytes done: 469.93 MB in 5.00 secs: 93.99 MB/sec
hmac(sha256): Testing chunk_size 4096 bytes done: 648.27 MB in 5.00 secs: 129.65 MB/sec
hmac(sha256): Testing chunk_size 8192 bytes done: 800.80 MB in 5.00 secs: 160.16 MB/sec
hmac(sha256): Testing chunk_size 16384 bytes done: 887.09 MB in 5.00 secs: 177.42 MB/sec
hmac(sha256): Testing chunk_size 32768 bytes done: 932.09 MB in 5.00 secs: 186.41 MB/sec
hmac(sha256): Testing chunk_size 65536 bytes done: 1013.25 MB in 5.00 secs: 202.64 MB/sec

$ QEMU_DISABLE_AF_ALG=1 tests/benchmark-crypto-hmac --quiet
hmac(sha256): Testing chunk_size 512 bytes done: 751.36 MB in 5.00 secs: 150.27 MB/sec
hmac(sha256): Testing chunk_size 1024 bytes done: 961.43 MB in 5.00 secs: 192.29 MB/sec
hmac(sha256): Testing chunk_size 2048 bytes done: 1110.92 MB in 5.00 secs: 222.18 MB/sec
hmac(sha256): Testing chunk_size 4096 bytes done: 1225.78 MB in 5.00 secs: 245.16 MB/sec
hmac(sha256): Testing chunk_size 8192 bytes done: 1300.52 MB in 5.00 secs: 260.10 MB/sec
hmac(sha256): Testing chunk_size 16384 bytes done: 1327.00 MB in 5.00 secs: 265.40 MB/sec
hmac(sha256): Testing chunk_size 32768 bytes done: 1345.72 MB in 5.00 secs: 269.14 MB/sec
hmac(sha256): Testing chunk_size 65536 bytes done: 1348.50 MB in 5.00 secs: 269.69 MB/sec


  ==> AF_ALG is slower in every case, by as much as x4



$ tests/benchmark-crypto-cipher --quiet
cbc(aes128): Testing chunk_size 512 bytes done: 1571.74 MB in 5.00 secs: 314.35 MB/sec
cbc(aes128): Testing chunk_size 1024 bytes done: 2436.54 MB in 5.00 secs: 487.31 MB/sec
cbc(aes128): Testing chunk_size 2048 bytes done: 3412.53 MB in 5.00 secs: 682.50 MB/sec
cbc(aes128): Testing chunk_size 4096 bytes done: 4307.00 MB in 5.00 secs: 861.40 MB/sec
cbc(aes128): Testing chunk_size 8192 bytes done: 4854.20 MB in 5.00 secs: 970.84 MB/sec
cbc(aes128): Testing chunk_size 16384 bytes done: 5180.72 MB in 5.00 secs: 1036.14 MB/sec
cbc(aes128): Testing chunk_size 32768 bytes done: 5390.25 MB in 5.00 secs: 1078.05 MB/sec
cbc(aes128): Testing chunk_size 65536 bytes done: 5427.94 MB in 5.00 secs: 1085.59 MB/sec


$ QEMU_DISABLE_AF_ALG=1 tests/benchmark-crypto-cipher --quiet
cbc(aes128): Testing chunk_size 512 bytes done: 4204.65 MB in 5.00 secs: 840.93 MB/sec
cbc(aes128): Testing chunk_size 1024 bytes done: 4362.01 MB in 5.00 secs: 872.40 MB/sec
cbc(aes128): Testing chunk_size 2048 bytes done: 4347.91 MB in 5.00 secs: 869.58 MB/sec
cbc(aes128): Testing chunk_size 4096 bytes done: 4432.54 MB in 5.00 secs: 886.51 MB/sec
cbc(aes128): Testing chunk_size 8192 bytes done: 4416.47 MB in 5.00 secs: 883.29 MB/sec
cbc(aes128): Testing chunk_size 16384 bytes done: 4469.45 MB in 5.00 secs: 893.89 MB/sec
cbc(aes128): Testing chunk_size 32768 bytes done: 4454.56 MB in 5.00 secs: 890.91 MB/sec
cbc(aes128): Testing chunk_size 65536 bytes done: 4518.50 MB in 5.00 secs: 903.70 MB/sec


  => AF_ALG is slower until chunk_size is 8192 or larger.


I of course don't have the same CPU as you, but it is a representative
current model  Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz

I can, however, imagine that there are scenarios where this is faster,
particularly if using this in an embedded scenario with a relatively
low perf main CPU, but a hardware accelerator available.

Based on this though, I'm very reluctant to enable AF_ALG by default
when building QEMU, because I think it'll likely cause a major perf
regression for the common case of people with fast CPUs and no
hardware accelerator.

I think in the immediate term we should add a switch to configure
--enable-crypto-afalg, that must be opt-in when building QEMU,
so those people who know they have good hardware accelerator
present can use it, but in the general case we avoid it.

For the general case, I think we need to figure out how to make
direct use of CPU insturctions for crypto, eg Intel aesni. This
might be possible by using GNUTLS for ciphers (though it lacks
coverage for all the combinations we want)

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support
Posted by Longpeng(Mike) 6 years, 9 months ago
2017-07-14 21:04 GMT+08:00 Daniel P. Berrange <berrange@redhat.com>:
> On Fri, Jul 14, 2017 at 07:38:22AM -0400, longpeng.mike@gmail.com wrote:
>> From: "Longpeng(Mike)" <longpeng2@huawei.com>
>>
[...]

>>
>> NOTE: If we use specific hardware crypto cards, I think afalg-backend
>>       would even faster.
>>
>> test-environment: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
>>
>> *sha256*
>> chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
>> 512                 93.03                       185.87
>> 1024                146.32                      201.78
>> 2048                213.32                      210.93
>> 4096                275.48                      215.26
>> 8192                321.77                      217.49
>> 16384               349.60                      219.26
>> 32768               363.59                      219.73
>> 65536               375.79                      219.99
>>
>> *hmac(sha256)*
>> chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
>> 512                 71.26                       165.55
>> 1024                117.43                      189.15
>> 2048                180.96                      203.24
>> 4096                247.60                      211.38
>> 8192                301.99                      215.65
>> 16384               340.79                      218.22
>> 32768               365.51                      219.49
>> 65536               377.92                      220.24
>>
>> *cbc(aes128)*
>> chunk_size(bytes)   MB/sec(afalg:cbc-aes-aesni)  MB/sec(nettle)
>> 512                 371.76                       188.41
>> 1024                559.86                       189.64
>> 2048                768.66                       192.11
>> 4096                939.15                       192.40
>> 8192                1029.48                      192.49
>> 16384               1072.79                      190.52
>> 32768               1109.38                      190.41
>> 65536               1102.38                      190.40
>
> So I've attempted to replicate these results, and see totally
> different outcome. NB, I hacked your code so that setting
> QEMU_DISABLE_AF_ALG=1 would skip the af-alg impl. The results
> I get are:
>
> $ tests/benchmark-crypto-hash --quiet
> sha256: Testing chunk_size 512 bytes done: 197.31 MB in 5.00 secs: 39.46 MB/sec
> sha256: Testing chunk_size 1024 bytes done: 337.03 MB in 5.00 secs: 67.41 MB/sec
> sha256: Testing chunk_size 2048 bytes done: 516.27 MB in 5.00 secs: 103.25 MB/sec
> sha256: Testing chunk_size 4096 bytes done: 675.18 MB in 5.00 secs: 135.04 MB/sec
> sha256: Testing chunk_size 8192 bytes done: 837.73 MB in 5.00 secs: 167.55 MB/sec
> sha256: Testing chunk_size 16384 bytes done: 946.78 MB in 5.00 secs: 189.35 MB/sec
> sha256: Testing chunk_size 32768 bytes done: 1008.56 MB in 5.00 secs: 201.71 MB/sec
> sha256: Testing chunk_size 65536 bytes done: 1037.19 MB in 5.00 secs: 207.43 MB/sec
[...]

>
> I of course don't have the same CPU as you, but it is a representative
> current model  Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz
>
> I can, however, imagine that there are scenarios where this is faster,
> particularly if using this in an embedded scenario with a relatively
> low perf main CPU, but a hardware accelerator available.
>
> Based on this though, I'm very reluctant to enable AF_ALG by default
> when building QEMU, because I think it'll likely cause a major perf
> regression for the common case of people with fast CPUs and no
> hardware accelerator.
>
> I think in the immediate term we should add a switch to configure
> --enable-crypto-afalg, that must be opt-in when building QEMU,
> so those people who know they have good hardware accelerator
> present can use it, but in the general case we avoid it.
>

OK.

We can take this measure currently.

But some hardware accelerators only support limit amount of algos,
maybe in the next step we need a cmdline param to specify which
algo uses afalg- backend and other algos still use library-backend
even though we '--enale-crypto-afalg'.

Anyway, I'll modify the code as your suggestion first.  :)


> For the general case, I think we need to figure out how to make
> direct use of CPU insturctions for crypto, eg Intel aesni. This
> might be possible by using GNUTLS for ciphers (though it lacks
> coverage for all the combinations we want)
>

IIUC,  newer gcrypt/nettle would use CPU insturctions for crypto if
CPU support.

-- 
Regards,
Longpeng

> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

[Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support
Posted by longpeng.mike@gmail.com 6 years, 9 months ago
From: "Longpeng(Mike)" <longpeng2@huawei.com>

The AF_ALG socket family is the userspace interface for linux
crypto API, users can use it to access hardware accelerators.

This patchset adds a afalg-backend for qemu crypto subsystem. Currently
when performs encrypt/decrypt, we'll try afalg-backend first and will
back to libiary-backend if it failed.

In the next step, It would support a command parameter to specifies
which backends prefer to and some other improvements.

I measured the performance about the afalg-backend impls, I tested
how many data could be encrypted in 5 seconds.

NOTE: If we use specific hardware crypto cards, I think afalg-backend
      would even faster.

test-environment: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz

*sha256*
chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
512                 93.03                       185.87
1024                146.32                      201.78
2048                213.32                      210.93
4096                275.48                      215.26
8192                321.77                      217.49
16384               349.60                      219.26
32768               363.59                      219.73
65536               375.79                      219.99

*hmac(sha256)*
chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
512                 71.26                       165.55
1024                117.43                      189.15
2048                180.96                      203.24
4096                247.60                      211.38
8192                301.99                      215.65
16384               340.79                      218.22
32768               365.51                      219.49
65536               377.92                      220.24

*cbc(aes128)*
chunk_size(bytes)   MB/sec(afalg:cbc-aes-aesni)  MB/sec(nettle)
512                 371.76                       188.41
1024                559.86                       189.64
2048                768.66                       192.11
4096                939.15                       192.40
8192                1029.48                      192.49
16384               1072.79                      190.52
32768               1109.38                      190.41
65536               1102.38                      190.40

---
Changes since v4:
    - remove 'name' field in 'struct CryptoAFAlg'. [Daniel]
    - add error handling for read() returning less than requested. [Daniel]
    - use iov_send_recv to recv msg in hash-afalg.c. [Daniel]
    - refactor hmac benchmark as suggestion. [Daniel]

Changes since v3:
    - add "Reviewed-by: Daniel P. Berrange <address@hidden>" in
      commit messages of PATCH 1/2/3/4/5/7/8/9/10/11.
    - PATCH 12: use strlen() instead of qemu_strnlen() in 
qcrypto_afalg_build_saddr(). [Daniel]
    - PATCH 12: rather than indenting the entire method, just return immediately
                if afalg=NULL. [Daniel]
    - PATCH 13: use g_strdup_printf() instead of g_new0+snprintf() and remove
                redundant bounds check in qcrypto_afalg_cipher_format_name(). 
[Daniel]
    - PATCH 13: s/except_niv/expect_niv  s/origin_contorllen/origin_controllen. 
[Daniel]
    - PATCH 13: use '%zu' to print 'size_t' in qcrypto_afalg_cipher_setiv(). 
[Daniel]
    - PATCH 13: remove qcrypto_cipher_using_afalg_drv(). [Daniel]
    - PATCH 13: refactor the qcrypto_cipher_new() as Daniel's suggestion. 
[Daniel]
    - PATCH 13: correct the ->cmsg initialization int 
qcrypto_afalg_cipher_ctx_new() to
                avoid different behaviour in test_cipher_null_iv(). [Daniel]
    - PATCH 14: use g_strdup_printf() instead of g_new0+snprintf() and remove
                edundant bounds check in qcrypto_afalg_hash_format_name(). 
[Daniel]
    - PATCH 14: s/except_len/expect_len. [Daniel]
    - PATCH 14: free 'errp' if afalg_driver.hash_bytesv() failed. [Daniel]
    - PATCH 14: maybe some afalg errors should be treated as fatal, but we
                have no idea yet, so add a "TODO" comment.
    - PATCH 15: refactor the qcrypto_hmac_new() as Daniel's suggestion. [Daniel]

Changes since v2:
    - init sockaddr_alg object when it's defined. [Gonglei]
    - fix some superfluous initialization. [Gonglei]
    - s/opeartion/operation/g in crypto/afalgpriv.h. [Gonglei]
    - check 'niv' in qcrypto_afalg_cipher_setiv. [Gonglei]

Changes since v1:
    - use "make check-speed" to testing the performance. [Daniel]
    - put private definations into crypto/***priv.h. [Daniel]
    - remove afalg socket from qapi-schema, put them into crypto/. [Daniel]
    - some Error report change. [Daniel]
    - s/QCryptoAfalg/QCryptoAFAlg. [Daniel]
    - use snprintf with bounds checking instead of sprintf. [Daniel]
    - use "qcrypto_afalg_" prefix and "qcrypto_nettle(gcrypt,glib,builtin)_" 
prefix. [Daniel]
    - add testing results in cover-letter. [Gonglei]

---
Longpeng(Mike) (18):
  crypto: cipher: introduce context free function
  crypto: cipher: introduce qcrypto_cipher_ctx_new for gcrypt-backend
  crypto: cipher: introduce qcrypto_cipher_ctx_new for nettle-backend
  crypto: cipher: introduce qcrypto_cipher_ctx_new for builtin-backend
  crypto: cipher: add cipher driver framework
  crypto: hash: add hash driver framework
  crypto: hmac: move crypto/hmac.h into include/crypto/
  crypto: hmac: introduce qcrypto_hmac_ctx_new for gcrypt-backend
  crypto: hmac: introduce qcrypto_hmac_ctx_new for nettle-backend
  crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend
  crypto: hmac: add hmac driver framework
  crypto: introduce some common functions for af_alg backend
  crypto: cipher: add afalg-backend cipher support
  crypto: hash: add afalg-backend hash support
  crypto: hmac: add af_alg-backend hmac support
  tests: crypto: add cipher speed benchmark support
  tests: crypto: add hash speed benchmark support
  tests: crypto: add hmac speed benchmark support

 configure                       |  22 ++++
 crypto/Makefile.objs            |   3 +
 crypto/afalg.c                  | 116 +++++++++++++++++++++
 crypto/afalgpriv.h              |  64 ++++++++++++
 crypto/cipher-afalg.c           | 226 ++++++++++++++++++++++++++++++++++++++++
 crypto/cipher-builtin.c         | 125 +++++++++++-----------
 crypto/cipher-gcrypt.c          | 105 ++++++++++---------
 crypto/cipher-nettle.c          |  84 ++++++++-------
 crypto/cipher.c                 |  80 ++++++++++++++
 crypto/cipherpriv.h             |  56 ++++++++++
 crypto/hash-afalg.c             | 214 +++++++++++++++++++++++++++++++++++++
 crypto/hash-gcrypt.c            |  19 ++--
 crypto/hash-glib.c              |  19 ++--
 crypto/hash-nettle.c            |  19 ++--
 crypto/hash.c                   |  30 ++++++
 crypto/hashpriv.h               |  39 +++++++
 crypto/hmac-gcrypt.c            |  42 ++++----
 crypto/hmac-glib.c              |  63 ++++++-----
 crypto/hmac-nettle.c            |  42 ++++----
 crypto/hmac.c                   |  58 +++++++++++
 crypto/hmac.h                   | 166 -----------------------------
 crypto/hmacpriv.h               |  48 +++++++++
 hw/virtio/vhost-backend.c       |   8 +-
 include/crypto/cipher.h         |   1 +
 include/crypto/hmac.h           | 167 +++++++++++++++++++++++++++++
 tests/Makefile.include          |  13 ++-
 tests/benchmark-crypto-cipher.c |  88 ++++++++++++++++
 tests/benchmark-crypto-hash.c   |  67 ++++++++++++
 tests/benchmark-crypto-hmac.c   |  82 +++++++++++++++
 29 files changed, 1654 insertions(+), 412 deletions(-)
 create mode 100644 crypto/afalg.c
 create mode 100644 crypto/afalgpriv.h
 create mode 100644 crypto/cipher-afalg.c
 create mode 100644 crypto/cipherpriv.h
 create mode 100644 crypto/hash-afalg.c
 create mode 100644 crypto/hashpriv.h
 delete mode 100644 crypto/hmac.h
 create mode 100644 crypto/hmacpriv.h
 create mode 100644 include/crypto/hmac.h
 create mode 100644 tests/benchmark-crypto-cipher.c
 create mode 100644 tests/benchmark-crypto-hash.c
 create mode 100644 tests/benchmark-crypto-hmac.c

-- 
1.8.3.1