crypto/xts.c | 147 ++++++++++++++----------------- tests/benchmark-crypto-cipher.c | 149 +++++++++++++++++++++++++++----- 2 files changed, 191 insertions(+), 105 deletions(-)
The XTS cipher mode is significantly slower than CBC mode. This series approximately doubles the XTS performance which will improve the I/O rate for LUKS disks. Daniel P. Berrangé (6): crypto: expand algorithm coverage for cipher benchmark crypto: remove code duplication in tweak encrypt/decrypt crypto: introduce a xts_uint128 data type crypto: convert xts_tweak_encdec to use xts_uint128 type crypto: convert xts_mult_x to use xts_uint128 type crypto: annotate xts_tweak_encdec as inlineable crypto/xts.c | 147 ++++++++++++++----------------- tests/benchmark-crypto-cipher.c | 149 +++++++++++++++++++++++++++----- 2 files changed, 191 insertions(+), 105 deletions(-) -- 2.17.1
Hi
On Tue, Oct 9, 2018 at 4:57 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> The XTS cipher mode is significantly slower than CBC mode. This series
> approximately doubles the XTS performance which will improve the I/O
> rate for LUKS disks.
>
> Daniel P. Berrangé (6):
> crypto: expand algorithm coverage for cipher benchmark
> crypto: remove code duplication in tweak encrypt/decrypt
> crypto: introduce a xts_uint128 data type
> crypto: convert xts_tweak_encdec to use xts_uint128 type
> crypto: convert xts_mult_x to use xts_uint128 type
> crypto: annotate xts_tweak_encdec as inlineable
>
> crypto/xts.c | 147 ++++++++++++++-----------------
> tests/benchmark-crypto-cipher.c | 149 +++++++++++++++++++++++++++-----
> 2 files changed, 191 insertions(+), 105 deletions(-)
By using a constant amount of data to process, it's easier to measure
perfomance with perf stat:
diff --git a/tests/benchmark-crypto-cipher.c b/tests/benchmark-crypto-cipher.c
index a8325a9510..32a19987e6 100644
--- a/tests/benchmark-crypto-cipher.c
+++ b/tests/benchmark-crypto-cipher.c
@@ -65,7 +65,7 @@ static void test_cipher_speed(size_t chunk_size,
chunk_size,
&err) == 0);
total += chunk_size;
- } while (g_test_timer_elapsed() < 1.0);
+ } while (total / MiB < 500);
total /= MiB;
g_print("Enc chunk %zu bytes ", chunk_size);
@@ -80,7 +80,7 @@ static void test_cipher_speed(size_t chunk_size,
chunk_size,
&err) == 0);
total += chunk_size;
- } while (g_test_timer_elapsed() < 1.0);
+ } while (total / MiB < 500);
On my laptop: before your series:
3701.625051 task-clock:u (msec) # 0.997 CPUs
utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
438 page-faults:u # 0.118 K/sec
10,823,305,761 cycles:u # 2.924 GHz
29,774,419,538 instructions:u # 2.75 insn per
cycle
4,919,267,782 branches:u # 1328.948 M/sec
32,923,105 branch-misses:u # 0.67% of all
branches
3.712998264 seconds time elapsed
Ater:
2151.201355 task-clock:u (msec) # 1.000 CPUs
utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
431 page-faults:u # 0.200 K/sec
7,073,869,618 cycles:u # 3.288 GHz
8,573,595,534 instructions:u # 1.21 insn per
cycle
1,576,926,668 branches:u # 733.045 M/sec
148,987 branch-misses:u # 0.01% of all
branches
2.151520872 seconds time elapsed
--
Marc-André Lureau
On Tue, Oct 09, 2018 at 05:59:46PM +0400, Marc-André Lureau wrote: > Hi > > On Tue, Oct 9, 2018 at 4:57 PM Daniel P. Berrangé <berrange@redhat.com> wrote: > > > > The XTS cipher mode is significantly slower than CBC mode. This series > > approximately doubles the XTS performance which will improve the I/O > > rate for LUKS disks. > > > > Daniel P. Berrangé (6): > > crypto: expand algorithm coverage for cipher benchmark > > crypto: remove code duplication in tweak encrypt/decrypt > > crypto: introduce a xts_uint128 data type > > crypto: convert xts_tweak_encdec to use xts_uint128 type > > crypto: convert xts_mult_x to use xts_uint128 type > > crypto: annotate xts_tweak_encdec as inlineable > > > > crypto/xts.c | 147 ++++++++++++++----------------- > > tests/benchmark-crypto-cipher.c | 149 +++++++++++++++++++++++++++----- > > 2 files changed, 191 insertions(+), 105 deletions(-) > > By using a constant amount of data to process, it's easier to measure > perfomance with perf stat: The problem is that the different encryption modes have wildly different performance. eg while XTS gets 400 MB/s, ECB gets 3000 MB/s. I want the test to run long enough to minimize the noise, and picking a data size large enough for best ECB perf while not being excessively large for XTS is hard. THus I prefer to have a fixed execution time for each test. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Hi On Tue, Oct 9, 2018 at 6:13 PM Daniel P. Berrangé <berrange@redhat.com> wrote: > > On Tue, Oct 09, 2018 at 05:59:46PM +0400, Marc-André Lureau wrote: > > Hi > > > > On Tue, Oct 9, 2018 at 4:57 PM Daniel P. Berrangé <berrange@redhat.com> wrote: > > > > > > The XTS cipher mode is significantly slower than CBC mode. This series > > > approximately doubles the XTS performance which will improve the I/O > > > rate for LUKS disks. > > > > > > Daniel P. Berrangé (6): > > > crypto: expand algorithm coverage for cipher benchmark > > > crypto: remove code duplication in tweak encrypt/decrypt > > > crypto: introduce a xts_uint128 data type > > > crypto: convert xts_tweak_encdec to use xts_uint128 type > > > crypto: convert xts_mult_x to use xts_uint128 type > > > crypto: annotate xts_tweak_encdec as inlineable > > > > > > crypto/xts.c | 147 ++++++++++++++----------------- > > > tests/benchmark-crypto-cipher.c | 149 +++++++++++++++++++++++++++----- > > > 2 files changed, 191 insertions(+), 105 deletions(-) > > > > By using a constant amount of data to process, it's easier to measure > > perfomance with perf stat: > > The problem is that the different encryption modes have wildly > different performance. eg while XTS gets 400 MB/s, ECB gets > 3000 MB/s. I want the test to run long enough to minimize the > noise, and picking a data size large enough for best ECB > perf while not being excessively large for XTS is hard. THus > I prefer to have a fixed execution time for each test. I understand, I was just giving you some nice numbers to back your patches ;) Otoh, I think having a fixed-size work for benchmark is more reliable, even if the test runs quickly. I wouldn't rely on the current benchmark results, they are quite unpredictable on my system. > > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Marc-André Lureau
© 2016 - 2025 Red Hat, Inc.