qapi/misc.json | 27 +++++++++++++++++++++++++++ include/exec/cpu-common.h | 1 + exec.c | 16 ++++++++++++++++ monitor/qmp-cmds.c | 9 +++++++++ 4 files changed, 53 insertions(+)
This returns MD5 checksum of all RAM blocks for migration debugging
as this is way faster than saving the entire RAM to a file and checking
that.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
I am actually wondering if there is an easier way of getting these
checksums and I just do not see it, it cannot be that we fixed all
memory migration bugs :)
---
qapi/misc.json | 27 +++++++++++++++++++++++++++
include/exec/cpu-common.h | 1 +
exec.c | 16 ++++++++++++++++
monitor/qmp-cmds.c | 9 +++++++++
4 files changed, 53 insertions(+)
diff --git a/qapi/misc.json b/qapi/misc.json
index a7fba7230cfa..e7475f30a844 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1194,6 +1194,33 @@
##
{ 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
+##
+# @MemoryChecksum:
+#
+# A string with MD5 checksum of all RAMBlocks.
+#
+# @checksum: the checksum.
+#
+# Since: 3.2.0
+##
+{ 'struct': 'MemoryChecksum',
+ 'data' : { 'checksum': 'str' } }
+
+##
+# @query-memory-checksum:
+#
+# Return the MD5 checksum of all RAMBlocks.
+#
+# Example:
+#
+# -> { "execute": "query-memory-checksum" }
+# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
+#
+# Since: 3.2.0
+##
+{ 'command': 'query-memory-checksum',
+ 'returns': 'MemoryChecksum' }
+
##
# @AddfdInfo:
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index f7dbe75fbc38..15dbf18c2d5d 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -57,6 +57,7 @@ void qemu_ram_set_idstr(RAMBlock *block, const char *name, DeviceState *dev);
void qemu_ram_unset_idstr(RAMBlock *block);
const char *qemu_ram_get_idstr(RAMBlock *rb);
void *qemu_ram_get_host_addr(RAMBlock *rb);
+gchar *qemu_ram_chksum(void);
ram_addr_t qemu_ram_get_offset(RAMBlock *rb);
ram_addr_t qemu_ram_get_used_length(RAMBlock *rb);
bool qemu_ram_is_shared(RAMBlock *rb);
diff --git a/exec.c b/exec.c
index 3e78de3b8f8b..76f7f63cf71b 100644
--- a/exec.c
+++ b/exec.c
@@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
return rb->host;
}
+gchar *qemu_ram_chksum(void)
+{
+ struct RAMBlock *rb;
+ GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
+ gchar *ret;
+
+ RAMBLOCK_FOREACH(rb) {
+ g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
+ qemu_ram_get_used_length(rb));
+ }
+ ret = g_strdup(g_checksum_get_string(chksum));
+ g_checksum_free(chksum);
+
+ return ret;
+}
+
ram_addr_t qemu_ram_get_offset(RAMBlock *rb)
{
return rb->offset;
diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index b9ae40eec751..ec52bd82588e 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -413,3 +413,12 @@ MemoryInfo *qmp_query_memory_size_summary(Error **errp)
return mem_info;
}
+
+MemoryChecksum *qmp_query_memory_checksum(Error **errp)
+{
+ MemoryChecksum *chk = g_malloc0(sizeof(MemoryChecksum));
+
+ chk->checksum = qemu_ram_chksum();
+
+ return chk;
+}
--
2.17.1
On 8/21/19 8:16 PM, Alexey Kardashevskiy wrote:
> This returns MD5 checksum of all RAM blocks for migration debugging
> as this is way faster than saving the entire RAM to a file and checking
> that.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>
>
> I am actually wondering if there is an easier way of getting these
> checksums and I just do not see it, it cannot be that we fixed all
> memory migration bugs :)
I'm not sure whether the command itself makes sense, but for the interface:
> +++ b/qapi/misc.json
> @@ -1194,6 +1194,33 @@
> ##
> { 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
>
> +##
> +# @MemoryChecksum:
> +#
> +# A string with MD5 checksum of all RAMBlocks.
> +#
> +# @checksum: the checksum.
> +#
> +# Since: 3.2.0
This should be 4.2, not 3.2.
> +##
> +{ 'struct': 'MemoryChecksum',
> + 'data' : { 'checksum': 'str' } }
> +
> +##
> +# @query-memory-checksum:
> +#
> +# Return the MD5 checksum of all RAMBlocks.
> +#
> +# Example:
> +#
> +# -> { "execute": "query-memory-checksum" }
> +# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
> +#
> +# Since: 3.2.0
and again
> +##
> +{ 'command': 'query-memory-checksum',
> + 'returns': 'MemoryChecksum' }
> +
>
> +++ b/exec.c
> @@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
> return rb->host;
> }
>
> +gchar *qemu_ram_chksum(void)
gchar is a pointless glib type. Use 'char' instead.
> +{
> + struct RAMBlock *rb;
> + GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
> + gchar *ret;
> +
> + RAMBLOCK_FOREACH(rb) {
> + g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
> + qemu_ram_get_used_length(rb));
> + }
> + ret = g_strdup(g_checksum_get_string(chksum));
> + g_checksum_free(chksum);
> +
> + return ret;
> +}
How long does this take to run? Is it something where you really want
to block the guest while chewing over the guest's entire memory?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
On 22/08/2019 11:33, Eric Blake wrote:
> On 8/21/19 8:16 PM, Alexey Kardashevskiy wrote:
>> This returns MD5 checksum of all RAM blocks for migration debugging
>> as this is way faster than saving the entire RAM to a file and checking
>> that.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>
>>
>> I am actually wondering if there is an easier way of getting these
>> checksums and I just do not see it, it cannot be that we fixed all
>> memory migration bugs :)
>
> I'm not sure whether the command itself makes sense, but for the interface:
>
>
>> +++ b/qapi/misc.json
>> @@ -1194,6 +1194,33 @@
>> ##
>> { 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
>>
>> +##
>> +# @MemoryChecksum:
>> +#
>> +# A string with MD5 checksum of all RAMBlocks.
>> +#
>> +# @checksum: the checksum.
>> +#
>> +# Since: 3.2.0
>
> This should be 4.2, not 3.2.
>
>> +##
>> +{ 'struct': 'MemoryChecksum',
>> + 'data' : { 'checksum': 'str' } }
>> +
>> +##
>> +# @query-memory-checksum:
>> +#
>> +# Return the MD5 checksum of all RAMBlocks.
>> +#
>> +# Example:
>> +#
>> +# -> { "execute": "query-memory-checksum" }
>> +# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
>> +#
>> +# Since: 3.2.0
>
> and again
>
>> +##
>> +{ 'command': 'query-memory-checksum',
>> + 'returns': 'MemoryChecksum' }
>> +
>>
>
>> +++ b/exec.c
>> @@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
>> return rb->host;
>> }
>>
>> +gchar *qemu_ram_chksum(void)
>
> gchar is a pointless glib type. Use 'char' instead.
>
>> +{
>> + struct RAMBlock *rb;
>> + GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
>> + gchar *ret;
>> +
>> + RAMBLOCK_FOREACH(rb) {
>> + g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
>> + qemu_ram_get_used_length(rb));
>> + }
>> + ret = g_strdup(g_checksum_get_string(chksum));
>> + g_checksum_free(chksum);
>> +
>> + return ret;
>> +}
>
> How long does this take to run? Is it something where you really want
> to block the guest while chewing over the guest's entire memory?
10-20 times faster than "pmemsave" and blocking the guest is not a
problem here as both - source and destination - guests are stopped
(otherwise the checksum does not make sense).
--
Alexey
Alexey Kardashevskiy <aik@ozlabs.ru> writes: > This returns MD5 checksum of all RAM blocks for migration debugging > as this is way faster than saving the entire RAM to a file and checking > that. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Any particular reason for MD5? Have you measured the other choices offered by GLib? I understand you don't need crypto-strength here. Both MD5 and SHA-1 would be bad choices then.
On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote: > Alexey Kardashevskiy <aik@ozlabs.ru> writes: > > > This returns MD5 checksum of all RAM blocks for migration debugging > > as this is way faster than saving the entire RAM to a file and checking > > that. > > > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > > Any particular reason for MD5? Have you measured the other choices > offered by GLib? > > I understand you don't need crypto-strength here. Both MD5 and SHA-1 > would be bad choices then. We have a tests/bench-crypto-hash test but its hardcoded for sha256. I hacked it to report all algorithms and got these results for varying input chunk sizes: /crypto/hash/md5/speed-512: 519.12 MB/sec OK /crypto/hash/md5/speed-1024: 560.39 MB/sec OK /crypto/hash/md5/speed-4096: 591.39 MB/sec OK /crypto/hash/md5/speed-16384: 576.46 MB/sec OK /crypto/hash/sha1/speed-512: 443.12 MB/sec OK /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK /crypto/hash/sha224/speed-512: 221.90 MB/sec OK /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK /crypto/hash/sha256/speed-512: 222.75 MB/sec OK /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK /crypto/hash/sha384/speed-512: 322.73 MB/sec OK /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK /crypto/hash/sha512/speed-512: 320.62 MB/sec OK /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK IOW, md5 is clearly the quickest, by a considerable margin over SHA256/512. SHA1 is slightly slower. Assuming that we document that this command is intentionally *not* trying to guarantee collision resistances we're ok. In fact we should not document what kind of checksum is reported by query-memory-checksum. The impl should be a black box from user's POV. If we're just aiming for debugging tool to detect accidental corruption, could we even just ignore cryptographic hashs entirely and do a crc32 - that'd be way faster than even md5. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Daniel P. Berrangé <berrange@redhat.com> writes: > On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote: >> Alexey Kardashevskiy <aik@ozlabs.ru> writes: >> >> > This returns MD5 checksum of all RAM blocks for migration debugging >> > as this is way faster than saving the entire RAM to a file and checking >> > that. >> > >> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> >> Any particular reason for MD5? Have you measured the other choices >> offered by GLib? >> >> I understand you don't need crypto-strength here. Both MD5 and SHA-1 >> would be bad choices then. > > We have a tests/bench-crypto-hash test but its hardcoded for sha256. > I hacked it to report all algorithms and got these results for varying > input chunk sizes: > > /crypto/hash/md5/speed-512: 519.12 MB/sec OK > /crypto/hash/md5/speed-1024: 560.39 MB/sec OK > /crypto/hash/md5/speed-4096: 591.39 MB/sec OK > /crypto/hash/md5/speed-16384: 576.46 MB/sec OK > /crypto/hash/sha1/speed-512: 443.12 MB/sec OK > /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK > /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK > /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK > /crypto/hash/sha224/speed-512: 221.90 MB/sec OK > /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK > /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK > /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK > /crypto/hash/sha256/speed-512: 222.75 MB/sec OK > /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK > /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK > /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK > /crypto/hash/sha384/speed-512: 322.73 MB/sec OK > /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK > /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK > /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK > /crypto/hash/sha512/speed-512: 320.62 MB/sec OK > /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK > /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK > /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK > /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK > /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK > /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK > /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK > > > IOW, md5 is clearly the quickest, by a considerable margin over > SHA256/512. SHA1 is slightly slower. > > Assuming that we document that this command is intentionally > *not* trying to guarantee collision resistances we're ok. > > In fact we should not document what kind of checksum is > reported by query-memory-checksum. The impl should be a black > box from user's POV. > > If we're just aiming for debugging tool to detect accidental > corruption, could we even just ignore cryptographic hashs > entirely and do a crc32 - that'd be way faster than even > md5. Good points. The doc strings should spell out "for debugging", like the commit message does, and both should spell out "weak collision resistance". I can't find CRC-32 in GLib, but zlib appears to provide it: http://refspecs.linuxbase.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/zlib-crc32-1.html Care to compare its speed to MD5?
On Fri, Aug 23, 2019 at 07:49:31AM +0200, Markus Armbruster wrote: > Daniel P. Berrangé <berrange@redhat.com> writes: > > > On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote: > >> Alexey Kardashevskiy <aik@ozlabs.ru> writes: > >> > >> > This returns MD5 checksum of all RAM blocks for migration debugging > >> > as this is way faster than saving the entire RAM to a file and checking > >> > that. > >> > > >> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > >> > >> Any particular reason for MD5? Have you measured the other choices > >> offered by GLib? > >> > >> I understand you don't need crypto-strength here. Both MD5 and SHA-1 > >> would be bad choices then. > > > > We have a tests/bench-crypto-hash test but its hardcoded for sha256. > > I hacked it to report all algorithms and got these results for varying > > input chunk sizes: > > > > /crypto/hash/md5/speed-512: 519.12 MB/sec OK > > /crypto/hash/md5/speed-1024: 560.39 MB/sec OK > > /crypto/hash/md5/speed-4096: 591.39 MB/sec OK > > /crypto/hash/md5/speed-16384: 576.46 MB/sec OK > > /crypto/hash/sha1/speed-512: 443.12 MB/sec OK > > /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK > > /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK > > /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK > > /crypto/hash/sha224/speed-512: 221.90 MB/sec OK > > /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK > > /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK > > /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK > > /crypto/hash/sha256/speed-512: 222.75 MB/sec OK > > /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK > > /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK > > /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK > > /crypto/hash/sha384/speed-512: 322.73 MB/sec OK > > /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK > > /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK > > /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK > > /crypto/hash/sha512/speed-512: 320.62 MB/sec OK > > /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK > > /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK > > /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK > > /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK > > /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK > > /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK > > /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK > > > > > > IOW, md5 is clearly the quickest, by a considerable margin over > > SHA256/512. SHA1 is slightly slower. > > > > Assuming that we document that this command is intentionally > > *not* trying to guarantee collision resistances we're ok. > > > > In fact we should not document what kind of checksum is > > reported by query-memory-checksum. The impl should be a black > > box from user's POV. > > > > If we're just aiming for debugging tool to detect accidental > > corruption, could we even just ignore cryptographic hashs > > entirely and do a crc32 - that'd be way faster than even > > md5. > > Good points. > > The doc strings should spell out "for debugging", like the commit > message does, and both should spell out "weak collision resistance". > > I can't find CRC-32 in GLib, but zlib appears to provide it: > http://refspecs.linuxbase.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/zlib-crc32-1.html > > Care to compare its speed to MD5? I hacked the code to use zlib's crc32 impl and got these for comparison: /crypto/hash/crc32/speed-512: 1089.18 MB/sec OK /crypto/hash/crc32/speed-1024: 1124.63 MB/sec OK /crypto/hash/crc32/speed-4096: 1162.73 MB/sec OK /crypto/hash/crc32/speed-16384: 1171.58 MB/sec OK /crypto/hash/crc32/speed-1048576: 1165.68 MB/sec OK /crypto/hash/md5/speed-512: 476.27 MB/sec OK /crypto/hash/md5/speed-1024: 517.16 MB/sec OK /crypto/hash/md5/speed-4096: 554.70 MB/sec OK /crypto/hash/md5/speed-16384: 564.44 MB/sec OK /crypto/hash/md5/speed-1048576: 566.78 MB/sec OK Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Daniel P. Berrangé <berrange@redhat.com> writes: > On Fri, Aug 23, 2019 at 07:49:31AM +0200, Markus Armbruster wrote: >> Daniel P. Berrangé <berrange@redhat.com> writes: >> >> > On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote: >> >> Alexey Kardashevskiy <aik@ozlabs.ru> writes: >> >> >> >> > This returns MD5 checksum of all RAM blocks for migration debugging >> >> > as this is way faster than saving the entire RAM to a file and checking >> >> > that. >> >> > >> >> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> >> >> >> Any particular reason for MD5? Have you measured the other choices >> >> offered by GLib? >> >> >> >> I understand you don't need crypto-strength here. Both MD5 and SHA-1 >> >> would be bad choices then. >> > >> > We have a tests/bench-crypto-hash test but its hardcoded for sha256. >> > I hacked it to report all algorithms and got these results for varying >> > input chunk sizes: >> > >> > /crypto/hash/md5/speed-512: 519.12 MB/sec OK >> > /crypto/hash/md5/speed-1024: 560.39 MB/sec OK >> > /crypto/hash/md5/speed-4096: 591.39 MB/sec OK >> > /crypto/hash/md5/speed-16384: 576.46 MB/sec OK >> > /crypto/hash/sha1/speed-512: 443.12 MB/sec OK >> > /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK >> > /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK >> > /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK >> > /crypto/hash/sha224/speed-512: 221.90 MB/sec OK >> > /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK >> > /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK >> > /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK >> > /crypto/hash/sha256/speed-512: 222.75 MB/sec OK >> > /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK >> > /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK >> > /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK >> > /crypto/hash/sha384/speed-512: 322.73 MB/sec OK >> > /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK >> > /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK >> > /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK >> > /crypto/hash/sha512/speed-512: 320.62 MB/sec OK >> > /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK >> > /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK >> > /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK >> > /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK >> > /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK >> > /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK >> > /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK >> > >> > >> > IOW, md5 is clearly the quickest, by a considerable margin over >> > SHA256/512. SHA1 is slightly slower. >> > >> > Assuming that we document that this command is intentionally >> > *not* trying to guarantee collision resistances we're ok. >> > >> > In fact we should not document what kind of checksum is >> > reported by query-memory-checksum. The impl should be a black >> > box from user's POV. >> > >> > If we're just aiming for debugging tool to detect accidental >> > corruption, could we even just ignore cryptographic hashs >> > entirely and do a crc32 - that'd be way faster than even >> > md5. >> >> Good points. >> >> The doc strings should spell out "for debugging", like the commit >> message does, and both should spell out "weak collision resistance". >> >> I can't find CRC-32 in GLib, but zlib appears to provide it: >> http://refspecs.linuxbase.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/zlib-crc32-1.html >> >> Care to compare its speed to MD5? > > I hacked the code to use zlib's crc32 impl and got these for comparison: > > /crypto/hash/crc32/speed-512: 1089.18 MB/sec OK > /crypto/hash/crc32/speed-1024: 1124.63 MB/sec OK > /crypto/hash/crc32/speed-4096: 1162.73 MB/sec OK > /crypto/hash/crc32/speed-16384: 1171.58 MB/sec OK > /crypto/hash/crc32/speed-1048576: 1165.68 MB/sec OK > /crypto/hash/md5/speed-512: 476.27 MB/sec OK > /crypto/hash/md5/speed-1024: 517.16 MB/sec OK > /crypto/hash/md5/speed-4096: 554.70 MB/sec OK > /crypto/hash/md5/speed-16384: 564.44 MB/sec OK > /crypto/hash/md5/speed-1048576: 566.78 MB/sec OK Twice as fast. Alexey, what do you think?
On 23/08/2019 21:41, Markus Armbruster wrote: > Daniel P. Berrangé <berrange@redhat.com> writes: > >> On Fri, Aug 23, 2019 at 07:49:31AM +0200, Markus Armbruster wrote: >>> Daniel P. Berrangé <berrange@redhat.com> writes: >>> >>>> On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote: >>>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes: >>>>> >>>>>> This returns MD5 checksum of all RAM blocks for migration debugging >>>>>> as this is way faster than saving the entire RAM to a file and checking >>>>>> that. >>>>>> >>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>>> >>>>> Any particular reason for MD5? Have you measured the other choices >>>>> offered by GLib? >>>>> >>>>> I understand you don't need crypto-strength here. Both MD5 and SHA-1 >>>>> would be bad choices then. >>>> >>>> We have a tests/bench-crypto-hash test but its hardcoded for sha256. >>>> I hacked it to report all algorithms and got these results for varying >>>> input chunk sizes: >>>> >>>> /crypto/hash/md5/speed-512: 519.12 MB/sec OK >>>> /crypto/hash/md5/speed-1024: 560.39 MB/sec OK >>>> /crypto/hash/md5/speed-4096: 591.39 MB/sec OK >>>> /crypto/hash/md5/speed-16384: 576.46 MB/sec OK >>>> /crypto/hash/sha1/speed-512: 443.12 MB/sec OK >>>> /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK >>>> /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK >>>> /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK >>>> /crypto/hash/sha224/speed-512: 221.90 MB/sec OK >>>> /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK >>>> /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK >>>> /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK >>>> /crypto/hash/sha256/speed-512: 222.75 MB/sec OK >>>> /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK >>>> /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK >>>> /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK >>>> /crypto/hash/sha384/speed-512: 322.73 MB/sec OK >>>> /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK >>>> /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK >>>> /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK >>>> /crypto/hash/sha512/speed-512: 320.62 MB/sec OK >>>> /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK >>>> /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK >>>> /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK >>>> /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK >>>> /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK >>>> /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK >>>> /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK >>>> >>>> >>>> IOW, md5 is clearly the quickest, by a considerable margin over >>>> SHA256/512. SHA1 is slightly slower. >>>> >>>> Assuming that we document that this command is intentionally >>>> *not* trying to guarantee collision resistances we're ok. >>>> >>>> In fact we should not document what kind of checksum is >>>> reported by query-memory-checksum. The impl should be a black >>>> box from user's POV. >>>> >>>> If we're just aiming for debugging tool to detect accidental >>>> corruption, could we even just ignore cryptographic hashs >>>> entirely and do a crc32 - that'd be way faster than even >>>> md5. >>> >>> Good points. >>> >>> The doc strings should spell out "for debugging", like the commit >>> message does, and both should spell out "weak collision resistance". >>> >>> I can't find CRC-32 in GLib, but zlib appears to provide it: >>> http://refspecs.linuxbase.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/zlib-crc32-1.html >>> >>> Care to compare its speed to MD5? >> >> I hacked the code to use zlib's crc32 impl and got these for comparison: >> >> /crypto/hash/crc32/speed-512: 1089.18 MB/sec OK >> /crypto/hash/crc32/speed-1024: 1124.63 MB/sec OK >> /crypto/hash/crc32/speed-4096: 1162.73 MB/sec OK >> /crypto/hash/crc32/speed-16384: 1171.58 MB/sec OK >> /crypto/hash/crc32/speed-1048576: 1165.68 MB/sec OK >> /crypto/hash/md5/speed-512: 476.27 MB/sec OK >> /crypto/hash/md5/speed-1024: 517.16 MB/sec OK >> /crypto/hash/md5/speed-4096: 554.70 MB/sec OK >> /crypto/hash/md5/speed-16384: 564.44 MB/sec OK >> /crypto/hash/md5/speed-1048576: 566.78 MB/sec OK > > Twice as fast. Alexey, what do you think? This is even better. TBH I picked md5 as I could not spot crc32 helper in the first minute (I can see it now) and MD5 felt the fastest available from glibc :) I'll probably add start..end range(s) and repost. Thanks for all these numbers and reviews. -- Alexey
© 2016 - 2026 Red Hat, Inc.