qapi: Add query-memory-checksum

[Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Alexey Kardashevskiy 6 years, 5 months ago

This returns MD5 checksum of all RAM blocks for migration debugging
as this is way faster than saving the entire RAM to a file and checking
that.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---


I am actually wondering if there is an easier way of getting these
checksums and I just do not see it, it cannot be that we fixed all
memory migration bugs :)


---
 qapi/misc.json            | 27 +++++++++++++++++++++++++++
 include/exec/cpu-common.h |  1 +
 exec.c                    | 16 ++++++++++++++++
 monitor/qmp-cmds.c        |  9 +++++++++
 4 files changed, 53 insertions(+)

diff --git a/qapi/misc.json b/qapi/misc.json
index a7fba7230cfa..e7475f30a844 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1194,6 +1194,33 @@
 ##
 { 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
 
+##
+# @MemoryChecksum:
+#
+# A string with MD5 checksum of all RAMBlocks.
+#
+# @checksum: the checksum.
+#
+# Since: 3.2.0
+##
+{ 'struct': 'MemoryChecksum',
+  'data'  : { 'checksum': 'str' } }
+
+##
+# @query-memory-checksum:
+#
+# Return the MD5 checksum of all RAMBlocks.
+#
+# Example:
+#
+# -> { "execute": "query-memory-checksum" }
+# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
+#
+# Since: 3.2.0
+##
+{ 'command': 'query-memory-checksum',
+  'returns': 'MemoryChecksum' }
+
 
 ##
 # @AddfdInfo:
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index f7dbe75fbc38..15dbf18c2d5d 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -57,6 +57,7 @@ void qemu_ram_set_idstr(RAMBlock *block, const char *name, DeviceState *dev);
 void qemu_ram_unset_idstr(RAMBlock *block);
 const char *qemu_ram_get_idstr(RAMBlock *rb);
 void *qemu_ram_get_host_addr(RAMBlock *rb);
+gchar *qemu_ram_chksum(void);
 ram_addr_t qemu_ram_get_offset(RAMBlock *rb);
 ram_addr_t qemu_ram_get_used_length(RAMBlock *rb);
 bool qemu_ram_is_shared(RAMBlock *rb);
diff --git a/exec.c b/exec.c
index 3e78de3b8f8b..76f7f63cf71b 100644
--- a/exec.c
+++ b/exec.c
@@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
     return rb->host;
 }
 
+gchar *qemu_ram_chksum(void)
+{
+    struct RAMBlock *rb;
+    GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
+    gchar *ret;
+
+    RAMBLOCK_FOREACH(rb) {
+        g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
+                          qemu_ram_get_used_length(rb));
+    }
+    ret = g_strdup(g_checksum_get_string(chksum));
+    g_checksum_free(chksum);
+
+    return ret;
+}
+
 ram_addr_t qemu_ram_get_offset(RAMBlock *rb)
 {
     return rb->offset;
diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index b9ae40eec751..ec52bd82588e 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -413,3 +413,12 @@ MemoryInfo *qmp_query_memory_size_summary(Error **errp)
 
     return mem_info;
 }
+
+MemoryChecksum *qmp_query_memory_checksum(Error **errp)
+{
+    MemoryChecksum *chk = g_malloc0(sizeof(MemoryChecksum));
+
+    chk->checksum = qemu_ram_chksum();
+
+    return chk;
+}
-- 
2.17.1

Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Eric Blake 6 years, 5 months ago

On 8/21/19 8:16 PM, Alexey Kardashevskiy wrote:
> This returns MD5 checksum of all RAM blocks for migration debugging
> as this is way faster than saving the entire RAM to a file and checking
> that.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> 
> I am actually wondering if there is an easier way of getting these
> checksums and I just do not see it, it cannot be that we fixed all
> memory migration bugs :)

I'm not sure whether the command itself makes sense, but for the interface:


> +++ b/qapi/misc.json
> @@ -1194,6 +1194,33 @@
>  ##
>  { 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
>  
> +##
> +# @MemoryChecksum:
> +#
> +# A string with MD5 checksum of all RAMBlocks.
> +#
> +# @checksum: the checksum.
> +#
> +# Since: 3.2.0

This should be 4.2, not 3.2.

> +##
> +{ 'struct': 'MemoryChecksum',
> +  'data'  : { 'checksum': 'str' } }
> +
> +##
> +# @query-memory-checksum:
> +#
> +# Return the MD5 checksum of all RAMBlocks.
> +#
> +# Example:
> +#
> +# -> { "execute": "query-memory-checksum" }
> +# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
> +#
> +# Since: 3.2.0

and again

> +##
> +{ 'command': 'query-memory-checksum',
> +  'returns': 'MemoryChecksum' }
> +
>  

> +++ b/exec.c
> @@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
>      return rb->host;
>  }
>  
> +gchar *qemu_ram_chksum(void)

gchar is a pointless glib type.  Use 'char' instead.

> +{
> +    struct RAMBlock *rb;
> +    GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
> +    gchar *ret;
> +
> +    RAMBLOCK_FOREACH(rb) {
> +        g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
> +                          qemu_ram_get_used_length(rb));
> +    }
> +    ret = g_strdup(g_checksum_get_string(chksum));
> +    g_checksum_free(chksum);
> +
> +    return ret;
> +}

How long does this take to run?  Is it something where you really want
to block the guest while chewing over the guest's entire memory?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Alexey Kardashevskiy 6 years, 5 months ago


On 22/08/2019 11:33, Eric Blake wrote:
> On 8/21/19 8:16 PM, Alexey Kardashevskiy wrote:
>> This returns MD5 checksum of all RAM blocks for migration debugging
>> as this is way faster than saving the entire RAM to a file and checking
>> that.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>
>>
>> I am actually wondering if there is an easier way of getting these
>> checksums and I just do not see it, it cannot be that we fixed all
>> memory migration bugs :)
> 
> I'm not sure whether the command itself makes sense, but for the interface:
> 
> 
>> +++ b/qapi/misc.json
>> @@ -1194,6 +1194,33 @@
>>   ##
>>   { 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
>>   
>> +##
>> +# @MemoryChecksum:
>> +#
>> +# A string with MD5 checksum of all RAMBlocks.
>> +#
>> +# @checksum: the checksum.
>> +#
>> +# Since: 3.2.0
> 
> This should be 4.2, not 3.2.
> 
>> +##
>> +{ 'struct': 'MemoryChecksum',
>> +  'data'  : { 'checksum': 'str' } }
>> +
>> +##
>> +# @query-memory-checksum:
>> +#
>> +# Return the MD5 checksum of all RAMBlocks.
>> +#
>> +# Example:
>> +#
>> +# -> { "execute": "query-memory-checksum" }
>> +# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
>> +#
>> +# Since: 3.2.0
> 
> and again
> 
>> +##
>> +{ 'command': 'query-memory-checksum',
>> +  'returns': 'MemoryChecksum' }
>> +
>>   
> 
>> +++ b/exec.c
>> @@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
>>       return rb->host;
>>   }
>>   
>> +gchar *qemu_ram_chksum(void)
> 
> gchar is a pointless glib type.  Use 'char' instead.
> 
>> +{
>> +    struct RAMBlock *rb;
>> +    GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
>> +    gchar *ret;
>> +
>> +    RAMBLOCK_FOREACH(rb) {
>> +        g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
>> +                          qemu_ram_get_used_length(rb));
>> +    }
>> +    ret = g_strdup(g_checksum_get_string(chksum));
>> +    g_checksum_free(chksum);
>> +
>> +    return ret;
>> +}
> 
> How long does this take to run?  Is it something where you really want
> to block the guest while chewing over the guest's entire memory?


10-20 times faster than "pmemsave" and blocking the guest is not a 
problem here as both - source and destination - guests are stopped 
(otherwise the checksum does not make sense).



-- 
Alexey

Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Markus Armbruster 6 years, 5 months ago

Alexey Kardashevskiy <aik@ozlabs.ru> writes:

> This returns MD5 checksum of all RAM blocks for migration debugging
> as this is way faster than saving the entire RAM to a file and checking
> that.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Any particular reason for MD5?  Have you measured the other choices
offered by GLib?

I understand you don't need crypto-strength here.  Both MD5 and SHA-1
would be bad choices then.

Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Daniel P. Berrangé 6 years, 5 months ago

On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote:
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> 
> > This returns MD5 checksum of all RAM blocks for migration debugging
> > as this is way faster than saving the entire RAM to a file and checking
> > that.
> >
> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Any particular reason for MD5?  Have you measured the other choices
> offered by GLib?
> 
> I understand you don't need crypto-strength here.  Both MD5 and SHA-1
> would be bad choices then.

We have a tests/bench-crypto-hash test but its hardcoded for sha256.
I hacked it to report all algorithms and got these results for varying
input chunk sizes:

/crypto/hash/md5/speed-512: 519.12 MB/sec OK
/crypto/hash/md5/speed-1024: 560.39 MB/sec OK
/crypto/hash/md5/speed-4096: 591.39 MB/sec OK
/crypto/hash/md5/speed-16384: 576.46 MB/sec OK
/crypto/hash/sha1/speed-512: 443.12 MB/sec OK
/crypto/hash/sha1/speed-1024: 518.82 MB/sec OK
/crypto/hash/sha1/speed-4096: 555.60 MB/sec OK
/crypto/hash/sha1/speed-16384: 568.16 MB/sec OK
/crypto/hash/sha224/speed-512: 221.90 MB/sec OK
/crypto/hash/sha224/speed-1024: 239.79 MB/sec OK
/crypto/hash/sha224/speed-4096: 269.37 MB/sec OK
/crypto/hash/sha224/speed-16384: 274.87 MB/sec OK
/crypto/hash/sha256/speed-512: 222.75 MB/sec OK
/crypto/hash/sha256/speed-1024: 253.25 MB/sec OK
/crypto/hash/sha256/speed-4096: 272.80 MB/sec OK
/crypto/hash/sha256/speed-16384: 275.59 MB/sec OK
/crypto/hash/sha384/speed-512: 322.73 MB/sec OK
/crypto/hash/sha384/speed-1024: 369.84 MB/sec OK
/crypto/hash/sha384/speed-4096: 406.71 MB/sec OK
/crypto/hash/sha384/speed-16384: 417.87 MB/sec OK
/crypto/hash/sha512/speed-512: 320.62 MB/sec OK
/crypto/hash/sha512/speed-1024: 361.93 MB/sec OK
/crypto/hash/sha512/speed-4096: 404.91 MB/sec OK
/crypto/hash/sha512/speed-16384: 418.53 MB/sec OK
/crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK
/crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK
/crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK
/crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK


IOW, md5 is clearly the quickest, by a considerable margin over
SHA256/512. SHA1 is slightly slower.

Assuming that we document that this command is intentionally
*not* trying to guarantee collision resistances we're ok.

In fact we should not document what kind of checksum is
reported by query-memory-checksum. The impl should be a black
box from user's POV.

If we're just aiming for debugging tool to detect accidental
corruption, could we even just ignore cryptographic hashs
entirely and do a crc32 - that'd be way faster than even
md5.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Markus Armbruster 6 years, 5 months ago

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote:
>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>> 
>> > This returns MD5 checksum of all RAM blocks for migration debugging
>> > as this is way faster than saving the entire RAM to a file and checking
>> > that.
>> >
>> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> 
>> Any particular reason for MD5?  Have you measured the other choices
>> offered by GLib?
>> 
>> I understand you don't need crypto-strength here.  Both MD5 and SHA-1
>> would be bad choices then.
>
> We have a tests/bench-crypto-hash test but its hardcoded for sha256.
> I hacked it to report all algorithms and got these results for varying
> input chunk sizes:
>
> /crypto/hash/md5/speed-512: 519.12 MB/sec OK
> /crypto/hash/md5/speed-1024: 560.39 MB/sec OK
> /crypto/hash/md5/speed-4096: 591.39 MB/sec OK
> /crypto/hash/md5/speed-16384: 576.46 MB/sec OK
> /crypto/hash/sha1/speed-512: 443.12 MB/sec OK
> /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK
> /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK
> /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK
> /crypto/hash/sha224/speed-512: 221.90 MB/sec OK
> /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK
> /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK
> /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK
> /crypto/hash/sha256/speed-512: 222.75 MB/sec OK
> /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK
> /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK
> /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK
> /crypto/hash/sha384/speed-512: 322.73 MB/sec OK
> /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK
> /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK
> /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK
> /crypto/hash/sha512/speed-512: 320.62 MB/sec OK
> /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK
> /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK
> /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK
> /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK
> /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK
> /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK
> /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK
>
>
> IOW, md5 is clearly the quickest, by a considerable margin over
> SHA256/512. SHA1 is slightly slower.
>
> Assuming that we document that this command is intentionally
> *not* trying to guarantee collision resistances we're ok.
>
> In fact we should not document what kind of checksum is
> reported by query-memory-checksum. The impl should be a black
> box from user's POV.
>
> If we're just aiming for debugging tool to detect accidental
> corruption, could we even just ignore cryptographic hashs
> entirely and do a crc32 - that'd be way faster than even
> md5.

Good points.

The doc strings should spell out "for debugging", like the commit
message does, and both should spell out "weak collision resistance".

I can't find CRC-32 in GLib, but zlib appears to provide it:
http://refspecs.linuxbase.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/zlib-crc32-1.html

Care to compare its speed to MD5?

Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Daniel P. Berrangé 6 years, 5 months ago

On Fri, Aug 23, 2019 at 07:49:31AM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote:
> >> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> >> 
> >> > This returns MD5 checksum of all RAM blocks for migration debugging
> >> > as this is way faster than saving the entire RAM to a file and checking
> >> > that.
> >> >
> >> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> 
> >> Any particular reason for MD5?  Have you measured the other choices
> >> offered by GLib?
> >> 
> >> I understand you don't need crypto-strength here.  Both MD5 and SHA-1
> >> would be bad choices then.
> >
> > We have a tests/bench-crypto-hash test but its hardcoded for sha256.
> > I hacked it to report all algorithms and got these results for varying
> > input chunk sizes:
> >
> > /crypto/hash/md5/speed-512: 519.12 MB/sec OK
> > /crypto/hash/md5/speed-1024: 560.39 MB/sec OK
> > /crypto/hash/md5/speed-4096: 591.39 MB/sec OK
> > /crypto/hash/md5/speed-16384: 576.46 MB/sec OK
> > /crypto/hash/sha1/speed-512: 443.12 MB/sec OK
> > /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK
> > /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK
> > /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK
> > /crypto/hash/sha224/speed-512: 221.90 MB/sec OK
> > /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK
> > /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK
> > /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK
> > /crypto/hash/sha256/speed-512: 222.75 MB/sec OK
> > /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK
> > /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK
> > /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK
> > /crypto/hash/sha384/speed-512: 322.73 MB/sec OK
> > /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK
> > /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK
> > /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK
> > /crypto/hash/sha512/speed-512: 320.62 MB/sec OK
> > /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK
> > /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK
> > /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK
> > /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK
> > /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK
> > /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK
> > /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK
> >
> >
> > IOW, md5 is clearly the quickest, by a considerable margin over
> > SHA256/512. SHA1 is slightly slower.
> >
> > Assuming that we document that this command is intentionally
> > *not* trying to guarantee collision resistances we're ok.
> >
> > In fact we should not document what kind of checksum is
> > reported by query-memory-checksum. The impl should be a black
> > box from user's POV.
> >
> > If we're just aiming for debugging tool to detect accidental
> > corruption, could we even just ignore cryptographic hashs
> > entirely and do a crc32 - that'd be way faster than even
> > md5.
> 
> Good points.
> 
> The doc strings should spell out "for debugging", like the commit
> message does, and both should spell out "weak collision resistance".
> 
> I can't find CRC-32 in GLib, but zlib appears to provide it:
> http://refspecs.linuxbase.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/zlib-crc32-1.html
> 
> Care to compare its speed to MD5?

I hacked the code to use zlib's crc32 impl and got these for comparison:

/crypto/hash/crc32/speed-512: 1089.18 MB/sec OK
/crypto/hash/crc32/speed-1024: 1124.63 MB/sec OK
/crypto/hash/crc32/speed-4096: 1162.73 MB/sec OK
/crypto/hash/crc32/speed-16384: 1171.58 MB/sec OK
/crypto/hash/crc32/speed-1048576: 1165.68 MB/sec OK
/crypto/hash/md5/speed-512: 476.27 MB/sec OK
/crypto/hash/md5/speed-1024: 517.16 MB/sec OK
/crypto/hash/md5/speed-4096: 554.70 MB/sec OK
/crypto/hash/md5/speed-16384: 564.44 MB/sec OK
/crypto/hash/md5/speed-1048576: 566.78 MB/sec OK


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Markus Armbruster 6 years, 5 months ago

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Aug 23, 2019 at 07:49:31AM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote:
>> >> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>> >> 
>> >> > This returns MD5 checksum of all RAM blocks for migration debugging
>> >> > as this is way faster than saving the entire RAM to a file and checking
>> >> > that.
>> >> >
>> >> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> >> 
>> >> Any particular reason for MD5?  Have you measured the other choices
>> >> offered by GLib?
>> >> 
>> >> I understand you don't need crypto-strength here.  Both MD5 and SHA-1
>> >> would be bad choices then.
>> >
>> > We have a tests/bench-crypto-hash test but its hardcoded for sha256.
>> > I hacked it to report all algorithms and got these results for varying
>> > input chunk sizes:
>> >
>> > /crypto/hash/md5/speed-512: 519.12 MB/sec OK
>> > /crypto/hash/md5/speed-1024: 560.39 MB/sec OK
>> > /crypto/hash/md5/speed-4096: 591.39 MB/sec OK
>> > /crypto/hash/md5/speed-16384: 576.46 MB/sec OK
>> > /crypto/hash/sha1/speed-512: 443.12 MB/sec OK
>> > /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK
>> > /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK
>> > /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK
>> > /crypto/hash/sha224/speed-512: 221.90 MB/sec OK
>> > /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK
>> > /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK
>> > /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK
>> > /crypto/hash/sha256/speed-512: 222.75 MB/sec OK
>> > /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK
>> > /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK
>> > /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK
>> > /crypto/hash/sha384/speed-512: 322.73 MB/sec OK
>> > /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK
>> > /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK
>> > /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK
>> > /crypto/hash/sha512/speed-512: 320.62 MB/sec OK
>> > /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK
>> > /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK
>> > /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK
>> > /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK
>> > /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK
>> > /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK
>> > /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK
>> >
>> >
>> > IOW, md5 is clearly the quickest, by a considerable margin over
>> > SHA256/512. SHA1 is slightly slower.
>> >
>> > Assuming that we document that this command is intentionally
>> > *not* trying to guarantee collision resistances we're ok.
>> >
>> > In fact we should not document what kind of checksum is
>> > reported by query-memory-checksum. The impl should be a black
>> > box from user's POV.
>> >
>> > If we're just aiming for debugging tool to detect accidental
>> > corruption, could we even just ignore cryptographic hashs
>> > entirely and do a crc32 - that'd be way faster than even
>> > md5.
>> 
>> Good points.
>> 
>> The doc strings should spell out "for debugging", like the commit
>> message does, and both should spell out "weak collision resistance".
>> 
>> I can't find CRC-32 in GLib, but zlib appears to provide it:
>> http://refspecs.linuxbase.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/zlib-crc32-1.html
>> 
>> Care to compare its speed to MD5?
>
> I hacked the code to use zlib's crc32 impl and got these for comparison:
>
> /crypto/hash/crc32/speed-512: 1089.18 MB/sec OK
> /crypto/hash/crc32/speed-1024: 1124.63 MB/sec OK
> /crypto/hash/crc32/speed-4096: 1162.73 MB/sec OK
> /crypto/hash/crc32/speed-16384: 1171.58 MB/sec OK
> /crypto/hash/crc32/speed-1048576: 1165.68 MB/sec OK
> /crypto/hash/md5/speed-512: 476.27 MB/sec OK
> /crypto/hash/md5/speed-1024: 517.16 MB/sec OK
> /crypto/hash/md5/speed-4096: 554.70 MB/sec OK
> /crypto/hash/md5/speed-16384: 564.44 MB/sec OK
> /crypto/hash/md5/speed-1048576: 566.78 MB/sec OK

Twice as fast.  Alexey, what do you think?

Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

Posted by Alexey Kardashevskiy 6 years, 5 months ago


On 23/08/2019 21:41, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
>> On Fri, Aug 23, 2019 at 07:49:31AM +0200, Markus Armbruster wrote:
>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>>
>>>> On Thu, Aug 22, 2019 at 04:16:53PM +0200, Markus Armbruster wrote:
>>>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>>>>
>>>>>> This returns MD5 checksum of all RAM blocks for migration debugging
>>>>>> as this is way faster than saving the entire RAM to a file and checking
>>>>>> that.
>>>>>>
>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>>
>>>>> Any particular reason for MD5?  Have you measured the other choices
>>>>> offered by GLib?
>>>>>
>>>>> I understand you don't need crypto-strength here.  Both MD5 and SHA-1
>>>>> would be bad choices then.
>>>>
>>>> We have a tests/bench-crypto-hash test but its hardcoded for sha256.
>>>> I hacked it to report all algorithms and got these results for varying
>>>> input chunk sizes:
>>>>
>>>> /crypto/hash/md5/speed-512: 519.12 MB/sec OK
>>>> /crypto/hash/md5/speed-1024: 560.39 MB/sec OK
>>>> /crypto/hash/md5/speed-4096: 591.39 MB/sec OK
>>>> /crypto/hash/md5/speed-16384: 576.46 MB/sec OK
>>>> /crypto/hash/sha1/speed-512: 443.12 MB/sec OK
>>>> /crypto/hash/sha1/speed-1024: 518.82 MB/sec OK
>>>> /crypto/hash/sha1/speed-4096: 555.60 MB/sec OK
>>>> /crypto/hash/sha1/speed-16384: 568.16 MB/sec OK
>>>> /crypto/hash/sha224/speed-512: 221.90 MB/sec OK
>>>> /crypto/hash/sha224/speed-1024: 239.79 MB/sec OK
>>>> /crypto/hash/sha224/speed-4096: 269.37 MB/sec OK
>>>> /crypto/hash/sha224/speed-16384: 274.87 MB/sec OK
>>>> /crypto/hash/sha256/speed-512: 222.75 MB/sec OK
>>>> /crypto/hash/sha256/speed-1024: 253.25 MB/sec OK
>>>> /crypto/hash/sha256/speed-4096: 272.80 MB/sec OK
>>>> /crypto/hash/sha256/speed-16384: 275.59 MB/sec OK
>>>> /crypto/hash/sha384/speed-512: 322.73 MB/sec OK
>>>> /crypto/hash/sha384/speed-1024: 369.84 MB/sec OK
>>>> /crypto/hash/sha384/speed-4096: 406.71 MB/sec OK
>>>> /crypto/hash/sha384/speed-16384: 417.87 MB/sec OK
>>>> /crypto/hash/sha512/speed-512: 320.62 MB/sec OK
>>>> /crypto/hash/sha512/speed-1024: 361.93 MB/sec OK
>>>> /crypto/hash/sha512/speed-4096: 404.91 MB/sec OK
>>>> /crypto/hash/sha512/speed-16384: 418.53 MB/sec OK
>>>> /crypto/hash/ripemd160/speed-512: 226.45 MB/sec OK
>>>> /crypto/hash/ripemd160/speed-1024: 239.25 MB/sec OK
>>>> /crypto/hash/ripemd160/speed-4096: 251.31 MB/sec OK
>>>> /crypto/hash/ripemd160/speed-16384: 255.01 MB/sec OK
>>>>
>>>>
>>>> IOW, md5 is clearly the quickest, by a considerable margin over
>>>> SHA256/512. SHA1 is slightly slower.
>>>>
>>>> Assuming that we document that this command is intentionally
>>>> *not* trying to guarantee collision resistances we're ok.
>>>>
>>>> In fact we should not document what kind of checksum is
>>>> reported by query-memory-checksum. The impl should be a black
>>>> box from user's POV.
>>>>
>>>> If we're just aiming for debugging tool to detect accidental
>>>> corruption, could we even just ignore cryptographic hashs
>>>> entirely and do a crc32 - that'd be way faster than even
>>>> md5.
>>>
>>> Good points.
>>>
>>> The doc strings should spell out "for debugging", like the commit
>>> message does, and both should spell out "weak collision resistance".
>>>
>>> I can't find CRC-32 in GLib, but zlib appears to provide it:
>>> http://refspecs.linuxbase.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/zlib-crc32-1.html
>>>
>>> Care to compare its speed to MD5?
>>
>> I hacked the code to use zlib's crc32 impl and got these for comparison:
>>
>> /crypto/hash/crc32/speed-512: 1089.18 MB/sec OK
>> /crypto/hash/crc32/speed-1024: 1124.63 MB/sec OK
>> /crypto/hash/crc32/speed-4096: 1162.73 MB/sec OK
>> /crypto/hash/crc32/speed-16384: 1171.58 MB/sec OK
>> /crypto/hash/crc32/speed-1048576: 1165.68 MB/sec OK
>> /crypto/hash/md5/speed-512: 476.27 MB/sec OK
>> /crypto/hash/md5/speed-1024: 517.16 MB/sec OK
>> /crypto/hash/md5/speed-4096: 554.70 MB/sec OK
>> /crypto/hash/md5/speed-16384: 564.44 MB/sec OK
>> /crypto/hash/md5/speed-1048576: 566.78 MB/sec OK
> 
> Twice as fast.  Alexey, what do you think?


This is even better. TBH I picked md5 as I could not spot crc32 helper 
in the first minute (I can see it now) and MD5 felt the fastest 
available from glibc :)  I'll probably add start..end range(s) and 
repost. Thanks for all these numbers and reviews.



-- 
Alexey