accel/kvm/kvm-all.c | 260 ++++++++++++++++++++++++++++++++++++--- accel/kvm/trace-events | 1 + exec.c | 15 ++- include/exec/memory.h | 19 +++ include/exec/ram_addr.h | 92 +++++++++++++- include/qemu/bitmap.h | 9 ++ include/sysemu/kvm_int.h | 4 + memory.c | 56 ++++++++- migration/migration.c | 4 + migration/migration.h | 27 ++++ migration/ram.c | 93 ++++++++++---- migration/trace-events | 3 +- tests/Makefile.include | 2 + tests/migration-test.c | 103 ++++++++++++---- tests/test-bitmap.c | 72 +++++++++++ util/bitmap.c | 85 +++++++++++++ util/cutils.c | 8 +- 17 files changed, 770 insertions(+), 83 deletions(-) create mode 100644 tests/test-bitmap.c
The following changes since commit a2a9d4adabe340617a24eb73a8b2a116d28a6b38:
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190712' into staging (2019-07-12 11:06:48 +0100)
are available in the Git repository at:
https://github.com/juanquintela/qemu.git tags/migration-pull-request
for you to fetch changes up to a48ad5602f496236b4e1955d9e2e8228a7d0ad56:
migration: allow private destination ram with x-ignore-shared (2019-07-12 16:25:59 +0200)
----------------------------------------------------------------
Migration pull request
Fix the issues with the previous pull request and 32 bits.
Please apply.
----------------------------------------------------------------
Juan Quintela (3):
migration: fix multifd_recv event typo
migration-test: rename parameter to parameter_int
migration-test: Add migration multifd test
Peng Tao (1):
migration: allow private destination ram with x-ignore-shared
Peter Xu (10):
migration: No need to take rcu during sync_dirty_bitmap
memory: Don't set migration bitmap when without migration
bitmap: Add bitmap_copy_with_{src|dst}_offset()
memory: Pass mr into snapshot_and_clear_dirty
memory: Introduce memory listener hook log_clear()
kvm: Update comments for sync_dirty_bitmap
kvm: Persistent per kvmslot dirty bitmap
kvm: Introduce slots lock for memory listener
kvm: Support KVM_CLEAR_DIRTY_LOG
migration: Split log_clear() into smaller chunks
Wei Yang (5):
migration/multifd: call multifd_send_sync_main when sending
RAM_SAVE_FLAG_EOS
migration/xbzrle: update cache and current_data in one place
cutils: remove one unnecessary pointer operation
migration/multifd: sync packet_num after all thread are done
migration/ram.c: reset complete_round when we gets a queued page
accel/kvm/kvm-all.c | 260 ++++++++++++++++++++++++++++++++++++---
accel/kvm/trace-events | 1 +
exec.c | 15 ++-
include/exec/memory.h | 19 +++
include/exec/ram_addr.h | 92 +++++++++++++-
include/qemu/bitmap.h | 9 ++
include/sysemu/kvm_int.h | 4 +
memory.c | 56 ++++++++-
migration/migration.c | 4 +
migration/migration.h | 27 ++++
migration/ram.c | 93 ++++++++++----
migration/trace-events | 3 +-
tests/Makefile.include | 2 +
tests/migration-test.c | 103 ++++++++++++----
tests/test-bitmap.c | 72 +++++++++++
util/bitmap.c | 85 +++++++++++++
util/cutils.c | 8 +-
17 files changed, 770 insertions(+), 83 deletions(-)
create mode 100644 tests/test-bitmap.c
--
2.21.0
On Fri, 12 Jul 2019 at 15:32, Juan Quintela <quintela@redhat.com> wrote:
>
> The following changes since commit a2a9d4adabe340617a24eb73a8b2a116d28a6b38:
>
> Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190712' into staging (2019-07-12 11:06:48 +0100)
>
> are available in the Git repository at:
>
> https://github.com/juanquintela/qemu.git tags/migration-pull-request
>
> for you to fetch changes up to a48ad5602f496236b4e1955d9e2e8228a7d0ad56:
>
> migration: allow private destination ram with x-ignore-shared (2019-07-12 16:25:59 +0200)
>
> ----------------------------------------------------------------
> Migration pull request
>
> Fix the issues with the previous pull request and 32 bits.
>
> Please apply.
>
Still fails on aarch32 host, I'm afraid:
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64
QTEST_QEMU_IMG=qemu-img tests/migration-test -m=quick -k --tap <
/dev/null | ./scripts/tap-driver.pl --test-name="migration-test"
PASS 1 migration-test /aarch64/migration/deprecated
PASS 2 migration-test /aarch64/migration/bad_dest
PASS 3 migration-test /aarch64/migration/fd_proto
PASS 4 migration-test /aarch64/migration/postcopy/unix
PASS 5 migration-test /aarch64/migration/postcopy/recovery
PASS 6 migration-test /aarch64/migration/precopy/unix
PASS 7 migration-test /aarch64/migration/precopy/tcp
PASS 8 migration-test /aarch64/migration/xbzrle/unix
malloc(): memory corruption
Broken pipe
qemu-system-aarch64: load of migration failed: Invalid argument
/home/peter.maydell/qemu/tests/libqtest.c:137: kill_qemu() tried to
terminate QEMU process but encountered exit status 1
Aborted
ERROR - too few tests run (expected 9, got 8)
/home/peter.maydell/qemu/tests/Makefile.include:899: recipe for target
'check-qtest-aarch64' failed
thanks
-- PMM
On Fri, 12 Jul 2019 at 17:33, Peter Maydell <peter.maydell@linaro.org> wrote:
> Still fails on aarch32 host, I'm afraid:
>
> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64
> QTEST_QEMU_IMG=qemu-img tests/migration-test -m=quick -k --tap <
> /dev/null | ./scripts/tap-driver.pl --test-name="migration-test"
> PASS 1 migration-test /aarch64/migration/deprecated
> PASS 2 migration-test /aarch64/migration/bad_dest
> PASS 3 migration-test /aarch64/migration/fd_proto
> PASS 4 migration-test /aarch64/migration/postcopy/unix
> PASS 5 migration-test /aarch64/migration/postcopy/recovery
> PASS 6 migration-test /aarch64/migration/precopy/unix
> PASS 7 migration-test /aarch64/migration/precopy/tcp
> PASS 8 migration-test /aarch64/migration/xbzrle/unix
> malloc(): memory corruption
> Broken pipe
> qemu-system-aarch64: load of migration failed: Invalid argument
> /home/peter.maydell/qemu/tests/libqtest.c:137: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1
> Aborted
> ERROR - too few tests run (expected 9, got 8)
> /home/peter.maydell/qemu/tests/Makefile.include:899: recipe for target
> 'check-qtest-aarch64' failed
A run with valgrind:
(armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
QTEST_QEMU_BINARY='valgrind aarch64-softmmu/qemu-system-aarch64'
tests/migration-test -v -p '/aarch64/migration/multifd/tcp'
/aarch64/migration/multifd/tcp: ==4034== Memcheck, a memory error detector
==4034== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4034== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4034== Command: aarch64-softmmu/qemu-system-aarch64 -qtest
unix:/tmp/qtest-4033.sock -qtest-log /dev/null -chardev
socket,path=/tmp/qtest-4033.qmp,id=char0 -mon
chardev=char0,mode=control -machine accel=qtest -display none -machine
virt,accel=kvm:tcg,gic-version=max -name vmsource,debug-threads=on
-cpu max -m 150M -serial file:/tmp/migration-test-mSLr4A/src_serial
-kernel /tmp/migration-test-mSLr4A/bootsect
==4034==
==4040== Memcheck, a memory error detector
==4040== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4040== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4040== Command: aarch64-softmmu/qemu-system-aarch64 -qtest
unix:/tmp/qtest-4033.sock -qtest-log /dev/null -chardev
socket,path=/tmp/qtest-4033.qmp,id=char0 -mon
chardev=char0,mode=control -machine accel=qtest -display none -machine
virt,accel=kvm:tcg,gic-version=max -name vmdest,debug-threads=on -cpu
max -m 150M -serial file:/tmp/migration-test-mSLr4A/dest_serial
-kernel /tmp/migration-test-mSLr4A/bootsect -incoming tcp:127.0.0.1:0
==4040==
==4034== Thread 5 multifdsend_0:
==4034== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==4034== at 0x5299F06: __libc_do_syscall (libc-do-syscall.S:47)
==4034== by 0x5298FCB: sendmsg (sendmsg.c:28)
==4034== by 0x60135D: qio_channel_socket_writev (channel-socket.c:544)
==4034== by 0x5FF995: qio_channel_writev (channel.c:207)
==4034== by 0x5FF995: qio_channel_writev_all (channel.c:171)
==4034== by 0x5FFA0F: qio_channel_write_all (channel.c:257)
==4034== by 0x26BA73: multifd_send_initial_packet (ram.c:711)
==4034== by 0x26BA73: multifd_send_thread (ram.c:1085)
==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
==4034== by 0x5290613: start_thread (pthread_create.c:463)
==4034== by 0x53487FB: ??? (clone.S:73)
==4034== Address 0x2320048d is on thread 5's stack
==4034== in frame #5, created by multifd_send_thread (ram.c:1077)
==4034==
==4034== Thread 6 multifdsend_1:
==4034== Invalid write of size 4
==4034== at 0x26BB7C: multifd_send_fill_packet (ram.c:806)
==4034== by 0x26BB7C: multifd_send_thread (ram.c:1101)
==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
==4034== by 0x5290613: start_thread (pthread_create.c:463)
==4034== by 0x53487FB: ??? (clone.S:73)
==4034== Address 0x224ed668 is 0 bytes after a block of size 832 alloc'd
==4034== at 0x4841BC4: calloc (vg_replace_malloc.c:711)
==4034== by 0x5018269: g_malloc0 (in
/usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
==4034==
==4034== Invalid write of size 4
==4034== at 0x26BB82: multifd_send_fill_packet (ram.c:806)
==4034== by 0x26BB82: multifd_send_thread (ram.c:1101)
==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
==4034== by 0x5290613: start_thread (pthread_create.c:463)
==4034== by 0x53487FB: ??? (clone.S:73)
==4034== Address 0x224ed66c is 4 bytes after a block of size 832 alloc'd
==4034== at 0x4841BC4: calloc (vg_replace_malloc.c:711)
==4034== by 0x5018269: g_malloc0 (in
/usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
==4034==
==4034== Invalid read of size 4
==4034== at 0x5FF1DA: qio_channel_writev_full (channel.c:86)
==4034== by 0x5FF995: qio_channel_writev (channel.c:207)
==4034== by 0x5FF995: qio_channel_writev_all (channel.c:171)
==4034== by 0x5FFA0F: qio_channel_write_all (channel.c:257)
==4034== by 0x26BBD9: multifd_send_thread (ram.c:1111)
==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
==4034== by 0x5290613: start_thread (pthread_create.c:463)
==4034== by 0x53487FB: ??? (clone.S:73)
==4034== Address 0x30 is not stack'd, malloc'd or (recently) free'd
==4034==
==4034==
==4034== Process terminating with default action of signal 11 (SIGSEGV)
==4034== Access not within mapped region at address 0x30
==4034== at 0x5FF1DA: qio_channel_writev_full (channel.c:86)
==4034== by 0x5FF995: qio_channel_writev (channel.c:207)
==4034== by 0x5FF995: qio_channel_writev_all (channel.c:171)
==4034== by 0x5FFA0F: qio_channel_write_all (channel.c:257)
==4034== by 0x26BBD9: multifd_send_thread (ram.c:1111)
==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
==4034== by 0x5290613: start_thread (pthread_create.c:463)
==4034== by 0x53487FB: ??? (clone.S:73)
==4034== If you believe this happened as a result of a stack
==4034== overflow in your program's main thread (unlikely but
==4034== possible), you can try to increase the size of the
==4034== main thread stack using the --main-stacksize= flag.
==4034== The main thread stack size used in this run was 8388608.
==4034==
==4034== HEAP SUMMARY:
==4034== in use at exit: 5,994,911 bytes in 23,588 blocks
==4034== total heap usage: 87,487 allocs, 63,899 frees, 17,732,188
bytes allocated
==4034==
==4034== LEAK SUMMARY:
==4034== definitely lost: 56 bytes in 1 blocks
==4034== indirectly lost: 64 bytes in 2 blocks
==4034== possibly lost: 1,620 bytes in 26 blocks
==4034== still reachable: 5,993,171 bytes in 23,559 blocks
==4034== suppressed: 0 bytes in 0 blocks
==4034== Rerun with --leak-check=full to see details of leaked memory
==4034==
==4034== For counts of detected and suppressed errors, rerun with: -v
==4034== Use --track-origins=yes to see where uninitialised values come from
==4034== ERROR SUMMARY: 66 errors from 4 contexts (suppressed: 6 from 3)
Broken pipe
qemu-system-aarch64: load of migration failed: Input/output error
==4040==
==4040== HEAP SUMMARY:
==4040== in use at exit: 4,893,269 bytes in 19,702 blocks
==4040== total heap usage: 86,196 allocs, 66,494 frees, 17,438,183
bytes allocated
==4040==
==4040== LEAK SUMMARY:
==4040== definitely lost: 0 bytes in 0 blocks
==4040== indirectly lost: 0 bytes in 0 blocks
==4040== possibly lost: 1,160 bytes in 5 blocks
==4040== still reachable: 4,892,109 bytes in 19,697 blocks
==4040== suppressed: 0 bytes in 0 blocks
==4040== Rerun with --leak-check=full to see details of leaked memory
==4040==
==4040== For counts of detected and suppressed errors, rerun with: -v
==4040== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 3)
/home/peter.maydell/qemu/tests/libqtest.c:137: kill_qemu() tried to
terminate QEMU process but encountered exit status 1
Aborted
thanks
-- PMM
On Mon, Jul 15, 2019 at 12:16:57PM +0100, Peter Maydell wrote:
> On Fri, 12 Jul 2019 at 17:33, Peter Maydell <peter.maydell@linaro.org> wrote:
> > Still fails on aarch32 host, I'm afraid:
> >
> > MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> > QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64
> > QTEST_QEMU_IMG=qemu-img tests/migration-test -m=quick -k --tap <
> > /dev/null | ./scripts/tap-driver.pl --test-name="migration-test"
> > PASS 1 migration-test /aarch64/migration/deprecated
> > PASS 2 migration-test /aarch64/migration/bad_dest
> > PASS 3 migration-test /aarch64/migration/fd_proto
> > PASS 4 migration-test /aarch64/migration/postcopy/unix
> > PASS 5 migration-test /aarch64/migration/postcopy/recovery
> > PASS 6 migration-test /aarch64/migration/precopy/unix
> > PASS 7 migration-test /aarch64/migration/precopy/tcp
> > PASS 8 migration-test /aarch64/migration/xbzrle/unix
> > malloc(): memory corruption
> > Broken pipe
> > qemu-system-aarch64: load of migration failed: Invalid argument
> > /home/peter.maydell/qemu/tests/libqtest.c:137: kill_qemu() tried to
> > terminate QEMU process but encountered exit status 1
> > Aborted
> > ERROR - too few tests run (expected 9, got 8)
> > /home/peter.maydell/qemu/tests/Makefile.include:899: recipe for target
> > 'check-qtest-aarch64' failed
>
> A run with valgrind:
>
> (armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
> QTEST_QEMU_BINARY='valgrind aarch64-softmmu/qemu-system-aarch64'
> tests/migration-test -v -p '/aarch64/migration/multifd/tcp'
> /aarch64/migration/multifd/tcp: ==4034== Memcheck, a memory error detector
> ==4034== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==4034== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==4034== Command: aarch64-softmmu/qemu-system-aarch64 -qtest
> unix:/tmp/qtest-4033.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-4033.qmp,id=char0 -mon
> chardev=char0,mode=control -machine accel=qtest -display none -machine
> virt,accel=kvm:tcg,gic-version=max -name vmsource,debug-threads=on
> -cpu max -m 150M -serial file:/tmp/migration-test-mSLr4A/src_serial
> -kernel /tmp/migration-test-mSLr4A/bootsect
> ==4034==
> ==4040== Memcheck, a memory error detector
> ==4040== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==4040== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==4040== Command: aarch64-softmmu/qemu-system-aarch64 -qtest
> unix:/tmp/qtest-4033.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-4033.qmp,id=char0 -mon
> chardev=char0,mode=control -machine accel=qtest -display none -machine
> virt,accel=kvm:tcg,gic-version=max -name vmdest,debug-threads=on -cpu
> max -m 150M -serial file:/tmp/migration-test-mSLr4A/dest_serial
> -kernel /tmp/migration-test-mSLr4A/bootsect -incoming tcp:127.0.0.1:0
> ==4040==
> ==4034== Thread 5 multifdsend_0:
> ==4034== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
> ==4034== at 0x5299F06: __libc_do_syscall (libc-do-syscall.S:47)
> ==4034== by 0x5298FCB: sendmsg (sendmsg.c:28)
> ==4034== by 0x60135D: qio_channel_socket_writev (channel-socket.c:544)
> ==4034== by 0x5FF995: qio_channel_writev (channel.c:207)
> ==4034== by 0x5FF995: qio_channel_writev_all (channel.c:171)
> ==4034== by 0x5FFA0F: qio_channel_write_all (channel.c:257)
> ==4034== by 0x26BA73: multifd_send_initial_packet (ram.c:711)
> ==4034== by 0x26BA73: multifd_send_thread (ram.c:1085)
> ==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
> ==4034== by 0x5290613: start_thread (pthread_create.c:463)
> ==4034== by 0x53487FB: ??? (clone.S:73)
> ==4034== Address 0x2320048d is on thread 5's stack
> ==4034== in frame #5, created by multifd_send_thread (ram.c:1077)
This is a simple missing initialization
multifd_send_initial_packet has a local variable:
MultiFDInit_t msg;
the code initializes 4 fields, but does *not* initialize the 2
padding fields, so we're writing random data. Harmless as the
receiving end will ignore padding too, but we should fill with
zeros really. so
MultiFDInit_t msg = {0};
should fix it.
> ==4034==
> ==4034== Thread 6 multifdsend_1:
> ==4034== Invalid write of size 4
> ==4034== at 0x26BB7C: multifd_send_fill_packet (ram.c:806)
> ==4034== by 0x26BB7C: multifd_send_thread (ram.c:1101)
> ==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
> ==4034== by 0x5290613: start_thread (pthread_create.c:463)
> ==4034== by 0x53487FB: ??? (clone.S:73)
> ==4034== Address 0x224ed668 is 0 bytes after a block of size 832 alloc'd
> ==4034== at 0x4841BC4: calloc (vg_replace_malloc.c:711)
> ==4034== by 0x5018269: g_malloc0 (in
> /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
multifd_send_fill_packet is getting the oob write in:
for (i = 0; i < p->pages->used; i++) {
packet->offset[i] = cpu_to_be64(p->pages->offset[i]);
}
offset is a variable length struct field at the end of MultiFDPacket_t:
typedef struct {
...snip...
char ramblock[256];
uint64_t offset[];
} __attribute__((packed)) MultiFDPacket_t;
but the packet data is allocated back in multifd_save_setup using:
p->packet_len = sizeof(MultiFDPacket_t)
+ sizeof(ram_addr_t) * page_count;
p->packet = g_malloc0(p->packet_len);
Notice the field in the struct is "uint64_t" but the length we're
allocating is "ram_addr_t".
Since this is a 32-bit build, I'm guessing ram_addr_t is a 32-bit
integer and thus we're under-allocating the variable length offset
field by half
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, 15 Jul 2019 at 15:04, Daniel P. Berrangé <berrange@redhat.com> wrote:
> MultiFDInit_t msg = {0};
>
> should fix it.
A minor nit, but "= {}" is our standard struct-zero-initializer
so we should prefer that, I think. (I know it is not the C-spec
recommended version but some C compilers incorrectly warn about
"= {0}" whereas no compiler we care about warns about the
gnu-extension "= {}".)
thanks
-- PMM
Peter Maydell <peter.maydell@linaro.org> wrote:
> On Fri, 12 Jul 2019 at 17:33, Peter Maydell <peter.maydell@linaro.org> wrote:
>> Still fails on aarch32 host, I'm afraid:
Hi
dropping the multifd test patch from now. For "some" reason, having a
packed struct and 32bits is getting ugly, not sure yet _why_.
Resending the pull request.
>>
>> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
>> QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64
>> QTEST_QEMU_IMG=qemu-img tests/migration-test -m=quick -k --tap <
>> /dev/null | ./scripts/tap-driver.pl --test-name="migration-test"
>> PASS 1 migration-test /aarch64/migration/deprecated
>> PASS 2 migration-test /aarch64/migration/bad_dest
>> PASS 3 migration-test /aarch64/migration/fd_proto
>> PASS 4 migration-test /aarch64/migration/postcopy/unix
>> PASS 5 migration-test /aarch64/migration/postcopy/recovery
>> PASS 6 migration-test /aarch64/migration/precopy/unix
>> PASS 7 migration-test /aarch64/migration/precopy/tcp
>> PASS 8 migration-test /aarch64/migration/xbzrle/unix
>> malloc(): memory corruption
>> Broken pipe
>> qemu-system-aarch64: load of migration failed: Invalid argument
>> /home/peter.maydell/qemu/tests/libqtest.c:137: kill_qemu() tried to
>> terminate QEMU process but encountered exit status 1
>> Aborted
>> ERROR - too few tests run (expected 9, got 8)
>> /home/peter.maydell/qemu/tests/Makefile.include:899: recipe for target
>> 'check-qtest-aarch64' failed
>
> A run with valgrind:
>
> (armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
> QTEST_QEMU_BINARY='valgrind aarch64-softmmu/qemu-system-aarch64'
> tests/migration-test -v -p '/aarch64/migration/multifd/tcp'
> /aarch64/migration/multifd/tcp: ==4034== Memcheck, a memory error detector
> ==4034== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==4034== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==4034== Command: aarch64-softmmu/qemu-system-aarch64 -qtest
> unix:/tmp/qtest-4033.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-4033.qmp,id=char0 -mon
> chardev=char0,mode=control -machine accel=qtest -display none -machine
> virt,accel=kvm:tcg,gic-version=max -name vmsource,debug-threads=on
> -cpu max -m 150M -serial file:/tmp/migration-test-mSLr4A/src_serial
> -kernel /tmp/migration-test-mSLr4A/bootsect
> ==4034==
> ==4040== Memcheck, a memory error detector
> ==4040== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==4040== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==4040== Command: aarch64-softmmu/qemu-system-aarch64 -qtest
> unix:/tmp/qtest-4033.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-4033.qmp,id=char0 -mon
> chardev=char0,mode=control -machine accel=qtest -display none -machine
> virt,accel=kvm:tcg,gic-version=max -name vmdest,debug-threads=on -cpu
> max -m 150M -serial file:/tmp/migration-test-mSLr4A/dest_serial
> -kernel /tmp/migration-test-mSLr4A/bootsect -incoming tcp:127.0.0.1:0
> ==4040==
> ==4034== Thread 5 multifdsend_0:
> ==4034== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
> ==4034== at 0x5299F06: __libc_do_syscall (libc-do-syscall.S:47)
> ==4034== by 0x5298FCB: sendmsg (sendmsg.c:28)
> ==4034== by 0x60135D: qio_channel_socket_writev (channel-socket.c:544)
> ==4034== by 0x5FF995: qio_channel_writev (channel.c:207)
> ==4034== by 0x5FF995: qio_channel_writev_all (channel.c:171)
> ==4034== by 0x5FFA0F: qio_channel_write_all (channel.c:257)
> ==4034== by 0x26BA73: multifd_send_initial_packet (ram.c:711)
> ==4034== by 0x26BA73: multifd_send_thread (ram.c:1085)
> ==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
> ==4034== by 0x5290613: start_thread (pthread_create.c:463)
> ==4034== by 0x53487FB: ??? (clone.S:73)
> ==4034== Address 0x2320048d is on thread 5's stack
> ==4034== in frame #5, created by multifd_send_thread (ram.c:1077)
> ==4034==
> ==4034== Thread 6 multifdsend_1:
> ==4034== Invalid write of size 4
> ==4034== at 0x26BB7C: multifd_send_fill_packet (ram.c:806)
> ==4034== by 0x26BB7C: multifd_send_thread (ram.c:1101)
> ==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
> ==4034== by 0x5290613: start_thread (pthread_create.c:463)
> ==4034== by 0x53487FB: ??? (clone.S:73)
> ==4034== Address 0x224ed668 is 0 bytes after a block of size 832 alloc'd
> ==4034== at 0x4841BC4: calloc (vg_replace_malloc.c:711)
> ==4034== by 0x5018269: g_malloc0 (in
> /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
> ==4034==
> ==4034== Invalid write of size 4
> ==4034== at 0x26BB82: multifd_send_fill_packet (ram.c:806)
> ==4034== by 0x26BB82: multifd_send_thread (ram.c:1101)
> ==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
> ==4034== by 0x5290613: start_thread (pthread_create.c:463)
> ==4034== by 0x53487FB: ??? (clone.S:73)
> ==4034== Address 0x224ed66c is 4 bytes after a block of size 832 alloc'd
> ==4034== at 0x4841BC4: calloc (vg_replace_malloc.c:711)
> ==4034== by 0x5018269: g_malloc0 (in
> /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
> ==4034==
> ==4034== Invalid read of size 4
> ==4034== at 0x5FF1DA: qio_channel_writev_full (channel.c:86)
> ==4034== by 0x5FF995: qio_channel_writev (channel.c:207)
> ==4034== by 0x5FF995: qio_channel_writev_all (channel.c:171)
> ==4034== by 0x5FFA0F: qio_channel_write_all (channel.c:257)
> ==4034== by 0x26BBD9: multifd_send_thread (ram.c:1111)
> ==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
> ==4034== by 0x5290613: start_thread (pthread_create.c:463)
> ==4034== by 0x53487FB: ??? (clone.S:73)
> ==4034== Address 0x30 is not stack'd, malloc'd or (recently) free'd
> ==4034==
> ==4034==
> ==4034== Process terminating with default action of signal 11 (SIGSEGV)
> ==4034== Access not within mapped region at address 0x30
> ==4034== at 0x5FF1DA: qio_channel_writev_full (channel.c:86)
> ==4034== by 0x5FF995: qio_channel_writev (channel.c:207)
> ==4034== by 0x5FF995: qio_channel_writev_all (channel.c:171)
> ==4034== by 0x5FFA0F: qio_channel_write_all (channel.c:257)
> ==4034== by 0x26BBD9: multifd_send_thread (ram.c:1111)
> ==4034== by 0x63C0B1: qemu_thread_start (qemu-thread-posix.c:502)
> ==4034== by 0x5290613: start_thread (pthread_create.c:463)
> ==4034== by 0x53487FB: ??? (clone.S:73)
> ==4034== If you believe this happened as a result of a stack
> ==4034== overflow in your program's main thread (unlikely but
> ==4034== possible), you can try to increase the size of the
> ==4034== main thread stack using the --main-stacksize= flag.
> ==4034== The main thread stack size used in this run was 8388608.
> ==4034==
> ==4034== HEAP SUMMARY:
> ==4034== in use at exit: 5,994,911 bytes in 23,588 blocks
> ==4034== total heap usage: 87,487 allocs, 63,899 frees, 17,732,188
> bytes allocated
> ==4034==
> ==4034== LEAK SUMMARY:
> ==4034== definitely lost: 56 bytes in 1 blocks
> ==4034== indirectly lost: 64 bytes in 2 blocks
> ==4034== possibly lost: 1,620 bytes in 26 blocks
> ==4034== still reachable: 5,993,171 bytes in 23,559 blocks
> ==4034== suppressed: 0 bytes in 0 blocks
> ==4034== Rerun with --leak-check=full to see details of leaked memory
> ==4034==
> ==4034== For counts of detected and suppressed errors, rerun with: -v
> ==4034== Use --track-origins=yes to see where uninitialised values come from
> ==4034== ERROR SUMMARY: 66 errors from 4 contexts (suppressed: 6 from 3)
> Broken pipe
> qemu-system-aarch64: load of migration failed: Input/output error
> ==4040==
> ==4040== HEAP SUMMARY:
> ==4040== in use at exit: 4,893,269 bytes in 19,702 blocks
> ==4040== total heap usage: 86,196 allocs, 66,494 frees, 17,438,183
> bytes allocated
> ==4040==
> ==4040== LEAK SUMMARY:
> ==4040== definitely lost: 0 bytes in 0 blocks
> ==4040== indirectly lost: 0 bytes in 0 blocks
> ==4040== possibly lost: 1,160 bytes in 5 blocks
> ==4040== still reachable: 4,892,109 bytes in 19,697 blocks
> ==4040== suppressed: 0 bytes in 0 blocks
> ==4040== Rerun with --leak-check=full to see details of leaked memory
> ==4040==
> ==4040== For counts of detected and suppressed errors, rerun with: -v
> ==4040== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 3)
> /home/peter.maydell/qemu/tests/libqtest.c:137: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1
> Aborted
>
> thanks
> -- PMM
On Mon, 15 Jul 2019 at 14:44, Juan Quintela <quintela@redhat.com> wrote: > > Peter Maydell <peter.maydell@linaro.org> wrote: > > On Fri, 12 Jul 2019 at 17:33, Peter Maydell <peter.maydell@linaro.org> wrote: > >> Still fails on aarch32 host, I'm afraid: > > Hi > > dropping the multifd test patch from now. For "some" reason, having a > packed struct and 32bits is getting ugly, not sure yet _why_. IMHO 'packed' structs are usually a bad idea. They have a bunch of behaviours you may not be expecting (for instance they're also not naturally aligned, and arrays of them won't be the size you expect). thanks -- PMM
Peter Maydell <peter.maydell@linaro.org> wrote:
> On Mon, 15 Jul 2019 at 14:44, Juan Quintela <quintela@redhat.com> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> wrote:
>> > On Fri, 12 Jul 2019 at 17:33, Peter Maydell <peter.maydell@linaro.org> wrote:
>> >> Still fails on aarch32 host, I'm afraid:
>>
>> Hi
>>
>> dropping the multifd test patch from now. For "some" reason, having a
>> packed struct and 32bits is getting ugly, not sure yet _why_.
>
> IMHO 'packed' structs are usually a bad idea. They have a bunch
> of behaviours you may not be expecting (for instance they're
> also not naturally aligned, and arrays of them won't be the
> size you expect).
I can't get everything happy O:-)
For the multifd initial packet, I used to have that I wrote the fields
by hand. Then danp asked that I used a packed struct, and converted the
values inside it. So ..... Imposible to have everybody happy.
Anyways, the struct is packed, both sides are i386 32bits, and it should
be exactly the same, but it appears that there is where your valgrind
problems appear. Still investigating _where_ the problem is. What is
even weirder is that there is no error at all on 64bits.
Thanks, Juan.
PS. BTW, did you launched by hand the guests with valgrind, or there is
a trick that I am missing for launching a qtest with valgrind?
On Mon, 15 Jul 2019 at 15:10, Juan Quintela <quintela@redhat.com> wrote: > PS. BTW, did you launched by hand the guests with valgrind, or there is > a trick that I am missing for launching a qtest with valgrind? I quoted the command line I used: QTEST_QEMU_BINARY='valgrind aarch64-softmmu/qemu-system-aarch64' tests/migration-test -v -p '/aarch64/migration/multifd/tcp' (https://wiki.qemu.org/Features/QTest lists this and some other things you might want to do with qtest tests.) thanks -- PMM
* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Fri, 12 Jul 2019 at 15:32, Juan Quintela <quintela@redhat.com> wrote:
> >
> > The following changes since commit a2a9d4adabe340617a24eb73a8b2a116d28a6b38:
> >
> > Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190712' into staging (2019-07-12 11:06:48 +0100)
> >
> > are available in the Git repository at:
> >
> > https://github.com/juanquintela/qemu.git tags/migration-pull-request
> >
> > for you to fetch changes up to a48ad5602f496236b4e1955d9e2e8228a7d0ad56:
> >
> > migration: allow private destination ram with x-ignore-shared (2019-07-12 16:25:59 +0200)
> >
> > ----------------------------------------------------------------
> > Migration pull request
> >
> > Fix the issues with the previous pull request and 32 bits.
> >
> > Please apply.
> >
>
> Still fails on aarch32 host, I'm afraid:
>
> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64
> QTEST_QEMU_IMG=qemu-img tests/migration-test -m=quick -k --tap <
> /dev/null | ./scripts/tap-driver.pl --test-name="migration-test"
> PASS 1 migration-test /aarch64/migration/deprecated
> PASS 2 migration-test /aarch64/migration/bad_dest
> PASS 3 migration-test /aarch64/migration/fd_proto
> PASS 4 migration-test /aarch64/migration/postcopy/unix
> PASS 5 migration-test /aarch64/migration/postcopy/recovery
> PASS 6 migration-test /aarch64/migration/precopy/unix
> PASS 7 migration-test /aarch64/migration/precopy/tcp
> PASS 8 migration-test /aarch64/migration/xbzrle/unix
> malloc(): memory corruption
> Broken pipe
> qemu-system-aarch64: load of migration failed: Invalid argument
> /home/peter.maydell/qemu/tests/libqtest.c:137: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1
> Aborted
> ERROR - too few tests run (expected 9, got 8)
> /home/peter.maydell/qemu/tests/Makefile.include:899: recipe for target
> 'check-qtest-aarch64' failed
Hmm, I think the only one to go near xbzrle specifically is
'migration/xbzrle: update cache and current_data in one place'
it did look right to me; but that is a subtle function so hmmmm.
Dave
>
> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
© 2016 - 2026 Red Hat, Inc.