[PULL 00/28] Migration pull patches

Juan Quintela posted 28 patches 4 years, 3 months ago
Test docker-mingw@fedora passed
Test checkpatch passed
Test docker-quick@centos7 passed
Test FreeBSD passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200110173215.3865-1-quintela@redhat.com
Maintainers: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, David Gibson <david@gibson.dropbear.id.au>, Laurent Vivier <lvivier@redhat.com>, Thomas Huth <thuth@redhat.com>, Jason Wang <jasowang@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, Corey Minyard <cminyard@mvista.com>, Stefan Weil <sw@weilnetz.de>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <rth@twiddle.net>, Stefan Berger <stefanb@linux.ibm.com>, Juan Quintela <quintela@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Andrzej Zaborowski <balrogg@gmail.com>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>
There is a newer version of this series
backends/dbus-vmstate.c      |   3 +-
exec.c                       |   4 +-
hw/arm/stellaris.c           |   2 +-
hw/core/qdev.c               |   3 +-
hw/display/ads7846.c         |   2 +-
hw/i2c/core.c                |   2 +-
hw/input/stellaris_input.c   |   3 +-
hw/intc/apic_common.c        |   7 +-
hw/misc/max111x.c            |   3 +-
hw/net/eepro100.c            |   3 +-
hw/pci/pci.c                 |   2 +-
hw/ppc/spapr.c               |   2 +-
hw/timer/arm_timer.c         |   2 +-
hw/tpm/tpm_emulator.c        |   3 +-
include/migration/register.h |   2 +-
include/migration/vmstate.h  |  25 ++++-
include/qemu/queue.h         |  39 ++++++++
migration/migration.c        |  72 +++++++-------
migration/migration.h        |   1 +
migration/ram.c              | 181 ++++++++++++++++++++++++++---------
migration/savevm.c           |  61 ++++++++----
migration/trace-events       |   9 +-
migration/vmstate-types.c    |  70 ++++++++++++++
stubs/vmstate.c              |   2 +-
tests/migration-test.c       |  97 ++++++++++++++++++-
tests/test-vmstate.c         | 170 ++++++++++++++++++++++++++++++++
vl.c                         |  10 +-
27 files changed, 652 insertions(+), 128 deletions(-)
[PULL 00/28] Migration pull patches
Posted by Juan Quintela 4 years, 3 months ago
The following changes since commit f38a71b01f839c7b65ea73ddd507903cb9489ed6:

  Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-semihosting-090120-2' into staging (2020-01-10 13:19:34 +0000)

are available in the Git repository at:

  https://github.com/juanquintela/qemu.git tags/migration-pull-pull-request

for you to fetch changes up to cc708d2411d3ed2ab4a428c996b778c7c7a47a04:

  apic: Use 32bit APIC ID for migration instance ID (2020-01-10 18:19:18 +0100)

----------------------------------------------------------------
Migration pull request

- several multifd mixes (jiahui, me)
- rate limit host pages (david)
- remove unneeded labels (daniel)
- several multifd fixes (wei)
- improve handler insert (scott)
- qlist migration (eric)
- power fixes (laurent)
- migration improvemests (yury)
- lots of fixes (wei)

----------------------------------------------------------------

Alexey Romko (1):
  Bug #1829242 correction.

Daniel Henrique Barboza (1):
  ram.c: remove unneeded labels

Dr. David Alan Gilbert (1):
  migration: Rate limit inside host pages

Eric Auger (1):
  migration: Support QLIST migration

Fangrui Song (1):
  migration: Fix incorrect integer->float conversion caught by clang

Jiahui Cen (2):
  migration/multifd: fix nullptr access in terminating multifd threads
  migration/multifd: fix destroyed mutex access in terminating multifd
    threads

Juan Quintela (3):
  migration-test: Add migration multifd test
  migration: Make sure that we don't call write() in case of error
  migration-test: introduce functions to handle string parameters

Laurent Vivier (2):
  migration-test: ppc64: fix FORTH test program
  runstate: ignore finishmigrate -> prelaunch transition

Marc-André Lureau (1):
  misc: use QEMU_IS_ALIGNED

Peter Xu (3):
  migration: Define VMSTATE_INSTANCE_ID_ANY
  migration: Change SaveStateEntry.instance_id into uint32_t
  apic: Use 32bit APIC ID for migration instance ID

Scott Cheloha (2):
  migration: add savevm_state_handler_remove()
  migration: savevm_state_handler_insert: constant-time element
    insertion

Wei Yang (8):
  migration/postcopy: reduce memset when it is zero page and
    matches_target_page_size
  migration/postcopy: wait for decompress thread in precopy
  migration/postcopy: count target page number to decide the
    place_needed
  migration/postcopy: set all_zero to true on the first target page
  migration/postcopy: enable random order target page arrival
  migration/postcopy: enable compress during postcopy
  migration/multifd: clean pages after filling packet
  migration/multifd: not use multifd during postcopy

Yury Kotov (2):
  migration: Fix the re-run check of the migrate-incoming command
  migration/ram: Yield periodically to the main loop

 backends/dbus-vmstate.c      |   3 +-
 exec.c                       |   4 +-
 hw/arm/stellaris.c           |   2 +-
 hw/core/qdev.c               |   3 +-
 hw/display/ads7846.c         |   2 +-
 hw/i2c/core.c                |   2 +-
 hw/input/stellaris_input.c   |   3 +-
 hw/intc/apic_common.c        |   7 +-
 hw/misc/max111x.c            |   3 +-
 hw/net/eepro100.c            |   3 +-
 hw/pci/pci.c                 |   2 +-
 hw/ppc/spapr.c               |   2 +-
 hw/timer/arm_timer.c         |   2 +-
 hw/tpm/tpm_emulator.c        |   3 +-
 include/migration/register.h |   2 +-
 include/migration/vmstate.h  |  25 ++++-
 include/qemu/queue.h         |  39 ++++++++
 migration/migration.c        |  72 +++++++-------
 migration/migration.h        |   1 +
 migration/ram.c              | 181 ++++++++++++++++++++++++++---------
 migration/savevm.c           |  61 ++++++++----
 migration/trace-events       |   9 +-
 migration/vmstate-types.c    |  70 ++++++++++++++
 stubs/vmstate.c              |   2 +-
 tests/migration-test.c       |  97 ++++++++++++++++++-
 tests/test-vmstate.c         | 170 ++++++++++++++++++++++++++++++++
 vl.c                         |  10 +-
 27 files changed, 652 insertions(+), 128 deletions(-)

-- 
2.24.1


Re: [PULL 00/28] Migration pull patches
Posted by Peter Maydell 4 years, 3 months ago
On Fri, 10 Jan 2020 at 17:32, Juan Quintela <quintela@redhat.com> wrote:
>
> The following changes since commit f38a71b01f839c7b65ea73ddd507903cb9489ed6:
>
>   Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-semihosting-090120-2' into staging (2020-01-10 13:19:34 +0000)
>
> are available in the Git repository at:
>
>   https://github.com/juanquintela/qemu.git tags/migration-pull-pull-request
>
> for you to fetch changes up to cc708d2411d3ed2ab4a428c996b778c7c7a47a04:
>
>   apic: Use 32bit APIC ID for migration instance ID (2020-01-10 18:19:18 +0100)
>
> ----------------------------------------------------------------
> Migration pull request
>
> - several multifd mixes (jiahui, me)
> - rate limit host pages (david)
> - remove unneeded labels (daniel)
> - several multifd fixes (wei)
> - improve handler insert (scott)
> - qlist migration (eric)
> - power fixes (laurent)
> - migration improvemests (yury)
> - lots of fixes (wei)

Hi. This causes a new compile warning for the netbsd VM:

In file included from
/home/qemu/qemu-test.tqjNTZ/src/include/hw/qdev-core.h:4:0,
                 from
/home/qemu/qemu-test.tqjNTZ/src/tests/../migration/migration.h:18,
                 from /home/qemu/qemu-test.tqjNTZ/src/tests/test-vmstate.c:27:
/home/qemu/qemu-test.tqjNTZ/src/tests/test-vmstate.c: In function
'manipulate_container':
/home/qemu/qemu-test.tqjNTZ/src/include/qemu/queue.h:130:34: warning:
'prev' may be used uninitialized in this function
[-Wmaybe-uninitialized]
         (listelm)->field.le_prev = &(elm)->field.le_next;               \
                                  ^
/home/qemu/qemu-test.tqjNTZ/src/tests/test-vmstate.c:1337:24: note:
'prev' was declared here
      TestQListElement *prev, *iter = QLIST_FIRST(&c->list);
                        ^


I also saw this on aarch32 host (more precisely, on the
aarch32-environment-in-aarch64-chroot setup I use for aarch32 build
and test):

malloc_consolidate(): invalid chunk size
Broken pipe
qemu-system-i386: check_section_footer: Read section footer failed: -5
qemu-system-i386: load of migration failed: Invalid argument
/home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
terminate QEMU process but encountered exit status 1 (expected 0)
Aborted
ERROR - too few tests run (expected 14, got 13)

The memory corruption is reproducible running just the
/x86_64/migration/multifd/tcp subtest:

(armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
tests/migration-test -p /x86_64/migration/multifd/tcp
/x86_64/migration/multifd/tcp: qemu-system-x86_64: -accel kvm: invalid
accelerator kvm
qemu-system-x86_64: falling back to tcg
qemu-system-x86_64: -accel kvm: invalid accelerator kvm
qemu-system-x86_64: falling back to tcg
qemu-system-x86_64: multifd_send_sync_main: multifd_send_pages fail
qemu-system-x86_64: failed to save SaveStateEntry with id(name): 3(ram)
double free or corruption (!prev)
Broken pipe
qemu-system-x86_64: Unknown combination of migration flags: 0
qemu-system-x86_64: error while loading state section id 3(ram)
qemu-system-x86_64: load of migration failed: Invalid argument
/home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
terminate QEMU process but encountered exit status 1 (expected 0)
Aborted

Here's what a valgrind run in that aarch32 setup produces:

(armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
QTEST_QEMU_BINARY='valgrind --smc-check=all-non-file
x86_64-softmmu/qemu-system-x86_64' tests/migration-test -p
/x86_64/migration/multifd/tcp
/x86_64/migration/multifd/tcp: ==12102== Memcheck, a memory error detector
==12102== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12102== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==12102== Command: x86_64-softmmu/qemu-system-x86_64 -qtest
unix:/tmp/qtest-12100.sock -qtest-log /dev/null -chardev
socket,path=/tmp/qtest-12100.qmp,id=char0 -mon
chardev=char0,mode=control -display none -accel kvm -accel tcg -name
source,debug-threads=on -m 150M -serial
file:/tmp/migration-test-UlotFX/src_serial -drive
file=/tmp/migration-test-UlotFX/bootsect,format=raw -accel qtest
==12102==
qemu-system-x86_64: -accel kvm: invalid accelerator kvm
qemu-system-x86_64: falling back to tcg
==12108== Memcheck, a memory error detector
==12108== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12108== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==12108== Command: x86_64-softmmu/qemu-system-x86_64 -qtest
unix:/tmp/qtest-12100.sock -qtest-log /dev/null -chardev
socket,path=/tmp/qtest-12100.qmp,id=char0 -mon
chardev=char0,mode=control -display none -accel kvm -accel tcg -name
target,debug-threads=on -m 150M -serial
file:/tmp/migration-test-UlotFX/dest_serial -incoming defer -drive
file=/tmp/migration-test-UlotFX/bootsect,format=raw -accel qtest
==12108==
qemu-system-x86_64: -accel kvm: invalid accelerator kvm
qemu-system-x86_64: falling back to tcg
==12102== Thread 22 multifdsend_15:
==12102== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==12102==    at 0x53C7F06: __libc_do_syscall (libc-do-syscall.S:47)
==12102==    by 0x53C6FCB: sendmsg (sendmsg.c:28)
==12102==    by 0x51B9A9: qio_channel_socket_writev (channel-socket.c:561)
==12102==    by 0x519FCD: qio_channel_writev (channel.c:207)
==12102==    by 0x519FCD: qio_channel_writev_all (channel.c:171)
==12102==    by 0x51A047: qio_channel_write_all (channel.c:257)
==12102==    by 0x25CB17: multifd_send_initial_packet (ram.c:714)
==12102==    by 0x25CB17: multifd_send_thread (ram.c:1136)
==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
==12102==    by 0x54767FB: ??? (clone.S:73)
==12102==  Address 0x262103fd is on thread 22's stack
==12102==  in frame #5, created by multifd_send_thread (ram.c:1127)
==12102==
==12102== Thread 6 multifdsend_1:
==12102== Invalid write of size 4
==12102==    at 0x25CC08: multifd_send_fill_packet (ram.c:806)
==12102==    by 0x25CC08: multifd_send_thread (ram.c:1157)
==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
==12102==    by 0x54767FB: ??? (clone.S:73)
==12102==  Address 0x1d89c470 is 0 bytes after a block of size 832 alloc'd
==12102==    at 0x4841BC4: calloc (vg_replace_malloc.c:711)
==12102==    by 0x49EE269: g_malloc0 (in
/usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
==12102==
==12102== Invalid write of size 4
==12102==    at 0x25CC0E: multifd_send_fill_packet (ram.c:806)
==12102==    by 0x25CC0E: multifd_send_thread (ram.c:1157)
==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
==12102==    by 0x54767FB: ??? (clone.S:73)
==12102==  Address 0x1d89c474 is 4 bytes after a block of size 832 alloc'd
==12102==    at 0x4841BC4: calloc (vg_replace_malloc.c:711)
==12102==    by 0x49EE269: g_malloc0 (in
/usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
==12102==
==12102== Invalid read of size 4
==12102==    at 0x519812: qio_channel_writev_full (channel.c:86)
==12102==    by 0x519FCD: qio_channel_writev (channel.c:207)
==12102==    by 0x519FCD: qio_channel_writev_all (channel.c:171)
==12102==    by 0x51A047: qio_channel_write_all (channel.c:257)
==12102==    by 0x25CC6D: multifd_send_thread (ram.c:1168)
==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
==12102==    by 0x54767FB: ??? (clone.S:73)
==12102==  Address 0x30 is not stack'd, malloc'd or (recently) free'd
==12102==
==12102==
==12102== Process terminating with default action of signal 11 (SIGSEGV)
==12102==  Access not within mapped region at address 0x30
==12102==    at 0x519812: qio_channel_writev_full (channel.c:86)
==12102==    by 0x519FCD: qio_channel_writev (channel.c:207)
==12102==    by 0x519FCD: qio_channel_writev_all (channel.c:171)
==12102==    by 0x51A047: qio_channel_write_all (channel.c:257)
==12102==    by 0x25CC6D: multifd_send_thread (ram.c:1168)
==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
==12102==    by 0x54767FB: ??? (clone.S:73)
==12102==  If you believe this happened as a result of a stack
==12102==  overflow in your program's main thread (unlikely but
==12102==  possible), you can try to increase the size of the
==12102==  main thread stack using the --main-stacksize= flag.
==12102==  The main thread stack size used in this run was 8388608.
==12102==
==12102== HEAP SUMMARY:
==12102==     in use at exit: 7,159,914 bytes in 28,035 blocks
==12102==   total heap usage: 370,889 allocs, 342,854 frees,
34,875,720 bytes allocated
==12102==
==12102== LEAK SUMMARY:
==12102==    definitely lost: 56 bytes in 1 blocks
==12102==    indirectly lost: 64 bytes in 2 blocks
==12102==      possibly lost: 5,916 bytes in 58 blocks
==12102==    still reachable: 7,153,878 bytes in 27,974 blocks
==12102==                       of which reachable via heuristic:
==12102==                         newarray           : 832 bytes in 16 blocks
==12102==         suppressed: 0 bytes in 0 blocks
==12102== Rerun with --leak-check=full to see details of leaked memory
==12102==
==12102== For counts of detected and suppressed errors, rerun with: -v
==12102== Use --track-origins=yes to see where uninitialised values come from
==12102== ERROR SUMMARY: 80 errors from 4 contexts (suppressed: 6 from 3)
Broken pipe
qemu-system-x86_64: load of migration failed: Input/output error
==12108==
==12108== HEAP SUMMARY:
==12108==     in use at exit: 6,321,388 bytes in 21,290 blocks
==12108==   total heap usage: 59,082 allocs, 37,792 frees, 23,874,965
bytes allocated
==12108==
==12108== LEAK SUMMARY:
==12108==    definitely lost: 0 bytes in 0 blocks
==12108==    indirectly lost: 0 bytes in 0 blocks
==12108==      possibly lost: 5,440 bytes in 37 blocks
==12108==    still reachable: 6,315,948 bytes in 21,253 blocks
==12108==                       of which reachable via heuristic:
==12108==                         newarray           : 832 bytes in 16 blocks
==12108==         suppressed: 0 bytes in 0 blocks
==12108== Rerun with --leak-check=full to see details of leaked memory
==12108==
==12108== For counts of detected and suppressed errors, rerun with: -v
==12108== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 3)
/home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
terminate QEMU process but encountered exit status 1 (expected 0)
Aborted


thanks
-- PMM

Re: [PULL 00/28] Migration pull patches
Posted by Daniel P. Berrangé 4 years, 3 months ago
On Mon, Jan 13, 2020 at 01:05:22PM +0000, Peter Maydell wrote:
> On Fri, 10 Jan 2020 at 17:32, Juan Quintela <quintela@redhat.com> wrote:
> >
> > The following changes since commit f38a71b01f839c7b65ea73ddd507903cb9489ed6:
> >
> >   Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-semihosting-090120-2' into staging (2020-01-10 13:19:34 +0000)
> >
> > are available in the Git repository at:
> >
> >   https://github.com/juanquintela/qemu.git tags/migration-pull-pull-request
> >
> > for you to fetch changes up to cc708d2411d3ed2ab4a428c996b778c7c7a47a04:
> >
> >   apic: Use 32bit APIC ID for migration instance ID (2020-01-10 18:19:18 +0100)
> >

[snip]

> I also saw this on aarch32 host (more precisely, on the
> aarch32-environment-in-aarch64-chroot setup I use for aarch32 build
> and test):
> 
> malloc_consolidate(): invalid chunk size
> Broken pipe
> qemu-system-i386: check_section_footer: Read section footer failed: -5
> qemu-system-i386: load of migration failed: Invalid argument
> /home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1 (expected 0)
> Aborted
> ERROR - too few tests run (expected 14, got 13)
> 
> The memory corruption is reproducible running just the
> /x86_64/migration/multifd/tcp subtest:
> 
> (armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
> QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
> tests/migration-test -p /x86_64/migration/multifd/tcp
> /x86_64/migration/multifd/tcp: qemu-system-x86_64: -accel kvm: invalid
> accelerator kvm
> qemu-system-x86_64: falling back to tcg
> qemu-system-x86_64: -accel kvm: invalid accelerator kvm
> qemu-system-x86_64: falling back to tcg
> qemu-system-x86_64: multifd_send_sync_main: multifd_send_pages fail
> qemu-system-x86_64: failed to save SaveStateEntry with id(name): 3(ram)
> double free or corruption (!prev)
> Broken pipe
> qemu-system-x86_64: Unknown combination of migration flags: 0
> qemu-system-x86_64: error while loading state section id 3(ram)
> qemu-system-x86_64: load of migration failed: Invalid argument
> /home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1 (expected 0)
> Aborted
> 
> Here's what a valgrind run in that aarch32 setup produces:
> 
> (armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
> QTEST_QEMU_BINARY='valgrind --smc-check=all-non-file
> x86_64-softmmu/qemu-system-x86_64' tests/migration-test -p
> /x86_64/migration/multifd/tcp
> /x86_64/migration/multifd/tcp: ==12102== Memcheck, a memory error detector
> ==12102== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==12102== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==12102== Command: x86_64-softmmu/qemu-system-x86_64 -qtest
> unix:/tmp/qtest-12100.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-12100.qmp,id=char0 -mon
> chardev=char0,mode=control -display none -accel kvm -accel tcg -name
> source,debug-threads=on -m 150M -serial
> file:/tmp/migration-test-UlotFX/src_serial -drive
> file=/tmp/migration-test-UlotFX/bootsect,format=raw -accel qtest
> ==12102==
> qemu-system-x86_64: -accel kvm: invalid accelerator kvm
> qemu-system-x86_64: falling back to tcg
> ==12108== Memcheck, a memory error detector
> ==12108== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==12108== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==12108== Command: x86_64-softmmu/qemu-system-x86_64 -qtest
> unix:/tmp/qtest-12100.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-12100.qmp,id=char0 -mon
> chardev=char0,mode=control -display none -accel kvm -accel tcg -name
> target,debug-threads=on -m 150M -serial
> file:/tmp/migration-test-UlotFX/dest_serial -incoming defer -drive
> file=/tmp/migration-test-UlotFX/bootsect,format=raw -accel qtest
> ==12108==
> qemu-system-x86_64: -accel kvm: invalid accelerator kvm
> qemu-system-x86_64: falling back to tcg
> ==12102== Thread 22 multifdsend_15:
> ==12102== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
> ==12102==    at 0x53C7F06: __libc_do_syscall (libc-do-syscall.S:47)
> ==12102==    by 0x53C6FCB: sendmsg (sendmsg.c:28)
> ==12102==    by 0x51B9A9: qio_channel_socket_writev (channel-socket.c:561)
> ==12102==    by 0x519FCD: qio_channel_writev (channel.c:207)
> ==12102==    by 0x519FCD: qio_channel_writev_all (channel.c:171)
> ==12102==    by 0x51A047: qio_channel_write_all (channel.c:257)
> ==12102==    by 0x25CB17: multifd_send_initial_packet (ram.c:714)
> ==12102==    by 0x25CB17: multifd_send_thread (ram.c:1136)
> ==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
> ==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
> ==12102==    by 0x54767FB: ??? (clone.S:73)
> ==12102==  Address 0x262103fd is on thread 22's stack
> ==12102==  in frame #5, created by multifd_send_thread (ram.c:1127)

Missing initialization of     MultiFDInit_t msg; to all zeros

> ==12102==
> ==12102== Thread 6 multifdsend_1:
> ==12102== Invalid write of size 4
> ==12102==    at 0x25CC08: multifd_send_fill_packet (ram.c:806)
> ==12102==    by 0x25CC08: multifd_send_thread (ram.c:1157)
> ==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
> ==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
> ==12102==    by 0x54767FB: ??? (clone.S:73)
> ==12102==  Address 0x1d89c470 is 0 bytes after a block of size 832 alloc'd
> ==12102==    at 0x4841BC4: calloc (vg_replace_malloc.c:711)
> ==12102==    by 0x49EE269: g_malloc0 (in
> /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)

This is the same issue that was reported last time this mulitfd unit
test was proposed for merge. Back then I pointed out the likely cause.
We were allocating  ram_addr_t sized quantity for an array which is
uint64_t, and ram_addr_t is probably 32-bit on this particular build.

  https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg03428.html

That suggested fix doesn't seem to have been included


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PULL 00/28] Migration pull patches
Posted by Juan Quintela 4 years, 3 months ago
Daniel P. Berrangé <berrange@redhat.com> wrote:
>> I also saw this on aarch32 host (more precisely, on the
>> aarch32-environment-in-aarch64-chroot setup I use for aarch32 build
>> and test):
>> 
>> malloc_consolidate(): invalid chunk size
>> Broken pipe
>> qemu-system-i386: check_section_footer: Read section footer failed: -5
>> qemu-system-i386: load of migration failed: Invalid argument
>> /home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
>> terminate QEMU process but encountered exit status 1 (expected 0)
>> Aborted
>> ERROR - too few tests run (expected 14, got 13)
>> 
>> The memory corruption is reproducible running just the
>> /x86_64/migration/multifd/tcp subtest:
>> 
>> (armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
>> QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
>> tests/migration-test -p /x86_64/migration/multifd/tcp
>> /x86_64/migration/multifd/tcp: qemu-system-x86_64: -accel kvm: invalid
>> accelerator kvm
>> qemu-system-x86_64: falling back to tcg
>> qemu-system-x86_64: -accel kvm: invalid accelerator kvm
>> qemu-system-x86_64: falling back to tcg
>> qemu-system-x86_64: multifd_send_sync_main: multifd_send_pages fail
>> qemu-system-x86_64: failed to save SaveStateEntry with id(name): 3(ram)
>> double free or corruption (!prev)
>> Broken pipe
>> qemu-system-x86_64: Unknown combination of migration flags: 0
>> qemu-system-x86_64: error while loading state section id 3(ram)
>> qemu-system-x86_64: load of migration failed: Invalid argument
>> /home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
>> terminate QEMU process but encountered exit status 1 (expected 0)
>> Aborted
>> 
>> Here's what a valgrind run in that aarch32 setup produces:
>> 
>
> Missing initialization of     MultiFDInit_t msg; to all zeros

I *thought* it was in.  Sorry.

>
>> ==12102==
>> ==12102== Thread 6 multifdsend_1:
>> ==12102== Invalid write of size 4
>> ==12102==    at 0x25CC08: multifd_send_fill_packet (ram.c:806)
>> ==12102==    by 0x25CC08: multifd_send_thread (ram.c:1157)
>> ==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
>> ==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
>> ==12102==    by 0x54767FB: ??? (clone.S:73)
>> ==12102==  Address 0x1d89c470 is 0 bytes after a block of size 832 alloc'd
>> ==12102==    at 0x4841BC4: calloc (vg_replace_malloc.c:711)
>> ==12102==    by 0x49EE269: g_malloc0 (in
>> /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
>
> This is the same issue that was reported last time this mulitfd unit
> test was proposed for merge. Back then I pointed out the likely cause.
> We were allocating  ram_addr_t sized quantity for an array which is
> uint64_t, and ram_addr_t is probably 32-bit on this particular build.
>
>   https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg03428.html
>
> That suggested fix doesn't seem to have been included

Thanks again.

And sorry for the disturbance.


Re: [PULL 00/28] Migration pull patches
Posted by Auger Eric 4 years, 3 months ago
Hi Juan, Peter,

On 1/13/20 2:05 PM, Peter Maydell wrote:
> On Fri, 10 Jan 2020 at 17:32, Juan Quintela <quintela@redhat.com> wrote:
>>
>> The following changes since commit f38a71b01f839c7b65ea73ddd507903cb9489ed6:
>>
>>   Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-semihosting-090120-2' into staging (2020-01-10 13:19:34 +0000)
>>
>> are available in the Git repository at:
>>
>>   https://github.com/juanquintela/qemu.git tags/migration-pull-pull-request
>>
>> for you to fetch changes up to cc708d2411d3ed2ab4a428c996b778c7c7a47a04:
>>
>>   apic: Use 32bit APIC ID for migration instance ID (2020-01-10 18:19:18 +0100)
>>
>> ----------------------------------------------------------------
>> Migration pull request
>>
>> - several multifd mixes (jiahui, me)
>> - rate limit host pages (david)
>> - remove unneeded labels (daniel)
>> - several multifd fixes (wei)
>> - improve handler insert (scott)
>> - qlist migration (eric)
>> - power fixes (laurent)
>> - migration improvemests (yury)
>> - lots of fixes (wei)
> 
> Hi. This causes a new compile warning for the netbsd VM:
> 
> In file included from
> /home/qemu/qemu-test.tqjNTZ/src/include/hw/qdev-core.h:4:0,
>                  from
> /home/qemu/qemu-test.tqjNTZ/src/tests/../migration/migration.h:18,
>                  from /home/qemu/qemu-test.tqjNTZ/src/tests/test-vmstate.c:27:
> /home/qemu/qemu-test.tqjNTZ/src/tests/test-vmstate.c: In function
> 'manipulate_container':
> /home/qemu/qemu-test.tqjNTZ/src/include/qemu/queue.h:130:34: warning:
> 'prev' may be used uninitialized in this function
> [-Wmaybe-uninitialized]
>          (listelm)->field.le_prev = &(elm)->field.le_next;               \
>                                   ^
> /home/qemu/qemu-test.tqjNTZ/src/tests/test-vmstate.c:1337:24: note:
> 'prev' was declared here
>       TestQListElement *prev, *iter = QLIST_FIRST(&c->list);>                         ^
I just sent "[PATCH v7] migration: Support QLIST migration"
It should fix that warning.

Sorry for the inconvenience.

Thanks

Eric
> 
> 
> I also saw this on aarch32 host (more precisely, on the
> aarch32-environment-in-aarch64-chroot setup I use for aarch32 build
> and test):
> 
> malloc_consolidate(): invalid chunk size
> Broken pipe
> qemu-system-i386: check_section_footer: Read section footer failed: -5
> qemu-system-i386: load of migration failed: Invalid argument
> /home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1 (expected 0)
> Aborted
> ERROR - too few tests run (expected 14, got 13)
> 
> The memory corruption is reproducible running just the
> /x86_64/migration/multifd/tcp subtest:
> 
> (armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
> QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
> tests/migration-test -p /x86_64/migration/multifd/tcp
> /x86_64/migration/multifd/tcp: qemu-system-x86_64: -accel kvm: invalid
> accelerator kvm
> qemu-system-x86_64: falling back to tcg
> qemu-system-x86_64: -accel kvm: invalid accelerator kvm
> qemu-system-x86_64: falling back to tcg
> qemu-system-x86_64: multifd_send_sync_main: multifd_send_pages fail
> qemu-system-x86_64: failed to save SaveStateEntry with id(name): 3(ram)
> double free or corruption (!prev)
> Broken pipe
> qemu-system-x86_64: Unknown combination of migration flags: 0
> qemu-system-x86_64: error while loading state section id 3(ram)
> qemu-system-x86_64: load of migration failed: Invalid argument
> /home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1 (expected 0)
> Aborted
> 
> Here's what a valgrind run in that aarch32 setup produces:
> 
> (armhf)pmaydell@mustang-maydell:~/qemu/build/all-a32$
> QTEST_QEMU_BINARY='valgrind --smc-check=all-non-file
> x86_64-softmmu/qemu-system-x86_64' tests/migration-test -p
> /x86_64/migration/multifd/tcp
> /x86_64/migration/multifd/tcp: ==12102== Memcheck, a memory error detector
> ==12102== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==12102== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==12102== Command: x86_64-softmmu/qemu-system-x86_64 -qtest
> unix:/tmp/qtest-12100.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-12100.qmp,id=char0 -mon
> chardev=char0,mode=control -display none -accel kvm -accel tcg -name
> source,debug-threads=on -m 150M -serial
> file:/tmp/migration-test-UlotFX/src_serial -drive
> file=/tmp/migration-test-UlotFX/bootsect,format=raw -accel qtest
> ==12102==
> qemu-system-x86_64: -accel kvm: invalid accelerator kvm
> qemu-system-x86_64: falling back to tcg
> ==12108== Memcheck, a memory error detector
> ==12108== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==12108== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==12108== Command: x86_64-softmmu/qemu-system-x86_64 -qtest
> unix:/tmp/qtest-12100.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-12100.qmp,id=char0 -mon
> chardev=char0,mode=control -display none -accel kvm -accel tcg -name
> target,debug-threads=on -m 150M -serial
> file:/tmp/migration-test-UlotFX/dest_serial -incoming defer -drive
> file=/tmp/migration-test-UlotFX/bootsect,format=raw -accel qtest
> ==12108==
> qemu-system-x86_64: -accel kvm: invalid accelerator kvm
> qemu-system-x86_64: falling back to tcg
> ==12102== Thread 22 multifdsend_15:
> ==12102== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
> ==12102==    at 0x53C7F06: __libc_do_syscall (libc-do-syscall.S:47)
> ==12102==    by 0x53C6FCB: sendmsg (sendmsg.c:28)
> ==12102==    by 0x51B9A9: qio_channel_socket_writev (channel-socket.c:561)
> ==12102==    by 0x519FCD: qio_channel_writev (channel.c:207)
> ==12102==    by 0x519FCD: qio_channel_writev_all (channel.c:171)
> ==12102==    by 0x51A047: qio_channel_write_all (channel.c:257)
> ==12102==    by 0x25CB17: multifd_send_initial_packet (ram.c:714)
> ==12102==    by 0x25CB17: multifd_send_thread (ram.c:1136)
> ==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
> ==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
> ==12102==    by 0x54767FB: ??? (clone.S:73)
> ==12102==  Address 0x262103fd is on thread 22's stack
> ==12102==  in frame #5, created by multifd_send_thread (ram.c:1127)
> ==12102==
> ==12102== Thread 6 multifdsend_1:
> ==12102== Invalid write of size 4
> ==12102==    at 0x25CC08: multifd_send_fill_packet (ram.c:806)
> ==12102==    by 0x25CC08: multifd_send_thread (ram.c:1157)
> ==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
> ==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
> ==12102==    by 0x54767FB: ??? (clone.S:73)
> ==12102==  Address 0x1d89c470 is 0 bytes after a block of size 832 alloc'd
> ==12102==    at 0x4841BC4: calloc (vg_replace_malloc.c:711)
> ==12102==    by 0x49EE269: g_malloc0 (in
> /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
> ==12102==
> ==12102== Invalid write of size 4
> ==12102==    at 0x25CC0E: multifd_send_fill_packet (ram.c:806)
> ==12102==    by 0x25CC0E: multifd_send_thread (ram.c:1157)
> ==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
> ==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
> ==12102==    by 0x54767FB: ??? (clone.S:73)
> ==12102==  Address 0x1d89c474 is 4 bytes after a block of size 832 alloc'd
> ==12102==    at 0x4841BC4: calloc (vg_replace_malloc.c:711)
> ==12102==    by 0x49EE269: g_malloc0 (in
> /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0.5600.4)
> ==12102==
> ==12102== Invalid read of size 4
> ==12102==    at 0x519812: qio_channel_writev_full (channel.c:86)
> ==12102==    by 0x519FCD: qio_channel_writev (channel.c:207)
> ==12102==    by 0x519FCD: qio_channel_writev_all (channel.c:171)
> ==12102==    by 0x51A047: qio_channel_write_all (channel.c:257)
> ==12102==    by 0x25CC6D: multifd_send_thread (ram.c:1168)
> ==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
> ==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
> ==12102==    by 0x54767FB: ??? (clone.S:73)
> ==12102==  Address 0x30 is not stack'd, malloc'd or (recently) free'd
> ==12102==
> ==12102==
> ==12102== Process terminating with default action of signal 11 (SIGSEGV)
> ==12102==  Access not within mapped region at address 0x30
> ==12102==    at 0x519812: qio_channel_writev_full (channel.c:86)
> ==12102==    by 0x519FCD: qio_channel_writev (channel.c:207)
> ==12102==    by 0x519FCD: qio_channel_writev_all (channel.c:171)
> ==12102==    by 0x51A047: qio_channel_write_all (channel.c:257)
> ==12102==    by 0x25CC6D: multifd_send_thread (ram.c:1168)
> ==12102==    by 0x557551: qemu_thread_start (qemu-thread-posix.c:519)
> ==12102==    by 0x53BE613: start_thread (pthread_create.c:463)
> ==12102==    by 0x54767FB: ??? (clone.S:73)
> ==12102==  If you believe this happened as a result of a stack
> ==12102==  overflow in your program's main thread (unlikely but
> ==12102==  possible), you can try to increase the size of the
> ==12102==  main thread stack using the --main-stacksize= flag.
> ==12102==  The main thread stack size used in this run was 8388608.
> ==12102==
> ==12102== HEAP SUMMARY:
> ==12102==     in use at exit: 7,159,914 bytes in 28,035 blocks
> ==12102==   total heap usage: 370,889 allocs, 342,854 frees,
> 34,875,720 bytes allocated
> ==12102==
> ==12102== LEAK SUMMARY:
> ==12102==    definitely lost: 56 bytes in 1 blocks
> ==12102==    indirectly lost: 64 bytes in 2 blocks
> ==12102==      possibly lost: 5,916 bytes in 58 blocks
> ==12102==    still reachable: 7,153,878 bytes in 27,974 blocks
> ==12102==                       of which reachable via heuristic:
> ==12102==                         newarray           : 832 bytes in 16 blocks
> ==12102==         suppressed: 0 bytes in 0 blocks
> ==12102== Rerun with --leak-check=full to see details of leaked memory
> ==12102==
> ==12102== For counts of detected and suppressed errors, rerun with: -v
> ==12102== Use --track-origins=yes to see where uninitialised values come from
> ==12102== ERROR SUMMARY: 80 errors from 4 contexts (suppressed: 6 from 3)
> Broken pipe
> qemu-system-x86_64: load of migration failed: Input/output error
> ==12108==
> ==12108== HEAP SUMMARY:
> ==12108==     in use at exit: 6,321,388 bytes in 21,290 blocks
> ==12108==   total heap usage: 59,082 allocs, 37,792 frees, 23,874,965
> bytes allocated
> ==12108==
> ==12108== LEAK SUMMARY:
> ==12108==    definitely lost: 0 bytes in 0 blocks
> ==12108==    indirectly lost: 0 bytes in 0 blocks
> ==12108==      possibly lost: 5,440 bytes in 37 blocks
> ==12108==    still reachable: 6,315,948 bytes in 21,253 blocks
> ==12108==                       of which reachable via heuristic:
> ==12108==                         newarray           : 832 bytes in 16 blocks
> ==12108==         suppressed: 0 bytes in 0 blocks
> ==12108== Rerun with --leak-check=full to see details of leaked memory
> ==12108==
> ==12108== For counts of detected and suppressed errors, rerun with: -v
> ==12108== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 3)
> /home/peter.maydell/qemu/tests/libqtest.c:140: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1 (expected 0)
> Aborted
> 
> 
> thanks
> -- PMM
>