[Qemu-devel] [PULL 00/10] Migration patches

Juan Quintela posted 10 patches 6 years, 8 months ago
Test asan failed
Test docker-clang@ubuntu passed
Test docker-mingw@fedora passed
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190313115121.7611-1-quintela@redhat.com
Maintainers: Markus Armbruster <armbru@redhat.com>, Eric Blake <eblake@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Juan Quintela <quintela@redhat.com>, Thomas Huth <thuth@redhat.com>
There is a newer version of this series
hmp.c                  | 26 +++++++-------
migration/migration.c  | 64 +++++++++++----------------------
migration/migration.h  |  1 -
migration/ram.c        | 82 ++++++++++++++++++++++++++++++------------
migration/tls.c        |  2 +-
migration/trace-events |  4 +--
qapi/migration.json    | 59 +++++++++++++++---------------
tests/migration-test.c | 48 +++++++++++++++++++++++++
8 files changed, 176 insertions(+), 110 deletions(-)
[Qemu-devel] [PULL 00/10] Migration patches
Posted by Juan Quintela 6 years, 8 months ago
The following changes since commit 3f3bbfc7cef4490c5ed5550766a81e7d18f08db1:

  Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2019-03-12' into staging (2019-03-12 21:06:26 +0000)

are available in the Git repository at:

  https://github.com/juanquintela/qemu.git tags/migration-pull-request

for you to fetch changes up to d677dc8d443e00bb9472fda6cc95ed2256f49670:

  migration: add support for a "tls-authz" migration parameter (2019-03-13 12:28:21 +0100)

----------------------------------------------------------------
Migration fixes

- multifd "upgrade" from experimental
- tls auth change  (danp)

Later, Juan.

----------------------------------------------------------------

Daniel P. Berrangé (1):
  migration: add support for a "tls-authz" migration parameter

Juan Quintela (9):
  multifd: Only send pages when packet are not empty
  multifd: Rename "size" member to pages_alloc
  multifd: Create new next_packet_size field
  multifd: Drop x-multifd-page-count parameter
  multifd: Be flexible about packet size
  multifd: Change default packet size
  multifd: Add some padding
  multifd: Drop x-
  tests: Add migration multifd test

 hmp.c                  | 26 +++++++-------
 migration/migration.c  | 64 +++++++++++----------------------
 migration/migration.h  |  1 -
 migration/ram.c        | 82 ++++++++++++++++++++++++++++++------------
 migration/tls.c        |  2 +-
 migration/trace-events |  4 +--
 qapi/migration.json    | 59 +++++++++++++++---------------
 tests/migration-test.c | 48 +++++++++++++++++++++++++
 8 files changed, 176 insertions(+), 110 deletions(-)

-- 
2.20.1


Re: [Qemu-devel] [PULL 00/10] Migration patches
Posted by Peter Maydell 6 years, 8 months ago
On Wed, 13 Mar 2019 at 12:14, Juan Quintela <quintela@redhat.com> wrote:
>
> The following changes since commit 3f3bbfc7cef4490c5ed5550766a81e7d18f08db1:
>
>   Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2019-03-12' into staging (2019-03-12 21:06:26 +0000)
>
> are available in the Git repository at:
>
>   https://github.com/juanquintela/qemu.git tags/migration-pull-request
>
> for you to fetch changes up to d677dc8d443e00bb9472fda6cc95ed2256f49670:
>
>   migration: add support for a "tls-authz" migration parameter (2019-03-13 12:28:21 +0100)
>
> ----------------------------------------------------------------
> Migration fixes
>
> - multifd "upgrade" from experimental
> - tls auth change  (danp)
>
> Later, Juan.
>
> ----------------------------------------------------------------

For aarch32 host i386 guest I got this failure on the
migration-test /i386/migration/multifd/tcp test:

*** Error in `i386-softmmu/qemu-system-i386': malloc(): smallbin
double linked list corrupted: 0x01b564b8 ***
qemu-system-i386: check_section_footer: Read section footer failed: -5
Broken pipe
qemu-system-i386: load of migration failed: Invalid argument
/home/peter.maydell/qemu/tests/libqtest.c:135: kill_qemu() tried to
terminate QEMU process but encountered exit status 1
Aborted
ERROR - too few tests run (expected 8, got 7)

I did a retry by hand of 'make check-qtest-i386' and this time
it hung entirely when trying to run that test case.

I also tried a run under valgrind on x86 (you'll need to
nobble KVM somehow as valgrind can't deal with it):
QTEST_QEMU_BINARY='valgrind --smc-check=all
i386-softmmu/qemu-system-i386' QTEST_QEMU_IMG=qemu-img
tests/migration-test -m=quick -p '/i386/migration/multifd/tcp'

and that seems to have hung too.

thanks
-- PMM

Re: [Qemu-devel] [PULL 00/10] Migration patches
Posted by Peter Maydell 6 years, 8 months ago
On Thu, 14 Mar 2019 at 11:48, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Wed, 13 Mar 2019 at 12:14, Juan Quintela <quintela@redhat.com> wrote:
> >
> > The following changes since commit 3f3bbfc7cef4490c5ed5550766a81e7d18f08db1:
> >
> >   Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2019-03-12' into staging (2019-03-12 21:06:26 +0000)
> >
> > are available in the Git repository at:
> >
> >   https://github.com/juanquintela/qemu.git tags/migration-pull-request
> >
> > for you to fetch changes up to d677dc8d443e00bb9472fda6cc95ed2256f49670:
> >
> >   migration: add support for a "tls-authz" migration parameter (2019-03-13 12:28:21 +0100)
> >
> > ----------------------------------------------------------------
> > Migration fixes
> >
> > - multifd "upgrade" from experimental
> > - tls auth change  (danp)
> >
> > Later, Juan.
> >
> > ----------------------------------------------------------------
>
> For aarch32 host i386 guest I got this failure on the
> migration-test /i386/migration/multifd/tcp test:
>
> *** Error in `i386-softmmu/qemu-system-i386': malloc(): smallbin
> double linked list corrupted: 0x01b564b8 ***
> qemu-system-i386: check_section_footer: Read section footer failed: -5
> Broken pipe
> qemu-system-i386: load of migration failed: Invalid argument
> /home/peter.maydell/qemu/tests/libqtest.c:135: kill_qemu() tried to
> terminate QEMU process but encountered exit status 1
> Aborted
> ERROR - too few tests run (expected 8, got 7)

I ran the test on the arm box under valgrind and got this, which might
or might not be helpful:

==18714== Thread 5 multifdsend_1:
==18714== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==18714==    at 0xF809B14: sendmsg (syscall-template.S:84)
==18714==    by 0x303825: qio_channel_socket_writev (channel-socket.c:543)
==18714==    by 0x302021: qio_channel_writev (channel.c:206)
==18714==    by 0x302021: qio_channel_writev_all (channel.c:170)
==18714==    by 0x302093: qio_channel_write_all (channel.c:256)
==18714==    by 0x7C12F: multifd_send_initial_packet (ram.c:713)
==18714==    by 0x7C12F: multifd_send_thread (ram.c:1090)
==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
==18714==    by 0xF8B3BEB: ??? (clone.S:89)
==18714==  Address 0x2123e67d is on thread 5's stack
==18714==  in frame #4, created by multifd_send_thread (ram.c:1082)
==18714==
==18714== Thread 6 multifdsend_0:
==18714== Invalid write of size 4
==18714==    at 0x7C25A: multifd_send_fill_packet (ram.c:808)
==18714==    by 0x7C25A: multifd_send_thread (ram.c:1106)
==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
==18714==    by 0xF8B3BEB: ??? (clone.S:89)
==18714==  Address 0x1e9bc108 is 0 bytes after a block of size 832 alloc'd
==18714==    at 0x4840AB4: calloc (vg_replace_malloc.c:711)
==18714==    by 0xF59BB99: g_malloc0 (in
/lib/arm-linux-gnueabihf/libglib-2.0.so.0.4800.2)
==18714==
==18714== Invalid write of size 4
==18714==    at 0x7C260: multifd_send_fill_packet (ram.c:808)
==18714==    by 0x7C260: multifd_send_thread (ram.c:1106)
==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
==18714==    by 0xF8B3BEB: ??? (clone.S:89)
==18714==  Address 0x1e9bc10c is 4 bytes after a block of size 832 alloc'd
==18714==    at 0x4840AB4: calloc (vg_replace_malloc.c:711)
==18714==    by 0xF59BB99: g_malloc0 (in
/lib/arm-linux-gnueabihf/libglib-2.0.so.0.4800.2)
==18714==
==18714== Invalid read of size 4
==18714==    at 0x3018F6: qio_channel_writev_full (channel.c:85)
==18714==    by 0x302021: qio_channel_writev (channel.c:206)
==18714==    by 0x302021: qio_channel_writev_all (channel.c:170)
==18714==    by 0x302093: qio_channel_write_all (channel.c:256)
==18714==    by 0x7C2B9: multifd_send_thread (ram.c:1116)
==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
==18714==    by 0xF8B3BEB: ??? (clone.S:89)
==18714==  Address 0x30 is not stack'd, malloc'd or (recently) free'd
==18714==
==18714==
==18714== Process terminating with default action of signal 11 (SIGSEGV)
==18714==  Access not within mapped region at address 0x30
==18714==    at 0x3018F6: qio_channel_writev_full (channel.c:85)
==18714==    by 0x302021: qio_channel_writev (channel.c:206)
==18714==    by 0x302021: qio_channel_writev_all (channel.c:170)
==18714==    by 0x302093: qio_channel_write_all (channel.c:256)
==18714==    by 0x7C2B9: multifd_send_thread (ram.c:1116)
==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
==18714==    by 0xF8B3BEB: ??? (clone.S:89)
==18714==  If you believe this happened as a result of a stack
==18714==  overflow in your program's main thread (unlikely but
==18714==  possible), you can try to increase the size of the
==18714==  main thread stack using the --main-stacksize= flag.
==18714==  The main thread stack size used in this run was 8388608.
==18714==

-- we seem to be writing off the end of a buffer.

thanks
-- PMM

Re: [Qemu-devel] [PULL 00/10] Migration patches
Posted by Juan Quintela 6 years, 8 months ago
Peter Maydell <peter.maydell@linaro.org> wrote:
> On Thu, 14 Mar 2019 at 11:48, Peter Maydell <peter.maydell@linaro.org> wrote:
>>
>> On Wed, 13 Mar 2019 at 12:14, Juan Quintela <quintela@redhat.com> wrote:
>> >
>> > The following changes since commit 3f3bbfc7cef4490c5ed5550766a81e7d18f08db1:
>> >
>> >   Merge remote-tracking branch
>> > 'remotes/huth-gitlab/tags/pull-request-2019-03-12' into staging
>> > (2019-03-12 21:06:26 +0000)
>> >
>> > are available in the Git repository at:
>> >
>> >   https://github.com/juanquintela/qemu.git tags/migration-pull-request
>> >
>> > for you to fetch changes up to d677dc8d443e00bb9472fda6cc95ed2256f49670:
>> >
>> >   migration: add support for a "tls-authz" migration parameter
>> > (2019-03-13 12:28:21 +0100)
>> >
>> > ----------------------------------------------------------------
>> > Migration fixes
>> >
>> > - multifd "upgrade" from experimental
>> > - tls auth change  (danp)
>> >
>> > Later, Juan.
>> >
>> > ----------------------------------------------------------------
>>
>> For aarch32 host i386 guest I got this failure on the
>> migration-test /i386/migration/multifd/tcp test:
>>
>> *** Error in `i386-softmmu/qemu-system-i386': malloc(): smallbin
>> double linked list corrupted: 0x01b564b8 ***
>> qemu-system-i386: check_section_footer: Read section footer failed: -5
>> Broken pipe
>> qemu-system-i386: load of migration failed: Invalid argument
>> /home/peter.maydell/qemu/tests/libqtest.c:135: kill_qemu() tried to
>> terminate QEMU process but encountered exit status 1
>> Aborted
>> ERROR - too few tests run (expected 8, got 7)
>
> I ran the test on the arm box under valgrind and got this, which might
> or might not be helpful:

Taking a look, thanks.
Grrr, I think that I am just "stoopid" (TM).

Just a wild guess, in this ARM box, page size is not 4KB, right?

/me goes to look at this anyways.

Later, Juan.


> ==18714== Thread 5 multifdsend_1:
> ==18714== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
> ==18714==    at 0xF809B14: sendmsg (syscall-template.S:84)
> ==18714==    by 0x303825: qio_channel_socket_writev (channel-socket.c:543)
> ==18714==    by 0x302021: qio_channel_writev (channel.c:206)
> ==18714==    by 0x302021: qio_channel_writev_all (channel.c:170)
> ==18714==    by 0x302093: qio_channel_write_all (channel.c:256)
> ==18714==    by 0x7C12F: multifd_send_initial_packet (ram.c:713)
> ==18714==    by 0x7C12F: multifd_send_thread (ram.c:1090)
> ==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
> ==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
> ==18714==    by 0xF8B3BEB: ??? (clone.S:89)
> ==18714==  Address 0x2123e67d is on thread 5's stack
> ==18714==  in frame #4, created by multifd_send_thread (ram.c:1082)
> ==18714==
> ==18714== Thread 6 multifdsend_0:
> ==18714== Invalid write of size 4
> ==18714==    at 0x7C25A: multifd_send_fill_packet (ram.c:808)
> ==18714==    by 0x7C25A: multifd_send_thread (ram.c:1106)
> ==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
> ==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
> ==18714==    by 0xF8B3BEB: ??? (clone.S:89)
> ==18714==  Address 0x1e9bc108 is 0 bytes after a block of size 832 alloc'd
> ==18714==    at 0x4840AB4: calloc (vg_replace_malloc.c:711)
> ==18714==    by 0xF59BB99: g_malloc0 (in
> /lib/arm-linux-gnueabihf/libglib-2.0.so.0.4800.2)
> ==18714==
> ==18714== Invalid write of size 4
> ==18714==    at 0x7C260: multifd_send_fill_packet (ram.c:808)
> ==18714==    by 0x7C260: multifd_send_thread (ram.c:1106)
> ==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
> ==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
> ==18714==    by 0xF8B3BEB: ??? (clone.S:89)
> ==18714==  Address 0x1e9bc10c is 4 bytes after a block of size 832 alloc'd
> ==18714==    at 0x4840AB4: calloc (vg_replace_malloc.c:711)
> ==18714==    by 0xF59BB99: g_malloc0 (in
> /lib/arm-linux-gnueabihf/libglib-2.0.so.0.4800.2)
> ==18714==
> ==18714== Invalid read of size 4
> ==18714==    at 0x3018F6: qio_channel_writev_full (channel.c:85)
> ==18714==    by 0x302021: qio_channel_writev (channel.c:206)
> ==18714==    by 0x302021: qio_channel_writev_all (channel.c:170)
> ==18714==    by 0x302093: qio_channel_write_all (channel.c:256)
> ==18714==    by 0x7C2B9: multifd_send_thread (ram.c:1116)
> ==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
> ==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
> ==18714==    by 0xF8B3BEB: ??? (clone.S:89)
> ==18714==  Address 0x30 is not stack'd, malloc'd or (recently) free'd
> ==18714==
> ==18714==
> ==18714== Process terminating with default action of signal 11 (SIGSEGV)
> ==18714==  Access not within mapped region at address 0x30
> ==18714==    at 0x3018F6: qio_channel_writev_full (channel.c:85)
> ==18714==    by 0x302021: qio_channel_writev (channel.c:206)
> ==18714==    by 0x302021: qio_channel_writev_all (channel.c:170)
> ==18714==    by 0x302093: qio_channel_write_all (channel.c:256)
> ==18714==    by 0x7C2B9: multifd_send_thread (ram.c:1116)
> ==18714==    by 0x33A19F: qemu_thread_start (qemu-thread-posix.c:502)
> ==18714==    by 0xF8025B3: start_thread (pthread_create.c:335)
> ==18714==    by 0xF8B3BEB: ??? (clone.S:89)
> ==18714==  If you believe this happened as a result of a stack
> ==18714==  overflow in your program's main thread (unlikely but
> ==18714==  possible), you can try to increase the size of the
> ==18714==  main thread stack using the --main-stacksize= flag.
> ==18714==  The main thread stack size used in this run was 8388608.
> ==18714==
>
> -- we seem to be writing off the end of a buffer.
>
> thanks
> -- PMM

Re: [Qemu-devel] [PULL 00/10] Migration patches
Posted by Peter Maydell 6 years, 8 months ago
On Thu, 14 Mar 2019 at 15:07, Juan Quintela <quintela@redhat.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> wrote:
> > On Thu, 14 Mar 2019 at 11:48, Peter Maydell <peter.maydell@linaro.org> wrote:
> >> For aarch32 host i386 guest I got this failure on the
> >> migration-test /i386/migration/multifd/tcp test:
> >>
> >> *** Error in `i386-softmmu/qemu-system-i386': malloc(): smallbin
> >> double linked list corrupted: 0x01b564b8 ***
> >> qemu-system-i386: check_section_footer: Read section footer failed: -5
> >> Broken pipe
> >> qemu-system-i386: load of migration failed: Invalid argument
> >> /home/peter.maydell/qemu/tests/libqtest.c:135: kill_qemu() tried to
> >> terminate QEMU process but encountered exit status 1
> >> Aborted
> >> ERROR - too few tests run (expected 8, got 7)
> >
> > I ran the test on the arm box under valgrind and got this, which might
> > or might not be helpful:
>
> Taking a look, thanks.
> Grrr, I think that I am just "stoopid" (TM).
>
> Just a wild guess, in this ARM box, page size is not 4KB, right?

No, it's 4K. (It's a 32-bit chroot on a 64-bit box but
it's Ubuntu rather than RedHat so it's using 4K pages.)

thanks
-- PMM