[PULL 00/21] Migration 20230530 patches

Juan Quintela posted 21 patches 11 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230530182531.6371-1-quintela@redhat.com
Maintainers: Juan Quintela <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras <leobras@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
include/migration/global_state.h |   2 +-
include/sysemu/runstate.h        |   2 +-
migration/migration-stats.h      |   4 +
migration/migration.h            |  12 +-
migration/options.h              |   1 +
migration/qemu-file.h            |  59 ----------
migration/rdma.h                 |  42 +++++++
migration/global_state.c         |  29 +++--
migration/migration-stats.c      |   5 +-
migration/migration.c            |  55 +++++----
migration/options.c              |   7 ++
migration/qemu-file.c            |  69 +-----------
migration/ram.c                  |  66 ++++++-----
migration/rdma.c                 | 185 +++++++++++++++----------------
migration/savevm.c               |   6 +-
softmmu/runstate.c               |  25 ++---
migration/trace-events           |  30 ++---
17 files changed, 268 insertions(+), 331 deletions(-)
[PULL 00/21] Migration 20230530 patches
Posted by Juan Quintela 11 months, 1 week ago
The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:

  Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu into staging (2023-05-29 14:31:52 -0700)

are available in the Git repository at:

  https://gitlab.com/juan.quintela/qemu.git tags/migration-20230530-pull-request

for you to fetch changes up to c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:

  migration/rdma: Check sooner if we are in postcopy for save_page() (2023-05-30 19:23:50 +0200)

----------------------------------------------------------------
Migration 20230530 Pull request (take 2)

Hi

Resend last PULL request, this time it compiles when CONFIG_RDMA is
not configured in.

[take 1]
On this PULL request:

- Set vmstate migration failure right (vladimir)
- Migration QEMUFileHook removal (juan)
- Migration Atomic counters (juan)

Please apply.

----------------------------------------------------------------

Juan Quintela (16):
  migration: Don't abuse qemu_file transferred for RDMA
  migration/RDMA: It is accounting for zero/normal pages in two places
  migration/rdma: Remove QEMUFile parameter when not used
  migration/rdma: Don't use imaginary transfers
  migration: Remove unused qemu_file_credit_transfer()
  migration/rdma: Simplify the function that saves a page
  migration: Create migrate_rdma()
  migration/rdma: Unfold ram_control_before_iterate()
  migration/rdma: Unfold ram_control_after_iterate()
  migration/rdma: Remove all uses of RAM_CONTROL_HOOK
  migration/rdma: Unfold hook_ram_load()
  migration/rdma: Create rdma_control_save_page()
  qemu-file: Remove QEMUFileHooks
  migration/rdma: Move rdma constants from qemu-file.h to rdma.h
  migration/rdma: Remove qemu_ prefix from exported functions
  migration/rdma: Check sooner if we are in postcopy for save_page()

Vladimir Sementsov-Ogievskiy (5):
  runstate: add runstate_get()
  migration: never fail in global_state_store()
  runstate: drop unused runstate_store()
  migration: switch from .vm_was_running to .vm_old_state
  migration: restore vmstate on migration failure

 include/migration/global_state.h |   2 +-
 include/sysemu/runstate.h        |   2 +-
 migration/migration-stats.h      |   4 +
 migration/migration.h            |  12 +-
 migration/options.h              |   1 +
 migration/qemu-file.h            |  59 ----------
 migration/rdma.h                 |  42 +++++++
 migration/global_state.c         |  29 +++--
 migration/migration-stats.c      |   5 +-
 migration/migration.c            |  55 +++++----
 migration/options.c              |   7 ++
 migration/qemu-file.c            |  69 +-----------
 migration/ram.c                  |  66 ++++++-----
 migration/rdma.c                 | 185 +++++++++++++++----------------
 migration/savevm.c               |   6 +-
 softmmu/runstate.c               |  25 ++---
 migration/trace-events           |  30 ++---
 17 files changed, 268 insertions(+), 331 deletions(-)

-- 
2.40.1
Re: [PULL 00/21] Migration 20230530 patches
Posted by Richard Henderson 11 months, 1 week ago
On 5/30/23 11:25, Juan Quintela wrote:
> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
> 
>    Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu into staging (2023-05-29 14:31:52 -0700)
> 
> are available in the Git repository at:
> 
>    https://gitlab.com/juan.quintela/qemu.git tags/migration-20230530-pull-request
> 
> for you to fetch changes up to c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
> 
>    migration/rdma: Check sooner if we are in postcopy for save_page() (2023-05-30 19:23:50 +0200)
> 
> ----------------------------------------------------------------
> Migration 20230530 Pull request (take 2)
> 
> Hi
> 
> Resend last PULL request, this time it compiles when CONFIG_RDMA is
> not configured in.
> 
> [take 1]
> On this PULL request:
> 
> - Set vmstate migration failure right (vladimir)
> - Migration QEMUFileHook removal (juan)
> - Migration Atomic counters (juan)
> 
> Please apply.
> 
> ----------------------------------------------------------------
> 
> Juan Quintela (16):
>    migration: Don't abuse qemu_file transferred for RDMA
>    migration/RDMA: It is accounting for zero/normal pages in two places
>    migration/rdma: Remove QEMUFile parameter when not used
>    migration/rdma: Don't use imaginary transfers
>    migration: Remove unused qemu_file_credit_transfer()
>    migration/rdma: Simplify the function that saves a page
>    migration: Create migrate_rdma()
>    migration/rdma: Unfold ram_control_before_iterate()
>    migration/rdma: Unfold ram_control_after_iterate()
>    migration/rdma: Remove all uses of RAM_CONTROL_HOOK
>    migration/rdma: Unfold hook_ram_load()
>    migration/rdma: Create rdma_control_save_page()
>    qemu-file: Remove QEMUFileHooks
>    migration/rdma: Move rdma constants from qemu-file.h to rdma.h
>    migration/rdma: Remove qemu_ prefix from exported functions
>    migration/rdma: Check sooner if we are in postcopy for save_page()
> 
> Vladimir Sementsov-Ogievskiy (5):
>    runstate: add runstate_get()
>    migration: never fail in global_state_store()
>    runstate: drop unused runstate_store()
>    migration: switch from .vm_was_running to .vm_old_state
>    migration: restore vmstate on migration failure

Appears to introduce multiple avocado failures:

https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286

Test summary:
tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1

https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387

Test summary:
tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1

Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test

../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is not an instance of 
type qio-channel-rdma
qemu-system-aarch64: Not a migration stream
qemu-system-aarch64: load of migration failed: Invalid argument
Broken pipe


r~
Re: [PULL 00/21] Migration 20230530 patches
Posted by Juan Quintela 11 months ago
Richard Henderson <richard.henderson@linaro.org> wrote:
> On 5/30/23 11:25, Juan Quintela wrote:
>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
>>    Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
>> into staging (2023-05-29 14:31:52 -0700)
>> are available in the Git repository at:
>>    https://gitlab.com/juan.quintela/qemu.git
>> tags/migration-20230530-pull-request
>> for you to fetch changes up to
>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
>>    migration/rdma: Check sooner if we are in postcopy for
>> save_page() (2023-05-30 19:23:50 +0200)
>> ----------------------------------------------------------------

Added Markus and Daniel.

>> Migration 20230530 Pull request (take 2)
>> Hi
>> Resend last PULL request, this time it compiles when CONFIG_RDMA is
>> not configured in.
>> [take 1]
>> On this PULL request:
>> - Set vmstate migration failure right (vladimir)
>> - Migration QEMUFileHook removal (juan)
>> - Migration Atomic counters (juan)
>> Please apply.
>> ----------------------------------------------------------------
>> Juan Quintela (16):
>>    migration: Don't abuse qemu_file transferred for RDMA
>>    migration/RDMA: It is accounting for zero/normal pages in two places
>>    migration/rdma: Remove QEMUFile parameter when not used
>>    migration/rdma: Don't use imaginary transfers
>>    migration: Remove unused qemu_file_credit_transfer()
>>    migration/rdma: Simplify the function that saves a page
>>    migration: Create migrate_rdma()
>>    migration/rdma: Unfold ram_control_before_iterate()
>>    migration/rdma: Unfold ram_control_after_iterate()
>>    migration/rdma: Remove all uses of RAM_CONTROL_HOOK
>>    migration/rdma: Unfold hook_ram_load()
>>    migration/rdma: Create rdma_control_save_page()
>>    qemu-file: Remove QEMUFileHooks
>>    migration/rdma: Move rdma constants from qemu-file.h to rdma.h
>>    migration/rdma: Remove qemu_ prefix from exported functions
>>    migration/rdma: Check sooner if we are in postcopy for save_page()
>> Vladimir Sementsov-Ogievskiy (5):
>>    runstate: add runstate_get()
>>    migration: never fail in global_state_store()
>>    runstate: drop unused runstate_store()
>>    migration: switch from .vm_was_running to .vm_old_state
>>    migration: restore vmstate on migration failure
>
> Appears to introduce multiple avocado failures:
>
> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
>
> Test summary:
> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>
> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
>
> Test summary:
> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>
> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
>
> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is
> not an instance of type qio-channel-rdma

I am looking at the other errors, but this one is weird.  It is failing
here:

#define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA)

In the OBJECT line.

I have no clue what problem are we having here with the object system to
decide at declaration time that a variable is not of the type that we
are declaring.

I am missing something obvious here?

Later, Juan.



> qemu-system-aarch64: Not a migration stream
> qemu-system-aarch64: load of migration failed: Invalid argument


> Broken pipe
>
>
> r~
Re: [PULL 00/21] Migration 20230530 patches
Posted by Daniel P. Berrangé 11 months ago
On Wed, May 31, 2023 at 11:03:23PM +0200, Juan Quintela wrote:
> Richard Henderson <richard.henderson@linaro.org> wrote:
> > On 5/30/23 11:25, Juan Quintela wrote:
> >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
> >>    Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
> >> into staging (2023-05-29 14:31:52 -0700)
> >> are available in the Git repository at:
> >>    https://gitlab.com/juan.quintela/qemu.git
> >> tags/migration-20230530-pull-request
> >> for you to fetch changes up to
> >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
> >>    migration/rdma: Check sooner if we are in postcopy for
> >> save_page() (2023-05-30 19:23:50 +0200)
> >> ----------------------------------------------------------------
> 
> Added Markus and Daniel.
> 
> >> Migration 20230530 Pull request (take 2)
> >> Hi
> >> Resend last PULL request, this time it compiles when CONFIG_RDMA is
> >> not configured in.
> >> [take 1]
> >> On this PULL request:
> >> - Set vmstate migration failure right (vladimir)
> >> - Migration QEMUFileHook removal (juan)
> >> - Migration Atomic counters (juan)
> >> Please apply.
> >> ----------------------------------------------------------------
> >> Juan Quintela (16):
> >>    migration: Don't abuse qemu_file transferred for RDMA
> >>    migration/RDMA: It is accounting for zero/normal pages in two places
> >>    migration/rdma: Remove QEMUFile parameter when not used
> >>    migration/rdma: Don't use imaginary transfers
> >>    migration: Remove unused qemu_file_credit_transfer()
> >>    migration/rdma: Simplify the function that saves a page
> >>    migration: Create migrate_rdma()
> >>    migration/rdma: Unfold ram_control_before_iterate()
> >>    migration/rdma: Unfold ram_control_after_iterate()
> >>    migration/rdma: Remove all uses of RAM_CONTROL_HOOK
> >>    migration/rdma: Unfold hook_ram_load()
> >>    migration/rdma: Create rdma_control_save_page()
> >>    qemu-file: Remove QEMUFileHooks
> >>    migration/rdma: Move rdma constants from qemu-file.h to rdma.h
> >>    migration/rdma: Remove qemu_ prefix from exported functions
> >>    migration/rdma: Check sooner if we are in postcopy for save_page()
> >> Vladimir Sementsov-Ogievskiy (5):
> >>    runstate: add runstate_get()
> >>    migration: never fail in global_state_store()
> >>    runstate: drop unused runstate_store()
> >>    migration: switch from .vm_was_running to .vm_old_state
> >>    migration: restore vmstate on migration failure
> >
> > Appears to introduce multiple avocado failures:
> >
> > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
> >
> > Test summary:
> > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
> > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
> >
> > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
> >
> > Test summary:
> > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
> >
> > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
> >
> > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is
> > not an instance of type qio-channel-rdma
> 
> I am looking at the other errors, but this one is weird.  It is failing
> here:
> 
> #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
> OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA)
> 
> In the OBJECT line.
> 
> I have no clue what problem are we having here with the object system to
> decide at declaration time that a variable is not of the type that we
> are declaring.
> 
> I am missing something obvious here?

I expect somewhere in the code has either corrupted memory, or is
using free'd memory. Either way you'll need to get a stack trace
to debug this kind of thing

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Re: [PULL 00/21] Migration 20230530 patches
Posted by Daniel P. Berrangé 11 months ago
On Thu, Jun 01, 2023 at 09:27:09AM +0100, Daniel P. Berrangé wrote:
> On Wed, May 31, 2023 at 11:03:23PM +0200, Juan Quintela wrote:
> > Richard Henderson <richard.henderson@linaro.org> wrote:
> > > On 5/30/23 11:25, Juan Quintela wrote:
> > >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
> > >>    Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
> > >> into staging (2023-05-29 14:31:52 -0700)
> > >> are available in the Git repository at:
> > >>    https://gitlab.com/juan.quintela/qemu.git
> > >> tags/migration-20230530-pull-request
> > >> for you to fetch changes up to
> > >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
> > >>    migration/rdma: Check sooner if we are in postcopy for
> > >> save_page() (2023-05-30 19:23:50 +0200)
> > >> ----------------------------------------------------------------
> > 
> > Added Markus and Daniel.
> > 
> > >> Migration 20230530 Pull request (take 2)
> > >> Hi
> > >> Resend last PULL request, this time it compiles when CONFIG_RDMA is
> > >> not configured in.
> > >> [take 1]
> > >> On this PULL request:
> > >> - Set vmstate migration failure right (vladimir)
> > >> - Migration QEMUFileHook removal (juan)
> > >> - Migration Atomic counters (juan)
> > >> Please apply.
> > >> ----------------------------------------------------------------
> > >> Juan Quintela (16):
> > >>    migration: Don't abuse qemu_file transferred for RDMA
> > >>    migration/RDMA: It is accounting for zero/normal pages in two places
> > >>    migration/rdma: Remove QEMUFile parameter when not used
> > >>    migration/rdma: Don't use imaginary transfers
> > >>    migration: Remove unused qemu_file_credit_transfer()
> > >>    migration/rdma: Simplify the function that saves a page
> > >>    migration: Create migrate_rdma()
> > >>    migration/rdma: Unfold ram_control_before_iterate()
> > >>    migration/rdma: Unfold ram_control_after_iterate()
> > >>    migration/rdma: Remove all uses of RAM_CONTROL_HOOK
> > >>    migration/rdma: Unfold hook_ram_load()
> > >>    migration/rdma: Create rdma_control_save_page()
> > >>    qemu-file: Remove QEMUFileHooks
> > >>    migration/rdma: Move rdma constants from qemu-file.h to rdma.h
> > >>    migration/rdma: Remove qemu_ prefix from exported functions
> > >>    migration/rdma: Check sooner if we are in postcopy for save_page()
> > >> Vladimir Sementsov-Ogievskiy (5):
> > >>    runstate: add runstate_get()
> > >>    migration: never fail in global_state_store()
> > >>    runstate: drop unused runstate_store()
> > >>    migration: switch from .vm_was_running to .vm_old_state
> > >>    migration: restore vmstate on migration failure
> > >
> > > Appears to introduce multiple avocado failures:
> > >
> > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
> > >
> > > Test summary:
> > > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
> > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
> > >
> > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
> > >
> > > Test summary:
> > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
> > >
> > > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
> > >
> > > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is
> > > not an instance of type qio-channel-rdma
> > 
> > I am looking at the other errors, but this one is weird.  It is failing
> > here:
> > 
> > #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
> > OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA)
> > 
> > In the OBJECT line.
> > 
> > I have no clue what problem are we having here with the object system to
> > decide at declaration time that a variable is not of the type that we
> > are declaring.
> > 
> > I am missing something obvious here?
> 
> I expect somewhere in the code has either corrupted memory, or is
> using free'd memory. Either way you'll need to get a stack trace
> to debug this kind of thing

I've replied to the patches pointing out 4 places where the code
casts to QIOChannelRDMA, without first checking that this is an
RDMA migration, which look likely to be the cause of this.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PULL 00/21] Migration 20230530 patches
Posted by Juan Quintela 11 months ago
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Thu, Jun 01, 2023 at 09:27:09AM +0100, Daniel P. Berrangé wrote:
>> On Wed, May 31, 2023 at 11:03:23PM +0200, Juan Quintela wrote:
>> > Richard Henderson <richard.henderson@linaro.org> wrote:
>> > > On 5/30/23 11:25, Juan Quintela wrote:
>> > >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
>> > >>    Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
>> > >> into staging (2023-05-29 14:31:52 -0700)
>> > >> are available in the Git repository at:
>> > >>    https://gitlab.com/juan.quintela/qemu.git
>> > >> tags/migration-20230530-pull-request
>> > >> for you to fetch changes up to
>> > >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
>> > >>    migration/rdma: Check sooner if we are in postcopy for
>> > >> save_page() (2023-05-30 19:23:50 +0200)
>> > >> ----------------------------------------------------------------
>> > 
>> > Added Markus and Daniel.
>> > 
>> > >> Migration 20230530 Pull request (take 2)
>> > >> Hi
>> > >> Resend last PULL request, this time it compiles when CONFIG_RDMA is
>> > >> not configured in.
>> > >> [take 1]
>> > >> On this PULL request:
>> > >> - Set vmstate migration failure right (vladimir)
>> > >> - Migration QEMUFileHook removal (juan)
>> > >> - Migration Atomic counters (juan)
>> > >> Please apply.
>> > >> ----------------------------------------------------------------
>> > >> Juan Quintela (16):
>> > >>    migration: Don't abuse qemu_file transferred for RDMA
>> > >>    migration/RDMA: It is accounting for zero/normal pages in two places
>> > >>    migration/rdma: Remove QEMUFile parameter when not used
>> > >>    migration/rdma: Don't use imaginary transfers
>> > >>    migration: Remove unused qemu_file_credit_transfer()
>> > >>    migration/rdma: Simplify the function that saves a page
>> > >>    migration: Create migrate_rdma()
>> > >>    migration/rdma: Unfold ram_control_before_iterate()
>> > >>    migration/rdma: Unfold ram_control_after_iterate()
>> > >>    migration/rdma: Remove all uses of RAM_CONTROL_HOOK
>> > >>    migration/rdma: Unfold hook_ram_load()
>> > >>    migration/rdma: Create rdma_control_save_page()
>> > >>    qemu-file: Remove QEMUFileHooks
>> > >>    migration/rdma: Move rdma constants from qemu-file.h to rdma.h
>> > >>    migration/rdma: Remove qemu_ prefix from exported functions
>> > >>    migration/rdma: Check sooner if we are in postcopy for save_page()
>> > >> Vladimir Sementsov-Ogievskiy (5):
>> > >>    runstate: add runstate_get()
>> > >>    migration: never fail in global_state_store()
>> > >>    runstate: drop unused runstate_store()
>> > >>    migration: switch from .vm_was_running to .vm_old_state
>> > >>    migration: restore vmstate on migration failure
>> > >
>> > > Appears to introduce multiple avocado failures:
>> > >
>> > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
>> > >
>> > > Test summary:
>> > > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
>> > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
>> > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
>> > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>> > >
>> > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
>> > >
>> > > Test summary:
>> > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
>> > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
>> > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>> > >
>> > > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
>> > >
>> > > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is
>> > > not an instance of type qio-channel-rdma
>> > 
>> > I am looking at the other errors, but this one is weird.  It is failing
>> > here:
>> > 
>> > #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
>> > OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA)
>> > 
>> > In the OBJECT line.
>> > 
>> > I have no clue what problem are we having here with the object system to
>> > decide at declaration time that a variable is not of the type that we
>> > are declaring.
>> > 
>> > I am missing something obvious here?
>> 
>> I expect somewhere in the code has either corrupted memory, or is
>> using free'd memory. Either way you'll need to get a stack trace
>> to debug this kind of thing
>
> I've replied to the patches pointing out 4 places where the code
> casts to QIOChannelRDMA, without first checking that this is an
> RDMA migration, which look likely to be the cause of this.

Good catch.

I can only say: Ouch.

And why it don't failed for me.  It passes for me:
- make check (compiled every target/device/... that can be compiled on
  Fedora38)

- I tested hundreds of times migration-test during development, never
  failed like that

- I am switching to test aarch64 tcg as main target, because it appears
  it finds way more bugs on migration-tests.

Thanks again.

Later, Juan.
Re: [PULL 00/21] Migration 20230530 patches
Posted by Richard Henderson 11 months ago
On 5/31/23 14:03, Juan Quintela wrote:
> Richard Henderson <richard.henderson@linaro.org> wrote:
>> On 5/30/23 11:25, Juan Quintela wrote:
>>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
>>>     Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
>>> into staging (2023-05-29 14:31:52 -0700)
>>> are available in the Git repository at:
>>>     https://gitlab.com/juan.quintela/qemu.git
>>> tags/migration-20230530-pull-request
>>> for you to fetch changes up to
>>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
>>>     migration/rdma: Check sooner if we are in postcopy for
>>> save_page() (2023-05-30 19:23:50 +0200)
>>> ----------------------------------------------------------------
> 
> Added Markus and Daniel.
> 
>>> Migration 20230530 Pull request (take 2)
>>> Hi
>>> Resend last PULL request, this time it compiles when CONFIG_RDMA is
>>> not configured in.
>>> [take 1]
>>> On this PULL request:
>>> - Set vmstate migration failure right (vladimir)
>>> - Migration QEMUFileHook removal (juan)
>>> - Migration Atomic counters (juan)
>>> Please apply.
>>> ----------------------------------------------------------------
>>> Juan Quintela (16):
>>>     migration: Don't abuse qemu_file transferred for RDMA
>>>     migration/RDMA: It is accounting for zero/normal pages in two places
>>>     migration/rdma: Remove QEMUFile parameter when not used
>>>     migration/rdma: Don't use imaginary transfers
>>>     migration: Remove unused qemu_file_credit_transfer()
>>>     migration/rdma: Simplify the function that saves a page
>>>     migration: Create migrate_rdma()
>>>     migration/rdma: Unfold ram_control_before_iterate()
>>>     migration/rdma: Unfold ram_control_after_iterate()
>>>     migration/rdma: Remove all uses of RAM_CONTROL_HOOK
>>>     migration/rdma: Unfold hook_ram_load()
>>>     migration/rdma: Create rdma_control_save_page()
>>>     qemu-file: Remove QEMUFileHooks
>>>     migration/rdma: Move rdma constants from qemu-file.h to rdma.h
>>>     migration/rdma: Remove qemu_ prefix from exported functions
>>>     migration/rdma: Check sooner if we are in postcopy for save_page()
>>> Vladimir Sementsov-Ogievskiy (5):
>>>     runstate: add runstate_get()
>>>     migration: never fail in global_state_store()
>>>     runstate: drop unused runstate_store()
>>>     migration: switch from .vm_was_running to .vm_old_state
>>>     migration: restore vmstate on migration failure
>>
>> Appears to introduce multiple avocado failures:
>>
>> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
>>
>> Test summary:
>> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
>> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
>> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
>> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>>
>> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
>>
>> Test summary:
>> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
>> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
>> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>>
>> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
>>
>> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is
>> not an instance of type qio-channel-rdma
> 
> I am looking at the other errors, but this one is weird.  It is failing
> here:
> 
> #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
> OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA)
> 
> In the OBJECT line.
> 
> I have no clue what problem are we having here with the object system to
> decide at declaration time that a variable is not of the type that we
> are declaring.
> 
> I am missing something obvious here?

This is where the inline function is declared, but you need to look at the backtrace, 
where you have applied QIO_CHANNEL_RDMA to an object that is *not* of that type.


r~
Re: [PULL 00/21] Migration 20230530 patches
Posted by Juan Quintela 11 months ago
Richard Henderson <richard.henderson@linaro.org> wrote:
> On 5/31/23 14:03, Juan Quintela wrote:
>> Richard Henderson <richard.henderson@linaro.org> wrote:
>>> On 5/30/23 11:25, Juan Quintela wrote:
>>>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
>>>>     Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
>>>> into staging (2023-05-29 14:31:52 -0700)
>>>> are available in the Git repository at:
>>>>     https://gitlab.com/juan.quintela/qemu.git
>>>> tags/migration-20230530-pull-request
>>>> for you to fetch changes up to
>>>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
>>>>     migration/rdma: Check sooner if we are in postcopy for
>>>> save_page() (2023-05-30 19:23:50 +0200)
>>>> ----------------------------------------------------------------
>> Added Markus and Daniel.
>> 
>>>> Migration 20230530 Pull request (take 2)
>>>> Hi
>>>> Resend last PULL request, this time it compiles when CONFIG_RDMA is
>>>> not configured in.
>>>> [take 1]
>>>> On this PULL request:
>>>> - Set vmstate migration failure right (vladimir)
>>>> - Migration QEMUFileHook removal (juan)
>>>> - Migration Atomic counters (juan)
>>>> Please apply.
>>>> ----------------------------------------------------------------
>>>> Juan Quintela (16):
>>>>     migration: Don't abuse qemu_file transferred for RDMA
>>>>     migration/RDMA: It is accounting for zero/normal pages in two places
>>>>     migration/rdma: Remove QEMUFile parameter when not used
>>>>     migration/rdma: Don't use imaginary transfers
>>>>     migration: Remove unused qemu_file_credit_transfer()
>>>>     migration/rdma: Simplify the function that saves a page
>>>>     migration: Create migrate_rdma()
>>>>     migration/rdma: Unfold ram_control_before_iterate()
>>>>     migration/rdma: Unfold ram_control_after_iterate()
>>>>     migration/rdma: Remove all uses of RAM_CONTROL_HOOK
>>>>     migration/rdma: Unfold hook_ram_load()
>>>>     migration/rdma: Create rdma_control_save_page()
>>>>     qemu-file: Remove QEMUFileHooks
>>>>     migration/rdma: Move rdma constants from qemu-file.h to rdma.h
>>>>     migration/rdma: Remove qemu_ prefix from exported functions
>>>>     migration/rdma: Check sooner if we are in postcopy for save_page()
>>>> Vladimir Sementsov-Ogievskiy (5):
>>>>     runstate: add runstate_get()
>>>>     migration: never fail in global_state_store()
>>>>     runstate: drop unused runstate_store()
>>>>     migration: switch from .vm_was_running to .vm_old_state
>>>>     migration: restore vmstate on migration failure
>>>
>>> Appears to introduce multiple avocado failures:
>>>
>>> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
>>>
>>> Test summary:
>>> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
>>> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
>>> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
>>> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142:
>>> check-avocado] Error 1
>>> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
>>>
>>> Test summary:
>>> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
>>> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
>>> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>>>
>>> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
>>>
>>> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is
>>> not an instance of type qio-channel-rdma
>> I am looking at the other errors, but this one is weird.  It is
>> failing
>> here:
>> #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
>> OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA)
>> In the OBJECT line.
>> I have no clue what problem are we having here with the object
>> system to
>> decide at declaration time that a variable is not of the type that we
>> are declaring.
>> I am missing something obvious here?
>
> This is where the inline function is declared, but you need to look at
> the backtrace, where you have applied QIO_CHANNEL_RDMA to an object
> that is *not* of that type.

Where is the stack trace?

Are you running aarch64 natively?  Here running qemu-system-aarch64 on
x86_64 works for me.  Neither avocado test nor migration-test fails with
my changes.

Cleber found the reason why I was having trouble running avocado
locally.  It appears that some change happened and there are several
tests that can't be run in parallel.  (Temporary) solution is to run it
as:

make -j1 check-avocado

Until they sort which tests are/aren't able to run in parallel.

Later, Juan.
Re: [PULL 00/21] Migration 20230530 patches
Posted by Juan Quintela 11 months, 1 week ago
Richard Henderson <richard.henderson@linaro.org> wrote:
> On 5/30/23 11:25, Juan Quintela wrote:
>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
>>    Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
>> into staging (2023-05-29 14:31:52 -0700)
>> are available in the Git repository at:
>>    https://gitlab.com/juan.quintela/qemu.git
>> tags/migration-20230530-pull-request
>> for you to fetch changes up to
>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
>>    migration/rdma: Check sooner if we are in postcopy for
>> save_page() (2023-05-30 19:23:50 +0200)
>> ----------------------------------------------------------------
>> Migration 20230530 Pull request (take 2)
>> Hi
>> Resend last PULL request, this time it compiles when CONFIG_RDMA is
>> not configured in.
>> [take 1]
>> On this PULL request:
>> - Set vmstate migration failure right (vladimir)
>> - Migration QEMUFileHook removal (juan)
>> - Migration Atomic counters (juan)
>> Please apply.
>> ----------------------------------------------------------------
>> Juan Quintela (16):
>>    migration: Don't abuse qemu_file transferred for RDMA
>>    migration/RDMA: It is accounting for zero/normal pages in two places
>>    migration/rdma: Remove QEMUFile parameter when not used
>>    migration/rdma: Don't use imaginary transfers
>>    migration: Remove unused qemu_file_credit_transfer()
>>    migration/rdma: Simplify the function that saves a page
>>    migration: Create migrate_rdma()
>>    migration/rdma: Unfold ram_control_before_iterate()
>>    migration/rdma: Unfold ram_control_after_iterate()
>>    migration/rdma: Remove all uses of RAM_CONTROL_HOOK
>>    migration/rdma: Unfold hook_ram_load()
>>    migration/rdma: Create rdma_control_save_page()
>>    qemu-file: Remove QEMUFileHooks
>>    migration/rdma: Move rdma constants from qemu-file.h to rdma.h
>>    migration/rdma: Remove qemu_ prefix from exported functions
>>    migration/rdma: Check sooner if we are in postcopy for save_page()
>> Vladimir Sementsov-Ogievskiy (5):
>>    runstate: add runstate_get()
>>    migration: never fail in global_state_store()
>>    runstate: drop unused runstate_store()
>>    migration: switch from .vm_was_running to .vm_old_state
>>    migration: restore vmstate on migration failure
>
> Appears to introduce multiple avocado failures:
>
> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
>
> Test summary:
> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>
> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
>
> Test summary:
> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>
> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
>
> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is
> not an instance of type qio-channel-rdma
> qemu-system-aarch64: Not a migration stream
> qemu-system-aarch64: load of migration failed: Invalid argument
> Broken pipe

And as it couldn't be anyother way, on my machine (with upstream qemu,
i.e. none of my changes in):

$ make check-avocado (check-acceptance fails in the same test)

...

STARTED
 (063/243) tests/avocado/kvm_xen_guest.py:KVMXenGuest.test_kvm_xen_guest_nomsi: ERROR: ConnectError: Failed to establish session: EOFError\n	Exit code: 1\n	Command: ./qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,fd=9 -mon chardev=mon,mode=control -machine q35 -chardev socket,id=console,path=/var/tmp/qemu_yv2aqehm/7f8b6cf5... (2.52 s)
Interrupting job (failfast).
RESULTS    : PASS 42 | ERROR 1 | FAIL 0 | SKIP 199 | WARN 0 | INTERRUPT 0 | CANCEL 1
JOB TIME   : 71.66 s

Test summary:
tests/avocado/kvm_xen_guest.py:KVMXenGuest.test_kvm_xen_guest_nomsi: ERROR
make: *** [/mnt/code/qemu/full/tests/Makefile.include:142: check-avocado] Error 9


the good news:  Some tests passed.  sixty something.
the bad news: it is failing on kvm xen guest and I have no clue
              why/where.

Is there some documentation that can help me here?

Thanks, Juan.
Re: [PULL 00/21] Migration 20230530 patches
Posted by Juan Quintela 11 months ago
Juan Quintela <quintela@redhat.com> wrote:
> Richard Henderson <richard.henderson@linaro.org> wrote:
>> On 5/30/23 11:25, Juan Quintela wrote:
>>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
>>>    Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
>>> into staging (2023-05-29 14:31:52 -0700)
>>> are available in the Git repository at:
>>>    https://gitlab.com/juan.quintela/qemu.git
>>> tags/migration-20230530-pull-request
>>> for you to fetch changes up to
>>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
>>>    migration/rdma: Check sooner if we are in postcopy for
>>> save_page() (2023-05-30 19:23:50 +0200)
>>> ----------------------------------------------------------------
>>> Migration 20230530 Pull request (take 2)
>>> Hi
>>> Resend last PULL request, this time it compiles when CONFIG_RDMA is
>>> not configured in.
>>> [take 1]
>>> On this PULL request:
>>> - Set vmstate migration failure right (vladimir)
>>> - Migration QEMUFileHook removal (juan)
>>> - Migration Atomic counters (juan)
>>> Please apply.
>>> ----------------------------------------------------------------
>>> Juan Quintela (16):
>>>    migration: Don't abuse qemu_file transferred for RDMA
>>>    migration/RDMA: It is accounting for zero/normal pages in two places
>>>    migration/rdma: Remove QEMUFile parameter when not used
>>>    migration/rdma: Don't use imaginary transfers
>>>    migration: Remove unused qemu_file_credit_transfer()
>>>    migration/rdma: Simplify the function that saves a page
>>>    migration: Create migrate_rdma()
>>>    migration/rdma: Unfold ram_control_before_iterate()
>>>    migration/rdma: Unfold ram_control_after_iterate()
>>>    migration/rdma: Remove all uses of RAM_CONTROL_HOOK
>>>    migration/rdma: Unfold hook_ram_load()
>>>    migration/rdma: Create rdma_control_save_page()
>>>    qemu-file: Remove QEMUFileHooks
>>>    migration/rdma: Move rdma constants from qemu-file.h to rdma.h
>>>    migration/rdma: Remove qemu_ prefix from exported functions
>>>    migration/rdma: Check sooner if we are in postcopy for save_page()
>>> Vladimir Sementsov-Ogievskiy (5):
>>>    runstate: add runstate_get()
>>>    migration: never fail in global_state_store()
>>>    runstate: drop unused runstate_store()
>>>    migration: switch from .vm_was_running to .vm_old_state
>>>    migration: restore vmstate on migration failure
>>
>> Appears to introduce multiple avocado failures:
>>
>> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
>>
>> Test summary:
>> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR
>> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
>> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
>> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>>
>> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
>>
>> Test summary:
>> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
>> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
>> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>>
>> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
>>
>> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is
>> not an instance of type qio-channel-rdma
>> qemu-system-aarch64: Not a migration stream
>> qemu-system-aarch64: load of migration failed: Invalid argument

And now I am stuck here.  Neither migration-test or avocado test
migration-rdma, so clearly I broke something that don't show on
migration-test but shows on avocado.

Still trying to get avocado to behave on my local test machine.

Later, Juan.


>> Broken pipe
>
> And as it couldn't be anyother way, on my machine (with upstream qemu,
> i.e. none of my changes in):
>
> $ make check-avocado (check-acceptance fails in the same test)
>
> ...
>
> STARTED
>  (063/243) tests/avocado/kvm_xen_guest.py:KVMXenGuest.test_kvm_xen_guest_nomsi: ERROR: ConnectError: Failed to establish session: EOFError\n	Exit code: 1\n	Command: ./qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,fd=9 -mon chardev=mon,mode=control -machine q35 -chardev socket,id=console,path=/var/tmp/qemu_yv2aqehm/7f8b6cf5... (2.52 s)
> Interrupting job (failfast).
> RESULTS    : PASS 42 | ERROR 1 | FAIL 0 | SKIP 199 | WARN 0 | INTERRUPT 0 | CANCEL 1
> JOB TIME   : 71.66 s
>
> Test summary:
> tests/avocado/kvm_xen_guest.py:KVMXenGuest.test_kvm_xen_guest_nomsi: ERROR
> make: *** [/mnt/code/qemu/full/tests/Makefile.include:142: check-avocado] Error 9
>
>
> the good news:  Some tests passed.  sixty something.
> the bad news: it is failing on kvm xen guest and I have no clue
>               why/where.
>
> Is there some documentation that can help me here?
>
> Thanks, Juan.
Re: [PULL 00/21] Migration 20230530 patches
Posted by Juan Quintela 11 months, 1 week ago
Richard Henderson <richard.henderson@linaro.org> wrote:
> On 5/30/23 11:25, Juan Quintela wrote:
>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43:
>>    Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu
>> into staging (2023-05-29 14:31:52 -0700)
>> are available in the Git repository at:
>>    https://gitlab.com/juan.quintela/qemu.git
>> tags/migration-20230530-pull-request
>> for you to fetch changes up to
>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4:
>>    migration/rdma: Check sooner if we are in postcopy for
>> save_page() (2023-05-30 19:23:50 +0200)
>> ----------------------------------------------------------------
>> Migration 20230530 Pull request (take 2)
>> Hi
>> Resend last PULL request, this time it compiles when CONFIG_RDMA is
>> not configured in.
>> [take 1]
>> On this PULL request:
>> - Set vmstate migration failure right (vladimir)
>> - Migration QEMUFileHook removal (juan)
>> - Migration Atomic counters (juan)
>> Please apply.
>> ----------------------------------------------------------------
>> Juan Quintela (16):
>>    migration: Don't abuse qemu_file transferred for RDMA
>>    migration/RDMA: It is accounting for zero/normal pages in two places
>>    migration/rdma: Remove QEMUFile parameter when not used
>>    migration/rdma: Don't use imaginary transfers
>>    migration: Remove unused qemu_file_credit_transfer()
>>    migration/rdma: Simplify the function that saves a page
>>    migration: Create migrate_rdma()
>>    migration/rdma: Unfold ram_control_before_iterate()
>>    migration/rdma: Unfold ram_control_after_iterate()
>>    migration/rdma: Remove all uses of RAM_CONTROL_HOOK
>>    migration/rdma: Unfold hook_ram_load()
>>    migration/rdma: Create rdma_control_save_page()
>>    qemu-file: Remove QEMUFileHooks
>>    migration/rdma: Move rdma constants from qemu-file.h to rdma.h
>>    migration/rdma: Remove qemu_ prefix from exported functions
>>    migration/rdma: Check sooner if we are in postcopy for save_page()
>> Vladimir Sementsov-Ogievskiy (5):
>>    runstate: add runstate_get()
>>    migration: never fail in global_state_store()
>>    runstate: drop unused runstate_store()
>>    migration: switch from .vm_was_running to .vm_old_state
>>    migration: restore vmstate on migration failure

Self nack

> Appears to introduce multiple avocado failures:
>
> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286
>
> Test summary:
> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR

No clue what is going on here.
I haven't touched exec migration (famous last words)

> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost:
> ERROR

This is tested by ./tests/migration-test

It passes for me with qemu-system-x86_64 (kvm), qemu-system-i386 (kvm)
and aarch64 (tcg).  What make check tests.

> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR

Dunno again.

> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>
> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387
>
> Test summary:
> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR
> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR
> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1
>
> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test
>
> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is

This is something that changed on the series, so I will check with this
avocado tests.  No clue why it passes with x86_64 and not with aarch64.

> not an instance of type qio-channel-rdma

Here there was an error creating the RDMA channel somehow.  Somehow is
the important word on the previous sentence.

> qemu-system-aarch64: Not a migration stream
> qemu-system-aarch64: load of migration failed: Invalid argument
> Broken pipe

Will isolate the non RDMA changes and try to pass the RDMA ones through
avocado.

I am sorry for the noise.

Later, Juan.