include/migration/global_state.h | 2 +- include/sysemu/runstate.h | 2 +- migration/migration-stats.h | 4 + migration/migration.h | 12 +- migration/options.h | 1 + migration/qemu-file.h | 59 ---------- migration/rdma.h | 42 +++++++ migration/global_state.c | 29 +++-- migration/migration-stats.c | 5 +- migration/migration.c | 55 +++++---- migration/options.c | 7 ++ migration/qemu-file.c | 69 +----------- migration/ram.c | 66 ++++++----- migration/rdma.c | 185 +++++++++++++++---------------- migration/savevm.c | 6 +- softmmu/runstate.c | 25 ++--- migration/trace-events | 30 ++--- 17 files changed, 268 insertions(+), 331 deletions(-)
The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu into staging (2023-05-29 14:31:52 -0700) are available in the Git repository at: https://gitlab.com/juan.quintela/qemu.git tags/migration-20230530-pull-request for you to fetch changes up to c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: migration/rdma: Check sooner if we are in postcopy for save_page() (2023-05-30 19:23:50 +0200) ---------------------------------------------------------------- Migration 20230530 Pull request (take 2) Hi Resend last PULL request, this time it compiles when CONFIG_RDMA is not configured in. [take 1] On this PULL request: - Set vmstate migration failure right (vladimir) - Migration QEMUFileHook removal (juan) - Migration Atomic counters (juan) Please apply. ---------------------------------------------------------------- Juan Quintela (16): migration: Don't abuse qemu_file transferred for RDMA migration/RDMA: It is accounting for zero/normal pages in two places migration/rdma: Remove QEMUFile parameter when not used migration/rdma: Don't use imaginary transfers migration: Remove unused qemu_file_credit_transfer() migration/rdma: Simplify the function that saves a page migration: Create migrate_rdma() migration/rdma: Unfold ram_control_before_iterate() migration/rdma: Unfold ram_control_after_iterate() migration/rdma: Remove all uses of RAM_CONTROL_HOOK migration/rdma: Unfold hook_ram_load() migration/rdma: Create rdma_control_save_page() qemu-file: Remove QEMUFileHooks migration/rdma: Move rdma constants from qemu-file.h to rdma.h migration/rdma: Remove qemu_ prefix from exported functions migration/rdma: Check sooner if we are in postcopy for save_page() Vladimir Sementsov-Ogievskiy (5): runstate: add runstate_get() migration: never fail in global_state_store() runstate: drop unused runstate_store() migration: switch from .vm_was_running to .vm_old_state migration: restore vmstate on migration failure include/migration/global_state.h | 2 +- include/sysemu/runstate.h | 2 +- migration/migration-stats.h | 4 + migration/migration.h | 12 +- migration/options.h | 1 + migration/qemu-file.h | 59 ---------- migration/rdma.h | 42 +++++++ migration/global_state.c | 29 +++-- migration/migration-stats.c | 5 +- migration/migration.c | 55 +++++---- migration/options.c | 7 ++ migration/qemu-file.c | 69 +----------- migration/ram.c | 66 ++++++----- migration/rdma.c | 185 +++++++++++++++---------------- migration/savevm.c | 6 +- softmmu/runstate.c | 25 ++--- migration/trace-events | 30 ++--- 17 files changed, 268 insertions(+), 331 deletions(-) -- 2.40.1
On 5/30/23 11:25, Juan Quintela wrote: > The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: > > Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu into staging (2023-05-29 14:31:52 -0700) > > are available in the Git repository at: > > https://gitlab.com/juan.quintela/qemu.git tags/migration-20230530-pull-request > > for you to fetch changes up to c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: > > migration/rdma: Check sooner if we are in postcopy for save_page() (2023-05-30 19:23:50 +0200) > > ---------------------------------------------------------------- > Migration 20230530 Pull request (take 2) > > Hi > > Resend last PULL request, this time it compiles when CONFIG_RDMA is > not configured in. > > [take 1] > On this PULL request: > > - Set vmstate migration failure right (vladimir) > - Migration QEMUFileHook removal (juan) > - Migration Atomic counters (juan) > > Please apply. > > ---------------------------------------------------------------- > > Juan Quintela (16): > migration: Don't abuse qemu_file transferred for RDMA > migration/RDMA: It is accounting for zero/normal pages in two places > migration/rdma: Remove QEMUFile parameter when not used > migration/rdma: Don't use imaginary transfers > migration: Remove unused qemu_file_credit_transfer() > migration/rdma: Simplify the function that saves a page > migration: Create migrate_rdma() > migration/rdma: Unfold ram_control_before_iterate() > migration/rdma: Unfold ram_control_after_iterate() > migration/rdma: Remove all uses of RAM_CONTROL_HOOK > migration/rdma: Unfold hook_ram_load() > migration/rdma: Create rdma_control_save_page() > qemu-file: Remove QEMUFileHooks > migration/rdma: Move rdma constants from qemu-file.h to rdma.h > migration/rdma: Remove qemu_ prefix from exported functions > migration/rdma: Check sooner if we are in postcopy for save_page() > > Vladimir Sementsov-Ogievskiy (5): > runstate: add runstate_get() > migration: never fail in global_state_store() > runstate: drop unused runstate_store() > migration: switch from .vm_was_running to .vm_old_state > migration: restore vmstate on migration failure Appears to introduce multiple avocado failures: https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 Test summary: tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 Test summary: tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is not an instance of type qio-channel-rdma qemu-system-aarch64: Not a migration stream qemu-system-aarch64: load of migration failed: Invalid argument Broken pipe r~
Richard Henderson <richard.henderson@linaro.org> wrote: > On 5/30/23 11:25, Juan Quintela wrote: >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: >> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu >> into staging (2023-05-29 14:31:52 -0700) >> are available in the Git repository at: >> https://gitlab.com/juan.quintela/qemu.git >> tags/migration-20230530-pull-request >> for you to fetch changes up to >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: >> migration/rdma: Check sooner if we are in postcopy for >> save_page() (2023-05-30 19:23:50 +0200) >> ---------------------------------------------------------------- Added Markus and Daniel. >> Migration 20230530 Pull request (take 2) >> Hi >> Resend last PULL request, this time it compiles when CONFIG_RDMA is >> not configured in. >> [take 1] >> On this PULL request: >> - Set vmstate migration failure right (vladimir) >> - Migration QEMUFileHook removal (juan) >> - Migration Atomic counters (juan) >> Please apply. >> ---------------------------------------------------------------- >> Juan Quintela (16): >> migration: Don't abuse qemu_file transferred for RDMA >> migration/RDMA: It is accounting for zero/normal pages in two places >> migration/rdma: Remove QEMUFile parameter when not used >> migration/rdma: Don't use imaginary transfers >> migration: Remove unused qemu_file_credit_transfer() >> migration/rdma: Simplify the function that saves a page >> migration: Create migrate_rdma() >> migration/rdma: Unfold ram_control_before_iterate() >> migration/rdma: Unfold ram_control_after_iterate() >> migration/rdma: Remove all uses of RAM_CONTROL_HOOK >> migration/rdma: Unfold hook_ram_load() >> migration/rdma: Create rdma_control_save_page() >> qemu-file: Remove QEMUFileHooks >> migration/rdma: Move rdma constants from qemu-file.h to rdma.h >> migration/rdma: Remove qemu_ prefix from exported functions >> migration/rdma: Check sooner if we are in postcopy for save_page() >> Vladimir Sementsov-Ogievskiy (5): >> runstate: add runstate_get() >> migration: never fail in global_state_store() >> runstate: drop unused runstate_store() >> migration: switch from .vm_was_running to .vm_old_state >> migration: restore vmstate on migration failure > > Appears to introduce multiple avocado failures: > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 > > Test summary: > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 > > Test summary: > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test > > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is > not an instance of type qio-channel-rdma I am looking at the other errors, but this one is weird. It is failing here: #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA) In the OBJECT line. I have no clue what problem are we having here with the object system to decide at declaration time that a variable is not of the type that we are declaring. I am missing something obvious here? Later, Juan. > qemu-system-aarch64: Not a migration stream > qemu-system-aarch64: load of migration failed: Invalid argument > Broken pipe > > > r~
On Wed, May 31, 2023 at 11:03:23PM +0200, Juan Quintela wrote: > Richard Henderson <richard.henderson@linaro.org> wrote: > > On 5/30/23 11:25, Juan Quintela wrote: > >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: > >> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu > >> into staging (2023-05-29 14:31:52 -0700) > >> are available in the Git repository at: > >> https://gitlab.com/juan.quintela/qemu.git > >> tags/migration-20230530-pull-request > >> for you to fetch changes up to > >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: > >> migration/rdma: Check sooner if we are in postcopy for > >> save_page() (2023-05-30 19:23:50 +0200) > >> ---------------------------------------------------------------- > > Added Markus and Daniel. > > >> Migration 20230530 Pull request (take 2) > >> Hi > >> Resend last PULL request, this time it compiles when CONFIG_RDMA is > >> not configured in. > >> [take 1] > >> On this PULL request: > >> - Set vmstate migration failure right (vladimir) > >> - Migration QEMUFileHook removal (juan) > >> - Migration Atomic counters (juan) > >> Please apply. > >> ---------------------------------------------------------------- > >> Juan Quintela (16): > >> migration: Don't abuse qemu_file transferred for RDMA > >> migration/RDMA: It is accounting for zero/normal pages in two places > >> migration/rdma: Remove QEMUFile parameter when not used > >> migration/rdma: Don't use imaginary transfers > >> migration: Remove unused qemu_file_credit_transfer() > >> migration/rdma: Simplify the function that saves a page > >> migration: Create migrate_rdma() > >> migration/rdma: Unfold ram_control_before_iterate() > >> migration/rdma: Unfold ram_control_after_iterate() > >> migration/rdma: Remove all uses of RAM_CONTROL_HOOK > >> migration/rdma: Unfold hook_ram_load() > >> migration/rdma: Create rdma_control_save_page() > >> qemu-file: Remove QEMUFileHooks > >> migration/rdma: Move rdma constants from qemu-file.h to rdma.h > >> migration/rdma: Remove qemu_ prefix from exported functions > >> migration/rdma: Check sooner if we are in postcopy for save_page() > >> Vladimir Sementsov-Ogievskiy (5): > >> runstate: add runstate_get() > >> migration: never fail in global_state_store() > >> runstate: drop unused runstate_store() > >> migration: switch from .vm_was_running to .vm_old_state > >> migration: restore vmstate on migration failure > > > > Appears to introduce multiple avocado failures: > > > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 > > > > Test summary: > > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 > > > > Test summary: > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > > > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test > > > > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is > > not an instance of type qio-channel-rdma > > I am looking at the other errors, but this one is weird. It is failing > here: > > #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" > OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA) > > In the OBJECT line. > > I have no clue what problem are we having here with the object system to > decide at declaration time that a variable is not of the type that we > are declaring. > > I am missing something obvious here? I expect somewhere in the code has either corrupted memory, or is using free'd memory. Either way you'll need to get a stack trace to debug this kind of thing With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, Jun 01, 2023 at 09:27:09AM +0100, Daniel P. Berrangé wrote: > On Wed, May 31, 2023 at 11:03:23PM +0200, Juan Quintela wrote: > > Richard Henderson <richard.henderson@linaro.org> wrote: > > > On 5/30/23 11:25, Juan Quintela wrote: > > >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: > > >> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu > > >> into staging (2023-05-29 14:31:52 -0700) > > >> are available in the Git repository at: > > >> https://gitlab.com/juan.quintela/qemu.git > > >> tags/migration-20230530-pull-request > > >> for you to fetch changes up to > > >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: > > >> migration/rdma: Check sooner if we are in postcopy for > > >> save_page() (2023-05-30 19:23:50 +0200) > > >> ---------------------------------------------------------------- > > > > Added Markus and Daniel. > > > > >> Migration 20230530 Pull request (take 2) > > >> Hi > > >> Resend last PULL request, this time it compiles when CONFIG_RDMA is > > >> not configured in. > > >> [take 1] > > >> On this PULL request: > > >> - Set vmstate migration failure right (vladimir) > > >> - Migration QEMUFileHook removal (juan) > > >> - Migration Atomic counters (juan) > > >> Please apply. > > >> ---------------------------------------------------------------- > > >> Juan Quintela (16): > > >> migration: Don't abuse qemu_file transferred for RDMA > > >> migration/RDMA: It is accounting for zero/normal pages in two places > > >> migration/rdma: Remove QEMUFile parameter when not used > > >> migration/rdma: Don't use imaginary transfers > > >> migration: Remove unused qemu_file_credit_transfer() > > >> migration/rdma: Simplify the function that saves a page > > >> migration: Create migrate_rdma() > > >> migration/rdma: Unfold ram_control_before_iterate() > > >> migration/rdma: Unfold ram_control_after_iterate() > > >> migration/rdma: Remove all uses of RAM_CONTROL_HOOK > > >> migration/rdma: Unfold hook_ram_load() > > >> migration/rdma: Create rdma_control_save_page() > > >> qemu-file: Remove QEMUFileHooks > > >> migration/rdma: Move rdma constants from qemu-file.h to rdma.h > > >> migration/rdma: Remove qemu_ prefix from exported functions > > >> migration/rdma: Check sooner if we are in postcopy for save_page() > > >> Vladimir Sementsov-Ogievskiy (5): > > >> runstate: add runstate_get() > > >> migration: never fail in global_state_store() > > >> runstate: drop unused runstate_store() > > >> migration: switch from .vm_was_running to .vm_old_state > > >> migration: restore vmstate on migration failure > > > > > > Appears to introduce multiple avocado failures: > > > > > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 > > > > > > Test summary: > > > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR > > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > > > > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 > > > > > > Test summary: > > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > > > > > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test > > > > > > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is > > > not an instance of type qio-channel-rdma > > > > I am looking at the other errors, but this one is weird. It is failing > > here: > > > > #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" > > OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA) > > > > In the OBJECT line. > > > > I have no clue what problem are we having here with the object system to > > decide at declaration time that a variable is not of the type that we > > are declaring. > > > > I am missing something obvious here? > > I expect somewhere in the code has either corrupted memory, or is > using free'd memory. Either way you'll need to get a stack trace > to debug this kind of thing I've replied to the patches pointing out 4 places where the code casts to QIOChannelRDMA, without first checking that this is an RDMA migration, which look likely to be the cause of this. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Daniel P. Berrangé <berrange@redhat.com> wrote: > On Thu, Jun 01, 2023 at 09:27:09AM +0100, Daniel P. Berrangé wrote: >> On Wed, May 31, 2023 at 11:03:23PM +0200, Juan Quintela wrote: >> > Richard Henderson <richard.henderson@linaro.org> wrote: >> > > On 5/30/23 11:25, Juan Quintela wrote: >> > >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: >> > >> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu >> > >> into staging (2023-05-29 14:31:52 -0700) >> > >> are available in the Git repository at: >> > >> https://gitlab.com/juan.quintela/qemu.git >> > >> tags/migration-20230530-pull-request >> > >> for you to fetch changes up to >> > >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: >> > >> migration/rdma: Check sooner if we are in postcopy for >> > >> save_page() (2023-05-30 19:23:50 +0200) >> > >> ---------------------------------------------------------------- >> > >> > Added Markus and Daniel. >> > >> > >> Migration 20230530 Pull request (take 2) >> > >> Hi >> > >> Resend last PULL request, this time it compiles when CONFIG_RDMA is >> > >> not configured in. >> > >> [take 1] >> > >> On this PULL request: >> > >> - Set vmstate migration failure right (vladimir) >> > >> - Migration QEMUFileHook removal (juan) >> > >> - Migration Atomic counters (juan) >> > >> Please apply. >> > >> ---------------------------------------------------------------- >> > >> Juan Quintela (16): >> > >> migration: Don't abuse qemu_file transferred for RDMA >> > >> migration/RDMA: It is accounting for zero/normal pages in two places >> > >> migration/rdma: Remove QEMUFile parameter when not used >> > >> migration/rdma: Don't use imaginary transfers >> > >> migration: Remove unused qemu_file_credit_transfer() >> > >> migration/rdma: Simplify the function that saves a page >> > >> migration: Create migrate_rdma() >> > >> migration/rdma: Unfold ram_control_before_iterate() >> > >> migration/rdma: Unfold ram_control_after_iterate() >> > >> migration/rdma: Remove all uses of RAM_CONTROL_HOOK >> > >> migration/rdma: Unfold hook_ram_load() >> > >> migration/rdma: Create rdma_control_save_page() >> > >> qemu-file: Remove QEMUFileHooks >> > >> migration/rdma: Move rdma constants from qemu-file.h to rdma.h >> > >> migration/rdma: Remove qemu_ prefix from exported functions >> > >> migration/rdma: Check sooner if we are in postcopy for save_page() >> > >> Vladimir Sementsov-Ogievskiy (5): >> > >> runstate: add runstate_get() >> > >> migration: never fail in global_state_store() >> > >> runstate: drop unused runstate_store() >> > >> migration: switch from .vm_was_running to .vm_old_state >> > >> migration: restore vmstate on migration failure >> > > >> > > Appears to introduce multiple avocado failures: >> > > >> > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 >> > > >> > > Test summary: >> > > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR >> > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR >> > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR >> > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 >> > > >> > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 >> > > >> > > Test summary: >> > > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR >> > > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR >> > > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 >> > > >> > > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test >> > > >> > > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is >> > > not an instance of type qio-channel-rdma >> > >> > I am looking at the other errors, but this one is weird. It is failing >> > here: >> > >> > #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" >> > OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA) >> > >> > In the OBJECT line. >> > >> > I have no clue what problem are we having here with the object system to >> > decide at declaration time that a variable is not of the type that we >> > are declaring. >> > >> > I am missing something obvious here? >> >> I expect somewhere in the code has either corrupted memory, or is >> using free'd memory. Either way you'll need to get a stack trace >> to debug this kind of thing > > I've replied to the patches pointing out 4 places where the code > casts to QIOChannelRDMA, without first checking that this is an > RDMA migration, which look likely to be the cause of this. Good catch. I can only say: Ouch. And why it don't failed for me. It passes for me: - make check (compiled every target/device/... that can be compiled on Fedora38) - I tested hundreds of times migration-test during development, never failed like that - I am switching to test aarch64 tcg as main target, because it appears it finds way more bugs on migration-tests. Thanks again. Later, Juan.
On 5/31/23 14:03, Juan Quintela wrote: > Richard Henderson <richard.henderson@linaro.org> wrote: >> On 5/30/23 11:25, Juan Quintela wrote: >>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: >>> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu >>> into staging (2023-05-29 14:31:52 -0700) >>> are available in the Git repository at: >>> https://gitlab.com/juan.quintela/qemu.git >>> tags/migration-20230530-pull-request >>> for you to fetch changes up to >>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: >>> migration/rdma: Check sooner if we are in postcopy for >>> save_page() (2023-05-30 19:23:50 +0200) >>> ---------------------------------------------------------------- > > Added Markus and Daniel. > >>> Migration 20230530 Pull request (take 2) >>> Hi >>> Resend last PULL request, this time it compiles when CONFIG_RDMA is >>> not configured in. >>> [take 1] >>> On this PULL request: >>> - Set vmstate migration failure right (vladimir) >>> - Migration QEMUFileHook removal (juan) >>> - Migration Atomic counters (juan) >>> Please apply. >>> ---------------------------------------------------------------- >>> Juan Quintela (16): >>> migration: Don't abuse qemu_file transferred for RDMA >>> migration/RDMA: It is accounting for zero/normal pages in two places >>> migration/rdma: Remove QEMUFile parameter when not used >>> migration/rdma: Don't use imaginary transfers >>> migration: Remove unused qemu_file_credit_transfer() >>> migration/rdma: Simplify the function that saves a page >>> migration: Create migrate_rdma() >>> migration/rdma: Unfold ram_control_before_iterate() >>> migration/rdma: Unfold ram_control_after_iterate() >>> migration/rdma: Remove all uses of RAM_CONTROL_HOOK >>> migration/rdma: Unfold hook_ram_load() >>> migration/rdma: Create rdma_control_save_page() >>> qemu-file: Remove QEMUFileHooks >>> migration/rdma: Move rdma constants from qemu-file.h to rdma.h >>> migration/rdma: Remove qemu_ prefix from exported functions >>> migration/rdma: Check sooner if we are in postcopy for save_page() >>> Vladimir Sementsov-Ogievskiy (5): >>> runstate: add runstate_get() >>> migration: never fail in global_state_store() >>> runstate: drop unused runstate_store() >>> migration: switch from .vm_was_running to .vm_old_state >>> migration: restore vmstate on migration failure >> >> Appears to introduce multiple avocado failures: >> >> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 >> >> Test summary: >> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR >> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR >> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR >> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 >> >> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 >> >> Test summary: >> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR >> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR >> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 >> >> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test >> >> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is >> not an instance of type qio-channel-rdma > > I am looking at the other errors, but this one is weird. It is failing > here: > > #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" > OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA) > > In the OBJECT line. > > I have no clue what problem are we having here with the object system to > decide at declaration time that a variable is not of the type that we > are declaring. > > I am missing something obvious here? This is where the inline function is declared, but you need to look at the backtrace, where you have applied QIO_CHANNEL_RDMA to an object that is *not* of that type. r~
Richard Henderson <richard.henderson@linaro.org> wrote: > On 5/31/23 14:03, Juan Quintela wrote: >> Richard Henderson <richard.henderson@linaro.org> wrote: >>> On 5/30/23 11:25, Juan Quintela wrote: >>>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: >>>> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu >>>> into staging (2023-05-29 14:31:52 -0700) >>>> are available in the Git repository at: >>>> https://gitlab.com/juan.quintela/qemu.git >>>> tags/migration-20230530-pull-request >>>> for you to fetch changes up to >>>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: >>>> migration/rdma: Check sooner if we are in postcopy for >>>> save_page() (2023-05-30 19:23:50 +0200) >>>> ---------------------------------------------------------------- >> Added Markus and Daniel. >> >>>> Migration 20230530 Pull request (take 2) >>>> Hi >>>> Resend last PULL request, this time it compiles when CONFIG_RDMA is >>>> not configured in. >>>> [take 1] >>>> On this PULL request: >>>> - Set vmstate migration failure right (vladimir) >>>> - Migration QEMUFileHook removal (juan) >>>> - Migration Atomic counters (juan) >>>> Please apply. >>>> ---------------------------------------------------------------- >>>> Juan Quintela (16): >>>> migration: Don't abuse qemu_file transferred for RDMA >>>> migration/RDMA: It is accounting for zero/normal pages in two places >>>> migration/rdma: Remove QEMUFile parameter when not used >>>> migration/rdma: Don't use imaginary transfers >>>> migration: Remove unused qemu_file_credit_transfer() >>>> migration/rdma: Simplify the function that saves a page >>>> migration: Create migrate_rdma() >>>> migration/rdma: Unfold ram_control_before_iterate() >>>> migration/rdma: Unfold ram_control_after_iterate() >>>> migration/rdma: Remove all uses of RAM_CONTROL_HOOK >>>> migration/rdma: Unfold hook_ram_load() >>>> migration/rdma: Create rdma_control_save_page() >>>> qemu-file: Remove QEMUFileHooks >>>> migration/rdma: Move rdma constants from qemu-file.h to rdma.h >>>> migration/rdma: Remove qemu_ prefix from exported functions >>>> migration/rdma: Check sooner if we are in postcopy for save_page() >>>> Vladimir Sementsov-Ogievskiy (5): >>>> runstate: add runstate_get() >>>> migration: never fail in global_state_store() >>>> runstate: drop unused runstate_store() >>>> migration: switch from .vm_was_running to .vm_old_state >>>> migration: restore vmstate on migration failure >>> >>> Appears to introduce multiple avocado failures: >>> >>> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 >>> >>> Test summary: >>> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR >>> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR >>> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR >>> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: >>> check-avocado] Error 1 >>> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 >>> >>> Test summary: >>> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR >>> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR >>> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 >>> >>> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test >>> >>> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is >>> not an instance of type qio-channel-rdma >> I am looking at the other errors, but this one is weird. It is >> failing >> here: >> #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" >> OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelRDMA, QIO_CHANNEL_RDMA) >> In the OBJECT line. >> I have no clue what problem are we having here with the object >> system to >> decide at declaration time that a variable is not of the type that we >> are declaring. >> I am missing something obvious here? > > This is where the inline function is declared, but you need to look at > the backtrace, where you have applied QIO_CHANNEL_RDMA to an object > that is *not* of that type. Where is the stack trace? Are you running aarch64 natively? Here running qemu-system-aarch64 on x86_64 works for me. Neither avocado test nor migration-test fails with my changes. Cleber found the reason why I was having trouble running avocado locally. It appears that some change happened and there are several tests that can't be run in parallel. (Temporary) solution is to run it as: make -j1 check-avocado Until they sort which tests are/aren't able to run in parallel. Later, Juan.
Richard Henderson <richard.henderson@linaro.org> wrote: > On 5/30/23 11:25, Juan Quintela wrote: >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: >> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu >> into staging (2023-05-29 14:31:52 -0700) >> are available in the Git repository at: >> https://gitlab.com/juan.quintela/qemu.git >> tags/migration-20230530-pull-request >> for you to fetch changes up to >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: >> migration/rdma: Check sooner if we are in postcopy for >> save_page() (2023-05-30 19:23:50 +0200) >> ---------------------------------------------------------------- >> Migration 20230530 Pull request (take 2) >> Hi >> Resend last PULL request, this time it compiles when CONFIG_RDMA is >> not configured in. >> [take 1] >> On this PULL request: >> - Set vmstate migration failure right (vladimir) >> - Migration QEMUFileHook removal (juan) >> - Migration Atomic counters (juan) >> Please apply. >> ---------------------------------------------------------------- >> Juan Quintela (16): >> migration: Don't abuse qemu_file transferred for RDMA >> migration/RDMA: It is accounting for zero/normal pages in two places >> migration/rdma: Remove QEMUFile parameter when not used >> migration/rdma: Don't use imaginary transfers >> migration: Remove unused qemu_file_credit_transfer() >> migration/rdma: Simplify the function that saves a page >> migration: Create migrate_rdma() >> migration/rdma: Unfold ram_control_before_iterate() >> migration/rdma: Unfold ram_control_after_iterate() >> migration/rdma: Remove all uses of RAM_CONTROL_HOOK >> migration/rdma: Unfold hook_ram_load() >> migration/rdma: Create rdma_control_save_page() >> qemu-file: Remove QEMUFileHooks >> migration/rdma: Move rdma constants from qemu-file.h to rdma.h >> migration/rdma: Remove qemu_ prefix from exported functions >> migration/rdma: Check sooner if we are in postcopy for save_page() >> Vladimir Sementsov-Ogievskiy (5): >> runstate: add runstate_get() >> migration: never fail in global_state_store() >> runstate: drop unused runstate_store() >> migration: switch from .vm_was_running to .vm_old_state >> migration: restore vmstate on migration failure > > Appears to introduce multiple avocado failures: > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 > > Test summary: > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 > > Test summary: > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test > > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is > not an instance of type qio-channel-rdma > qemu-system-aarch64: Not a migration stream > qemu-system-aarch64: load of migration failed: Invalid argument > Broken pipe And as it couldn't be anyother way, on my machine (with upstream qemu, i.e. none of my changes in): $ make check-avocado (check-acceptance fails in the same test) ... STARTED (063/243) tests/avocado/kvm_xen_guest.py:KVMXenGuest.test_kvm_xen_guest_nomsi: ERROR: ConnectError: Failed to establish session: EOFError\n Exit code: 1\n Command: ./qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,fd=9 -mon chardev=mon,mode=control -machine q35 -chardev socket,id=console,path=/var/tmp/qemu_yv2aqehm/7f8b6cf5... (2.52 s) Interrupting job (failfast). RESULTS : PASS 42 | ERROR 1 | FAIL 0 | SKIP 199 | WARN 0 | INTERRUPT 0 | CANCEL 1 JOB TIME : 71.66 s Test summary: tests/avocado/kvm_xen_guest.py:KVMXenGuest.test_kvm_xen_guest_nomsi: ERROR make: *** [/mnt/code/qemu/full/tests/Makefile.include:142: check-avocado] Error 9 the good news: Some tests passed. sixty something. the bad news: it is failing on kvm xen guest and I have no clue why/where. Is there some documentation that can help me here? Thanks, Juan.
Juan Quintela <quintela@redhat.com> wrote: > Richard Henderson <richard.henderson@linaro.org> wrote: >> On 5/30/23 11:25, Juan Quintela wrote: >>> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: >>> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu >>> into staging (2023-05-29 14:31:52 -0700) >>> are available in the Git repository at: >>> https://gitlab.com/juan.quintela/qemu.git >>> tags/migration-20230530-pull-request >>> for you to fetch changes up to >>> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: >>> migration/rdma: Check sooner if we are in postcopy for >>> save_page() (2023-05-30 19:23:50 +0200) >>> ---------------------------------------------------------------- >>> Migration 20230530 Pull request (take 2) >>> Hi >>> Resend last PULL request, this time it compiles when CONFIG_RDMA is >>> not configured in. >>> [take 1] >>> On this PULL request: >>> - Set vmstate migration failure right (vladimir) >>> - Migration QEMUFileHook removal (juan) >>> - Migration Atomic counters (juan) >>> Please apply. >>> ---------------------------------------------------------------- >>> Juan Quintela (16): >>> migration: Don't abuse qemu_file transferred for RDMA >>> migration/RDMA: It is accounting for zero/normal pages in two places >>> migration/rdma: Remove QEMUFile parameter when not used >>> migration/rdma: Don't use imaginary transfers >>> migration: Remove unused qemu_file_credit_transfer() >>> migration/rdma: Simplify the function that saves a page >>> migration: Create migrate_rdma() >>> migration/rdma: Unfold ram_control_before_iterate() >>> migration/rdma: Unfold ram_control_after_iterate() >>> migration/rdma: Remove all uses of RAM_CONTROL_HOOK >>> migration/rdma: Unfold hook_ram_load() >>> migration/rdma: Create rdma_control_save_page() >>> qemu-file: Remove QEMUFileHooks >>> migration/rdma: Move rdma constants from qemu-file.h to rdma.h >>> migration/rdma: Remove qemu_ prefix from exported functions >>> migration/rdma: Check sooner if we are in postcopy for save_page() >>> Vladimir Sementsov-Ogievskiy (5): >>> runstate: add runstate_get() >>> migration: never fail in global_state_store() >>> runstate: drop unused runstate_store() >>> migration: switch from .vm_was_running to .vm_old_state >>> migration: restore vmstate on migration failure >> >> Appears to introduce multiple avocado failures: >> >> https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 >> >> Test summary: >> tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR >> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR >> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR >> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 >> >> https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 >> >> Test summary: >> tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR >> tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR >> make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 >> >> Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test >> >> ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is >> not an instance of type qio-channel-rdma >> qemu-system-aarch64: Not a migration stream >> qemu-system-aarch64: load of migration failed: Invalid argument And now I am stuck here. Neither migration-test or avocado test migration-rdma, so clearly I broke something that don't show on migration-test but shows on avocado. Still trying to get avocado to behave on my local test machine. Later, Juan. >> Broken pipe > > And as it couldn't be anyother way, on my machine (with upstream qemu, > i.e. none of my changes in): > > $ make check-avocado (check-acceptance fails in the same test) > > ... > > STARTED > (063/243) tests/avocado/kvm_xen_guest.py:KVMXenGuest.test_kvm_xen_guest_nomsi: ERROR: ConnectError: Failed to establish session: EOFError\n Exit code: 1\n Command: ./qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,fd=9 -mon chardev=mon,mode=control -machine q35 -chardev socket,id=console,path=/var/tmp/qemu_yv2aqehm/7f8b6cf5... (2.52 s) > Interrupting job (failfast). > RESULTS : PASS 42 | ERROR 1 | FAIL 0 | SKIP 199 | WARN 0 | INTERRUPT 0 | CANCEL 1 > JOB TIME : 71.66 s > > Test summary: > tests/avocado/kvm_xen_guest.py:KVMXenGuest.test_kvm_xen_guest_nomsi: ERROR > make: *** [/mnt/code/qemu/full/tests/Makefile.include:142: check-avocado] Error 9 > > > the good news: Some tests passed. sixty something. > the bad news: it is failing on kvm xen guest and I have no clue > why/where. > > Is there some documentation that can help me here? > > Thanks, Juan.
Richard Henderson <richard.henderson@linaro.org> wrote: > On 5/30/23 11:25, Juan Quintela wrote: >> The following changes since commit aa9bbd865502ed517624ab6fe7d4b5d89ca95e43: >> Merge tag 'pull-ppc-20230528' of https://gitlab.com/danielhb/qemu >> into staging (2023-05-29 14:31:52 -0700) >> are available in the Git repository at: >> https://gitlab.com/juan.quintela/qemu.git >> tags/migration-20230530-pull-request >> for you to fetch changes up to >> c63c544005e6b1375a9c038f0e0fb8dfb8b249f4: >> migration/rdma: Check sooner if we are in postcopy for >> save_page() (2023-05-30 19:23:50 +0200) >> ---------------------------------------------------------------- >> Migration 20230530 Pull request (take 2) >> Hi >> Resend last PULL request, this time it compiles when CONFIG_RDMA is >> not configured in. >> [take 1] >> On this PULL request: >> - Set vmstate migration failure right (vladimir) >> - Migration QEMUFileHook removal (juan) >> - Migration Atomic counters (juan) >> Please apply. >> ---------------------------------------------------------------- >> Juan Quintela (16): >> migration: Don't abuse qemu_file transferred for RDMA >> migration/RDMA: It is accounting for zero/normal pages in two places >> migration/rdma: Remove QEMUFile parameter when not used >> migration/rdma: Don't use imaginary transfers >> migration: Remove unused qemu_file_credit_transfer() >> migration/rdma: Simplify the function that saves a page >> migration: Create migrate_rdma() >> migration/rdma: Unfold ram_control_before_iterate() >> migration/rdma: Unfold ram_control_after_iterate() >> migration/rdma: Remove all uses of RAM_CONTROL_HOOK >> migration/rdma: Unfold hook_ram_load() >> migration/rdma: Create rdma_control_save_page() >> qemu-file: Remove QEMUFileHooks >> migration/rdma: Move rdma constants from qemu-file.h to rdma.h >> migration/rdma: Remove qemu_ prefix from exported functions >> migration/rdma: Check sooner if we are in postcopy for save_page() >> Vladimir Sementsov-Ogievskiy (5): >> runstate: add runstate_get() >> migration: never fail in global_state_store() >> runstate: drop unused runstate_store() >> migration: switch from .vm_was_running to .vm_old_state >> migration: restore vmstate on migration failure Self nack > Appears to introduce multiple avocado failures: > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066518#L286 > > Test summary: > tests/avocado/migration.py:X86_64.test_migration_with_exec: ERROR No clue what is going on here. I haven't touched exec migration (famous last words) > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: > ERROR This is tested by ./tests/migration-test It passes for me with qemu-system-x86_64 (kvm), qemu-system-i386 (kvm) and aarch64 (tcg). What make check tests. > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR Dunno again. > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > https://gitlab.com/qemu-project/qemu/-/jobs/4378066523#L387 > > Test summary: > tests/avocado/migration.py:X86_64.test_migration_with_tcp_localhost: ERROR > tests/avocado/migration.py:X86_64.test_migration_with_unix: ERROR > make: *** [/builds/qemu-project/qemu/tests/Makefile.include:142: check-avocado] Error 1 > > Also fails QTEST_QEMU_BINARY=./qemu-system-aarch64 ./tests/qtest/migration-test > > ../src/migration/rdma.c:408:QIO_CHANNEL_RDMA: Object 0xaaaaf7bba680 is This is something that changed on the series, so I will check with this avocado tests. No clue why it passes with x86_64 and not with aarch64. > not an instance of type qio-channel-rdma Here there was an error creating the RDMA channel somehow. Somehow is the important word on the previous sentence. > qemu-system-aarch64: Not a migration stream > qemu-system-aarch64: load of migration failed: Invalid argument > Broken pipe Will isolate the non RDMA changes and try to pass the RDMA ones through avocado. I am sorry for the noise. Later, Juan.
© 2016 - 2024 Red Hat, Inc.