27.07.2020 20:06, Vladimir Sementsov-Ogievskiy wrote:
> 27.07.2020 16:21, Dr. David Alan Gilbert wrote:
>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>> If target is turned off prior to postcopy finished, target crashes
>>> because busy bitmaps are found at shutdown.
>>> Canceling incoming migration helps, as it removes all unfinished (and
>>> therefore busy) bitmaps.
>>>
>>> Similarly on source we crash in bdrv_close_all which asserts that all
>>> bdrv states are removed, because bdrv states involved into dirty bitmap
>>> migration are referenced by it. So, we need to cancel outgoing
>>> migration as well.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
>>> ---
>>> migration/migration.h | 2 ++
>>> migration/block-dirty-bitmap.c | 16 ++++++++++++++++
>>> migration/migration.c | 13 +++++++++++++
>>> 3 files changed, 31 insertions(+)
>>>
>>> diff --git a/migration/migration.h b/migration/migration.h
>>> index ab20c756f5..6c6a931d0d 100644
>>> --- a/migration/migration.h
>>> +++ b/migration/migration.h
>>> @@ -335,6 +335,8 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState *mis,
>>> void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
>>> void dirty_bitmap_mig_before_vm_start(void);
>>> +void dirty_bitmap_mig_cancel_outgoing(void);
>>> +void dirty_bitmap_mig_cancel_incoming(void);
>>> void migrate_add_address(SocketAddress *address);
>>> int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
>>> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
>>> index c24d4614bf..a198ec7278 100644
>>> --- a/migration/block-dirty-bitmap.c
>>> +++ b/migration/block-dirty-bitmap.c
>>> @@ -657,6 +657,22 @@ static void cancel_incoming_locked(DBMLoadState *s)
>>> s->bitmaps = NULL;
>>> }
>>> +void dirty_bitmap_mig_cancel_outgoing(void)
>>> +{
>>> + dirty_bitmap_do_save_cleanup(&dbm_state.save);
>>> +}
>>> +
>>> +void dirty_bitmap_mig_cancel_incoming(void)
>>> +{
>>> + DBMLoadState *s = &dbm_state.load;
>>> +
>>> + qemu_mutex_lock(&s->lock);
>>> +
>>> + cancel_incoming_locked(s);
>>> +
>>> + qemu_mutex_unlock(&s->lock);
>>> +}
>>> +
>>> static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
>>> {
>>> GSList *item;
>>> diff --git a/migration/migration.c b/migration/migration.c
>>> index 1c61428988..8fe36339db 100644
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -188,6 +188,19 @@ void migration_shutdown(void)
>>> */
>>> migrate_fd_cancel(current_migration);
>>> object_unref(OBJECT(current_migration));
>>> +
>>> + /*
>>> + * Cancel outgoing migration of dirty bitmaps. It should
>>> + * at least unref used block nodes.
>>> + */
>>> + dirty_bitmap_mig_cancel_outgoing();
>>> +
>>> + /*
>>> + * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
>>> + * are non-critical data, and their loss never considered as
>>> + * something serious.
>>> + */
>>> + dirty_bitmap_mig_cancel_incoming();
>>
>> Are you sure this is the right place to put them - I'm thinking that
>> perhaps the object_unref of current_migration should still be after
>> them?
>
> Hmm, looks strange, I will check.
It's OK. These functions are operate on global bitmap migration state
which is separate from current_migration, and do post-processing of
dirty bitmaps, so it's seems OK to do it at last.
>
>>
>>> }
>>> /* For outgoing */
>>> --
>>> 2.21.0
>>>
>> --
>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>
>
>
--
Best regards,
Vladimir