Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> On 18.05.23 14:23, Juan Quintela wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
>>> Hi all.
>>>
>>> The problem I want to solve is that guest-panicked state may be lost
>>> when migration is failed (or cancelled) after source stop.
>>>
>>> Still, I try to go further and restore all possible paused states in the
>>> same way. The key patch is the last one and others are refactoring and
>>> preparation.
>> Hi
>> I like and agree with the spirit of the series in general. But I
>> think
>> that we need to drop the "never fail in global_state_store()". We
>> shouldn't kill a guest because we found a bug on migration.
>>
>
> Why migration is better in this sense than non-migration? We have a
> lot of places where we just assert things instead of creating
> unreachable error messages. I think assert/abort is always better in
> such cases. Really, if we fail in this assertion it means that memory
> is corrupted, and stopping the execution is the best thing to do.
>
> (Should we consider the case that in future we add 100 character length vmstate? I hope we should not)
Ok, I give up and integrate the series as they are O:-)
I agree that this is a case that shouldn't happen, so assert() is not as
out of question.
What I am trying to get migration is to really detect errors and be able
to recover from them. My long term crusade is getting rid of
qemu_file_get_error() and just check the return value for functions that
do IO. Yes, it is a big long term because we need to change the whole
interface to something saner.
Later, Juan.