[PATCH 0/1] Do not stop guest when panic event is received

Alejandro Jimenez posted 1 patch 3 years, 7 months ago
Test docker-quick@centos7 failed
Test docker-mingw@fedora failed
Test checkpatch failed
Test FreeBSD failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1601606494-1154-1-git-send-email-alejandro.j.jimenez@oracle.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>
qemu-options.hx | 11 +++++++++++
softmmu/vl.c    | 17 ++++++++++++++---
2 files changed, 25 insertions(+), 3 deletions(-)
[PATCH 0/1] Do not stop guest when panic event is received
Posted by Alejandro Jimenez 3 years, 7 months ago
The following patch adds an option to request that QEMU does not stop the VM when a panic event is received.
This allows guests in cloud environments to report the panic condition to the control plane, but be able to
proceed to collect a crash dump and automatically reboot, without waiting to receive one or several 'cont'
monitor commands.

I am aware of a previous discussion regarding the decision to stop the guest on panic event:
https://lore.kernel.org/qemu-devel/52148F88.5000509@redhat.com/
that is why I propose explicitly using a parameter to change the default behavior when necessary.

The PVPANIC_CRASHLOADED event was introduced in the v5.6 kernel, and it is intended to tell QEMU that the guest
will handle the panic condition by itself, but unfortunately older kernels will only support sending the
PVPANIC_PANICKED event, for which the default behavior is to pause the VM.

Having a '-no-panicstop' option allows for older guest kernels that do not support the PVPANIC_CRASHLOADED event
to behave in the same way as newer kernels, simplifying control plane code. It also provides the same advantage
when launching Windows guests with the hv-crash enlightenment, since the hv-crash MSR writes are ultimately
handled by QEMU as if the guest had sent a PVPANIC_PANICKED event.

The fact that the behavior of hv-crash is also affected is why I chose to implement this change as an independent
option, as opposed to making it a property of the pvpanic device (e.g. -device pvpanic,no-panicstop).

Please let me know if you have any comments or suggestions.

Regards,
Alejandro

Alejandro Jimenez (1):
  vl: Add -no-panicstop option

 qemu-options.hx | 11 +++++++++++
 softmmu/vl.c    | 17 ++++++++++++++---
 2 files changed, 25 insertions(+), 3 deletions(-)

-- 
1.8.3.1


Re: [PATCH 0/1] Do not stop guest when panic event is received
Posted by Paolo Bonzini 3 years, 6 months ago
On 02/10/20 04:41, Alejandro Jimenez wrote:
> The fact that the behavior of hv-crash is also affected is why I chose to implement this change as an independent
> option, as opposed to making it a property of the pvpanic device (e.g. -device pvpanic,no-panicstop).
> 
> Please let me know if you have any comments or suggestions.

Hi Alejandro, sorry for the delayed response.

The concept is fine, and I agree this should not be a device property.

On the other hand, we already have many similar options: -no-reboot,
-no-shutdown, -watchdog-action and now --no-panicstop.

I think it's time to group them into a single option:

* -action reboot=pause|shutdown|none
* -action shutdown=pause|poweroff|none
* -action panic=pause|shutdown|none
* -action watchdog=reset|shutdown|poweroff|pause|debug|none|inject-nmi

where the existing options would translate to the new option, like:

* -no-reboot "-action reboot=shutdown"
* -no-shutdown "-action shutdown=pause"

The implementation should be relatively easy too; there's already an
enum WatchdogAction (that can be renamed to e.g. RunstateAction) and a
parsing function select_watchdog_action that can be changed to just
return the RunstateAction.

Would you like to take a look at this?

Thanks,

Paolo


Re: [PATCH 0/1] Do not stop guest when panic event is received
Posted by Alejandro Jimenez 3 years, 6 months ago

On 10/20/2020 1:14 PM, Paolo Bonzini wrote:
> On 02/10/20 04:41, Alejandro Jimenez wrote:
>> The fact that the behavior of hv-crash is also affected is why I chose to implement this change as an independent
>> option, as opposed to making it a property of the pvpanic device (e.g. -device pvpanic,no-panicstop).
>>
>> Please let me know if you have any comments or suggestions.
> Hi Alejandro, sorry for the delayed response.
>
> The concept is fine, and I agree this should not be a device property.
>
> On the other hand, we already have many similar options: -no-reboot,
> -no-shutdown, -watchdog-action and now --no-panicstop.
>
> I think it's time to group them into a single option:
>
> * -action reboot=pause|shutdown|none
> * -action shutdown=pause|poweroff|none
> * -action panic=pause|shutdown|none
> * -action watchdog=reset|shutdown|poweroff|pause|debug|none|inject-nmi
>
> where the existing options would translate to the new option, like:
>
> * -no-reboot "-action reboot=shutdown"
> * -no-shutdown "-action shutdown=pause"
>
> The implementation should be relatively easy too; there's already an
> enum WatchdogAction (that can be renamed to e.g. RunstateAction) and a
> parsing function select_watchdog_action that can be changed to just
> return the RunstateAction.
>
> Would you like to take a look at this?
Hi Paolo,

Thank you for your reply and the advice/hints above. I'll take a look 
and try to implement what you propose.

Regards,
Alejandro
>
> Thanks,
>
> Paolo
>


Re: [PATCH 0/1] Do not stop guest when panic event is received
Posted by Paolo Bonzini 3 years, 6 months ago
On 21/10/20 15:26, Alejandro Jimenez wrote:
> 
> 
> On 10/20/2020 1:14 PM, Paolo Bonzini wrote:
>> On 02/10/20 04:41, Alejandro Jimenez wrote:
>>> The fact that the behavior of hv-crash is also affected is why I
>>> chose to implement this change as an independent
>>> option, as opposed to making it a property of the pvpanic device
>>> (e.g. -device pvpanic,no-panicstop).
>>>
>>> Please let me know if you have any comments or suggestions.
>> Hi Alejandro, sorry for the delayed response.
>>
>> The concept is fine, and I agree this should not be a device property.
>>
>> On the other hand, we already have many similar options: -no-reboot,
>> -no-shutdown, -watchdog-action and now --no-panicstop.
>>
>> I think it's time to group them into a single option:
>>
>> * -action reboot=pause|shutdown|none
>> * -action shutdown=pause|poweroff|none
>> * -action panic=pause|shutdown|none
>> * -action watchdog=reset|shutdown|poweroff|pause|debug|none|inject-nmi
>>
>> where the existing options would translate to the new option, like:
>>
>> * -no-reboot "-action reboot=shutdown"
>> * -no-shutdown "-action shutdown=pause"
>>
>> The implementation should be relatively easy too; there's already an
>> enum WatchdogAction (that can be renamed to e.g. RunstateAction) and a
>> parsing function select_watchdog_action that can be changed to just
>> return the RunstateAction.
>>
>> Would you like to take a look at this?
> Hi Paolo,
> 
> Thank you for your reply and the advice/hints above. I'll take a look
> and try to implement what you propose.

Just one thing, for the parsing you can place it close to the existing

    qemu_opts_foreach(qemu_find_opts("name"),
                      parse_name, NULL, &error_fatal);

Paolo