On 9/21/20 8:09 PM, Paolo Bonzini wrote:
> On 21/09/20 04:22, zhenwei pi wrote:
>> Hi,
>>
>> A patchset about handling 'MCE' might have been ignored, can anyone tell
>> me whether the purpose is reasonable?
>>
>> https://patchwork.kernel.org/cover/11773795/
>
> Yes, it's very useful. Just one thing, "guest-mce" can be reported for
> both AR and AO faults. Is it worth adding a 'type' field to distinguish
> the two?
>
> Paolo
>
Sure. how about adding a 'flags' of a structure? and a field named
'action-required' to describe AO or AR?
>> On 9/14/20 9:43 PM, zhenwei pi wrote:
>>> Although QEMU could catch signal BUS to handle hardware memory
>>> corrupted event, sadly, QEMU just prints a little log and try to fix
>>> it silently.
>>>
>>> In these patches, introduce a 'MEMORY_FAILURE' event with 4 detailed
>>> actions of QEMU, then uplayer could know what situaction QEMU hit and
>>> did. And further step we can do: if a host server hits a
>>> 'hypervisor-ignore'
>>> or 'guest-mce', scheduler could migrate VM to another host; if hitting
>>> 'hypervisor-stop' or 'guest-triple-fault', scheduler could select other
>>> healthy servers to launch VM.
>>>
>>> zhenwei pi (3):
>>> target-i386: seperate MCIP & MCE_MASK error reason
>>> iqapi/run-state.json: introduce memory failure event
>>> target-i386: post memory failure event to uplayer
>>>
>>> qapi/run-state.json | 46
>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>> target/i386/helper.c | 30 +++++++++++++++++++++++-------
>>> target/i386/kvm.c | 5 ++++-
>>> 3 files changed, 73 insertions(+), 8 deletions(-)
>>>
>>
>
--
zhenwei pi