[PATCH v2 0/3] add MEMORY_FAILURE event

zhenwei pi posted 3 patches 3 years, 7 months ago
Test docker-quick@centos7 failed
Test docker-mingw@fedora passed
Test checkpatch passed
Test FreeBSD passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200922095630.394893-1-pizhenwei@bytedance.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Markus Armbruster <armbru@redhat.com>, Eric Blake <eblake@redhat.com>, Richard Henderson <rth@twiddle.net>, Marcelo Tosatti <mtosatti@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>
There is a newer version of this series
qapi/run-state.json  | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++
target/i386/helper.c | 40 +++++++++++++++++++++++++------
target/i386/kvm.c    |  7 +++++-
3 files changed, 106 insertions(+), 8 deletions(-)
[PATCH v2 0/3] add MEMORY_FAILURE event
Posted by zhenwei pi 3 years, 7 months ago
v1->v2:
Suggested by Peter Maydell, rename events to make them
architecture-neutral:
'PC-RAM' -> 'guest-memory'
'guest-triple-fault' -> 'guest-mce-fatal'

Suggested by Paolo, add more fields in event:
'action-required': boolean type to distinguish a guest-mce is AR/AO.
'recursive': boolean type. set true if: previous MCE in processing
             in guest, another AO MCE occurs.

v1:
Although QEMU could catch signal BUS to handle hardware memory
corrupted event, sadly, QEMU just prints a little log and try to fix
it silently.

In these patches, introduce a 'MEMORY_FAILURE' event with 4 detailed
actions of QEMU, then uplayer could know what situaction QEMU hit and
did. And further step we can do: if a host server hits a 'hypervisor-ignore'
or 'guest-mce', scheduler could migrate VM to another host; if hitting
'hypervisor-stop' or 'guest-triple-fault', scheduler could select other
healthy servers to launch VM.

Zhenwei Pi (3):
  target-i386: seperate MCIP & MCE_MASK error reason
  qapi/run-state.json: introduce memory failure event
  target-i386: post memory failure event to uplayer

 qapi/run-state.json  | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 target/i386/helper.c | 40 +++++++++++++++++++++++++------
 target/i386/kvm.c    |  7 +++++-
 3 files changed, 106 insertions(+), 8 deletions(-)

-- 
2.11.0


Re: [PATCH v2 0/3] add MEMORY_FAILURE event
Posted by no-reply@patchew.org 3 years, 7 months ago
Patchew URL: https://patchew.org/QEMU/20200922095630.394893-1-pizhenwei@bytedance.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

C linker for the host machine: cc ld.bfd 2.27-43
Host machine cpu family: x86_64
Host machine cpu: x86_64
../src/meson.build:10: WARNING: Module unstable-keyval has no backwards or forwards compatibility and might not exist in future releases.
Program sh found: YES
Program python3 found: YES (/usr/bin/python3)
Configuring ninjatool using configuration
---
Not run: 259
Failures: 192
Failed 1 of 121 iotests
make: *** [check-block] Error 1
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 709, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--rm', '--label', 'com.qemu.instance.uuid=c2bcee7055544144a0155e97d2f7a118', '-u', '1001', '--security-opt', 'seccomp=unconfined', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-prfzegjw/src/docker-src.2020-09-22-11.22.51.1488:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=c2bcee7055544144a0155e97d2f7a118
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-prfzegjw/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    17m39.310s
user    0m20.428s


The full log is available at
http://patchew.org/logs/20200922095630.394893-1-pizhenwei@bytedance.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
PING: [PATCH v2 0/3] add MEMORY_FAILURE event
Posted by zhenwei pi 3 years, 7 months ago
PING

On 9/22/20 5:56 PM, zhenwei pi wrote:
> v1->v2:
> Suggested by Peter Maydell, rename events to make them
> architecture-neutral:
> 'PC-RAM' -> 'guest-memory'
> 'guest-triple-fault' -> 'guest-mce-fatal'
> 
> Suggested by Paolo, add more fields in event:
> 'action-required': boolean type to distinguish a guest-mce is AR/AO.
> 'recursive': boolean type. set true if: previous MCE in processing
>               in guest, another AO MCE occurs.
> 
> v1:
> Although QEMU could catch signal BUS to handle hardware memory
> corrupted event, sadly, QEMU just prints a little log and try to fix
> it silently.
> 
> In these patches, introduce a 'MEMORY_FAILURE' event with 4 detailed
> actions of QEMU, then uplayer could know what situaction QEMU hit and
> did. And further step we can do: if a host server hits a 'hypervisor-ignore'
> or 'guest-mce', scheduler could migrate VM to another host; if hitting
> 'hypervisor-stop' or 'guest-triple-fault', scheduler could select other
> healthy servers to launch VM.
> 
> Zhenwei Pi (3):
>    target-i386: seperate MCIP & MCE_MASK error reason
>    qapi/run-state.json: introduce memory failure event
>    target-i386: post memory failure event to uplayer
> 
>   qapi/run-state.json  | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>   target/i386/helper.c | 40 +++++++++++++++++++++++++------
>   target/i386/kvm.c    |  7 +++++-
>   3 files changed, 106 insertions(+), 8 deletions(-)
> 

-- 
zhenwei pi