RE: [PATCH V2 0/4] Introduce Advanced Watch Dog module

Zhang, Chen posted 4 patches 4 years, 5 months ago
Only 0 patches received!
RE: [PATCH V2 0/4] Introduce Advanced Watch Dog module
Posted by Zhang, Chen 4 years, 5 months ago
Hi~ All~ 

Ping.... Anyone have time to review this series? I need more comments~

Thanks
Zhang Chen

> -----Original Message-----
> From: Zhang, Chen <chen.zhang@intel.com>
> Sent: Friday, November 1, 2019 10:49 AM
> To: Jason Wang <jasowang@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; Philippe Mathieu-Daudé <philmd@redhat.com>;
> qemu-dev <qemu-devel@nongnu.org>
> Cc: Zhang Chen <zhangckid@gmail.com>; Zhang, Chen
> <chen.zhang@intel.com>
> Subject: [PATCH V2 0/4] Introduce Advanced Watch Dog module
> 
> From: Zhang Chen <chen.zhang@intel.com>
> 
> Advanced Watch Dog is an universal monitoring module on VMM side, it can
> be used to detect network down(VMM to guest, VMM to VMM, VMM to
> another remote server) and do previously set operation. Current AWD patch
> just accept any input as the signal to refresh the watchdog timer, and we can
> also make a certain interactive protocol here. For the output user can pre-
> write some command or some messages in the AWD opt-script. We noticed
> that there is no way for VMM communicate directly, maybe some people
> think we don't need such things(up layer software like openstack can handle
> it). But we engaged with real customer found that in some cases,they need a
> lightweight and efficient mechanism to solve some practical
> problems(openstack is too heavy).
> for example: When it detects lost connection with the paired node,it will
> send message to admin, notify another VMM, send qmp command to qemu
> do some operation like restart the VM, build VMM heartbeat system, etc.
> It make user have basic VM/Host network monitoring tools and basic false
> tolerance and recovery solution.
> 
> Demo usage(for COLO heartbeat service):
> 
> In primary node:
> 
> -chardev socket,id=h1,host=3.3.3.3,port=9009,server,nowait
> -chardev socket,id=heartbeat0,host=3.3.3.3,port=4445
> -object iothread,id=iothread2
> -object advanced-
> watchdog,id=heart1,server=on,awd_node=h1,notification_node=heartbeat
> 0,opt_script=colo_opt_script_path,iothread=iothread1,pulse_interval=1000,
> timeout=5000
> 
> In secondary node:
> 
> -monitor tcp::4445,server,nowait
> -chardev socket,id=h1,host=3.3.3.3,port=9009,reconnect=1
> -chardev socket,id=heart1,host=3.3.3.8,port=4445
> -object iothread,id=iothread1
> -object advanced-
> watchdog,id=heart1,server=off,awd_node=h1,notification_node=heart1,op
> t_script=colo_secondary_opt_script,iothread=iothread1,timeout=10000
> 
> 
> V2:
>  - Addressed Philippe comments add configure selector for AWD.
> 
> Initial:
>  - Initial version.
> 
> Zhang Chen (4):
>   net/awd.c: Introduce Advanced Watch Dog module framework
>   net/awd.c: Initailize input/output chardev
>   net/awd.c: Load advanced watch dog worker thread job
>   vl.c: Make Advanced Watch Dog delayed initialization
> 
>  configure         |   9 +
>  net/Makefile.objs |   1 +
>  net/awd.c         | 491
> ++++++++++++++++++++++++++++++++++++++++++++++
>  qemu-options.hx   |   6 +
>  vl.c              |   7 +
>  5 files changed, 514 insertions(+)
>  create mode 100644 net/awd.c
> 
> --
> 2.17.1


Re: [PATCH V2 0/4] Introduce Advanced Watch Dog module
Posted by Markus Armbruster 4 years, 4 months ago
"Zhang, Chen" <chen.zhang@intel.com> writes:

> Hi~ All~ 
>
> Ping.... Anyone have time to review this series? I need more comments~

Any takers?


RE: [PATCH V2 0/4] Introduce Advanced Watch Dog module
Posted by Zhang, Chen 4 years, 4 months ago

> -----Original Message-----
> From: Markus Armbruster <armbru@redhat.com>
> Sent: Wednesday, November 27, 2019 11:49 PM
> To: Zhang, Chen <chen.zhang@intel.com>
> Cc: Jason Wang <jasowang@redhat.com>; Paolo Bonzini
> <pbonzini@redhat.com>; Philippe Mathieu-Daudé <philmd@redhat.com>;
> qemu-dev <qemu-devel@nongnu.org>; Zhang Chen <zhangckid@gmail.com>
> Subject: Re: [PATCH V2 0/4] Introduce Advanced Watch Dog module
> 
> "Zhang, Chen" <chen.zhang@intel.com> writes:
> 
> > Hi~ All~
> >
> > Ping.... Anyone have time to review this series? I need more comments~
> 
> Any takers?

Hi Markus,

Thank you for your attention.
This is a very simple module to complete the tasks related to error detection and automatic processing.
I have write the detail reason why we need it in real environment on the commit log.
Here is the latest patch:
https://lists.nongnu.org/archive/html/qemu-devel/2019-11/msg02872.html

Thanks
Zhang Chen