configure | 9 + docs/awd.txt | 88 +++++++++ net/Makefile.objs | 1 + net/awd.c | 491 ++++++++++++++++++++++++++++++++++++++++++++++ qemu-options.hx | 20 ++ vl.c | 7 + 6 files changed, 616 insertions(+) create mode 100644 docs/awd.txt create mode 100644 net/awd.c
From: Zhang Chen <chen.zhang@intel.com> Advanced Watch Dog is an universal monitoring module on VMM side, it can be used to detect network down(VMM to guest, VMM to VMM, VMM to another remote server) and do previously set operation. Current AWD patch just accept any input as the signal to refresh the watchdog timer, and we can also make a certain interactive protocol here. For the outputs, user can pre-write some command or some messages in the AWD opt-script. We noticed that there is no way for VMM communicate directly, maybe some people think we don't need such things(up layer software like openstack can handle it). so we engaged with real customer found that they need a lightweight and efficient mechanism to solve some practical problems, For example Edge Computing cases(they think high level software is too heavy to use in Edge or it is hard to manage and combine with VM instance). It make user have basic VM/Host network monitoring tools and basic false tolerance and recovery solution.. Please see the detail documentation in the last patch. V4: - Add more introduction in qemu-options.hx - Addressed Paolo's comments add docs/awd.txt for the AWD module detail. V3: - Rebased on Qemu 4.2.0-rc1 code. - Fix commit message issue. V2: - Addressed Philippe comments add configure selector for AWD. Initial: - Initial version. Zhang Chen (5): net/awd.c: Introduce Advanced Watch Dog module framework net/awd.c: Initailize input/output chardev net/awd.c: Load advanced watch dog worker thread job vl.c: Make Advanced Watch Dog delayed initialization docs/awd.txt: Add doc to introduce Advanced WatchDog(AWD) module configure | 9 + docs/awd.txt | 88 +++++++++ net/Makefile.objs | 1 + net/awd.c | 491 ++++++++++++++++++++++++++++++++++++++++++++++ qemu-options.hx | 20 ++ vl.c | 7 + 6 files changed, 616 insertions(+) create mode 100644 docs/awd.txt create mode 100644 net/awd.c -- 2.17.1
Hi All, No news for a while about this series. This version already add new docs to address Paolo's comments. Please give me more comments. Thanks Zhang Chen On 12/17/2019 8:45 PM, Zhang, Chen wrote: > From: Zhang Chen <chen.zhang@intel.com> > > Advanced Watch Dog is an universal monitoring module on VMM side, it can be used > to detect network down(VMM to guest, VMM to VMM, VMM to another remote server) > and do previously set operation. Current AWD patch just accept any input as the > signal to refresh the watchdog timer, and we can also make a certain interactive > protocol here. For the outputs, user can pre-write some command or some messages > in the AWD opt-script. We noticed that there is no way for VMM communicate > directly, maybe some people think we don't need such things(up layer > software like openstack can handle it). so we engaged with real customer found > that they need a lightweight and efficient mechanism to solve some practical problems, > > For example Edge Computing cases(they think high level software is too heavy > to use in Edge or it is hard to manage and combine with VM instance). > It make user have basic VM/Host network monitoring tools and basic false > tolerance and recovery solution.. > > Please see the detail documentation in the last patch. > > V4: > - Add more introduction in qemu-options.hx > - Addressed Paolo's comments add docs/awd.txt for the AWD module detail. > > V3: > - Rebased on Qemu 4.2.0-rc1 code. > - Fix commit message issue. > > V2: > - Addressed Philippe comments add configure selector for AWD. > > Initial: > - Initial version. > > > Zhang Chen (5): > net/awd.c: Introduce Advanced Watch Dog module framework > net/awd.c: Initailize input/output chardev > net/awd.c: Load advanced watch dog worker thread job > vl.c: Make Advanced Watch Dog delayed initialization > docs/awd.txt: Add doc to introduce Advanced WatchDog(AWD) module > > configure | 9 + > docs/awd.txt | 88 +++++++++ > net/Makefile.objs | 1 + > net/awd.c | 491 ++++++++++++++++++++++++++++++++++++++++++++++ > qemu-options.hx | 20 ++ > vl.c | 7 + > 6 files changed, 616 insertions(+) > create mode 100644 docs/awd.txt > create mode 100644 net/awd.c >
Hi~ Anyone have comments about this module? We have some clients already try to use this module with COLO. Please review this part. If no one want to maintain this module, I can maintain this module myself. Thanks Zhang Chen > -----Original Message----- > From: Qemu-devel <qemu-devel- > bounces+chen.zhang=intel.com@nongnu.org> On Behalf Of Zhang, Chen > Sent: Tuesday, January 7, 2020 12:33 PM > To: Jason Wang <jasowang@redhat.com>; Paolo Bonzini > <pbonzini@redhat.com>; Philippe Mathieu-Daudé <philmd@redhat.com>; > qemu-dev <qemu-devel@nongnu.org> > Cc: Zhang Chen <zhangckid@gmail.com> > Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module > > Hi All, > > No news for a while about this series. > > This version already add new docs to address Paolo's comments. > > Please give me more comments. > > > Thanks > > Zhang Chen > > > On 12/17/2019 8:45 PM, Zhang, Chen wrote: > > From: Zhang Chen <chen.zhang@intel.com> > > > > Advanced Watch Dog is an universal monitoring module on VMM side, it > > can be used to detect network down(VMM to guest, VMM to VMM, VMM > to > > another remote server) and do previously set operation. Current AWD > > patch just accept any input as the signal to refresh the watchdog > > timer, and we can also make a certain interactive protocol here. For > > the outputs, user can pre-write some command or some messages in the > > AWD opt-script. We noticed that there is no way for VMM communicate > > directly, maybe some people think we don't need such things(up layer > > software like openstack can handle it). so we engaged with real > > customer found that they need a lightweight and efficient mechanism to > > solve some practical problems, > > > > For example Edge Computing cases(they think high level software is too > > heavy to use in Edge or it is hard to manage and combine with VM instance). > > It make user have basic VM/Host network monitoring tools and basic > > false tolerance and recovery solution.. > > > > Please see the detail documentation in the last patch. > > > > V4: > > - Add more introduction in qemu-options.hx > > - Addressed Paolo's comments add docs/awd.txt for the AWD module > detail. > > > > V3: > > - Rebased on Qemu 4.2.0-rc1 code. > > - Fix commit message issue. > > > > V2: > > - Addressed Philippe comments add configure selector for AWD. > > > > Initial: > > - Initial version. > > > > > > Zhang Chen (5): > > net/awd.c: Introduce Advanced Watch Dog module framework > > net/awd.c: Initailize input/output chardev > > net/awd.c: Load advanced watch dog worker thread job > > vl.c: Make Advanced Watch Dog delayed initialization > > docs/awd.txt: Add doc to introduce Advanced WatchDog(AWD) module > > > > configure | 9 + > > docs/awd.txt | 88 +++++++++ > > net/Makefile.objs | 1 + > > net/awd.c | 491 > ++++++++++++++++++++++++++++++++++++++++++++++ > > qemu-options.hx | 20 ++ > > vl.c | 7 + > > 6 files changed, 616 insertions(+) > > create mode 100644 docs/awd.txt > > create mode 100644 net/awd.c > >
On 2020/1/19 下午5:10, Zhang, Chen wrote: > Hi~ > > Anyone have comments about this module? Hi Chen: I will take a look at this series. Two general questions: - if it can detect more than network stall, it should not belong to /net - need to convince libvirt guys for this proposal, since usually it's the duty of upper layer instead of qemu itself Thanks > We have some clients already try to use this module with COLO. Please review this part. > If no one want to maintain this module, I can maintain this module myself. > > Thanks > Zhang Chen > >> -----Original Message----- >> From: Qemu-devel <qemu-devel- >> bounces+chen.zhang=intel.com@nongnu.org> On Behalf Of Zhang, Chen >> Sent: Tuesday, January 7, 2020 12:33 PM >> To: Jason Wang <jasowang@redhat.com>; Paolo Bonzini >> <pbonzini@redhat.com>; Philippe Mathieu-Daudé <philmd@redhat.com>; >> qemu-dev <qemu-devel@nongnu.org> >> Cc: Zhang Chen <zhangckid@gmail.com> >> Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module >> >> Hi All, >> >> No news for a while about this series. >> >> This version already add new docs to address Paolo's comments. >> >> Please give me more comments. >> >> >> Thanks >> >> Zhang Chen >> >> >> On 12/17/2019 8:45 PM, Zhang, Chen wrote: >>> From: Zhang Chen <chen.zhang@intel.com> >>> >>> Advanced Watch Dog is an universal monitoring module on VMM side, it >>> can be used to detect network down(VMM to guest, VMM to VMM, VMM >> to >>> another remote server) and do previously set operation. Current AWD >>> patch just accept any input as the signal to refresh the watchdog >>> timer, and we can also make a certain interactive protocol here. For >>> the outputs, user can pre-write some command or some messages in the >>> AWD opt-script. We noticed that there is no way for VMM communicate >>> directly, maybe some people think we don't need such things(up layer >>> software like openstack can handle it). so we engaged with real >>> customer found that they need a lightweight and efficient mechanism to >>> solve some practical problems, >>> >>> For example Edge Computing cases(they think high level software is too >>> heavy to use in Edge or it is hard to manage and combine with VM instance). >>> It make user have basic VM/Host network monitoring tools and basic >>> false tolerance and recovery solution.. >>> >>> Please see the detail documentation in the last patch. >>> >>> V4: >>> - Add more introduction in qemu-options.hx >>> - Addressed Paolo's comments add docs/awd.txt for the AWD module >> detail. >>> V3: >>> - Rebased on Qemu 4.2.0-rc1 code. >>> - Fix commit message issue. >>> >>> V2: >>> - Addressed Philippe comments add configure selector for AWD. >>> >>> Initial: >>> - Initial version. >>> >>> >>> Zhang Chen (5): >>> net/awd.c: Introduce Advanced Watch Dog module framework >>> net/awd.c: Initailize input/output chardev >>> net/awd.c: Load advanced watch dog worker thread job >>> vl.c: Make Advanced Watch Dog delayed initialization >>> docs/awd.txt: Add doc to introduce Advanced WatchDog(AWD) module >>> >>> configure | 9 + >>> docs/awd.txt | 88 +++++++++ >>> net/Makefile.objs | 1 + >>> net/awd.c | 491 >> ++++++++++++++++++++++++++++++++++++++++++++++ >>> qemu-options.hx | 20 ++ >>> vl.c | 7 + >>> 6 files changed, 616 insertions(+) >>> create mode 100644 docs/awd.txt >>> create mode 100644 net/awd.c >>>
> -----Original Message----- > From: Jason Wang <jasowang@redhat.com> > Sent: Monday, January 20, 2020 10:57 AM > To: Zhang, Chen <chen.zhang@intel.com>; Paolo Bonzini > <pbonzini@redhat.com>; Philippe Mathieu-Daudé <philmd@redhat.com>; > qemu-dev <qemu-devel@nongnu.org> > Cc: Zhang Chen <zhangckid@gmail.com> > Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module > > > On 2020/1/19 下午5:10, Zhang, Chen wrote: > > Hi~ > > > > Anyone have comments about this module? > > > Hi Chen: > > I will take a look at this series. Sorry for slow reply due to CNY and extend leave. OK, waiting your comments~ Thanks~ > > Two general questions: > > - if it can detect more than network stall, it should not belong to /net This module use network connection status to detect all the issue(Host to Guest/Host to Host/Host to Admin...). The target is more than network but all use network way. So it is looks a tricky problem. > - need to convince libvirt guys for this proposal, since usually it's the duty of > upper layer instead of qemu itself > Yes, It looks a upper layer responsibility, but In the cover latter I have explained the reason why we need this in Qemu. try to make this module as simple as possible. This module give upper layer software a new way to connect/monitoring Qemu. And due to all the COLO code implement in Qemu side, Many customer want to use this FT solution without other dependencies, it is very easy to integrated to real product. Thanks Zhang Chen > Thanks > > > > We have some clients already try to use this module with COLO. Please > review this part. > > If no one want to maintain this module, I can maintain this module myself. > > > > Thanks > > Zhang Chen > > > >> -----Original Message----- > >> From: Qemu-devel <qemu-devel- > >> bounces+chen.zhang=intel.com@nongnu.org> On Behalf Of Zhang, Chen > >> Sent: Tuesday, January 7, 2020 12:33 PM > >> To: Jason Wang <jasowang@redhat.com>; Paolo Bonzini > >> <pbonzini@redhat.com>; Philippe Mathieu-Daudé > <philmd@redhat.com>; > >> qemu-dev <qemu-devel@nongnu.org> > >> Cc: Zhang Chen <zhangckid@gmail.com> > >> Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module > >> > >> Hi All, > >> > >> No news for a while about this series. > >> > >> This version already add new docs to address Paolo's comments. > >> > >> Please give me more comments. > >> > >> > >> Thanks > >> > >> Zhang Chen > >> > >> > >> On 12/17/2019 8:45 PM, Zhang, Chen wrote: > >>> From: Zhang Chen <chen.zhang@intel.com> > >>> > >>> Advanced Watch Dog is an universal monitoring module on VMM side, it > >>> can be used to detect network down(VMM to guest, VMM to VMM, > VMM > >> to > >>> another remote server) and do previously set operation. Current AWD > >>> patch just accept any input as the signal to refresh the watchdog > >>> timer, and we can also make a certain interactive protocol here. For > >>> the outputs, user can pre-write some command or some messages in > the > >>> AWD opt-script. We noticed that there is no way for VMM communicate > >>> directly, maybe some people think we don't need such things(up layer > >>> software like openstack can handle it). so we engaged with real > >>> customer found that they need a lightweight and efficient mechanism > >>> to solve some practical problems, > >>> > >>> For example Edge Computing cases(they think high level software is > >>> too heavy to use in Edge or it is hard to manage and combine with VM > instance). > >>> It make user have basic VM/Host network monitoring tools and basic > >>> false tolerance and recovery solution.. > >>> > >>> Please see the detail documentation in the last patch. > >>> > >>> V4: > >>> - Add more introduction in qemu-options.hx > >>> - Addressed Paolo's comments add docs/awd.txt for the AWD module > >> detail. > >>> V3: > >>> - Rebased on Qemu 4.2.0-rc1 code. > >>> - Fix commit message issue. > >>> > >>> V2: > >>> - Addressed Philippe comments add configure selector for AWD. > >>> > >>> Initial: > >>> - Initial version. > >>> > >>> > >>> Zhang Chen (5): > >>> net/awd.c: Introduce Advanced Watch Dog module framework > >>> net/awd.c: Initailize input/output chardev > >>> net/awd.c: Load advanced watch dog worker thread job > >>> vl.c: Make Advanced Watch Dog delayed initialization > >>> docs/awd.txt: Add doc to introduce Advanced WatchDog(AWD) > module > >>> > >>> configure | 9 + > >>> docs/awd.txt | 88 +++++++++ > >>> net/Makefile.objs | 1 + > >>> net/awd.c | 491 > >> ++++++++++++++++++++++++++++++++++++++++++++++ > >>> qemu-options.hx | 20 ++ > >>> vl.c | 7 + > >>> 6 files changed, 616 insertions(+) > >>> create mode 100644 docs/awd.txt > >>> create mode 100644 net/awd.c > >>>
On 2020/2/11 下午4:58, Zhang, Chen wrote: >> -----Original Message----- >> From: Jason Wang<jasowang@redhat.com> >> Sent: Monday, January 20, 2020 10:57 AM >> To: Zhang, Chen<chen.zhang@intel.com>; Paolo Bonzini >> <pbonzini@redhat.com>; Philippe Mathieu-Daudé<philmd@redhat.com>; >> qemu-dev<qemu-devel@nongnu.org> >> Cc: Zhang Chen<zhangckid@gmail.com> >> Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module >> >> >> On 2020/1/19 下午5:10, Zhang, Chen wrote: >>> Hi~ >>> >>> Anyone have comments about this module? >> Hi Chen: >> >> I will take a look at this series. > Sorry for slow reply due to CNY and extend leave. > OK, waiting your comments~ Thanks~ > >> Two general questions: >> >> - if it can detect more than network stall, it should not belong to /net > This module use network connection status to detect all the issue(Host to Guest/Host to Host/Host to Admin...). > The target is more than network but all use network way. So it is looks a tricky problem. Ok. > >> - need to convince libvirt guys for this proposal, since usually it's the duty of >> upper layer instead of qemu itself >> > Yes, It looks a upper layer responsibility, but In the cover latter I have explained the reason why we need this in Qemu. > try to make this module as simple as possible. This module give upper layer software a new way to connect/monitoring Qemu. > And due to all the COLO code implement in Qemu side, Many customer want to use this FT solution without other dependencies, > it is very easy to integrated to real product. > > Thanks > Zhang Chen I would like to hear from libvirt about such design. Thanks >
On 2/12/2020 10:56 AM, Jason Wang wrote: > On 2020/2/11 下午4:58, Zhang, Chen wrote: >>> -----Original Message----- >>> From: Jason Wang<jasowang@redhat.com> >>> Sent: Monday, January 20, 2020 10:57 AM >>> To: Zhang, Chen<chen.zhang@intel.com>; Paolo Bonzini >>> <pbonzini@redhat.com>; Philippe Mathieu-Daudé<philmd@redhat.com>; >>> qemu-dev<qemu-devel@nongnu.org> >>> Cc: Zhang Chen<zhangckid@gmail.com> >>> Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module >>> >>> >>> On 2020/1/19 下午5:10, Zhang, Chen wrote: >>>> Hi~ >>>> >>>> Anyone have comments about this module? >>> Hi Chen: >>> >>> I will take a look at this series. >> Sorry for slow reply due to CNY and extend leave. >> OK, waiting your comments~ Thanks~ >> >>> Two general questions: >>> >>> - if it can detect more than network stall, it should not belong to /net >> This module use network connection status to detect all the issue(Host to Guest/Host to Host/Host to Admin...). >> The target is more than network but all use network way. So it is looks a tricky problem. > > Ok. > > >>> - need to convince libvirt guys for this proposal, since usually it's the duty of >>> upper layer instead of qemu itself >>> >> Yes, It looks a upper layer responsibility, but In the cover latter I have explained the reason why we need this in Qemu. >> try to make this module as simple as possible. This module give upper layer software a new way to connect/monitoring Qemu. >> And due to all the COLO code implement in Qemu side, Many customer want to use this FT solution without other dependencies, >> it is very easy to integrated to real product. >> >> Thanks >> Zhang Chen > > I would like to hear from libvirt about such design. Hi Jason, OK. I add the libvirt mailing list in this thread. The full mail discussion and patches: https://lists.nongnu.org/archive/html/qemu-devel/2020-02/msg02611.html By the way, I noticed Eric is libvirt maintianer. Hi Eric and Paolo, Can you give some comments about this series? Thanks Zhang Chen > > Thanks >
> >>> Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module > >>> > >>> > >>> On 2020/1/19 下午5:10, Zhang, Chen wrote: > >>>> Hi~ > >>>> > >>>> Anyone have comments about this module? > >>> Hi Chen: > >>> > >>> I will take a look at this series. > >> Sorry for slow reply due to CNY and extend leave. > >> OK, waiting your comments~ Thanks~ > >> > >>> Two general questions: > >>> > >>> - if it can detect more than network stall, it should not belong to > >>> /net > >> This module use network connection status to detect all the issue(Host to > Guest/Host to Host/Host to Admin...). > >> The target is more than network but all use network way. So it is looks a > tricky problem. > > > > Ok. > > > > > >>> - need to convince libvirt guys for this proposal, since usually > >>> it's the duty of upper layer instead of qemu itself > >>> > >> Yes, It looks a upper layer responsibility, but In the cover latter I have > explained the reason why we need this in Qemu. > >> try to make this module as simple as possible. This module give upper > layer software a new way to connect/monitoring Qemu. > >> And due to all the COLO code implement in Qemu side, Many customer > >> want to use this FT solution without other dependencies, it is very easy to > integrated to real product. > >> > >> Thanks > >> Zhang Chen > > > > I would like to hear from libvirt about such design. > > > Hi Jason, > > OK. I add the libvirt mailing list in this thread. > > The full mail discussion and patches: > > https://lists.nongnu.org/archive/html/qemu-devel/2020-02/msg02611.html > > > By the way, I noticed Eric is libvirt maintianer. > > Hi Eric and Paolo, Can you give some comments about this series? > > No news for a while... We already have some users(Cloud Service Provider) try to use is module in their product. But they also need to follow the Qemu upstream code. Thanks Zhang Chen > Thanks > > Zhang Chen > > > > > > Thanks > >
On 04/03/20 09:06, Zhang, Chen wrote: >> Hi Eric and Paolo, Can you give some comments about this series? >> >> > No news for a while... > We already have some users(Cloud Service Provider) try to use is module in their product. > But they also need to follow the Qemu upstream code. My main comment about this series is that it's not clear why it is needed and how to use it. The documentation includes a demo, but no description of what is an awd_node, a notification_node and an opt_script. I can more or less understand the notification_node and opt_script role from the documentation, but not entirely because, for example, the two-host demo has hardcoded IP addresses without saying which host is which IP address. The documentation does not describe the protocol, which is absolutely necessary, and does not describe _why_ the protocol was designed like that. Without such documentation it's not clear if, for example, the watchdog protocol could be implemented as QMP commands (e.g. start-watchdog, stop-watchdog, notify-watchdog). Another possibility could be to use the systemd watchdog protocol, which consists of essentially three commands (WATCHDOG=1, WATCHDOG=trigger, WATCHDOG_USEC=...) which are transmitted as datagrams. Documentation is important for reviewers to judge the merits of the protocol without (or before) diving into the code. In the demo, the opt_script mechanism is currently using the "human" monitor as opposed to QMP. The human monitor interface is not stable and not meant for consumption by management interface. It is not clear if this is just a sample usage, and in practice the notification_node would be outside of QEMU, or not. In general I would prefer to have the script as an optional feature, and report the triggering of the watchdog via QMP events. Paolo
On 3/4/2020 9:37 PM, Paolo Bonzini wrote: > On 04/03/20 09:06, Zhang, Chen wrote: >>> Hi Eric and Paolo, Can you give some comments about this series? >>> >>> >> No news for a while... >> We already have some users(Cloud Service Provider) try to use is module in their product. >> But they also need to follow the Qemu upstream code. > My main comment about this series is that it's not clear why it is > needed and how to use it. The documentation includes a demo, but no > description of what is an awd_node, a notification_node and an > opt_script. I can more or less understand the notification_node and > opt_script role from the documentation, but not entirely because, for > example, the two-host demo has hardcoded IP addresses without saying > which host is which IP address. Hi Paolo, Sorry for slow reply and thank you for your comments. Let me summarize your main opinions and methods: 1. Why AWD is needed. Advanced Watch Dog is an universal monitoring module on VMM side, it can be used to detect network down(VMM to guest, VMM to VMM, VMM to another remote server) and do previously set operation. Current AWD patch just accept any input as the signal to refresh the watchdog timer, and we can also make a certain interactive protocol here. For the outputs, user can pre-write some command or some messages in the AWD opt-script. We noticed that there is no way for VMM communicate directly, maybe some people think we don't need such things(up layer software like openstack can handle it). so we engaged with real customer found that they need a lightweight and efficient mechanism to solve some practical problems, For example Edge Computing cases(they think high level software is too heavy to use in Edge or it is hard to manage and combine with VM instance). It make user have basic VM/Host network monitoring tools and basic false tolerance and recovery solution. For COLO FT/HA solution, we already have some CSPs try to use AWD with COLO. 2. Documentation issues, include how to use it. I will address all your comments and complete details about documentation. 3. Communication protocol issue. Current AWD without any protocol, any data it gets will be considered a heartbeat signal. I think use QMP format is good for me. 4. Implementation issue. The AWD script as an optional feature is OK for me. And report the triggering of the watchdog via QMP events is enough for current usage. But it looks have limitation to notify outside Qemu. I don't know which is better choice. If the QMP events solution is better, I will fix it in next version. I don't know if I understand your means correctly. Please give me more guidance on this series. :-) Thanks Zhang Chen > > The documentation does not describe the protocol, which is absolutely > necessary, and does not describe _why_ the protocol was designed like > that. Without such documentation it's not clear if, for example, the > watchdog protocol could be implemented as QMP commands (e.g. > start-watchdog, stop-watchdog, notify-watchdog). Another possibility > could be to use the systemd watchdog protocol, which consists of > essentially three commands (WATCHDOG=1, WATCHDOG=trigger, > WATCHDOG_USEC=...) which are transmitted as datagrams. Documentation is > important for reviewers to judge the merits of the protocol without (or > before) diving into the code. > > In the demo, the opt_script mechanism is currently using the "human" > monitor as opposed to QMP. The human monitor interface is not stable > and not meant for consumption by management interface. It is not clear > if this is just a sample usage, and in practice the notification_node > would be outside of QEMU, or not. In general I would prefer to have the > script as an optional feature, and report the triggering of the watchdog > via QMP events. > > Paolo >
On 09/03/20 10:32, Zhang, Chen wrote: > 4. Implementation issue. > > The AWD script as an optional feature is OK for me. > > And report the triggering of the watchdog via QMP events is enough for > current usage. > > But it looks have limitation to notify outside Qemu. I don't know which > is better choice. > > If the QMP events solution is better, I will fix it in next version. Good, thanks. Naming-wise, it's ugly that we already have a WATCHDOG event for guest watchdog devices. The following design however should allow setting up multiple watchdogs - Creating a watchdog from the command line: -object watchdog,id=STR,timeout=NNN,chardev=CHR and object_add/object-add can also be used for HMP and QMP. - Reporting a watchdog timeout via QMP: { 'event': 'WATCHDOG_TIMEOUT', 'data': { 'id': 'str' } } - Protocol: the data sent on the chardev to QEMU must be WATCHDOG=1 optionally followed by exactly one \n character. All other data is ignored. Paolo
© 2016 - 2024 Red Hat, Inc.