[libvirt] [PATCH 0/4] libvirtd: fix hang on termination in qemu driver

Nikolay Shirokovskiy posted 4 patches 6 years, 6 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/1508922308-554701-1-git-send-email-nshirokovskiy@virtuozzo.com
daemon/libvirtd.c        |  2 ++
src/driver-state.h       |  4 ++++
src/libvirt.c            | 18 ++++++++++++++++++
src/libvirt_internal.h   |  1 +
src/libvirt_private.syms |  1 +
src/qemu/qemu_agent.c    | 14 +++++++-------
src/qemu/qemu_driver.c   | 39 +++++++++++++++++++++++++++++++++++++++
src/qemu/qemu_monitor.c  | 27 +++++++++++++--------------
8 files changed, 85 insertions(+), 21 deletions(-)
[libvirt] [PATCH 0/4] libvirtd: fix hang on termination in qemu driver
Posted by Nikolay Shirokovskiy 6 years, 6 months ago
Libvirtd termination can hang. For example if some API call in qemu
driver awaiting monitor response it will never finish because event
loop does not functional during termination. As a result we hang
in virNetDaemonClose call during termination as this call finishes RPC
threads.

Let's ask hypervisor drivers to finish all API calls by calling
introduced state driver shutdown function before call to virNetDaemonClose.

Nikolay Shirokovskiy (4):
  libvirt: introduce hypervisor driver shutdown function
  qemu: implement state driver shutdown function
  qemu: agent: fix monitor close during first sync
  qemu: monitor: check monitor not closed upon send

 daemon/libvirtd.c        |  2 ++
 src/driver-state.h       |  4 ++++
 src/libvirt.c            | 18 ++++++++++++++++++
 src/libvirt_internal.h   |  1 +
 src/libvirt_private.syms |  1 +
 src/qemu/qemu_agent.c    | 14 +++++++-------
 src/qemu/qemu_driver.c   | 39 +++++++++++++++++++++++++++++++++++++++
 src/qemu/qemu_monitor.c  | 27 +++++++++++++--------------
 8 files changed, 85 insertions(+), 21 deletions(-)

-- 
1.8.3.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/4] libvirtd: fix hang on termination in qemu driver
Posted by John Ferlan 6 years, 5 months ago

On 10/25/2017 05:05 AM, Nikolay Shirokovskiy wrote:
> Libvirtd termination can hang. For example if some API call in qemu
> driver awaiting monitor response it will never finish because event
> loop does not functional during termination. As a result we hang
> in virNetDaemonClose call during termination as this call finishes RPC
> threads.
> 
> Let's ask hypervisor drivers to finish all API calls by calling
> introduced state driver shutdown function before call to virNetDaemonClose.
> 
> Nikolay Shirokovskiy (4):
>   libvirt: introduce hypervisor driver shutdown function
>   qemu: implement state driver shutdown function
>   qemu: agent: fix monitor close during first sync
>   qemu: monitor: check monitor not closed upon send
> 
>  daemon/libvirtd.c        |  2 ++
>  src/driver-state.h       |  4 ++++
>  src/libvirt.c            | 18 ++++++++++++++++++
>  src/libvirt_internal.h   |  1 +
>  src/libvirt_private.syms |  1 +
>  src/qemu/qemu_agent.c    | 14 +++++++-------
>  src/qemu/qemu_driver.c   | 39 +++++++++++++++++++++++++++++++++++++++
>  src/qemu/qemu_monitor.c  | 27 +++++++++++++--------------
>  8 files changed, 85 insertions(+), 21 deletions(-)
> 

Moving the discussion started in the refcount thread to here via just
simple cut-n-paste... I didn't CC Erik or Martin directly because I'm
fairly confident they read libvir-list posts anyway ;-)

John

[jferlan comment]
https://www.redhat.com/archives/libvir-list/2017-November/msg00236.html

The other series seems to be adding a state shutdown step to be run
before state cleanup; however, if we alter the cleanup code in libvirtd
does that make the state shutdown step unnecessary? I don't know and
really didn't want to think about it until we got through thinking about
the cleanup processing order. Still if one considers that state
initialization is really two steps (init and auto start), then splitting
cleanup into shutdown and cleanup seems reasonable, but I don't think
affects whether we have a crash or latent usage of dmn->servers. I could
be wrong, but it's so hard to tell.


[eskultety comment/followup]
https://www.redhat.com/archives/libvir-list/2017-November/msg00249.html

Actually, no, the stateShutdown was addressing the issue with running
<driver>StateCleanup while there still was an active worker thread that
could potentially access the @driver object, in which case - SIGSEGV,
but as I wrote in my response, I don't like the idea of having another
state driver callback, as I feel it introduces a bit of ambiguity, since
by just looking at the callback names, you can't have an idea, what the
difference between stateCleanup and stateShutdown is. I think this
should be handled as part of the final stage of leaving the event loop
and simply register an appropriate callback to the handle, that way, you
don't need to define a new state callback which would most likely used
by a couple of drivers.



[mkletzan comment/followup]
https://www.redhat.com/archives/libvir-list/2017-November/msg00450.html

If stateShutdown is what makes everything go away, then why not?  The
names are pretty descriptive, shutdown is called when the daemon is
shutting down and cleanup when you need to clean up after yourself.
Used by only few drivers? Well, we can implement it everywhere and not
make it make optional, but rather mandatory.  If that is checked on the
registration then we'll know right away when we missed something (aat
runtime, but right after someone tries running it).

-----

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/4] libvirtd: fix hang on termination in qemu driver
Posted by Nikolay Shirokovskiy 6 years, 4 months ago
On 13.11.2017 18:46, John Ferlan wrote:
> 
> 
> On 10/25/2017 05:05 AM, Nikolay Shirokovskiy wrote:
>> Libvirtd termination can hang. For example if some API call in qemu
>> driver awaiting monitor response it will never finish because event
>> loop does not functional during termination. As a result we hang
>> in virNetDaemonClose call during termination as this call finishes RPC
>> threads.
>>
>> Let's ask hypervisor drivers to finish all API calls by calling
>> introduced state driver shutdown function before call to virNetDaemonClose.
>>
>> Nikolay Shirokovskiy (4):
>>   libvirt: introduce hypervisor driver shutdown function
>>   qemu: implement state driver shutdown function
>>   qemu: agent: fix monitor close during first sync
>>   qemu: monitor: check monitor not closed upon send
>>
>>  daemon/libvirtd.c        |  2 ++
>>  src/driver-state.h       |  4 ++++
>>  src/libvirt.c            | 18 ++++++++++++++++++
>>  src/libvirt_internal.h   |  1 +
>>  src/libvirt_private.syms |  1 +
>>  src/qemu/qemu_agent.c    | 14 +++++++-------
>>  src/qemu/qemu_driver.c   | 39 +++++++++++++++++++++++++++++++++++++++
>>  src/qemu/qemu_monitor.c  | 27 +++++++++++++--------------
>>  8 files changed, 85 insertions(+), 21 deletions(-)
>>
> 
> Moving the discussion started in the refcount thread to here via just
> simple cut-n-paste... I didn't CC Erik or Martin directly because I'm
> fairly confident they read libvir-list posts anyway ;-)
> 
> John
> 
> [jferlan comment]
> https://www.redhat.com/archives/libvir-list/2017-November/msg00236.html
> 
> The other series seems to be adding a state shutdown step to be run
> before state cleanup; however, if we alter the cleanup code in libvirtd
> does that make the state shutdown step unnecessary? I don't know and
> really didn't want to think about it until we got through thinking about
> the cleanup processing order. Still if one considers that state
> initialization is really two steps (init and auto start), then splitting
> cleanup into shutdown and cleanup seems reasonable, but I don't think
> affects whether we have a crash or latent usage of dmn->servers. I could
> be wrong, but it's so hard to tell.
> 
> 
> [eskultety comment/followup]
> https://www.redhat.com/archives/libvir-list/2017-November/msg00249.html
> 
> Actually, no, the stateShutdown was addressing the issue with running
> <driver>StateCleanup while there still was an active worker thread that
> could potentially access the @driver object, in which case - SIGSEGV,
> but as I wrote in my response, I don't like the idea of having another
> state driver callback, as I feel it introduces a bit of ambiguity, since
> by just looking at the callback names, you can't have an idea, what the
> difference between stateCleanup and stateShutdown is. I think this
> should be handled as part of the final stage of leaving the event loop
> and simply register an appropriate callback to the handle, that way, you
> don't need to define a new state callback which would most likely used
> by a couple of drivers.
> 

This way we will indroduce some function in event API to notify event
loop clients of shutdown, something like virEventNotifyShutdown.
And we will probably introduce another function for clients to register
for shutdown event. Thus in terms of implementation and API this new
piece of functionality is fully apart from the current event loop function.
So I'm not for this functional mixup.

Nikolay

> 
> 
> [mkletzan comment/followup]
> https://www.redhat.com/archives/libvir-list/2017-November/msg00450.html
> 
> If stateShutdown is what makes everything go away, then why not?  The
> names are pretty descriptive, shutdown is called when the daemon is
> shutting down and cleanup when you need to clean up after yourself.
> Used by only few drivers? Well, we can implement it everywhere and not
> make it make optional, but rather mandatory.  If that is checked on the
> registration then we'll know right away when we missed something (aat
> runtime, but right after someone tries running it).
> 
> -----
> 


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list