[libvirt] [PATCH v4 0/2] Fix detection of slow guest shutdown

Christian Ehrhardt posted 2 patches 5 years, 8 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20180821123326.5721-1-christian.ehrhardt@canonical.com
Test syntax-check passed
src/libvirt_private.syms |  1 +
src/qemu/qemu_process.c  |  7 +++++--
src/util/virprocess.c    | 22 ++++++++++++++++++----
src/util/virprocess.h    |  3 +++
4 files changed, 27 insertions(+), 6 deletions(-)
[libvirt] [PATCH v4 0/2] Fix detection of slow guest shutdown
Posted by Christian Ehrhardt 5 years, 8 months ago
Hi,
after a good discussion a few days ago in
 https://www.redhat.com/archives/libvir-list/2018-August/msg00122.html
and a short lived but back then untested v2 in
 https://www.redhat.com/archives/libvir-list/2018-August/msg00199.html
I finally get access to the right HW again and completed the series.

Being finally retested and working I finally feel safe to submit without
a RFC prefix. I think this would be a great addition for a better handling
of guests with plenty of host devices passed through.

With the new code in place I can shutdown systems that have 12, 16 or
even more hostdevs attached without getting into the "zombie" mode where
libvirt will forever consider the guest as "in shutdown" as it gave up
waiting too early because the signal zero still was able to reach it.

Scaling examples (extracted with gdb):
16 Devices: virProcessKillPainfullyDelay (pid=67096, force=true, extradelay=32)
12 Devices: virProcessKillPainfullyDelay (pid=68251, force=true, extradelay=24)

*Updates in v4*
- virDebug now reports the extradelay as requested (in seconds) and
  thereby mostly matches the gdb output seen above
- header function prototype defines the variable name
- clarify the usage of delay units
  - seconds (API call)
  - 5th of seconds (internal poll loop)
- explain the request for 2*nhostdevs from the qemu shutdown code

*Updates in v3*
- fixup some issues found in testing and code checks

*Updates in v2*
- removed the "accept the lack of /proc/<pid> as valid process removal"
  approach due to valid concerns about reusing ressources.
- added a dynamic extra wait scaling with the amount of hostdevs

Christian Ehrhardt (2):
  process: wait longer on kill per assigned Hostdev
  process: wait longer 5->30s on hard shutdown

 src/libvirt_private.syms |  1 +
 src/qemu/qemu_process.c  |  7 +++++--
 src/util/virprocess.c    | 22 ++++++++++++++++++----
 src/util/virprocess.h    |  3 +++
 4 files changed, 27 insertions(+), 6 deletions(-)

-- 
2.17.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v4 0/2] Fix detection of slow guest shutdown
Posted by Christian Ehrhardt 5 years, 8 months ago
On Tue, Aug 21, 2018 at 2:34 PM Christian Ehrhardt <
christian.ehrhardt@canonical.com> wrote:

> Hi,
> after a good discussion a few days ago in
>  https://www.redhat.com/archives/libvir-list/2018-August/msg00122.html
> and a short lived but back then untested v2 in
>  https://www.redhat.com/archives/libvir-list/2018-August/msg00199.html
> I finally get access to the right HW again and completed the series.
>
> Being finally retested and working I finally feel safe to submit without
> a RFC prefix. I think this would be a great addition for a better handling
> of guests with plenty of host devices passed through.
>
> With the new code in place I can shutdown systems that have 12, 16 or
> even more hostdevs attached without getting into the "zombie" mode where
> libvirt will forever consider the guest as "in shutdown" as it gave up
> waiting too early because the signal zero still was able to reach it.
>
> Scaling examples (extracted with gdb):
> 16 Devices: virProcessKillPainfullyDelay (pid=67096, force=true,
> extradelay=32)
> 12 Devices: virProcessKillPainfullyDelay (pid=68251, force=true,
> extradelay=24)
>
> *Updates in v4*
> - virDebug now reports the extradelay as requested (in seconds) and
>   thereby mostly matches the gdb output seen above
> - header function prototype defines the variable name
> - clarify the usage of delay units
>   - seconds (API call)
>   - 5th of seconds (internal poll loop)
> - explain the request for 2*nhostdevs from the qemu shutdown code
>
> *Updates in v3*
> - fixup some issues found in testing and code checks
>
> *Updates in v2*
> - removed the "accept the lack of /proc/<pid> as valid process removal"
>   approach due to valid concerns about reusing ressources.
> - added a dynamic extra wait scaling with the amount of hostdevs
>
> Christian Ehrhardt (2):
>   process: wait longer on kill per assigned Hostdev
>   process: wait longer 5->30s on hard shutdown
>

FYI after there was no further feedback I pushed the v4 with the
appropriate reviewed by tags.
Thanks everybody for your participation!


>  src/libvirt_private.syms |  1 +
>  src/qemu/qemu_process.c  |  7 +++++--
>  src/util/virprocess.c    | 22 ++++++++++++++++++----
>  src/util/virprocess.h    |  3 +++
>  4 files changed, 27 insertions(+), 6 deletions(-)
>
> --
> 2.17.1
>
>

-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list