[libvirt] [PATCH 0/7] util: abort when out of memory

Daniel P. Berrangé posted 7 patches 4 years, 7 months ago
Test syntax-check passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20190829180250.3290-1-berrange@redhat.com
There is a newer version of this series
configure.ac                      |  19 +-
docs/docs.html.in                 |   3 -
docs/internals/oomtesting.html.in | 213 --------------------
libvirt.spec.in                   |   1 +
m4/virt-glib.m4                   |  30 +++
mingw-libvirt.spec.in             |   2 +
src/Makefile.am                   |   2 +
src/internal.h                    |   1 +
src/libvirt_private.syms          |   4 -
src/lxc/Makefile.inc.am           |   2 +
src/remote/Makefile.inc.am        |   1 +
src/util/Makefile.inc.am          |   1 +
src/util/viralloc.c               | 313 ++++--------------------------
src/util/viralloc.h               | 204 ++++---------------
src/util/virstring.c              |  93 +++------
src/util/virstring.h              |  79 +++-----
tests/Makefile.am                 |   4 +-
tests/oomtrace.pl                 |  41 ----
tests/qemuxml2argvtest.c          |  12 +-
tests/testutils.c                 | 189 +-----------------
tests/testutils.h                 |   2 -
tests/virfirewalltest.c           |  12 --
tools/Makefile.am                 |   1 +
23 files changed, 179 insertions(+), 1050 deletions(-)
delete mode 100644 docs/internals/oomtesting.html.in
create mode 100644 m4/virt-glib.m4
delete mode 100755 tests/oomtrace.pl
[libvirt] [PATCH 0/7] util: abort when out of memory
Posted by Daniel P. Berrangé 4 years, 7 months ago
See this previous thread:

  https://www.redhat.com/archives/libvir-list/2019-May/msg00273.html

The executive summary is that catching & reporting ENOMEM is not worth
the maint cost because:

  - this code will almost never run on Linux hosts

  - if it does run it will likely have bad behaviour silently dropping
    data or crashing the process

  - apps using libvirt often do so via a non-C language that aborts/exits
    the app on OOM regardless, or use other C libraries that abort

  - we can build a system that is more reliable when OOM happens by
    not catching OOM, instead simply letting apps exit, restart and
    carry on where they left off

The first part of the series does the bare minimum to cull OOM handling.

With this done, the main reason why we never adopted glib is now
removed. Thus the second part of this series introduces use of glib into
libvirt and converts our our allocation APIs to use the glib allocation
APIs internally.

In future I'd especially like to use glib to replace virObject code,
which I knowingly wrote as a somewhat pathetic clone of GObject.
Eliminating all our DBus code is also another thing I'd like, since
Glib's DBus client code is better IMHO.

Note that none of the callers are updated at this point, so they are all
still attempting to handle errors from VIR_ALLOC, VIR_STRDUP, etc. If
we convert VIR_ALLOC calls to remove return value checks, and then later
convert to glib's  g_new0 API, we go through two lots of churn which I
think is undesirable.

Thus I think it is highly desirable to introduce glib straight away and
do a single conversion step. It also means we can introduce other data
structures from glib to replace ours and avoid wasting time converting
those too.

Overall the code in this series is all quite simple and a nice cleanup.
50% of the lines culled come from the first patch removing OOM testing,
the other 50% come from viralloc.c impl simplification

The interesting question is the impact is has on downstreams who have
to backport patches to older versions.

If we start converting callers to g_new0, etc, then downstream have
to either

 * Change g_new0 back into VIR_ALLOC and likely re-add  many goto
   calls we purged

Or

 * Backport all the patches in this series that drop OOM handling
   and introduce glib usage

If we start adopting other glib features beyond g_new0, then downstreams
are pretty much forced into option 2.

Given the benefit I think we'll see from this series in our codebase,
I'm inclined to say we should prioritize the future, over prioritizing
the past (backports), and thus freely adopt use of glib APIs rightaway.

IOW, I think we should expect vendors to backport this series as a
starting point, before any other patches. I've not tested but I'm
hopeful that such a backport of this series is fairly easy. The
viralloc.{c,h} file hasn't seen much change in recent times so
ought to be a clean cherry-pick. The glib additions might hit
some conflicts in makefiles, but not too terrible I hope. Probably
worth doing a test to see just how easy backports are over the past
1 year of releases.

Daniel P. Berrangé (7):
  util: purge all code for testing OOM handling
  util: make allocation functions abort on OOM
  util: remove several unused _QUIET allocation macro variants
  util: make string functions abort on OOM
  build: probe for glib-2 library in configure
  build: link to glib library
  util: use glib allocation functions

 configure.ac                      |  19 +-
 docs/docs.html.in                 |   3 -
 docs/internals/oomtesting.html.in | 213 --------------------
 libvirt.spec.in                   |   1 +
 m4/virt-glib.m4                   |  30 +++
 mingw-libvirt.spec.in             |   2 +
 src/Makefile.am                   |   2 +
 src/internal.h                    |   1 +
 src/libvirt_private.syms          |   4 -
 src/lxc/Makefile.inc.am           |   2 +
 src/remote/Makefile.inc.am        |   1 +
 src/util/Makefile.inc.am          |   1 +
 src/util/viralloc.c               | 313 ++++--------------------------
 src/util/viralloc.h               | 204 ++++---------------
 src/util/virstring.c              |  93 +++------
 src/util/virstring.h              |  79 +++-----
 tests/Makefile.am                 |   4 +-
 tests/oomtrace.pl                 |  41 ----
 tests/qemuxml2argvtest.c          |  12 +-
 tests/testutils.c                 | 189 +-----------------
 tests/testutils.h                 |   2 -
 tests/virfirewalltest.c           |  12 --
 tools/Makefile.am                 |   1 +
 23 files changed, 179 insertions(+), 1050 deletions(-)
 delete mode 100644 docs/internals/oomtesting.html.in
 create mode 100644 m4/virt-glib.m4
 delete mode 100755 tests/oomtrace.pl

-- 
2.21.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/7] util: abort when out of memory
Posted by Jim Fehlig 4 years, 7 months ago
On 8/29/19 12:02 PM, Daniel P. Berrangé  wrote:
> See this previous thread:
> 
>    https://www.redhat.com/archives/libvir-list/2019-May/msg00273.html
> 
> The executive summary is that catching & reporting ENOMEM is not worth
> the maint cost because:
> 
>    - this code will almost never run on Linux hosts
> 
>    - if it does run it will likely have bad behaviour silently dropping
>      data or crashing the process
> 
>    - apps using libvirt often do so via a non-C language that aborts/exits
>      the app on OOM regardless, or use other C libraries that abort
> 
>    - we can build a system that is more reliable when OOM happens by
>      not catching OOM, instead simply letting apps exit, restart and
>      carry on where they left off
> 
> The first part of the series does the bare minimum to cull OOM handling.
> 
> With this done, the main reason why we never adopted glib is now
> removed. Thus the second part of this series introduces use of glib into
> libvirt and converts our our allocation APIs to use the glib allocation
> APIs internally.
> 
> In future I'd especially like to use glib to replace virObject code,
> which I knowingly wrote as a somewhat pathetic clone of GObject.
> Eliminating all our DBus code is also another thing I'd like, since
> Glib's DBus client code is better IMHO.
> 
> Note that none of the callers are updated at this point, so they are all
> still attempting to handle errors from VIR_ALLOC, VIR_STRDUP, etc. If
> we convert VIR_ALLOC calls to remove return value checks, and then later
> convert to glib's  g_new0 API, we go through two lots of churn which I
> think is undesirable.
> 
> Thus I think it is highly desirable to introduce glib straight away and
> do a single conversion step. It also means we can introduce other data
> structures from glib to replace ours and avoid wasting time converting
> those too.
> 
> Overall the code in this series is all quite simple and a nice cleanup.
> 50% of the lines culled come from the first patch removing OOM testing,
> the other 50% come from viralloc.c impl simplification
> 
> The interesting question is the impact is has on downstreams who have
> to backport patches to older versions.
> 
> If we start converting callers to g_new0, etc, then downstream have
> to either
> 
>   * Change g_new0 back into VIR_ALLOC and likely re-add  many goto
>     calls we purged
> 
> Or
> 
>   * Backport all the patches in this series that drop OOM handling
>     and introduce glib usage
> 
> If we start adopting other glib features beyond g_new0, then downstreams
> are pretty much forced into option 2.
> 
> Given the benefit I think we'll see from this series in our codebase,
> I'm inclined to say we should prioritize the future, over prioritizing
> the past (backports), and thus freely adopt use of glib APIs rightaway.
> 
> IOW, I think we should expect vendors to backport this series as a
> starting point, before any other patches. I've not tested but I'm
> hopeful that such a backport of this series is fairly easy. The
> viralloc.{c,h} file hasn't seen much change in recent times so
> ought to be a clean cherry-pick. The glib additions might hit
> some conflicts in makefiles, but not too terrible I hope. Probably
> worth doing a test to see just how easy backports are over the past
> 1 year of releases.

Given the primary maintenance burden I'll be seeing over the next years is on 
versions 5.1.0 and 4.0.0, I took at stab at backporting this series to those 
releases. Here are some notes I scribbled while backporting to 5.1.0:

Patch1 has conflicts due to moving of virtauto out of viralloc.h by commit a4bfc252.

Patch2 has conflicts, mostly due to whitespace changes from switching to 
'#pragma once'. E.g. commit a6d438a9 in the case of viralloc.h

Patch3 has similar conflicts to 2. Too bad you have to patch functions (that 
have conflicts) in patch 2 that are removed in this patch.

Patch4 has conflicts due to '#pragma once'

Patch5 applies cleanly!

Patch6 has conflicts due to '#pragma once'

Patch7 applies cleanly!

I then took the refreshed patches and tried applying to 4.0.0:

Patch1 has conflicts due to no autoptr stuff in 4.0.0 and due to some changes in 
testutils.c (hunk 5).

Patch2 and patch3 apply cleanly!

Patch4 has conflicts in hunk 5 of virstring.h, but I couldn't spot cause of 
conflict.

Patch5 has some fuzz due to introduction of *_FIREWALLD_ZONE

Patch6 has conflicts since there are no */Makefile.inc.am in 4.0.0

Regards,
Jim

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list