[Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()

Yang Zhong posted 1 patch 6 years, 5 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1510899814-19372-1-git-send-email-yang.zhong@intel.com
Test checkpatch passed
Test docker passed
Test ppc passed
Test s390x passed
util/rcu.c | 4 ++++
1 file changed, 4 insertions(+)
[Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Yang Zhong 6 years, 5 months ago
Since there are some issues in memory alloc/free machenism
in glibc for little chunk memory, if Qemu frequently
alloc/free little chunk memory, the glibc doesn't alloc
little chunk memory from free list of glibc and still
allocate from OS, which make the heap size bigger and bigger.

This patch introduce malloc_trim(), which will free heap memory.

Below are test results from smaps file.
55f0783e1000-55f07992a000 rw-p 00000000 00:00 0  [heap]
Size:              21796 kB
Rss:               14260 kB
Pss:               14260 kB

55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0  [heap]
Size:              21668 kB
Rss:                6940 kB
Pss:                6940 kB

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
---
 util/rcu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/util/rcu.c b/util/rcu.c
index ca5a63e..8d491a6 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -26,6 +26,7 @@
  * IBM's contributions to this file may be relicensed under LGPLv2 or later.
  */
 
+#include <malloc.h>
 #include "qemu/osdep.h"
 #include "qemu-common.h"
 #include "qemu/rcu.h"
@@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
             node->func(node);
         }
         qemu_mutex_unlock_iothread();
+#ifdef CONFIG_LINUX
+        malloc_trim(0);
+#endif
     }
     abort();
 }
-- 
1.9.1


Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Stefan Hajnoczi 6 years, 5 months ago
On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> diff --git a/util/rcu.c b/util/rcu.c
> index ca5a63e..8d491a6 100644
> --- a/util/rcu.c
> +++ b/util/rcu.c
> @@ -26,6 +26,7 @@
>   * IBM's contributions to this file may be relicensed under LGPLv2 or later.
>   */
>  
> +#include <malloc.h>

This header file is not mentioned in the C99 standard or POSIX.  It is
probably not available on all host OSes that QEMU supports.  Please use
#ifdef CONFIG_LINUX.

>  #include "qemu/osdep.h"
>  #include "qemu-common.h"
>  #include "qemu/rcu.h"
> @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
>              node->func(node);
>          }
>          qemu_mutex_unlock_iothread();
> +#ifdef CONFIG_LINUX
> +        malloc_trim(0);
> +#endif

It is important that the rcu thread isn't overzealous in minimizing heap
size if that means ordinary malloc(3) calls will experience latency
spikes.  Please leave a few MB free so that malloc(3) doesn't take the
slow path.

Stefan
Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Daniel P. Berrange 6 years, 5 months ago
On Fri, Nov 17, 2017 at 01:54:09PM +0000, Stefan Hajnoczi wrote:
> On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > diff --git a/util/rcu.c b/util/rcu.c
> > index ca5a63e..8d491a6 100644
> > --- a/util/rcu.c
> > +++ b/util/rcu.c
> > @@ -26,6 +26,7 @@
> >   * IBM's contributions to this file may be relicensed under LGPLv2 or later.
> >   */
> >  
> > +#include <malloc.h>
> 
> This header file is not mentioned in the C99 standard or POSIX.  It is
> probably not available on all host OSes that QEMU supports.  Please use
> #ifdef CONFIG_LINUX.
> 
> >  #include "qemu/osdep.h"
> >  #include "qemu-common.h"
> >  #include "qemu/rcu.h"
> > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> >              node->func(node);
> >          }
> >          qemu_mutex_unlock_iothread();
> > +#ifdef CONFIG_LINUX
> > +        malloc_trim(0);
> > +#endif
> 
> It is important that the rcu thread isn't overzealous in minimizing heap
> size if that means ordinary malloc(3) calls will experience latency
> spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> slow path.

If you pass '0' the docs say that the minimum amount is left in the
heap, per M_TOP_PAD, which is 128kb.

Strangely the mallopt(3) man page suggests, that free() should automatically
trim the heap when its size exceeds M_TOP_TRIM, which is again 128kb by
default.  So I'm puzzelled by malloc_trim() would be needed unless there
are scenarios in which free() won't trim, that aren't mentioned in the
manpage.

Also, how does malloc_trim interact with tcmalloc.so that people often
use in preference to glibc's built in malloc ?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Zhong Yang 6 years, 5 months ago
On Fri, Nov 17, 2017 at 02:06:20PM +0000, Daniel P. Berrange wrote:
> On Fri, Nov 17, 2017 at 01:54:09PM +0000, Stefan Hajnoczi wrote:
> > On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > > diff --git a/util/rcu.c b/util/rcu.c
> > > index ca5a63e..8d491a6 100644
> > > --- a/util/rcu.c
> > > +++ b/util/rcu.c
> > > @@ -26,6 +26,7 @@
> > >   * IBM's contributions to this file may be relicensed under LGPLv2 or later.
> > >   */
> > >  
> > > +#include <malloc.h>
> > 
> > This header file is not mentioned in the C99 standard or POSIX.  It is
> > probably not available on all host OSes that QEMU supports.  Please use
> > #ifdef CONFIG_LINUX.
> > 
> > >  #include "qemu/osdep.h"
> > >  #include "qemu-common.h"
> > >  #include "qemu/rcu.h"
> > > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> > >              node->func(node);
> > >          }
> > >          qemu_mutex_unlock_iothread();
> > > +#ifdef CONFIG_LINUX
> > > +        malloc_trim(0);
> > > +#endif
> > 
> > It is important that the rcu thread isn't overzealous in minimizing heap
> > size if that means ordinary malloc(3) calls will experience latency
> > spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> > slow path.
> 
> If you pass '0' the docs say that the minimum amount is left in the
> heap, per M_TOP_PAD, which is 128kb.
> 
> Strangely the mallopt(3) man page suggests, that free() should automatically
> trim the heap when its size exceeds M_TOP_TRIM, which is again 128kb by
> default.  So I'm puzzelled by malloc_trim() would be needed unless there
> are scenarios in which free() won't trim, that aren't mentioned in the
> manpage.
  Hello Daniel,

  In fact, i firstly adopted mallopt() solution to optimize the heap memory,
  but i found this function is NOT useful, which are difference with MAN's
  description, so i had to swith to use malloc_trim().
  
  Regards,

  Yang

> Also, how does malloc_trim interact with tcmalloc.so that people often
> use in preference to glibc's built in malloc ?
  Thanks, you reminded me to consider tcmalloc or jemalloc.
 
  Whether below code is more suitable? thanks!
  #if defined(CONFIG_LINUX) && defined(__GLIBC__)
         malloc_trim(0);
  #endif

  Regards,

  Yang

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Daniel P. Berrange 6 years, 5 months ago
On Mon, Nov 20, 2017 at 04:54:42PM +0800, Zhong Yang wrote:
> On Fri, Nov 17, 2017 at 02:06:20PM +0000, Daniel P. Berrange wrote:
> > On Fri, Nov 17, 2017 at 01:54:09PM +0000, Stefan Hajnoczi wrote:
> > > On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > > > diff --git a/util/rcu.c b/util/rcu.c
> > > > index ca5a63e..8d491a6 100644
> > > > --- a/util/rcu.c
> > > > +++ b/util/rcu.c
> > > > @@ -26,6 +26,7 @@
> > > >   * IBM's contributions to this file may be relicensed under LGPLv2 or later.
> > > >   */
> > > >  
> > > > +#include <malloc.h>
> > > 
> > > This header file is not mentioned in the C99 standard or POSIX.  It is
> > > probably not available on all host OSes that QEMU supports.  Please use
> > > #ifdef CONFIG_LINUX.
> > > 
> > > >  #include "qemu/osdep.h"
> > > >  #include "qemu-common.h"
> > > >  #include "qemu/rcu.h"
> > > > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> > > >              node->func(node);
> > > >          }
> > > >          qemu_mutex_unlock_iothread();
> > > > +#ifdef CONFIG_LINUX
> > > > +        malloc_trim(0);
> > > > +#endif
> > > 
> > > It is important that the rcu thread isn't overzealous in minimizing heap
> > > size if that means ordinary malloc(3) calls will experience latency
> > > spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> > > slow path.
> > 
> > If you pass '0' the docs say that the minimum amount is left in the
> > heap, per M_TOP_PAD, which is 128kb.
> > 
> > Strangely the mallopt(3) man page suggests, that free() should automatically
> > trim the heap when its size exceeds M_TOP_TRIM, which is again 128kb by
> > default.  So I'm puzzelled by malloc_trim() would be needed unless there
> > are scenarios in which free() won't trim, that aren't mentioned in the
> > manpage.
> 
>   In fact, i firstly adopted mallopt() solution to optimize the heap memory,
>   but i found this function is NOT useful, which are difference with MAN's
>   description, so i had to swith to use malloc_trim().


> > Also, how does malloc_trim interact with tcmalloc.so that people often
> > use in preference to glibc's built in malloc ?
>   Thanks, you reminded me to consider tcmalloc or jemalloc.
>  
>   Whether below code is more suitable? thanks!
>   #if defined(CONFIG_LINUX) && defined(__GLIBC__)
>          malloc_trim(0);
>   #endif

Both of those macro symbols will still be defined even when tcmalloc/jemalloc
are in use.

I wonder if tcmalloc/jemalloc even suffer from the same problem that libc's
builtin malloc has ?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Zhong Yang 6 years, 5 months ago
On Mon, Nov 20, 2017 at 02:14:50PM +0000, Daniel P. Berrange wrote:
> On Mon, Nov 20, 2017 at 04:54:42PM +0800, Zhong Yang wrote:
> > On Fri, Nov 17, 2017 at 02:06:20PM +0000, Daniel P. Berrange wrote:
> > > On Fri, Nov 17, 2017 at 01:54:09PM +0000, Stefan Hajnoczi wrote:
> > > > On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > > > > diff --git a/util/rcu.c b/util/rcu.c
> > > > > index ca5a63e..8d491a6 100644
> > > > > --- a/util/rcu.c
> > > > > +++ b/util/rcu.c
> > > > > @@ -26,6 +26,7 @@
> > > > >   * IBM's contributions to this file may be relicensed under LGPLv2 or later.
> > > > >   */
> > > > >  
> > > > > +#include <malloc.h>
> > > > 
> > > > This header file is not mentioned in the C99 standard or POSIX.  It is
> > > > probably not available on all host OSes that QEMU supports.  Please use
> > > > #ifdef CONFIG_LINUX.
> > > > 
> > > > >  #include "qemu/osdep.h"
> > > > >  #include "qemu-common.h"
> > > > >  #include "qemu/rcu.h"
> > > > > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> > > > >              node->func(node);
> > > > >          }
> > > > >          qemu_mutex_unlock_iothread();
> > > > > +#ifdef CONFIG_LINUX
> > > > > +        malloc_trim(0);
> > > > > +#endif
> > > > 
> > > > It is important that the rcu thread isn't overzealous in minimizing heap
> > > > size if that means ordinary malloc(3) calls will experience latency
> > > > spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> > > > slow path.
> > > 
> > > If you pass '0' the docs say that the minimum amount is left in the
> > > heap, per M_TOP_PAD, which is 128kb.
> > > 
> > > Strangely the mallopt(3) man page suggests, that free() should automatically
> > > trim the heap when its size exceeds M_TOP_TRIM, which is again 128kb by
> > > default.  So I'm puzzelled by malloc_trim() would be needed unless there
> > > are scenarios in which free() won't trim, that aren't mentioned in the
> > > manpage.
> > 
> >   In fact, i firstly adopted mallopt() solution to optimize the heap memory,
> >   but i found this function is NOT useful, which are difference with MAN's
> >   description, so i had to swith to use malloc_trim().
> 
> 
> > > Also, how does malloc_trim interact with tcmalloc.so that people often
> > > use in preference to glibc's built in malloc ?
> >   Thanks, you reminded me to consider tcmalloc or jemalloc.
> >  
> >   Whether below code is more suitable? thanks!
> >   #if defined(CONFIG_LINUX) && defined(__GLIBC__)
> >          malloc_trim(0);
> >   #endif
> 
> Both of those macro symbols will still be defined even when tcmalloc/jemalloc
> are in use.
  
  Hello Daniel,

  If those two macro are NOT useful for this scenario, how about below changes?

  In configure file, 

  if test "$tcmalloc" = "yes" ; then
      echo "CONFIG__TCMALLOC=y" >> $config_host_mak
  ........ 

  then use !CONFIG_TCMALLOC macro to check malloc_trim()? thanks!

  Regards,

  Yang

 
> I wonder if tcmalloc/jemalloc even suffer from the same problem that libc's
> builtin malloc has ?
 
  Hello Xulei,

  Did you compiled Qemu with tcmalloc or jemalloc ? If you did , would you 
  please share those datas with different malloc compile option?  Many thanks!

  Regards,

  Yang


> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Zhong Yang 6 years, 5 months ago
On Fri, Nov 17, 2017 at 01:54:09PM +0000, Stefan Hajnoczi wrote:
> On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > diff --git a/util/rcu.c b/util/rcu.c
> > index ca5a63e..8d491a6 100644
> > --- a/util/rcu.c
> > +++ b/util/rcu.c
> > @@ -26,6 +26,7 @@
> >   * IBM's contributions to this file may be relicensed under LGPLv2 or later.
> >   */
> >  
> > +#include <malloc.h>
> 
> This header file is not mentioned in the C99 standard or POSIX.  It is
> probably not available on all host OSes that QEMU supports.  Please use
> #ifdef CONFIG_LINUX.
> 
  Hello Stefan,

  Thanks for your remind! 
  
  Regards,

  Yang 
> >  #include "qemu/osdep.h"
> >  #include "qemu-common.h"
> >  #include "qemu/rcu.h"
> > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> >              node->func(node);
> >          }
> >          qemu_mutex_unlock_iothread();
> > +#ifdef CONFIG_LINUX
> > +        malloc_trim(0);
> > +#endif
> 
> It is important that the rcu thread isn't overzealous in minimizing heap
> size if that means ordinary malloc(3) calls will experience latency
> spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> slow path.
> 
  Hello Stefan,

  From the Qemu bootup procedure, the qemu malloc chunk memory from OS, not
  from glibc free list which is freed before by Qemu. Maybe there are some 
  issues in glibc memory mechanism. I will continue to fine this parameter
  to get better balance, Many thanks!

  Regards,

  Yang 

> Stefan



Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Stefan Hajnoczi 6 years, 5 months ago
On Mon, Nov 20, 2017 at 04:41:41PM +0800, Zhong Yang wrote:
> On Fri, Nov 17, 2017 at 01:54:09PM +0000, Stefan Hajnoczi wrote:
> > On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> > >              node->func(node);
> > >          }
> > >          qemu_mutex_unlock_iothread();
> > > +#ifdef CONFIG_LINUX
> > > +        malloc_trim(0);
> > > +#endif
> > 
> > It is important that the rcu thread isn't overzealous in minimizing heap
> > size if that means ordinary malloc(3) calls will experience latency
> > spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> > slow path.
> > 
>   Hello Stefan,
> 
>   From the Qemu bootup procedure, the qemu malloc chunk memory from OS, not
>   from glibc free list which is freed before by Qemu. Maybe there are some 
>   issues in glibc memory mechanism. I will continue to fine this parameter
>   to get better balance, Many thanks!

I was suggesting malloc_trim(4 * 1024 * 1024) or similar instead of
malloc_trim(0).

Stefan
Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Zhong Yang 6 years, 5 months ago
On Mon, Nov 20, 2017 at 02:03:44PM +0000, Stefan Hajnoczi wrote:
> On Mon, Nov 20, 2017 at 04:41:41PM +0800, Zhong Yang wrote:
> > On Fri, Nov 17, 2017 at 01:54:09PM +0000, Stefan Hajnoczi wrote:
> > > On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > > > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> > > >              node->func(node);
> > > >          }
> > > >          qemu_mutex_unlock_iothread();
> > > > +#ifdef CONFIG_LINUX
> > > > +        malloc_trim(0);
> > > > +#endif
> > > 
> > > It is important that the rcu thread isn't overzealous in minimizing heap
> > > size if that means ordinary malloc(3) calls will experience latency
> > > spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> > > slow path.
> > > 
> >   Hello Stefan,
> > 
> >   From the Qemu bootup procedure, the qemu malloc chunk memory from OS, not
> >   from glibc free list which is freed before by Qemu. Maybe there are some 
> >   issues in glibc memory mechanism. I will continue to fine this parameter
> >   to get better balance, Many thanks!
> 
> I was suggesting malloc_trim(4 * 1024 * 1024) or similar instead of
> malloc_trim(0).
>  
  Hello Stefan,

  Thanks for your suggestion!
  I tried your suggested changes, malloc_trim(4 * 1024 * 1024), the heap size with
  this change almost same with malloc_trim(0), thanks!

  Regards,

  Yang   
 
> Stefan



Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Fam Zheng 6 years, 5 months ago
On Fri, 11/17 14:23, Yang Zhong wrote:
> diff --git a/util/rcu.c b/util/rcu.c
> index ca5a63e..8d491a6 100644
> --- a/util/rcu.c
> +++ b/util/rcu.c
> @@ -26,6 +26,7 @@
>   * IBM's contributions to this file may be relicensed under LGPLv2 or later.
>   */
>  
> +#include <malloc.h>

BTW, if you respin, the "qemu/osdep.h" header should always be included first as
per HACKING file.

Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()
Posted by Zhong Yang 6 years, 5 months ago
On Mon, Nov 20, 2017 at 10:28:28PM +0800, Fam Zheng wrote:
> On Fri, 11/17 14:23, Yang Zhong wrote:
> > diff --git a/util/rcu.c b/util/rcu.c
> > index ca5a63e..8d491a6 100644
> > --- a/util/rcu.c
> > +++ b/util/rcu.c
> > @@ -26,6 +26,7 @@
> >   * IBM's contributions to this file may be relicensed under LGPLv2 or later.
> >   */
> >  
> > +#include <malloc.h>
> 
> BTW, if you respin, the "qemu/osdep.h" header should always be included first as
> per HACKING file.

  Hello Fam,

  Thanks for your reminder!
  I ever thought the header file include in Qemu like linux kernel did, thanks again!

  Regards,

  Yang