On Fri, Aug 04, 2023 at 03:38:07PM +0200, Michal Privoznik wrote:
> When spawning a new container (via clone()) we allocate stack for
> lxcContainerChild(). So far, we allocate 4 pages for the stack
> and this used to be enough until we started rewriting everything
> to glib. With glib we switched to g_strerror() which localizes
> errno strings and thus increases stack usage, while the
> previously used strerror_r() was more compact.
We're allocating the stack using g_new0, so when we overflowed
the stack we started scribbling over other allocations which
is horrible to diagnose.
> Fortunately, the solution is easy - just increase how much stack
> the child can use (16 pages ought to be enough for anybody).
I wonder if we're better off switching to mmap(), allocating
17 pages,and then using mprotect() to remove read+write
perms from first and/or last page, so that any future overflow
will generate SIGBUS immediately.
>
> Resolves: https://gitlab.com/libvirt/libvirt/-/issues/511
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
> src/lxc/lxc_container.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
> index 63cf283285..f741a754ce 100644
> --- a/src/lxc/lxc_container.c
> +++ b/src/lxc/lxc_container.c
> @@ -2132,7 +2132,7 @@ int lxcContainerStart(virDomainDef *def,
> {
> pid_t pid;
> int cflags;
> - int stacksize = getpagesize() * 4;
> + int stacksize = getpagesize() * 16;
> g_autofree char *stack = NULL;
> char *stacktop;
> lxc_child_argv_t args = {
> --
> 2.41.0
>
With regards,
Daniel
[1] first or last - arches differ on whether stack grows up vs down IIRC
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|