[RFC PATCH] linux-user: expand reserved brk space for 64bit guests

Alex Bennée posted 1 patch 2 years, 3 months ago
Test checkpatch passed
Failed in applying to current master (apply log)
linux-user/elfload.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
[RFC PATCH] linux-user: expand reserved brk space for 64bit guests
Posted by Alex Bennée 2 years, 3 months ago
A recent change to fix commpage allocation issues on 32bit hosts
revealed another intermittent issue on s390x. The root cause was the
headroom we give for the brk space wasn't enough causing the guest to
attempt to map something on top of QEMUs own pages. We do not
currently do anything to protect from this (see #555).

By inspection the brk mmap moves around and top of the address range
has been measured as far as 19Mb away from the top of the binary. As
we chose a smallish number to keep 32bit on 32 bit feasible we only
increase the gap for 64 bit guests. This does mean that 64-on-32
static binaries are more likely to fail to find a hole in the address
space but that is hopefully a fairly rare situation.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 linux-user/elfload.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 64b87d37e8..9628a38361 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -2800,11 +2800,17 @@ static void load_elf_image(const char *image_name, int image_fd,
          * and the stack, lest they be placed immediately after
          * the data segment and block allocation from the brk.
          *
-         * 16MB is chosen as "large enough" without being so large
-         * as to allow the result to not fit with a 32-bit guest on
-         * a 32-bit host.
+         * 16MB is chosen as "large enough" without being so large as
+         * to allow the result to not fit with a 32-bit guest on a
+         * 32-bit host. However some 64 bit guests (e.g. s390x)
+         * attempt to place their heap further ahead and currently
+         * nothing stops them smashing into QEMUs address space.
          */
+#if TARGET_LONG_BITS == 64
+        info->reserve_brk = 32 * MiB;
+#else
         info->reserve_brk = 16 * MiB;
+#endif
         hiaddr += info->reserve_brk;
 
         if (ehdr->e_type == ET_EXEC) {
-- 
2.30.2


Re: [RFC PATCH] linux-user: expand reserved brk space for 64bit guests
Posted by Thomas Huth 2 years, 3 months ago
On 13/01/2022 17.55, Alex Bennée wrote:
> A recent change to fix commpage allocation issues on 32bit hosts
> revealed another intermittent issue on s390x. The root cause was the
> headroom we give for the brk space wasn't enough causing the guest to
> attempt to map something on top of QEMUs own pages. We do not
> currently do anything to protect from this (see #555).
> 
> By inspection the brk mmap moves around and top of the address range
> has been measured as far as 19Mb away from the top of the binary. As
> we chose a smallish number to keep 32bit on 32 bit feasible we only
> increase the gap for 64 bit guests. This does mean that 64-on-32
> static binaries are more likely to fail to find a hole in the address
> space but that is hopefully a fairly rare situation.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>   linux-user/elfload.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index 64b87d37e8..9628a38361 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -2800,11 +2800,17 @@ static void load_elf_image(const char *image_name, int image_fd,
>            * and the stack, lest they be placed immediately after
>            * the data segment and block allocation from the brk.
>            *
> -         * 16MB is chosen as "large enough" without being so large
> -         * as to allow the result to not fit with a 32-bit guest on
> -         * a 32-bit host.
> +         * 16MB is chosen as "large enough" without being so large as
> +         * to allow the result to not fit with a 32-bit guest on a
> +         * 32-bit host. However some 64 bit guests (e.g. s390x)
> +         * attempt to place their heap further ahead and currently
> +         * nothing stops them smashing into QEMUs address space.
>            */
> +#if TARGET_LONG_BITS == 64
> +        info->reserve_brk = 32 * MiB;
> +#else
>           info->reserve_brk = 16 * MiB;
> +#endif
>           hiaddr += info->reserve_brk;
>   
>           if (ehdr->e_type == ET_EXEC) {

Should be ok as a temporary work-around at least, I guess...

FWIW,
Reviewed-by: Thomas Huth <thuth@redhat.com>


Re: [RFC PATCH] linux-user: expand reserved brk space for 64bit guests
Posted by Richard Henderson 2 years, 2 months ago
On 1/14/22 3:55 AM, Alex Bennée wrote:
> A recent change to fix commpage allocation issues on 32bit hosts
> revealed another intermittent issue on s390x. The root cause was the
> headroom we give for the brk space wasn't enough causing the guest to
> attempt to map something on top of QEMUs own pages. We do not
> currently do anything to protect from this (see #555).
> 
> By inspection the brk mmap moves around and top of the address range
> has been measured as far as 19Mb away from the top of the binary. As
> we chose a smallish number to keep 32bit on 32 bit feasible we only
> increase the gap for 64 bit guests. This does mean that 64-on-32
> static binaries are more likely to fail to find a hole in the address
> space but that is hopefully a fairly rare situation.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>   linux-user/elfload.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index 64b87d37e8..9628a38361 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -2800,11 +2800,17 @@ static void load_elf_image(const char *image_name, int image_fd,
>            * and the stack, lest they be placed immediately after
>            * the data segment and block allocation from the brk.
>            *
> -         * 16MB is chosen as "large enough" without being so large
> -         * as to allow the result to not fit with a 32-bit guest on
> -         * a 32-bit host.
> +         * 16MB is chosen as "large enough" without being so large as
> +         * to allow the result to not fit with a 32-bit guest on a
> +         * 32-bit host. However some 64 bit guests (e.g. s390x)
> +         * attempt to place their heap further ahead and currently
> +         * nothing stops them smashing into QEMUs address space.
>            */
> +#if TARGET_LONG_BITS == 64
> +        info->reserve_brk = 32 * MiB;
> +#else
>           info->reserve_brk = 16 * MiB;
> +#endif

I'd prefer to use 32M unconditionally, but...
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~