[PATCH v7 00/14] linux-user: brk fixes

Richard Henderson posted 14 patches 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230803015302.407219-1-richard.henderson@linaro.org
Maintainers: Warner Losh <imp@bsdimp.com>, Kyle Evans <kevans@freebsd.org>, Laurent Vivier <laurent@vivier.eu>, Brian Cain <bcain@quicinc.com>
There is a newer version of this series
bsd-user/qemu.h                      |  1 -
linux-user/aarch64/target_mman.h     | 13 ++++
linux-user/alpha/target_mman.h       | 11 ++++
linux-user/arm/target_mman.h         | 11 ++++
linux-user/cris/target_mman.h        | 12 ++++
linux-user/hexagon/target_mman.h     | 13 ++++
linux-user/hppa/target_mman.h        |  6 ++
linux-user/i386/target_mman.h        | 16 +++++
linux-user/loongarch64/target_mman.h | 11 ++++
linux-user/m68k/target_mman.h        |  5 ++
linux-user/microblaze/target_mman.h  | 11 ++++
linux-user/mips/target_mman.h        | 10 +++
linux-user/nios2/target_mman.h       | 10 +++
linux-user/openrisc/target_mman.h    | 10 +++
linux-user/ppc/target_mman.h         | 20 ++++++
linux-user/qemu.h                    |  2 -
linux-user/riscv/target_mman.h       | 10 +++
linux-user/s390x/target_mman.h       | 20 ++++++
linux-user/sh4/target_mman.h         |  7 +++
linux-user/sparc/target_mman.h       | 25 ++++++++
linux-user/user-mmap.h               |  6 +-
linux-user/x86_64/target_mman.h      | 15 +++++
linux-user/xtensa/target_mman.h      | 10 +++
bsd-user/mmap.c                      |  2 -
linux-user/elfload.c                 | 94 ++++++++++++++++------------
linux-user/flatload.c                |  2 +-
linux-user/main.c                    | 43 ++++++++++++-
linux-user/mmap.c                    | 68 ++++++++++++--------
linux-user/syscall.c                 | 69 +++++---------------
29 files changed, 401 insertions(+), 132 deletions(-)
[PATCH v7 00/14] linux-user: brk fixes
Posted by Richard Henderson 9 months ago
Builds on Helge's v6, incorporating my feedback plus
some other minor cleanup.


r~


Akihiko Odaki (6):
  linux-user: Unset MAP_FIXED_NOREPLACE for host
  linux-user: Fix MAP_FIXED_NOREPLACE on old kernels
  linux-user: Do not call get_errno() in do_brk()
  linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
  linux-user: Do nothing if too small brk is specified
  linux-user: Do not align brk with host page size

Helge Deller (1):
  linux-user: Adjust initial brk when interpreter is close to executable

Richard Henderson (7):
  linux-user: Remove last_brk
  bsd-user: Remove last_brk
  linux-user: Adjust task_unmapped_base for reserved_va
  linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h
  linux-user: Add ELF_ET_DYN_BASE
  linux-user: Use elf_et_dyn_base for ET_DYN with interpreter
  linux-user: Properly set image_info.brk in flatload

 bsd-user/qemu.h                      |  1 -
 linux-user/aarch64/target_mman.h     | 13 ++++
 linux-user/alpha/target_mman.h       | 11 ++++
 linux-user/arm/target_mman.h         | 11 ++++
 linux-user/cris/target_mman.h        | 12 ++++
 linux-user/hexagon/target_mman.h     | 13 ++++
 linux-user/hppa/target_mman.h        |  6 ++
 linux-user/i386/target_mman.h        | 16 +++++
 linux-user/loongarch64/target_mman.h | 11 ++++
 linux-user/m68k/target_mman.h        |  5 ++
 linux-user/microblaze/target_mman.h  | 11 ++++
 linux-user/mips/target_mman.h        | 10 +++
 linux-user/nios2/target_mman.h       | 10 +++
 linux-user/openrisc/target_mman.h    | 10 +++
 linux-user/ppc/target_mman.h         | 20 ++++++
 linux-user/qemu.h                    |  2 -
 linux-user/riscv/target_mman.h       | 10 +++
 linux-user/s390x/target_mman.h       | 20 ++++++
 linux-user/sh4/target_mman.h         |  7 +++
 linux-user/sparc/target_mman.h       | 25 ++++++++
 linux-user/user-mmap.h               |  6 +-
 linux-user/x86_64/target_mman.h      | 15 +++++
 linux-user/xtensa/target_mman.h      | 10 +++
 bsd-user/mmap.c                      |  2 -
 linux-user/elfload.c                 | 94 ++++++++++++++++------------
 linux-user/flatload.c                |  2 +-
 linux-user/main.c                    | 43 ++++++++++++-
 linux-user/mmap.c                    | 68 ++++++++++++--------
 linux-user/syscall.c                 | 69 +++++---------------
 29 files changed, 401 insertions(+), 132 deletions(-)

-- 
2.34.1
Re: [PATCH v7 00/14] linux-user: brk fixes
Posted by Joel Stanley 9 months ago
Hi Richard,

On Thu, 3 Aug 2023 at 01:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Builds on Helge's v6, incorporating my feedback plus
> some other minor cleanup.

This succeeds for the armhf static binary on ppc64le host that was
previously segfaulting.

However, the arm static binary on ppc64le host now segfaults:

$ gdb -q -ex r --args ./build/qemu-arm -d guest_errors,page,strace ~/hello
Reading symbols from ./build/qemu-arm...
Starting program: /scratch/joel/qemu/build/qemu-arm -d
guest_errors,page,strace /home/joel/hello
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff762ece0 (LWP 143553)]
host mmap_min_addr=0x10000
pgb_find_hole: base @ 140420000 for 4294967296 bytes
pgb_static: base @ 140420000 for 4294967295 bytes
pgb_reserved_va: base @ 0x140420000 for 4294967296 bytes
Locating guest address space @ 0x140420000
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 ---
00090000-0009b000 0000b000 ---
ffff0000-00000000 00010000 r-x
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-0009b000 0000b000 ---
ffff0000-00000000 00010000 r-x
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-000a0000 00010000 rw-
ffff0000-00000000 00010000 r-x
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-000a0000 00010000 rw-
40000000-40810000 00810000 rw-
ffff0000-00000000 00010000 r-x
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-000a0000 00010000 rw-
40000000-40010000 00010000 ---
40010000-40811000 00801000 rw-
ffff0000-00000000 00010000 r-x
guest_base  0x140420000
page layout changed following binary load
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-000a0000 00010000 rw-
40000000-40010000 00010000 ---
40010000-40810000 00800000 rw-
40810000-40811000 00001000 r-x
ffff0000-00000000 00010000 r-x
end_code    0x00084f7c
start_code  0x00010000
start_data  0x00095098
end_data    0x00098394
start_stack 0x4080f410
brk         0x0009b000
entry       0x00010418
argv_start  0x4080f414
env_start   0x4080f41c
auxv_start  0x4080f4a0
143551 brk(NULL) = 0x0009b000
143551 brk(0x0009b8fc) = 0x0009b000

Thread 1 "qemu-arm" received signal SIGSEGV, Segmentation fault.
0x00007fffeed9bb74 in code_gen_buffer ()
(gdb) bt
#0  0x00007fffeed9bb74 in code_gen_buffer ()
#1  0x0000000100169fdc in cpu_tb_exec (cpu=cpu@entry=0x1003d4a90,
    itb=itb@entry=0x7fffeed9ba60 <code_gen_buffer+47512>,
tb_exit=tb_exit@entry=0x7fffffffe51c)
    at ../accel/tcg/cpu-exec.c:457
#2  0x000000010016a704 in cpu_loop_exec_tb (tb_exit=0x7fffffffe51c,
last_tb=<synthetic pointer>,
    pc=<optimised out>, tb=0x7fffeed9ba60 <code_gen_buffer+47512>,
cpu=<optimised out>)
    at ../accel/tcg/cpu-exec.c:919
#3  cpu_exec_loop (cpu=cpu@entry=0x1003d4a90, sc=<optimised out>) at
../accel/tcg/cpu-exec.c:1040
#4  0x000000010016abac in cpu_exec_setjmp (cpu=cpu@entry=0x1003d4a90,
sc=<optimised out>)
    at ../accel/tcg/cpu-exec.c:1057
#5  0x000000010016b270 in cpu_exec (cpu=0x1003d4a90) at
../accel/tcg/cpu-exec.c:1083
#6  0x000000010004d7b0 in cpu_loop (env=0x1003d4fa0) at
../linux-user/arm/cpu_loop.c:328
#7  0x0000000100047548 in main (argc=<optimised out>,
argv=0x7ffffffff188, envp=<optimised out>)
    at ../linux-user/main.c:1012
(gdb)


>
>
> r~
>
>
> Akihiko Odaki (6):
>   linux-user: Unset MAP_FIXED_NOREPLACE for host
>   linux-user: Fix MAP_FIXED_NOREPLACE on old kernels
>   linux-user: Do not call get_errno() in do_brk()
>   linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
>   linux-user: Do nothing if too small brk is specified
>   linux-user: Do not align brk with host page size
>
> Helge Deller (1):
>   linux-user: Adjust initial brk when interpreter is close to executable
>
> Richard Henderson (7):
>   linux-user: Remove last_brk
>   bsd-user: Remove last_brk
>   linux-user: Adjust task_unmapped_base for reserved_va
>   linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h
>   linux-user: Add ELF_ET_DYN_BASE
>   linux-user: Use elf_et_dyn_base for ET_DYN with interpreter
>   linux-user: Properly set image_info.brk in flatload
>
>  bsd-user/qemu.h                      |  1 -
>  linux-user/aarch64/target_mman.h     | 13 ++++
>  linux-user/alpha/target_mman.h       | 11 ++++
>  linux-user/arm/target_mman.h         | 11 ++++
>  linux-user/cris/target_mman.h        | 12 ++++
>  linux-user/hexagon/target_mman.h     | 13 ++++
>  linux-user/hppa/target_mman.h        |  6 ++
>  linux-user/i386/target_mman.h        | 16 +++++
>  linux-user/loongarch64/target_mman.h | 11 ++++
>  linux-user/m68k/target_mman.h        |  5 ++
>  linux-user/microblaze/target_mman.h  | 11 ++++
>  linux-user/mips/target_mman.h        | 10 +++
>  linux-user/nios2/target_mman.h       | 10 +++
>  linux-user/openrisc/target_mman.h    | 10 +++
>  linux-user/ppc/target_mman.h         | 20 ++++++
>  linux-user/qemu.h                    |  2 -
>  linux-user/riscv/target_mman.h       | 10 +++
>  linux-user/s390x/target_mman.h       | 20 ++++++
>  linux-user/sh4/target_mman.h         |  7 +++
>  linux-user/sparc/target_mman.h       | 25 ++++++++
>  linux-user/user-mmap.h               |  6 +-
>  linux-user/x86_64/target_mman.h      | 15 +++++
>  linux-user/xtensa/target_mman.h      | 10 +++
>  bsd-user/mmap.c                      |  2 -
>  linux-user/elfload.c                 | 94 ++++++++++++++++------------
>  linux-user/flatload.c                |  2 +-
>  linux-user/main.c                    | 43 ++++++++++++-
>  linux-user/mmap.c                    | 68 ++++++++++++--------
>  linux-user/syscall.c                 | 69 +++++---------------
>  29 files changed, 401 insertions(+), 132 deletions(-)
>
> --
> 2.34.1
>
Re: [PATCH v7 00/14] linux-user: brk fixes
Posted by Helge Deller 9 months ago
On 8/3/23 15:11, Joel Stanley wrote:
> Hi Richard,
>
> On Thu, 3 Aug 2023 at 01:53, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Builds on Helge's v6, incorporating my feedback plus
>> some other minor cleanup.
>
> This succeeds for the armhf static binary on ppc64le host that was
> previously segfaulting.
>
> However, the arm static binary on ppc64le host now segfaults:
>
> $ gdb -q -ex r --args ./build/qemu-arm -d guest_errors,page,strace ~/hello
> Reading symbols from ./build/qemu-arm...
> Starting program: /scratch/joel/qemu/build/qemu-arm -d
> guest_errors,page,strace /home/joel/hello
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1".
> [New Thread 0x7ffff762ece0 (LWP 143553)]
> host mmap_min_addr=0x10000
> pgb_find_hole: base @ 140420000 for 4294967296 bytes
> pgb_static: base @ 140420000 for 4294967295 bytes
> pgb_reserved_va: base @ 0x140420000 for 4294967296 bytes
> Locating guest address space @ 0x140420000
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 ---
> 00090000-0009b000 0000b000 ---
> ffff0000-00000000 00010000 r-x
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-0009b000 0000b000 ---
> ffff0000-00000000 00010000 r-x
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-000a0000 00010000 rw-
> ffff0000-00000000 00010000 r-x
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-000a0000 00010000 rw-
> 40000000-40810000 00810000 rw-
> ffff0000-00000000 00010000 r-x
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-000a0000 00010000 rw-
> 40000000-40010000 00010000 ---
> 40010000-40811000 00801000 rw-
> ffff0000-00000000 00010000 r-x
> guest_base  0x140420000
> page layout changed following binary load
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-000a0000 00010000 rw-
> 40000000-40010000 00010000 ---
> 40010000-40810000 00800000 rw-
> 40810000-40811000 00001000 r-x
> ffff0000-00000000 00010000 r-x
> end_code    0x00084f7c
> start_code  0x00010000
> start_data  0x00095098
> end_data    0x00098394
> start_stack 0x4080f410
> brk         0x0009b000
> entry       0x00010418
> argv_start  0x4080f414
> env_start   0x4080f41c
> auxv_start  0x4080f4a0
> 143551 brk(NULL) = 0x0009b000
> 143551 brk(0x0009b8fc) = 0x0009b000

I think the problem is the brk with 9b000 here.
It's not 64k aligned (=pages size of your ppc64le).

Please try with this patch on top of Richard's series:

> @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
>       info->end_code = 0;
>       info->start_data = -1;
>       info->end_data = 0;
> -    info->brk = .....
change that to become:
     info->brk = HOST_PAGE_ALIGN(hiaddr);

Helge
Re: [PATCH v7 00/14] linux-user: brk fixes
Posted by Joel Stanley 9 months ago
On Thu, 3 Aug 2023 at 13:55, Helge Deller <deller@gmx.de> wrote:
> > 143551 brk(NULL) = 0x0009b000
> > 143551 brk(0x0009b8fc) = 0x0009b000
>
> I think the problem is the brk with 9b000 here.
> It's not 64k aligned (=pages size of your ppc64le).
>
> Please try with this patch on top of Richard's series:
>
> > @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
> >       info->end_code = 0;
> >       info->start_data = -1;
> >       info->end_data = 0;
> > -    info->brk = .....
> change that to become:
>      info->brk = HOST_PAGE_ALIGN(hiaddr);

That stopped the crashing, and the binaries seem to run fine. I tested
on two hosts: ppc64le (64K) and arm64 (16K).

Cheers,

Joel
Re: [PATCH v7 00/14] linux-user: brk fixes
Posted by Helge Deller 9 months ago
* Joel Stanley <joel@jms.id.au>:
> On Thu, 3 Aug 2023 at 13:55, Helge Deller <deller@gmx.de> wrote:
> > > 143551 brk(NULL) = 0x0009b000
> > > 143551 brk(0x0009b8fc) = 0x0009b000
> >
> > I think the problem is the brk with 9b000 here.
> > It's not 64k aligned (=pages size of your ppc64le).
> >
> > Please try with this patch on top of Richard's series:
> >
> > > @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
> > >       info->end_code = 0;
> > >       info->start_data = -1;
> > >       info->end_data = 0;
> > > -    info->brk = .....
> > change that to become:
> >      info->brk = HOST_PAGE_ALIGN(hiaddr);
>
> That stopped the crashing, and the binaries seem to run fine. I tested
> on two hosts: ppc64le (64K) and arm64 (16K).

Great!

That made re-read Akihiko's patch:
----
Author: Akihiko Odaki <akihiko.odaki@daynix.com>
    linux-user: Do not align brk with host page size

    do_brk() minimizes calls into target_mmap() by aligning the address
    with host page size, which is potentially larger than the target page
    size. However, the current implementation of this optimization has two
    bugs:

    - The start of brk is rounded up with the host page size while brk
      advertises an address aligned with the target page size as the
      beginning of brk. This makes the beginning of brk unmapped.
----
this patch has wrong assumptions.

The start of brk always needs to be host page aligned.
It's not an optimization, but a requirement, since brk needs to be
located on a host-aligned page which may get different permissions
than the page before it (where code from the binary may be located).

I wonder if we need that patch at all.


Joel, could you give the patch below on top of git head (no other
patches applied) a spin?
(I just tested it here locally on a full range of linux-user chroots)

I think this is ALL what's needed for git head to fix the static binary
issues, has a nice preparation for Richard's ELF_ET_DYN_BASE patches.

If it does, it replaces patches 1,2 & 4-6 from Richard's v7 patch
series.

Helge



diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 861ec07abc..88d9e4056e 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3021,8 +3021,10 @@ static void load_elf_image(const char *image_name, int image_fd,
     struct elfhdr *ehdr = (struct elfhdr *)bprm_buf;
     struct elf_phdr *phdr;
     abi_ulong load_addr, load_bias, loaddr, hiaddr, error;
+    unsigned long load_offset = 0;
     int i, retval, prot_exec;
     Error *err = NULL;
+    bool is_main_executable;

     /* First of all, some simple consistency checks */
     if (!elf_check_ident(ehdr)) {
@@ -3106,28 +3108,8 @@ static void load_elf_image(const char *image_name, int image_fd,
         }
     }

-    if (pinterp_name != NULL) {
-        /*
-         * This is the main executable.
-         *
-         * Reserve extra space for brk.
-         * We hold on to this space while placing the interpreter
-         * and the stack, lest they be placed immediately after
-         * the data segment and block allocation from the brk.
-         *
-         * 16MB is chosen as "large enough" without being so large as
-         * to allow the result to not fit with a 32-bit guest on a
-         * 32-bit host. However some 64 bit guests (e.g. s390x)
-         * attempt to place their heap further ahead and currently
-         * nothing stops them smashing into QEMUs address space.
-         */
-#if TARGET_LONG_BITS == 64
-        info->reserve_brk = 32 * MiB;
-#else
-        info->reserve_brk = 16 * MiB;
-#endif
-        hiaddr += info->reserve_brk;
-
+    is_main_executable = (pinterp_name != NULL);
+    if (is_main_executable) {
         if (ehdr->e_type == ET_EXEC) {
             /*
              * Make sure that the low address does not conflict with
@@ -3136,10 +3118,11 @@ static void load_elf_image(const char *image_name, int image_fd,
             probe_guest_base(image_name, loaddr, hiaddr);
         } else {
             /*
-             * The binary is dynamic, but we still need to
+             * The binary is dynamic (pie-executabe), but we still need to
              * select guest_base.  In this case we pass a size.
              */
             probe_guest_base(image_name, 0, hiaddr - loaddr);
+            load_offset = 0 /* TODO: should be ELF_ET_DYN_BASE */;
         }
     }

@@ -3157,9 +3140,9 @@ static void load_elf_image(const char *image_name, int image_fd,
      * In both cases, we will overwrite pages in this range with mappings
      * from the executable.
      */
-    load_addr = target_mmap(loaddr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
+    load_addr = target_mmap(loaddr + load_offset, (size_t)hiaddr - loaddr + 1, PROT_NONE,
                             MAP_PRIVATE | MAP_ANON | MAP_NORESERVE |
-                            (ehdr->e_type == ET_EXEC ? MAP_FIXED : 0),
+                            (is_main_executable ? MAP_FIXED : 0),
                             -1, 0);
     if (load_addr == -1) {
         goto exit_mmap;
@@ -3194,7 +3177,8 @@ static void load_elf_image(const char *image_name, int image_fd,
     info->end_code = 0;
     info->start_data = -1;
     info->end_data = 0;
-    info->brk = 0;
+    /* possible start for brk is behind all sections of this ELF file. */
+    info->brk = HOST_PAGE_ALIGN(load_offset + hiaddr);
     info->elf_flags = ehdr->e_flags;

     prot_exec = PROT_EXEC;
@@ -3288,9 +3272,6 @@ static void load_elf_image(const char *image_name, int image_fd,
                     info->end_data = vaddr_ef;
                 }
             }
-            if (vaddr_em > info->brk) {
-                info->brk = vaddr_em;
-            }
 #ifdef TARGET_MIPS
         } else if (eppnt->p_type == PT_MIPS_ABIFLAGS) {
             Mips_elf_abiflags_v0 abiflags;
@@ -3618,6 +3599,15 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)

     if (elf_interpreter) {
         load_elf_interp(elf_interpreter, &interp_info, bprm->buf);
+        /*
+	 * Use brk address of interpreter if it was loaded above the
+	 * executable and leaves less than 16 MB for heap.
+	 * This happens e.g. with static binaries on armhf.
+         */
+        if (interp_info.brk > info->brk &&
+            interp_info.load_bias - info->brk < 16 * MiB)  {
+            info->brk = interp_info.brk;
+        }

         /* If the program interpreter is one of these two, then assume
            an iBCS2 image.  Otherwise assume a native linux image.  */
@@ -3672,17 +3662,6 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
     bprm->core_dump = &elf_core_dump;
 #endif

-    /*
-     * If we reserved extra space for brk, release it now.
-     * The implementation of do_brk in syscalls.c expects to be able
-     * to mmap pages in this space.
-     */
-    if (info->reserve_brk) {
-        abi_ulong start_brk = HOST_PAGE_ALIGN(info->brk);
-        abi_ulong end_brk = HOST_PAGE_ALIGN(info->brk + info->reserve_brk);
-        target_munmap(start_brk, end_brk - start_brk);
-    }
-
     return 0;
 }
Re: [PATCH v7 00/14] linux-user: brk fixes
Posted by Richard Henderson 9 months ago
On 8/3/23 08:01, Helge Deller wrote:
> If it does, it replaces patches 1,2 & 4-6 from Richard's v7 patch
> series.

The patch you gave below has no overlap with 1,2,4,5 at all.


r~
Re: [PATCH v7 00/14] linux-user: brk fixes
Posted by Helge Deller 9 months ago
On 8/3/23 17:20, Richard Henderson wrote:
> On 8/3/23 08:01, Helge Deller wrote:
>> If it does, it replaces patches 1,2 & 4-6 from Richard's v7 patch
>> series.
>
> The patch you gave below has no overlap with 1,2,4,5 at all.

Yes, ignore this.... Your patch series is fine as-is....
(the brk()-host-page-alignment issue is the only one left I think)

Helge
Re: [PATCH v7 00/14] linux-user: brk fixes
Posted by Richard Henderson 9 months ago
On 8/3/23 08:01, Helge Deller wrote:
> * Joel Stanley <joel@jms.id.au>:
>> On Thu, 3 Aug 2023 at 13:55, Helge Deller <deller@gmx.de> wrote:
>>>> 143551 brk(NULL) = 0x0009b000
>>>> 143551 brk(0x0009b8fc) = 0x0009b000
>>>
>>> I think the problem is the brk with 9b000 here.
>>> It's not 64k aligned (=pages size of your ppc64le).
>>>
>>> Please try with this patch on top of Richard's series:
>>>
>>>> @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
>>>>        info->end_code = 0;
>>>>        info->start_data = -1;
>>>>        info->end_data = 0;
>>>> -    info->brk = .....
>>> change that to become:
>>>       info->brk = HOST_PAGE_ALIGN(hiaddr);
>>
>> That stopped the crashing, and the binaries seem to run fine. I tested
>> on two hosts: ppc64le (64K) and arm64 (16K).
> 
> Great!
> 
> That made re-read Akihiko's patch:
> ----
> Author: Akihiko Odaki <akihiko.odaki@daynix.com>
>      linux-user: Do not align brk with host page size
> 
>      do_brk() minimizes calls into target_mmap() by aligning the address
>      with host page size, which is potentially larger than the target page
>      size. However, the current implementation of this optimization has two
>      bugs:
> 
>      - The start of brk is rounded up with the host page size while brk
>        advertises an address aligned with the target page size as the
>        beginning of brk. This makes the beginning of brk unmapped.
> ----
> this patch has wrong assumptions.
> 
> The start of brk always needs to be host page aligned.


There is a bunch of code in target_mmap that attempts to manage adjacent guest pages that 
fall into the same host page.  Akihiko's patch assumes that code actually works.  Which I 
think is entirely reasonable.

You can't move brk up like this either (without other adjustments to the binary mapping), 
since that will leave a hole in the guest address space, which can get filled with 
something else later, which will definitely cause problems.


r~
Re: [PATCH v7 00/14] linux-user: brk fixes
Posted by Helge Deller 9 months ago
On 8/3/23 17:11, Richard Henderson wrote:
> On 8/3/23 08:01, Helge Deller wrote:
>> * Joel Stanley <joel@jms.id.au>:
>>> On Thu, 3 Aug 2023 at 13:55, Helge Deller <deller@gmx.de> wrote:
>>>>> 143551 brk(NULL) = 0x0009b000
>>>>> 143551 brk(0x0009b8fc) = 0x0009b000
>>>>
>>>> I think the problem is the brk with 9b000 here.
>>>> It's not 64k aligned (=pages size of your ppc64le).
>>>>
>>>> Please try with this patch on top of Richard's series:
>>>>
>>>>> @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
>>>>>        info->end_code = 0;
>>>>>        info->start_data = -1;
>>>>>        info->end_data = 0;
>>>>> -    info->brk = .....
>>>> change that to become:
>>>>       info->brk = HOST_PAGE_ALIGN(hiaddr);
>>>
>>> That stopped the crashing, and the binaries seem to run fine. I tested
>>> on two hosts: ppc64le (64K) and arm64 (16K).
>>
>> Great!
>>
>> That made re-read Akihiko's patch:
>> ----
>> Author: Akihiko Odaki <akihiko.odaki@daynix.com>
>>      linux-user: Do not align brk with host page size
>>
>>      do_brk() minimizes calls into target_mmap() by aligning the address
>>      with host page size, which is potentially larger than the target page
>>      size. However, the current implementation of this optimization has two
>>      bugs:
>>
>>      - The start of brk is rounded up with the host page size while brk
>>        advertises an address aligned with the target page size as the
>>        beginning of brk. This makes the beginning of brk unmapped.
>> ----
>> this patch has wrong assumptions.
>>
>> The start of brk always needs to be host page aligned.
>
>
> There is a bunch of code in target_mmap that attempts to manage
> adjacent guest pages that fall into the same host page.  Akihiko's
> patch assumes that code actually works.  Which I think is entirely
> reasonable.

Ok.

> You can't move brk up like this either (without other adjustments to
> the binary mapping), since that will leave a hole in the guest
> address space, which can get filled with something else later, which
> will definitely cause problems.

Ah, true. I have to admit that I didn't thought of that yet.
What is a possible solution to increase brk then (if you agree
that it should be increased here if necessary) ?
Call target_mmap() on the area from current brk to new brk?

Helge