[PATCH] linux-user: Fix zero_bss for RX PT_LOAD segments

Razvan Ghiorghe posted 1 patch 5 days, 2 hours ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260202002615.27441-1-razvanghiorghe16@gmail.com
Maintainers: Laurent Vivier <laurent@vivier.eu>, Pierrick Bouvier <pierrick.bouvier@linaro.org>
linux-user/elfload.c | 38 +++++++++++++++++++++++---------------
1 file changed, 23 insertions(+), 15 deletions(-)
[PATCH] linux-user: Fix zero_bss for RX PT_LOAD segments
Posted by Razvan Ghiorghe 5 days, 2 hours ago
zero_bss() incorrectly assumed that any PT_LOAD containing .bss must be
writable, rejecting valid ELF binaries where .bss overlaps the tail of
an RX file-backed page.

Instead of failing, temporarily enable write access on the overlapping
page to zero the fractional bss range, then restore the original page
permissions once initialization is complete.

To validate the correctness of the modified zero_bss() implementation,
two targeted test cases were constructed, designed to exercise the edge cases where
the .bss segment overlaps a partially filled virtual memory page belonging to a
R_X region. The test binaries were intentionally built without a main() function
and instead defined a custom ELF entry-point via the _start symbol.
This approach bypasses CRT, dynamic loader, libc initialization etc. ensuring that
execution begins immediately after QEMU completes ELF loading and memory mapping.

The first binary defines a minimal _start routine and immediately terminates
via a system call, without ever referencing the .bss symbol. Although a .bss section
is present in the ELF, it is not accessed at runtime, and the resulting PT_LOAD
mapping can be established without triggering any writes to a file-backed RX page.
In this configuration, QEMU successfully loads the binary, and the loader reaches
the zero_bss() path, validating that the fractional .bss region is correctly zeroed
without violating the original segment permissions.

The second binary explicitly reads from the global .bss symbol (x) at program entry.
This forces the linker to materialize the .bss region within the same PT_LOAD
segment as the RX code, yielding a segment with p_filesz < p_memsz and flags R|X.
In this case, QEMU correctly fails during the initial file-backed mmap() of the PT_LOAD
segment, returning EINVAL. This behavior is consistent with the Linux kernel’s ELF
loader semantics, which prohibit mapping a file-backed segment as RX when it (logically)
contains writable memory. Consequently, this failure occurs before zero_bss()
is reached (behaviour expected and correct).

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3179
Signed-off-by: Razvan Ghiorghe <razvanghiorghe16@gmail.com>
---
 linux-user/elfload.c | 38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 35471c0c9a..fa3f7cda69 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -449,18 +449,11 @@ static bool zero_bss(abi_ulong start_bss, abi_ulong end_bss,
 {
     abi_ulong align_bss;
 
-    /* We only expect writable bss; the code segment shouldn't need this. */
-    if (!(prot & PROT_WRITE)) {
-        error_setg(errp, "PT_LOAD with non-writable bss");
-        return false;
-    }
-
     align_bss = TARGET_PAGE_ALIGN(start_bss);
     end_bss = TARGET_PAGE_ALIGN(end_bss);
 
     if (start_bss < align_bss) {
         int flags = page_get_flags(start_bss);
-
         if (!(flags & PAGE_RWX)) {
             /*
              * The whole address space of the executable was reserved
@@ -472,20 +465,35 @@ static bool zero_bss(abi_ulong start_bss, abi_ulong end_bss,
              */
             align_bss -= TARGET_PAGE_SIZE;
         } else {
+            abi_ulong start_page_aligned = start_bss & TARGET_PAGE_MASK;
             /*
-             * The start of the bss shares a page with something.
-             * The only thing that we expect is the data section,
-             * which would already be marked writable.
-             * Overlapping the RX code segment seems malformed.
+             * The logical OR between flags and PAGE_WRITE works because
+             * in include/exec/page-protection.h they are defined as PROT_*
+             * values, matching mprotect().
+             * Temporarily enable write access to zero the fractional bss.
+             * target_mprotect() handles TB invalidation if needed.
              */
             if (!(flags & PAGE_WRITE)) {
-                error_setg(errp, "PT_LOAD with bss overlapping "
-                           "non-writable page");
-                return false;
+                if (target_mprotect(start_page_aligned,
+                                    TARGET_PAGE_SIZE,
+                                    prot | PAGE_WRITE) == -1) {
+                    error_setg_errno(errp, errno,
+                                    "Error enabling write access for bss");
+                    return false;
+                }
             }
 
-            /* The page is already mapped and writable. */
+            /* The page is already mapped and now guaranteed writable. */
             memset(g2h_untagged(start_bss), 0, align_bss - start_bss);
+
+            if (!(flags & PAGE_WRITE)) {
+                if (target_mprotect(start_page_aligned,
+                                    TARGET_PAGE_SIZE, prot) == -1) {
+                    error_setg_errno(errp, errno,
+                                    "Error restoring bss first permissions");
+                    return false;
+                }
+            }
         }
     }
 
-- 
2.43.0


Re: [PATCH] linux-user: Fix zero_bss for RX PT_LOAD segments
Posted by Richard Henderson 3 days, 22 hours ago
On 2/2/26 10:24, Razvan Ghiorghe wrote:
> To validate the correctness of the modified zero_bss() implementation,
> two targeted test cases were constructed, designed to exercise the edge cases where
> the .bss segment overlaps a partially filled virtual memory page belonging to a
> R_X region. The test binaries were intentionally built without a main() function
> and instead defined a custom ELF entry-point via the _start symbol.
> This approach bypasses CRT, dynamic loader, libc initialization etc. ensuring that
> execution begins immediately after QEMU completes ELF loading and memory mapping.

It would be nice to include those test cases.


r~