[PATCH 4/4] mm: align file-backed mmap to exec folio order in thp_get_unmapped_area

Usama Arif posted 4 patches 4 weeks, 1 day ago
[PATCH 4/4] mm: align file-backed mmap to exec folio order in thp_get_unmapped_area
Posted by Usama Arif 4 weeks, 1 day ago
thp_get_unmapped_area() is the get_unmapped_area callback for
filesystems like ext4, xfs, and btrfs. It attempts to align the virtual
address for PMD_SIZE THP mappings, but on arm64 with 64K base pages
PMD_SIZE is 512M, which is too large for typical shared library mappings,
so the alignment always fails and falls back to PAGE_SIZE.

This means shared libraries loaded by ld.so via mmap() get 64K-aligned
virtual addresses, preventing contpte mapping even when 2M folios are
allocated with properly aligned file offsets and physical addresses.

Add a fallback in thp_get_unmapped_area_vmflags() that tries
PAGE_SIZE << exec_folio_order() alignment (2M on arm64 64K pages)
when PMD_SIZE alignment fails. This is small enough that shared
libraries could qualify, enabling contpte mapping for their executable
segments.

This applies to all file-backed mappings (not just exec). Non-exec
file-backed mappings also benefit from contpte mapping when large
folios are used. Aligning all file-backed mappings ensures that any
large folio in the page cache can be contpte-mapped regardless of
the mapping's protection flags, reducing dTLB misses for read-heavy
workloads.

The fallback is gated by exec_folio_order() which returns 0 by default,
making this a no-op on architectures that don't define it.

Signed-off-by: Usama Arif <usama.arif@linux.dev>
---
 mm/huge_memory.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8e2746ea74adf..1c9476a5ed51c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1242,6 +1242,23 @@ unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long add
 	if (ret)
 		return ret;
 
+	/*
+	 * If the arch requested large folios for exec memory, try to align
+	 * to the folio size as a fallback. This is much smaller than PMD_SIZE
+	 * (e.g. 2M vs 512M on arm64 64K pages), so it succeeds for mappings
+	 * that are too small for PMD alignment. Proper alignment ensures that
+	 * the hardware can coalesce PTEs (e.g. arm64 contpte) when large
+	 * folios are mapped.
+	 */
+	if (exec_folio_order()) {
+		unsigned long folio_size = PAGE_SIZE << exec_folio_order();
+
+		ret = __thp_get_unmapped_area(filp, addr, len, off, flags,
+					      folio_size, vm_flags);
+		if (ret)
+			return ret;
+	}
+
 	return mm_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags,
 					    vm_flags);
 }
-- 
2.47.3
Re: [PATCH 4/4] mm: align file-backed mmap to exec folio order in thp_get_unmapped_area
Posted by WANG Rui 3 weeks, 5 days ago
> +	if (exec_folio_order()) {
> +		unsigned long folio_size = PAGE_SIZE << exec_folio_order();
> +
> +		ret = __thp_get_unmapped_area(filp, addr, len, off, flags,
> +					      folio_size, vm_flags);
> +		if (ret)
> +			return ret;
> +	}
> +

I noticed that even when the code segment of a user-space shared library
satisfies PMD_SIZE (32MB), it still doesn’t end up at a PMD-aligned virtual
address. This might be the fallback you mentioned. Adjusting p_align in the
ld.so ELF loader does work, though it also avoids extremely large PMD_SIZE
values (capped at ≤32M).

It would probably be better to skip the PMD_SIZE == folio_sz case here,
so we don’t end up calling __thp_get_unmapped_area() twice with the same
parameters.

Thanks,
Rui