exec_folio_order() was introduced [1] to request readahead of executable
file-backed pages at an arch-preferred folio order, so that the hardware
can coalesce contiguous PTEs into fewer iTLB entries (contpte).
The current implementation uses ilog2(SZ_64K >> PAGE_SHIFT), which
requests 64K folios. This is optimal for 4K base pages (where CONT_PTES
= 16, contpte size = 64K), but suboptimal for 16K and 64K base pages:
Page size | Before (order) | After (order) | contpte
----------|----------------|---------------|--------
4K | 4 (64K) | 4 (64K) | Yes (unchanged)
16K | 2 (64K) | 7 (2M) | Yes (new)
64K | 0 (64K) | 5 (2M) | Yes (new)
For 16K pages, CONT_PTES = 128 and the contpte size is 2M (order 7).
For 64K pages, CONT_PTES = 32 and the contpte size is 2M (order 5).
Use ilog2(CONT_PTES) instead, which directly evaluates to contpte-aligned
order for all page sizes.
The worst-case waste is bounded to one folio (up to 2MB - 64KB)
at the end of the file, since page_cache_ra_order() reduces the folio
order near EOF to avoid allocating past i_size.
[1] https://lore.kernel.org/all/20250430145920.3748738-6-ryan.roberts@arm.com/
Signed-off-by: Usama Arif <usama.arif@linux.dev>
---
arch/arm64/include/asm/pgtable.h | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b3e58735c49bd..a1110a33acb35 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1600,12 +1600,11 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
#define arch_wants_old_prefaulted_pte cpu_has_hw_af
/*
- * Request exec memory is read into pagecache in at least 64K folios. This size
- * can be contpte-mapped when 4K base pages are in use (16 pages into 1 iTLB
- * entry), and HPA can coalesce it (4 pages into 1 TLB entry) when 16K base
- * pages are in use.
+ * Request exec memory is read into pagecache in contpte-sized folios. The
+ * contpte size is the number of contiguous PTEs that the hardware can coalesce
+ * into a single iTLB entry: 64K for 4K pages, 2M for 16K and 64K pages.
*/
-#define exec_folio_order() ilog2(SZ_64K >> PAGE_SHIFT)
+#define exec_folio_order() ilog2(CONT_PTES)
static inline bool pud_sect_supported(void)
{
--
2.47.3
On 3/10/26 15:51, Usama Arif wrote: > exec_folio_order() was introduced [1] to request readahead of executable > file-backed pages at an arch-preferred folio order, so that the hardware > can coalesce contiguous PTEs into fewer iTLB entries (contpte). > > The current implementation uses ilog2(SZ_64K >> PAGE_SHIFT), which > requests 64K folios. This is optimal for 4K base pages (where CONT_PTES > = 16, contpte size = 64K), but suboptimal for 16K and 64K base pages: > > Page size | Before (order) | After (order) | contpte > ----------|----------------|---------------|-------- > 4K | 4 (64K) | 4 (64K) | Yes (unchanged) > 16K | 2 (64K) | 7 (2M) | Yes (new) > 64K | 0 (64K) | 5 (2M) | Yes (new) > > For 16K pages, CONT_PTES = 128 and the contpte size is 2M (order 7). > For 64K pages, CONT_PTES = 32 and the contpte size is 2M (order 5). > > Use ilog2(CONT_PTES) instead, which directly evaluates to contpte-aligned > order for all page sizes. > > The worst-case waste is bounded to one folio (up to 2MB - 64KB) > at the end of the file, since page_cache_ra_order() reduces the folio > order near EOF to avoid allocating past i_size. So, if you have a smallish text segment in a larger file, we'd always try to allocate 2M on 16k/64k? That feels wrong. Asking the other way around: why not also use 2M on a 4k system and end up with a PMD? And no, I don't think we should default to that, just emphasizing my point that *maybe* we really want to consider mapping (vma) size as well. -- Cheers, David
© 2016 - 2026 Red Hat, Inc.