[PATCH v2] mm: Respect mmap hint address when aligning for THP

Kalesh Singh posted 1 patch 5 days, 2 hours ago
mm/mmap.c | 1 +
1 file changed, 1 insertion(+)
[PATCH v2] mm: Respect mmap hint address when aligning for THP
Posted by Kalesh Singh 5 days, 2 hours ago
Commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
boundaries") updated __get_unmapped_area() to align the start address
for the VMA to a PMD boundary if CONFIG_TRANSPARENT_HUGEPAGE=y.

It does this by effectively looking up a region that is of size,
request_size + PMD_SIZE, and aligning up the start to a PMD boundary.

Commit 4ef9ad19e176 ("mm: huge_memory: don't force huge page alignment
on 32 bit") opted out of this for 32bit due to regressions in mmap base
randomization.

Commit d4148aeab412 ("mm, mmap: limit THP alignment of anonymous
mappings to PMD-aligned sizes") restricted this to only mmap sizes that
are multiples of the PMD_SIZE due to reported regressions in some
performance benchmarks -- which seemed mostly due to the reduced spatial
locality of related mappings due to the forced PMD-alignment.

Another unintended side effect has emerged: When a user specifies an mmap
hint address, the THP alignment logic modifies the behavior, potentially
ignoring the hint even if a sufficiently large gap exists at the requested
hint location.

Example Scenario:

Consider the following simplified virtual address (VA) space:

    ...

    0x200000-0x400000 --- VMA A
    0x400000-0x600000 --- Hole
    0x600000-0x800000 --- VMA B

    ...

A call to mmap() with hint=0x400000 and len=0x200000 behaves differently:

  - Before THP alignment: The requested region (size 0x200000) fits into
    the gap at 0x400000, so the hint is respected.

  - After alignment: The logic searches for a region of size
    0x400000 (len + PMD_SIZE) starting at 0x400000.
    This search fails due to the mapping at 0x600000 (VMA B), and the hint
    is ignored, falling back to arch_get_unmapped_area[_topdown]().

In general the hint is effectively ignored, if there is any
existing mapping in the below range:

     [mmap_hint + mmap_size, mmap_hint + mmap_size + PMD_SIZE)

This changes the semantics of mmap hint; from ""Respect the hint if a
sufficiently large gap exists at the requested location" to "Respect the
hint only if an additional PMD-sized gap exists beyond the requested size".

This has performance implications for allocators that allocate their heap
using mmap but try to keep it "as contiguous as possible" by using the
end of the exisiting heap as the address hint. With the new behavior
it's more likely to get a much less contiguous heap, adding extra
fragmentation and performance overhead.

To restore the expected behavior; don't use thp_get_unmapped_area_vmflags()
when the user provided a hint address, for anonymous mappings.

Note: As, Yang Shi, pointed out: the issue still remains for filesystems
which are using thp_get_unmapped_area() for their get_unmapped_area() op.
It is unclear what worklaods will regress for if we ignore THP alignment
when the hint address is provided for such file backed mappings -- so this
fix will be handled separately.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Hans Boehm <hboehm@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: <stable@vger.kernel.org>
Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries")
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---

Changes in v2:
  - Clarify the handling of file backed mappings, as highlighted by Yang
  - Collect Vlastimil's and Rik's Reviewed-by's

 mm/mmap.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index 79d541f1502b..2f01f1a8e304 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -901,6 +901,7 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 	if (get_area) {
 		addr = get_area(file, addr, len, pgoff, flags);
 	} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)
+		   && !addr /* no hint */
 		   && IS_ALIGNED(len, PMD_SIZE)) {
 		/* Ensures that larger anonymous mappings are THP aligned. */
 		addr = thp_get_unmapped_area_vmflags(file, addr, len,

base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623
--
2.47.0.338.g60cca15819-goog
Re: [PATCH v2] mm: Respect mmap hint address when aligning for THP
Posted by David Hildenbrand 4 days, 9 hours ago
On 18.11.24 22:46, Kalesh Singh wrote:
> Commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
> boundaries") updated __get_unmapped_area() to align the start address
> for the VMA to a PMD boundary if CONFIG_TRANSPARENT_HUGEPAGE=y.
> 
> It does this by effectively looking up a region that is of size,
> request_size + PMD_SIZE, and aligning up the start to a PMD boundary.
> 
> Commit 4ef9ad19e176 ("mm: huge_memory: don't force huge page alignment
> on 32 bit") opted out of this for 32bit due to regressions in mmap base
> randomization.
> 
> Commit d4148aeab412 ("mm, mmap: limit THP alignment of anonymous
> mappings to PMD-aligned sizes") restricted this to only mmap sizes that
> are multiples of the PMD_SIZE due to reported regressions in some
> performance benchmarks -- which seemed mostly due to the reduced spatial
> locality of related mappings due to the forced PMD-alignment.
> 
> Another unintended side effect has emerged: When a user specifies an mmap
> hint address, the THP alignment logic modifies the behavior, potentially
> ignoring the hint even if a sufficiently large gap exists at the requested
> hint location.
> 
> Example Scenario:
> 
> Consider the following simplified virtual address (VA) space:
> 
>      ...
> 
>      0x200000-0x400000 --- VMA A
>      0x400000-0x600000 --- Hole
>      0x600000-0x800000 --- VMA B
> 
>      ...
> 
> A call to mmap() with hint=0x400000 and len=0x200000 behaves differently:
> 
>    - Before THP alignment: The requested region (size 0x200000) fits into
>      the gap at 0x400000, so the hint is respected.
> 
>    - After alignment: The logic searches for a region of size
>      0x400000 (len + PMD_SIZE) starting at 0x400000.
>      This search fails due to the mapping at 0x600000 (VMA B), and the hint
>      is ignored, falling back to arch_get_unmapped_area[_topdown]().
> 
> In general the hint is effectively ignored, if there is any
> existing mapping in the below range:
> 
>       [mmap_hint + mmap_size, mmap_hint + mmap_size + PMD_SIZE)
> 
> This changes the semantics of mmap hint; from ""Respect the hint if a
> sufficiently large gap exists at the requested location" to "Respect the
> hint only if an additional PMD-sized gap exists beyond the requested size".
> 
> This has performance implications for allocators that allocate their heap
> using mmap but try to keep it "as contiguous as possible" by using the
> end of the exisiting heap as the address hint. With the new behavior
> it's more likely to get a much less contiguous heap, adding extra
> fragmentation and performance overhead.
> 
> To restore the expected behavior; don't use thp_get_unmapped_area_vmflags()
> when the user provided a hint address, for anonymous mappings.
> 
> Note: As, Yang Shi, pointed out: the issue still remains for filesystems
> which are using thp_get_unmapped_area() for their get_unmapped_area() op.
> It is unclear what worklaods will regress for if we ignore THP alignment
> when the hint address is provided for such file backed mappings -- so this
> fix will be handled separately.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Yang Shi <yang@os.amperecomputing.com>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Hans Boehm <hboehm@google.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: <stable@vger.kernel.org>
> Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries")
> Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
> Reviewed-by: Rik van Riel <riel@surriel.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---

LGTM. Hopefully that's the end of this story :)

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb
Re: [PATCH v2] mm: Respect mmap hint address when aligning for THP
Posted by Kalesh Singh 3 days, 5 hours ago
On Tue, Nov 19, 2024 at 6:35 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 18.11.24 22:46, Kalesh Singh wrote:
> > Commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
> > boundaries") updated __get_unmapped_area() to align the start address
> > for the VMA to a PMD boundary if CONFIG_TRANSPARENT_HUGEPAGE=y.
> >
> > It does this by effectively looking up a region that is of size,
> > request_size + PMD_SIZE, and aligning up the start to a PMD boundary.
> >
> > Commit 4ef9ad19e176 ("mm: huge_memory: don't force huge page alignment
> > on 32 bit") opted out of this for 32bit due to regressions in mmap base
> > randomization.
> >
> > Commit d4148aeab412 ("mm, mmap: limit THP alignment of anonymous
> > mappings to PMD-aligned sizes") restricted this to only mmap sizes that
> > are multiples of the PMD_SIZE due to reported regressions in some
> > performance benchmarks -- which seemed mostly due to the reduced spatial
> > locality of related mappings due to the forced PMD-alignment.
> >
> > Another unintended side effect has emerged: When a user specifies an mmap
> > hint address, the THP alignment logic modifies the behavior, potentially
> > ignoring the hint even if a sufficiently large gap exists at the requested
> > hint location.
> >
> > Example Scenario:
> >
> > Consider the following simplified virtual address (VA) space:
> >
> >      ...
> >
> >      0x200000-0x400000 --- VMA A
> >      0x400000-0x600000 --- Hole
> >      0x600000-0x800000 --- VMA B
> >
> >      ...
> >
> > A call to mmap() with hint=0x400000 and len=0x200000 behaves differently:
> >
> >    - Before THP alignment: The requested region (size 0x200000) fits into
> >      the gap at 0x400000, so the hint is respected.
> >
> >    - After alignment: The logic searches for a region of size
> >      0x400000 (len + PMD_SIZE) starting at 0x400000.
> >      This search fails due to the mapping at 0x600000 (VMA B), and the hint
> >      is ignored, falling back to arch_get_unmapped_area[_topdown]().
> >
> > In general the hint is effectively ignored, if there is any
> > existing mapping in the below range:
> >
> >       [mmap_hint + mmap_size, mmap_hint + mmap_size + PMD_SIZE)
> >
> > This changes the semantics of mmap hint; from ""Respect the hint if a
> > sufficiently large gap exists at the requested location" to "Respect the
> > hint only if an additional PMD-sized gap exists beyond the requested size".
> >
> > This has performance implications for allocators that allocate their heap
> > using mmap but try to keep it "as contiguous as possible" by using the
> > end of the exisiting heap as the address hint. With the new behavior
> > it's more likely to get a much less contiguous heap, adding extra
> > fragmentation and performance overhead.
> >
> > To restore the expected behavior; don't use thp_get_unmapped_area_vmflags()
> > when the user provided a hint address, for anonymous mappings.
> >
> > Note: As, Yang Shi, pointed out: the issue still remains for filesystems
> > which are using thp_get_unmapped_area() for their get_unmapped_area() op.
> > It is unclear what worklaods will regress for if we ignore THP alignment
> > when the hint address is provided for such file backed mappings -- so this
> > fix will be handled separately.
> >
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Vlastimil Babka <vbabka@suse.cz>
> > Cc: Yang Shi <yang@os.amperecomputing.com>
> > Cc: Rik van Riel <riel@surriel.com>
> > Cc: Ryan Roberts <ryan.roberts@arm.com>
> > Cc: Suren Baghdasaryan <surenb@google.com>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: Hans Boehm <hboehm@google.com>
> > Cc: Lokesh Gidra <lokeshgidra@google.com>
> > Cc: <stable@vger.kernel.org>
> > Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries")
> > Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
> > Reviewed-by: Rik van Riel <riel@surriel.com>
> > Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
>
> LGTM. Hopefully that's the end of this story :)
>
> Reviewed-by: David Hildenbrand <david@redhat.com>

Thanks David.

Andrew, when you have a chance, could you take this for mm fixes please.

Thanks,
Kalesh

>
> --
> Cheers,
>
> David / dhildenb
>