include/linux/hugetlb.h | 3 + mm/hugetlb.c | 209 ++++++++++++++++++++++++++---------------------- 2 files changed, 117 insertions(+), 95 deletions(-)
Hi,
The motivation for this patch series is guest_memfd, which would like
to use HugeTLB as a generic source of huge pages but not adopt
HugeTLB's reservation at mmap() time.
By refactoring alloc_hugetlb_folio() and some dependent functions,
there is now an option to allocate HugeTLB folios without providing a
VMA. Specifically, HugeTLB allocation used to be dependent on the VMA
to
1. Look up reservations in the resv_map
2. Get mpol, stored at vma->vm_policy
This refactoring provides hugetlb_alloc_folio(), which focuses on just
the allocation itself, and associated memory and HugeTLB charging
(cgroups). alloc_hugetlb_folio() still handles reservations in the
resv_map and subpools.
Regarding naming, I'm definitely open to alternative names :) I chose
hugetlb_alloc_folio() because I'm seeing this function as a general
allocation function that is provided by the HugeTLB subsystem (hence
the hugetlb_ prefix). I'm intending for alloc_hugetlb_folio() to be
later refactored as a static function for use just by HugeTLB, and
HugeTLBfs should probably use hugetlb_alloc_folio() directly.
To see how hugetlb_alloc_folio() is used by guest_memfd, the most
recent patch series that uses this more generic HugeTLB allocation
routine is at [1], and a newer revision of that patch series is at
[2].
Independently of guest_memfd, I believe this change is useful in
simplifying alloc_hugetlb_folio(). alloc_hugetlb_folio() was so
coupled to a VMA that even HugeTLBfs allocates HugeTLB folios using a
pseudo-VMA.
Testing:
+ libhugetlbfs tests pass
+ ./tools/testing/selftests/mm/ksft_hugetlb.sh passes
Changes in this revision:
+ No longer reintroduces try-commit-cancel protocol for hugetlb's memcg charging.
+ "mm: memcontrol: eliminate the problem of dying memory cgroup
for LRU folios" was merged, and memcg seems to be moving away
from the try-commit-cancel protcol, with try_charge() no longer
having any users [3].
[1] https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.com/T/
[2] https://github.com/googleprodkernel/linux-cc/tree/wip-gmem-conversions-hugetlb-restructuring-12-08-25
[3] https://lore.kernel.org/all/bb35a69a-5be9-45f5-a557-1902487a1bc2@linux.dev/
---
Ackerley Tng (6):
mm: hugetlb: Consolidate interpretation of gbl_chg within alloc_hugetlb_folio()
mm: hugetlb: Move mpol interpretation out of alloc_buddy_hugetlb_folio_with_mpol()
mm: hugetlb: Move mpol interpretation out of dequeue_hugetlb_folio_vma()
mm: hugetlb: Use error variable in alloc_hugetlb_folio
mm: hugetlb: Move mem_cgroup_charge_hugetlb() earlier in allocation
mm: hugetlb: Refactor out hugetlb_alloc_folio()
include/linux/hugetlb.h | 3 +
mm/hugetlb.c | 209 ++++++++++++++++++++++++++----------------------
2 files changed, 117 insertions(+), 95 deletions(-)
---
base-commit: adc1e5c6203cf13fe05a1ead08edcb3d3a3baae8
change-id: 20260504-hugetlb-open-up-eaba80571b09
Best regards,
--
Ackerley Tng <ackerleytng@google.com>
On Wed, May 06, 2026 at 08:54:36AM -0700, Ackerley Tng via B4 Relay wrote: > Hi, > > The motivation for this patch series is guest_memfd, which would like > to use HugeTLB as a generic source of huge pages but not adopt > HugeTLB's reservation at mmap() time. > > By refactoring alloc_hugetlb_folio() and some dependent functions, > there is now an option to allocate HugeTLB folios without providing a > VMA. Specifically, HugeTLB allocation used to be dependent on the VMA > to > > 1. Look up reservations in the resv_map > 2. Get mpol, stored at vma->vm_policy > > This refactoring provides hugetlb_alloc_folio(), which focuses on just > the allocation itself, and associated memory and HugeTLB charging > (cgroups). alloc_hugetlb_folio() still handles reservations in the > resv_map and subpools. > > Regarding naming, I'm definitely open to alternative names :) I chose > hugetlb_alloc_folio() because I'm seeing this function as a general > allocation function that is provided by the HugeTLB subsystem (hence > the hugetlb_ prefix). I'm intending for alloc_hugetlb_folio() to be > later refactored as a static function for use just by HugeTLB, and > HugeTLBfs should probably use hugetlb_alloc_folio() directly. > > To see how hugetlb_alloc_folio() is used by guest_memfd, the most > recent patch series that uses this more generic HugeTLB allocation > routine is at [1], and a newer revision of that patch series is at > [2]. Would that be https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.com/T/#me2152fa2cc79d651ecea7a2bce8b57725fb57465 ? -- Oscar Salvador SUSE Labs
Oscar Salvador <osalvador@suse.de> writes: > On Wed, May 06, 2026 at 08:54:36AM -0700, Ackerley Tng via B4 Relay wrote: >> Hi, >> >> The motivation for this patch series is guest_memfd, which would like >> to use HugeTLB as a generic source of huge pages but not adopt >> HugeTLB's reservation at mmap() time. >> >> By refactoring alloc_hugetlb_folio() and some dependent functions, >> there is now an option to allocate HugeTLB folios without providing a >> VMA. Specifically, HugeTLB allocation used to be dependent on the VMA >> to >> >> 1. Look up reservations in the resv_map >> 2. Get mpol, stored at vma->vm_policy >> >> This refactoring provides hugetlb_alloc_folio(), which focuses on just >> the allocation itself, and associated memory and HugeTLB charging >> (cgroups). alloc_hugetlb_folio() still handles reservations in the >> resv_map and subpools. >> >> Regarding naming, I'm definitely open to alternative names :) I chose >> hugetlb_alloc_folio() because I'm seeing this function as a general >> allocation function that is provided by the HugeTLB subsystem (hence >> the hugetlb_ prefix). I'm intending for alloc_hugetlb_folio() to be >> later refactored as a static function for use just by HugeTLB, and >> HugeTLBfs should probably use hugetlb_alloc_folio() directly. >> >> To see how hugetlb_alloc_folio() is used by guest_memfd, the most >> recent patch series that uses this more generic HugeTLB allocation >> routine is at [1], and a newer revision of that patch series is at >> [2]. > > Would that be > > https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.com/T/#me2152fa2cc79d651ecea7a2bce8b57725fb57465 > > ? Thanks for checking :) The link you have is the latest patch series I posted (let's call that [*]), but [2], or https://github.com/googleprodkernel/linux-cc/tree/wip-gmem-conversions-hugetlb-restructuring-12-08-25 is even newer than [*]. [2] has some more bugfixes, and is also on a newer version of guest_memfd conversions patch series than [*]. > > > -- > Oscar Salvador > SUSE Labs
© 2016 - 2026 Red Hat, Inc.