[v2] Open HugeTLB allocation routine for more generic use

[PATCH v2 0/6] Open HugeTLB allocation routine for more generic use

Posted by Ackerley Tng via B4 Relay 1 month, 1 week ago

Hi,

The motivation for this patch series is guest_memfd, which would like
to use HugeTLB as a generic source of huge pages but not adopt
HugeTLB's reservation at mmap() time.

By refactoring alloc_hugetlb_folio() and some dependent functions,
there is now an option to allocate HugeTLB folios without providing a
VMA. Specifically, HugeTLB allocation used to be dependent on the VMA
to

1. Look up reservations in the resv_map
2. Get mpol, stored at vma->vm_policy

This refactoring provides hugetlb_alloc_folio(), which focuses on just
the allocation itself, and associated memory and HugeTLB charging
(cgroups). alloc_hugetlb_folio() still handles reservations in the
resv_map and subpools.

Regarding naming, I'm definitely open to alternative names :) I chose
hugetlb_alloc_folio() because I'm seeing this function as a general
allocation function that is provided by the HugeTLB subsystem (hence
the hugetlb_ prefix). I'm intending for alloc_hugetlb_folio() to be
later refactored as a static function for use just by HugeTLB, and
HugeTLBfs should probably use hugetlb_alloc_folio() directly.

To see how hugetlb_alloc_folio() is used by guest_memfd, the most
recent patch series that uses this more generic HugeTLB allocation
routine is at [1], and a newer revision of that patch series is at
[2].

Independently of guest_memfd, I believe this change is useful in
simplifying alloc_hugetlb_folio(). alloc_hugetlb_folio() was so
coupled to a VMA that even HugeTLBfs allocates HugeTLB folios using a
pseudo-VMA.

Testing:

+ libhugetlbfs tests pass
+ ./tools/testing/selftests/mm/ksft_hugetlb.sh passes

Changes in this revision:

+ No longer reintroduces try-commit-cancel protocol for hugetlb's memcg charging.
    + "mm: memcontrol: eliminate the problem of dying memory cgroup
      for LRU folios" was merged, and memcg seems to be moving away
      from the try-commit-cancel protcol, with try_charge() no longer
      having any users [3].

[1] https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.com/T/
[2] https://github.com/googleprodkernel/linux-cc/tree/wip-gmem-conversions-hugetlb-restructuring-12-08-25
[3] https://lore.kernel.org/all/bb35a69a-5be9-45f5-a557-1902487a1bc2@linux.dev/

---
Ackerley Tng (6):
      mm: hugetlb: Consolidate interpretation of gbl_chg within alloc_hugetlb_folio()
      mm: hugetlb: Move mpol interpretation out of alloc_buddy_hugetlb_folio_with_mpol()
      mm: hugetlb: Move mpol interpretation out of dequeue_hugetlb_folio_vma()
      mm: hugetlb: Use error variable in alloc_hugetlb_folio
      mm: hugetlb: Move mem_cgroup_charge_hugetlb() earlier in allocation
      mm: hugetlb: Refactor out hugetlb_alloc_folio()

 include/linux/hugetlb.h |   3 +
 mm/hugetlb.c            | 209 ++++++++++++++++++++++++++----------------------
 2 files changed, 117 insertions(+), 95 deletions(-)
---
base-commit: adc1e5c6203cf13fe05a1ead08edcb3d3a3baae8
change-id: 20260504-hugetlb-open-up-eaba80571b09

Best regards,
--
Ackerley Tng <ackerleytng@google.com>

Re: [PATCH v2 0/6] Open HugeTLB allocation routine for more generic use

Posted by Oscar Salvador 1 month ago

On Wed, May 06, 2026 at 08:54:36AM -0700, Ackerley Tng via B4 Relay wrote:
> Hi,
> 
> The motivation for this patch series is guest_memfd, which would like
> to use HugeTLB as a generic source of huge pages but not adopt
> HugeTLB's reservation at mmap() time.
> 
> By refactoring alloc_hugetlb_folio() and some dependent functions,
> there is now an option to allocate HugeTLB folios without providing a
> VMA. Specifically, HugeTLB allocation used to be dependent on the VMA
> to
> 
> 1. Look up reservations in the resv_map
> 2. Get mpol, stored at vma->vm_policy
> 
> This refactoring provides hugetlb_alloc_folio(), which focuses on just
> the allocation itself, and associated memory and HugeTLB charging
> (cgroups). alloc_hugetlb_folio() still handles reservations in the
> resv_map and subpools.
> 
> Regarding naming, I'm definitely open to alternative names :) I chose
> hugetlb_alloc_folio() because I'm seeing this function as a general
> allocation function that is provided by the HugeTLB subsystem (hence
> the hugetlb_ prefix). I'm intending for alloc_hugetlb_folio() to be
> later refactored as a static function for use just by HugeTLB, and
> HugeTLBfs should probably use hugetlb_alloc_folio() directly.
> 
> To see how hugetlb_alloc_folio() is used by guest_memfd, the most
> recent patch series that uses this more generic HugeTLB allocation
> routine is at [1], and a newer revision of that patch series is at
> [2].

Would that be

https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.com/T/#me2152fa2cc79d651ecea7a2bce8b57725fb57465

?


-- 
Oscar Salvador
SUSE Labs

Re: [PATCH v2 0/6] Open HugeTLB allocation routine for more generic use

Posted by Ackerley Tng 3 weeks, 4 days ago

Oscar Salvador <osalvador@suse.de> writes:

> On Wed, May 06, 2026 at 08:54:36AM -0700, Ackerley Tng via B4 Relay wrote:
>> Hi,
>>
>> The motivation for this patch series is guest_memfd, which would like
>> to use HugeTLB as a generic source of huge pages but not adopt
>> HugeTLB's reservation at mmap() time.
>>
>> By refactoring alloc_hugetlb_folio() and some dependent functions,
>> there is now an option to allocate HugeTLB folios without providing a
>> VMA. Specifically, HugeTLB allocation used to be dependent on the VMA
>> to
>>
>> 1. Look up reservations in the resv_map
>> 2. Get mpol, stored at vma->vm_policy
>>
>> This refactoring provides hugetlb_alloc_folio(), which focuses on just
>> the allocation itself, and associated memory and HugeTLB charging
>> (cgroups). alloc_hugetlb_folio() still handles reservations in the
>> resv_map and subpools.
>>
>> Regarding naming, I'm definitely open to alternative names :) I chose
>> hugetlb_alloc_folio() because I'm seeing this function as a general
>> allocation function that is provided by the HugeTLB subsystem (hence
>> the hugetlb_ prefix). I'm intending for alloc_hugetlb_folio() to be
>> later refactored as a static function for use just by HugeTLB, and
>> HugeTLBfs should probably use hugetlb_alloc_folio() directly.
>>
>> To see how hugetlb_alloc_folio() is used by guest_memfd, the most
>> recent patch series that uses this more generic HugeTLB allocation
>> routine is at [1], and a newer revision of that patch series is at
>> [2].
>
> Would that be
>
> https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.com/T/#me2152fa2cc79d651ecea7a2bce8b57725fb57465
>
> ?

Thanks for checking :)

The link you have is the latest patch series I posted (let's call that
[*]), but [2], or
https://github.com/googleprodkernel/linux-cc/tree/wip-gmem-conversions-hugetlb-restructuring-12-08-25
is even newer than [*].

[2] has some more bugfixes, and is also on a newer version of
guest_memfd conversions patch series than [*].

>
>
> --
> Oscar Salvador
> SUSE Labs