[PATCH] mm: hugetlb: Add Kconfig option to set default nr_overcommit_hugepages

Josh Triplett posted 1 patch 2 years, 8 months ago
mm/Kconfig   | 14 ++++++++++++++
mm/hugetlb.c |  2 ++
2 files changed, 16 insertions(+)
[PATCH] mm: hugetlb: Add Kconfig option to set default nr_overcommit_hugepages
Posted by Josh Triplett 2 years, 8 months ago
The default kernel configuration does not allow any huge page allocation
until after setting nr_hugepages or nr_overcommit_hugepages to a
non-zero value; without setting those, mmap attempts with MAP_HUGETLB
will always fail with -ENOMEM. nr_overcommit_hugepages allows userspace
to attempt to allocate huge pages at runtime, succeeding if the kernel
can find or assemble a free huge page.

Provide a Kconfig option to make nr_overcommit_hugepages default to
unlimited, which permits userspace to always attempt huge page
allocation on a best-effort basis. This makes it easier and more
worthwhile for random applications and libraries to opportunistically
attempt MAP_HUGETLB allocations without special configuration.

In particular, current versions of liburing with IORING_SETUP_NO_MMAP
attempt to allocate the rings in a huge page. This seems likely to lead
to more applications and libraries attempting to use huge pages.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---
 mm/Kconfig   | 14 ++++++++++++++
 mm/hugetlb.c |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/mm/Kconfig b/mm/Kconfig
index 7672a22647b4..32c13610c5c4 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -824,6 +824,20 @@ config READ_ONLY_THP_FOR_FS
 
 endif # TRANSPARENT_HUGEPAGE
 
+config HUGEPAGE_OVERCOMMIT_DEFAULT_UNLIMITED
+	bool "Allow huge page allocation attempts by default"
+	depends on HUGETLB_PAGE
+	help
+	  By default, the kernel does not allow any huge page allocation until
+	  after setting nr_hugepages or nr_overcommit_hugepages to a non-zero
+	  value. nr_overcommit_hugepages allows userspace to attempt to
+	  allocate huge pages at runtime, succeeding if the kernel can find or
+	  assemble a free huge page.
+
+	  Enable this option to make nr_overcommit_hugepages default to
+	  unlimited, which permits userspace to always attempt hugepage
+	  allocation.
+
 #
 # UP and nommu archs use km based percpu allocator
 #
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f154019e6b84..65abbe254e10 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4305,6 +4305,8 @@ void __init hugetlb_add_hstate(unsigned int order)
 	mutex_init(&h->resize_lock);
 	h->order = order;
 	h->mask = ~(huge_page_size(h) - 1);
+	if (IS_ENABLED(CONFIG_HUGEPAGE_OVERCOMMIT_DEFAULT_UNLIMITED))
+		h->nr_overcommit_huge_pages = ULONG_MAX;
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
 	INIT_LIST_HEAD(&h->hugepage_activelist);
-- 
2.40.1
Re: [PATCH] mm: hugetlb: Add Kconfig option to set default nr_overcommit_hugepages
Posted by David Rientjes 2 years, 8 months ago
On Fri, 9 Jun 2023, Josh Triplett wrote:

> The default kernel configuration does not allow any huge page allocation
> until after setting nr_hugepages or nr_overcommit_hugepages to a
> non-zero value; without setting those, mmap attempts with MAP_HUGETLB
> will always fail with -ENOMEM. nr_overcommit_hugepages allows userspace
> to attempt to allocate huge pages at runtime, succeeding if the kernel
> can find or assemble a free huge page.
> 
> Provide a Kconfig option to make nr_overcommit_hugepages default to
> unlimited, which permits userspace to always attempt huge page
> allocation on a best-effort basis. This makes it easier and more
> worthwhile for random applications and libraries to opportunistically
> attempt MAP_HUGETLB allocations without special configuration.
> 
> In particular, current versions of liburing with IORING_SETUP_NO_MMAP
> attempt to allocate the rings in a huge page. This seems likely to lead
> to more applications and libraries attempting to use huge pages.
> 
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>

Why not do this in an initscript?

Or, if absolutely necessary, a kernel command line parameter?

A Kconfig option to set a default value to be ULONG_MAX seems strange if 
you can just write the value to procfs.
Re: [PATCH] mm: hugetlb: Add Kconfig option to set default nr_overcommit_hugepages
Posted by David Hildenbrand 2 years, 8 months ago
On 11.06.23 07:20, David Rientjes wrote:
> On Fri, 9 Jun 2023, Josh Triplett wrote:
> 
>> The default kernel configuration does not allow any huge page allocation
>> until after setting nr_hugepages or nr_overcommit_hugepages to a
>> non-zero value; without setting those, mmap attempts with MAP_HUGETLB
>> will always fail with -ENOMEM. nr_overcommit_hugepages allows userspace
>> to attempt to allocate huge pages at runtime, succeeding if the kernel
>> can find or assemble a free huge page.
>>
>> Provide a Kconfig option to make nr_overcommit_hugepages default to
>> unlimited, which permits userspace to always attempt huge page
>> allocation on a best-effort basis. This makes it easier and more
>> worthwhile for random applications and libraries to opportunistically
>> attempt MAP_HUGETLB allocations without special configuration.
>>
>> In particular, current versions of liburing with IORING_SETUP_NO_MMAP
>> attempt to allocate the rings in a huge page. This seems likely to lead
>> to more applications and libraries attempting to use huge pages.
>>
>> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> 
> Why not do this in an initscript?
> 
> Or, if absolutely necessary, a kernel command line parameter?
> 
> A Kconfig option to set a default value to be ULONG_MAX seems strange if
> you can just write the value to procfs.
> 

Agreed, not to mention that huge pages in some environment can cause 
trouble (some architectures -- or with gigantic huge pages --  don't 
support huge page migration and you can run into trouble with 
ZONE_MOVABLE or MIGRATE_CMA, because you'll end up "consuming" all 
memory for unmovable allocations in the system), and we shouldn't 
advocate the use of unlimited overcommit for huge pages ...

-- 
Cheers,

David / dhildenb
Re: [PATCH] mm: hugetlb: Add Kconfig option to set default nr_overcommit_hugepages
Posted by Mike Kravetz 2 years, 8 months ago
On 06/12/23 11:12, David Hildenbrand wrote:
> On 11.06.23 07:20, David Rientjes wrote:
> > On Fri, 9 Jun 2023, Josh Triplett wrote:
> > 
> > > The default kernel configuration does not allow any huge page allocation
> > > until after setting nr_hugepages or nr_overcommit_hugepages to a
> > > non-zero value; without setting those, mmap attempts with MAP_HUGETLB
> > > will always fail with -ENOMEM. nr_overcommit_hugepages allows userspace
> > > to attempt to allocate huge pages at runtime, succeeding if the kernel
> > > can find or assemble a free huge page.
> > > 
> > > Provide a Kconfig option to make nr_overcommit_hugepages default to
> > > unlimited, which permits userspace to always attempt huge page
> > > allocation on a best-effort basis. This makes it easier and more
> > > worthwhile for random applications and libraries to opportunistically
> > > attempt MAP_HUGETLB allocations without special configuration.
> > > 
> > > In particular, current versions of liburing with IORING_SETUP_NO_MMAP
> > > attempt to allocate the rings in a huge page. This seems likely to lead
> > > to more applications and libraries attempting to use huge pages.
> > > 
> > > Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> > 
> > Why not do this in an initscript?
> > 
> > Or, if absolutely necessary, a kernel command line parameter?
> > 
> > A Kconfig option to set a default value to be ULONG_MAX seems strange if
> > you can just write the value to procfs.
> > 
> 
> Agreed, not to mention that huge pages in some environment can cause trouble
> (some architectures -- or with gigantic huge pages --  don't support huge
> page migration and you can run into trouble with ZONE_MOVABLE or
> MIGRATE_CMA, because you'll end up "consuming" all memory for unmovable
> allocations in the system), and we shouldn't advocate the use of unlimited
> overcommit for huge pages ...
> 

Agree with David(s).  Such an option should really be decided by a sysadmin.

Any reason why liburing can not use THP?  Seems like that would provide the
desired functionality.
-- 
Mike Kravetz