mm/hugetlb: A fix and some cleanups

[PATCH 1/3] mm/hugetlb: Restore failed global reservations to subpool

Posted by Joshua Hahn 3 weeks, 3 days ago

Commit a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
fixed an underflow error for hstate->resv_huge_pages caused by
incorrectly attributing globally requested pages to the subpool's
reservation.

Unfortunately, this fix also introduced the opposite problem, which would
leave spool->used_hpages elevated if the globally requested pages could
not be acquired. This is because while a subpool's reserve pages only
accounts for what is requested and allocated from the subpool, its
"used" counter keeps track of what is consumed in total, both from the
subpool and globally. Thus, we need to adjust spool->used_hpages in the
other direction, and make sure that globally requested pages are
uncharged from the subpool's used counter.

Each failed allocation attempt increments the used_hpages counter by
how many pages were requested from the global pool. Ultimately, this
renders the subpool unusable, as used_hpages approaches the max limit.

The issue can be reproduced as follows:
1. Allocate 4 hugetlb pages
2. Create a hugetlb mount with max=4, min=2
3. Consume 2 pages globally
4. Request 3 pages from the subpool (2 from subpool + 1 from global)
	4.1 hugepage_subpool_get_pages(spool, 3) succeeds.
		used_hpages += 3
	4.2 hugetlb_acct_memory(h, 1) fails: no global pages left
		used_hpages -= 2
5. Subpool now has used_hpages = 1, despite not being able to
   successfully allocate any hugepages. It believes it can now only
   allocate 3 more hugepages, not 4.

Repeating this process will ultimately render the subpool unable to
allocate any hugepages, since it believes that it is using the maximum
number of hugepages that the subpool has been allotted.

The underflow issue that commit a833a693a490 fixes still remains fixed
as well.

Fixes: a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: stable@vger.kernel.org
---
 mm/hugetlb.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2e296d30a8d7..88b9e997c9da 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6560,6 +6560,7 @@ long hugetlb_reserve_pages(struct inode *inode,
 	struct resv_map *resv_map;
 	struct hugetlb_cgroup *h_cg = NULL;
 	long gbl_reserve, regions_needed = 0;
+	unsigned long flags;
 	int err;
 
 	/* This should never happen */
@@ -6704,6 +6705,13 @@ long hugetlb_reserve_pages(struct inode *inode,
 		 */
 		hugetlb_acct_memory(h, -gbl_resv);
 	}
+	/* Restore used_hpages for pages that failed global reservation */
+	if (gbl_reserve && spool) {
+		spin_lock_irqsave(&spool->lock, flags);
+		if (spool->max_hpages != -1)
+			spool->used_hpages -= gbl_reserve;
+		unlock_or_release_subpool(spool, flags);
+	}
 out_uncharge_cgroup:
 	hugetlb_cgroup_uncharge_cgroup_rsvd(hstate_index(h),
 					    chg * pages_per_huge_page(h), h_cg);
-- 
2.47.3

Re: [PATCH 1/3] mm/hugetlb: Restore failed global reservations to subpool

Posted by Andrew Morton 3 weeks, 3 days ago

On Thu, 15 Jan 2026 13:14:35 -0500 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:

> Commit a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
> fixed an underflow error for hstate->resv_huge_pages caused by
> incorrectly attributing globally requested pages to the subpool's
> reservation.
> 
> Unfortunately, this fix also introduced the opposite problem, which would
> leave spool->used_hpages elevated if the globally requested pages could
> not be acquired. This is because while a subpool's reserve pages only
> accounts for what is requested and allocated from the subpool, its
> "used" counter keeps track of what is consumed in total, both from the
> subpool and globally. Thus, we need to adjust spool->used_hpages in the
> other direction, and make sure that globally requested pages are
> uncharged from the subpool's used counter.
> 
> Each failed allocation attempt increments the used_hpages counter by
> how many pages were requested from the global pool. Ultimately, this
> renders the subpool unusable, as used_hpages approaches the max limit.
> 
> The issue can be reproduced as follows:
> 1. Allocate 4 hugetlb pages
> 2. Create a hugetlb mount with max=4, min=2
> 3. Consume 2 pages globally
> 4. Request 3 pages from the subpool (2 from subpool + 1 from global)
> 	4.1 hugepage_subpool_get_pages(spool, 3) succeeds.
> 		used_hpages += 3
> 	4.2 hugetlb_acct_memory(h, 1) fails: no global pages left
> 		used_hpages -= 2
> 5. Subpool now has used_hpages = 1, despite not being able to
>    successfully allocate any hugepages. It believes it can now only
>    allocate 3 more hugepages, not 4.
> 
> Repeating this process will ultimately render the subpool unable to
> allocate any hugepages, since it believes that it is using the maximum
> number of hugepages that the subpool has been allotted.
> 
> The underflow issue that commit a833a693a490 fixes still remains fixed
> as well.

Thanks, I submitted the above to the Changelog Of The Year judging
committee.

> Fixes: a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> Cc: stable@vger.kernel.org

I'll add this to mm.git's mm-hotfixes queue, for testing and review
input.

> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6560,6 +6560,7 @@ long hugetlb_reserve_pages(struct inode *inode,
>  	struct resv_map *resv_map;
>  	struct hugetlb_cgroup *h_cg = NULL;
>  	long gbl_reserve, regions_needed = 0;
> +	unsigned long flags;

This could have been local to the {block} which uses it, which would be
nicer, no?

>  	int err;
>  
>  	/* This should never happen */
> @@ -6704,6 +6705,13 @@ long hugetlb_reserve_pages(struct inode *inode,
>  		 */
>  		hugetlb_acct_memory(h, -gbl_resv);
>  	}
> +	/* Restore used_hpages for pages that failed global reservation */
> +	if (gbl_reserve && spool) {
> +		spin_lock_irqsave(&spool->lock, flags);
> +		if (spool->max_hpages != -1)
> +			spool->used_hpages -= gbl_reserve;
> +		unlock_or_release_subpool(spool, flags);
> +	}

I'll add [2/3] and [3/3] to the mm-new queue while discarding your
perfectly good [0/N] :(

Please, let's try not to mix backportable patches with the
non-backportable ones?

Re: [PATCH 1/3] mm/hugetlb: Restore failed global reservations to subpool

Posted by Joshua Hahn 3 weeks, 3 days ago

On Thu, 15 Jan 2026 11:19:46 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:

> On Thu, 15 Jan 2026 13:14:35 -0500 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:

Hello Andrew, I hope you are doing well. Thank you for your help as always!

> > Commit a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
> > fixed an underflow error for hstate->resv_huge_pages caused by
> > incorrectly attributing globally requested pages to the subpool's
> > reservation.
> > 
> > Unfortunately, this fix also introduced the opposite problem, which would
> > leave spool->used_hpages elevated if the globally requested pages could
> > not be acquired. This is because while a subpool's reserve pages only
> > accounts for what is requested and allocated from the subpool, its
> > "used" counter keeps track of what is consumed in total, both from the
> > subpool and globally. Thus, we need to adjust spool->used_hpages in the
> > other direction, and make sure that globally requested pages are
> > uncharged from the subpool's used counter.
> > 
> > Each failed allocation attempt increments the used_hpages counter by
> > how many pages were requested from the global pool. Ultimately, this
> > renders the subpool unusable, as used_hpages approaches the max limit.
> > 
> > The issue can be reproduced as follows:
> > 1. Allocate 4 hugetlb pages
> > 2. Create a hugetlb mount with max=4, min=2
> > 3. Consume 2 pages globally
> > 4. Request 3 pages from the subpool (2 from subpool + 1 from global)
> > 	4.1 hugepage_subpool_get_pages(spool, 3) succeeds.
> > 		used_hpages += 3
> > 	4.2 hugetlb_acct_memory(h, 1) fails: no global pages left
> > 		used_hpages -= 2
> > 5. Subpool now has used_hpages = 1, despite not being able to
> >    successfully allocate any hugepages. It believes it can now only
> >    allocate 3 more hugepages, not 4.
> > 
> > Repeating this process will ultimately render the subpool unable to
> > allocate any hugepages, since it believes that it is using the maximum
> > number of hugepages that the subpool has been allotted.
> > 
> > The underflow issue that commit a833a693a490 fixes still remains fixed
> > as well.
> 
> Thanks, I submitted the above to the Changelog Of The Year judging
> committee.

: -) Thank you for the kind words!

> > Fixes: a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > Cc: stable@vger.kernel.org
> 
> I'll add this to mm.git's mm-hotfixes queue, for testing and review
> input.

Sounds good to me! I'll wait a bit in case others have different concerns,
but I'll send out a new version which addresses your comments below (and
any future comments) in a day or two.

> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -6560,6 +6560,7 @@ long hugetlb_reserve_pages(struct inode *inode,
> >  	struct resv_map *resv_map;
> >  	struct hugetlb_cgroup *h_cg = NULL;
> >  	long gbl_reserve, regions_needed = 0;
> > +	unsigned long flags;
> 
> This could have been local to the {block} which uses it, which would be
> nicer, no?

Definiely, I'll address this in v2!

> >  	int err;
> >  
> >  	/* This should never happen */
> > @@ -6704,6 +6705,13 @@ long hugetlb_reserve_pages(struct inode *inode,
> >  		 */
> >  		hugetlb_acct_memory(h, -gbl_resv);
> >  	}
> > +	/* Restore used_hpages for pages that failed global reservation */
> > +	if (gbl_reserve && spool) {
> > +		spin_lock_irqsave(&spool->lock, flags);
> > +		if (spool->max_hpages != -1)
> > +			spool->used_hpages -= gbl_reserve;
> > +		unlock_or_release_subpool(spool, flags);
> > +	}
> 
> I'll add [2/3] and [3/3] to the mm-new queue while discarding your
> perfectly good [0/N] :(
> 
> Please, let's try not to mix backportable patches with the
> non-backportable ones?

Oh no! Sorry, this is my first time Cc-ing stable so I wasn't aware of the
implications. In v2, I'll send the two out as separate patches, so that it's
easier to backport. I was just eager to send out 2/3 and 3/3 because I've
been waiting for a functional hugetlb patch to smoosh these cleanups into.

I'll be more mindful in the future.

Thank you again, I hope you have a great day!!
Joshua

[PATCH 1/3] mm/hugetlb: Restore failed global reservations to subpool
[PATCH 2/3] mm/hugetlb: Remove unnecessary if condition
[PATCH 3/3] mm/hugetlb: Enforce brace style