[PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts

Nico Pache posted 15 patches 3 months, 2 weeks ago
There is a newer version of this series
[PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Posted by Nico Pache 3 months, 2 weeks ago
There are cases where, if an attempted collapse fails, all subsequent
orders are guaranteed to also fail. Avoid these collapse attempts by
bailing out early.

Signed-off-by: Nico Pache <npache@redhat.com>
---
 mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index e2319bfd0065..54f5c7888e46 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
 			ret = collapse_huge_page(mm, address, referenced,
 						 unmapped, cc, mmap_locked,
 						 order, offset);
-			if (ret == SCAN_SUCCEED) {
+
+			/*
+			 * Analyze failure reason to determine next action:
+			 * - goto next_order: try smaller orders in same region
+			 * - continue: try other regions at same order
+			 * - break: stop all attempts (system-wide failure)
+			 */
+			switch (ret) {
+			/* Cases were we should continue to the next region */
+			case SCAN_SUCCEED:
 				collapsed += 1UL << order;
+				fallthrough;
+			case SCAN_PTE_MAPPED_HUGEPAGE:
 				continue;
+			/* Cases were lower orders might still succeed */
+			case SCAN_LACK_REFERENCED_PAGE:
+			case SCAN_EXCEED_NONE_PTE:
+			case SCAN_EXCEED_SWAP_PTE:
+			case SCAN_EXCEED_SHARED_PTE:
+			case SCAN_PAGE_LOCK:
+			case SCAN_PAGE_COUNT:
+			case SCAN_PAGE_LRU:
+			case SCAN_PAGE_NULL:
+			case SCAN_DEL_PAGE_LRU:
+			case SCAN_PTE_NON_PRESENT:
+			case SCAN_PTE_UFFD_WP:
+			case SCAN_ALLOC_HUGE_PAGE_FAIL:
+				goto next_order;
+			/* All other cases should stop collapse attempts */
+			default:
+				break;
 			}
+			break;
 		}
 
 next_order:
-- 
2.51.0
Re: [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Posted by Lorenzo Stoakes 2 months, 3 weeks ago
On Wed, Oct 22, 2025 at 12:37:15PM -0600, Nico Pache wrote:
> There are cases where, if an attempted collapse fails, all subsequent
> orders are guaranteed to also fail. Avoid these collapse attempts by
> bailing out early.
>
> Signed-off-by: Nico Pache <npache@redhat.com>
> ---
>  mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index e2319bfd0065..54f5c7888e46 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
>  			ret = collapse_huge_page(mm, address, referenced,
>  						 unmapped, cc, mmap_locked,
>  						 order, offset);
> -			if (ret == SCAN_SUCCEED) {
> +
> +			/*
> +			 * Analyze failure reason to determine next action:
> +			 * - goto next_order: try smaller orders in same region
> +			 * - continue: try other regions at same order

The stack is a DFS, so this isn't correct, you may have pushed a bunch of higher
order candidate mTHPs (I don't like plain 'region') which you will also true.

> +			 * - break: stop all attempts (system-wide failure)
> +			 */

This comment isn't hugely helpful, just put the relevant comments next to each
of the goto, continue, break (soon to be return re: review below) statements
please.

> +			switch (ret) {
> +			/* Cases were we should continue to the next region */
> +			case SCAN_SUCCEED:
>  				collapsed += 1UL << order;
> +				fallthrough;
> +			case SCAN_PTE_MAPPED_HUGEPAGE:
>  				continue;

Would we not run into trouble potentially in the previous patch's implementation
of this examing candidate mTHPs that are part of an already existing huge page,
or would a folio check in the collapse just make this wasted work?

> +			/* Cases were lower orders might still succeed */
> +			case SCAN_LACK_REFERENCED_PAGE:
> +			case SCAN_EXCEED_NONE_PTE:

How can we, having checked the max_pte_none, still fail due to this?

> +			case SCAN_EXCEED_SWAP_PTE:
> +			case SCAN_EXCEED_SHARED_PTE:
> +			case SCAN_PAGE_LOCK:
> +			case SCAN_PAGE_COUNT:
> +			case SCAN_PAGE_LRU:
> +			case SCAN_PAGE_NULL:
> +			case SCAN_DEL_PAGE_LRU:
> +			case SCAN_PTE_NON_PRESENT:
> +			case SCAN_PTE_UFFD_WP:
> +			case SCAN_ALLOC_HUGE_PAGE_FAIL:
> +				goto next_order;
> +			/* All other cases should stop collapse attempts */

I don't love us having a catch-all, plase spell out all cases.

> +			default:
> +				break;
>  			}
> +			break;

_Hate_ this double break. Just return collapsed please.

>  		}
>
>  next_order:
> --
> 2.51.0
>
Re: [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Posted by Nico Pache 2 months, 2 weeks ago
On Wed, Nov 19, 2025 at 5:06 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Oct 22, 2025 at 12:37:15PM -0600, Nico Pache wrote:
> > There are cases where, if an attempted collapse fails, all subsequent
> > orders are guaranteed to also fail. Avoid these collapse attempts by
> > bailing out early.
> >
> > Signed-off-by: Nico Pache <npache@redhat.com>
> > ---
> >  mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
> >  1 file changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index e2319bfd0065..54f5c7888e46 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
> >                       ret = collapse_huge_page(mm, address, referenced,
> >                                                unmapped, cc, mmap_locked,
> >                                                order, offset);
> > -                     if (ret == SCAN_SUCCEED) {
> > +
> > +                     /*
> > +                      * Analyze failure reason to determine next action:
> > +                      * - goto next_order: try smaller orders in same region
> > +                      * - continue: try other regions at same order
>
> The stack is a DFS, so this isn't correct, you may have pushed a bunch of higher
> order candidate mTHPs (I don't like plain 'region') which you will also true.
>
> > +                      * - break: stop all attempts (system-wide failure)
> > +                      */
>
> This comment isn't hugely helpful, just put the relevant comments next to each
> of the goto, continue, break (soon to be return re: review below) statements
> please.
>
> > +                     switch (ret) {
> > +                     /* Cases were we should continue to the next region */
> > +                     case SCAN_SUCCEED:
> >                               collapsed += 1UL << order;
> > +                             fallthrough;
> > +                     case SCAN_PTE_MAPPED_HUGEPAGE:
> >                               continue;
>
> Would we not run into trouble potentially in the previous patch's implementation
> of this examing candidate mTHPs that are part of an already existing huge page,
> or would a folio check in the collapse just make this wasted work?

whoops almost missed this comment.

There is a folio check in the __collapse_huge_page_isolate function
that handles this. I think Dev has some plans to try and add
partially-mapped support as the todo comment suggests (I think he
already has some patches from earlier mTHP work).

/*
* TODO: In some cases of partially-mapped folios, we'd actually
* want to collapse.
*/

>
> > +                     /* Cases were lower orders might still succeed */
> > +                     case SCAN_LACK_REFERENCED_PAGE:
> > +                     case SCAN_EXCEED_NONE_PTE:
>
> How can we, having checked the max_pte_none, still fail due to this?
>
> > +                     case SCAN_EXCEED_SWAP_PTE:
> > +                     case SCAN_EXCEED_SHARED_PTE:
> > +                     case SCAN_PAGE_LOCK:
> > +                     case SCAN_PAGE_COUNT:
> > +                     case SCAN_PAGE_LRU:
> > +                     case SCAN_PAGE_NULL:
> > +                     case SCAN_DEL_PAGE_LRU:
> > +                     case SCAN_PTE_NON_PRESENT:
> > +                     case SCAN_PTE_UFFD_WP:
> > +                     case SCAN_ALLOC_HUGE_PAGE_FAIL:
> > +                             goto next_order;
> > +                     /* All other cases should stop collapse attempts */
>
> I don't love us having a catch-all, plase spell out all cases.
>
> > +                     default:
> > +                             break;
> >                       }
> > +                     break;
>
> _Hate_ this double break. Just return collapsed please.
>
> >               }
> >
> >  next_order:
> > --
> > 2.51.0
> >
>
Re: [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Posted by Nico Pache 2 months, 2 weeks ago
On Wed, Nov 19, 2025 at 5:06 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Oct 22, 2025 at 12:37:15PM -0600, Nico Pache wrote:
> > There are cases where, if an attempted collapse fails, all subsequent
> > orders are guaranteed to also fail. Avoid these collapse attempts by
> > bailing out early.
> >
> > Signed-off-by: Nico Pache <npache@redhat.com>
> > ---
> >  mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
> >  1 file changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index e2319bfd0065..54f5c7888e46 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
> >                       ret = collapse_huge_page(mm, address, referenced,
> >                                                unmapped, cc, mmap_locked,
> >                                                order, offset);
> > -                     if (ret == SCAN_SUCCEED) {
> > +
> > +                     /*
> > +                      * Analyze failure reason to determine next action:
> > +                      * - goto next_order: try smaller orders in same region
> > +                      * - continue: try other regions at same order
>
> The stack is a DFS, so this isn't correct, you may have pushed a bunch of higher
> order candidate mTHPs (I don't like plain 'region') which you will also true.

Ah yeah so it should just be try other "regions" or in this case we
want something like "try to collapse another mTHP candidate in the
stack"

>
> > +                      * - break: stop all attempts (system-wide failure)
> > +                      */
>
> This comment isn't hugely helpful, just put the relevant comments next to each
> of the goto, continue, break (soon to be return re: review below) statements
> please.

ack

>
> > +                     switch (ret) {
> > +                     /* Cases were we should continue to the next region */
> > +                     case SCAN_SUCCEED:
> >                               collapsed += 1UL << order;
> > +                             fallthrough;
> > +                     case SCAN_PTE_MAPPED_HUGEPAGE:
> >                               continue;
>
> Would we not run into trouble potentially in the previous patch's implementation
> of this examing candidate mTHPs that are part of an already existing huge page,
> or would a folio check in the collapse just make this wasted work?
>
> > +                     /* Cases were lower orders might still succeed */
> > +                     case SCAN_LACK_REFERENCED_PAGE:
> > +                     case SCAN_EXCEED_NONE_PTE:
>
> How can we, having checked the max_pte_none, still fail due to this?

There are two phases in the khugepaged code, scan and collapse. in
between them is an alloc which requires dropping the lock, and
reconfirming values (in the collapse phase) after relocking.

During this time, the state of the PMD range might have changed and
our thresholds may have been exceeded.

This was true for PMD collapse and holds true for mTHP collapse too.

>
> > +                     case SCAN_EXCEED_SWAP_PTE:
> > +                     case SCAN_EXCEED_SHARED_PTE:
> > +                     case SCAN_PAGE_LOCK:
> > +                     case SCAN_PAGE_COUNT:
> > +                     case SCAN_PAGE_LRU:
> > +                     case SCAN_PAGE_NULL:
> > +                     case SCAN_DEL_PAGE_LRU:
> > +                     case SCAN_PTE_NON_PRESENT:
> > +                     case SCAN_PTE_UFFD_WP:
> > +                     case SCAN_ALLOC_HUGE_PAGE_FAIL:
> > +                             goto next_order;
> > +                     /* All other cases should stop collapse attempts */
>
> I don't love us having a catch-all, plase spell out all cases.

Ok sounds good, quick question, do we spell out ALL the enums or just
the ones that are reachable from here?

>
> > +                     default:
> > +                             break;
> >                       }
> > +                     break;
>
> _Hate_ this double break. Just return collapsed please.

ack, yeah that's much better. Thanks!

-- Nico

>
> >               }
> >
> >  next_order:
> > --
> > 2.51.0
> >
>
Re: [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Posted by Wei Yang 3 months ago
On Wed, Oct 22, 2025 at 12:37:15PM -0600, Nico Pache wrote:
>There are cases where, if an attempted collapse fails, all subsequent
>orders are guaranteed to also fail. Avoid these collapse attempts by
>bailing out early.
>
>Signed-off-by: Nico Pache <npache@redhat.com>
>---
> mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
> 1 file changed, 30 insertions(+), 1 deletion(-)
>
>diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>index e2319bfd0065..54f5c7888e46 100644
>--- a/mm/khugepaged.c
>+++ b/mm/khugepaged.c
>@@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
> 			ret = collapse_huge_page(mm, address, referenced,
> 						 unmapped, cc, mmap_locked,
> 						 order, offset);
>-			if (ret == SCAN_SUCCEED) {
>+
>+			/*
>+			 * Analyze failure reason to determine next action:
>+			 * - goto next_order: try smaller orders in same region
>+			 * - continue: try other regions at same order
>+			 * - break: stop all attempts (system-wide failure)
>+			 */
>+			switch (ret) {
>+			/* Cases were we should continue to the next region */
>+			case SCAN_SUCCEED:
> 				collapsed += 1UL << order;
>+				fallthrough;
>+			case SCAN_PTE_MAPPED_HUGEPAGE:
> 				continue;
>+			/* Cases were lower orders might still succeed */
>+			case SCAN_LACK_REFERENCED_PAGE:
>+			case SCAN_EXCEED_NONE_PTE:
>+			case SCAN_EXCEED_SWAP_PTE:
>+			case SCAN_EXCEED_SHARED_PTE:
>+			case SCAN_PAGE_LOCK:
>+			case SCAN_PAGE_COUNT:
>+			case SCAN_PAGE_LRU:
>+			case SCAN_PAGE_NULL:
>+			case SCAN_DEL_PAGE_LRU:
>+			case SCAN_PTE_NON_PRESENT:
>+			case SCAN_PTE_UFFD_WP:
>+			case SCAN_ALLOC_HUGE_PAGE_FAIL:
>+				goto next_order;
>+			/* All other cases should stop collapse attempts */
>+			default:
>+				break;
> 			}
>+			break;

One question here:

Suppose we have iterated several orders and not collapse successfully yet. So
the mthp_bitmap_stack[] would look like this:

[8 7 6 6]
       ^
       |

Now we found this one pass the threshold check, but it fails with other
result.

Current code looks it would give up at all, but we may still have a chance to
collapse the above 3 range?

> 		}
> 
> next_order:
>-- 
>2.51.0

-- 
Wei Yang
Help you, Help me
Re: [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Posted by Nico Pache 2 months, 3 weeks ago
On Sat, Nov 8, 2025 at 7:40 PM Wei Yang <richard.weiyang@gmail.com> wrote:
>
> On Wed, Oct 22, 2025 at 12:37:15PM -0600, Nico Pache wrote:
> >There are cases where, if an attempted collapse fails, all subsequent
> >orders are guaranteed to also fail. Avoid these collapse attempts by
> >bailing out early.
> >
> >Signed-off-by: Nico Pache <npache@redhat.com>
> >---
> > mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
> > 1 file changed, 30 insertions(+), 1 deletion(-)
> >
> >diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> >index e2319bfd0065..54f5c7888e46 100644
> >--- a/mm/khugepaged.c
> >+++ b/mm/khugepaged.c
> >@@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
> >                       ret = collapse_huge_page(mm, address, referenced,
> >                                                unmapped, cc, mmap_locked,
> >                                                order, offset);
> >-                      if (ret == SCAN_SUCCEED) {
> >+
> >+                      /*
> >+                       * Analyze failure reason to determine next action:
> >+                       * - goto next_order: try smaller orders in same region
> >+                       * - continue: try other regions at same order
> >+                       * - break: stop all attempts (system-wide failure)
> >+                       */
> >+                      switch (ret) {
> >+                      /* Cases were we should continue to the next region */
> >+                      case SCAN_SUCCEED:
> >                               collapsed += 1UL << order;
> >+                              fallthrough;
> >+                      case SCAN_PTE_MAPPED_HUGEPAGE:
> >                               continue;
> >+                      /* Cases were lower orders might still succeed */
> >+                      case SCAN_LACK_REFERENCED_PAGE:
> >+                      case SCAN_EXCEED_NONE_PTE:
> >+                      case SCAN_EXCEED_SWAP_PTE:
> >+                      case SCAN_EXCEED_SHARED_PTE:
> >+                      case SCAN_PAGE_LOCK:
> >+                      case SCAN_PAGE_COUNT:
> >+                      case SCAN_PAGE_LRU:
> >+                      case SCAN_PAGE_NULL:
> >+                      case SCAN_DEL_PAGE_LRU:
> >+                      case SCAN_PTE_NON_PRESENT:
> >+                      case SCAN_PTE_UFFD_WP:
> >+                      case SCAN_ALLOC_HUGE_PAGE_FAIL:
> >+                              goto next_order;
> >+                      /* All other cases should stop collapse attempts */
> >+                      default:
> >+                              break;
> >                       }
> >+                      break;
>
> One question here:

Hi Wei Yang,

Sorry I forgot to get back to this email.

>
> Suppose we have iterated several orders and not collapse successfully yet. So
> the mthp_bitmap_stack[] would look like this:
>
> [8 7 6 6]
>        ^
>        |

so we always pop before pushing. So it would go

[9]
pop
if (collapse fails)
[8 8]
lets say we pop and successfully collapse a order 8
[8]
Then we fail the other order 8
[7 7]
now if we succeed the first order 7
[7 6 6]
I believe we are now in the state you wanted to describe.

>
> Now we found this one pass the threshold check, but it fails with other
> result.

ok lets say we pass the threshold checks, but the collapse fails for
any reason that is described in the
/* Cases were lower orders might still succeed */
In this case we would continue to order 5 (or lower). Once we are done
with this branch of the tree we go back to the other order 6 collapse.
and eventually the order 7.

>
> Current code looks it would give up at all, but we may still have a chance to
> collapse the above 3 range?

for cases under /* All other cases should stop collapse attempts */
Yes we would bail out and skip some collapses. I tried to think about
all the cases were we would still want to continue trying, vs cases
where the system is probably out of resources or hitting some major
failure, and we should just break out (as others will probably fail
too).

But this is also why I separated this patch out on its own. I was
hoping to have some more focus on the different cases, and make sure I
handled them in the best possible way. So I really appreciate the
question :)

* I did some digging through old message to find this *

I believe these are the remaining cases. If these are hit I figured
it's better to abort.

/* cases where we must stop collapse attempts */
case SCAN_CGROUP_CHARGE_FAIL:
case SCAN_COPY_MC:
case SCAN_ADDRESS_RANGE:
case SCAN_PMD_NULL:
case SCAN_ANY_PROCESS:
case SCAN_VMA_NULL:
case SCAN_VMA_CHECK:
case SCAN_SCAN_ABORT:
case SCAN_PMD_NONE:
case SCAN_PAGE_ANON:
case SCAN_PMD_MAPPED:
case SCAN_FAIL:

Please let me know if you think we should move these to either the
`continue` or `next order` cases.

Cheers,
-- Nico

>
> >               }
> >
> > next_order:
> >--
> >2.51.0
>
> --
> Wei Yang
> Help you, Help me
>
Re: [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Posted by Wei Yang 2 months, 3 weeks ago
On Mon, Nov 17, 2025 at 11:16:53AM -0700, Nico Pache wrote:
>On Sat, Nov 8, 2025 at 7:40 PM Wei Yang <richard.weiyang@gmail.com> wrote:
>>
>> On Wed, Oct 22, 2025 at 12:37:15PM -0600, Nico Pache wrote:
>> >There are cases where, if an attempted collapse fails, all subsequent
>> >orders are guaranteed to also fail. Avoid these collapse attempts by
>> >bailing out early.
>> >
>> >Signed-off-by: Nico Pache <npache@redhat.com>
>> >---
>> > mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
>> > 1 file changed, 30 insertions(+), 1 deletion(-)
>> >
>> >diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> >index e2319bfd0065..54f5c7888e46 100644
>> >--- a/mm/khugepaged.c
>> >+++ b/mm/khugepaged.c
>> >@@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
>> >                       ret = collapse_huge_page(mm, address, referenced,
>> >                                                unmapped, cc, mmap_locked,
>> >                                                order, offset);
>> >-                      if (ret == SCAN_SUCCEED) {
>> >+
>> >+                      /*
>> >+                       * Analyze failure reason to determine next action:
>> >+                       * - goto next_order: try smaller orders in same region
>> >+                       * - continue: try other regions at same order
>> >+                       * - break: stop all attempts (system-wide failure)
>> >+                       */
>> >+                      switch (ret) {
>> >+                      /* Cases were we should continue to the next region */
>> >+                      case SCAN_SUCCEED:
>> >                               collapsed += 1UL << order;
>> >+                              fallthrough;
>> >+                      case SCAN_PTE_MAPPED_HUGEPAGE:
>> >                               continue;
>> >+                      /* Cases were lower orders might still succeed */
>> >+                      case SCAN_LACK_REFERENCED_PAGE:
>> >+                      case SCAN_EXCEED_NONE_PTE:
>> >+                      case SCAN_EXCEED_SWAP_PTE:
>> >+                      case SCAN_EXCEED_SHARED_PTE:
>> >+                      case SCAN_PAGE_LOCK:
>> >+                      case SCAN_PAGE_COUNT:
>> >+                      case SCAN_PAGE_LRU:
>> >+                      case SCAN_PAGE_NULL:
>> >+                      case SCAN_DEL_PAGE_LRU:
>> >+                      case SCAN_PTE_NON_PRESENT:
>> >+                      case SCAN_PTE_UFFD_WP:
>> >+                      case SCAN_ALLOC_HUGE_PAGE_FAIL:
>> >+                              goto next_order;
>> >+                      /* All other cases should stop collapse attempts */
>> >+                      default:
>> >+                              break;
>> >                       }
>> >+                      break;
>>
>> One question here:
>
>Hi Wei Yang,
>
>Sorry I forgot to get back to this email.
>

No problem, thanks for taking a look.

>>
>> Suppose we have iterated several orders and not collapse successfully yet. So
>> the mthp_bitmap_stack[] would look like this:
>>
>> [8 7 6 6]
>>        ^
>>        |
>
>so we always pop before pushing. So it would go
>
>[9]
>pop
>if (collapse fails)
>[8 8]
>lets say we pop and successfully collapse a order 8
>[8]
>Then we fail the other order 8
>[7 7]
>now if we succeed the first order 7
>[7 6 6]
>I believe we are now in the state you wanted to describe.
>
>>
>> Now we found this one pass the threshold check, but it fails with other
>> result.
>
>ok lets say we pass the threshold checks, but the collapse fails for
>any reason that is described in the
>/* Cases were lower orders might still succeed */
>In this case we would continue to order 5 (or lower). Once we are done
>with this branch of the tree we go back to the other order 6 collapse.
>and eventually the order 7.
>
>>
>> Current code looks it would give up at all, but we may still have a chance to
>> collapse the above 3 range?
>
>for cases under /* All other cases should stop collapse attempts */
>Yes we would bail out and skip some collapses. I tried to think about
>all the cases were we would still want to continue trying, vs cases
>where the system is probably out of resources or hitting some major
>failure, and we should just break out (as others will probably fail
>too).
>

Thanks, your explanation is very clear.

>But this is also why I separated this patch out on its own. I was
>hoping to have some more focus on the different cases, and make sure I
>handled them in the best possible way. So I really appreciate the
>question :)
>
>* I did some digging through old message to find this *
>
>I believe these are the remaining cases. If these are hit I figured
>it's better to abort.
>

I agree we need to take care of those cases.

>/* cases where we must stop collapse attempts */
>case SCAN_CGROUP_CHARGE_FAIL:
>case SCAN_COPY_MC:
>case SCAN_ADDRESS_RANGE:
>case SCAN_PMD_NULL:
>case SCAN_ANY_PROCESS:
>case SCAN_VMA_NULL:
>case SCAN_VMA_CHECK:
>case SCAN_SCAN_ABORT:
>case SCAN_PMD_NONE:
>case SCAN_PAGE_ANON:
>case SCAN_PMD_MAPPED:
>case SCAN_FAIL:
>
>Please let me know if you think we should move these to either the
>`continue` or `next order` cases.

Take a look into these cases, it looks good to me now.

Also one of my concern is this coding style is a little hard to maintain. In
case we introduce a new result, we should remember to add it here. Otherwise
we may stop the collapse too early.

While it maybe a separate work after this patch set merged.

>
>Cheers,
>-- Nico
>
>>
>> >               }
>> >
>> > next_order:
>> >--
>> >2.51.0
>>
>> --
>> Wei Yang
>> Help you, Help me
>>

-- 
Wei Yang
Help you, Help me