[PATCH v8 11/15] khugepaged: allow madvise_collapse to check all anonymous mTHP orders

Nico Pache posted 15 patches 3 months, 1 week ago
There is a newer version of this series
[PATCH v8 11/15] khugepaged: allow madvise_collapse to check all anonymous mTHP orders
Posted by Nico Pache 3 months, 1 week ago
Allow madvise_collapse to scan/collapse all mTHP orders without the
strict requirement of needing the PMD-order enabled.

Signed-off-by: Nico Pache <npache@redhat.com>
---
 mm/khugepaged.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b96a7327b9c0..6ea681b81647 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2930,11 +2930,13 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 	unsigned long hstart, hend, addr;
 	int thps = 0, last_fail = SCAN_FAIL;
 	bool mmap_locked = true;
+	unsigned long orders = vma_is_anonymous(vma) ?
+					THP_ORDERS_ALL_ANON : BIT(PMD_ORDER);
 
 	BUG_ON(vma->vm_start > start);
 	BUG_ON(vma->vm_end < end);
 
-	if (!thp_vma_allowable_order(vma, vma->vm_flags, 0, PMD_ORDER))
+	if (!thp_vma_allowable_orders(vma, vma->vm_flags, 0, orders))
 		return -EINVAL;
 
 	cc = kmalloc(sizeof(*cc), GFP_KERNEL);
@@ -2956,7 +2958,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 			mmap_read_lock(mm);
 			mmap_locked = true;
 			result = hugepage_vma_revalidate(mm, addr, false, &vma,
-							 cc, BIT(HPAGE_PMD_ORDER));
+							 cc, orders);
 			if (result  != SCAN_SUCCEED) {
 				last_fail = result;
 				goto out_nolock;
-- 
2.49.0
Re: [PATCH v8 11/15] khugepaged: allow madvise_collapse to check all anonymous mTHP orders
Posted by Baolin Wang 3 months ago

On 2025/7/2 13:57, Nico Pache wrote:
> Allow madvise_collapse to scan/collapse all mTHP orders without the
> strict requirement of needing the PMD-order enabled.
> 
> Signed-off-by: Nico Pache <npache@redhat.com>

I am afraid we should drop this patch from the series, since Hugh 
explicitly opposed my modification of the madvise_collapse control logic 
in this thread[1].

Moreover, since we have not yet clarified how to handle the conflict 
between the semantics of madvise_collapse and the THP interfaces, we 
should keep the current logic[2] (means madvise_collapse still only 
allows PMD collapse).

If madvise_collapse is to support mTHP collapse, there will be more 
semantic conflicts to discuss.

[1] 
https://lore.kernel.org/all/cover.1750815384.git.baolin.wang@linux.alibaba.com/
[2] 
https://lore.kernel.org/all/23b8ad10-cd1f-45df-a25c-78d01c8af44f@redhat.com/

> ---
>   mm/khugepaged.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index b96a7327b9c0..6ea681b81647 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -2930,11 +2930,13 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>   	unsigned long hstart, hend, addr;
>   	int thps = 0, last_fail = SCAN_FAIL;
>   	bool mmap_locked = true;
> +	unsigned long orders = vma_is_anonymous(vma) ?
> +					THP_ORDERS_ALL_ANON : BIT(PMD_ORDER);
>   
>   	BUG_ON(vma->vm_start > start);
>   	BUG_ON(vma->vm_end < end);
>   
> -	if (!thp_vma_allowable_order(vma, vma->vm_flags, 0, PMD_ORDER))
> +	if (!thp_vma_allowable_orders(vma, vma->vm_flags, 0, orders))
>   		return -EINVAL;
>   
>   	cc = kmalloc(sizeof(*cc), GFP_KERNEL);
> @@ -2956,7 +2958,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>   			mmap_read_lock(mm);
>   			mmap_locked = true;
>   			result = hugepage_vma_revalidate(mm, addr, false, &vma,
> -							 cc, BIT(HPAGE_PMD_ORDER));
> +							 cc, orders);
>   			if (result  != SCAN_SUCCEED) {
>   				last_fail = result;
>   				goto out_nolock;
Re: [PATCH v8 11/15] khugepaged: allow madvise_collapse to check all anonymous mTHP orders
Posted by Andrew Morton 3 months ago
On Fri, 4 Jul 2025 14:11:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> On 2025/7/2 13:57, Nico Pache wrote:
> > Allow madvise_collapse to scan/collapse all mTHP orders without the
> > strict requirement of needing the PMD-order enabled.
> > 
> > Signed-off-by: Nico Pache <npache@redhat.com>
> 
> I am afraid we should drop this patch from the series, since Hugh 
> explicitly opposed my modification of the madvise_collapse control logic 
> in this thread[1].

Well it drops easily enough.  I don't know if it compiles yet.

Nico, can you confirm that you're OK with the droppage and that the
series is still good without this patch?

> Moreover, since we have not yet clarified how to handle the conflict 
> between the semantics of madvise_collapse and the THP interfaces, we 
> should keep the current logic[2] (means madvise_collapse still only 
> allows PMD collapse).
> 
> If madvise_collapse is to support mTHP collapse, there will be more 
> semantic conflicts to discuss.
> 
> [1] 
> https://lore.kernel.org/all/cover.1750815384.git.baolin.wang@linux.alibaba.com/
> [2] 
> https://lore.kernel.org/all/23b8ad10-cd1f-45df-a25c-78d01c8af44f@redhat.com/
Re: [PATCH v8 11/15] khugepaged: allow madvise_collapse to check all anonymous mTHP orders
Posted by Nico Pache 3 months ago
On Fri, Jul 4, 2025 at 3:14 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Fri, 4 Jul 2025 14:11:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>
> > On 2025/7/2 13:57, Nico Pache wrote:
> > > Allow madvise_collapse to scan/collapse all mTHP orders without the
> > > strict requirement of needing the PMD-order enabled.
> > >
> > > Signed-off-by: Nico Pache <npache@redhat.com>
> >
> > I am afraid we should drop this patch from the series, since Hugh
> > explicitly opposed my modification of the madvise_collapse control logic
> > in this thread[1].
>
> Well it drops easily enough.  I don't know if it compiles yet.
>
> Nico, can you confirm that you're OK with the droppage and that the
> series is still good without this patch?
Yes that is fine, it shouldn't conflict at all. Although this isn't
the only patch that will need modification... Ill reply to Baolin
inline here.
>
> > Moreover, since we have not yet clarified how to handle the conflict
> > between the semantics of madvise_collapse and the THP interfaces, we
> > should keep the current logic[2] (means madvise_collapse still only
> > allows PMD collapse).
Hi Baolin,

Sorry I saw most of that discussion but missed that point by David.

We should come to an agreement on what the proper approach is, because
if not we'll ALWAYS have this problem and NEVER solve it (pun
intended).

Jokes aside, if we are leaving MADV_COLLAPSE untouched then I need to
make some slight modifications to this series so that MADV_COLLAPSE
never tries anything other than PMD collapse (should be a pretty small
change). Dropping this commit alone does not achieve that, but rather
just allows MADV_COLLAPSE to work if PMD is disabled, and one other
mTHP size is enabled. On second thought this isn't great either as
MADV_COLLAPSE ignores sysfs, so if you have any mTHP size enabled, it
will still continue to collapse to other PMD+mTHP sizes... wow this
really is a mess.
> >
> > If madvise_collapse is to support mTHP collapse, there will be more
> > semantic conflicts to discuss.
I guess a V9 is inevitable, I will drop mTHP support for
MADV_COLLAPSE, and drop this patch too. I'll let this series sit for a
week or so to gather reviews (and make sure nothing else is missing).

Thanks,
-- Nico
> >
> > [1]
> > https://lore.kernel.org/all/cover.1750815384.git.baolin.wang@linux.alibaba.com/
> > [2]
> > https://lore.kernel.org/all/23b8ad10-cd1f-45df-a25c-78d01c8af44f@redhat.com/
>
Re: [PATCH v8 11/15] khugepaged: allow madvise_collapse to check all anonymous mTHP orders
Posted by Baolin Wang 3 months ago

On 2025/7/8 12:37, Nico Pache wrote:
> On Fri, Jul 4, 2025 at 3:14 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> On Fri, 4 Jul 2025 14:11:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>>
>>> On 2025/7/2 13:57, Nico Pache wrote:
>>>> Allow madvise_collapse to scan/collapse all mTHP orders without the
>>>> strict requirement of needing the PMD-order enabled.
>>>>
>>>> Signed-off-by: Nico Pache <npache@redhat.com>
>>>
>>> I am afraid we should drop this patch from the series, since Hugh
>>> explicitly opposed my modification of the madvise_collapse control logic
>>> in this thread[1].
>>
>> Well it drops easily enough.  I don't know if it compiles yet.
>>
>> Nico, can you confirm that you're OK with the droppage and that the
>> series is still good without this patch?
> Yes that is fine, it shouldn't conflict at all. Although this isn't
> the only patch that will need modification... Ill reply to Baolin
> inline here.
>>
>>> Moreover, since we have not yet clarified how to handle the conflict
>>> between the semantics of madvise_collapse and the THP interfaces, we
>>> should keep the current logic[2] (means madvise_collapse still only
>>> allows PMD collapse).
> Hi Baolin,
> 
> Sorry I saw most of that discussion but missed that point by David.
> 
> We should come to an agreement on what the proper approach is, because
> if not we'll ALWAYS have this problem and NEVER solve it (pun
> intended).
> 
> Jokes aside, if we are leaving MADV_COLLAPSE untouched then I need to
> make some slight modifications to this series so that MADV_COLLAPSE
> never tries anything other than PMD collapse (should be a pretty small
> change). Dropping this commit alone does not achieve that, but rather
> just allows MADV_COLLAPSE to work if PMD is disabled, and one other
> mTHP size is enabled. On second thought this isn't great either as
> MADV_COLLAPSE ignores sysfs, so if you have any mTHP size enabled, it
> will still continue to collapse to other PMD+mTHP sizes... wow this
> really is a mess.

Yes, this is what I mentioned earlier about the conflict between 
MADV_COLLAPSE and mTHP collapse :(

>>> If madvise_collapse is to support mTHP collapse, there will be more
>>> semantic conflicts to discuss.
> I guess a V9 is inevitable, I will drop mTHP support for
> MADV_COLLAPSE, and drop this patch too. I'll let this series sit for a
> week or so to gather reviews (and make sure nothing else is missing).

Sure.