[PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()

Barry Song posted 1 patch 8 months, 1 week ago
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by Barry Song 8 months, 1 week ago
From: Barry Song <v-songbaohua@oppo.com>

We've already found the VMA before calling madvise_free_single_vma(),
so calling walk_page_range() and doing find_vma() again seems
unnecessary. It also prevents potential optimizations for MADV_FREE
to use a per-VMA lock.

Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Tangquan Zheng <zhengtangquan@oppo.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 mm/madvise.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index d408ffa404b3..c6a28a2d3ff8 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -826,7 +826,7 @@ static int madvise_free_single_vma(struct madvise_behavior *madv_behavior,
 
 	mmu_notifier_invalidate_range_start(&range);
 	tlb_start_vma(tlb, vma);
-	walk_page_range(vma->vm_mm, range.start, range.end,
+	walk_page_range_vma(vma, range.start, range.end,
 			&madvise_free_walk_ops, tlb);
 	tlb_end_vma(tlb, vma);
 	mmu_notifier_invalidate_range_end(&range);
-- 
2.39.3 (Apple Git-146)
Re: [PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by Lorenzo Stoakes 8 months, 1 week ago
On Tue, Jun 03, 2025 at 01:31:54PM +1200, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
>
> We've already found the VMA before calling madvise_free_single_vma(),
> so calling walk_page_range() and doing find_vma() again seems
> unnecessary. It also prevents potential optimizations for MADV_FREE
> to use a per-VMA lock.

Really nice find, great work Barry!

Lord above, why on earth weren't we doing this before...

>
> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Jann Horn <jannh@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

I can't see anything wrong with this, so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

> ---
>  mm/madvise.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index d408ffa404b3..c6a28a2d3ff8 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -826,7 +826,7 @@ static int madvise_free_single_vma(struct madvise_behavior *madv_behavior,
>
>  	mmu_notifier_invalidate_range_start(&range);
>  	tlb_start_vma(tlb, vma);
> -	walk_page_range(vma->vm_mm, range.start, range.end,
> +	walk_page_range_vma(vma, range.start, range.end,
>  			&madvise_free_walk_ops, tlb);
>  	tlb_end_vma(tlb, vma);
>  	mmu_notifier_invalidate_range_end(&range);
> --
> 2.39.3 (Apple Git-146)
>
Re: [PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by Oscar Salvador 8 months, 1 week ago
On Tue, Jun 03, 2025 at 01:31:54PM +1200, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> We've already found the VMA before calling madvise_free_single_vma(),
> so calling walk_page_range() and doing find_vma() again seems
> unnecessary. It also prevents potential optimizations for MADV_FREE
> to use a per-VMA lock.
> 
> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Jann Horn <jannh@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE Labs
Re: [PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by David Hildenbrand 8 months, 1 week ago
On 03.06.25 03:31, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> We've already found the VMA before calling madvise_free_single_vma(),
> so calling walk_page_range() and doing find_vma() again seems
> unnecessary. It also prevents potential optimizations for MADV_FREE
> to use a per-VMA lock.
> 
> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Jann Horn <jannh@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb
Re: [PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by Dev Jain 8 months, 1 week ago
On 03/06/25 7:01 am, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
>
> We've already found the VMA before calling madvise_free_single_vma(),
> so calling walk_page_range() and doing find_vma() again seems
> unnecessary. It also prevents potential optimizations for MADV_FREE
> to use a per-VMA lock.
>
> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Jann Horn <jannh@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>   mm/madvise.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index d408ffa404b3..c6a28a2d3ff8 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -826,7 +826,7 @@ static int madvise_free_single_vma(struct madvise_behavior *madv_behavior,
>   
>   	mmu_notifier_invalidate_range_start(&range);
>   	tlb_start_vma(tlb, vma);
> -	walk_page_range(vma->vm_mm, range.start, range.end,
> +	walk_page_range_vma(vma, range.start, range.end,
>   			&madvise_free_walk_ops, tlb);
>   	tlb_end_vma(tlb, vma);
>   	mmu_notifier_invalidate_range_end(&range);

Can similar optimizations be made in madvise_willneed(), madvise_cold_page_range(), etc?
Re: [PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by Barry Song 8 months, 1 week ago
On Tue, Jun 3, 2025 at 6:11 PM Dev Jain <dev.jain@arm.com> wrote:
>
>
> On 03/06/25 7:01 am, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > We've already found the VMA before calling madvise_free_single_vma(),
> > so calling walk_page_range() and doing find_vma() again seems
> > unnecessary. It also prevents potential optimizations for MADV_FREE
> > to use a per-VMA lock.
> >
> > Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Cc: David Hildenbrand <david@redhat.com>
> > Cc: Vlastimil Babka <vbabka@suse.cz>
> > Cc: Jann Horn <jannh@google.com>
> > Cc: Suren Baghdasaryan <surenb@google.com>
> > Cc: Lokesh Gidra <lokeshgidra@google.com>
> > Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > ---
> >   mm/madvise.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index d408ffa404b3..c6a28a2d3ff8 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -826,7 +826,7 @@ static int madvise_free_single_vma(struct madvise_behavior *madv_behavior,
> >
> >       mmu_notifier_invalidate_range_start(&range);
> >       tlb_start_vma(tlb, vma);
> > -     walk_page_range(vma->vm_mm, range.start, range.end,
> > +     walk_page_range_vma(vma, range.start, range.end,
> >                       &madvise_free_walk_ops, tlb);
> >       tlb_end_vma(tlb, vma);
> >       mmu_notifier_invalidate_range_end(&range);
>
> Can similar optimizations be made in madvise_willneed(), madvise_cold_page_range(), etc?

Yes, I think the same code flow applies to madvise_willneed,
madvise_cold_page_range, and similar functions, though my current
interest is more on madvise_free.

Let me prepare a v2 that includes those as well.

>

Thanks
Barry
Re: [PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by Lorenzo Stoakes 8 months, 1 week ago
On Tue, Jun 03, 2025 at 08:47:04PM +1200, Barry Song wrote:
> On Tue, Jun 3, 2025 at 6:11 PM Dev Jain <dev.jain@arm.com> wrote:
> >
> >
> > On 03/06/25 7:01 am, Barry Song wrote:
> > > From: Barry Song <v-songbaohua@oppo.com>
> > >
> > > We've already found the VMA before calling madvise_free_single_vma(),
> > > so calling walk_page_range() and doing find_vma() again seems
> > > unnecessary. It also prevents potential optimizations for MADV_FREE
> > > to use a per-VMA lock.
> > >
> > > Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> > > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > Cc: David Hildenbrand <david@redhat.com>
> > > Cc: Vlastimil Babka <vbabka@suse.cz>
> > > Cc: Jann Horn <jannh@google.com>
> > > Cc: Suren Baghdasaryan <surenb@google.com>
> > > Cc: Lokesh Gidra <lokeshgidra@google.com>
> > > Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > > ---
> > >   mm/madvise.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/mm/madvise.c b/mm/madvise.c
> > > index d408ffa404b3..c6a28a2d3ff8 100644
> > > --- a/mm/madvise.c
> > > +++ b/mm/madvise.c
> > > @@ -826,7 +826,7 @@ static int madvise_free_single_vma(struct madvise_behavior *madv_behavior,
> > >
> > >       mmu_notifier_invalidate_range_start(&range);
> > >       tlb_start_vma(tlb, vma);
> > > -     walk_page_range(vma->vm_mm, range.start, range.end,
> > > +     walk_page_range_vma(vma, range.start, range.end,
> > >                       &madvise_free_walk_ops, tlb);
> > >       tlb_end_vma(tlb, vma);
> > >       mmu_notifier_invalidate_range_end(&range);
> >
> > Can similar optimizations be made in madvise_willneed(), madvise_cold_page_range(), etc?
>
> Yes, I think the same code flow applies to madvise_willneed,
> madvise_cold_page_range, and similar functions, though my current
> interest is more on madvise_free.
>
> Let me prepare a v2 that includes those as well.

FWIW Dev makes a great point here and I agree wholeheartedly, let's fix all such
cases...

As an aside, I wonder if we previously didn't do this because we hadn't
previously exposed the walk_page_range_vma() API or something?

>
> >
>
> Thanks
> Barry

Cheers!
Re: [PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by David Hildenbrand 8 months, 1 week ago
On 03.06.25 11:41, Lorenzo Stoakes wrote:
> On Tue, Jun 03, 2025 at 08:47:04PM +1200, Barry Song wrote:
>> On Tue, Jun 3, 2025 at 6:11 PM Dev Jain <dev.jain@arm.com> wrote:
>>>
>>>
>>> On 03/06/25 7:01 am, Barry Song wrote:
>>>> From: Barry Song <v-songbaohua@oppo.com>
>>>>
>>>> We've already found the VMA before calling madvise_free_single_vma(),
>>>> so calling walk_page_range() and doing find_vma() again seems
>>>> unnecessary. It also prevents potential optimizations for MADV_FREE
>>>> to use a per-VMA lock.
>>>>
>>>> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
>>>> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>>> Cc: David Hildenbrand <david@redhat.com>
>>>> Cc: Vlastimil Babka <vbabka@suse.cz>
>>>> Cc: Jann Horn <jannh@google.com>
>>>> Cc: Suren Baghdasaryan <surenb@google.com>
>>>> Cc: Lokesh Gidra <lokeshgidra@google.com>
>>>> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
>>>> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
>>>> ---
>>>>    mm/madvise.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/madvise.c b/mm/madvise.c
>>>> index d408ffa404b3..c6a28a2d3ff8 100644
>>>> --- a/mm/madvise.c
>>>> +++ b/mm/madvise.c
>>>> @@ -826,7 +826,7 @@ static int madvise_free_single_vma(struct madvise_behavior *madv_behavior,
>>>>
>>>>        mmu_notifier_invalidate_range_start(&range);
>>>>        tlb_start_vma(tlb, vma);
>>>> -     walk_page_range(vma->vm_mm, range.start, range.end,
>>>> +     walk_page_range_vma(vma, range.start, range.end,
>>>>                        &madvise_free_walk_ops, tlb);
>>>>        tlb_end_vma(tlb, vma);
>>>>        mmu_notifier_invalidate_range_end(&range);
>>>
>>> Can similar optimizations be made in madvise_willneed(), madvise_cold_page_range(), etc?
>>
>> Yes, I think the same code flow applies to madvise_willneed,
>> madvise_cold_page_range, and similar functions, though my current
>> interest is more on madvise_free.
>>
>> Let me prepare a v2 that includes those as well.
> 
> FWIW Dev makes a great point here and I agree wholeheartedly, let's fix all such
> cases...
> 
> As an aside, I wonder if we previously didn't do this because we hadn't
> previously exposed the walk_page_range_vma() API or something?

IIRC, yes:

commit e07cda5f232fac4de0925d8a4c92e51e41fa2f6e
Author: David Hildenbrand <david@redhat.com>
Date:   Fri Oct 21 12:11:39 2022 +0200

     mm/pagewalk: add walk_page_range_vma()
     
     Let's add walk_page_range_vma(), which is similar to walk_page_vma(),
     however, is only interested in a subset of the VMA range.
     
     To be used in KSM code to stop using follow_page() next.


-- 
Cheers,

David / dhildenb

Re: [PATCH RFC] mm: madvise: use walk_page_range_vma() for madvise_free_single_vma()
Posted by Anshuman Khandual 8 months, 1 week ago
On 6/3/25 07:01, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> We've already found the VMA before calling madvise_free_single_vma(),
> so calling walk_page_range() and doing find_vma() again seems
> unnecessary. It also prevents potential optimizations for MADV_FREE
> to use a per-VMA lock.
> 
> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Jann Horn <jannh@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: Tangquan Zheng <zhengtangquan@oppo.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>  mm/madvise.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index d408ffa404b3..c6a28a2d3ff8 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -826,7 +826,7 @@ static int madvise_free_single_vma(struct madvise_behavior *madv_behavior,
>  
>  	mmu_notifier_invalidate_range_start(&range);
>  	tlb_start_vma(tlb, vma);
> -	walk_page_range(vma->vm_mm, range.start, range.end,
> +	walk_page_range_vma(vma, range.start, range.end,
>  			&madvise_free_walk_ops, tlb);
>  	tlb_end_vma(tlb, vma);
>  	mmu_notifier_invalidate_range_end(&range);

This indeed looks like an improvement dropping a redundant VMA look up
and also opening up potential optimization later using per-VMA lock.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>