Optimize mremap() by PTE-batching

[PATCH 0/3] Optimize mremap() by PTE-batching

Posted by Dev Jain 9 months, 1 week ago

Use PTE batching to optimize mremap().

Mapping 512K of memory, memsetting it, remapping it to src + 512K, and
munmapping it 10,000 times, the average execution time reduces from 1.9 to
1.2 seconds, giving a 37% performance optimization. (Apple M3)

Dev Jain (3):
  mm: Call pointers to ptes as ptep
  mm: Add generic helper to hint a large folio
  mm: Optimize mremap() by PTE batching

 include/linux/pgtable.h | 16 +++++++++++++++
 mm/mremap.c             | 44 +++++++++++++++++++++++++++--------------
 2 files changed, 45 insertions(+), 15 deletions(-)

-- 
2.30.2

Re: [PATCH 0/3] Optimize mremap() by PTE-batching

Posted by Anshuman Khandual 9 months, 1 week ago

On 5/6/25 10:30, Dev Jain wrote:
> Use PTE batching to optimize mremap().
> 
> Mapping 512K of memory, memsetting it, remapping it to src + 512K, and
> munmapping it 10,000 times, the average execution time reduces from 1.9 to
> 1.2 seconds, giving a 37% performance optimization. (Apple M3)

That's impressive improvement. But could you please re-organize the test
description into a pseudo code format or better provide the test program
itself (which should be compact anyways) just to be more clear about the
scenario where this helps.

> 
> Dev Jain (3):
>   mm: Call pointers to ptes as ptep
>   mm: Add generic helper to hint a large folio
>   mm: Optimize mremap() by PTE batching
> 
>  include/linux/pgtable.h | 16 +++++++++++++++
>  mm/mremap.c             | 44 +++++++++++++++++++++++++++--------------
>  2 files changed, 45 insertions(+), 15 deletions(-)
>

Re: [PATCH 0/3] Optimize mremap() by PTE-batching

Posted by Dev Jain 9 months, 1 week ago


On 06/05/25 2:46 pm, Anshuman Khandual wrote:
> On 5/6/25 10:30, Dev Jain wrote:
>> Use PTE batching to optimize mremap().
>>
>> Mapping 512K of memory, memsetting it, remapping it to src + 512K, and
>> munmapping it 10,000 times, the average execution time reduces from 1.9 to
>> 1.2 seconds, giving a 37% performance optimization. (Apple M3)
> 
> That's impressive improvement. But could you please re-organize the test
> description into a pseudo code format or better provide the test program
> itself (which should be compact anyways) just to be more clear about the
> scenario where this helps.

Sure.

> 
>>
>> Dev Jain (3):
>>    mm: Call pointers to ptes as ptep
>>    mm: Add generic helper to hint a large folio
>>    mm: Optimize mremap() by PTE batching
>>
>>   include/linux/pgtable.h | 16 +++++++++++++++
>>   mm/mremap.c             | 44 +++++++++++++++++++++++++++--------------
>>   2 files changed, 45 insertions(+), 15 deletions(-)
>>

Re: [PATCH 0/3] Optimize mremap() by PTE-batching

Posted by Lorenzo Stoakes 9 months, 1 week ago

On Tue, May 06, 2025 at 03:52:39PM +0530, Dev Jain wrote:
>
>
> On 06/05/25 2:46 pm, Anshuman Khandual wrote:
> > On 5/6/25 10:30, Dev Jain wrote:
> > > Use PTE batching to optimize mremap().
> > >
> > > Mapping 512K of memory, memsetting it, remapping it to src + 512K, and
> > > munmapping it 10,000 times, the average execution time reduces from 1.9 to
> > > 1.2 seconds, giving a 37% performance optimization. (Apple M3)
> >
> > That's impressive improvement. But could you please re-organize the test
> > description into a pseudo code format or better provide the test program
> > itself (which should be compact anyways) just to be more clear about the
> > scenario where this helps.
>
> Sure.

I echo Ashuman's comment, definitely would like to see that.

And wrt to perf improvement, whether it's a microbenchmark or not, that's a
great result so well done :) I echo this also!

However, it'd be good to see some more detail here also, you're kind of missing
out - everything - about why this improvement happens - what the intent of the
series is, anything about large folios, under what circumstances you'll see an
improvement, etc. etc.

While this might duplicate comments you've made elsewhere, it's mandatory for a
series, and Andrew is unlikely to take this without it.

In mm we place the contents of the cover letter in the first commit in the
series, so it gets stored for posterity also!

Cheers, Lorenzo

>
> >
> > >
> > > Dev Jain (3):
> > >    mm: Call pointers to ptes as ptep
> > >    mm: Add generic helper to hint a large folio
> > >    mm: Optimize mremap() by PTE batching
> > >
> > >   include/linux/pgtable.h | 16 +++++++++++++++
> > >   mm/mremap.c             | 44 +++++++++++++++++++++++++++--------------
> > >   2 files changed, 45 insertions(+), 15 deletions(-)
> > >
>

Re: [PATCH 0/3] Optimize mremap() by PTE-batching

Posted by Dev Jain 9 months, 1 week ago


On 06/05/25 4:14 pm, Lorenzo Stoakes wrote:
> On Tue, May 06, 2025 at 03:52:39PM +0530, Dev Jain wrote:
>>
>>
>> On 06/05/25 2:46 pm, Anshuman Khandual wrote:
>>> On 5/6/25 10:30, Dev Jain wrote:
>>>> Use PTE batching to optimize mremap().
>>>>
>>>> Mapping 512K of memory, memsetting it, remapping it to src + 512K, and
>>>> munmapping it 10,000 times, the average execution time reduces from 1.9 to
>>>> 1.2 seconds, giving a 37% performance optimization. (Apple M3)
>>>
>>> That's impressive improvement. But could you please re-organize the test
>>> description into a pseudo code format or better provide the test program
>>> itself (which should be compact anyways) just to be more clear about the
>>> scenario where this helps.
>>
>> Sure.
> 
> I echo Ashuman's comment, definitely would like to see that.
> 
> And wrt to perf improvement, whether it's a microbenchmark or not, that's a
> great result so well done :) I echo this also!
> 
> However, it'd be good to see some more detail here also, you're kind of missing
> out - everything - about why this improvement happens - what the intent of the
> series is, anything about large folios, under what circumstances you'll see an
> improvement, etc. etc.
> 
> While this might duplicate comments you've made elsewhere, it's mandatory for a
> series, and Andrew is unlikely to take this without it.
> 
> In mm we place the contents of the cover letter in the first commit in the
> series, so it gets stored for posterity also!
> 
> Cheers, Lorenzo

Thanks for your feedback. I'll make the required changes.

> 
>>
>>>
>>>>
>>>> Dev Jain (3):
>>>>     mm: Call pointers to ptes as ptep
>>>>     mm: Add generic helper to hint a large folio
>>>>     mm: Optimize mremap() by PTE batching
>>>>
>>>>    include/linux/pgtable.h | 16 +++++++++++++++
>>>>    mm/mremap.c             | 44 +++++++++++++++++++++++++++--------------
>>>>    2 files changed, 45 insertions(+), 15 deletions(-)
>>>>
>>