Similar to folio_referenced_one(), we can apply batched unmapping for file
large folios to optimize the performance of file folios reclamation.
Performance testing:
Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to
reclaim 8G file-backed folios via the memory.reclaim interface. I can observe
75% performance improvement on my Arm64 32-core server.
W/o patch:
real 0m1.018s
user 0m0.000s
sys 0m1.018s
W/ patch:
real 0m0.249s
user 0m0.000s
sys 0m0.249s
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
mm/rmap.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index ec232165c47d..4c9d5777c8da 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1855,9 +1855,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
end_addr = pmd_addr_end(addr, vma->vm_end);
max_nr = (end_addr - addr) >> PAGE_SHIFT;
- /* We only support lazyfree batching for now ... */
- if (!folio_test_anon(folio) || folio_test_swapbacked(folio))
+ /* We only support lazyfree or file folios batching for now ... */
+ if (folio_test_anon(folio) && folio_test_swapbacked(folio))
return 1;
+
if (pte_unused(pte))
return 1;
@@ -2223,7 +2224,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
*
* See Documentation/mm/mmu_notifier.rst
*/
- dec_mm_counter(mm, mm_counter_file(folio));
+ add_mm_counter(mm, mm_counter_file(folio), -nr_pages);
}
discard:
if (unlikely(folio_test_hugetlb(folio))) {
--
2.47.3
On Thu, Dec 11, 2025 at 04:16:56PM +0800, Baolin Wang wrote:
> Similar to folio_referenced_one(), we can apply batched unmapping for file
> large folios to optimize the performance of file folios reclamation.
>
> Performance testing:
> Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to
> reclaim 8G file-backed folios via the memory.reclaim interface. I can observe
> 75% performance improvement on my Arm64 32-core server.
Again, you must test on non-arm64 architectures and report the numbers for this
also.
>
> W/o patch:
> real 0m1.018s
> user 0m0.000s
> sys 0m1.018s
>
> W/ patch:
> real 0m0.249s
> user 0m0.000s
> sys 0m0.249s
>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> mm/rmap.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index ec232165c47d..4c9d5777c8da 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1855,9 +1855,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
> end_addr = pmd_addr_end(addr, vma->vm_end);
> max_nr = (end_addr - addr) >> PAGE_SHIFT;
>
> - /* We only support lazyfree batching for now ... */
> - if (!folio_test_anon(folio) || folio_test_swapbacked(folio))
> + /* We only support lazyfree or file folios batching for now ... */
> + if (folio_test_anon(folio) && folio_test_swapbacked(folio))
Why is it now ok to support file-backed batched unmapping when it wasn't in
Barry's series (see [0])? You don't seem to be justifying this?
[0]:https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@gmail.com/T/#u
> return 1;
> +
> if (pte_unused(pte))
> return 1;
>
> @@ -2223,7 +2224,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
> *
> * See Documentation/mm/mmu_notifier.rst
> */
> - dec_mm_counter(mm, mm_counter_file(folio));
> + add_mm_counter(mm, mm_counter_file(folio), -nr_pages);
Was this just a bug before?
> }
> discard:
> if (unlikely(folio_test_hugetlb(folio))) {
> --
> 2.47.3
>
Thanks, Lorenzo
On 2025/12/15 20:38, Lorenzo Stoakes wrote: > On Thu, Dec 11, 2025 at 04:16:56PM +0800, Baolin Wang wrote: >> Similar to folio_referenced_one(), we can apply batched unmapping for file >> large folios to optimize the performance of file folios reclamation. >> >> Performance testing: >> Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to >> reclaim 8G file-backed folios via the memory.reclaim interface. I can observe >> 75% performance improvement on my Arm64 32-core server. > > Again, you must test on non-arm64 architectures and report the numbers for this > also. Yes, I've tested on the x86 machine, and will add the data in the commit message. >> W/o patch: >> real 0m1.018s >> user 0m0.000s >> sys 0m1.018s >> >> W/ patch: >> real 0m0.249s >> user 0m0.000s >> sys 0m0.249s >> >> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >> --- >> mm/rmap.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/mm/rmap.c b/mm/rmap.c >> index ec232165c47d..4c9d5777c8da 100644 >> --- a/mm/rmap.c >> +++ b/mm/rmap.c >> @@ -1855,9 +1855,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio, >> end_addr = pmd_addr_end(addr, vma->vm_end); >> max_nr = (end_addr - addr) >> PAGE_SHIFT; >> >> - /* We only support lazyfree batching for now ... */ >> - if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) >> + /* We only support lazyfree or file folios batching for now ... */ >> + if (folio_test_anon(folio) && folio_test_swapbacked(folio)) > > Why is it now ok to support file-backed batched unmapping when it wasn't in > Barry's series (see [0])? You don't seem to be justifying this? Barry's series[0] is merely aimed at optimizing lazyfree anonymous large folios and does not continue to optimize anonymous large folios or file-backed large folios at that point. Subsequently, Barry sent out a new patch (see [1]) to optimize anonymous large folios. As for file-backed large folios, the batched unmapping support is relatively simple, since we only need to clear the PTE entries for file-backed large folios. > [0]:https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@gmail.com/T/#u [1] https://lore.kernel.org/all/20250513084620.58231-1-21cnbao@gmail.com/ >> return 1; >> + >> if (pte_unused(pte)) >> return 1; >> >> @@ -2223,7 +2224,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, >> * >> * See Documentation/mm/mmu_notifier.rst >> */ >> - dec_mm_counter(mm, mm_counter_file(folio)); >> + add_mm_counter(mm, mm_counter_file(folio), -nr_pages); > > Was this just a bug before? Nope. Before this patch, we never supported batched unmapping for file-backed large folios, so the 'nr_pages' was always 1. After this patch, we should use the number of pages in this file-backed large folio.
On Tue, Dec 16, 2025 at 01:48:52PM +0800, Baolin Wang wrote: > > > On 2025/12/15 20:38, Lorenzo Stoakes wrote: > > On Thu, Dec 11, 2025 at 04:16:56PM +0800, Baolin Wang wrote: > > > Similar to folio_referenced_one(), we can apply batched unmapping for file > > > large folios to optimize the performance of file folios reclamation. > > > > > > Performance testing: > > > Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to > > > reclaim 8G file-backed folios via the memory.reclaim interface. I can observe > > > 75% performance improvement on my Arm64 32-core server. > > > > Again, you must test on non-arm64 architectures and report the numbers for this > > also. > > Yes, I've tested on the x86 machine, and will add the data in the commit > message. > > > > W/o patch: > > > real 0m1.018s > > > user 0m0.000s > > > sys 0m1.018s > > > > > > W/ patch: > > > real 0m0.249s > > > user 0m0.000s > > > sys 0m0.249s > > > > > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > > --- > > > mm/rmap.c | 7 ++++--- > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > > index ec232165c47d..4c9d5777c8da 100644 > > > --- a/mm/rmap.c > > > +++ b/mm/rmap.c > > > @@ -1855,9 +1855,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio, > > > end_addr = pmd_addr_end(addr, vma->vm_end); > > > max_nr = (end_addr - addr) >> PAGE_SHIFT; > > > > > > - /* We only support lazyfree batching for now ... */ > > > - if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) > > > + /* We only support lazyfree or file folios batching for now ... */ > > > + if (folio_test_anon(folio) && folio_test_swapbacked(folio)) > > > > Why is it now ok to support file-backed batched unmapping when it wasn't in > > Barry's series (see [0])? You don't seem to be justifying this? > > Barry's series[0] is merely aimed at optimizing lazyfree anonymous large > folios and does not continue to optimize anonymous large folios or > file-backed large folios at that point. > > Subsequently, Barry sent out a new patch (see [1]) to optimize anonymous > large folios. As for file-backed large folios, the batched unmapping support > is relatively simple, since we only need to clear the PTE entries for > file-backed large folios. Yeah, but he sent an entire patch changing a bunch of logic to accommodate this, you're just changing the conditional and not really justifying it in the commit message? It really needs a 'it is safe to allow this for file-backed because blah blah blah'. Is this relying on the prior commits you've added? If so you should say so. If it was as easy as just changing the conditional then it begs the question as to why Barry didn't do that in the first place :) > > > [0]:https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@gmail.com/T/#u > > [1] https://lore.kernel.org/all/20250513084620.58231-1-21cnbao@gmail.com/ > > > > return 1; > > > + > > > if (pte_unused(pte)) > > > return 1; > > > > > > @@ -2223,7 +2224,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, > > > * > > > * See Documentation/mm/mmu_notifier.rst > > > */ > > > - dec_mm_counter(mm, mm_counter_file(folio)); > > > + add_mm_counter(mm, mm_counter_file(folio), -nr_pages); > > > > Was this just a bug before? > > Nope. Before this patch, we never supported batched unmapping for > file-backed large folios, so the 'nr_pages' was always 1. After this patch, > we should use the number of pages in this file-backed large folio. Right ok :) Cheers, Lorenzo
> >> > >> diff --git a/mm/rmap.c b/mm/rmap.c > >> index ec232165c47d..4c9d5777c8da 100644 > >> --- a/mm/rmap.c > >> +++ b/mm/rmap.c > >> @@ -1855,9 +1855,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio, > >> end_addr = pmd_addr_end(addr, vma->vm_end); > >> max_nr = (end_addr - addr) >> PAGE_SHIFT; > >> > >> - /* We only support lazyfree batching for now ... */ > >> - if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) > >> + /* We only support lazyfree or file folios batching for now ... */ > >> + if (folio_test_anon(folio) && folio_test_swapbacked(folio)) > > > > Why is it now ok to support file-backed batched unmapping when it wasn't in > > Barry's series (see [0])? You don't seem to be justifying this? > > Barry's series[0] is merely aimed at optimizing lazyfree anonymous large > folios and does not continue to optimize anonymous large folios or > file-backed large folios at that point. Yep. At that time, I didn’t have an Android machine with a filesystem that supported large folios, so I focused on lazyfree. But I agree that lazyfree anon folios and file folios are quite similar. > > Subsequently, Barry sent out a new patch (see [1]) to optimize anonymous > large folios. As for file-backed large folios, the batched unmapping > support is relatively simple, since we only need to clear the PTE > entries for file-backed large folios. Yep. It is actually quite straightforward to go from lazyfree anon folios to file folios. Swap-backed anon folios are much more tricky, though. > > > [0]:https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@gmail.com/T/#u > > [1] https://lore.kernel.org/all/20250513084620.58231-1-21cnbao@gmail.com/ Thanks Barry
On 2025/12/16 14:13, Barry Song wrote: >>>> >>>> diff --git a/mm/rmap.c b/mm/rmap.c >>>> index ec232165c47d..4c9d5777c8da 100644 >>>> --- a/mm/rmap.c >>>> +++ b/mm/rmap.c >>>> @@ -1855,9 +1855,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio, >>>> end_addr = pmd_addr_end(addr, vma->vm_end); >>>> max_nr = (end_addr - addr) >> PAGE_SHIFT; >>>> >>>> - /* We only support lazyfree batching for now ... */ >>>> - if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) >>>> + /* We only support lazyfree or file folios batching for now ... */ >>>> + if (folio_test_anon(folio) && folio_test_swapbacked(folio)) >>> >>> Why is it now ok to support file-backed batched unmapping when it wasn't in >>> Barry's series (see [0])? You don't seem to be justifying this? >> >> Barry's series[0] is merely aimed at optimizing lazyfree anonymous large >> folios and does not continue to optimize anonymous large folios or >> file-backed large folios at that point. > > Yep. At that time, I didn’t have an Android machine with a filesystem > that supported large folios, so I focused on lazyfree. But I > agree that lazyfree anon folios and file folios are quite > similar. > >> >> Subsequently, Barry sent out a new patch (see [1]) to optimize anonymous >> large folios. As for file-backed large folios, the batched unmapping >> support is relatively simple, since we only need to clear the PTE >> entries for file-backed large folios. > > Yep. It is actually quite straightforward to go from lazyfree > anon folios to file folios. Swap-backed anon folios are much > more tricky, though. Agree. Thanks Barry for reviewing and confirming.
On Tue, Dec 16, 2025 at 02:22:11PM +0800, Baolin Wang wrote: > > > On 2025/12/16 14:13, Barry Song wrote: > > > > > > > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > > > > index ec232165c47d..4c9d5777c8da 100644 > > > > > --- a/mm/rmap.c > > > > > +++ b/mm/rmap.c > > > > > @@ -1855,9 +1855,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio, > > > > > end_addr = pmd_addr_end(addr, vma->vm_end); > > > > > max_nr = (end_addr - addr) >> PAGE_SHIFT; > > > > > > > > > > - /* We only support lazyfree batching for now ... */ > > > > > - if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) > > > > > + /* We only support lazyfree or file folios batching for now ... */ > > > > > + if (folio_test_anon(folio) && folio_test_swapbacked(folio)) > > > > > > > > Why is it now ok to support file-backed batched unmapping when it wasn't in > > > > Barry's series (see [0])? You don't seem to be justifying this? > > > > > > Barry's series[0] is merely aimed at optimizing lazyfree anonymous large > > > folios and does not continue to optimize anonymous large folios or > > > file-backed large folios at that point. > > > > Yep. At that time, I didn’t have an Android machine with a filesystem > > that supported large folios, so I focused on lazyfree. But I > > agree that lazyfree anon folios and file folios are quite > > similar. > > > > > > > > Subsequently, Barry sent out a new patch (see [1]) to optimize anonymous > > > large folios. As for file-backed large folios, the batched unmapping > > > support is relatively simple, since we only need to clear the PTE > > > entries for file-backed large folios. > > > > Yep. It is actually quite straightforward to go from lazyfree > > anon folios to file folios. Swap-backed anon folios are much > > more tricky, though. > > Agree. Thanks Barry for reviewing and confirming. OK that makes me less concerned, but you do need to put some more justification in the commit message. Cheers, Lorenzo
On Thu, Dec 11, 2025 at 4:17 PM Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > Similar to folio_referenced_one(), we can apply batched unmapping for file > large folios to optimize the performance of file folios reclamation. > > Performance testing: > Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to > reclaim 8G file-backed folios via the memory.reclaim interface. I can observe > 75% performance improvement on my Arm64 32-core server. > > W/o patch: > real 0m1.018s > user 0m0.000s > sys 0m1.018s > > W/ patch: > real 0m0.249s > user 0m0.000s > sys 0m0.249s > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> It appears quite straightforward to introduce file folios support based on the current lazyfree implementation. Acked-by: Barry Song <baohua@kernel.org>
© 2016 - 2025 Red Hat, Inc.