mm/memory.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
From: Barry Song <v-songbaohua@oppo.com>
In a Copy-on-Write (CoW) scenario, the last subpage will reuse the entire
large folio, resulting in the waste of (nr_pages - 1) pages. This wasted
memory remains allocated until it is either unmapped or memory
reclamation occurs.
The following small program can serve as evidence of this behavior
main()
{
#define SIZE 1024 * 1024 * 1024UL
void *p = malloc(SIZE);
memset(p, 0x11, SIZE);
if (fork() == 0)
_exit(0);
memset(p, 0x12, SIZE);
printf("done\n");
while(1);
}
For example, using a 1024KiB mTHP by:
echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled
(1) w/o the patch, it takes 2GiB,
Before running the test program,
/ # free -m
total used free shared buff/cache available
Mem: 5754 84 5692 0 17 5669
Swap: 0 0 0
/ # /a.out &
/ # done
After running the test program,
/ # free -m
total used free shared buff/cache available
Mem: 5754 2149 3627 0 19 3605
Swap: 0 0 0
(2) w/ the patch, it takes 1GiB only,
Before running the test program,
/ # free -m
total used free shared buff/cache available
Mem: 5754 89 5687 0 17 5664
Swap: 0 0 0
/ # /a.out &
/ # done
After running the test program,
/ # free -m
total used free shared buff/cache available
Mem: 5754 1122 4655 0 17 4632
Swap: 0 0 0
This patch migrates the last subpage to a small folio and immediately
returns the large folio to the system. It benefits both memory availability
and anti-fragmentation.
Cc: David Hildenbrand <david@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
-v2:
* return at the 1st beginning for a large folio according to David's comment,
thanks!
-v1:
https://lore.kernel.org/linux-mm/20240308085653.124180-1-21cnbao@gmail.com/
mm/memory.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/mm/memory.c b/mm/memory.c
index e17669d4f72f..f2bc6dd15eb8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3498,6 +3498,16 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf, struct folio *folio)
static bool wp_can_reuse_anon_folio(struct folio *folio,
struct vm_area_struct *vma)
{
+ /*
+ * We could currently only reuse a subpage of a large folio if no
+ * other subpages of the large folios are still mapped. However,
+ * let's just consistently not reuse subpages even if we could
+ * reuse in that scenario, and give back a large folio a bit
+ * sooner.
+ */
+ if (folio_test_large(folio))
+ return false;
+
/*
* We have to verify under folio lock: these early checks are
* just an optimization to avoid locking the folio and freeing
--
2.34.1
On 08.03.24 10:27, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
>
> In a Copy-on-Write (CoW) scenario, the last subpage will reuse the entire
> large folio, resulting in the waste of (nr_pages - 1) pages. This wasted
> memory remains allocated until it is either unmapped or memory
> reclamation occurs.
>
> The following small program can serve as evidence of this behavior
>
> main()
> {
> #define SIZE 1024 * 1024 * 1024UL
> void *p = malloc(SIZE);
> memset(p, 0x11, SIZE);
> if (fork() == 0)
> _exit(0);
> memset(p, 0x12, SIZE);
> printf("done\n");
> while(1);
> }
>
> For example, using a 1024KiB mTHP by:
> echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled
>
> (1) w/o the patch, it takes 2GiB,
>
> Before running the test program,
> / # free -m
> total used free shared buff/cache available
> Mem: 5754 84 5692 0 17 5669
> Swap: 0 0 0
>
> / # /a.out &
> / # done
>
> After running the test program,
> / # free -m
> total used free shared buff/cache available
> Mem: 5754 2149 3627 0 19 3605
> Swap: 0 0 0
>
> (2) w/ the patch, it takes 1GiB only,
>
> Before running the test program,
> / # free -m
> total used free shared buff/cache available
> Mem: 5754 89 5687 0 17 5664
> Swap: 0 0 0
>
> / # /a.out &
> / # done
>
> After running the test program,
> / # free -m
> total used free shared buff/cache available
> Mem: 5754 1122 4655 0 17 4632
> Swap: 0 0 0
>
> This patch migrates the last subpage to a small folio and immediately
> returns the large folio to the system. It benefits both memory availability
> and anti-fragmentation.
It might be controversial optimization, and as Ryan said, there, are
likely other cases where we'd want to migrate off-of a thp if possible
earlier.
But I like that it just handles large folios now in a consistent way for
the time being.
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
On 08/03/2024 09:34, David Hildenbrand wrote:
> On 08.03.24 10:27, Barry Song wrote:
>> From: Barry Song <v-songbaohua@oppo.com>
>>
>> In a Copy-on-Write (CoW) scenario, the last subpage will reuse the entire
>> large folio, resulting in the waste of (nr_pages - 1) pages. This wasted
>> memory remains allocated until it is either unmapped or memory
>> reclamation occurs.
>>
>> The following small program can serve as evidence of this behavior
>>
>> main()
>> {
>> #define SIZE 1024 * 1024 * 1024UL
>> void *p = malloc(SIZE);
>> memset(p, 0x11, SIZE);
>> if (fork() == 0)
>> _exit(0);
>> memset(p, 0x12, SIZE);
>> printf("done\n");
>> while(1);
>> }
>>
>> For example, using a 1024KiB mTHP by:
>> echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled
>>
>> (1) w/o the patch, it takes 2GiB,
>>
>> Before running the test program,
>> / # free -m
>> total used free shared buff/cache
>> available
>> Mem: 5754 84 5692 0 17
>> 5669
>> Swap: 0 0 0
>>
>> / # /a.out &
>> / # done
>>
>> After running the test program,
>> / # free -m
>> total used free shared buff/cache
>> available
>> Mem: 5754 2149 3627 0 19
>> 3605
>> Swap: 0 0 0
>>
>> (2) w/ the patch, it takes 1GiB only,
>>
>> Before running the test program,
>> / # free -m
>> total used free shared buff/cache
>> available
>> Mem: 5754 89 5687 0 17
>> 5664
>> Swap: 0 0 0
>>
>> / # /a.out &
>> / # done
>>
>> After running the test program,
>> / # free -m
>> total used free shared buff/cache
>> available
>> Mem: 5754 1122 4655 0 17
>> 4632
>> Swap: 0 0 0
>>
>> This patch migrates the last subpage to a small folio and immediately
>> returns the large folio to the system. It benefits both memory availability
>> and anti-fragmentation.
>
> It might be controversial optimization, and as Ryan said, there, are likely
> other cases where we'd want to migrate off-of a thp if possible earlier.
Personally, I think there might also be cases where you want to copy/reuse the
entire large folio. If you're application is using 16K THPs perhaps it's a
bigger win to just treat it like a base page? I expect the cost/benefit will
change as the THP size increases?
I know we have previously talked about using a khugepaged-like mechanism to
re-collapse after CoW, but for the smaller sizes maybe that's just a lot more
effort?
>
> But I like that it just handles large folios now in a consistent way for the
> time being.
Yes agreed.
>
> Acked-by: David Hildenbrand <david@redhat.com>
>
>>> This patch migrates the last subpage to a small folio and immediately >>> returns the large folio to the system. It benefits both memory availability >>> and anti-fragmentation. >> >> It might be controversial optimization, and as Ryan said, there, are likely >> other cases where we'd want to migrate off-of a thp if possible earlier. > > Personally, I think there might also be cases where you want to copy/reuse the > entire large folio. If you're application is using 16K THPs perhaps it's a > bigger win to just treat it like a base page? I expect the cost/benefit will > change as the THP size increases? Yes, I think for small folios (i.e., 16KiB) it will be rather easy to make a decision. The larger the folio, the larger the page fault latency due to scanning, copying, modifying, which can easily turn undesirable. At least when it comes to page reuse, I have some simple backup plans for small folios if I won't be able to make progress with my other approach. For larger folios, it won't really work/be desirable, though. -- Cheers, David / dhildenb
On 08/03/2024 13:24, David Hildenbrand wrote: >>>> This patch migrates the last subpage to a small folio and immediately >>>> returns the large folio to the system. It benefits both memory availability >>>> and anti-fragmentation. >>> >>> It might be controversial optimization, and as Ryan said, there, are likely >>> other cases where we'd want to migrate off-of a thp if possible earlier. >> >> Personally, I think there might also be cases where you want to copy/reuse the >> entire large folio. If you're application is using 16K THPs perhaps it's a >> bigger win to just treat it like a base page? I expect the cost/benefit will >> change as the THP size increases? > > Yes, I think for small folios (i.e., 16KiB) it will be rather easy to make a > decision. The larger the folio, the larger the page fault latency due to > scanning, copying, modifying, which can easily turn undesirable. > > At least when it comes to page reuse, I have some simple backup plans for small > folios if I won't be able to make progress with my other approach. Do you mean "small large folios" here? i.e. order >= 1? If so, great! For larger > folios, it won't really work/be desirable, though. >
On 08.03.24 14:45, Ryan Roberts wrote: > On 08/03/2024 13:24, David Hildenbrand wrote: >>>>> This patch migrates the last subpage to a small folio and immediately >>>>> returns the large folio to the system. It benefits both memory availability >>>>> and anti-fragmentation. >>>> >>>> It might be controversial optimization, and as Ryan said, there, are likely >>>> other cases where we'd want to migrate off-of a thp if possible earlier. >>> >>> Personally, I think there might also be cases where you want to copy/reuse the >>> entire large folio. If you're application is using 16K THPs perhaps it's a >>> bigger win to just treat it like a base page? I expect the cost/benefit will >>> change as the THP size increases? >> >> Yes, I think for small folios (i.e., 16KiB) it will be rather easy to make a >> decision. The larger the folio, the larger the page fault latency due to >> scanning, copying, modifying, which can easily turn undesirable. >> >> At least when it comes to page reuse, I have some simple backup plans for small >> folios if I won't be able to make progress with my other approach. > > Do you mean "small large folios" here? i.e. order >= 1? If so, great! *smaller*, yes :) -- Cheers, David / dhildenb
© 2016 - 2026 Red Hat, Inc.