[v4] mm/vmalloc: free unused pages on vrealloc() shrink

[PATCH v4 2/3] mm/vmalloc: free unused pages on vrealloc() shrink

Posted by Shivam Kalra via B4 Relay 3 weeks, 3 days ago

From: Shivam Kalra <shivamkalra98@zohomail.in>

When vrealloc() shrinks an allocation and the new size crosses a page
boundary, unmap and free the tail pages that are no longer needed. This
reclaims physical memory that was previously wasted for the lifetime
of the allocation.

The heuristic is simple: always free when at least one full page becomes
unused. Huge page allocations (page_order > 0) are skipped, as partial
freeing would require splitting.

The virtual address reservation (vm->size / vmap_area) is intentionally
kept unchanged, preserving the address for potential future grow-in-place
support.

Fix the grow-in-place check to compare against vm->nr_pages rather than
get_vm_area_size(), since the latter reflects the virtual reservation
which does not shrink. Without this fix, a grow after shrink would
access freed pages.

Signed-off-by: Shivam Kalra <shivamkalra98@zohomail.in>
---
 mm/vmalloc.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b29bf58c0e3f..2c455f2038f6 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -4345,14 +4345,23 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
 			goto need_realloc;
 	}
 
-	/*
-	 * TODO: Shrink the vm_area, i.e. unmap and free unused pages. What
-	 * would be a good heuristic for when to shrink the vm_area?
-	 */
 	if (size <= old_size) {
+		unsigned int new_nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
 		/* Zero out "freed" memory, potentially for future realloc. */
 		if (want_init_on_free() || want_init_on_alloc(flags))
 			memset((void *)p + size, 0, old_size - size);
+
+		/* Free tail pages when shrink crosses a page boundary. */
+		if (new_nr_pages < vm->nr_pages && !vm_area_page_order(vm)) {
+			unsigned long addr = (unsigned long)p;
+
+			vunmap_range(addr + (new_nr_pages << PAGE_SHIFT),
+				     addr + (vm->nr_pages << PAGE_SHIFT));
+
+			vm_area_free_pages(vm, new_nr_pages, vm->nr_pages);
+			vm->nr_pages = new_nr_pages;
+		}
 		vm->requested_size = size;
 		kasan_vrealloc(p, old_size, size);
 		return (void *)p;
@@ -4361,7 +4370,7 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
 	/*
 	 * We already have the bytes available in the allocation; use them.
 	 */
-	if (size <= alloced_size) {
+	if (size <= (size_t)vm->nr_pages << PAGE_SHIFT) {
 		/*
 		 * No need to zero memory here, as unused memory will have
 		 * already been zeroed at initial allocation time or during

-- 
2.43.0

Re: [PATCH v4 2/3] mm/vmalloc: free unused pages on vrealloc() shrink

Posted by Uladzislau Rezki 3 weeks ago

On Sat, Mar 14, 2026 at 02:34:14PM +0530, Shivam Kalra via B4 Relay wrote:
> From: Shivam Kalra <shivamkalra98@zohomail.in>
> 
> When vrealloc() shrinks an allocation and the new size crosses a page
> boundary, unmap and free the tail pages that are no longer needed. This
> reclaims physical memory that was previously wasted for the lifetime
> of the allocation.
> 
> The heuristic is simple: always free when at least one full page becomes
> unused. Huge page allocations (page_order > 0) are skipped, as partial
> freeing would require splitting.
> 
> The virtual address reservation (vm->size / vmap_area) is intentionally
> kept unchanged, preserving the address for potential future grow-in-place
> support.
> 
> Fix the grow-in-place check to compare against vm->nr_pages rather than
> get_vm_area_size(), since the latter reflects the virtual reservation
> which does not shrink. Without this fix, a grow after shrink would
> access freed pages.
> 
> Signed-off-by: Shivam Kalra <shivamkalra98@zohomail.in>
> ---
>  mm/vmalloc.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index b29bf58c0e3f..2c455f2038f6 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -4345,14 +4345,23 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>  			goto need_realloc;
>  	}
>  
> -	/*
> -	 * TODO: Shrink the vm_area, i.e. unmap and free unused pages. What
> -	 * would be a good heuristic for when to shrink the vm_area?
> -	 */
>  	if (size <= old_size) {
> +		unsigned int new_nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +
>  		/* Zero out "freed" memory, potentially for future realloc. */
>  		if (want_init_on_free() || want_init_on_alloc(flags))
>  			memset((void *)p + size, 0, old_size - size);
> +
> +		/* Free tail pages when shrink crosses a page boundary. */
> +		if (new_nr_pages < vm->nr_pages && !vm_area_page_order(vm)) {
> +			unsigned long addr = (unsigned long)p;
> +
> +			vunmap_range(addr + (new_nr_pages << PAGE_SHIFT),
> +				     addr + (vm->nr_pages << PAGE_SHIFT));
> +
> +			vm_area_free_pages(vm, new_nr_pages, vm->nr_pages);
> +			vm->nr_pages = new_nr_pages;
> +		}
>  		vm->requested_size = size;
>  		kasan_vrealloc(p, old_size, size);
>  		return (void *)p;
> @@ -4361,7 +4370,7 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>  	/*
>  	 * We already have the bytes available in the allocation; use them.
>  	 */
> -	if (size <= alloced_size) {
> +	if (size <= (size_t)vm->nr_pages << PAGE_SHIFT) {
>  		/*
>  		 * No need to zero memory here, as unused memory will have
>  		 * already been zeroed at initial allocation time or during
> 
> -- 
> 2.43.0
> 
> 
Do we perform vm_reset_perms(vm) for tail pages? As i see you update the
vm->nr_pages when shrinking. Then on vfree() we have:

<snip>
/*
 * Flush the vm mapping and reset the direct map.
 */
static void vm_reset_perms(struct vm_struct *area)
{
	unsigned long start = ULONG_MAX, end = 0;
	unsigned int page_order = vm_area_page_order(area);
	int flush_dmap = 0;
	int i;

	/*
	 * Find the start and end range of the direct mappings to make sure that
	 * the vm_unmap_aliases() flush includes the direct map.
	 */
	for (i = 0; i < area->nr_pages; i += 1U << page_order) {
...
<snip>

i.e. tail pages go back to the page allocator without resetting permission.

--
Uladzslau Rezki

Re: [PATCH v4 2/3] mm/vmalloc: free unused pages on vrealloc() shrink

Posted by Shivam Kalra 3 weeks ago

On 16/03/26 22:42, Uladzislau Rezki wrote:
> On Sat, Mar 14, 2026 at 02:34:14PM +0530, Shivam Kalra via B4 Relay wrote:
>> From: Shivam Kalra <shivamkalra98@zohomail.in>
>>
>> When vrealloc() shrinks an allocation and the new size crosses a page
>> boundary, unmap and free the tail pages that are no longer needed. This
>> reclaims physical memory that was previously wasted for the lifetime
>> of the allocation.
>>
>> The heuristic is simple: always free when at least one full page becomes
>> unused. Huge page allocations (page_order > 0) are skipped, as partial
>> freeing would require splitting.
>>
>> The virtual address reservation (vm->size / vmap_area) is intentionally
>> kept unchanged, preserving the address for potential future grow-in-place
>> support.
>>
>> Fix the grow-in-place check to compare against vm->nr_pages rather than
>> get_vm_area_size(), since the latter reflects the virtual reservation
>> which does not shrink. Without this fix, a grow after shrink would
>> access freed pages.
>>
>> Signed-off-by: Shivam Kalra <shivamkalra98@zohomail.in>
>> ---
>>  mm/vmalloc.c | 19 ++++++++++++++-----
>>  1 file changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index b29bf58c0e3f..2c455f2038f6 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -4345,14 +4345,23 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>>  			goto need_realloc;
>>  	}
>>  
>> -	/*
>> -	 * TODO: Shrink the vm_area, i.e. unmap and free unused pages. What
>> -	 * would be a good heuristic for when to shrink the vm_area?
>> -	 */
>>  	if (size <= old_size) {
>> +		unsigned int new_nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +
>>  		/* Zero out "freed" memory, potentially for future realloc. */
>>  		if (want_init_on_free() || want_init_on_alloc(flags))
>>  			memset((void *)p + size, 0, old_size - size);
>> +
>> +		/* Free tail pages when shrink crosses a page boundary. */
>> +		if (new_nr_pages < vm->nr_pages && !vm_area_page_order(vm)) {
>> +			unsigned long addr = (unsigned long)p;
>> +
>> +			vunmap_range(addr + (new_nr_pages << PAGE_SHIFT),
>> +				     addr + (vm->nr_pages << PAGE_SHIFT));
>> +
>> +			vm_area_free_pages(vm, new_nr_pages, vm->nr_pages);
>> +			vm->nr_pages = new_nr_pages;
>> +		}
>>  		vm->requested_size = size;
>>  		kasan_vrealloc(p, old_size, size);
>>  		return (void *)p;
>> @@ -4361,7 +4370,7 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>>  	/*
>>  	 * We already have the bytes available in the allocation; use them.
>>  	 */
>> -	if (size <= alloced_size) {
>> +	if (size <= (size_t)vm->nr_pages << PAGE_SHIFT) {
>>  		/*
>>  		 * No need to zero memory here, as unused memory will have
>>  		 * already been zeroed at initial allocation time or during
>>
>> -- 
>> 2.43.0
>>
>>
> Do we perform vm_reset_perms(vm) for tail pages? As i see you update the
> vm->nr_pages when shrinking. Then on vfree() we have:
> 
> <snip>
> /*
>  * Flush the vm mapping and reset the direct map.
>  */
> static void vm_reset_perms(struct vm_struct *area)
> {
> 	unsigned long start = ULONG_MAX, end = 0;
> 	unsigned int page_order = vm_area_page_order(area);
> 	int flush_dmap = 0;
> 	int i;
> 
> 	/*
> 	 * Find the start and end range of the direct mappings to make sure that
> 	 * the vm_unmap_aliases() flush includes the direct map.
> 	 */
> 	for (i = 0; i < area->nr_pages; i += 1U << page_order) {
> ...
> <snip>
> 
> i.e. tail pages go back to the page allocator without resetting permission.
> 
> --
> Uladzslau Rezki
Hi Uladzislau,

Good catch, thank you for spotting this. You are absolutely right-we are
currently returning the tail pages to the page allocator without
resetting their direct-map permissions if VM_FLUSH_RESET_PERMS was set.

While my specific use case doesn't utilize VM_FLUSH_RESET_PERMS,
vrealloc needs to safely handle all vmalloc flags as a generic API.

I will fix this in the next version (v5). I plan to add a helper
function to perform the permission reset specifically for the range of
tail pages being freed during the shrink.

Thanks,
Shivam

[PATCH v4 1/3] mm/vmalloc: extract vm_area_free_pages() helper from vfree()
[PATCH v4 2/3] mm/vmalloc: free unused pages on vrealloc() shrink
[PATCH v4 3/3] lib/test_vmalloc: add vrealloc test case