[PATCH v3 00/10] mm/mremap: permit mremap() move of multiple VMAs

Lorenzo Stoakes posted 10 patches 2 months, 3 weeks ago
There is a newer version of this series
fs/userfaultfd.c                         |  15 +-
include/linux/userfaultfd_k.h            |   5 +
mm/mremap.c                              | 553 +++++++++++++++--------
tools/testing/selftests/mm/mremap_test.c | 146 +++++-
4 files changed, 518 insertions(+), 201 deletions(-)
[PATCH v3 00/10] mm/mremap: permit mremap() move of multiple VMAs
Posted by Lorenzo Stoakes 2 months, 3 weeks ago
Historically we've made it a uAPI requirement that mremap() may only
operate on a single VMA at a time.

For instances where VMAs need to be resized, this makes sense, as it
becomes very difficult to determine what a user actually wants should they
indicate a desire to expand or shrink the size of multiple VMAs (truncate?
Adjust sizes individually? Some other strategy?).

However, in instances where a user is moving VMAs, it is restrictive to
disallow this.

This is especially the case when anonymous mapping remap may or may not be
mergeable depending on whether VMAs have or have not been faulted due to
anon_vma assignment and folio index alignment with vma->vm_pgoff.

Often this can result in surprising impact where a moved region is faulted,
then moved back and a user fails to observe a merge from otherwise
compatible, adjacent VMAs.

This change allows such cases to work without the user having to be
cognizant of whether a prior mremap() move or other VMA operations has
resulted in VMA fragmentation.

In order to do this, this series performs a large amount of refactoring,
most pertinently - grouping sanity checks together, separately those that
check input parameters and those relating to VMAs.

we also simplify the post-mmap lock drop processing for uffd and mlock()'d
VMAs.

With this done, we can then fairly straightforwardly implement this
functionality.

This works exclusively for mremap() invocations which specify
MREMAP_FIXED. It is not compatible with VMAs which use userfaultfd, as the
notification of the userland fault handler would require us to drop the
mmap lock.

It is also not compatible with file-backed mappings with customised
get_unmapped_area() handlers as these may not honour MREMAP_FIXED.

The input and output addresses ranges must not overlap. We carefully
account for moves which would result in VMA iterator invalidation.

While there can be gaps between VMAs in the input range, there can be no
gap before the first VMA in the range.


v3:
* Disallowed move operation except for MREMAP_FIXED.
* Disallow gap at start of aggregate range to avoid confusion.
* Disallow any file-baked VMAs with custom get_unmapped_area.
* Renamed multi_vma to seen_vma to be clearer. Stop reusing new_addr, use
  separate target_addr var to track next target address.
* Check if first VMA fails multi VMA check, if so we'll allow one VMA but
  not multiple.
* Updated the commit message for patch 9 to be clearer about gap behaviour.
* Removed accidentally included debug goto statement in test (doh!). Test
  was and is passing regardless.
* Unmap target range in test, previously we ended up moving additional VMAs
  unintentionally. This still all passed :) but was not what was intended.
* Removed self-merge check - there is absolutely no way this can happen
  across multiple VMAs, as there is no means of moving VMAs such that a VMA
  merges with itself.

v2:
* Squashed uffd stub fix into series.
* Propagated tags, thanks!
* Fixed param naming in patch 4 as per Vlastimil.
* Renamed vma_reset to vmi_needs_reset + dropped reset on unmap as per
  Liam.
* Correctly return -EFAULT if no VMAs in input range.
* Account for get_unmapped_area() disregarding MAP_FIXED and returning an
  altered address.
* Added additional explanatatory comment to the remap_move() function.
https://lore.kernel.org/all/cover.1751865330.git.lorenzo.stoakes@oracle.com/

v1:
https://lore.kernel.org/all/cover.1751865330.git.lorenzo.stoakes@oracle.com/


Lorenzo Stoakes (10):
  mm/mremap: perform some simple cleanups
  mm/mremap: refactor initial parameter sanity checks
  mm/mremap: put VMA check and prep logic into helper function
  mm/mremap: cleanup post-processing stage of mremap
  mm/mremap: use an explicit uffd failure path for mremap
  mm/mremap: check remap conditions earlier
  mm/mremap: move remap_is_valid() into check_prep_vma()
  mm/mremap: clean up mlock populate behaviour
  mm/mremap: permit mremap() move of multiple VMAs
  tools/testing/selftests: extend mremap_test to test multi-VMA mremap

 fs/userfaultfd.c                         |  15 +-
 include/linux/userfaultfd_k.h            |   5 +
 mm/mremap.c                              | 553 +++++++++++++++--------
 tools/testing/selftests/mm/mremap_test.c | 146 +++++-
 4 files changed, 518 insertions(+), 201 deletions(-)

--
2.50.0
Re: [PATCH v3 00/10] mm/mremap: permit mremap() move of multiple VMAs
Posted by Lorenzo Stoakes 2 months, 3 weeks ago
Apologies, I've done it again... :) Friday eh?

+cc linux-api - FYI guys.

I will update manpage to reflect this change of course also.

On Fri, Jul 11, 2025 at 12:38:14PM +0100, Lorenzo Stoakes wrote:
> Historically we've made it a uAPI requirement that mremap() may only
> operate on a single VMA at a time.
>
> For instances where VMAs need to be resized, this makes sense, as it
> becomes very difficult to determine what a user actually wants should they
> indicate a desire to expand or shrink the size of multiple VMAs (truncate?
> Adjust sizes individually? Some other strategy?).
>
> However, in instances where a user is moving VMAs, it is restrictive to
> disallow this.
>
> This is especially the case when anonymous mapping remap may or may not be
> mergeable depending on whether VMAs have or have not been faulted due to
> anon_vma assignment and folio index alignment with vma->vm_pgoff.
>
> Often this can result in surprising impact where a moved region is faulted,
> then moved back and a user fails to observe a merge from otherwise
> compatible, adjacent VMAs.
>
> This change allows such cases to work without the user having to be
> cognizant of whether a prior mremap() move or other VMA operations has
> resulted in VMA fragmentation.
>
> In order to do this, this series performs a large amount of refactoring,
> most pertinently - grouping sanity checks together, separately those that
> check input parameters and those relating to VMAs.
>
> we also simplify the post-mmap lock drop processing for uffd and mlock()'d
> VMAs.
>
> With this done, we can then fairly straightforwardly implement this
> functionality.
>
> This works exclusively for mremap() invocations which specify
> MREMAP_FIXED. It is not compatible with VMAs which use userfaultfd, as the
> notification of the userland fault handler would require us to drop the
> mmap lock.
>
> It is also not compatible with file-backed mappings with customised
> get_unmapped_area() handlers as these may not honour MREMAP_FIXED.
>
> The input and output addresses ranges must not overlap. We carefully
> account for moves which would result in VMA iterator invalidation.
>
> While there can be gaps between VMAs in the input range, there can be no
> gap before the first VMA in the range.
>
>
> v3:
> * Disallowed move operation except for MREMAP_FIXED.
> * Disallow gap at start of aggregate range to avoid confusion.
> * Disallow any file-baked VMAs with custom get_unmapped_area.
> * Renamed multi_vma to seen_vma to be clearer. Stop reusing new_addr, use
>   separate target_addr var to track next target address.
> * Check if first VMA fails multi VMA check, if so we'll allow one VMA but
>   not multiple.
> * Updated the commit message for patch 9 to be clearer about gap behaviour.
> * Removed accidentally included debug goto statement in test (doh!). Test
>   was and is passing regardless.
> * Unmap target range in test, previously we ended up moving additional VMAs
>   unintentionally. This still all passed :) but was not what was intended.
> * Removed self-merge check - there is absolutely no way this can happen
>   across multiple VMAs, as there is no means of moving VMAs such that a VMA
>   merges with itself.
>
> v2:
> * Squashed uffd stub fix into series.
> * Propagated tags, thanks!
> * Fixed param naming in patch 4 as per Vlastimil.
> * Renamed vma_reset to vmi_needs_reset + dropped reset on unmap as per
>   Liam.
> * Correctly return -EFAULT if no VMAs in input range.
> * Account for get_unmapped_area() disregarding MAP_FIXED and returning an
>   altered address.
> * Added additional explanatatory comment to the remap_move() function.
> https://lore.kernel.org/all/cover.1751865330.git.lorenzo.stoakes@oracle.com/
>
> v1:
> https://lore.kernel.org/all/cover.1751865330.git.lorenzo.stoakes@oracle.com/
>
>
> Lorenzo Stoakes (10):
>   mm/mremap: perform some simple cleanups
>   mm/mremap: refactor initial parameter sanity checks
>   mm/mremap: put VMA check and prep logic into helper function
>   mm/mremap: cleanup post-processing stage of mremap
>   mm/mremap: use an explicit uffd failure path for mremap
>   mm/mremap: check remap conditions earlier
>   mm/mremap: move remap_is_valid() into check_prep_vma()
>   mm/mremap: clean up mlock populate behaviour
>   mm/mremap: permit mremap() move of multiple VMAs
>   tools/testing/selftests: extend mremap_test to test multi-VMA mremap
>
>  fs/userfaultfd.c                         |  15 +-
>  include/linux/userfaultfd_k.h            |   5 +
>  mm/mremap.c                              | 553 +++++++++++++++--------
>  tools/testing/selftests/mm/mremap_test.c | 146 +++++-
>  4 files changed, 518 insertions(+), 201 deletions(-)
>
> --
> 2.50.0