[PATCH 0/3] fix incorrectly disallowed anonymous VMA merges

Lorenzo Stoakes posted 3 patches 9 months ago
There is a newer version of this series
mm/vma.c                                  |  81 ++--
tools/testing/selftests/mm/.gitignore     |   1 +
tools/testing/selftests/mm/Makefile       |   1 +
tools/testing/selftests/mm/merge.c        | 454 ++++++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh |   2 +
tools/testing/selftests/mm/vm_util.c      |  62 +++
tools/testing/selftests/mm/vm_util.h      |  21 +
tools/testing/vma/vma.c                   | 100 ++---
8 files changed, 652 insertions(+), 70 deletions(-)
create mode 100644 tools/testing/selftests/mm/merge.c
[PATCH 0/3] fix incorrectly disallowed anonymous VMA merges
Posted by Lorenzo Stoakes 9 months ago
It appears that we have been incorrectly rejecting merge cases for 15
years, apparently by mistake.

Imagine a range of anonymous mapped momemory divided into two VMAs like
this, with incompatible protection bits:

              RW         RWX
	  unfaulted    faulted
	|-----------|-----------|
	|    prev   |    vma    |
	|-----------|-----------|
	             mprotect(RW)

Now imagine mprotect()'ing vma so it is RW. This appears as if it should
merge, it does not.

Neither does this case, again mprotect()'ing vma RW:

              RWX        RW
	   faulted    unfaulted
	|-----------|-----------|
	|    vma    |   next    |
	|-----------|-----------|
	 mprotect(RW)

Nor:

              RW         RWX          RW
	  unfaulted    faulted    unfaulted
	|-----------|-----------|-----------|
	|    prev   |    vma    |    next   |
	|-----------|-----------|-----------|
	             mprotect(RW)

What's going on here?

In commit 5beb49305251 ("mm: change anon_vma linking to fix multi-process
server scalability issue"), from 2010, Rik von Riel took careful care to
account for these cases - commenting that '[this is] easily overlooked:
when mprotect shifts the boundary, make sure the expanding vma has anon_vma
set if the shrinking vma had, to cover any anon pages imported.'

However, commit 965f55dea0e3 ("mmap: avoid merging cloned VMAs") introduced
a little over a year later, appears to have accidentally disallowed this.

By adjusting the is_mergeable_anon_vma() function to avoid lock contention
across large trees of forked anon_vma's, this commit wrongly assumed the
VMA being checked (the ostensible merge 'target') should be faulted, that
is, have an anon_vma, and thus an anon_vma_chain list established, but only
of length 1.

This appears to have been unintentional, as disallowing empty target VMAs
like this across the board makes no sense.

We already have logic that accounts for this case, the same logic Rik
introduced in 2010, now via dup_anon_vma() (and ultimately
anon_vma_clone()), so there is no problem permitting this.

This series fixes this mistake and also ensures that scalability concerns
remain addressed by explicitly checking that whatever VMA is being merged
has not been forked.

A full set of self tests which reproduce the issue are provided, as well as
updating userland VMA tests to assert this behaviour.

The self tests additionally assert scalability concerns are addressed.


Lorenzo Stoakes (3):
  mm/vma: fix incorrectly disallowed anonymous VMA merges
  tools/testing: add PROCMAP_QUERY helper functions in mm self tests
  tools/testing/selftests: assert that anon merge cases behave as
    expected

 mm/vma.c                                  |  81 ++--
 tools/testing/selftests/mm/.gitignore     |   1 +
 tools/testing/selftests/mm/Makefile       |   1 +
 tools/testing/selftests/mm/merge.c        | 454 ++++++++++++++++++++++
 tools/testing/selftests/mm/run_vmtests.sh |   2 +
 tools/testing/selftests/mm/vm_util.c      |  62 +++
 tools/testing/selftests/mm/vm_util.h      |  21 +
 tools/testing/vma/vma.c                   | 100 ++---
 8 files changed, 652 insertions(+), 70 deletions(-)
 create mode 100644 tools/testing/selftests/mm/merge.c

--
2.48.1
Re: [PATCH 0/3] fix incorrectly disallowed anonymous VMA merges
Posted by Yeoreum Yun 9 months ago
On Mon, Mar 17, 2025 at 09:15:02PM +0000, Lorenzo Stoakes wrote:
> It appears that we have been incorrectly rejecting merge cases for 15
> years, apparently by mistake.
>
> Imagine a range of anonymous mapped momemory divided into two VMAs like
> this, with incompatible protection bits:
>
>               RW         RWX
> 	  unfaulted    faulted
> 	|-----------|-----------|
> 	|    prev   |    vma    |
> 	|-----------|-----------|
> 	             mprotect(RW)
>
> Now imagine mprotect()'ing vma so it is RW. This appears as if it should
> merge, it does not.
>
> Neither does this case, again mprotect()'ing vma RW:
>
>               RWX        RW
> 	   faulted    unfaulted
> 	|-----------|-----------|
> 	|    vma    |   next    |
> 	|-----------|-----------|
> 	 mprotect(RW)
>
> Nor:
>
>               RW         RWX          RW
> 	  unfaulted    faulted    unfaulted
> 	|-----------|-----------|-----------|
> 	|    prev   |    vma    |    next   |
> 	|-----------|-----------|-----------|
> 	             mprotect(RW)
>
> What's going on here?
>
> In commit 5beb49305251 ("mm: change anon_vma linking to fix multi-process
> server scalability issue"), from 2010, Rik von Riel took careful care to
> account for these cases - commenting that '[this is] easily overlooked:
> when mprotect shifts the boundary, make sure the expanding vma has anon_vma
> set if the shrinking vma had, to cover any anon pages imported.'
>
> However, commit 965f55dea0e3 ("mmap: avoid merging cloned VMAs") introduced
> a little over a year later, appears to have accidentally disallowed this.
>
> By adjusting the is_mergeable_anon_vma() function to avoid lock contention
> across large trees of forked anon_vma's, this commit wrongly assumed the
> VMA being checked (the ostensible merge 'target') should be faulted, that
> is, have an anon_vma, and thus an anon_vma_chain list established, but only
> of length 1.
>
> This appears to have been unintentional, as disallowing empty target VMAs
> like this across the board makes no sense.
>
> We already have logic that accounts for this case, the same logic Rik
> introduced in 2010, now via dup_anon_vma() (and ultimately
> anon_vma_clone()), so there is no problem permitting this.
>
> This series fixes this mistake and also ensures that scalability concerns
> remain addressed by explicitly checking that whatever VMA is being merged
> has not been forked.
>
> A full set of self tests which reproduce the issue are provided, as well as
> updating userland VMA tests to assert this behaviour.
>
> The self tests additionally assert scalability concerns are addressed.
>
>
> Lorenzo Stoakes (3):
>   mm/vma: fix incorrectly disallowed anonymous VMA merges
>   tools/testing: add PROCMAP_QUERY helper functions in mm self tests
>   tools/testing/selftests: assert that anon merge cases behave as
>     expected
>
>  mm/vma.c                                  |  81 ++--
>  tools/testing/selftests/mm/.gitignore     |   1 +
>  tools/testing/selftests/mm/Makefile       |   1 +
>  tools/testing/selftests/mm/merge.c        | 454 ++++++++++++++++++++++
>  tools/testing/selftests/mm/run_vmtests.sh |   2 +
>  tools/testing/selftests/mm/vm_util.c      |  62 +++
>  tools/testing/selftests/mm/vm_util.h      |  21 +
>  tools/testing/vma/vma.c                   | 100 ++---
>  8 files changed, 652 insertions(+), 70 deletions(-)
>  create mode 100644 tools/testing/selftests/mm/merge.c
>
> --
> 2.48.1
>

Look good to me.
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>