[PATCH 0/3] mm: improve map count checks

Lorenzo Stoakes (Oracle) posted 3 patches 3 weeks, 6 days ago
include/linux/mm.h                 |  2 -
mm/internal.h                      |  6 ++
mm/mmap.c                          |  2 +-
mm/mremap.c                        | 98 ++++++++++++++++++++++++------
mm/nommu.c                         |  2 +-
mm/vma.c                           |  6 +-
tools/testing/vma/include/custom.h |  3 -
tools/testing/vma/include/dup.h    |  9 +++
tools/testing/vma/main.c           |  2 +
9 files changed, 100 insertions(+), 30 deletions(-)
[PATCH 0/3] mm: improve map count checks
Posted by Lorenzo Stoakes (Oracle) 3 weeks, 6 days ago
Firstly, in mremap(), it appears that our map count checks have been overly
conservative - there is simply no reason to require that we have headroom
of 4 mappings prior to moving the VMA, we only need headroom of 2 VMAs
since commit 659ace584e7a ("mmap: don't return ENOMEM when mapcount is
temporarily exceeded in munmap()").

Likely the original headroom of 4 mappings was a mistake, and 3 was
actually intended.

Next, we access sysctl_max_map_count in a number of places without being
all that careful about how we do so.

We introduces a simple helper that READ_ONCE()'s the field
(get_sysctl_max_map_count()) to ensure that the field is accessed
correctly. The WRITE_ONCE() side is already handled by the sysctl procfs
code in proc_int_conv().

We also move this field to internal.h as there's no reason for anybody else
to access it outside of mm. Unfortunately we have to maintain the extern
variable, as mmap.c implements the procfs code.

Finally, we are accessing current->mm->map_count without holding the mmap
write lock, which is also not correct, so this series ensures the lock is
head before we access it.

We also abstract the check to a helper function, and add ASCII diagrams to
explain why we're doing what we're doing.

Lorenzo Stoakes (Oracle) (3):
  mm/mremap: correct invalid map count check
  mm: abstract reading sysctl_max_map_count, and READ_ONCE()
  mm/mremap: check map count under mmap write lock and abstract

 include/linux/mm.h                 |  2 -
 mm/internal.h                      |  6 ++
 mm/mmap.c                          |  2 +-
 mm/mremap.c                        | 98 ++++++++++++++++++++++++------
 mm/nommu.c                         |  2 +-
 mm/vma.c                           |  6 +-
 tools/testing/vma/include/custom.h |  3 -
 tools/testing/vma/include/dup.h    |  9 +++
 tools/testing/vma/main.c           |  2 +
 9 files changed, 100 insertions(+), 30 deletions(-)

--
2.53.0
Re: [PATCH 0/3] mm: improve map count checks
Posted by Andrew Morton 1 week, 4 days ago
On Wed, 11 Mar 2026 17:24:35 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:

> Firstly, in mremap(), it appears that our map count checks have been overly
> conservative - there is simply no reason to require that we have headroom
> of 4 mappings prior to moving the VMA, we only need headroom of 2 VMAs
> since commit 659ace584e7a ("mmap: don't return ENOMEM when mapcount is
> temporarily exceeded in munmap()").
> 
> Likely the original headroom of 4 mappings was a mistake, and 3 was
> actually intended.
> 
> Next, we access sysctl_max_map_count in a number of places without being
> all that careful about how we do so.
> 
> We introduces a simple helper that READ_ONCE()'s the field
> (get_sysctl_max_map_count()) to ensure that the field is accessed
> correctly. The WRITE_ONCE() side is already handled by the sysctl procfs
> code in proc_int_conv().
> 
> We also move this field to internal.h as there's no reason for anybody else
> to access it outside of mm. Unfortunately we have to maintain the extern
> variable, as mmap.c implements the procfs code.
> 
> Finally, we are accessing current->mm->map_count without holding the mmap
> write lock, which is also not correct, so this series ensures the lock is
> head before we access it.
> 
> We also abstract the check to a helper function, and add ASCII diagrams to
> explain why we're doing what we're doing.

This little series has no reviews, if anyone is wanting to boost our
R-b stats.

Or not.  I plan to upstream it unless someone stops me.
Re: [PATCH 0/3] mm: improve map count checks
Posted by Pedro Falcato 1 week, 4 days ago
On Thu, Mar 26, 2026 at 10:42:39PM -0700, Andrew Morton wrote:
> On Wed, 11 Mar 2026 17:24:35 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
> 
> > Firstly, in mremap(), it appears that our map count checks have been overly
> > conservative - there is simply no reason to require that we have headroom
> > of 4 mappings prior to moving the VMA, we only need headroom of 2 VMAs
> > since commit 659ace584e7a ("mmap: don't return ENOMEM when mapcount is
> > temporarily exceeded in munmap()").
> > 
> > Likely the original headroom of 4 mappings was a mistake, and 3 was
> > actually intended.
> > 
> > Next, we access sysctl_max_map_count in a number of places without being
> > all that careful about how we do so.
> > 
> > We introduces a simple helper that READ_ONCE()'s the field
> > (get_sysctl_max_map_count()) to ensure that the field is accessed
> > correctly. The WRITE_ONCE() side is already handled by the sysctl procfs
> > code in proc_int_conv().
> > 
> > We also move this field to internal.h as there's no reason for anybody else
> > to access it outside of mm. Unfortunately we have to maintain the extern
> > variable, as mmap.c implements the procfs code.
> > 
> > Finally, we are accessing current->mm->map_count without holding the mmap
> > write lock, which is also not correct, so this series ensures the lock is
> > head before we access it.
> > 
> > We also abstract the check to a helper function, and add ASCII diagrams to
> > explain why we're doing what we're doing.
> 
> This little series has no reviews, if anyone is wanting to boost our
> R-b stats.

Thanks for the heads up! This was too stale for too long :)

-- 
Pedro