MAINTAINERS | 3 + fs/exec.c | 69 +------ include/linux/mm.h | 1 - kernel/fork.c | 277 +-------------------------- mm/Makefile | 4 +- mm/internal.h | 2 + mm/mmap.c | 309 ++++++++++++++++++------------- mm/nommu.c | 12 +- mm/vma.c | 43 +++++ mm/vma.h | 16 ++ mm/vma_exec.c | 161 ++++++++++++++++ mm/vma_init.c | 101 ++++++++++ tools/testing/vma/Makefile | 2 +- tools/testing/vma/vma.c | 27 ++- tools/testing/vma/vma_internal.h | 215 ++++++++++++++++++--- 15 files changed, 737 insertions(+), 505 deletions(-) create mode 100644 mm/vma_exec.c create mode 100644 mm/vma_init.c
Currently VMA allocation, freeing and duplication exist in kernel/fork.c, which is a violation of separation of concerns, and leaves these functions exposed to the rest of the kernel when they are in fact internal implementation details. Resolve this by moving this logic to mm, and making it internal to vma.c, vma.h. This also allows us, in future, to provide userland testing around this functionality. We additionally abstract dup_mmap() to mm, being careful to ensure kernel/fork.c acceses this via the mm internal header so it is not exposed elsewhere in the kernel. As part of this change, also abstract initial stack allocation performed in __bprm_mm_init() out of fs code into mm via the create_init_stack_vma(), as this code uses vm_area_alloc() and vm_area_free(). In order to do so sensibly, we introduce a new mm/vma_exec.c file, which contains the code that is shared by mm and exec. This file is added to both memory mapping and exec sections in MAINTAINERS so both sets of maintainers can maintain oversight. As part of this change, we also move relocate_vma_down() to mm/vma_exec.c so all shared mm/exec functionality is kept in one place. We add code shared between nommu and mmu-enabled configurations in order to share VMA allocation, freeing and duplication code correctly while also keeping these functions available in userland VMA testing. This is achieved by adding a mm/vma_init.c file which is also compiled by the userland tests. v3: * Establish mm/vma_exec.c for shared exec/mm vma logic, as per Kees. * Add this file both to exec and mm MAINTAINERS sections so correct oversight is provided. * Add a patch to move relocate_vma_down() to the new mm/vma_exec.c file. * Move the create_init_stack_vma() function to mm/vma_exec.c also. * Take the opportunity to also move insert_vm_struct() to mm/vma.c since this is no longer needed outside of mm. * Fixup VMA userland tests to account for the new additions, extend the userland test build (as well as the kernel build) to account for mm/vma_exec.c. * Remove __bprm_mm_init() and open code as we are simply calling a function, as per Kees. v2: * Moved vma init, alloc, free, dup functions to newly created vma_init.c function as per Suren, Liam. * Added MAINTAINERS entry for vma_init.c, added to Makefile. * Updated mmap_init() comment. * Propagated tags (thanks everyone!) * Added detach_free_vma() helper and correctly detached vmas in userland VMA test code. * Updated userland test code to also compile the vma_init.c file. * Corrected create_init_stack_vma() comment as per Suren. * Updated commit message as per Suren. https://lore.kernel.org/all/cover.1745592303.git.lorenzo.stoakes@oracle.com/ v1: https://lore.kernel.org/all/cover.1745528282.git.lorenzo.stoakes@oracle.com/ *** BLURB HERE *** Lorenzo Stoakes (4): mm: establish mm/vma_exec.c for shared exec/mm VMA functionality mm: abstract initial stack setup to mm subsystem mm: move dup_mmap() to mm mm: perform VMA allocation, freeing, duplication in mm MAINTAINERS | 3 + fs/exec.c | 69 +------ include/linux/mm.h | 1 - kernel/fork.c | 277 +-------------------------- mm/Makefile | 4 +- mm/internal.h | 2 + mm/mmap.c | 309 ++++++++++++++++++------------- mm/nommu.c | 12 +- mm/vma.c | 43 +++++ mm/vma.h | 16 ++ mm/vma_exec.c | 161 ++++++++++++++++ mm/vma_init.c | 101 ++++++++++ tools/testing/vma/Makefile | 2 +- tools/testing/vma/vma.c | 27 ++- tools/testing/vma/vma_internal.h | 215 ++++++++++++++++++--- 15 files changed, 737 insertions(+), 505 deletions(-) create mode 100644 mm/vma_exec.c create mode 100644 mm/vma_init.c -- 2.49.0
On 4/28/25 17:28, Lorenzo Stoakes wrote: > Currently VMA allocation, freeing and duplication exist in kernel/fork.c, > which is a violation of separation of concerns, and leaves these functions > exposed to the rest of the kernel when they are in fact internal > implementation details. > > Resolve this by moving this logic to mm, and making it internal to vma.c, > vma.h. > > This also allows us, in future, to provide userland testing around this > functionality. > > We additionally abstract dup_mmap() to mm, being careful to ensure > kernel/fork.c acceses this via the mm internal header so it is not exposed > elsewhere in the kernel. > > As part of this change, also abstract initial stack allocation performed in > __bprm_mm_init() out of fs code into mm via the create_init_stack_vma(), as > this code uses vm_area_alloc() and vm_area_free(). > > In order to do so sensibly, we introduce a new mm/vma_exec.c file, which > contains the code that is shared by mm and exec. This file is added to both > memory mapping and exec sections in MAINTAINERS so both sets of maintainers > can maintain oversight. Note that kernel/fork.c itself belongs to no section. Maybe we could put it somewhere too, maybe also multiple subsystems? I'm thinking something between MM, SCHEDULER, EXEC, perhaps PIDFD?
On Tue, Apr 29, 2025 at 09:28:05AM +0200, Vlastimil Babka wrote: > On 4/28/25 17:28, Lorenzo Stoakes wrote: > > Currently VMA allocation, freeing and duplication exist in kernel/fork.c, > > which is a violation of separation of concerns, and leaves these functions > > exposed to the rest of the kernel when they are in fact internal > > implementation details. > > > > Resolve this by moving this logic to mm, and making it internal to vma.c, > > vma.h. > > > > This also allows us, in future, to provide userland testing around this > > functionality. > > > > We additionally abstract dup_mmap() to mm, being careful to ensure > > kernel/fork.c acceses this via the mm internal header so it is not exposed > > elsewhere in the kernel. > > > > As part of this change, also abstract initial stack allocation performed in > > __bprm_mm_init() out of fs code into mm via the create_init_stack_vma(), as > > this code uses vm_area_alloc() and vm_area_free(). > > > > In order to do so sensibly, we introduce a new mm/vma_exec.c file, which > > contains the code that is shared by mm and exec. This file is added to both > > memory mapping and exec sections in MAINTAINERS so both sets of maintainers > > can maintain oversight. > > Note that kernel/fork.c itself belongs to no section. Maybe we could put it > somewhere too, maybe also multiple subsystems? I'm thinking something > between MM, SCHEDULER, EXEC, perhaps PIDFD? Thanks, indeed I was wondering about where this should be, and the fact we can put stuff in multiple places is actually pretty powerful! This is on my todo, will take a look at this.
On Tue, Apr 29, 2025 at 11:23:25AM +0100, Lorenzo Stoakes wrote: > On Tue, Apr 29, 2025 at 09:28:05AM +0200, Vlastimil Babka wrote: > > On 4/28/25 17:28, Lorenzo Stoakes wrote: > > > Currently VMA allocation, freeing and duplication exist in kernel/fork.c, > > > which is a violation of separation of concerns, and leaves these functions > > > exposed to the rest of the kernel when they are in fact internal > > > implementation details. > > > > > > Resolve this by moving this logic to mm, and making it internal to vma.c, > > > vma.h. > > > > > > This also allows us, in future, to provide userland testing around this > > > functionality. > > > > > > We additionally abstract dup_mmap() to mm, being careful to ensure > > > kernel/fork.c acceses this via the mm internal header so it is not exposed > > > elsewhere in the kernel. > > > > > > As part of this change, also abstract initial stack allocation performed in > > > __bprm_mm_init() out of fs code into mm via the create_init_stack_vma(), as > > > this code uses vm_area_alloc() and vm_area_free(). > > > > > > In order to do so sensibly, we introduce a new mm/vma_exec.c file, which > > > contains the code that is shared by mm and exec. This file is added to both > > > memory mapping and exec sections in MAINTAINERS so both sets of maintainers > > > can maintain oversight. > > > > Note that kernel/fork.c itself belongs to no section. Maybe we could put it > > somewhere too, maybe also multiple subsystems? I'm thinking something > > between MM, SCHEDULER, EXEC, perhaps PIDFD? > > Thanks, indeed I was wondering about where this should be, and the fact we can > put stuff in multiple places is actually pretty powerful! > > This is on my todo, will take a look at this. Yeah, I'd be interested in having fork.c multi-maintainer-sectioned with EXEC/BINFMT too, when the time comes. -- Kees Cook
© 2016 - 2025 Red Hat, Inc.