include/linux/mm.h | 6 +- mm/madvise.c | 63 +++++++++- mm/mprotect.c | 2 +- mm/mremap.c | 2 +- mm/mseal.c | 157 +++++------------------- mm/vma.c | 4 +- mm/vma.h | 27 +--- tools/testing/selftests/mm/mseal_test.c | 3 +- tools/testing/vma/vma_internal.h | 6 +- 9 files changed, 107 insertions(+), 163 deletions(-)
Perform a number of cleanups to the mseal logic. Firstly, VM_SEALED is treated differently from every other VMA flag, it really doesn't make sense to do this, so we start by making this consistent with everything else. Next we place the madvise logic where it belongs - in mm/madvise.c. It really makes no sense to abstract this elsewhere. In doing so, we go to great lengths to explain very clearly the previously very confusing logic as to what sealed mappings are impacted here. In doing so, we fix an existing logical oversight - previously we permitted an madvise() discard operation for a sealed, read-only MAP_PRIVATE file-backed mapping. However this is incorrect. To see why consider: 1. A MAP_PRIVATE R/W file-backed mapping is established. 2. The mapping is written to, which backs it with anonymous memory. 3. The mapping is mprotect()'d read-only. 4. The mapping is mseal()'d. At this point you have data that, once sealed, a user cannot alter, but a discard operation can unrecoverably remove. This contradicts the semantics of mseal(), so should not be permitted. We then abstract out and explain the 'are there are any gaps in this range in the mm?' check being performed as a prerequisite to mseal being performed. Finally, we simplify the actual mseal logic which is really quite straightforward. v3: * Propagated more tags, thanks everyone! * Updated 5/5 to assign curr_start in a smarter way as per Liam. Adjust code to more sensibly handle already-sealed case at the same time. * Updated 4/5 to not move range_contains_unmapped() for better diff. * Renamed can_modify_vma() to vma_is_sealed() and inverted the logic - this is far clearer than the nebulous 'can modify VMA'. v2: * Propagated tags, thanks everyone! * Updated can_madvise_modify() to a more logical order re: the checks performed, as per David. * Replaced vma_is_anonymous() check (which was, in the original code, a vma->vm_file or vma->vm_ops check) with a vma->vm_flags & VM_SHARED check - to explicitly check for shared mappings vs private to preclude MAP_PRIVATE-mapping file-baked mappings, as per David. * Made range_contains_unmapped() static and placed in mm/mseal.c to avoid encouraging any other internal users towards this rather silly pattern, as per Pedro and Liam. https://lore.kernel.org/all/cover.1752586090.git.lorenzo.stoakes@oracle.com/ v1: https://lore.kernel.org/all/cover.1752497324.git.lorenzo.stoakes@oracle.com/ Lorenzo Stoakes (5): mm/mseal: always define VM_SEALED mm/mseal: update madvise() logic mm/mseal: small cleanups mm/mseal: Simplify and rename VMA gap check mm/mseal: rework mseal apply logic include/linux/mm.h | 6 +- mm/madvise.c | 63 +++++++++- mm/mprotect.c | 2 +- mm/mremap.c | 2 +- mm/mseal.c | 157 +++++------------------- mm/vma.c | 4 +- mm/vma.h | 27 +--- tools/testing/selftests/mm/mseal_test.c | 3 +- tools/testing/vma/vma_internal.h | 6 +- 9 files changed, 107 insertions(+), 163 deletions(-) -- 2.50.1
Hi Lorenzo, Thanks for including me in this thread. I've just returned from vacation and am catching up on my emails. I'll respond to each patch separately in the following emails. Could you consider adding mm/mseal.c to the HARDENING section of MAINTAINERS? Please include Kees and linux-hardening in future emails about mseal - Kees has been helping me with mseal since the beginning. Thanks and regards, -Jeff On Wed, Jul 16, 2025 at 10:38 AM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > Perform a number of cleanups to the mseal logic. Firstly, VM_SEALED is > treated differently from every other VMA flag, it really doesn't make sense > to do this, so we start by making this consistent with everything else. > > Next we place the madvise logic where it belongs - in mm/madvise.c. It > really makes no sense to abstract this elsewhere. In doing so, we go to > great lengths to explain very clearly the previously very confusing logic > as to what sealed mappings are impacted here. > > In doing so, we fix an existing logical oversight - previously we permitted > an madvise() discard operation for a sealed, read-only MAP_PRIVATE > file-backed mapping. > > However this is incorrect. To see why consider: > > 1. A MAP_PRIVATE R/W file-backed mapping is established. > 2. The mapping is written to, which backs it with anonymous memory. > 3. The mapping is mprotect()'d read-only. > 4. The mapping is mseal()'d. > > At this point you have data that, once sealed, a user cannot alter, but a > discard operation can unrecoverably remove. This contradicts the semantics > of mseal(), so should not be permitted. > > We then abstract out and explain the 'are there are any gaps in this range > in the mm?' check being performed as a prerequisite to mseal being > performed. > > Finally, we simplify the actual mseal logic which is really quite > straightforward. > > > v3: > * Propagated more tags, thanks everyone! > * Updated 5/5 to assign curr_start in a smarter way as per Liam. Adjust > code to more sensibly handle already-sealed case at the same time. > * Updated 4/5 to not move range_contains_unmapped() for better diff. > * Renamed can_modify_vma() to vma_is_sealed() and inverted the logic - this > is far clearer than the nebulous 'can modify VMA'. > > v2: > * Propagated tags, thanks everyone! > * Updated can_madvise_modify() to a more logical order re: the checks > performed, as per David. > * Replaced vma_is_anonymous() check (which was, in the original code, a > vma->vm_file or vma->vm_ops check) with a vma->vm_flags & VM_SHARED > check - to explicitly check for shared mappings vs private to preclude > MAP_PRIVATE-mapping file-baked mappings, as per David. > * Made range_contains_unmapped() static and placed in mm/mseal.c to avoid > encouraging any other internal users towards this rather silly pattern, > as per Pedro and Liam. > https://lore.kernel.org/all/cover.1752586090.git.lorenzo.stoakes@oracle.com/ > > v1: > https://lore.kernel.org/all/cover.1752497324.git.lorenzo.stoakes@oracle.com/ > > Lorenzo Stoakes (5): > mm/mseal: always define VM_SEALED > mm/mseal: update madvise() logic > mm/mseal: small cleanups > mm/mseal: Simplify and rename VMA gap check > mm/mseal: rework mseal apply logic > > include/linux/mm.h | 6 +- > mm/madvise.c | 63 +++++++++- > mm/mprotect.c | 2 +- > mm/mremap.c | 2 +- > mm/mseal.c | 157 +++++------------------- > mm/vma.c | 4 +- > mm/vma.h | 27 +--- > tools/testing/selftests/mm/mseal_test.c | 3 +- > tools/testing/vma/vma_internal.h | 6 +- > 9 files changed, 107 insertions(+), 163 deletions(-) > > -- > 2.50.1
On Thu, Jul 24, 2025 at 11:32:26AM -0700, Jeff Xu wrote: > Hi Lorenzo, > > Thanks for including me in this thread. I've just returned from > vacation and am catching up on my emails. I'll respond to each patch > separately in the following emails. You're welcome, I promised I would always cc you, and I keep my promises as best I can. It's unfortunate that you're sending this review on more or less the last day of the cycle, but there we are. > > Could you consider adding mm/mseal.c to the HARDENING section of > MAINTAINERS? Please include Kees and linux-hardening in future emails > about mseal - Kees has been helping me with mseal since the beginning. No, because we might move any of this logic elsewhere and I consider it fundamental to VMA logic. I am more than happy to include Kees as well on any emails regarding this. But it does not serve VMA logic to arbitrarily keep things in a certain file to satisfy MAINTAINERS.
Since there's debate about the semantics of the MAP_PRIVATE stuff I'll send a v4 with that taken out. I very much want the refactorings to land in 6.17. So we can carry on debating the ins and outs of this and make any _semantic_ changes for 6.18. Thanks, Lorenzo
© 2016 - 2025 Red Hat, Inc.