[PATCH v3 0/5] mseal cleanups, fixup MAP_PRIVATE file-backed case

Lorenzo Stoakes posted 5 patches 2 months, 3 weeks ago
include/linux/mm.h                      |   6 +-
mm/madvise.c                            |  63 +++++++++-
mm/mprotect.c                           |   2 +-
mm/mremap.c                             |   2 +-
mm/mseal.c                              | 157 +++++-------------------
mm/vma.c                                |   4 +-
mm/vma.h                                |  27 +---
tools/testing/selftests/mm/mseal_test.c |   3 +-
tools/testing/vma/vma_internal.h        |   6 +-
9 files changed, 107 insertions(+), 163 deletions(-)
[PATCH v3 0/5] mseal cleanups, fixup MAP_PRIVATE file-backed case
Posted by Lorenzo Stoakes 2 months, 3 weeks ago
Perform a number of cleanups to the mseal logic. Firstly, VM_SEALED is
treated differently from every other VMA flag, it really doesn't make sense
to do this, so we start by making this consistent with everything else.

Next we place the madvise logic where it belongs - in mm/madvise.c. It
really makes no sense to abstract this elsewhere. In doing so, we go to
great lengths to explain very clearly the previously very confusing logic
as to what sealed mappings are impacted here.

In doing so, we fix an existing logical oversight - previously we permitted
an madvise() discard operation for a sealed, read-only MAP_PRIVATE
file-backed mapping.

However this is incorrect. To see why consider:

1. A MAP_PRIVATE R/W file-backed mapping is established.
2. The mapping is written to, which backs it with anonymous memory.
3. The mapping is mprotect()'d read-only.
4. The mapping is mseal()'d.

At this point you have data that, once sealed, a user cannot alter, but a
discard operation can unrecoverably remove. This contradicts the semantics
of mseal(), so should not be permitted.

We then abstract out and explain the 'are there are any gaps in this range
in the mm?' check being performed as a prerequisite to mseal being
performed.

Finally, we simplify the actual mseal logic which is really quite
straightforward.


v3:
* Propagated more tags, thanks everyone!
* Updated 5/5 to assign curr_start in a smarter way as per Liam. Adjust
  code to more sensibly handle already-sealed case at the same time.
* Updated 4/5 to not move range_contains_unmapped() for better diff.
* Renamed can_modify_vma() to vma_is_sealed() and inverted the logic - this
  is far clearer than the nebulous 'can modify VMA'.

v2:
* Propagated tags, thanks everyone!
* Updated can_madvise_modify() to a more logical order re: the checks
  performed, as per David.
* Replaced vma_is_anonymous() check (which was, in the original code, a
  vma->vm_file or vma->vm_ops check) with a vma->vm_flags & VM_SHARED
  check - to explicitly check for shared mappings vs private to preclude
  MAP_PRIVATE-mapping file-baked mappings, as per David.
* Made range_contains_unmapped() static and placed in mm/mseal.c to avoid
  encouraging any other internal users towards this rather silly pattern,
  as per Pedro and Liam.
https://lore.kernel.org/all/cover.1752586090.git.lorenzo.stoakes@oracle.com/

v1:
https://lore.kernel.org/all/cover.1752497324.git.lorenzo.stoakes@oracle.com/

Lorenzo Stoakes (5):
  mm/mseal: always define VM_SEALED
  mm/mseal: update madvise() logic
  mm/mseal: small cleanups
  mm/mseal: Simplify and rename VMA gap check
  mm/mseal: rework mseal apply logic

 include/linux/mm.h                      |   6 +-
 mm/madvise.c                            |  63 +++++++++-
 mm/mprotect.c                           |   2 +-
 mm/mremap.c                             |   2 +-
 mm/mseal.c                              | 157 +++++-------------------
 mm/vma.c                                |   4 +-
 mm/vma.h                                |  27 +---
 tools/testing/selftests/mm/mseal_test.c |   3 +-
 tools/testing/vma/vma_internal.h        |   6 +-
 9 files changed, 107 insertions(+), 163 deletions(-)

--
2.50.1
Re: [PATCH v3 0/5] mseal cleanups, fixup MAP_PRIVATE file-backed case
Posted by Jeff Xu 2 months, 2 weeks ago
Hi Lorenzo,

Thanks for including me in this thread. I've just returned from
vacation and am catching up on my emails. I'll respond to each patch
separately in the following emails.

Could you consider adding mm/mseal.c to the HARDENING section of
MAINTAINERS? Please include Kees and linux-hardening in future emails
about mseal - Kees has been helping me with mseal since the beginning.

Thanks and regards,
-Jeff

On Wed, Jul 16, 2025 at 10:38 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> Perform a number of cleanups to the mseal logic. Firstly, VM_SEALED is
> treated differently from every other VMA flag, it really doesn't make sense
> to do this, so we start by making this consistent with everything else.
>
> Next we place the madvise logic where it belongs - in mm/madvise.c. It
> really makes no sense to abstract this elsewhere. In doing so, we go to
> great lengths to explain very clearly the previously very confusing logic
> as to what sealed mappings are impacted here.
>
> In doing so, we fix an existing logical oversight - previously we permitted
> an madvise() discard operation for a sealed, read-only MAP_PRIVATE
> file-backed mapping.
>
> However this is incorrect. To see why consider:
>
> 1. A MAP_PRIVATE R/W file-backed mapping is established.
> 2. The mapping is written to, which backs it with anonymous memory.
> 3. The mapping is mprotect()'d read-only.
> 4. The mapping is mseal()'d.
>
> At this point you have data that, once sealed, a user cannot alter, but a
> discard operation can unrecoverably remove. This contradicts the semantics
> of mseal(), so should not be permitted.
>
> We then abstract out and explain the 'are there are any gaps in this range
> in the mm?' check being performed as a prerequisite to mseal being
> performed.
>
> Finally, we simplify the actual mseal logic which is really quite
> straightforward.
>
>
> v3:
> * Propagated more tags, thanks everyone!
> * Updated 5/5 to assign curr_start in a smarter way as per Liam. Adjust
>   code to more sensibly handle already-sealed case at the same time.
> * Updated 4/5 to not move range_contains_unmapped() for better diff.
> * Renamed can_modify_vma() to vma_is_sealed() and inverted the logic - this
>   is far clearer than the nebulous 'can modify VMA'.
>
> v2:
> * Propagated tags, thanks everyone!
> * Updated can_madvise_modify() to a more logical order re: the checks
>   performed, as per David.
> * Replaced vma_is_anonymous() check (which was, in the original code, a
>   vma->vm_file or vma->vm_ops check) with a vma->vm_flags & VM_SHARED
>   check - to explicitly check for shared mappings vs private to preclude
>   MAP_PRIVATE-mapping file-baked mappings, as per David.
> * Made range_contains_unmapped() static and placed in mm/mseal.c to avoid
>   encouraging any other internal users towards this rather silly pattern,
>   as per Pedro and Liam.
> https://lore.kernel.org/all/cover.1752586090.git.lorenzo.stoakes@oracle.com/
>
> v1:
> https://lore.kernel.org/all/cover.1752497324.git.lorenzo.stoakes@oracle.com/
>
> Lorenzo Stoakes (5):
>   mm/mseal: always define VM_SEALED
>   mm/mseal: update madvise() logic
>   mm/mseal: small cleanups
>   mm/mseal: Simplify and rename VMA gap check
>   mm/mseal: rework mseal apply logic
>
>  include/linux/mm.h                      |   6 +-
>  mm/madvise.c                            |  63 +++++++++-
>  mm/mprotect.c                           |   2 +-
>  mm/mremap.c                             |   2 +-
>  mm/mseal.c                              | 157 +++++-------------------
>  mm/vma.c                                |   4 +-
>  mm/vma.h                                |  27 +---
>  tools/testing/selftests/mm/mseal_test.c |   3 +-
>  tools/testing/vma/vma_internal.h        |   6 +-
>  9 files changed, 107 insertions(+), 163 deletions(-)
>
> --
> 2.50.1
Re: [PATCH v3 0/5] mseal cleanups, fixup MAP_PRIVATE file-backed case
Posted by Lorenzo Stoakes 2 months, 2 weeks ago
On Thu, Jul 24, 2025 at 11:32:26AM -0700, Jeff Xu wrote:
> Hi Lorenzo,
>
> Thanks for including me in this thread. I've just returned from
> vacation and am catching up on my emails. I'll respond to each patch
> separately in the following emails.

You're welcome, I promised I would always cc you, and I keep my promises as
best I can.

It's unfortunate that you're sending this review on more or less the last
day of the cycle, but there we are.

>
> Could you consider adding mm/mseal.c to the HARDENING section of
> MAINTAINERS? Please include Kees and linux-hardening in future emails
> about mseal - Kees has been helping me with mseal since the beginning.

No, because we might move any of this logic elsewhere and I consider it
fundamental to VMA logic.

I am more than happy to include Kees as well on any emails regarding
this. But it does not serve VMA logic to arbitrarily keep things in a
certain file to satisfy MAINTAINERS.
Re: [PATCH v3 0/5] mseal cleanups, fixup MAP_PRIVATE file-backed case
Posted by Lorenzo Stoakes 2 months, 1 week ago
Since there's debate about the semantics of the MAP_PRIVATE stuff I'll send
a v4 with that taken out.

I very much want the refactorings to land in 6.17.

So we can carry on debating the ins and outs of this and make any
_semantic_ changes for 6.18.

Thanks, Lorenzo