[v2] mm: Optimize mseal checks

[PATCH v2 0/6] mm: Optimize mseal checks

Posted by Pedro Falcato 1 year, 6 months ago

[based on mm-unstable, 98808d08]

Optimize mseal checks by removing the separate can_modify_mm() step, and
just doing checks on the individual vmas, when various operations are
themselves iterating through the tree. This provides a nice speedup and restores
performance parity with pre-mseal[3].

This series also depends on (and will eventually very slightly conflict with)
the powerpc series that removes arch_unmap[2].

will-it-scale mmap1_process[1] -t 1 results:

commit 3450fe2b574b4345e4296ccae395149e1a357fee:

min:277605 max:277605 total:277605
min:281784 max:281784 total:281784
min:277238 max:277238 total:277238
min:281761 max:281761 total:281761
min:274279 max:274279 total:274279
min:254854 max:254854 total:254854
measurement
min:269143 max:269143 total:269143
min:270454 max:270454 total:270454
min:243523 max:243523 total:243523
min:251148 max:251148 total:251148
min:209669 max:209669 total:209669
min:190426 max:190426 total:190426
min:231219 max:231219 total:231219
min:275364 max:275364 total:275364
min:266540 max:266540 total:266540
min:242572 max:242572 total:242572
min:284469 max:284469 total:284469
min:278882 max:278882 total:278882
min:283269 max:283269 total:283269
min:281204 max:281204 total:281204

After this patch set:

min:280580 max:280580 total:280580
min:290514 max:290514 total:290514
min:291006 max:291006 total:291006
min:290352 max:290352 total:290352
min:294582 max:294582 total:294582
min:293075 max:293075 total:293075
measurement
min:295613 max:295613 total:295613
min:294070 max:294070 total:294070
min:293193 max:293193 total:293193
min:291631 max:291631 total:291631
min:295278 max:295278 total:295278
min:293782 max:293782 total:293782
min:290361 max:290361 total:290361
min:294517 max:294517 total:294517
min:293750 max:293750 total:293750
min:293572 max:293572 total:293572
min:295239 max:295239 total:295239
min:292932 max:292932 total:292932
min:293319 max:293319 total:293319
min:294954 max:294954 total:294954

This was a Completely Unscientific test but seems to show there were around 5-10% gains on ops per second.

Oliver performed his own tests and showed[3] a similar ~5% gain in them.

[1]: mmap1_process does mmap and munmap in a loop. I didn't bother testing multithreading cases.
[2]: https://lore.kernel.org/all/20240807124103.85644-1-mpe@ellerman.id.au/
[3]: https://lore.kernel.org/all/ZrMMJfe9aXSWxJz6@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/all/202408041602.caa0372-oliver.sang@intel.com/

Changes in v2:
 - Rebased on top of mm-unstable
 - Removed a superfluous check in mremap (Jeff Xu)

Pedro Falcato (6):
  mm: Move can_modify_vma to mm/internal.h
  mm/munmap: Replace can_modify_mm with can_modify_vma
  mm/mprotect: Replace can_modify_mm with can_modify_vma
  mm/mremap: Replace can_modify_mm with can_modify_vma
  mseal: Replace can_modify_mm_madv with a vma variant
  mm: Remove can_modify_mm()

 mm/internal.h | 30 ++++++++++++++++++++--------
 mm/madvise.c  | 13 +++---------
 mm/mmap.c     | 13 +-----------
 mm/mprotect.c | 12 +++--------
 mm/mremap.c   | 30 ++++------------------------
 mm/mseal.c    | 55 ++++-----------------------------------------------
 mm/vma.c      | 23 ++++++++++-----------
 7 files changed, 49 insertions(+), 127 deletions(-)


base-commit: 98808d08fc0f78ee638e0c0a88020fbbaf581ec6
-- 
2.46.0

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Jeff Xu 1 year, 5 months ago

On Wed, Aug 7, 2024 at 2:13 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> [based on mm-unstable, 98808d08]
>
> Optimize mseal checks by removing the separate can_modify_mm() step, and
> just doing checks on the individual vmas, when various operations are
> themselves iterating through the tree. This provides a nice speedup and restores
> performance parity with pre-mseal[3].
>
> This series also depends on (and will eventually very slightly conflict with)
> the powerpc series that removes arch_unmap[2].
>
> will-it-scale mmap1_process[1] -t 1 results:
>
> commit 3450fe2b574b4345e4296ccae395149e1a357fee:
>
> min:277605 max:277605 total:277605
> min:281784 max:281784 total:281784
> min:277238 max:277238 total:277238
> min:281761 max:281761 total:281761
> min:274279 max:274279 total:274279
> min:254854 max:254854 total:254854
> measurement
> min:269143 max:269143 total:269143
> min:270454 max:270454 total:270454
> min:243523 max:243523 total:243523
> min:251148 max:251148 total:251148
> min:209669 max:209669 total:209669
> min:190426 max:190426 total:190426
> min:231219 max:231219 total:231219
> min:275364 max:275364 total:275364
> min:266540 max:266540 total:266540
> min:242572 max:242572 total:242572
> min:284469 max:284469 total:284469
> min:278882 max:278882 total:278882
> min:283269 max:283269 total:283269
> min:281204 max:281204 total:281204
>
> After this patch set:
>
> min:280580 max:280580 total:280580
> min:290514 max:290514 total:290514
> min:291006 max:291006 total:291006
> min:290352 max:290352 total:290352
> min:294582 max:294582 total:294582
> min:293075 max:293075 total:293075
> measurement
> min:295613 max:295613 total:295613
> min:294070 max:294070 total:294070
> min:293193 max:293193 total:293193
> min:291631 max:291631 total:291631
> min:295278 max:295278 total:295278
> min:293782 max:293782 total:293782
> min:290361 max:290361 total:290361
> min:294517 max:294517 total:294517
> min:293750 max:293750 total:293750
> min:293572 max:293572 total:293572
> min:295239 max:295239 total:295239
> min:292932 max:292932 total:292932
> min:293319 max:293319 total:293319
> min:294954 max:294954 total:294954
>
> This was a Completely Unscientific test but seems to show there were around 5-10% gains on ops per second.
>
It is likely true that the test " was Completely Unscientific".
Without the necessary information, such as stddiv, stderr, and large
enough sample size, it is impossible to prove the test can reasonably
measure the impact of the patch. Unless you add that information, it
would be prudent to omit this test data from the cover letter.

> Oliver performed his own tests and showed[3] a similar ~5% gain in them.
>
> [1]: mmap1_process does mmap and munmap in a loop. I didn't bother testing multithreading cases.
> [2]: https://lore.kernel.org/all/20240807124103.85644-1-mpe@ellerman.id.au/
> [3]: https://lore.kernel.org/all/ZrMMJfe9aXSWxJz6@xsang-OptiPlex-9020/
> Link: https://lore.kernel.org/all/202408041602.caa0372-oliver.sang@intel.com/
>
> Changes in v2:
>  - Rebased on top of mm-unstable
>  - Removed a superfluous check in mremap (Jeff Xu)
Thanks for taking this suggestion. When I posted that comment, I was
studying the mremap change in V2, I'm not 100% sure if it is possible,
or reasonable the code is written that way (even before mseal is
added.)

This week, I wrote more self-tests [1] for the mremap to reason about
the code handling of mremap across the VMA boundary, as part of an
effort to test your patch.

During the process I realized that we can probably improve the mremap
code further, even without considering the sealing,  so I made a refactor
patch for mremap. [2].

Testing is 90% of the time, if not more, when I modify the kernel
code. If you agree with my refactor on mremap, please feel free to
rebase  or include it as part of your patch series.

Thanks
-Jeff

[1] https://lore.kernel.org/linux-mm/20240814071424.2655666-2-jeffxu@chromium.org/
[2] https://lore.kernel.org/linux-mm/20240814071424.2655666-3-jeffxu@chromium.org/


> Pedro Falcato (6):
>   mm: Move can_modify_vma to mm/internal.h
>   mm/munmap: Replace can_modify_mm with can_modify_vma
>   mm/mprotect: Replace can_modify_mm with can_modify_vma
>   mm/mremap: Replace can_modify_mm with can_modify_vma
>   mseal: Replace can_modify_mm_madv with a vma variant
>   mm: Remove can_modify_mm()
>
>  mm/internal.h | 30 ++++++++++++++++++++--------
>  mm/madvise.c  | 13 +++---------
>  mm/mmap.c     | 13 +-----------
>  mm/mprotect.c | 12 +++--------
>  mm/mremap.c   | 30 ++++------------------------
>  mm/mseal.c    | 55 ++++-----------------------------------------------
>  mm/vma.c      | 23 ++++++++++-----------
>  7 files changed, 49 insertions(+), 127 deletions(-)
>
>
> base-commit: 98808d08fc0f78ee638e0c0a88020fbbaf581ec6
> --
> 2.46.0
>
>

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Andrew Morton 1 year, 6 months ago

On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:

> This series also depends on (and will eventually very slightly conflict with)
> the powerpc series that removes arch_unmap[2].

That's awkward.  Please describe the dependency?

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Pedro Falcato 1 year, 6 months ago

On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> > This series also depends on (and will eventually very slightly conflict with)
> > the powerpc series that removes arch_unmap[2].
>
> That's awkward.  Please describe the dependency?

One of the transformations done in this patch series (patch 2) assumes
that arch_unmap either doesn't exist or does nothing.
PPC is the only architecture with an arch_unmap implementation, and
through the series I linked they're going to make it work via
->close().

What's the easiest way to deal with this? Can the PPC series go
through the mm tree?

I could also possibly work around this on my end, and either limit the
terribleness to the ppc arch_unmap code or limit the effectiveness of
the patch set a bit. But both of these options feel like somewhat
fighting the inevitable.

What do you think?

Thanks,
Pedro

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Jeff Xu 1 year, 6 months ago

On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> >
> > > This series also depends on (and will eventually very slightly conflict with)
> > > the powerpc series that removes arch_unmap[2].
> >
> > That's awkward.  Please describe the dependency?
>
> One of the transformations done in this patch series (patch 2) assumes
> that arch_unmap either doesn't exist or does nothing.
> PPC is the only architecture with an arch_unmap implementation, and
> through the series I linked they're going to make it work via
> ->close().
>
> What's the easiest way to deal with this? Can the PPC series go
> through the mm tree?
>
This patch can't be merged until arch_unmap() is all removed (ppc change)

Also I'm still doing a test/reviewing for this patch,  perhaps it is
better to wait till my test is done.

Thanks
-Jeff

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Jeff Xu 1 year, 5 months ago

Hi Andrew, Pedro

On Thu, Aug 8, 2024 at 6:03 PM Jeff Xu <jeffxu@google.com> wrote:
>
> On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> >
> > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > >
> > > > This series also depends on (and will eventually very slightly conflict with)
> > > > the powerpc series that removes arch_unmap[2].
> > >
> > > That's awkward.  Please describe the dependency?
> >
> > One of the transformations done in this patch series (patch 2) assumes
> > that arch_unmap either doesn't exist or does nothing.
> > PPC is the only architecture with an arch_unmap implementation, and
> > through the series I linked they're going to make it work via
> > ->close().
> >
> > What's the easiest way to deal with this? Can the PPC series go
> > through the mm tree?
> >
> This patch can't be merged until arch_unmap() is all removed (ppc change)
>
> Also I'm still doing a test/reviewing for this patch,  perhaps it is
> better to wait till my test is done.
>
Sorry that I'm late for updating this thread.

With removing arch_unmap() change landed , there is no dependency for
the patch. However: I have other comments:

1. Testing
Testing is 90% of work when I modify kernel code and send a patch.  So
I'm a little disappointed that this patch doesn't have any test
updates or add new tests. Which makes me less confident about the
regression risk on mseal itself, i.e. a sealed mapping being
overwritten by mprotect/mmap/mremap/munmap.  I have posted the comment
in  [1], and I would like to repeat it to stress my point.

The V2 series doesn't have selftest change which indicates lack of
testing. The out-of-loop check is positioned nearer to the API entry
point and separated from internal business logic, thereby minimizing
the testing requirements. However, as we move the sealing check
further inward and intertwine it with business logic, greater test
coverage becomes necessary to ensure  the correctness of  sealing
is preserved.

Yes. I promised to run some tests, which I did, with the existing self
test (that passed),  also I added more tests in the mremap selftest.
However I'm bound by the time that I can spend on this  (my other
duties and deliverables), I can't test it as much as I like to for
in-loop change (in a time frame demanded by a dev in this ml). Because
this patch is not getting tested as it should be, my confidence for
the V2 patch is low .

2 perf testing
stress-ng is not stable in my test with Chromebook, and I'm requesting
 Oliver to take more samples [2] . This due diligence assures that
this patch accurately fulfills its purpose. The in-loop approach adds
complexity to the code, i.e. future dev is harder to understand the
sealing logic. Additionally, it sacrifices a security feature that
makes it harder for an attacker to modify mapping (currently if an
attacker uses munmap with a large address range, if one of the
addresses is sealed, the entire range is not modified. In the in-loop
approach,  memory will be unmapped till it hits the sealed memory).
Therefore, I would like to ascertain the gain.

3 mremap refactor work.
I posted mremap refactor work [3] , which is aligned with the
direction that we want to do in-loop change, and it also (imo)  a
better version of handling error cases for mremap across multiple vma
boundaries. That patch set can be applied on its own, and the test
cases added also enhance the existing selftest. I hope my patch can be
reviewed, and if passing perf test/approved, applied to mm first.

Thanks
Best regards,
-Jeff

[1] https://lore.kernel.org/linux-mm/20240814071424.2655666-2-jeffxu@chromium.org/
[2] https://lore.kernel.org/linux-mm/CABi2SkXtZLojx3AQU4C=41NtBPGjVB2+fv_KWziOqyXRQ8P7Bg@mail.gmail.com/
[3] https://lore.kernel.org/linux-mm/20240814071424.2655666-1-jeffxu@chromium.org/

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Pedro Falcato 1 year, 5 months ago

On Thu, Aug 15, 2024 at 11:10 PM Jeff Xu <jeffxu@chromium.org> wrote:
>
> Hi Andrew, Pedro
>
> On Thu, Aug 8, 2024 at 6:03 PM Jeff Xu <jeffxu@google.com> wrote:
> >
> > On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > >
> > > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > >
> > > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > >
> > > > > This series also depends on (and will eventually very slightly conflict with)
> > > > > the powerpc series that removes arch_unmap[2].
> > > >
> > > > That's awkward.  Please describe the dependency?
> > >
> > > One of the transformations done in this patch series (patch 2) assumes
> > > that arch_unmap either doesn't exist or does nothing.
> > > PPC is the only architecture with an arch_unmap implementation, and
> > > through the series I linked they're going to make it work via
> > > ->close().
> > >
> > > What's the easiest way to deal with this? Can the PPC series go
> > > through the mm tree?
> > >
> > This patch can't be merged until arch_unmap() is all removed (ppc change)
> >
> > Also I'm still doing a test/reviewing for this patch,  perhaps it is
> > better to wait till my test is done.
> >
> Sorry that I'm late for updating this thread.
>
> With removing arch_unmap() change landed , there is no dependency for
> the patch. However: I have other comments:
>
> 1. Testing
> Testing is 90% of work when I modify kernel code and send a patch.  So
> I'm a little disappointed that this patch doesn't have any test
> updates or add new tests. Which makes me less confident about the
> regression risk on mseal itself, i.e. a sealed mapping being
> overwritten by mprotect/mmap/mremap/munmap.  I have posted the comment
> in  [1], and I would like to repeat it to stress my point.
>
> The V2 series doesn't have selftest change which indicates lack of
> testing. The out-of-loop check is positioned nearer to the API entry
> point and separated from internal business logic, thereby minimizing
> the testing requirements. However, as we move the sealing check
> further inward and intertwine it with business logic, greater test
> coverage becomes necessary to ensure  the correctness of  sealing
> is preserved.

Sorry, cut the crap. Your backhanded concerns are very funny.
mseal was _your_ feature. You wrote the tests. You either didn't write
or didn't understand what kinds of tests need/should be done,
otherwise you would've done them. I wrote some code to fix the awful
performance hit. It is a _fix_, not a feature. Your previous mseal
tests should've covered this. If they didn't, that's okay (we all make
mistakes), but pushing your problems onto me is seriously laughable.

I want to stress this bit: There's no mseal feature, there's no
business logic. There's mmap, munmap, mprotect, madvise, mremap (among
others). These are the things people actually care about, and need to
stay fast. Memory management is a core part of the kernel. All of the
very pretty abstractions you're talking about aren't applicable to
kernel code (for any kernel, really) in general. No one wants to pay
the cost of walking through a data structure 2 or 3 times just to
"separate the business logic" for a random, minimally simple feature.

>
> Yes. I promised to run some tests, which I did, with the existing self
> test (that passed),  also I added more tests in the mremap selftest.
> However I'm bound by the time that I can spend on this  (my other
> duties and deliverables), I can't test it as much as I like to for
> in-loop change (in a time frame demanded by a dev in this ml). Because
> this patch is not getting tested as it should be, my confidence for
> the V2 patch is low .

Ok so you: tried to explain to me how to run selftests in v1 (when I
actively did _before sending_, and found a bug in your tests, and
wrote about it in-depth), pledge to "run some tests", never get back
to us, push the "the testsuite I wrote is lacking" concern onto me,
send a whole separate parallel patch set that tries to address _one_
of the regressions with complete disregard for the work done here,
complain about a lack of time, and now say a backhanded "your
confidence for the V2 patch is low".

I seriously have no words.
I want to stress I have no way to test real software that uses mseal
because APPARENTLY THERE IS NONE. THE FEATURE ADDED EXPLICITLY FOR
CHROMIUM IS NOT USED BY UPSTREAM CHROMIUM.

>
> 2 perf testing
> stress-ng is not stable in my test with Chromebook, and I'm requesting
>  Oliver to take more samples [2] . This due diligence assures that
> this patch accurately fulfills its purpose. The in-loop approach adds
> complexity to the code, i.e. future dev is harder to understand the
> sealing logic. Additionally, it sacrifices a security feature that
> makes it harder for an attacker to modify mapping (currently if an
> attacker uses munmap with a large address range, if one of the
> addresses is sealed, the entire range is not modified. In the in-loop
> approach,  memory will be unmapped till it hits the sealed memory).

Wrong. munmap is atomic and always has been. It's required by POSIX.

I would also ask you to back these partial mprotect (assuming that's
what you meant) concerns with a real attack that takes this into
account, instead of hand waving "security".
While noting that all of these operations can already partially fail,
this is not new (and is explicitly permitted by POSIX for
syscalls-not-named-munmap).

> Therefore, I would like to ascertain the gain.

The gain is real. I've proven it with will-it-scale, Oliver ran his
tests and found the regression to be gone (and they do seem to do
proper statistical analysis).
I would wager you to find a workload or hardware where *doing
substantially less work* makes for slower code.

>
> 3 mremap refactor work.

mremap refactoring is not related to these mmap regressions. In the v3
I'm cleaning up and sending out tomorrow, I minimally change mremap
(as Liam wanted).

-- 
Pedro

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Jeff Xu 1 year, 5 months ago

Hi Pedro,

(resent,  previous email has html link)

On Thu, Aug 15, 2024 at 6:58 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Thu, Aug 15, 2024 at 11:10 PM Jeff Xu <jeffxu@chromium.org> wrote:
> >
> > Hi Andrew, Pedro
> >
> > On Thu, Aug 8, 2024 at 6:03 PM Jeff Xu <jeffxu@google.com> wrote:
> > >
> > > On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > >
> > > > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > > >
> > > > > > This series also depends on (and will eventually very slightly conflict with)
> > > > > > the powerpc series that removes arch_unmap[2].
> > > > >
> > > > > That's awkward.  Please describe the dependency?
> > > >
> > > > One of the transformations done in this patch series (patch 2) assumes
> > > > that arch_unmap either doesn't exist or does nothing.
> > > > PPC is the only architecture with an arch_unmap implementation, and
> > > > through the series I linked they're going to make it work via
> > > > ->close().
> > > >
> > > > What's the easiest way to deal with this? Can the PPC series go
> > > > through the mm tree?
> > > >
> > > This patch can't be merged until arch_unmap() is all removed (ppc change)
> > >
> > > Also I'm still doing a test/reviewing for this patch,  perhaps it is
> > > better to wait till my test is done.
> > >
> > Sorry that I'm late for updating this thread.
> >
> > With removing arch_unmap() change landed , there is no dependency for
> > the patch. However: I have other comments:
> >
> > 1. Testing
> > Testing is 90% of work when I modify kernel code and send a patch.  So
> > I'm a little disappointed that this patch doesn't have any test
> > updates or add new tests. Which makes me less confident about the
> > regression risk on mseal itself, i.e. a sealed mapping being
> > overwritten by mprotect/mmap/mremap/munmap.  I have posted the comment
> > in  [1], and I would like to repeat it to stress my point.
> >
> > The V2 series doesn't have selftest change which indicates lack of
> > testing. The out-of-loop check is positioned nearer to the API entry
> > point and separated from internal business logic, thereby minimizing
> > the testing requirements. However, as we move the sealing check
> > further inward and intertwine it with business logic, greater test
> > coverage becomes necessary to ensure  the correctness of  sealing
> > is preserved.
>
> Sorry, cut the crap. Your backhanded concerns are very funny.
> mseal was _your_ feature. You wrote the tests. You either didn't write
> or didn't understand what kinds of tests need/should be done,
> otherwise you would've done them. I wrote some code to fix the awful
> performance hit. It is a _fix_, not a feature. Your previous mseal
> tests should've covered this. If they didn't, that's okay (we all make
> mistakes), but pushing your problems onto me is seriously laughable.
>
I posted the comments about the lack of a test update in V2 last
monday, there is no response from you until Thursday night (which is
OK). If you were expecting me to update the test cases and to support
your patch series, you should explicitly call it out.

So your point here is that you wrote the code, passed  the existing
selftest, fixed the perf, and the job is done. If the test doesn't
cover the new  cases you added, it is not your problem, you only need
to care about perf part.

This is how regression happened in past, e.g.

commit 77795f900e2a07c1cbedc375789aefb43843b6c2
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Tue Jun 6 14:29:12 2023 -0400

    mm/mprotect: fix do_mprotect_pkey() limit check

    The return of do_mprotect_pkey() can still be incorrectly returned as
    success if there is a gap that spans to or beyond the end address passed
    in.  Update the check to ensure that the end address has indeed been seen.

    Link: https://lore.kernel.org/all/CABi2SkXjN+5iFoBhxk71t3cmunTk-s=rB4T7qo0UQRh17s49PQ@mail.gmail.com/
    Link: https://lkml.kernel.org/r/20230606182912.586576-1-Liam.Howlett@oracle.com
    Fixes: 82f951340f25 ("mm/mprotect: fix do_mprotect_pkey() return on error")
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reported-by: Jeff Xu <jeffxu@chromium.org>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


Had not I wrote selftest to discover this mprotect regression, it
would go unnoticed  and stay that way.

My point is: the existing selftest for mseal  is written for out-loop,
and will not fully test in-loop. Your patch has made a big change to
mseal, more tests are needed.

To move forward from this situation for your patch series, either you
or me will need to update the tests. How about sharing the load ?

> I want to stress this bit: There's no mseal feature, there's no
> business logic. There's mmap, munmap, mprotect, madvise, mremap (among
> others). These are the things people actually care about, and need to
> stay fast. Memory management is a core part of the kernel. All of the
> very pretty abstractions you're talking about aren't applicable to
> kernel code (for any kernel, really) in general. No one wants to pay
> the cost of walking through a data structure 2 or 3 times just to
> "separate the business logic" for a random, minimally simple feature.
>
The testing is about making sure that sealing is preserved after
mmap/mremap/munmap/mprotect call, there is no real software that will
do that kind of testing, even in the future, selftest is the only way.

Security features never come quickly, adding  syscall to the kernel is
the first step to allow apps to use it.

> >
> > Yes. I promised to run some tests, which I did, with the existing self
> > test (that passed),  also I added more tests in the mremap selftest.
> > However I'm bound by the time that I can spend on this  (my other
> > duties and deliverables), I can't test it as much as I like to for
> > in-loop change (in a time frame demanded by a dev in this ml). Because
> > this patch is not getting tested as it should be, my confidence for
> > the V2 patch is low .
>
> Ok so you: tried to explain to me how to run selftests in v1 (when I
> actively did _before sending_, and found a bug in your tests, and
> wrote about it in-depth), pledge to "run some tests", never get back
> to us, push the "the testsuite I wrote is lacking" concern onto me,
> send a whole separate parallel patch set that tries to address _one_
> of the regressions with complete disregard for the work done here,
> complain about a lack of time, and now say a backhanded "your
> confidence for the V2 patch is low".
>
> I seriously have no words.
> I want to stress I have no way to test real software that uses mseal
> because APPARENTLY THERE IS NONE. THE FEATURE ADDED EXPLICITLY FOR
> CHROMIUM IS NOT USED BY UPSTREAM CHROMIUM.
>
> >
> > 2 perf testing
> > stress-ng is not stable in my test with Chromebook, and I'm requesting
> >  Oliver to take more samples [2] . This due diligence assures that
> > this patch accurately fulfills its purpose. The in-loop approach adds
> > complexity to the code, i.e. future dev is harder to understand the
> > sealing logic. Additionally, it sacrifices a security feature that
> > makes it harder for an attacker to modify mapping (currently if an
> > attacker uses munmap with a large address range, if one of the
> > addresses is sealed, the entire range is not modified. In the in-loop
> > approach,  memory will be unmapped till it hits the sealed memory).
>
> Wrong. munmap is atomic and always has been. It's required by POSIX.
>
Please run this test on the latest kernel branch to verify:

static void test_munmap_free_multiple_ranges(bool seal)
{
        void *ptr;
        unsigned long page_size = getpagesize();
        unsigned long size = 8 * page_size;
        int ret;
        int prot;

        setup_single_address(size, &ptr);
        FAIL_TEST_IF_FALSE(ptr != (void *)-1);

        /* unmap one page from beginning. */
        ret = sys_munmap(ptr, page_size);
        FAIL_TEST_IF_FALSE(!ret);

        /* unmap one page from middle. */
        ret = sys_munmap(ptr + 4 * page_size, page_size);
        FAIL_TEST_IF_FALSE(!ret);

        /* seal the last page */
        if (seal) {
                ret = sys_mseal(ptr + 7 * page_size, page_size);
                FAIL_TEST_IF_FALSE(!ret);
        }

        /* munmap all 8  pages from beginning */
        ret = sys_munmap(ptr, 8 * page_size);
        if (seal) {
                FAIL_TEST_IF_FALSE(ret < 0);

                /* verify none of existing pages in  the range are unmapped */
                size = get_vma_size(ptr + page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 3 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);

                size = get_vma_size(ptr +  5 * page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 2 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);

                size = get_vma_size(ptr +  7 * page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 1 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);
        } else {
                FAIL_TEST_IF_FALSE(!ret);

                /* verify all pages are unmapped */
                for (int i = 0; i < 8; i++) {
                        size = get_vma_size(ptr, &prot);
                        FAIL_TEST_IF_FALSE(size == 0);
                }
        }

        REPORT_TEST_PASS();
}

test_munmap_free_multiple_ranges(true)
test_munmap_free_multiple_ranges(false)

> I would also ask you to back these partial mprotect (assuming that's
> what you meant) concerns with a real attack that takes this into
> account, instead of hand waving "security".
> While noting that all of these operations can already partially fail,
> this is not new (and is explicitly permitted by POSIX for
> syscalls-not-named-munmap).
>
As defence gets stronger, with seccomp,selinux,landlock, attackers now
have to find an easier route.

> > Therefore, I would like to ascertain the gain.
>
> The gain is real. I've proven it with will-it-scale, Oliver ran his
> tests and found the regression to be gone (and they do seem to do
> proper statistical analysis).
> I would wager you to find a workload or hardware where *doing
> substantially less work* makes for slower code.
>
> >
> > 3 mremap refactor work.
>
> mremap refactoring is not related to these mmap regressions. In the v3
> I'm cleaning up and sending out tomorrow, I minimally change mremap
> (as Liam wanted).
>
If the test issue is not resolved, V3 will be in the same situation as V2.

Best Regards,
-Jeff

> --
> Pedro

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Jeff Xu 1 year, 5 months ago

Hi Pedro

On Thu, Aug 15, 2024 at 6:58 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Thu, Aug 15, 2024 at 11:10 PM Jeff Xu <jeffxu@chromium.org> wrote:
> >
> > Hi Andrew, Pedro
> >
> > On Thu, Aug 8, 2024 at 6:03 PM Jeff Xu <jeffxu@google.com> wrote:
> > >
> > > On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > >
> > > > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > > >
> > > > > > This series also depends on (and will eventually very slightly conflict with)
> > > > > > the powerpc series that removes arch_unmap[2].
> > > > >
> > > > > That's awkward.  Please describe the dependency?
> > > >
> > > > One of the transformations done in this patch series (patch 2) assumes
> > > > that arch_unmap either doesn't exist or does nothing.
> > > > PPC is the only architecture with an arch_unmap implementation, and
> > > > through the series I linked they're going to make it work via
> > > > ->close().
> > > >
> > > > What's the easiest way to deal with this? Can the PPC series go
> > > > through the mm tree?
> > > >
> > > This patch can't be merged until arch_unmap() is all removed (ppc change)
> > >
> > > Also I'm still doing a test/reviewing for this patch,  perhaps it is
> > > better to wait till my test is done.
> > >
> > Sorry that I'm late for updating this thread.
> >
> > With removing arch_unmap() change landed , there is no dependency for
> > the patch. However: I have other comments:
> >
> > 1. Testing
> > Testing is 90% of work when I modify kernel code and send a patch.  So
> > I'm a little disappointed that this patch doesn't have any test
> > updates or add new tests. Which makes me less confident about the
> > regression risk on mseal itself, i.e. a sealed mapping being
> > overwritten by mprotect/mmap/mremap/munmap.  I have posted the comment
> > in  [1], and I would like to repeat it to stress my point.
> >
> > The V2 series doesn't have selftest change which indicates lack of
> > testing. The out-of-loop check is positioned nearer to the API entry
> > point and separated from internal business logic, thereby minimizing
> > the testing requirements. However, as we move the sealing check
> > further inward and intertwine it with business logic, greater test
> > coverage becomes necessary to ensure  the correctness of  sealing
> > is preserved.
>
> Sorry, cut the crap. Your backhanded concerns are very funny.
> mseal was _your_ feature. You wrote the tests. You either didn't write
> or didn't understand what kinds of tests need/should be done,
> otherwise you would've done them. I wrote some code to fix the awful
> performance hit. It is a _fix_, not a feature. Your previous mseal
> tests should've covered this. If they didn't, that's okay (we all make
> mistakes), but pushing your problems onto me is seriously laughable.
>
I posted the comments about the lack of a test update in V2 last
monday, there is no response from you until Thursday night (which is
OK). If you were expecting me to update the test cases and to support
your patch series, you should explicitly call it out.

So your point here is that you wrote the code, passed  the existing
selftest, fixed the perf, and the job is done. If the test doesn't
cover the new  cases you added, it is not your problem, you only need
to care about perf part.

This is how regression happened in past, e.g.

commit 77795f900e2a07c1cbedc375789aefb43843b6c2
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Tue Jun 6 14:29:12 2023 -0400

    mm/mprotect: fix do_mprotect_pkey() limit check

    The return of do_mprotect_pkey() can still be incorrectly returned as
    success if there is a gap that spans to or beyond the end address passed
    in.  Update the check to ensure that the end address has indeed been seen.

    Link: https://lore.kernel.org/all/CABi2SkXjN+5iFoBhxk71t3cmunTk-s=rB4T7qo0UQRh17s49PQ@mail.gmail.com/
    Link: https://lkml.kernel.org/r/20230606182912.586576-1-Liam.Howlett@oracle.com
    Fixes: 82f951340f25 ("mm/mprotect: fix do_mprotect_pkey() return on error")
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reported-by: Jeff Xu <jeffxu@chromium.org>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Had not I wrote selftest to discover this mprotect regression, it
would go unnoticed  and stay that way.

My point is: the existing selftest for mseal  is written for out-loop,
and will not fully test in-loop. Your patch has made a big change to
mseal, more tests are needed.

To move forward from this situation for your patch series, either you
or me will need to update the tests. How about sharing the load ?

> I seriously have no words.
> I want to stress I have no way to test real software that uses mseal
> because APPARENTLY THERE IS NONE. THE FEATURE ADDED EXPLICITLY FOR
> CHROMIUM IS NOT USED BY UPSTREAM CHROMIUM.
>
The testing is about making sure that sealing is preserved after
mmap/mremap/munmap/mprotect call, there is no real software that will
do that kind of testing, even in the future, selftest is the only way.

Security features never come quickly, adding  syscall to the kernel is
the first step to allow apps to use it.

> >
> > 2 perf testing
> > stress-ng is not stable in my test with Chromebook, and I'm requesting
> >  Oliver to take more samples [2] . This due diligence assures that
> > this patch accurately fulfills its purpose. The in-loop approach adds
> > complexity to the code, i.e. future dev is harder to understand the
> > sealing logic. Additionally, it sacrifices a security feature that
> > makes it harder for an attacker to modify mapping (currently if an
> > attacker uses munmap with a large address range, if one of the
> > addresses is sealed, the entire range is not modified. In the in-loop
> > approach,  memory will be unmapped till it hits the sealed memory).
>
> Wrong. munmap is atomic and always has been. It's required by POSIX.
>
Please run this test on the latest kernel branch to verify:

static void test_munmap_free_multiple_ranges(bool seal)
{
        void *ptr;
        unsigned long page_size = getpagesize();
        unsigned long size = 8 * page_size;
        int ret;
        int prot;

        setup_single_address(size, &ptr);
        FAIL_TEST_IF_FALSE(ptr != (void *)-1);

        /* unmap one page from beginning. */
        ret = sys_munmap(ptr, page_size);
        FAIL_TEST_IF_FALSE(!ret);

        /* unmap one page from middle. */
        ret = sys_munmap(ptr + 4 * page_size, page_size);
        FAIL_TEST_IF_FALSE(!ret);

        /* seal the last page */
        if (seal) {
                ret = sys_mseal(ptr + 7 * page_size, page_size);
                FAIL_TEST_IF_FALSE(!ret);
        }

        /* munmap all 8  pages from beginning */
        ret = sys_munmap(ptr, 8 * page_size);
        if (seal) {
                FAIL_TEST_IF_FALSE(ret < 0);

                /* verify none of existing pages in  the range are unmapped */
                size = get_vma_size(ptr + page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 3 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);

                size = get_vma_size(ptr +  5 * page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 2 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);

                size = get_vma_size(ptr +  7 * page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 1 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);
        } else {
                FAIL_TEST_IF_FALSE(!ret);

                /* verify all pages are unmapped */
                for (int i = 0; i < 8; i++) {
                        size = get_vma_size(ptr, &prot);
                        FAIL_TEST_IF_FALSE(size == 0);
                }
        }

        REPORT_TEST_PASS();
}

test_munmap_free_multiple_ranges(true)
test_munmap_free_multiple_ranges(false)

> I would also ask you to back these partial mprotect (assuming that's
> what you meant) concerns with a real attack that takes this into
> account, instead of hand waving "security".
> While noting that all of these operations can already partially fail,
> this is not new (and is explicitly permitted by POSIX for
> syscalls-not-named-munmap).
>
As defence gets stronger, with seccomp,selinux,landlock, attackers now
have to find an easier route.

> > Therefore, I would like to ascertain the gain.
>
> The gain is real. I've proven it with will-it-scale, Oliver ran his
> tests and found the regression to be gone (and they do seem to do
> proper statistical analysis).
> I would wager you to find a workload or hardware where *doing
> substantially less work* makes for slower code.
>
> >
> > 3 mremap refactor work.
>
> mremap refactoring is not related to these mmap regressions. In the v3
> I'm cleaning up and sending out tomorrow, I minimally change mremap
> (as Liam wanted).
>
If the test issue is not resolved, V3 will be in the same situation as V2.

Best Regards,
-Jeff

> --
> Pedro

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Pedro Falcato 1 year, 5 months ago

On Fri, Aug 16, 2024 at 6:07 PM Jeff Xu <jeffxu@chromium.org> wrote:
> Please run this test on the latest kernel branch to verify:
>
> static void test_munmap_free_multiple_ranges(bool seal)
> {
>         void *ptr;
>         unsigned long page_size = getpagesize();
>         unsigned long size = 8 * page_size;
>         int ret;
>         int prot;
>
>         setup_single_address(size, &ptr);
>         FAIL_TEST_IF_FALSE(ptr != (void *)-1);
>
>         /* unmap one page from beginning. */
>         ret = sys_munmap(ptr, page_size);
>         FAIL_TEST_IF_FALSE(!ret);
>
>         /* unmap one page from middle. */
>         ret = sys_munmap(ptr + 4 * page_size, page_size);
>         FAIL_TEST_IF_FALSE(!ret);
>
>         /* seal the last page */
>         if (seal) {
>                 ret = sys_mseal(ptr + 7 * page_size, page_size);
>                 FAIL_TEST_IF_FALSE(!ret);
>         }
>
>         /* munmap all 8  pages from beginning */
>         ret = sys_munmap(ptr, 8 * page_size);
>         if (seal) {
>                 FAIL_TEST_IF_FALSE(ret < 0);
>
>                 /* verify none of existing pages in  the range are unmapped */
>                 size = get_vma_size(ptr + page_size, &prot);
>                 FAIL_TEST_IF_FALSE(size == 3 * page_size);
>                 FAIL_TEST_IF_FALSE(prot == 4);
>
>                 size = get_vma_size(ptr +  5 * page_size, &prot);
>                 FAIL_TEST_IF_FALSE(size == 2 * page_size);
>                 FAIL_TEST_IF_FALSE(prot == 4);
>
>                 size = get_vma_size(ptr +  7 * page_size, &prot);
>                 FAIL_TEST_IF_FALSE(size == 1 * page_size);
>                 FAIL_TEST_IF_FALSE(prot == 4);
>         } else {
>                 FAIL_TEST_IF_FALSE(!ret);
>
>                 /* verify all pages are unmapped */
>                 for (int i = 0; i < 8; i++) {
>                         size = get_vma_size(ptr, &prot);
>                         FAIL_TEST_IF_FALSE(size == 0);
>                 }
>         }
>
>         REPORT_TEST_PASS();
> }
>

FWIW this test does not work correctly on my end due to sealed VMAs
getting merged. I hacked up setup_single_address to work around that,
and the test does pass on both 6.10.5 and my local mseal changes
branch.

-- 
Pedro

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Jeff Xu 1 year, 5 months ago

On Fri, Aug 16, 2024 at 11:09 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Fri, Aug 16, 2024 at 6:07 PM Jeff Xu <jeffxu@chromium.org> wrote:
> > Please run this test on the latest kernel branch to verify:
> >
> > static void test_munmap_free_multiple_ranges(bool seal)
> > {
> >         void *ptr;
> >         unsigned long page_size = getpagesize();
> >         unsigned long size = 8 * page_size;
> >         int ret;
> >         int prot;
> >
> >         setup_single_address(size, &ptr);
> >         FAIL_TEST_IF_FALSE(ptr != (void *)-1);
> >
> >         /* unmap one page from beginning. */
> >         ret = sys_munmap(ptr, page_size);
> >         FAIL_TEST_IF_FALSE(!ret);
> >
> >         /* unmap one page from middle. */
> >         ret = sys_munmap(ptr + 4 * page_size, page_size);
> >         FAIL_TEST_IF_FALSE(!ret);
> >
> >         /* seal the last page */
> >         if (seal) {
> >                 ret = sys_mseal(ptr + 7 * page_size, page_size);
> >                 FAIL_TEST_IF_FALSE(!ret);
> >         }
> >
> >         /* munmap all 8  pages from beginning */
> >         ret = sys_munmap(ptr, 8 * page_size);
> >         if (seal) {
> >                 FAIL_TEST_IF_FALSE(ret < 0);
> >
> >                 /* verify none of existing pages in  the range are unmapped */
> >                 size = get_vma_size(ptr + page_size, &prot);
> >                 FAIL_TEST_IF_FALSE(size == 3 * page_size);
> >                 FAIL_TEST_IF_FALSE(prot == 4);
> >
> >                 size = get_vma_size(ptr +  5 * page_size, &prot);
> >                 FAIL_TEST_IF_FALSE(size == 2 * page_size);
> >                 FAIL_TEST_IF_FALSE(prot == 4);
> >
> >                 size = get_vma_size(ptr +  7 * page_size, &prot);
> >                 FAIL_TEST_IF_FALSE(size == 1 * page_size);
> >                 FAIL_TEST_IF_FALSE(prot == 4);
> >         } else {
> >                 FAIL_TEST_IF_FALSE(!ret);
> >
> >                 /* verify all pages are unmapped */
> >                 for (int i = 0; i < 8; i++) {
> >                         size = get_vma_size(ptr, &prot);
> >                         FAIL_TEST_IF_FALSE(size == 0);
> >                 }
> >         }
> >
> >         REPORT_TEST_PASS();
> > }
> >
>
> FWIW this test does not work correctly on my end due to sealed VMAs
> getting merged. I hacked up setup_single_address to work around that,
> and the test does pass on both 6.10.5 and my local mseal changes
> branch.
Yes. you would need to comment out other tests and run this test only,
it didn't consider the case that sealed vma will merge with another
sealed vma (created from another test)

The test didn't pass with the V2 patch (the seal = true) case.

-Jeff

>
> --
> Pedro

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Pedro Falcato 1 year, 5 months ago

On Fri, Aug 16, 2024 at 7:20 PM Jeff Xu <jeffxu@chromium.org> wrote:
>
> On Fri, Aug 16, 2024 at 11:09 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> >
> > On Fri, Aug 16, 2024 at 6:07 PM Jeff Xu <jeffxu@chromium.org> wrote:
> > > Please run this test on the latest kernel branch to verify:
> > >
> > > static void test_munmap_free_multiple_ranges(bool seal)
> > > {
> > >         void *ptr;
> > >         unsigned long page_size = getpagesize();
> > >         unsigned long size = 8 * page_size;
> > >         int ret;
> > >         int prot;
> > >
> > >         setup_single_address(size, &ptr);
> > >         FAIL_TEST_IF_FALSE(ptr != (void *)-1);
> > >
> > >         /* unmap one page from beginning. */
> > >         ret = sys_munmap(ptr, page_size);
> > >         FAIL_TEST_IF_FALSE(!ret);
> > >
> > >         /* unmap one page from middle. */
> > >         ret = sys_munmap(ptr + 4 * page_size, page_size);
> > >         FAIL_TEST_IF_FALSE(!ret);
> > >
> > >         /* seal the last page */
> > >         if (seal) {
> > >                 ret = sys_mseal(ptr + 7 * page_size, page_size);
> > >                 FAIL_TEST_IF_FALSE(!ret);
> > >         }
> > >
> > >         /* munmap all 8  pages from beginning */
> > >         ret = sys_munmap(ptr, 8 * page_size);
> > >         if (seal) {
> > >                 FAIL_TEST_IF_FALSE(ret < 0);
> > >
> > >                 /* verify none of existing pages in  the range are unmapped */
> > >                 size = get_vma_size(ptr + page_size, &prot);
> > >                 FAIL_TEST_IF_FALSE(size == 3 * page_size);
> > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > >
> > >                 size = get_vma_size(ptr +  5 * page_size, &prot);
> > >                 FAIL_TEST_IF_FALSE(size == 2 * page_size);
> > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > >
> > >                 size = get_vma_size(ptr +  7 * page_size, &prot);
> > >                 FAIL_TEST_IF_FALSE(size == 1 * page_size);
> > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > >         } else {
> > >                 FAIL_TEST_IF_FALSE(!ret);
> > >
> > >                 /* verify all pages are unmapped */
> > >                 for (int i = 0; i < 8; i++) {
> > >                         size = get_vma_size(ptr, &prot);
> > >                         FAIL_TEST_IF_FALSE(size == 0);
> > >                 }
> > >         }
> > >
> > >         REPORT_TEST_PASS();
> > > }
> > >
> >
> > FWIW this test does not work correctly on my end due to sealed VMAs
> > getting merged. I hacked up setup_single_address to work around that,
> > and the test does pass on both 6.10.5 and my local mseal changes
> > branch.
> Yes. you would need to comment out other tests and run this test only,
> it didn't consider the case that sealed vma will merge with another
> sealed vma (created from another test)
>
> The test didn't pass with the V2 patch (the seal = true) case.

Because we... found a bug in my munmap changes. The fixed v3 I'm
planning to send out does indeed pass.

-- 
Pedro

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Jeff Xu 1 year, 5 months ago

On Fri, Aug 16, 2024 at 11:26 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Fri, Aug 16, 2024 at 7:20 PM Jeff Xu <jeffxu@chromium.org> wrote:
> >
> > On Fri, Aug 16, 2024 at 11:09 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > >
> > > On Fri, Aug 16, 2024 at 6:07 PM Jeff Xu <jeffxu@chromium.org> wrote:
> > > > Please run this test on the latest kernel branch to verify:
> > > >
> > > > static void test_munmap_free_multiple_ranges(bool seal)
> > > > {
> > > >         void *ptr;
> > > >         unsigned long page_size = getpagesize();
> > > >         unsigned long size = 8 * page_size;
> > > >         int ret;
> > > >         int prot;
> > > >
> > > >         setup_single_address(size, &ptr);
> > > >         FAIL_TEST_IF_FALSE(ptr != (void *)-1);
> > > >
> > > >         /* unmap one page from beginning. */
> > > >         ret = sys_munmap(ptr, page_size);
> > > >         FAIL_TEST_IF_FALSE(!ret);
> > > >
> > > >         /* unmap one page from middle. */
> > > >         ret = sys_munmap(ptr + 4 * page_size, page_size);
> > > >         FAIL_TEST_IF_FALSE(!ret);
> > > >
> > > >         /* seal the last page */
> > > >         if (seal) {
> > > >                 ret = sys_mseal(ptr + 7 * page_size, page_size);
> > > >                 FAIL_TEST_IF_FALSE(!ret);
> > > >         }
> > > >
> > > >         /* munmap all 8  pages from beginning */
> > > >         ret = sys_munmap(ptr, 8 * page_size);
> > > >         if (seal) {
> > > >                 FAIL_TEST_IF_FALSE(ret < 0);
> > > >
> > > >                 /* verify none of existing pages in  the range are unmapped */
> > > >                 size = get_vma_size(ptr + page_size, &prot);
> > > >                 FAIL_TEST_IF_FALSE(size == 3 * page_size);
> > > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > > >
> > > >                 size = get_vma_size(ptr +  5 * page_size, &prot);
> > > >                 FAIL_TEST_IF_FALSE(size == 2 * page_size);
> > > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > > >
> > > >                 size = get_vma_size(ptr +  7 * page_size, &prot);
> > > >                 FAIL_TEST_IF_FALSE(size == 1 * page_size);
> > > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > > >         } else {
> > > >                 FAIL_TEST_IF_FALSE(!ret);
> > > >
> > > >                 /* verify all pages are unmapped */
> > > >                 for (int i = 0; i < 8; i++) {
> > > >                         size = get_vma_size(ptr, &prot);
> > > >                         FAIL_TEST_IF_FALSE(size == 0);
> > > >                 }
> > > >         }
> > > >
> > > >         REPORT_TEST_PASS();
> > > > }
> > > >
> > >
> > > FWIW this test does not work correctly on my end due to sealed VMAs
> > > getting merged. I hacked up setup_single_address to work around that,
> > > and the test does pass on both 6.10.5 and my local mseal changes
> > > branch.
> > Yes. you would need to comment out other tests and run this test only,
> > it didn't consider the case that sealed vma will merge with another
> > sealed vma (created from another test)
> >
> > The test didn't pass with the V2 patch (the seal = true) case.
>
> Because we... found a bug in my munmap changes. The fixed v3 I'm
> planning to send out does indeed pass.
>
OK, I think you got my point.
Glad to hear that you are doing more testing.

Thanks
-Jeff

> --
> Pedro

Re: [PATCH v2 0/6] mm: Optimize mseal checks

Posted by Liam R. Howlett 1 year, 6 months ago

* Jeff Xu <jeffxu@google.com> [240808 21:03]:
> On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> >
> > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > >
> > > > This series also depends on (and will eventually very slightly conflict with)
> > > > the powerpc series that removes arch_unmap[2].
> > >
> > > That's awkward.  Please describe the dependency?
> >
> > One of the transformations done in this patch series (patch 2) assumes
> > that arch_unmap either doesn't exist or does nothing.
> > PPC is the only architecture with an arch_unmap implementation, and
> > through the series I linked they're going to make it work via
> > ->close().
> >
> > What's the easiest way to deal with this? Can the PPC series go
> > through the mm tree?
> >
> This patch can't be merged until arch_unmap() is all removed (ppc change)

I would say that you should just leave the arch_unmap() call in place
for now.

Are we seriously worried that a powerpc user will try to unmap the VDSO
and then get an error that it was mseal'ed after setting the VDSO
pointer to NULL?  I'm more worried about the Perseid meteor shower
hitting me - aka the sky falling.

Note that we could already run into an error after setting the vdso
pointer to null today.

...

Thanks,
Liam