[PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag

Lorenzo Stoakes posted 2 patches 2 months, 3 weeks ago
There is a newer version of this series
include/linux/mm.h                      | 23 ++++++-----
tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
tools/testing/vma/vma_internal.h        | 23 ++++++-----
3 files changed, 72 insertions(+), 25 deletions(-)
[PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Lorenzo Stoakes 2 months, 3 weeks ago
Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
establishing a new VMA, or via merge) as implemented in __mmap_complete()
and do_brk_flags().

However, when performing a merge of existing mappings such as when
performing mprotect(), we may lose the VM_SOFTDIRTY flag.


Lorenzo Stoakes (2):
  mm: propagate VM_SOFTDIRTY on merge
  testing/selftests/mm: add soft-dirty merge self-test

 include/linux/mm.h                      | 23 ++++++-----
 tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
 tools/testing/vma/vma_internal.h        | 23 ++++++-----
 3 files changed, 72 insertions(+), 25 deletions(-)

--
2.51.0
Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Andrei Vagin 2 months, 3 weeks ago
On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> establishing a new VMA, or via merge) as implemented in __mmap_complete()
> and do_brk_flags().
>
> However, when performing a merge of existing mappings such as when
> performing mprotect(), we may lose the VM_SOFTDIRTY flag.

Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.

A separate concern is whether merging two VMAs should be permitted when
one has the VM_SOFTDIRTY flag set and another does not. I think the
merging operation should be disallowed.The  issue is that
PAGE_IS_SOFT_DIRTY will be reported for every page in the resulting VMA.
Consider a scenario where a large VMA has only a small number of pages
marked SOFT_DIRTY. If we merge it with a smaller VMA that does have
VM_SOFTDIRTY, all pages in the originally large VMA will subsequently be
reported as SOFT_DIRTY. As a result, CRIU will needlessly dump all of
these pages again, even though the vast majority of them were unchanged
since the prior checkpoint iteration.

Thanks,
Andrei

>
>
> Lorenzo Stoakes (2):
>   mm: propagate VM_SOFTDIRTY on merge
>   testing/selftests/mm: add soft-dirty merge self-test
>
>  include/linux/mm.h                      | 23 ++++++-----
>  tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
>  tools/testing/vma/vma_internal.h        | 23 ++++++-----
>  3 files changed, 72 insertions(+), 25 deletions(-)
>
> --
> 2.51.0
>
Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Lorenzo Stoakes 2 months, 3 weeks ago
On Sun, Nov 16, 2025 at 04:53:36PM -0800, Andrei Vagin wrote:
> On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > and do_brk_flags().
> >
> > However, when performing a merge of existing mappings such as when
> > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
>
> Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.
>
> A separate concern is whether merging two VMAs should be permitted when
> one has the VM_SOFTDIRTY flag set and another does not. I think the
> merging operation should be disallowed.The  issue is that


This patch doesn't change anything in terms of merging, it only _correctly_
marks VMAs as soft-dirty where certain, very specific, circumstances might
result in a merged VMA being incorrectly indicated to not be soft-dirty
when it in fact contains pages which are.

Since VMA fragmentation is an issue that impacts non-softydirty users, I'm
afraid we cannot split on this parameter.

It'd also be a user-visible change that could cause breaking issues
(mremap() for instances in _most_ cases requires that it operates on a
single VMA).

So this isn't possible.


> PAGE_IS_SOFT_DIRTY will be reported for every page in the resulting VMA.
> Consider a scenario where a large VMA has only a small number of pages
> marked SOFT_DIRTY. If we merge it with a smaller VMA that does have
> VM_SOFTDIRTY, all pages in the originally large VMA will subsequently be
> reported as SOFT_DIRTY. As a result, CRIU will needlessly dump all of
> these pages again, even though the vast majority of them were unchanged
> since the prior checkpoint iteration.

I think there's some confusion about what is possible here.

Currently if you don't invoke /proc/$pid/clear_refs, all VMAs will have
soft-dirty set until you do.

So this is a situation that _already exists_.

And intentionally so - we default all VMAs to soft-dirty so users can
detect new mappings in order not to perceive e.g. mmap()'ing over an
existing range as as being no change.

OK so what if you clear references? Considering:

1. Map large VMA
2. Clear references
3. Dirty several pages (VM_SOFTDIRTY clear)
4. Map new VMA immediately _after_ it (VM_SOFTDIRTY set)
5. Merge - Before this patch: VM_SOFTDIRTY bit cleared on merge BUT SET
   AGAIN due to it being an mmap() invocation. After this patch:
   VM_SOFTDIRTY bit retained on merge but also set again due to it being an
   mmap() invocation.

So this kind of merge has no change in behaviour.

And again, it's correct - the user needs to be able to identify what's
changed.

This change fixes this behaviour to be consistent for other types of merge,
when previously it was not.

In the past, you'd get soft-dirty set/not set _depending on the type of
merge_. So if the target VMA had the flag set, you'd have it marked
soft-dirty, otherwise not.

Since it's unacceptabale to fragment VMAs on the basis of soft-dirty, we're
_only_ improving correctness here, and this patch is a net good no matter
what.


>
> Thanks,
> Andrei
>
> >
> >
> > Lorenzo Stoakes (2):
> >   mm: propagate VM_SOFTDIRTY on merge
> >   testing/selftests/mm: add soft-dirty merge self-test
> >
> >  include/linux/mm.h                      | 23 ++++++-----
> >  tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
> >  tools/testing/vma/vma_internal.h        | 23 ++++++-----
> >  3 files changed, 72 insertions(+), 25 deletions(-)
> >
> > --
> > 2.51.0
> >

Cheers, Lorenzo
Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Andrei Vagin 2 months, 3 weeks ago
On Mon, Nov 17, 2025 at 3:33 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Sun, Nov 16, 2025 at 04:53:36PM -0800, Andrei Vagin wrote:
> > On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
> > <lorenzo.stoakes@oracle.com> wrote:
> > >
> > > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > > and do_brk_flags().
> > >
> > > However, when performing a merge of existing mappings such as when
> > > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> >
> > Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.
> >
> > A separate concern is whether merging two VMAs should be permitted when
> > one has the VM_SOFTDIRTY flag set and another does not. I think the
> > merging operation should be disallowed.The  issue is that
>
>
> This patch doesn't change anything in terms of merging, it only _correctly_
> marks VMAs as soft-dirty where certain, very specific, circumstances might
> result in a merged VMA being incorrectly indicated to not be soft-dirty
> when it in fact contains pages which are.

As I mentioned in the previous message, this patch is correct, and I
appreciate your effort to solve this issue. My comment was about whether
we should allow merging VMAs if one has VM_SOFTDIRTY and the other does
not. You are right, this is a separate question unrelated to this patch.

I recall correctly that initially, merging vma-s with different
VM_SORTDIRTY bit values was not allowed. It was a bit surprising that
this behavior was changed by Cyrill in 34228d473efe.  Cyrill was an
active CRIU contributor at the time, so we can't even blame anyone for
breaking CRIU :).

Thanks,
Andrei
Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Lorenzo Stoakes 2 months, 3 weeks ago
On Mon, Nov 17, 2025 at 10:26:34AM -0800, Andrei Vagin wrote:
> On Mon, Nov 17, 2025 at 3:33 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Sun, Nov 16, 2025 at 04:53:36PM -0800, Andrei Vagin wrote:
> > > On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
> > > <lorenzo.stoakes@oracle.com> wrote:
> > > >
> > > > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > > > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > > > and do_brk_flags().
> > > >
> > > > However, when performing a merge of existing mappings such as when
> > > > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> > >
> > > Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.
> > >
> > > A separate concern is whether merging two VMAs should be permitted when
> > > one has the VM_SOFTDIRTY flag set and another does not. I think the
> > > merging operation should be disallowed.The  issue is that
> >
> >
> > This patch doesn't change anything in terms of merging, it only _correctly_
> > marks VMAs as soft-dirty where certain, very specific, circumstances might
> > result in a merged VMA being incorrectly indicated to not be soft-dirty
> > when it in fact contains pages which are.
>
> As I mentioned in the previous message, this patch is correct, and I
> appreciate your effort to solve this issue. My comment was about whether
> we should allow merging VMAs if one has VM_SOFTDIRTY and the other does
> not. You are right, this is a separate question unrelated to this patch.

Thanks :)

>
> I recall correctly that initially, merging vma-s with different
> VM_SORTDIRTY bit values was not allowed. It was a bit surprising that
> this behavior was changed by Cyrill in 34228d473efe.  Cyrill was an
> active CRIU contributor at the time, so we can't even blame anyone for
> breaking CRIU :).

Well I think Cyrill is in the right here :) the problem described there -
that of hitting the max_map_count simply due to failed VM_SOFTDIRTY merges
- is very serious and clearly highlights the issue that arises from not
merging these - that is VMA fragmentation.

>
> Thanks,
> Andrei

Cheers, Lorenzo
Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Cyrill Gorcunov 2 months, 2 weeks ago
On Mon, Nov 17, 2025 at 07:57:30PM +0000, Lorenzo Stoakes wrote:
...
> > I recall correctly that initially, merging vma-s with different
> > VM_SORTDIRTY bit values was not allowed. It was a bit surprising that
> > this behavior was changed by Cyrill in 34228d473efe.  Cyrill was an
> > active CRIU contributor at the time, so we can't even blame anyone for
> > breaking CRIU :).
> 
> Well I think Cyrill is in the right here :) the problem described there -
> that of hitting the max_map_count simply due to failed VM_SOFTDIRTY merges
> - is very serious and clearly highlights the issue that arises from not
> merging these - that is VMA fragmentation.

Hi guys! Happen to miss this thread due to high message traffic, thanks
for CC'ing me ;) Yeah, disability to merge VMAs due to softdirty bit has
been a serious issue, so better criu to dump more (redundant) memory
than apps got broken. As to this patch series, I think it is good, thanks
a huge Lorenzo!

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Lorenzo Stoakes 2 months, 2 weeks ago
On Wed, Nov 19, 2025 at 04:09:49PM +0300, Cyrill Gorcunov wrote:
> On Mon, Nov 17, 2025 at 07:57:30PM +0000, Lorenzo Stoakes wrote:
> ...
> > > I recall correctly that initially, merging vma-s with different
> > > VM_SORTDIRTY bit values was not allowed. It was a bit surprising that
> > > this behavior was changed by Cyrill in 34228d473efe.  Cyrill was an
> > > active CRIU contributor at the time, so we can't even blame anyone for
> > > breaking CRIU :).
> >
> > Well I think Cyrill is in the right here :) the problem described there -
> > that of hitting the max_map_count simply due to failed VM_SOFTDIRTY merges
> > - is very serious and clearly highlights the issue that arises from not
> > merging these - that is VMA fragmentation.
>
> Hi guys! Happen to miss this thread due to high message traffic, thanks
> for CC'ing me ;) Yeah, disability to merge VMAs due to softdirty bit has
> been a serious issue, so better criu to dump more (redundant) memory
> than apps got broken. As to this patch series, I think it is good, thanks
> a huge Lorenzo!
>
> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

Thanks :)
Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Anshuman Khandual 2 months, 3 weeks ago
On 17/11/25 6:23 AM, Andrei Vagin wrote:
> On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
>>
>> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
>> establishing a new VMA, or via merge) as implemented in __mmap_complete()
>> and do_brk_flags().
>>
>> However, when performing a merge of existing mappings such as when
>> performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> 
> Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.
> 
> A separate concern is whether merging two VMAs should be permitted when
> one has the VM_SOFTDIRTY flag set and another does not. I think the
> merging operation should be disallowed.The  issue is that

If merging VM_SOFTDIRTY and non-VM_SOFTDIRTY VMAs would not be allowed then
what is the point for moving VM_SOFTDIRTY as VM_STICKY ?
> PAGE_IS_SOFT_DIRTY will be reported for every page in the resulting VMA.
> Consider a scenario where a large VMA has only a small number of pages
> marked SOFT_DIRTY. If we merge it with a smaller VMA that does have
> VM_SOFTDIRTY, all pages in the originally large VMA will subsequently be
> reported as SOFT_DIRTY. As a result, CRIU will needlessly dump all of
> these pages again, even though the vast majority of them were unchanged
> since the prior checkpoint iteration.
> 
> Thanks,
> Andrei
> 
>>
>>
>> Lorenzo Stoakes (2):
>>   mm: propagate VM_SOFTDIRTY on merge
>>   testing/selftests/mm: add soft-dirty merge self-test
>>
>>  include/linux/mm.h                      | 23 ++++++-----
>>  tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
>>  tools/testing/vma/vma_internal.h        | 23 ++++++-----
>>  3 files changed, 72 insertions(+), 25 deletions(-)
>>
>> --
>> 2.51.0
>>
> 

Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Andrew Morton 2 months, 3 weeks ago
On Fri, 14 Nov 2025 17:53:17 +0000 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> establishing a new VMA, or via merge) as implemented in __mmap_complete()
> and do_brk_flags().
> 
> However, when performing a merge of existing mappings such as when
> performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> 

userspace-visible effects?

Documentation/admin-guide/mm/soft-dirty.rst tells me that this can
already happen in other circumstances so I guess it isn't very serious.
CRIU inefficiency.  perhaps?

Please review Documentation/admin-guide/mm/soft-dirty.rst, check that
it is complete and accurate?
Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
Posted by Lorenzo Stoakes 2 months, 3 weeks ago
On Fri, Nov 14, 2025 at 01:53:03PM -0800, Andrew Morton wrote:
> On Fri, 14 Nov 2025 17:53:17 +0000 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
>
> > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > and do_brk_flags().
> >
> > However, when performing a merge of existing mappings such as when
> > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> >
>
> userspace-visible effects?

Simply more correct accounting of soft-dirty :)

>
> Documentation/admin-guide/mm/soft-dirty.rst tells me that this can
> already happen in other circumstances so I guess it isn't very serious.
> CRIU inefficiency.  perhaps?

I don't think it should cause inefficiency other than us already _accidentally_
being more efficient, see the discussion in thread :)

>
> Please review Documentation/admin-guide/mm/soft-dirty.rst, check that
> it is complete and accurate?

It LGTM, we are changing some very specific internal implementation detail here
which I don't think is worth mentioning here (effectively - 'we used to be wrong
sometimes, now not so much' :)

(Staring at tihs I realise I _probably_ need to change something very specific
in the original sticky implementation. The VMA implementation is so
finnicky... will send follow up fixpatch/respin on that w/details)

Cheers, Lorenzo