[PATCH] mm: skip folio_activate() for mlocked folios

Dmitry Ilvokhin posted 1 patch 2 months, 2 weeks ago
There is a newer version of this series
mm/swap.c | 10 ++++++++++
1 file changed, 10 insertions(+)
[PATCH] mm: skip folio_activate() for mlocked folios
Posted by Dmitry Ilvokhin 2 months, 2 weeks ago
__mlock_folio() should update stats, when lruvec_add_folio() is called,
but if folio_test_clear_lru() check failed, then __mlock_folio() gives
up early. From the other hand, folio_mark_accessed() calls
folio_activate() which also calls folio_test_clear_lru() down the line.
When folio_activate() successfully removed folio from LRU,
__mlock_folio() will not update any stats, which will lead to inaccurate
values in /proc/meminfo as well as cgroup memory.stat.

To prevent this case from happening also check for folio_test_mlocked()
in folio_mark_accessed(). If folio is not yet marked as unevictable, but
already marked as mlocked, then skip folio_activate() call to allow
__mlock_folio() to make all necessary updates.

To observe the problem mmap() and mlock() big file and check Unevictable
and Mlocked values from /proc/meminfo. On freshly booted system without
any other mlocked memory we expect them to match or be quite close.

See below for more detailed reproduction steps. Source code of stat.c
is available at [1].

  $ head -c 8G < /dev/urandom > /tmp/random.bin

  $ cc -pedantic -Wall -std=c99 stat.c -O3 -o /tmp/stat
  $ /tmp/stat
  Unevictable:     8389668 kB
  Mlocked:         8389700 kB

  Need to run binary twice. Problem does not reproduce on the first run,
  but always reproduces on the second run.

  $ /tmp/stat
  Unevictable:     5374676 kB
  Mlocked:         8389332 kB

[1]: https://gist.github.com/ilvokhin/e50c3d2ff5d9f70dcbb378c6695386dd

Co-developed-by: Kiryl Shutsemau <kas@kernel.org>
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
 mm/swap.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/swap.c b/mm/swap.c
index 2260dcd2775e..f682f070160b 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -469,6 +469,16 @@ void folio_mark_accessed(struct folio *folio)
 		 * this list is never rotated or maintained, so marking an
 		 * unevictable page accessed has no effect.
 		 */
+	} else if (folio_test_mlocked(folio)) {
+		/*
+		 * Pages that are mlocked, but not yet on unevictable LRU.
+		 * They might be still in mlock_fbatch waiting to be processed
+		 * and activating it here might interfere with
+		 * mlock_folio_batch(). __mlock_folio() will fail
+		 * folio_test_clear_lru() check and give up. It happens because
+		 * __folio_batch_add_and_move() clears LRU flag, when adding
+		 * folio to activate batch.
+		 */
 	} else if (!folio_test_active(folio)) {
 		/*
 		 * If the folio is on the LRU, queue it for activation via
-- 
2.47.3
Re: [PATCH] mm: skip folio_activate() for mlocked folios
Posted by Kiryl Shutsemau 2 months, 2 weeks ago
On Fri, Oct 03, 2025 at 02:19:55PM +0000, Dmitry Ilvokhin wrote:
> __mlock_folio() should update stats, when lruvec_add_folio() is called,

The update of stats is incidental to moving to unevicable LRU. But okay.

> but if folio_test_clear_lru() check failed, then __mlock_folio() gives
> up early. From the other hand, folio_mark_accessed() calls
> folio_activate() which also calls folio_test_clear_lru() down the line.
> When folio_activate() successfully removed folio from LRU,
> __mlock_folio() will not update any stats, which will lead to inaccurate
> values in /proc/meminfo as well as cgroup memory.stat.
> 
> To prevent this case from happening also check for folio_test_mlocked()
> in folio_mark_accessed(). If folio is not yet marked as unevictable, but
> already marked as mlocked, then skip folio_activate() call to allow
> __mlock_folio() to make all necessary updates.
> 
> To observe the problem mmap() and mlock() big file and check Unevictable
> and Mlocked values from /proc/meminfo. On freshly booted system without
> any other mlocked memory we expect them to match or be quite close.
> 
> See below for more detailed reproduction steps. Source code of stat.c
> is available at [1].
> 
>   $ head -c 8G < /dev/urandom > /tmp/random.bin
> 
>   $ cc -pedantic -Wall -std=c99 stat.c -O3 -o /tmp/stat
>   $ /tmp/stat
>   Unevictable:     8389668 kB
>   Mlocked:         8389700 kB
> 
>   Need to run binary twice. Problem does not reproduce on the first run,
>   but always reproduces on the second run.
> 
>   $ /tmp/stat
>   Unevictable:     5374676 kB
>   Mlocked:         8389332 kB

I think it is worth starting with the problem statement.

I like to follow this pattern of commit messages:

<Background, if needed>

<Issue statement>

<Proposed solution>

> 
> [1]: https://gist.github.com/ilvokhin/e50c3d2ff5d9f70dcbb378c6695386dd
> 
> Co-developed-by: Kiryl Shutsemau <kas@kernel.org>
> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>

Your Co-developed-by is missing. See submitting-patches.rst.

> ---
>  mm/swap.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 2260dcd2775e..f682f070160b 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -469,6 +469,16 @@ void folio_mark_accessed(struct folio *folio)
>  		 * this list is never rotated or maintained, so marking an
>  		 * unevictable page accessed has no effect.
>  		 */
> +	} else if (folio_test_mlocked(folio)) {
> +		/*
> +		 * Pages that are mlocked, but not yet on unevictable LRU.
> +		 * They might be still in mlock_fbatch waiting to be processed
> +		 * and activating it here might interfere with
> +		 * mlock_folio_batch(). __mlock_folio() will fail
> +		 * folio_test_clear_lru() check and give up. It happens because
> +		 * __folio_batch_add_and_move() clears LRU flag, when adding
> +		 * folio to activate batch.
> +		 */
>  	} else if (!folio_test_active(folio)) {
>  		/*
>  		 * If the folio is on the LRU, queue it for activation via
> -- 
> 2.47.3
> 

-- 
  Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH] mm: skip folio_activate() for mlocked folios
Posted by Dmitry Ilvokhin 2 months, 1 week ago
On Fri, Oct 03, 2025 at 03:41:05PM +0100, Kiryl Shutsemau wrote:
> On Fri, Oct 03, 2025 at 02:19:55PM +0000, Dmitry Ilvokhin wrote:
> > __mlock_folio() should update stats, when lruvec_add_folio() is called,
> 
> The update of stats is incidental to moving to unevicable LRU. But okay.
> 

Good point. I'll rephrase commit message in terms of unevicable
LRU instead of stat updates in v2.

> > but if folio_test_clear_lru() check failed, then __mlock_folio() gives
> > up early. From the other hand, folio_mark_accessed() calls
> > folio_activate() which also calls folio_test_clear_lru() down the line.
> > When folio_activate() successfully removed folio from LRU,
> > __mlock_folio() will not update any stats, which will lead to inaccurate
> > values in /proc/meminfo as well as cgroup memory.stat.
> > 
> > To prevent this case from happening also check for folio_test_mlocked()
> > in folio_mark_accessed(). If folio is not yet marked as unevictable, but
> > already marked as mlocked, then skip folio_activate() call to allow
> > __mlock_folio() to make all necessary updates.
> > 
> > To observe the problem mmap() and mlock() big file and check Unevictable
> > and Mlocked values from /proc/meminfo. On freshly booted system without
> > any other mlocked memory we expect them to match or be quite close.
> > 
> > See below for more detailed reproduction steps. Source code of stat.c
> > is available at [1].
> > 
> >   $ head -c 8G < /dev/urandom > /tmp/random.bin
> > 
> >   $ cc -pedantic -Wall -std=c99 stat.c -O3 -o /tmp/stat
> >   $ /tmp/stat
> >   Unevictable:     8389668 kB
> >   Mlocked:         8389700 kB
> > 
> >   Need to run binary twice. Problem does not reproduce on the first run,
> >   but always reproduces on the second run.
> > 
> >   $ /tmp/stat
> >   Unevictable:     5374676 kB
> >   Mlocked:         8389332 kB
> 
> I think it is worth starting with the problem statement.
> 
> I like to follow this pattern of commit messages:
> 
> <Background, if needed>
> 
> <Issue statement>
> 
> <Proposed solution>
>

Thanks for suggestion, v2 commit message will much this pattern.

> > 
> > [1]: https://gist.github.com/ilvokhin/e50c3d2ff5d9f70dcbb378c6695386dd
> > 
> > Co-developed-by: Kiryl Shutsemau <kas@kernel.org>
> > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> 
> Your Co-developed-by is missing. See submitting-patches.rst.
> 

I followed an example of a patch submitted by the From: author from
submitting-patches.rst. This example doesn't have Co-developed-by tag
from the From Author. That's being said, I found both cases usage in the
mm commit log, so I'll add mine Co-developed-by tag in the v2.

> > ---
> >  mm/swap.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 2260dcd2775e..f682f070160b 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -469,6 +469,16 @@ void folio_mark_accessed(struct folio *folio)
> >  		 * this list is never rotated or maintained, so marking an
> >  		 * unevictable page accessed has no effect.
> >  		 */
> > +	} else if (folio_test_mlocked(folio)) {
> > +		/*
> > +		 * Pages that are mlocked, but not yet on unevictable LRU.
> > +		 * They might be still in mlock_fbatch waiting to be processed
> > +		 * and activating it here might interfere with
> > +		 * mlock_folio_batch(). __mlock_folio() will fail
> > +		 * folio_test_clear_lru() check and give up. It happens because
> > +		 * __folio_batch_add_and_move() clears LRU flag, when adding
> > +		 * folio to activate batch.
> > +		 */
> >  	} else if (!folio_test_active(folio)) {
> >  		/*
> >  		 * If the folio is on the LRU, queue it for activation via
> > -- 
> > 2.47.3
> > 
> 
> -- 
>   Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH] mm: skip folio_activate() for mlocked folios
Posted by Dmitry Ilvokhin 2 months, 1 week ago
On Mon, Oct 06, 2025 at 12:07:48PM +0000, Dmitry Ilvokhin wrote:
> On Fri, Oct 03, 2025 at 03:41:05PM +0100, Kiryl Shutsemau wrote:
> > On Fri, Oct 03, 2025 at 02:19:55PM +0000, Dmitry Ilvokhin wrote:
> > > __mlock_folio() should update stats, when lruvec_add_folio() is called,
> > 
> > The update of stats is incidental to moving to unevicable LRU. But okay.
> > 
> 
> Good point. I'll rephrase commit message in terms of unevicable
> LRU instead of stat updates in v2.
> 
> > > but if folio_test_clear_lru() check failed, then __mlock_folio() gives
> > > up early. From the other hand, folio_mark_accessed() calls
> > > folio_activate() which also calls folio_test_clear_lru() down the line.
> > > When folio_activate() successfully removed folio from LRU,
> > > __mlock_folio() will not update any stats, which will lead to inaccurate
> > > values in /proc/meminfo as well as cgroup memory.stat.
> > > 
> > > To prevent this case from happening also check for folio_test_mlocked()
> > > in folio_mark_accessed(). If folio is not yet marked as unevictable, but
> > > already marked as mlocked, then skip folio_activate() call to allow
> > > __mlock_folio() to make all necessary updates.
> > > 
> > > To observe the problem mmap() and mlock() big file and check Unevictable
> > > and Mlocked values from /proc/meminfo. On freshly booted system without
> > > any other mlocked memory we expect them to match or be quite close.
> > > 
> > > See below for more detailed reproduction steps. Source code of stat.c
> > > is available at [1].
> > > 
> > >   $ head -c 8G < /dev/urandom > /tmp/random.bin
> > > 
> > >   $ cc -pedantic -Wall -std=c99 stat.c -O3 -o /tmp/stat
> > >   $ /tmp/stat
> > >   Unevictable:     8389668 kB
> > >   Mlocked:         8389700 kB
> > > 
> > >   Need to run binary twice. Problem does not reproduce on the first run,
> > >   but always reproduces on the second run.
> > > 
> > >   $ /tmp/stat
> > >   Unevictable:     5374676 kB
> > >   Mlocked:         8389332 kB
> > 
> > I think it is worth starting with the problem statement.
> > 
> > I like to follow this pattern of commit messages:
> > 
> > <Background, if needed>
> > 
> > <Issue statement>
> > 
> > <Proposed solution>
> >
> 
> Thanks for suggestion, v2 commit message will much this pattern.
> 
> > > 
> > > [1]: https://gist.github.com/ilvokhin/e50c3d2ff5d9f70dcbb378c6695386dd
> > > 
> > > Co-developed-by: Kiryl Shutsemau <kas@kernel.org>
> > > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> > 
> > Your Co-developed-by is missing. See submitting-patches.rst.
> > 
> 
> I followed an example of a patch submitted by the From: author from
> submitting-patches.rst. This example doesn't have Co-developed-by tag
> from the From Author. That's being said, I found both cases usage in the
> mm commit log, so I'll add mine Co-developed-by tag in the v2.

Turns out scripts/checkpatch.pl is able to catch that with the following
message: "Co-developed-by: should not be used to attribute nominal patch
author", so I'll obey automation suggestion here and will not add mine
Co-developed-by tag here.

> 
> > > ---
> > >  mm/swap.c | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/mm/swap.c b/mm/swap.c
> > > index 2260dcd2775e..f682f070160b 100644
> > > --- a/mm/swap.c
> > > +++ b/mm/swap.c
> > > @@ -469,6 +469,16 @@ void folio_mark_accessed(struct folio *folio)
> > >  		 * this list is never rotated or maintained, so marking an
> > >  		 * unevictable page accessed has no effect.
> > >  		 */
> > > +	} else if (folio_test_mlocked(folio)) {
> > > +		/*
> > > +		 * Pages that are mlocked, but not yet on unevictable LRU.
> > > +		 * They might be still in mlock_fbatch waiting to be processed
> > > +		 * and activating it here might interfere with
> > > +		 * mlock_folio_batch(). __mlock_folio() will fail
> > > +		 * folio_test_clear_lru() check and give up. It happens because
> > > +		 * __folio_batch_add_and_move() clears LRU flag, when adding
> > > +		 * folio to activate batch.
> > > +		 */
> > >  	} else if (!folio_test_active(folio)) {
> > >  		/*
> > >  		 * If the folio is on the LRU, queue it for activation via
> > > -- 
> > > 2.47.3
> > > 
> > 
> > -- 
> >   Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH] mm: skip folio_activate() for mlocked folios
Posted by Usama Arif 2 months, 2 weeks ago

On 03/10/2025 15:19, Dmitry Ilvokhin wrote:
> __mlock_folio() should update stats, when lruvec_add_folio() is called,
> but if folio_test_clear_lru() check failed, then __mlock_folio() gives

nit: s/failed/fails/
> up early. From the other hand, folio_mark_accessed() calls

nit: s/From/On/
> folio_activate() which also calls folio_test_clear_lru() down the line.
> When folio_activate() successfully removed folio from LRU,
> __mlock_folio() will not update any stats, which will lead to inaccurate
> values in /proc/meminfo as well as cgroup memory.stat.
> 
> To prevent this case from happening also check for folio_test_mlocked()
> in folio_mark_accessed(). If folio is not yet marked as unevictable, but
> already marked as mlocked, then skip folio_activate() call to allow
> __mlock_folio() to make all necessary updates.

Would it make sense to write over here that its safe to skip activating
an mlocked folio?
> 
> To observe the problem mmap() and mlock() big file and check Unevictable
> and Mlocked values from /proc/meminfo. On freshly booted system without
> any other mlocked memory we expect them to match or be quite close.
> 
> See below for more detailed reproduction steps. Source code of stat.c
> is available at [1].
> 
>   $ head -c 8G < /dev/urandom > /tmp/random.bin
> 
>   $ cc -pedantic -Wall -std=c99 stat.c -O3 -o /tmp/stat
>   $ /tmp/stat
>   Unevictable:     8389668 kB
>   Mlocked:         8389700 kB
> 
>   Need to run binary twice. Problem does not reproduce on the first run,
>   but always reproduces on the second run.
> 
>   $ /tmp/stat
>   Unevictable:     5374676 kB
>   Mlocked:         8389332 kB
> 
> [1]: https://gist.github.com/ilvokhin/e50c3d2ff5d9f70dcbb378c6695386dd
> 
> Co-developed-by: Kiryl Shutsemau <kas@kernel.org>
> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> ---


Thanks for the patch!

Personally I would just use the comment you have written below to create the commit message.
You probably dont really need to write all the function calls paths?

Also, I don't think you need () for all the functions in the commit message, although
thats my personal preference.

Apart from changes in the commit message, lgtm.

Feel free to add after commit message fixups.

Acked-by: Usama Arif <usamaarif642@gmail.com>


>  mm/swap.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 2260dcd2775e..f682f070160b 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -469,6 +469,16 @@ void folio_mark_accessed(struct folio *folio)
>  		 * this list is never rotated or maintained, so marking an
>  		 * unevictable page accessed has no effect.
>  		 */
> +	} else if (folio_test_mlocked(folio)) {
> +		/*
> +		 * Pages that are mlocked, but not yet on unevictable LRU.
> +		 * They might be still in mlock_fbatch waiting to be processed
> +		 * and activating it here might interfere with
> +		 * mlock_folio_batch(). __mlock_folio() will fail
> +		 * folio_test_clear_lru() check and give up. It happens because
> +		 * __folio_batch_add_and_move() clears LRU flag, when adding
> +		 * folio to activate batch.
> +		 */
>  	} else if (!folio_test_active(folio)) {
>  		/*
>  		 * If the folio is on the LRU, queue it for activation via
Re: [PATCH] mm: skip folio_activate() for mlocked folios
Posted by Dmitry Ilvokhin 2 months, 1 week ago
On Fri, Oct 03, 2025 at 03:36:08PM +0100, Usama Arif wrote:
> 
> 
> On 03/10/2025 15:19, Dmitry Ilvokhin wrote:
> > __mlock_folio() should update stats, when lruvec_add_folio() is called,
> > but if folio_test_clear_lru() check failed, then __mlock_folio() gives
> 
> nit: s/failed/fails/

I'll cut commit message in v2, so phrase will not be there anymore.

> > up early. From the other hand, folio_mark_accessed() calls
> 
> nit: s/From/On/

This one as well.

> > folio_activate() which also calls folio_test_clear_lru() down the line.
> > When folio_activate() successfully removed folio from LRU,
> > __mlock_folio() will not update any stats, which will lead to inaccurate
> > values in /proc/meminfo as well as cgroup memory.stat.
> > 
> > To prevent this case from happening also check for folio_test_mlocked()
> > in folio_mark_accessed(). If folio is not yet marked as unevictable, but
> > already marked as mlocked, then skip folio_activate() call to allow
> > __mlock_folio() to make all necessary updates.
> 
> Would it make sense to write over here that its safe to skip activating
> an mlocked folio?

Good point, will mention that, because that's my understanding as well.
I think mlocked folio should end up in the unevictable LRU eventually
and if so, mlocked folio being in active LRU is temporary anyway, so it
is should be safe to skip folio_activate() for mlocked folios.

> > 
> > To observe the problem mmap() and mlock() big file and check Unevictable
> > and Mlocked values from /proc/meminfo. On freshly booted system without
> > any other mlocked memory we expect them to match or be quite close.
> > 
> > See below for more detailed reproduction steps. Source code of stat.c
> > is available at [1].
> > 
> >   $ head -c 8G < /dev/urandom > /tmp/random.bin
> > 
> >   $ cc -pedantic -Wall -std=c99 stat.c -O3 -o /tmp/stat
> >   $ /tmp/stat
> >   Unevictable:     8389668 kB
> >   Mlocked:         8389700 kB
> > 
> >   Need to run binary twice. Problem does not reproduce on the first run,
> >   but always reproduces on the second run.
> > 
> >   $ /tmp/stat
> >   Unevictable:     5374676 kB
> >   Mlocked:         8389332 kB
> > 
> > [1]: https://gist.github.com/ilvokhin/e50c3d2ff5d9f70dcbb378c6695386dd
> > 
> > Co-developed-by: Kiryl Shutsemau <kas@kernel.org>
> > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> > ---
> 
> 
> Thanks for the patch!
> 
> Personally I would just use the comment you have written below to create the commit message.
> You probably dont really need to write all the function calls paths?
> 

I'll cut commit message a bit in v2 and make it more succinct, thanks
for the feedback.

> Also, I don't think you need () for all the functions in the commit message, although
> thats my personal preference.
> 
> Apart from changes in the commit message, lgtm.
> 
> Feel free to add after commit message fixups.
> 
> Acked-by: Usama Arif <usamaarif642@gmail.com>
> 

Thanks for the review.

> 
> >  mm/swap.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 2260dcd2775e..f682f070160b 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -469,6 +469,16 @@ void folio_mark_accessed(struct folio *folio)
> >  		 * this list is never rotated or maintained, so marking an
> >  		 * unevictable page accessed has no effect.
> >  		 */
> > +	} else if (folio_test_mlocked(folio)) {
> > +		/*
> > +		 * Pages that are mlocked, but not yet on unevictable LRU.
> > +		 * They might be still in mlock_fbatch waiting to be processed
> > +		 * and activating it here might interfere with
> > +		 * mlock_folio_batch(). __mlock_folio() will fail
> > +		 * folio_test_clear_lru() check and give up. It happens because
> > +		 * __folio_batch_add_and_move() clears LRU flag, when adding
> > +		 * folio to activate batch.
> > +		 */
> >  	} else if (!folio_test_active(folio)) {
> >  		/*
> >  		 * If the folio is on the LRU, queue it for activation via
>