[v1] RE: [PATCH 0/3] mm/lru_gen: move lru_gen control interface from debugfs to procfs

RE: [PATCH 0/3] mm/lru_gen: move lru_gen control interface from debugfs to procfs

Posted by wangzicheng 2 months, 1 week ago

Hi Barry,

Thank you for the follow-up questions.

It seems that our main testbed (kernel v6.6/v6.12 for latest devices), 
don't have SWAPPINESS_ANON_ONLY/201 - related patches yet.

Since the max swappiness is 200, there are quite scenarios that file
pages are the only option.

Quote from kairui's reply:
> Right, we are seeing similar problems on our server too. To workaround
> it we force an age iteration before reclaiming when it happens, which
> isn't the best choice. When the LRU is long and the opposite type of
> the folios we want to reclaim is piling up in the oldest gen, a forced
> age will have to move all these folios, which leads to long tailing
> issues. Let's work on a reasonable solution for that.

Again, thank you for your guidance. We will carefully evaluate the 
Patchset[1] you recommended.

> Hi Zicheng,
> 
> On Mon, Dec 1, 2025 at 5:55 PM wangzicheng <wangzicheng@honor.com>
> wrote:
> >
> > Hi Barry,
> >
> > Thank you for the comment, actually we do know the cgroup file.
> >
> > What we really need is to *proactive aging 2~3 gens* before proactive
> reclaim.
> > (especially after cold launches when no anon pages in the oldest gens)
> >
> > The proactive aging also helps distribute the anon and file pages evenly in
> > MGLRU gens. And reclaiming won't fall into file caches.
> 
> I’m not quite sure what you mean by “reclaiming won’t fall into file caches.”
> 
> I assume you mean you configured a high swappiness for MGLRU proactive
> reclamation, so when both anon and file have four generations,
> `get_type_to_scan()` effectively always returns anon?
> 
> >
> > > Also note that memcg already has an interface for proactive reclamation,
> > > so I’m not certain whether your patchset can coexist with it or extend
> > > it to meet your requirements—which seems quite impossible to me
> > >
> > > memory.reclaim
> > >         A write-only nested-keyed file which exists for all cgroups.
> > >
> > >         This is a simple interface to trigger memory reclaim in the
> > >         target cgroup.
> > >
> > >         Example::
> > >
> > >           echo "1G" > memory.reclaim
> > >
> > >         Please note that the kernel can over or under reclaim from
> > >         the target cgroup. If less bytes are reclaimed than the
> > >         specified amount, -EAGAIN is returned.
> > >
> > This remind me that adding a `memor.aging` under memcg directories
> > rather than adding new procfs files is also a great option.
> 
> I still don’t understand why. Aging is something MGLRU itself should
> handle; components outside MGLRU, such as cgroup v2, do not need to be
> aware of this concept at all. Exposing it will likely lead to another
> immediate NAK.
> 
> In short, aging should remain within MGLRU’s internal scope.

I would like to express a different point of view. We are working on something
Interesting on it, will be shared once ready.

> 
> But it seems you do want some policy control for your proactive
> reclamation, such as always reclaiming anon pages or reclaiming them
> more aggressively than file pages. I assume Zhongkun’s patch [1] we
> mentioned earlier should provide support for that, correct?
> 
> As a workaround, you can set `swappiness=max` for `memory.reclaim`
> before
> we internally improve the handling of the aging issue. In short,
> “proactive aging” and similar mechanisms should be handled automatically
> and internally within the scope of the MGLRU code.

Sure, we will make a careful evaluation.

> 
> [1] https://lore.kernel.org/linux-
> mm/cover.1744169302.git.hezhongkun.hzk@bytedance.com/
> 
> Thanks
> Barry

Thanks
Zicheng

Re: [PATCH 0/3] mm/lru_gen: move lru_gen control interface from debugfs to procfs

Posted by Barry Song 2 months, 1 week ago

On Mon, Dec 1, 2025 at 9:32 PM wangzicheng <wangzicheng@honor.com> wrote:
>
> Hi Barry,
>
> Thank you for the follow-up questions.
>
> It seems that our main testbed (kernel v6.6/v6.12 for latest devices),
> don't have SWAPPINESS_ANON_ONLY/201 - related patches yet.

Then please check with Suren whether it is possible to backport this to
the Android common kernel.
My understanding is that this should already be present in the Android 6.12
kernel.

>
> Since the max swappiness is 200, there are quite scenarios that file
> pages are the only option.
>
> Quote from kairui's reply:
> > Right, we are seeing similar problems on our server too. To workaround
> > it we force an age iteration before reclaiming when it happens, which
> > isn't the best choice. When the LRU is long and the opposite type of
> > the folios we want to reclaim is piling up in the oldest gen, a forced
> > age will have to move all these folios, which leads to long tailing
> > issues. Let's work on a reasonable solution for that.
>

We all agree that MGLRU has this generation issue. You mentioned it, I agreed
and noted that both Kairui and I had observed it. Then Kairui replied that he
had indeed seen it as well. Now you are using Kairui’s reply to argue against
me, and I honestly don’t understand the logic behind your responses.

> Again, thank you for your guidance. We will carefully evaluate the
> Patchset[1] you recommended.
>
> > Hi Zicheng,
> >
> > On Mon, Dec 1, 2025 at 5:55 PM wangzicheng <wangzicheng@honor.com>
> > wrote:
> > >
> > > Hi Barry,
> > >
> > > Thank you for the comment, actually we do know the cgroup file.
> > >
> > > What we really need is to *proactive aging 2~3 gens* before proactive
> > reclaim.
> > > (especially after cold launches when no anon pages in the oldest gens)
> > >
> > > The proactive aging also helps distribute the anon and file pages evenly in
> > > MGLRU gens. And reclaiming won't fall into file caches.
> >
> > I’m not quite sure what you mean by “reclaiming won’t fall into file caches.”
> >
> > I assume you mean you configured a high swappiness for MGLRU proactive
> > reclamation, so when both anon and file have four generations,
> > `get_type_to_scan()` effectively always returns anon?
> >
> > >
> > > > Also note that memcg already has an interface for proactive reclamation,
> > > > so I’m not certain whether your patchset can coexist with it or extend
> > > > it to meet your requirements—which seems quite impossible to me
> > > >
> > > > memory.reclaim
> > > >         A write-only nested-keyed file which exists for all cgroups.
> > > >
> > > >         This is a simple interface to trigger memory reclaim in the
> > > >         target cgroup.
> > > >
> > > >         Example::
> > > >
> > > >           echo "1G" > memory.reclaim
> > > >
> > > >         Please note that the kernel can over or under reclaim from
> > > >         the target cgroup. If less bytes are reclaimed than the
> > > >         specified amount, -EAGAIN is returned.
> > > >
> > > This remind me that adding a `memor.aging` under memcg directories
> > > rather than adding new procfs files is also a great option.
> >
> > I still don’t understand why. Aging is something MGLRU itself should
> > handle; components outside MGLRU, such as cgroup v2, do not need to be
> > aware of this concept at all. Exposing it will likely lead to another
> > immediate NAK.
> >
> > In short, aging should remain within MGLRU’s internal scope.
>
> I would like to express a different point of view. We are working on something
> Interesting on it, will be shared once ready.

You are always welcome to share, but please understand that memory.aging is
not of interest to any module outside the scope of MGLRU itself. An interface
is an interface, and internal implementation should remain internal. In other
words, there is no reason for cgroupv2 to be aware of what “aging” is.

You may submit your new code as a "fix" for the generation issue without
introducing a new interface. That would be a good starting point for
discussing how to resolve the problem.

>
> >
> > But it seems you do want some policy control for your proactive
> > reclamation, such as always reclaiming anon pages or reclaiming them
> > more aggressively than file pages. I assume Zhongkun’s patch [1] we
> > mentioned earlier should provide support for that, correct?
> >
> > As a workaround, you can set `swappiness=max` for `memory.reclaim`
> > before
> > we internally improve the handling of the aging issue. In short,
> > “proactive aging” and similar mechanisms should be handled automatically
> > and internally within the scope of the MGLRU code.
>
> Sure, we will make a careful evaluation.

Thanks
Barry

RE: [PATCH 0/3] mm/lru_gen: move lru_gen control interface from debugfs to procfs

Posted by wangzicheng 2 months, 1 week ago

Hi Barry,

> Then please check with Suren whether it is possible to backport this to
> the Android common kernel.
> My understanding is that this should already be present in the Android 6.12
> kernel.
> 
Thanks for the reminding.

> >
> > Since the max swappiness is 200, there are quite scenarios that file
> > pages are the only option.
> >
> > Quote from kairui's reply:
> > > Right, we are seeing similar problems on our server too. To workaround
> > > it we force an age iteration before reclaiming when it happens, which
> > > isn't the best choice. When the LRU is long and the opposite type of
> > > the folios we want to reclaim is piling up in the oldest gen, a forced
> > > age will have to move all these folios, which leads to long tailing
> > > issues. Let's work on a reasonable solution for that.
> >
> 
> We all agree that MGLRU has this generation issue. You mentioned it, I
> agreed
> and noted that both Kairui and I had observed it. Then Kairui replied that he
> had indeed seen it as well. Now you are using Kairui’s reply to argue against
> me, and I honestly don’t understand the logic behind your responses.
> 

My apologize if my previous wording caused any confusion.

The only thing the patchset (want to) do is forcing 2/3 gens aging right before proactive
reclaim, and it helps reclaim more anon pages and preserve more file pages under
certain workload. (400~800MB MemAvailable improvement).

The reason for quoting Kairui's reply:
`force aging 2/3 gens before reclaim` would be roughly similar in spirit to what Kairui
referred to ` force an age iteration before reclaiming`, from my understanding.

If my understanding is inaccurate, please feel free to correct me.

> > Again, thank you for your guidance. We will carefully evaluate the
> > Patchset[1] you recommended.
> >
> > > Hi Zicheng,
> > >
> > > On Mon, Dec 1, 2025 at 5:55 PM wangzicheng <wangzicheng@honor.com>
> > > wrote:
> > > >
> > > > Hi Barry,
> > > >
> > > > Thank you for the comment, actually we do know the cgroup file.
> > > >
> > > > What we really need is to *proactive aging 2~3 gens* before proactive
> > > reclaim.
> > > > (especially after cold launches when no anon pages in the oldest gens)
> > > >
> > > > The proactive aging also helps distribute the anon and file pages evenly
> in
> > > > MGLRU gens. And reclaiming won't fall into file caches.
> > >
> > > I’m not quite sure what you mean by “reclaiming won’t fall into file
> caches.”
> > >
> > > I assume you mean you configured a high swappiness for MGLRU
> proactive
> > > reclamation, so when both anon and file have four generations,
> > > `get_type_to_scan()` effectively always returns anon?
> > >
> > > >
> > > > > Also note that memcg already has an interface for proactive
> reclamation,
> > > > > so I’m not certain whether your patchset can coexist with it or extend
> > > > > it to meet your requirements—which seems quite impossible to me
> > > > >
> > > > > memory.reclaim
> > > > >         A write-only nested-keyed file which exists for all cgroups.
> > > > >
> > > > >         This is a simple interface to trigger memory reclaim in the
> > > > >         target cgroup.
> > > > >
> > > > >         Example::
> > > > >
> > > > >           echo "1G" > memory.reclaim
> > > > >
> > > > >         Please note that the kernel can over or under reclaim from
> > > > >         the target cgroup. If less bytes are reclaimed than the
> > > > >         specified amount, -EAGAIN is returned.
> > > > >
> > > > This remind me that adding a `memor.aging` under memcg directories
> > > > rather than adding new procfs files is also a great option.
> > >
> > > I still don’t understand why. Aging is something MGLRU itself should
> > > handle; components outside MGLRU, such as cgroup v2, do not need to
> be
> > > aware of this concept at all. Exposing it will likely lead to another
> > > immediate NAK.
> > >
> > > In short, aging should remain within MGLRU’s internal scope.
> >
> > I would like to express a different point of view. We are working on
> something
> > Interesting on it, will be shared once ready.
> 
> You are always welcome to share, but please understand that memory.aging
> is
> not of interest to any module outside the scope of MGLRU itself. An
> interface
> is an interface, and internal implementation should remain internal. In other
> words, there is no reason for cgroupv2 to be aware of what “aging” is.
> 
> You may submit your new code as a "fix" for the generation issue without
> introducing a new interface. That would be a good starting point for
> discussing how to resolve the problem.
> 

Completely agree with your guidance.
We will revisit the design and think about the next version, and try to keep the
mechanism internally.

> >
> > >
> > > But it seems you do want some policy control for your proactive
> > > reclamation, such as always reclaiming anon pages or reclaiming them
> > > more aggressively than file pages. I assume Zhongkun’s patch [1] we
> > > mentioned earlier should provide support for that, correct?
> > >
> > > As a workaround, you can set `swappiness=max` for `memory.reclaim`
> > > before
> > > we internally improve the handling of the aging issue. In short,
> > > “proactive aging” and similar mechanisms should be handled
> automatically
> > > and internally within the scope of the MGLRU code.
> >
> > Sure, we will make a careful evaluation.
> 
> Thanks
> Barry

Best,
Zicheng