[v1] RE: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

RE: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by wangtao 1 week, 3 days ago

> > On Thu, May 28, 2026 at 07:57:31AM +0000, wangtao wrote:
> > > > Subject: Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for
> deferred
> > > > anon_vma creation
> > > >
> > > > OK I've had a look through more thoroughly now and:
> > > >
> > > > NAK and NAK any approach like this.
> > > >
> > > >
> > > > Not only is this structurally all wrong, it does some insane stuff
> > > > (pinning VMAs - no), the RCU usage is highly dubious and I suspect
> > > > you've completely broken the anon rmap for things like migration,
> > > > or have at least added very dubious edge cases.
> > > >
> > > > You've added insane complexity, and also have failed to add even
> > > > perfunctory tests, which is also totally unacceptable.
> > > >
> > > > The implementation is wrong, and the approach is wrong - we do not
> > > > want to extend or build on anon_vma. So this is unmergeable, or any
> approach like it.
> > > >
> > > > I also, unfortunately, strongly suspect AI here. The turn of
> > > > phrase, and poor commit messages, you doing this out of nowhere
> > > > with absolutely no rmap experience before, your total lack of
> communication before.
> > > >
> > > > Claude puts the probability of heavy AI usage at 85-90%, and I'm
> > > > pretty convinced. Either way it's utterly unmergeable but that you
> > > > (likely) used AI to generate this much work for us makes me actually
> pretty annoyed.
> > > >
> > > > As a result, I would strongly suggest you no longer submit patches
> > > > for the reverse mapping part of mm, as there is now a real lack of trust.
> > > >
> > > > If you wish to rebuild that, I suggest you _discuss_ concepts and ideas,
> e.g.
> > > > send stuff on-list with a [DISCUSSION] tag, and engage with the
> > > > community, and go from there.
> > > >
> > > > It's also important to synchronise - I'm working on an anon rmap
> > > > replacement that I'm more than happy to discuss with you or
> > > > anybody else which should achieve the same numbers in an
> architecturally sound way.
> > > >
> > > > You going off and, in a vacuum, generating a bunch of code with an
> > > > unacceptable approach is not a civil way of engaging nor is it a
> > > > good use of your time, or maintainer time looking at it.
> > > >
> > > > Thanks, Lorenzo
> > >
> > > Your email is very unfriendly. I hope you can point out the specific
> > > problems so we can discuss how to solve them.
> 
> Hi Tao,
> 
> Lorenzo had a discussion about rmap in Zagreb here:
> https://lore.kernel.org/linux-mm/aec533b2-37a7-4f44-a279-
> c4aa604206ac@lucifer.local/
> 
> He also shared the PoC code here:
> https://git.kernel.org/pub/scm/linux/kernel/git/ljs/linux.git/log/?h=project/
> cow-context
> 
> and the slides were shared as well. In case you can't find them on linux-mm (I
> actually couldn't find them myself), I am attaching them again here -
> "scalable-cow-lsf-longer-version.pdf"
> 
> After coming back from Zagreb, I kept trying to find one or two full days to
> read Lorenzo's code and slides carefully and write a blog about them.
> Unfortunately, I have been completely busy with other work. Sigh... we
> always seem to have too many non-upstream tasks.
> 
> If possible, I'd really appreciate it if you could take a deep dive into it and
> write a detailed blog post. I'd be very eager to read it and better understand
> the overall design.
> Otherwise, I'll try to find some time next week or later to go through it
> myself.
> 
Hi Barry,

Thank you very much for your reply.

I took an initial look at the cow-context code, and a few points  
might be worth noting:

1. cow_context_walk currently assumes that the rmap walk runs  
   under RCU protection. This may need to be adjusted early,  
   since paths such as try_to_unmap_one, page_vma_mkclean_one,  
   and try_to_migrate_one may involve task switching.

2. In cow_context_walk, traverse_contexts appears to involve  
   multiple nested loops. When there are many child processes  
   across several fork layers, it may not be as simple or  
   efficient as the current anon_vma approach.

   It needs to traverse all child cow_ctx, and within each  
   cow_ctx, remaps_for_each() has two levels of iteration:  
   remaps_for_each_entry and remaps_for_each_entry_offset.

   In other words, it first iterates over cow_ctx and then  
   traverses rmap_mt inside each one. The rough complexity  
   seems to be O(#proc * log(#rmap_entries_in_cow)), which  
   may be somewhat higher than anon_vma's  
   O(#vmas_in_anon_vma). However, in most cases the number  
   of processes is not large, so the impact may be limited.

Previously, I also considered converting anon_vma's rb_tree  
to a mapletree. If one entry records a single VMA, the  
average overhead could be less than two longs per VMA.

However, unlike rb_tree, mapletree does not support storing  
multiple elements under a single key. The key would need to  
look like (vma_id/mm_id + pgoff). On 32-bit platforms, since  
64-bit mapletree keys are not supported yet, the remaining  
12 bits are not enough for vma_id/mm_id.

Because of this limitation, I later started thinking about  
ways to reduce anon_vma allocations instead.

I will try to find some time next week to analyze the  
cow-context design and code more thoroughly, and then  
write up a summary.

Thanks,
Tao

> >
> > I already did, you've not responded to any of them, and I'm simply not
> > spending any more time on this.
> >
> > The series is totally unmergeable, please do not make further rmap
> > submissions.
> >
> > >
> > > I am not good at English and need to use AI to translate commit
> > > messages and comments. This reply email is also translated with AI.
> > > However, the code is written by me. I do not know which AI you are
> > > referring to, but the AI tools I use currently cannot effectively
> > > write kernel code.
> > >
> >
> > We're fine with using AI for language, or in general as long as
> > there's a clear understanding of what's being submitted.
> >
> > However I'm very unconvinced that this series wasn't generated.
> >
> > You have 2 patches in the kernel for the entirety of 2026. One in
> > bluetooth and one in the scheduler.
> >
> > Prior to that you have patches from 2018 in device tree drivers.
> >
> > You have exactly 0 contributions to mm.
> >
> > Out of nowhere this year you have a big series for DMA, this series
> > for anon_vma, having done no work or any contributions to rmap, let
> > alone one of the trickiest and most complicated areas of mm.
> >
> > You have a total of 39 mails on the linux-mm mailing list.
> >
> > Suddenly doing a giant bit of work like this using code that looks
> > entirely like it's AI-generated, and which after assessment by AI
> > gives an 85-90% probability of AI generation is really suspicious.
> >
> > Now, if I'm mistaken, and you have a different name/email/identity I
> > missed with many mm contributes - I will eat my words here (the series
> > is still unmergeable either way though).
> >
> > So sorry, there's simply no trust and as a maintainer of rmap again I
> > must strongly suggest that you no longer submit patches for this part
> > of the kernel.
> >
> > If you wish to build trust up again, begin with discussions, and maybe
> > try some smaller patches in mm to demonstrate that you're genuinely
> > acting in good faith?
> 
> Hi Lorenzo,
> 
> I truly believe Tao is acting with good intentions, although the way this is
> being done is quite messy.
> 
> Memory costs are increasing significantly these days, and as I understand the
> patchset, he is trying to save memory.
> 
> However, I don't think this is being done at the right time or in the right way.
> This may also be due to cultural differences, language barriers, information
> gaps, and a lack of familiarity with the mm community.
> As a non-native speaker, I can see how difficult this can sometimes be.
> 
> I would really ask you to give Tao more chances to build trust step by step.
> 
> Best Regards
> Barry

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Lorenzo Stoakes 1 week, 2 days ago

On Fri, May 29, 2026 at 09:41:20AM +0000, wangtao wrote:
> > Hi Tao,
> >
> > Lorenzo had a discussion about rmap in Zagreb here:
> > https://lore.kernel.org/linux-mm/aec533b2-37a7-4f44-a279-
> > c4aa604206ac@lucifer.local/
> >
> > He also shared the PoC code here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/ljs/linux.git/log/?h=project/
> > cow-context
> >
> > and the slides were shared as well. In case you can't find them on linux-mm (I
> > actually couldn't find them myself), I am attaching them again here -
> > "scalable-cow-lsf-longer-version.pdf"
> >
> > After coming back from Zagreb, I kept trying to find one or two full days to
> > read Lorenzo's code and slides carefully and write a blog about them.
> > Unfortunately, I have been completely busy with other work. Sigh... we
> > always seem to have too many non-upstream tasks.
> >
> > If possible, I'd really appreciate it if you could take a deep dive into it and
> > write a detailed blog post. I'd be very eager to read it and better understand
> > the overall design.
> > Otherwise, I'll try to find some time next week or later to go through it
> > myself.
> >
> Hi Barry,
>
> Thank you very much for your reply.
>
> I took an initial look at the cow-context code, and a few points
> might be worth noting:
>
> 1. cow_context_walk currently assumes that the rmap walk runs
>    under RCU protection. This may need to be adjusted early,
>    since paths such as try_to_unmap_one, page_vma_mkclean_one,
>    and try_to_migrate_one may involve task switching.
>
> 2. In cow_context_walk, traverse_contexts appears to involve
>    multiple nested loops. When there are many child processes
>    across several fork layers, it may not be as simple or
>    efficient as the current anon_vma approach.
>
>    It needs to traverse all child cow_ctx, and within each
>    cow_ctx, remaps_for_each() has two levels of iteration:
>    remaps_for_each_entry and remaps_for_each_entry_offset.
>
>    In other words, it first iterates over cow_ctx and then
>    traverses rmap_mt inside each one. The rough complexity

>    seems to be O(#proc * log(#rmap_entries_in_cow)), which
>    may be somewhat higher than anon_vma's
>    O(#vmas_in_anon_vma). However, in most cases the number
>    of processes is not large, so the impact may be limited.
>
> Previously, I also considered converting anon_vma's rb_tree
> to a mapletree. If one entry records a single VMA, the
> average overhead could be less than two longs per VMA.
>
> However, unlike rb_tree, mapletree does not support storing
> multiple elements under a single key. The key would need to
> look like (vma_id/mm_id + pgoff). On 32-bit platforms, since
> 64-bit mapletree keys are not supported yet, the remaining
> 12 bits are not enough for vma_id/mm_id.
>
> Because of this limitation, I later started thinking about
> ways to reduce anon_vma allocations instead.
>
> I will try to find some time next week to analyze the
> cow-context design and code more thoroughly, and then
> write up a summary.

Tao,

This response is so full of misunderstandings it's not really worth me
responding to any of it. You've even hallucinated an imaginary field which
is REALLY suspicious.

You've no mm expertise or history and came up with this in a few hours. I
asked Claude to analyse it and it puts it at 75-80% chance of being solely
LLM-generated from cow_context.c.

I simply don't have the time to deal with this, so unfortunately I'm going
to have to withdraw the suggestion of further discussion with you on this
topic.

I am working on the scalable CoW project and will solicit opinions of those
with relevant expertise.

We are not interested in your approach or analysis.

Thanks, Lorenzo

RE: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by wangtao 1 week ago

> > Previously, I also considered converting anon_vma's rb_tree to a
> > mapletree. If one entry records a single VMA, the average overhead
> > could be less than two longs per VMA.
> >
> > However, unlike rb_tree, mapletree does not support storing multiple
> > elements under a single key. The key would need to look like
> > (vma_id/mm_id + pgoff). On 32-bit platforms, since 64-bit mapletree
> > keys are not supported yet, the remaining
> > 12 bits are not enough for vma_id/mm_id.
> >
> > Because of this limitation, I later started thinking about ways to
> > reduce anon_vma allocations instead.
> >
> > I will try to find some time next week to analyze the cow-context
> > design and code more thoroughly, and then write up a summary.
> 
> Tao,
> 
> This response is so full of misunderstandings it's not really worth me
> responding to any of it. You've even hallucinated an imaginary field which is
> REALLY suspicious.
> 
> You've no mm expertise or history and came up with this in a few hours. I
> asked Claude to analyse it and it puts it at 75-80% chance of being solely LLM-
> generated from cow_context.c.
> 
> I simply don't have the time to deal with this, so unfortunately I'm going to
> have to withdraw the suggestion of further discussion with you on this topic.
> 
> I am working on the scalable CoW project and will solicit opinions of those
> with relevant expertise.
> 
> We are not interested in your approach or analysis.
> 
> Thanks, Lorenzo

You said discussion was welcome, yet when someone offered even a  
small comment, you refused to continue the discussion.

If I had known you would be this inconsistent, I would not have  
replied to you in the first place.

This will be my last reply to you. I will not respond again.

Consider the following test case:
Process P creates 1000 VMAs with mmap, named vma_1, vma_2, ...,  
vma_1000.

Then it forks child processes C_1, C_2, ..., C_1000. Each child  
process C_k keeps only vma_k and munmaps all other vma_i.

With the current anon_vma, reclaim walking each page only needs  
to handle two VMAs (vma_k in process P and vma_k in process C_k).

But under the CoW approach, reclaiming each page needs to walk  
1000 processes, then spend O(log(#remap_entries)) time to check  
whether a remap_entry exists, and then O(log(#vmas)) time to  
locate the VMA.

Both the code complexity and the time complexity of the reverse  
walk are much higher than the current anon_vma approach.

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Barry Song 6 days, 8 hours ago

On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@honor.com> wrote:
[...]
>
> You said discussion was welcome, yet when someone offered even a
> small comment, you refused to continue the discussion.
>
> If I had known you would be this inconsistent, I would not have
> replied to you in the first place.
>
> This will be my last reply to you. I will not respond again.
>

Hi Tao,

Please don't walk away from the linux-mm community. I read your
patchset and found it quite valuable. It not only reduces memory
overhead, but also eliminates rmap costs for exclusive folios.

Since I'm not very confident discussing technical topics in English,
I wrote a blog post in Chinese about your patchset:

https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA

I have to admit that I found the implementation quite complex and
in need of significant improvement. However, I think the underlying
idea is very interesting and worth exploring further.

I'm looking forward to seeing a v2 RFC with a cleaner and simpler
implementation while preserving the core concept.

Regardless of whether it ultimately gets merged, I hope the discussion
can continue.

Best regards,
Barry

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Harry Yoo 5 days, 15 hours ago

On 6/2/26 11:15 AM, Barry Song wrote:
> On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@honor.com> wrote:
> [...]
>>
>> You said discussion was welcome, yet when someone offered even a
>> small comment, you refused to continue the discussion.
>>
>> If I had known you would be this inconsistent, I would not have
>> replied to you in the first place.
>>
>> This will be my last reply to you. I will not respond again.
> 
> Hi Tao,
> 
> Please don't walk away from the linux-mm community. I read your
> patchset and found it quite valuable. It not only reduces memory
> overhead, but also eliminates rmap costs for exclusive folios.
> 
> Since I'm not very confident discussing technical topics in English,
> I wrote a blog post in Chinese about your patchset:
> 
> https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA
The cover letter and commit messages should have been elaborated to a
much greater degree instead of making people guess the design and intent
from the code.

> I have to admit that I found the implementation quite complex and
> in need of significant improvement.

> However, I think the underlying> idea is very interesting and worth
exploring further.

No. What it is trying to achieve is ambitious, but the idea itself is
not worth exploring further as-is unless the correctness and complexity
concerns are addressed.

> I'm looking forward to seeing a v2 RFC with a cleaner and simpler
> implementation while preserving the core concept.

I'm afraid this encouragement would mislead us in the wrong direction,
where all of us end up wasting time.

There isn't much point in posting v2 without addressing fundamental
questions about the design.

> Regardless of whether it ultimately gets merged, I hope the discussion
> can continue.

Regarding the "improving the reverse mapping subsystem" topic, a more
constructive direction would be to carefully revisit the design
decisions and discuss what we can do about them (that's exactly what
Lorenzo has been doing).

But that's not the first thing I would recommend to a relatively new
contributor given that it's really complicated and even the people who
have designed and reworked the reverse mapping subsystem over the past
20+ years haven't come up with a fundamentally better design.

Reverse mapping is a frustratingly complicated subsystem. Without
carefully revisiting the current design, there is not much hope of
improving things at the design level, even slightly.

What I would recommend to new people instead is:

1) starting by reviewing other people's work, so that you have enough
time to learn the historical context and subtleties of the subsystem
without making intrusive changes (which also keeps in touch with the
community), and

2) making progress on smaller tasks with less intrusive changes, to
gradually build trust and be able to do more valuable work.

Unfortunately, looking at how this thread went, I see that the author is
now in a worse position than an entirely new contributor.

-- 
Cheers,
Harry / Hyeonggon

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Barry Song 5 days, 12 hours ago

On Wed, Jun 3, 2026 at 3:57 AM Harry Yoo <harry@kernel.org> wrote:
>
>
>
> On 6/2/26 11:15 AM, Barry Song wrote:
> > On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@honor.com> wrote:
> > [...]
> >>
> >> You said discussion was welcome, yet when someone offered even a
> >> small comment, you refused to continue the discussion.
> >>
> >> If I had known you would be this inconsistent, I would not have
> >> replied to you in the first place.
> >>
> >> This will be my last reply to you. I will not respond again.
> >
> > Hi Tao,
> >
> > Please don't walk away from the linux-mm community. I read your
> > patchset and found it quite valuable. It not only reduces memory
> > overhead, but also eliminates rmap costs for exclusive folios.
> >
> > Since I'm not very confident discussing technical topics in English,
> > I wrote a blog post in Chinese about your patchset:
> >
> > https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA
> The cover letter and commit messages should have been elaborated to a
> much greater degree instead of making people guess the design and intent
> from the code.

Indeed. The cover letter does not clearly tell the story, and yesterday
I needed quite some time to understand what the patchset
was trying to achieve.

>
> > I have to admit that I found the implementation quite complex and
> > in need of significant improvement.
>
> > However, I think the underlying> idea is very interesting and worth
> exploring further.
>
> No. What it is trying to achieve is ambitious, but the idea itself is
> not worth exploring further as-is unless the correctness and complexity
> concerns are addressed.

Can we give Tao more time to address the concerns and explain
the correctness of the approach?

That said, I don't think the patchset is entirely without merit.
The idea that caught my attention is whether knowing that a
process is guaranteed to be a leaf process could allow us to
simplify parts of the rmap machinery and reduce some of the
associated overhead.

Assuming that a fork server (e.g. systemd or zygote) is preferable
to having each application perform its own fork(), Linux already
largely relies on fork servers in practice. Matthew also pointed
out that calling fork() in multithreaded applications is a
terrible idea [1]. This may suggest that, in general, processes
outside of a fork-server model should avoid using fork().

If we were to introduce an API such as prctl(PR_SET_NOFORK) or
something similar, could we eliminate a significant portion of
the rmap-related overhead for such leaf processes, while still
avoiding the complexity of the lazy allocation scheme proposed
by Tao?

I assume that the vast majority of processes in a real system
are leaf processes?

It also seems somewhat unusual that a few Android applications
invoke fork() directly in a multithreaded context, while most
use the zygote to create multiple processes for an app. Perhaps
the Android framework should discourage this pattern entirely,
and require applications to create child processes via the zygote?

If, in real-world systems, more than 95% of processes are leaf
processes, could that imply that the rmap design might be
reconsidered for a different optimization path?

[1] https://marc.info/?l=linuxppc-embedded&m=177912107460825&w=2

>
> > I'm looking forward to seeing a v2 RFC with a cleaner and simpler
> > implementation while preserving the core concept.
>
> I'm afraid this encouragement would mislead us in the wrong direction,
> where all of us end up wasting time.
>
> There isn't much point in posting v2 without addressing fundamental
> questions about the design.

I suggested a v2 because the current patchset does not clearly
state what it is trying to achieve. A revised version might help
clarify the intent and make it easier to understand. Even if the
overall complexity (such as lazy allocation) makes it hard to
move forward, we may still be able to learn from it and gain
some useful inspiration.

>
> > Regardless of whether it ultimately gets merged, I hope the discussion
> > can continue.
>
> Regarding the "improving the reverse mapping subsystem" topic, a more
> constructive direction would be to carefully revisit the design
> decisions and discuss what we can do about them (that's exactly what
> Lorenzo has been doing).

I have no doubt at all about Lorenzo’s expertise in rmap and many
other mm areas. That is well understood and widely recognized.

I just think that hearing more perspectives could help us gain
additional insight and inspiration.

>
> But that's not the first thing I would recommend to a relatively new
> contributor given that it's really complicated and even the people who
> have designed and reworked the reverse mapping subsystem over the past
> 20+ years haven't come up with a fundamentally better design.
>
> Reverse mapping is a frustratingly complicated subsystem. Without
> carefully revisiting the current design, there is not much hope of
> improving things at the design level, even slightly.
>
> What I would recommend to new people instead is:
>
> 1) starting by reviewing other people's work, so that you have enough
> time to learn the historical context and subtleties of the subsystem
> without making intrusive changes (which also keeps in touch with the
> community), and
>
> 2) making progress on smaller tasks with less intrusive changes, to
> gradually build trust and be able to do more valuable work.
>

Yes, that is a good approach for new contributors.

> Unfortunately, looking at how this thread went, I see that the author is
> now in a worse position than an entirely new contributor.
>
> --
> Cheers,
> Harry / Hyeonggon

Thanks
Barry

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Lance Yang 6 days, 8 hours ago


On 2026/6/2 10:15, Barry Song wrote:
> On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@honor.com> wrote:
> [...]
>>
>> You said discussion was welcome, yet when someone offered even a
>> small comment, you refused to continue the discussion.
>>
>> If I had known you would be this inconsistent, I would not have
>> replied to you in the first place.
>>
>> This will be my last reply to you. I will not respond again.
>>
> 
> Hi Tao,
> 
> Please don't walk away from the linux-mm community. I read your
> patchset and found it quite valuable. It not only reduces memory
> overhead, but also eliminates rmap costs for exclusive folios.
> 
> Since I'm not very confident discussing technical topics in English,
> I wrote a blog post in Chinese about your patchset:
> 
> https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA
> 
> I have to admit that I found the implementation quite complex and
> in need of significant improvement. However, I think the underlying
> idea is very interesting and worth exploring further.
> 
> I'm looking forward to seeing a v2 RFC with a cleaner and simpler
> implementation while preserving the core concept.
> 
> Regardless of whether it ultimately gets merged, I hope the discussion
> can continue.

Same here :)

Tao, please don't let this thread get you down. No first RFC is
perfect, and the idea still looks worth discussing :)

Thanks for working on this!

Cheers, Lance

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Lorenzo Stoakes 5 days, 19 hours ago

On Tue, Jun 02, 2026 at 10:46:35AM +0800, Lance Yang wrote:
>
>
> On 2026/6/2 10:15, Barry Song wrote:
> > On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@honor.com> wrote:
> > [...]
> > >
> > > You said discussion was welcome, yet when someone offered even a
> > > small comment, you refused to continue the discussion.
> > >
> > > If I had known you would be this inconsistent, I would not have
> > > replied to you in the first place.
> > >
> > > This will be my last reply to you. I will not respond again.
> > >
> >
> > Hi Tao,
> >
> > Please don't walk away from the linux-mm community. I read your
> > patchset and found it quite valuable. It not only reduces memory
> > overhead, but also eliminates rmap costs for exclusive folios.
> >
> > Since I'm not very confident discussing technical topics in English,
> > I wrote a blog post in Chinese about your patchset:
> >
> > https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA
> >
> > I have to admit that I found the implementation quite complex and
> > in need of significant improvement. However, I think the underlying
> > idea is very interesting and worth exploring further.
> >
> > I'm looking forward to seeing a v2 RFC with a cleaner and simpler
> > implementation while preserving the core concept.
> >
> > Regardless of whether it ultimately gets merged, I hope the discussion
> > can continue.
>
> Same here :)
>
> Tao, please don't let this thread get you down. No first RFC is
> perfect, and the idea still looks worth discussing :)
>
> Thanks for working on this!

Guys, this isn't helpful.

We aren't extending anon_vma, and I am working on replacing it, that's the
bottom line.

I have presented compelling evidence suggesting this is AI generated. In
response I got more AI-generated nonsense. There's no trust, the code and
analysis are all wrong, end of discussion.

>
> Cheers, Lance
>

Thanks, Lorenzo

P.S. maintainership is utterly thankless, and I don't really expect much in
return, but honestly reading this, given the case I've made here, was
really quite disappointing.

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Barry Song 5 days, 11 hours ago

On Tue, Jun 2, 2026 at 11:37 PM Lorenzo Stoakes <ljs@kernel.org> wrote:
>
> On Tue, Jun 02, 2026 at 10:46:35AM +0800, Lance Yang wrote:
> >
> >
> > On 2026/6/2 10:15, Barry Song wrote:
> > > On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@honor.com> wrote:
> > > [...]
> > > >
> > > > You said discussion was welcome, yet when someone offered even a
> > > > small comment, you refused to continue the discussion.
> > > >
> > > > If I had known you would be this inconsistent, I would not have
> > > > replied to you in the first place.
> > > >
> > > > This will be my last reply to you. I will not respond again.
> > > >
> > >
> > > Hi Tao,
> > >
> > > Please don't walk away from the linux-mm community. I read your
> > > patchset and found it quite valuable. It not only reduces memory
> > > overhead, but also eliminates rmap costs for exclusive folios.
> > >
> > > Since I'm not very confident discussing technical topics in English,
> > > I wrote a blog post in Chinese about your patchset:
> > >
> > > https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA
> > >
> > > I have to admit that I found the implementation quite complex and
> > > in need of significant improvement. However, I think the underlying
> > > idea is very interesting and worth exploring further.
> > >
> > > I'm looking forward to seeing a v2 RFC with a cleaner and simpler
> > > implementation while preserving the core concept.
> > >
> > > Regardless of whether it ultimately gets merged, I hope the discussion
> > > can continue.
> >
> > Same here :)
> >
> > Tao, please don't let this thread get you down. No first RFC is
> > perfect, and the idea still looks worth discussing :)
> >
> > Thanks for working on this!
>
> Guys, this isn't helpful.
>
> We aren't extending anon_vma, and I am working on replacing it, that's the
> bottom line.

Not trying to challenge your bottom line. As explained to Harry, I
have no doubt about your expertise in rmap and many other mm
areas, and I deeply respect your work on rmap.

With more discussion, we might gain additional insight and
inspiration. What Tao has inspired me with is the idea that if we
assume most real-world processes are leaf processes, could we
simplify parts of the design?

This is why I suggested a v2, to improve the clarity of the cover
letter and make the code easier to understand, and to see whether
there is something worth considering further, even if it is not
suitable for merging.

>
> I have presented compelling evidence suggesting this is AI generated. In
> response I got more AI-generated nonsense. There's no trust, the code and
> analysis are all wrong, end of discussion.

I am not an AI expert, and I do not really use AI in kernel work,
so I am not really sure what counts as AI versus non-AI. Sorry.

>
> >
> > Cheers, Lance
> >
>
> Thanks, Lorenzo
>
> P.S. maintainership is utterly thankless, and I don't really expect much in
> return, but honestly reading this, given the case I've made here, was
> really quite disappointing.

Understood. I see your position, and I personally have great
respect and appreciation for your work on maintenance. Sorry if
my words came across as disappointing.

Best Regards
Barry

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Lorenzo Stoakes 5 days, 3 hours ago

On Wed, Jun 03, 2026 at 07:03:53AM +0800, Barry Song wrote:
> On Tue, Jun 2, 2026 at 11:37 PM Lorenzo Stoakes <ljs@kernel.org> wrote:
> >
> > On Tue, Jun 02, 2026 at 10:46:35AM +0800, Lance Yang wrote:
> > >
> > >
> > > On 2026/6/2 10:15, Barry Song wrote:
> > > > On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@honor.com> wrote:
> > > > [...]
> > > > >
> > > > > You said discussion was welcome, yet when someone offered even a
> > > > > small comment, you refused to continue the discussion.
> > > > >
> > > > > If I had known you would be this inconsistent, I would not have
> > > > > replied to you in the first place.
> > > > >
> > > > > This will be my last reply to you. I will not respond again.
> > > > >
> > > >
> > > > Hi Tao,
> > > >
> > > > Please don't walk away from the linux-mm community. I read your
> > > > patchset and found it quite valuable. It not only reduces memory
> > > > overhead, but also eliminates rmap costs for exclusive folios.
> > > >
> > > > Since I'm not very confident discussing technical topics in English,
> > > > I wrote a blog post in Chinese about your patchset:
> > > >
> > > > https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA
> > > >
> > > > I have to admit that I found the implementation quite complex and
> > > > in need of significant improvement. However, I think the underlying
> > > > idea is very interesting and worth exploring further.
> > > >
> > > > I'm looking forward to seeing a v2 RFC with a cleaner and simpler
> > > > implementation while preserving the core concept.
> > > >
> > > > Regardless of whether it ultimately gets merged, I hope the discussion
> > > > can continue.
> > >
> > > Same here :)
> > >
> > > Tao, please don't let this thread get you down. No first RFC is
> > > perfect, and the idea still looks worth discussing :)
> > >
> > > Thanks for working on this!
> >
> > Guys, this isn't helpful.
> >
> > We aren't extending anon_vma, and I am working on replacing it, that's the
> > bottom line.
>
> Not trying to challenge your bottom line. As explained to Harry, I
> have no doubt about your expertise in rmap and many other mm
> areas, and I deeply respect your work on rmap.

Thanks I appreciate that.

I don't mean to be 'mean' here, I'm only acting in what I feel are the best
interests of mm and the kernel.

>
> With more discussion, we might gain additional insight and
> inspiration. What Tao has inspired me with is the idea that if we
> assume most real-world processes are leaf processes, could we
> simplify parts of the design?

Maybe I didn't express it clearly enough at LSF, but this is entirely a key
point of my CoW context design :)

It's true most stuff is leaf, and yes we can take advantage of this, and CoW
context allows us to do it while also unravelling the issues with anon_vma.

I am actually thinking of doing some incremental changes as part of my work
possibly if I can.

I maybe need to expedite that to bring some clarity to things here...

>
> This is why I suggested a v2, to improve the clarity of the cover
> letter and make the code easier to understand, and to see whether
> there is something worth considering further, even if it is not
> suitable for merging.

Right, I see. Again I'm really trying to tread a fine line here between the
technical discussion and not pouring more and more time into a discussion that's
not useful to me or the community.

See [0] as to my reasoning on this :)

[0]:https://lore.kernel.org/all/ah887A5VkXOcmq-g@lucifer/

>
> >
> > I have presented compelling evidence suggesting this is AI generated. In
> > response I got more AI-generated nonsense. There's no trust, the code and
> > analysis are all wrong, end of discussion.
>
> I am not an AI expert, and I do not really use AI in kernel work,
> so I am not really sure what counts as AI versus non-AI. Sorry.

No worries!

>
> >
> > >
> > > Cheers, Lance
> > >
> >
> > Thanks, Lorenzo
> >
> > P.S. maintainership is utterly thankless, and I don't really expect much in
> > return, but honestly reading this, given the case I've made here, was
> > really quite disappointing.
>
> Understood. I see your position, and I personally have great
> respect and appreciation for your work on maintenance. Sorry if
> my words came across as disappointing.

Thanks, appreciate it. And no worries!

>
> Best Regards
> Barry

Cheers, Lorenzo

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Pedro Falcato 5 days, 15 hours ago

On Tue, Jun 02, 2026 at 04:37:14PM +0100, Lorenzo Stoakes wrote:
> On Tue, Jun 02, 2026 at 10:46:35AM +0800, Lance Yang wrote:
> >
> >
> > On 2026/6/2 10:15, Barry Song wrote:
> > > On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@honor.com> wrote:
> > > [...]
> > > >
> > > > You said discussion was welcome, yet when someone offered even a
> > > > small comment, you refused to continue the discussion.
> > > >
> > > > If I had known you would be this inconsistent, I would not have
> > > > replied to you in the first place.
> > > >
> > > > This will be my last reply to you. I will not respond again.
> > > >
> > >
> > > Hi Tao,
> > >
> > > Please don't walk away from the linux-mm community. I read your
> > > patchset and found it quite valuable. It not only reduces memory
> > > overhead, but also eliminates rmap costs for exclusive folios.
> > >
> > > Since I'm not very confident discussing technical topics in English,
> > > I wrote a blog post in Chinese about your patchset:
> > >
> > > https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA
> > >
> > > I have to admit that I found the implementation quite complex and
> > > in need of significant improvement. However, I think the underlying
> > > idea is very interesting and worth exploring further.
> > >
> > > I'm looking forward to seeing a v2 RFC with a cleaner and simpler
> > > implementation while preserving the core concept.
> > >
> > > Regardless of whether it ultimately gets merged, I hope the discussion
> > > can continue.
> >
> > Same here :)
> >
> > Tao, please don't let this thread get you down. No first RFC is
> > perfect, and the idea still looks worth discussing :)
> >
> > Thanks for working on this!
> 
> Guys, this isn't helpful.
> 
> We aren't extending anon_vma, and I am working on replacing it, that's the
> bottom line.
> 
> I have presented compelling evidence suggesting this is AI generated. In
> response I got more AI-generated nonsense. There's no trust, the code and
> analysis are all wrong, end of discussion.

100% agree. I think plenty of technical/process/etc reasons as to why this
idea/contribution is not mergeable have been listed. Overriding this with
"keep it up!!!111!11!!" is not helpful.

-- 
Pedro

Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

Posted by Lorenzo Stoakes 5 days, 14 hours ago

On Fri, May 29, 2026 at 01:04:08PM +0100, Lorenzo Stoakes wrote:
> On Fri, May 29, 2026 at 09:41:20AM +0000, wangtao wrote:
> > > Hi Tao,
> > >
> > > Lorenzo had a discussion about rmap in Zagreb here:
> > > https://lore.kernel.org/linux-mm/aec533b2-37a7-4f44-a279-
> > > c4aa604206ac@lucifer.local/
> > >
> > > He also shared the PoC code here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/ljs/linux.git/log/?h=project/
> > > cow-context
> > >
> > > and the slides were shared as well. In case you can't find them on linux-mm (I
> > > actually couldn't find them myself), I am attaching them again here -
> > > "scalable-cow-lsf-longer-version.pdf"
> > >
> > > After coming back from Zagreb, I kept trying to find one or two full days to
> > > read Lorenzo's code and slides carefully and write a blog about them.
> > > Unfortunately, I have been completely busy with other work. Sigh... we
> > > always seem to have too many non-upstream tasks.
> > >
> > > If possible, I'd really appreciate it if you could take a deep dive into it and
> > > write a detailed blog post. I'd be very eager to read it and better understand
> > > the overall design.
> > > Otherwise, I'll try to find some time next week or later to go through it
> > > myself.
> > >
> > Hi Barry,
> >
> > Thank you very much for your reply.
> >
> > I took an initial look at the cow-context code, and a few points
> > might be worth noting:
> >
> > 1. cow_context_walk currently assumes that the rmap walk runs
> >    under RCU protection. This may need to be adjusted early,
> >    since paths such as try_to_unmap_one, page_vma_mkclean_one,
> >    and try_to_migrate_one may involve task switching.
> >
> > 2. In cow_context_walk, traverse_contexts appears to involve
> >    multiple nested loops. When there are many child processes
> >    across several fork layers, it may not be as simple or
> >    efficient as the current anon_vma approach.
> >
> >    It needs to traverse all child cow_ctx, and within each
> >    cow_ctx, remaps_for_each() has two levels of iteration:
> >    remaps_for_each_entry and remaps_for_each_entry_offset.
> >
> >    In other words, it first iterates over cow_ctx and then
> >    traverses rmap_mt inside each one. The rough complexity
>
> >    seems to be O(#proc * log(#rmap_entries_in_cow)), which
> >    may be somewhat higher than anon_vma's
> >    O(#vmas_in_anon_vma). However, in most cases the number
> >    of processes is not large, so the impact may be limited.
> >
> > Previously, I also considered converting anon_vma's rb_tree
> > to a mapletree. If one entry records a single VMA, the
> > average overhead could be less than two longs per VMA.
> >
> > However, unlike rb_tree, mapletree does not support storing
> > multiple elements under a single key. The key would need to
> > look like (vma_id/mm_id + pgoff). On 32-bit platforms, since
> > 64-bit mapletree keys are not supported yet, the remaining
> > 12 bits are not enough for vma_id/mm_id.
> >
> > Because of this limitation, I later started thinking about
> > ways to reduce anon_vma allocations instead.
> >
> > I will try to find some time next week to analyze the
> > cow-context design and code more thoroughly, and then
> > write up a summary.
>
> Tao,
>
> This response is so full of misunderstandings it's not really worth me
> responding to any of it. You've even hallucinated an imaginary field which
> is REALLY suspicious.
>
> You've no mm expertise or history and came up with this in a few hours. I
> asked Claude to analyse it and it puts it at 75-80% chance of being solely
> LLM-generated from cow_context.c.
>
> I simply don't have the time to deal with this, so unfortunately I'm going
> to have to withdraw the suggestion of further discussion with you on this
> topic.
>
> I am working on the scalable CoW project and will solicit opinions of those
> with relevant expertise.
>
> We are not interested in your approach or analysis.
>
> Thanks, Lorenzo

Apparently there's some misunderstanding about this situation here, sigh.

So for avoidance of doubt - I've now spent many hours on this, and unfortunately
(as I've already said in multiple places) this series has serious architectural
and code flaws.

And unfortunately, the anon_vma approach is not something we wish to extend, for
reasons I've gone into elsewhere - but broadly because it's a broken
abstraction, that uses lots of memory and causes lock contention.

The approach here has multiple technical issues, so many that getting into each
one would require hours more of my time to analyse, maybe all week?

And then if there were further replies and replies to the replies and respins...

However, I also feel there's substantive, overlapping, evidence of the _logic_
(not the text, we are FINE with using AI to assist text for non-native speakers)
being LLM-generated.

However you can never prove this for 100% certain. But you can certainly be more
or less sure. I would never suggest this unless I was really pretty certain.

I am very keen to avoid 'witch hunts', or rash accusations. This is not
that. It's a _carefully considered_ opinion, based on evidence.

But of course - I do not know for SURE. You can never know.

The big problem here is asymmetry of maintainer resource. I simply _cannot_
respond to every single issue here. And when the architecture is something we
don't want, then it's not really necessary to.

And my big deep underlying concern with all this is - people can generate a very
significant amount of this kind of work, and we have limited reviewer time.

I've already dealt with burnout recently that I'm thankfully recovering
from. I'm not really keen to go back to that.

I really truly worry that if we don't have a means by which we can quickly
dismiss/deprioritise things when we have a _significant_ evidence of wholesale
AI generation, then maintainer overload will increase exponentially.

And that's really a serious problem.

If we treat it like simply a technically incorrect solution, then it means we
open it up to further discussion on and onx, as we're actually observing here. If
the responses are also LLM-generated then it's even more problematic.

This is why I bring it up, and proactively say it's lead to a real loss in trust
in this case, and why, after there was a response that included a hallucinated
field in it, I went further and said that I really don't want to have a
discussion either.

It's because of this asymmetry.

And even this reply, written at 9.45pm at night, after several hours of
discussion about this off-list, is evidence of the problem we have with this
kind of asymmetry.

It's nothing personal, it's about managing time and resources.

Thanks, Lorenzo