[v2] mm/huge_memory: refactor zap_huge_pmd()

[PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Lorenzo Stoakes (Oracle) 2 weeks, 3 days ago

The zap_huge_pmd() function is overly complicated, clean it up and also add
an assert in the case that we encounter a buggy PMD entry that doesn't
match expectations.

This is motivated by a bug discovered [0] where the PMD entry was none of:

* A non-DAX, PFN or mixed map.
* The huge zero folio
* A present PMD entry
* A softleaf entry

In zap_huge_pmd(), but due to the bug we manged to reach this code.

It is useful to explicitly call this out rather than have an arbitrary NULL
pointer dereference happen, which also improves understanding of what's
going on.

[0]:https://lore.kernel.org/all/6b3d7ad7-49e1-407a-903d-3103704160d8@lucifer.local/


v2:
* Added tags thanks everybody!
* Fixed issue with returning false on bug case potentially looping forever as
  per Baolin.
* Fixed further issue in bug path in 5/8 with double pte unlock.
* Add patch to use vm_normal_folio_pmd() as per David.

v1:
https://lore.kernel.org/all/cover.1773865827.git.ljs@kernel.org/

Lorenzo Stoakes (Oracle) (9):
  mm/huge_memory: simplify vma_is_specal_huge()
  mm/huge: avoid big else branch in zap_huge_pmd()
  mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
  mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
  mm/huge_memory: add a common exit path to zap_huge_pmd()
  mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
  mm/huge_memory: deduplicate zap deposited table call
  mm/huge_memory: deduplicate zap_huge_pmd() further by tracking state
  mm/huge_memory: have zap_huge_pmd() use vm_normal_folio_pmd()

 include/linux/huge_mm.h |   8 +--
 include/linux/mm.h      |  16 -----
 mm/huge_memory.c        | 141 +++++++++++++++++++++++-----------------
 3 files changed, 85 insertions(+), 80 deletions(-)

--
2.53.

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Andrew Morton 2 weeks, 3 days ago

On Thu, 19 Mar 2026 13:00:06 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:

> The zap_huge_pmd() function is overly complicated, clean it up and also add
> an assert in the case that we encounter a buggy PMD entry that doesn't
> match expectations.
> 
> This is motivated by a bug discovered [0] where the PMD entry was none of:
> 
> * A non-DAX, PFN or mixed map.
> * The huge zero folio
> * A present PMD entry
> * A softleaf entry
> 
> In zap_huge_pmd(), but due to the bug we manged to reach this code.
> 
> It is useful to explicitly call this out rather than have an arbitrary NULL
> pointer dereference happen, which also improves understanding of what's
> going on.
> 
> [0]:https://lore.kernel.org/all/6b3d7ad7-49e1-407a-903d-3103704160d8@lucifer.local/

AI review has questions, which I assume you've seen
	https://sashiko.dev/#/patchset/cover.1773924928.git.ljs%40kernel.org

This isn't going well from a workflow POV.  I merge stuff (this was v2)
then half a day later a bunch of potential issues are identified.

If these reviews are useful (they seem to be, enough) then I guess I'll
need to further increase the lag between seeing-it and merging-it.  But
if there's a 2-day lag before I get onto a series and I'm the first to
look at Sashiko then that won't help.

So it needs to be something like

	- series is posted
	- 24 hours pass
	- submitter takes a look at the AI review, maybe prepares a new
	  series.
	- 24 hours pass
	- rinse, repeat
	- it gets merged, hopefully with some Reviewed-by"s.

Not unreasonable, but it requires that submitter be made aware of
Sashiko's comments.  At present that's via me being tiresome.

Anyway, early days.  I'm thinking that an emailed reply-to-all from
Sashiko will help.  Much hinges on how useful submitters find these
questions to be - something which I'm paying close attention to...

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Roman Gushchin 2 weeks, 2 days ago

Andrew Morton <akpm@linux-foundation.org> writes:

> On Thu, 19 Mar 2026 13:00:06 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
>
>> The zap_huge_pmd() function is overly complicated, clean it up and also add
>> an assert in the case that we encounter a buggy PMD entry that doesn't
>> match expectations.
>> 
>> This is motivated by a bug discovered [0] where the PMD entry was none of:
>> 
>> * A non-DAX, PFN or mixed map.
>> * The huge zero folio
>> * A present PMD entry
>> * A softleaf entry
>> 
>> In zap_huge_pmd(), but due to the bug we manged to reach this code.
>> 
>> It is useful to explicitly call this out rather than have an arbitrary NULL
>> pointer dereference happen, which also improves understanding of what's
>> going on.
>> 
>> [0]:https://lore.kernel.org/all/6b3d7ad7-49e1-407a-903d-3103704160d8@lucifer.local/
>
> AI review has questions, which I assume you've seen
> 	https://sashiko.dev/#/patchset/cover.1773924928.git.ljs%40kernel.org
>
>
>
> This isn't going well from a workflow POV.  I merge stuff (this was v2)
> then half a day later a bunch of potential issues are identified.
>
> If these reviews are useful (they seem to be, enough) then I guess I'll
> need to further increase the lag between seeing-it and merging-it.  But
> if there's a 2-day lag before I get onto a series and I'm the first to
> look at Sashiko then that won't help.
>
> So it needs to be something like
>
> 	- series is posted
> 	- 24 hours pass
> 	- submitter takes a look at the AI review, maybe prepares a new
> 	  series.
> 	- 24 hours pass
> 	- rinse, repeat
> 	- it gets merged, hopefully with some Reviewed-by"s.
>
> Not unreasonable, but it requires that submitter be made aware of
> Sashiko's comments.  At present that's via me being tiresome.
>
>
> Anyway, early days.  I'm thinking that an emailed reply-to-all from
> Sashiko will help.  Much hinges on how useful submitters find these
> questions to be - something which I'm paying close attention to...

For bpf Alexei suggested to always send the review to the author and
cc the bpf mailing list, but ignore maintainers and other mailing lists
like lkml to minimize the traffic. It sounds like a good trade off to me.

If there are concerns about polluting the mm mailing list,  maybe
something like a new list like "mm-new" / "mm-early" might work?
Idk what's the best thing here to do, just throwing some ideas.

Likely next week I'll be able to send reviews over the email
and I can send them to whoever we think it's appropriate.

Thanks!

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Andrew Morton 2 weeks, 2 days ago

On Fri, 20 Mar 2026 20:21:04 -0700 Roman Gushchin <roman.gushchin@linux.dev> wrote:

> >
> > Anyway, early days.  I'm thinking that an emailed reply-to-all from
> > Sashiko will help.  Much hinges on how useful submitters find these
> > questions to be - something which I'm paying close attention to...
> 
> For bpf Alexei suggested to always send the review to the author and
> cc the bpf mailing list, but ignore maintainers and other mailing lists
> like lkml to minimize the traffic. It sounds like a good trade off to me.

I'd like to see them.

But I'm figuring it out now - I just let the patchset chill until
Sashiko has looked at it.

> If there are concerns about polluting the mm mailing list,  maybe
> something like a new list like "mm-new" / "mm-early" might work?
> Idk what's the best thing here to do, just throwing some ideas.

Yes, a dedicated list might be the way to go.

> Likely next week I'll be able to send reviews over the email
> and I can send them to whoever we think it's appropriate.

Cool.

A lot of patchsets are "failed to apply".  What is Sashiko trying to
apply MM patches to?  It would take some smarts to apply the v2
patchset when v1 is presently in mm.git?

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Andrew Morton 2 weeks, 1 day ago

On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:

> A lot of patchsets are "failed to apply".  What is Sashiko trying to
> apply MM patches to?  It would take some smarts to apply the v2
> patchset when v1 is presently in mm.git?

?

The way things are going at present, I'm just not going to apply a
series which Sashiko "failed to apply".  And that's cool, I'll just
wait for a version which Sashiko was able to apply.  And then not
apply unless all Sashiko questions are resolved or convincingly refuted.

Question please: if Sashiko finds an "issue" in v3 and then v4 comes
out with changelog words which justifies the questionable alteration, can
Sashiko parse that changelog justification and think "OK, never mind"?

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Lorenzo Stoakes (Oracle) 1 week, 6 days ago

On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote:
> On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > A lot of patchsets are "failed to apply".  What is Sashiko trying to
> > apply MM patches to?  It would take some smarts to apply the v2
> > patchset when v1 is presently in mm.git?
>
> ?
>
> The way things are going at present, I'm just not going to apply a

50% noise vs. signal?... maybe wait until we're in the 9x'%s?

> series which Sashiko "failed to apply".  And that's cool, I'll just
> wait for a version which Sashiko was able to apply.  And then not
> apply unless all Sashiko questions are resolved or convincingly refuted.

Andrew, for crying out loud. Please don't do this.

2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't
apply to Sashiko, but do apply to the mm tree.

I haven't the _faintest clue_ how we are supposed to factor a 3rd party
experimental website applying or not applying series into our work??

And 'not apply unless all Sashiko questions are resolved or convincingly
refuted.' is seriously concerning.

The workload is already insane, now you're expecting us to answer every bit
of nonsense Sashiko hallucinates or misunderstands also?

I say that with no disrespect to Roman or his efforts, but as discussed at
length, it is not ready for prime time yet.

It's clear that Sashiko is not correctly handling applies, and produces a
lot of noise. Predicating taking series on this is absurd.

Thanks, Lorenzo

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Roman Gushchin 1 week, 6 days ago

"Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:

> On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote:
>> On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> > A lot of patchsets are "failed to apply".  What is Sashiko trying to
>> > apply MM patches to?  It would take some smarts to apply the v2
>> > patchset when v1 is presently in mm.git?
>>
>> ?
>>
>> The way things are going at present, I'm just not going to apply a
>
> 50% noise vs. signal?... maybe wait until we're in the 9x'%s?
>
>> series which Sashiko "failed to apply".  And that's cool, I'll just
>> wait for a version which Sashiko was able to apply.  And then not
>> apply unless all Sashiko questions are resolved or convincingly refuted.
>
> Andrew, for crying out loud. Please don't do this.
>
> 2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't
> apply to Sashiko, but do apply to the mm tree.

I'll look into that.

> I haven't the _faintest clue_ how we are supposed to factor a 3rd party
> experimental website applying or not applying series into our work??
>
> And 'not apply unless all Sashiko questions are resolved or convincingly
> refuted.' is seriously concerning.
>
> The workload is already insane, now you're expecting us to answer every bit
> of nonsense Sashiko hallucinates or misunderstands also?
>
> I say that with no disrespect to Roman or his efforts, but as discussed at
> length, it is not ready for prime time yet.
>
> It's clear that Sashiko is not correctly handling applies, and produces a
> lot of noise. Predicating taking series on this is absurd.

Not trying to pretend that Sashiko is perfect in any way, I think a good
mental exercise is to put down our expectation how the "perfect" system
would work. The more I work on it, the more I realize it it's far from
binary correct/incorrect. In fact, the same applies to humans: I'm sure
everyone of us had once this feeling that someone is to picky and just
annoying us with finding small nits. At the same time some of these
people are extremely useful for the community to find and fix a lot of
issues. In the end, we do argue all the time about questions/issues
raised by human reviewers.

Like do we prefer a system, which finds more real bugs at the cost of being
more noisy or we prefer a system which misses more but if it points at
the bug, it's certainly real? I'm sure you tempted to prefer the latter,
but image a hypothetical system which finds _all_ bugs, but has some false
positive rate, e.g. 20%. I think it's pretty attractive.

Also lot of raised issues are real, but subjectively are not worth our
time. But this is extremely subjective! Depends on the personal level
of perfectionism, amount of time available, the state of code before,
further plans, etc etc. For example, syzkaller has usually o(100's) open
bugs, which are 100% real, but not always are high priority work.

I think that asking to address 100% issues raised by any LLM is not
reasonable (especially because it's output might be different each time
you runt it with the same input), but I also think it's reasonable to
address critical & high severity concerns. And I'm happy to tweak
Sashiko to be more conservative here, but I think it should be based on
some specific examples or data, not purely subjective.

tl;dr I increasingly realize the importance of the social context for
providing good reviews, and it can't be easily derived from the code.
What is acceptable in one subsystem is considered a bad practice in the
other. I guess the only way to get the system we all find acceptable
(and we still might not like it, who likes being pointed at their bugs?)
is collectively codify our expectations in prompts on per-subsystem basis.

Thanks!

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Lorenzo Stoakes (Oracle) 1 week, 6 days ago

On Mon, Mar 23, 2026 at 06:08:27PM -0700, Roman Gushchin wrote:
> "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:
>
> > On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote:
> >> On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> >>
> >> > A lot of patchsets are "failed to apply".  What is Sashiko trying to
> >> > apply MM patches to?  It would take some smarts to apply the v2
> >> > patchset when v1 is presently in mm.git?
> >>
> >> ?
> >>
> >> The way things are going at present, I'm just not going to apply a
> >
> > 50% noise vs. signal?... maybe wait until we're in the 9x'%s?
> >
> >> series which Sashiko "failed to apply".  And that's cool, I'll just
> >> wait for a version which Sashiko was able to apply.  And then not
> >> apply unless all Sashiko questions are resolved or convincingly refuted.
> >
> > Andrew, for crying out loud. Please don't do this.
> >
> > 2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't
> > apply to Sashiko, but do apply to the mm tree.
>
> I'll look into that.

Thanks.

>
> > I haven't the _faintest clue_ how we are supposed to factor a 3rd party
> > experimental website applying or not applying series into our work??
> >
> > And 'not apply unless all Sashiko questions are resolved or convincingly
> > refuted.' is seriously concerning.
> >
> > The workload is already insane, now you're expecting us to answer every bit
> > of nonsense Sashiko hallucinates or misunderstands also?
> >
> > I say that with no disrespect to Roman or his efforts, but as discussed at
> > length, it is not ready for prime time yet.
> >
> > It's clear that Sashiko is not correctly handling applies, and produces a
> > lot of noise. Predicating taking series on this is absurd.
>
> Not trying to pretend that Sashiko is perfect in any way, I think a good
> mental exercise is to put down our expectation how the "perfect" system
> would work. The more I work on it, the more I realize it it's far from

Throughout this discussion I have been making practical points. Nobody
expects perfection.

I am simpy saying unilaterally demanding that every single point sashiko
raises is responded to out of the blue without any community input or input
from those doing review AND requiring somehow series all apply is not good.

BTW, I don't want to make you the scapegoat for complaints about mm process
here :) so I am being careful not to criticise, as I realise when people
are frustrated with tooling even if _totally irrelevant_ to you as the
maker of the tool, will instinctively want to blame you.

I refuse to fall into this trap ;)

> binary correct/incorrect. In fact, the same applies to humans: I'm sure
> everyone of us had once this feeling that someone is to picky and just
> annoying us with finding small nits. At the same time some of these
> people are extremely useful for the community to find and fix a lot of
> issues. In the end, we do argue all the time about questions/issues
> raised by human reviewers.

Yes except human reviewers generally evolve over time to be pretty high
signal if they remain consistent, that is at least how it is in mm. Even if
you think points are trivial.

Sashiko is hallucinating, it is raising irrelevant points that have nothing
to do with the series, it's creating responses that require serious time to
decode.

I have not encountered review in mm that is even anwyhere near the ~50% hit
rate, rest potentialy violently wrong/wildly irrelevant that sashiko
generates.

There's an asymmetry too - sashiko can just keep on generating this stuff
indefinitely (well, limited by tokens of course :), and potentially
generate serious useless work for submitters and reviewers.

We _have_ to take that into account when it comes to review process.

Again, this is nothing to do with the tooling which I'm grateful, again
it's to do with mm process. And sadly you've been dragged into a debate on
this which you are ultimately more or less orthogonal to :)

>
> Like do we prefer a system, which finds more real bugs at the cost of being
> more noisy or we prefer a system which misses more but if it points at
> the bug, it's certainly real? I'm sure you tempted to prefer the latter,
> but image a hypothetical system which finds _all_ bugs, but has some false
> positive rate, e.g. 20%. I think it's pretty attractive.

I think we are very far from that right now. The issue is how it is _now_
not in some imagined future.

And it's easy to pontificate about all this, but in the end it's the
sub-maintainers in mm who will have to eventually figure out whether a
series is ok or not, and have to decide stuff people might do based on
hallucinations/irrelevant points etc.

Right now this is going to result in _more work_ for us, and already it
feels like in mm the sub-maintainers are the reason things function
reasonably, but we don't seem to be having our voices heard here.

>
> Also lot of raised issues are real, but subjectively are not worth our
> time. But this is extremely subjective! Depends on the personal level
> of perfectionism, amount of time available, the state of code before,
> further plans, etc etc. For example, syzkaller has usually o(100's) open
> bugs, which are 100% real, but not always are high priority work.

I don't think it's anywhere near as subjective as you say, and I think
that's easy to hand wave.

One issue here is - trust. There are people in the community we trust to
whom we asssign M: and R: entries in MAINTAINERS.

Trust on taste, judgement etc.

Now sashiko is essentially proposed to be given the same trust despite
absolutely not deserving it.

What I propose, as I did in the other sub-thread here, is to use it as a
_tool_ to _help_ sub-maintainers do their job.

Not for it to become a new trusted gatekeeper out of the blue and
unilaterally while adding to our workload.

>
> I think that asking to address 100% issues raised by any LLM is not
> reasonable (especially because it's output might be different each time

Really, again with respect and trying to dodge the 'blame the tool maker'
thing :) that's something of a strawman, nobody is saying they require
that.

I think >~50% signal is a reasonable ask though.

> you runt it with the same input), but I also think it's reasonable to
> address critical & high severity concerns. And I'm happy to tweak

Right, but with respect you're not an mm maintainer who has to deal with
the resultant fallout :)

> Sashiko to be more conservative here, but I think it should be based on
> some specific examples or data, not purely subjective.

Well you can't both say all review is highly subjective and simultaneously
ask for objective feedback :)

I have provided detailed feedback on a specific example elsewhere, and I'm
telling you as an experienced mm maintainer that the hit rate is ~50% in my
experience so far.

I'm happy to feedback more, but it's again a time and workload thing - the
default here shouldn't be that mm is just taking sashiko input as read and
we have to jump on everything to explicitly say it's right/wrong.

Ideally we'd have some way of feeding back on the website, even if it's as
simple as a tick/cross as to what points you are actually accepting or
not. That'd be great I think!

That could be useful as well to Andrew who could see that in action.

User login wise you could have some system where somebody could send a mail
from the account that is being reviewed to get a login or something?

>
> tl;dr I increasingly realize the importance of the social context for
> providing good reviews, and it can't be easily derived from the code.

Yes for sure.

> What is acceptable in one subsystem is considered a bad practice in the
> other. I guess the only way to get the system we all find acceptable
> (and we still might not like it, who likes being pointed at their bugs?)
> is collectively codify our expectations in prompts on per-subsystem basis.

Well not only that, we need to figure out, per-subsystem, what our process
will be.

Again the contentiousness here is not around your tooling but really around
the unilateral announcement that we're just going to block on sashiko now.

And on that I am pushing back with detailed points as per the rest of the
thread.

>
> Thanks!

Cheers, Lorenzo

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Roman Gushchin 1 week, 5 days ago

"Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:

> On Mon, Mar 23, 2026 at 06:08:27PM -0700, Roman Gushchin wrote:
>> "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:
>>
>> > On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote:
>> >> On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
>> >>
>> >> > A lot of patchsets are "failed to apply".  What is Sashiko trying to
>> >> > apply MM patches to?  It would take some smarts to apply the v2
>> >> > patchset when v1 is presently in mm.git?
>> >>
>> >> ?
>> >>
>> >> The way things are going at present, I'm just not going to apply a
>> >
>> > 50% noise vs. signal?... maybe wait until we're in the 9x'%s?
>> >
>> >> series which Sashiko "failed to apply".  And that's cool, I'll just
>> >> wait for a version which Sashiko was able to apply.  And then not
>> >> apply unless all Sashiko questions are resolved or convincingly refuted.
>> >
>> > Andrew, for crying out loud. Please don't do this.
>> >
>> > 2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't
>> > apply to Sashiko, but do apply to the mm tree.
>>
>> I'll look into that.
>
> Thanks.
>
>>
>> > I haven't the _faintest clue_ how we are supposed to factor a 3rd party
>> > experimental website applying or not applying series into our work??
>> >
>> > And 'not apply unless all Sashiko questions are resolved or convincingly
>> > refuted.' is seriously concerning.
>> >
>> > The workload is already insane, now you're expecting us to answer every bit
>> > of nonsense Sashiko hallucinates or misunderstands also?
>> >
>> > I say that with no disrespect to Roman or his efforts, but as discussed at
>> > length, it is not ready for prime time yet.
>> >
>> > It's clear that Sashiko is not correctly handling applies, and produces a
>> > lot of noise. Predicating taking series on this is absurd.
>>
>> Not trying to pretend that Sashiko is perfect in any way, I think a good
>> mental exercise is to put down our expectation how the "perfect" system
>> would work. The more I work on it, the more I realize it it's far from
>
> Throughout this discussion I have been making practical points. Nobody
> expects perfection.
>
> I am simpy saying unilaterally demanding that every single point sashiko
> raises is responded to out of the blue without any community input or input
> from those doing review AND requiring somehow series all apply is not
> good.

I never suggested this and explicitly wrote it below (but looks like I
wasn't clear enough and you argue with this statement).

>
> BTW, I don't want to make you the scapegoat for complaints about mm process
> here :) so I am being careful not to criticise, as I realise when people
> are frustrated with tooling even if _totally irrelevant_ to you as the
> maker of the tool, will instinctively want to blame you.
>
> I refuse to fall into this trap ;)

Agree. Let's separate the mm process from everything else here,
otherwise it quickly becomes too messy.

>
>> binary correct/incorrect. In fact, the same applies to humans: I'm sure
>> everyone of us had once this feeling that someone is to picky and just
>> annoying us with finding small nits. At the same time some of these
>> people are extremely useful for the community to find and fix a lot of
>> issues. In the end, we do argue all the time about questions/issues
>> raised by human reviewers.
>
> Yes except human reviewers generally evolve over time to be pretty high
> signal if they remain consistent, that is at least how it is in mm. Even if
> you think points are trivial.
>
> Sashiko is hallucinating, it is raising irrelevant points that have nothing
> to do with the series, it's creating responses that require serious time to
> decode.
>
> I have not encountered review in mm that is even anwyhere near the ~50% hit
> rate, rest potentialy violently wrong/wildly irrelevant that sashiko
> generates.
>
> There's an asymmetry too - sashiko can just keep on generating this stuff
> indefinitely (well, limited by tokens of course :), and potentially
> generate serious useless work for submitters and reviewers.
>
> We _have_ to take that into account when it comes to review process.
>
> Again, this is nothing to do with the tooling which I'm grateful, again
> it's to do with mm process. And sadly you've been dragged into a debate on
> this which you are ultimately more or less orthogonal to :)
>
>>
>> Like do we prefer a system, which finds more real bugs at the cost of being
>> more noisy or we prefer a system which misses more but if it points at
>> the bug, it's certainly real? I'm sure you tempted to prefer the latter,
>> but image a hypothetical system which finds _all_ bugs, but has some false
>> positive rate, e.g. 20%. I think it's pretty attractive.
>
> I think we are very far from that right now. The issue is how it is _now_
> not in some imagined future.
>
> And it's easy to pontificate about all this, but in the end it's the
> sub-maintainers in mm who will have to eventually figure out whether a
> series is ok or not, and have to decide stuff people might do based on
> hallucinations/irrelevant points etc.
>
> Right now this is going to result in _more work_ for us, and already it
> feels like in mm the sub-maintainers are the reason things function
> reasonably, but we don't seem to be having our voices heard here.
>
>>
>> Also lot of raised issues are real, but subjectively are not worth our
>> time. But this is extremely subjective! Depends on the personal level
>> of perfectionism, amount of time available, the state of code before,
>> further plans, etc etc. For example, syzkaller has usually o(100's) open
>> bugs, which are 100% real, but not always are high priority work.
>
> I don't think it's anywhere near as subjective as you say, and I think
> that's easy to hand wave.
>
> One issue here is - trust. There are people in the community we trust to
> whom we asssign M: and R: entries in MAINTAINERS.
>
> Trust on taste, judgement etc.
>
> Now sashiko is essentially proposed to be given the same trust despite
> absolutely not deserving it.

I don't remember anyone ever said this, at least I definitely did not.

I think Sashiko can be really useful in finding mechanical bugs, so that
_eventually_ maintainers can spend most of their cycles thinking about
the direction and high-level ideas rather than checking if all gotos in
error handling paths are correct.

>
> What I propose, as I did in the other sub-thread here, is to use it as a
> _tool_ to _help_ sub-maintainers do their job.
>
> Not for it to become a new trusted gatekeeper out of the blue and
> unilaterally while adding to our workload.
>
>>
>> I think that asking to address 100% issues raised by any LLM is not
>> reasonable (especially because it's output might be different each time
>
> Really, again with respect and trying to dodge the 'blame the tool maker'
> thing :) that's something of a strawman, nobody is saying they require
> that.
>
> I think >~50% signal is a reasonable ask though.

I think you misinterpreted me.

>
>> you runt it with the same input), but I also think it's reasonable to
>> address critical & high severity concerns. And I'm happy to tweak
>
> Right, but with respect you're not an mm maintainer who has to deal with
> the resultant fallout :)

I am btw :)

>
>> Sashiko to be more conservative here, but I think it should be based on
>> some specific examples or data, not purely subjective.
>
> Well you can't both say all review is highly subjective and simultaneously
> ask for objective feedback :)
>
> I have provided detailed feedback on a specific example elsewhere, and I'm
> telling you as an experienced mm maintainer that the hit rate is ~50% in my
> experience so far.
>
> I'm happy to feedback more, but it's again a time and workload thing - the
> default here shouldn't be that mm is just taking sashiko input as read and
> we have to jump on everything to explicitly say it's right/wrong.
>
> Ideally we'd have some way of feeding back on the website, even if it's as
> simple as a tick/cross as to what points you are actually accepting or
> not. That'd be great I think!
>
> That could be useful as well to Andrew who could see that in action.
>
> User login wise you could have some system where somebody could send a mail
> from the account that is being reviewed to get a login or something?

This is an option. We have to agree (at least on per-subsystem basis)
what's the best option here. For me as Sashiko developer it doesn't
really matter which way I get the signal - I need the signal.

Thanks

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Lorenzo Stoakes (Oracle) 1 week, 5 days ago

On Tue, Mar 24, 2026 at 08:24:44AM -0700, Roman Gushchin wrote:
> "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:
>
> > On Mon, Mar 23, 2026 at 06:08:27PM -0700, Roman Gushchin wrote:
> >> "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:
> >>
> >> > On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote:
> >> >> On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> >> >>
> >> >> > A lot of patchsets are "failed to apply".  What is Sashiko trying to
> >> >> > apply MM patches to?  It would take some smarts to apply the v2
> >> >> > patchset when v1 is presently in mm.git?
> >> >>
> >> >> ?
> >> >>
> >> >> The way things are going at present, I'm just not going to apply a
> >> >
> >> > 50% noise vs. signal?... maybe wait until we're in the 9x'%s?
> >> >
> >> >> series which Sashiko "failed to apply".  And that's cool, I'll just
> >> >> wait for a version which Sashiko was able to apply.  And then not
> >> >> apply unless all Sashiko questions are resolved or convincingly refuted.
> >> >
> >> > Andrew, for crying out loud. Please don't do this.
> >> >
> >> > 2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't
> >> > apply to Sashiko, but do apply to the mm tree.
> >>
> >> I'll look into that.
> >
> > Thanks.
> >
> >>
> >> > I haven't the _faintest clue_ how we are supposed to factor a 3rd party
> >> > experimental website applying or not applying series into our work??
> >> >
> >> > And 'not apply unless all Sashiko questions are resolved or convincingly
> >> > refuted.' is seriously concerning.
> >> >
> >> > The workload is already insane, now you're expecting us to answer every bit
> >> > of nonsense Sashiko hallucinates or misunderstands also?
> >> >
> >> > I say that with no disrespect to Roman or his efforts, but as discussed at
> >> > length, it is not ready for prime time yet.
> >> >
> >> > It's clear that Sashiko is not correctly handling applies, and produces a
> >> > lot of noise. Predicating taking series on this is absurd.
> >>
> >> Not trying to pretend that Sashiko is perfect in any way, I think a good
> >> mental exercise is to put down our expectation how the "perfect" system
> >> would work. The more I work on it, the more I realize it it's far from
> >
> > Throughout this discussion I have been making practical points. Nobody
> > expects perfection.
> >
> > I am simpy saying unilaterally demanding that every single point sashiko
> > raises is responded to out of the blue without any community input or input
> > from those doing review AND requiring somehow series all apply is not
> > good.
>
> I never suggested this and explicitly wrote it below (but looks like I
> wasn't clear enough and you argue with this statement).

Yeah, Andrew has proposed this, nothing to do with you!

>
> >
> > BTW, I don't want to make you the scapegoat for complaints about mm process
> > here :) so I am being careful not to criticise, as I realise when people
> > are frustrated with tooling even if _totally irrelevant_ to you as the
> > maker of the tool, will instinctively want to blame you.
> >
> > I refuse to fall into this trap ;)
>
> Agree. Let's separate the mm process from everything else here,
> otherwise it quickly becomes too messy.

Yup :)

>
> >
> >> binary correct/incorrect. In fact, the same applies to humans: I'm sure
> >> everyone of us had once this feeling that someone is to picky and just
> >> annoying us with finding small nits. At the same time some of these
> >> people are extremely useful for the community to find and fix a lot of
> >> issues. In the end, we do argue all the time about questions/issues
> >> raised by human reviewers.
> >
> > Yes except human reviewers generally evolve over time to be pretty high
> > signal if they remain consistent, that is at least how it is in mm. Even if
> > you think points are trivial.
> >
> > Sashiko is hallucinating, it is raising irrelevant points that have nothing
> > to do with the series, it's creating responses that require serious time to
> > decode.
> >
> > I have not encountered review in mm that is even anwyhere near the ~50% hit
> > rate, rest potentialy violently wrong/wildly irrelevant that sashiko
> > generates.
> >
> > There's an asymmetry too - sashiko can just keep on generating this stuff
> > indefinitely (well, limited by tokens of course :), and potentially
> > generate serious useless work for submitters and reviewers.
> >
> > We _have_ to take that into account when it comes to review process.
> >
> > Again, this is nothing to do with the tooling which I'm grateful, again
> > it's to do with mm process. And sadly you've been dragged into a debate on
> > this which you are ultimately more or less orthogonal to :)
> >
> >>
> >> Like do we prefer a system, which finds more real bugs at the cost of being
> >> more noisy or we prefer a system which misses more but if it points at
> >> the bug, it's certainly real? I'm sure you tempted to prefer the latter,
> >> but image a hypothetical system which finds _all_ bugs, but has some false
> >> positive rate, e.g. 20%. I think it's pretty attractive.
> >
> > I think we are very far from that right now. The issue is how it is _now_
> > not in some imagined future.
> >
> > And it's easy to pontificate about all this, but in the end it's the
> > sub-maintainers in mm who will have to eventually figure out whether a
> > series is ok or not, and have to decide stuff people might do based on
> > hallucinations/irrelevant points etc.
> >
> > Right now this is going to result in _more work_ for us, and already it
> > feels like in mm the sub-maintainers are the reason things function
> > reasonably, but we don't seem to be having our voices heard here.
> >
> >>
> >> Also lot of raised issues are real, but subjectively are not worth our
> >> time. But this is extremely subjective! Depends on the personal level
> >> of perfectionism, amount of time available, the state of code before,
> >> further plans, etc etc. For example, syzkaller has usually o(100's) open
> >> bugs, which are 100% real, but not always are high priority work.
> >
> > I don't think it's anywhere near as subjective as you say, and I think
> > that's easy to hand wave.
> >
> > One issue here is - trust. There are people in the community we trust to
> > whom we asssign M: and R: entries in MAINTAINERS.
> >
> > Trust on taste, judgement etc.
> >
> > Now sashiko is essentially proposed to be given the same trust despite
> > absolutely not deserving it.
>
> I don't remember anyone ever said this, at least I definitely did not.

Andrew has said that every single point sashiko raises needs to be
addressed or patches will not be taken, that's again a separate process
issue.

>
> I think Sashiko can be really useful in finding mechanical bugs, so that
> _eventually_ maintainers can spend most of their cycles thinking about
> the direction and high-level ideas rather than checking if all gotos in
> error handling paths are correct.
>
> >
> > What I propose, as I did in the other sub-thread here, is to use it as a
> > _tool_ to _help_ sub-maintainers do their job.
> >
> > Not for it to become a new trusted gatekeeper out of the blue and
> > unilaterally while adding to our workload.
> >
> >>
> >> I think that asking to address 100% issues raised by any LLM is not
> >> reasonable (especially because it's output might be different each time
> >
> > Really, again with respect and trying to dodge the 'blame the tool maker'
> > thing :) that's something of a strawman, nobody is saying they require
> > that.
> >
> > I think >~50% signal is a reasonable ask though.
>
> I think you misinterpreted me.

Right, but this is broadly the hit rate I've experienced. It's not a
criticism, just saying that from an RoI point of view, I'd want to see that
be higher before putting in _stringent_ requirements as to having to
address points.

>
> >
> >> you runt it with the same input), but I also think it's reasonable to
> >> address critical & high severity concerns. And I'm happy to tweak
> >
> > Right, but with respect you're not an mm maintainer who has to deal with
> > the resultant fallout :)
>
> I am btw :)

Oh damn I am so sorry! That is me being a scatterbrain and not some strange
kind of insult or something :P I promise!

I was thinking of you with your sashiko hat on :)

The point of saying that was to emphasise the process side of things, and
it being separate of course.

>
> >
> >> Sashiko to be more conservative here, but I think it should be based on
> >> some specific examples or data, not purely subjective.
> >
> > Well you can't both say all review is highly subjective and simultaneously
> > ask for objective feedback :)
> >
> > I have provided detailed feedback on a specific example elsewhere, and I'm
> > telling you as an experienced mm maintainer that the hit rate is ~50% in my
> > experience so far.
> >
> > I'm happy to feedback more, but it's again a time and workload thing - the
> > default here shouldn't be that mm is just taking sashiko input as read and
> > we have to jump on everything to explicitly say it's right/wrong.
> >
> > Ideally we'd have some way of feeding back on the website, even if it's as
> > simple as a tick/cross as to what points you are actually accepting or
> > not. That'd be great I think!
> >
> > That could be useful as well to Andrew who could see that in action.
> >
> > User login wise you could have some system where somebody could send a mail
> > from the account that is being reviewed to get a login or something?
>
> This is an option. We have to agree (at least on per-subsystem basis)
> what's the best option here. For me as Sashiko developer it doesn't
> really matter which way I get the signal - I need the signal.

Right, but from a workflow point of view, it's not really workable to have
to respond to every input in any kind of detail.

So to me something super simple like tick/cross on responses would be
great.

>
> Thanks

Cheers, Lorenzo

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Pedro Falcato 1 week, 6 days ago

On Mon, Mar 23, 2026 at 11:31:29AM +0000, Lorenzo Stoakes (Oracle) wrote:
> On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote:
> > On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > > A lot of patchsets are "failed to apply".  What is Sashiko trying to
> > > apply MM patches to?  It would take some smarts to apply the v2
> > > patchset when v1 is presently in mm.git?
> >
> > ?
> >
> > The way things are going at present, I'm just not going to apply a
> 
> 50% noise vs. signal?... maybe wait until we're in the 9x'%s?
> 
> > series which Sashiko "failed to apply".  And that's cool, I'll just
> > wait for a version which Sashiko was able to apply.  And then not
> > apply unless all Sashiko questions are resolved or convincingly refuted.
> 
> Andrew, for crying out loud. Please don't do this.
> 
> 2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't
> apply to Sashiko, but do apply to the mm tree.
> 
> I haven't the _faintest clue_ how we are supposed to factor a 3rd party
> experimental website applying or not applying series into our work??
> 
> And 'not apply unless all Sashiko questions are resolved or convincingly
> refuted.' is seriously concerning.

FWIW I wholeheartedly agree. I don't understand how we don't require proper
M: or R: reviews on patches before merging, but now out of the blue require
the magic AI LLM thingy to review it before it's merged.

Like, sure, sashiko can be useful, and is better than nothing. But unless
sashiko is better than the maintainers, it should be kept as optional.

Seriously, I can't wrap my head around the difference in treatment in
"human maintainers, experts in the code, aren't required to review a patch"
vs "make the fscking AI happy or it's not going anywhere". It's almost
insulting.

-- 
Pedro

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Andrew Morton 1 week, 6 days ago

On Mon, 23 Mar 2026 12:34:31 +0000 Pedro Falcato <pfalcato@suse.de> wrote:

> On Mon, Mar 23, 2026 at 11:31:29AM +0000, Lorenzo Stoakes (Oracle) wrote:
> > On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote:
> > > On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > > A lot of patchsets are "failed to apply".  What is Sashiko trying to
> > > > apply MM patches to?  It would take some smarts to apply the v2
> > > > patchset when v1 is presently in mm.git?
> > >
> > > ?
> > >
> > > The way things are going at present, I'm just not going to apply a
> > 
> > 50% noise vs. signal?... maybe wait until we're in the 9x'%s?
> > 
> > > series which Sashiko "failed to apply".  And that's cool, I'll just
> > > wait for a version which Sashiko was able to apply.  And then not
> > > apply unless all Sashiko questions are resolved or convincingly refuted.
> > 
> > Andrew, for crying out loud. Please don't do this.
> > 
> > 2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't
> > apply to Sashiko, but do apply to the mm tree.
> > 
> > I haven't the _faintest clue_ how we are supposed to factor a 3rd party
> > experimental website applying or not applying series into our work??
> > 
> > And 'not apply unless all Sashiko questions are resolved or convincingly
> > refuted.' is seriously concerning.
> 
> FWIW I wholeheartedly agree. I don't understand how we don't require proper
> M: or R: reviews on patches before merging

I wish people would stop making this claim, without substantiation. 
I've looked (deeply) at the data, which is equally available to us all.
Has anyone else?

After weeding out a few special cases (especially DAMON) (this time
also maple_tree), the amount of such unreviewed material which enters
mm-stable and mainline is very very low.

> Like, sure, sashiko can be useful, and is better than nothing. But unless
> sashiko is better than the maintainers, it should be kept as optional.

Rule #1 is, surely, "don't add bugs".  This thing finds bugs.  If its
hit rate is 50% then that's plenty high enough to justify people
spending time to go through and check its output.

> Seriously, I can't wrap my head around the difference in treatment in
> "human maintainers, experts in the code, aren't required to review a patch"

Speaking of insulting.

> vs "make the fscking AI happy or it's not going anywhere". It's almost
> insulting.

Look, I know people are busy.  If checking these reports slows us down
and we end up merging less code and less buggy code then that's a good
tradeoff.

Also, gimme a break.  Like everyone else I'm still trying to wrap my
head how best to incorporate this new tool into our development
processes.

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Mike Rapoport 1 week, 6 days ago

On Mon, Mar 23, 2026 at 02:36:04PM -0700, Andrew Morton wrote:
> On Mon, 23 Mar 2026 12:34:31 +0000 Pedro Falcato <pfalcato@suse.de> wrote:
> > 
> > FWIW I wholeheartedly agree. I don't understand how we don't require proper
> > M: or R: reviews on patches before merging
> 
> I wish people would stop making this claim, without substantiation. 
> I've looked (deeply) at the data, which is equally available to us all.
> Has anyone else?
>
> After weeding out a few special cases (especially DAMON) (this time
> also maple_tree), the amount of such unreviewed material which enters
> mm-stable and mainline is very very low.

Here's a breakout of MM commit tags (with DAMON excluded) since 6.10:

------------------------------------------------------------------------------
Release        Total   Reviewed-by   Acked-by only   No review   DAMON excl
------------------------------------------------------------------------------
v6.10            318     206 (65%)        36 (11%)    76 (24%)           10
v6.11            270     131 (49%)        72 (27%)    67 (25%)           17
v6.12            333     161 (48%)        65 (20%)   107 (32%)           18
v6.13            180      94 (52%)        29 (16%)    57 (32%)            8
v6.14            217     103 (47%)        40 (18%)    74 (34%)           30
v6.15            289     129 (45%)        45 (16%)   115 (40%)           43
v6.16            198     126 (64%)        44 (22%)    28 (14%)           16
v6.17            245     181 (74%)        41 (17%)     23 (9%)           53
v6.18            205     150 (73%)        28 (14%)    27 (13%)           34
v6.19            228     165 (72%)        33 (14%)    30 (13%)           64
------------------------------------------------------------------------------

There's indeed sharp reduction in amount of unreviewed material that gets
merged since v6.15, i.e. after the last LSF/MM when we updated the process
and nominated people as sub-maintainers and reviewers for different parts
of MM. This very much confirms that splitting up the MM entry and letting
people to step up as sub-maintaners pays off.

But we are still at double digits for percentage of commits without
Reviewed-by tags despite the effort people (especially David and Lorenzo)
are putting into review. I wouldn't say that even 9% is "very very low".

> > Like, sure, sashiko can be useful, and is better than nothing. But unless
> > sashiko is better than the maintainers, it should be kept as optional.
> 
> Rule #1 is, surely, "don't add bugs".  This thing finds bugs.  If its
> hit rate is 50% then that's plenty high enough to justify people
> spending time to go through and check its output.
> 
> > Seriously, I can't wrap my head around the difference in treatment in
> > "human maintainers, experts in the code, aren't required to review a patch"
> 
> Speaking of insulting.
> 
> > vs "make the fscking AI happy or it's not going anywhere". It's almost
> > insulting.
> 
> Look, I know people are busy.  If checking these reports slows us down
> and we end up merging less code and less buggy code then that's a good
> tradeoff.

If you think this is a good trade-off, then slowing down to wait for human
review so we merge up less buggy or less maintainable code is a good
trade-off too.

While LLMs can detect potential bugs, they are not capable to identify
potential maintainability issues.

> Also, gimme a break.  Like everyone else I'm still trying to wrap my
> head how best to incorporate this new tool into our development
> processes.

It would be nice if we had a more formal description of our development
process in Documentation/process/maintainer-mm.rst and then we can add a
few sentences about how to incorporate this tool into the process when we
figure this out.

Right now our process is a tribal knowledge, having "Rule #1" and a few
others written down would help everyone who participates in MM development.

-- 
Sincerely yours,
Mike.

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Lorenzo Stoakes (Oracle) 1 week, 6 days ago

On Tue, Mar 24, 2026 at 09:58:12AM +0200, Mike Rapoport wrote:
> On Mon, Mar 23, 2026 at 02:36:04PM -0700, Andrew Morton wrote:
> > On Mon, 23 Mar 2026 12:34:31 +0000 Pedro Falcato <pfalcato@suse.de> wrote:
> > >
> > > FWIW I wholeheartedly agree. I don't understand how we don't require proper
> > > M: or R: reviews on patches before merging
> >
> > I wish people would stop making this claim, without substantiation.
> > I've looked (deeply) at the data, which is equally available to us all.
> > Has anyone else?
> >
> > After weeding out a few special cases (especially DAMON) (this time
> > also maple_tree), the amount of such unreviewed material which enters
> > mm-stable and mainline is very very low.
>
> Here's a breakout of MM commit tags (with DAMON excluded) since 6.10:
>
> ------------------------------------------------------------------------------
> Release        Total   Reviewed-by   Acked-by only   No review   DAMON excl
> ------------------------------------------------------------------------------
> v6.10            318     206 (65%)        36 (11%)    76 (24%)           10
> v6.11            270     131 (49%)        72 (27%)    67 (25%)           17
> v6.12            333     161 (48%)        65 (20%)   107 (32%)           18
> v6.13            180      94 (52%)        29 (16%)    57 (32%)            8
> v6.14            217     103 (47%)        40 (18%)    74 (34%)           30
> v6.15            289     129 (45%)        45 (16%)   115 (40%)           43
> v6.16            198     126 (64%)        44 (22%)    28 (14%)           16
> v6.17            245     181 (74%)        41 (17%)     23 (9%)           53
> v6.18            205     150 (73%)        28 (14%)    27 (13%)           34
> v6.19            228     165 (72%)        33 (14%)    30 (13%)           64
> ------------------------------------------------------------------------------

Thanks Mike, I've gone a bit deeper, classifying based on the _actually_
requested requirement of sub-maintainer R-b or A-b (not all reviews are equal),
and since sub-M's were in place ~6.15.

I exclude DAMON from everything, which seems pretty arbitrary, but for the sake
of being generous:

(getting some slightly different total numbers maybe mildly varying filters)

------------------------------------------------------------------------------
Release        Total     Sub-M signoff		No sub-M signoff
------------------------------------------------------------------------------
v6.15		289	136/289 (47.1%)		153/289 (52.9%)
v6.16		198	147/198 (74.2%)		 51/198 (25.8%)
v6.17		245	201/245 (82.0%)		 44/245 (18.0%)
v6.18		206	155/206 (75.2%)		 51/206 (24.8%)
v6.19		232	181/232 (78.0%)		 51/232 (22.0%)
v7.0 (so far)	188	135/188 (71.8%)		 53/188 (28.2%)
V6.15..v.7	1358	955/1358 (70.3%)	403/1358 (29.7%)
------------------------------------------------------------------------------

Now if we consider series _sent_ by sub-M's as being reviewed by default:

------------------------------------------------------------------------------
Release        Total     Sub-M signoff		No sub-M signoff
------------------------------------------------------------------------------
v6.15		289	204/289 (70.6%)		85/289 (29.4%)
v6.16		198	163/198 (82.3%)		35/198 (17.7%)
v6.17		245	212/245 (86.5%)		33/245 (13.5%)
v6.18		206	176/206 (85.4%)		30/206 (14.6%)
v6.19		232	200/232 (86.2%)		32/232 (13.8%)
v7.0 (so far)	188	174/188 (92.6%)		14/188 ( 7.4%)
V6.15..v.7	1358	1129/1358 (83.1%)	229/1358 (16.9%)
------------------------------------------------------------------------------

So 'the amount of such unreviewed material which enters mm-stable and mainline
is very very low' is clearly untrue.

In aggregate there were 229 patches merged (and by that I mean to Linus's
tree), or 16.9% without sub-M review or sub-M S-o-b.

I seem to recall you claiming there were only one or two series/patches
that landed like this for the past year or 2 or something like this? None
of the data reflects that.

Clearly there is still work to be done and clearly there are still patches
being sent that are not getting sub-M signoff.

It _is_ improving, but I fear that a lot of that is because of us sub-M's
burning ourselves out.

Let's look at that.

Rather than limiting to mm commits, let's expand and just go with commits
which you were the comitter for from 6.15 onward to make life easier:

Of those, there were 3,339 commits, and 2,284 had at least one A-b or R-b
(68.4% review rate).

Looking at commits actually A-b/R-b from 6.15 on and taking those in 3
digits or more:

-----------------------------------------
Author			R-b/A-b
-----------------------------------------
David Hildenbrand	484/2284 (21.2%)
Lorenzo Stoakes		356/2284 (15.6%)
Vlastimil Babka		276/2284 (12.1%)
Zi Yan			213/2284 ( 9.3%)
Mike Rapoport		193/2284 ( 8.5%)
SJ Park			174/2284 ( 7.6%)
Liam Howlett		128/2284 ( 5.6%)
Shakeel Butt		115/2284 ( 5.0%)
Oscar Salvador		111/2284 ( 4.9%)
-----------------------------------------

(Keep in mind I reduced my review sharply for a couple months during this
period due to burnout/objecting to mm review policy.)

Do you think that maybe some of the people listed here should be consulted
about these kinds of decisions at all?

Do you notice here that the people listed above (apart from Zi, who is
exceptional overall anyway :) are sub-M's?

The data overwhelmingly backs the fact that the sub-M/R changes have
radically improved review in mm.

This is something you have pushed back on, so I gently suggest that you
should be a little more accepting of the fact the data lays bare here
please.

>
> There's indeed sharp reduction in amount of unreviewed material that gets
> merged since v6.15, i.e. after the last LSF/MM when we updated the process
> and nominated people as sub-maintainers and reviewers for different parts
> of MM. This very much confirms that splitting up the MM entry and letting
> people to step up as sub-maintaners pays off.

Yes that's evident obviously in all the data, I felt it had a huge impact and
it's great to see the data dmeontrate that!

Andrew - hopefully that helps give some basis for the role of
sub-maintainers and reviewers in mm, I know you have expressed in the past
(on more than one occasion) that you feel these roles are meaningless as
you are able to subjectively interpret reviews - the data clearly shows
otherwise.

As a man of data, I ask you to take this into account please.

And as you are showing you are more than happy to wait for review when AI
does it, I genuinely do not understand why you would not accept this sub-M
signoff rule at this stage.

>
> But we are still at double digits for percentage of commits without
> Reviewed-by tags despite the effort people (especially David and Lorenzo)
> are putting into review. I wouldn't say that even 9% is "very very low".

Yes, far from it.

>
> > > Like, sure, sashiko can be useful, and is better than nothing. But unless
> > > sashiko is better than the maintainers, it should be kept as optional.
> >
> > Rule #1 is, surely, "don't add bugs".  This thing finds bugs.  If its
> > hit rate is 50% then that's plenty high enough to justify people
> > spending time to go through and check its output.
> >
> > > Seriously, I can't wrap my head around the difference in treatment in
> > > "human maintainers, experts in the code, aren't required to review a patch"
> >
> > Speaking of insulting.

Honestly I think unilaterally instituting radical changes to review in MM
without even bothering to consult those who do the actual review-work, and
responding to push back either by ignoring or dismissal isn't hugely
respectful.

I also feel you are not being quite fair to Pedro here, especially when the
data bears out his claims.

(I refer you back to the above data.)

> >
> > > vs "make the fscking AI happy or it's not going anywhere". It's almost
> > > insulting.
> >
> > Look, I know people are busy.  If checking these reports slows us down
> > and we end up merging less code and less buggy code then that's a good
> > tradeoff.

I mean you're literally ignoring the people who are doing all the review
work here and then saying you're fine with adding more work for them (it's
clear reviewers will have to account for Sashiko feedback in a regime where
that's a hard requirement for merge), as well as to submitters too
obviously.

So I honestly don't think you do know that, since you are ignoring
push-back from the people who are doing the work who are demonstrably VERY
busy.

>
> If you think this is a good trade-off, then slowing down to wait for human
> review so we merge up less buggy or less maintainable code is a good
> trade-off too.
>
> While LLMs can detect potential bugs, they are not capable to identify
> potential maintainability issues.

Yes precisely.

>
> > Also, gimme a break.  Like everyone else I'm still trying to wrap my
> > head how best to incorporate this new tool into our development
> > processes.
>
> It would be nice if we had a more formal description of our development
> process in Documentation/process/maintainer-mm.rst and then we can add a
> few sentences about how to incorporate this tool into the process when we
> figure this out.

I mean we've been waiting for this for a while :)

I actually think at this stage it'd be better for those
actually-doing-the-work of review to be writing these documents.

But then they won't match what's actually happening, of course.

>
> Right now our process is a tribal knowledge, having "Rule #1" and a few
> others written down would help everyone who participates in MM development.

Rule #1 presumably 'don't introduce bugs' has so many caveats in it it's
almost meaningless.

For instance, as a silly example but one that makes the point - if
reviewers were required to do two rounds of review, the second with much
more scrutiny after having tagged the first - this would ABSOLUTELY find
more bugs.

But it'd double the time or more taken to do review.

It's like saying 'reduce speed limits to save lives' - invariably you will
if you do, but there are other considerations. A 5mph limit nationally
might have other knock on effects :)

I'd say this requires _discussion_ with those _actually doing the work_
that keeps mm moving and stable, i.e. review.

Plus review comprises of more than finding bugs - in fact that's almost
secondary to ensuring _architecturally_ changes are valid and we're not
causing user interface issues and style and code quality and etc.

All things that AI frankly sucks at (at least for now).

This new approach, taken out of the blue and without community discussion
also FLATLY contradicts mm process thus far - Andrew has repeatedly argued
that 'perfectly good series' get 'held up' by review, and he really wants
to avoid that.

And thus has rejected the reasonable requests, whose requirement is now
borne out by statistical evidence, for sub-M signoff.

He's even intimated that stable patches don't require proper review in the
past.

Now AI is being instituted as a trusted gatekeeper and is immediately given
full veto power.

I don't think documenting this kind of decision making is helpful, but
absolutely process docs are needed, were promised, and have not emerged.

>
> --
> Sincerely yours,
> Mike.

Hopefully the data helps paint the picture here.

Thanks, Lorenzo

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Pedro Falcato 1 week, 6 days ago

On Mon, Mar 23, 2026 at 02:36:04PM -0700, Andrew Morton wrote:
> On Mon, 23 Mar 2026 12:34:31 +0000 Pedro Falcato <pfalcato@suse.de> wrote:
> 
> > On Mon, Mar 23, 2026 at 11:31:29AM +0000, Lorenzo Stoakes (Oracle) wrote:
> > > On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote:
> > > > On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> > > >
> > > > > A lot of patchsets are "failed to apply".  What is Sashiko trying to
> > > > > apply MM patches to?  It would take some smarts to apply the v2
> > > > > patchset when v1 is presently in mm.git?
> > > >
> > > > ?
> > > >
> > > > The way things are going at present, I'm just not going to apply a
> > > 
> > > 50% noise vs. signal?... maybe wait until we're in the 9x'%s?
> > > 
> > > > series which Sashiko "failed to apply".  And that's cool, I'll just
> > > > wait for a version which Sashiko was able to apply.  And then not
> > > > apply unless all Sashiko questions are resolved or convincingly refuted.
> > > 
> > > Andrew, for crying out loud. Please don't do this.
> > > 
> > > 2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't
> > > apply to Sashiko, but do apply to the mm tree.
> > > 
> > > I haven't the _faintest clue_ how we are supposed to factor a 3rd party
> > > experimental website applying or not applying series into our work??
> > > 
> > > And 'not apply unless all Sashiko questions are resolved or convincingly
> > > refuted.' is seriously concerning.
> > 
> > FWIW I wholeheartedly agree. I don't understand how we don't require proper
> > M: or R: reviews on patches before merging
> 
> I wish people would stop making this claim, without substantiation. 
> I've looked (deeply) at the data, which is equally available to us all.
> Has anyone else?
> 
> After weeding out a few special cases (especially DAMON) (this time
> also maple_tree), the amount of such unreviewed material which enters
> mm-stable and mainline is very very low.

That is not what I said. I said "we don't require proper M: or R: reviews
on patches before merging". Which as far as I know is still true when it
comes to the process. If I have this wrong, then I'm not the only one.

The fact that the end result is still high quality is a result of your work
(diligently tracking down review states; yes, i've seen your quilt series file
and its annotations) and every single one involved in the review process.
This is not however codified into the process.

(note: the fact that DAMON and maple tree both lack reviews from !authors
just shows there is a very low bus factor at stake. we should fix this...)

> 
> > Like, sure, sashiko can be useful, and is better than nothing. But unless
> > sashiko is better than the maintainers, it should be kept as optional.
> 
> Rule #1 is, surely, "don't add bugs".  This thing finds bugs.  If its
> hit rate is 50% then that's plenty high enough to justify people
> spending time to go through and check its output.

I agree. But I don't think it's flawless enough to become mandatory.

> 
> > Seriously, I can't wrap my head around the difference in treatment in
> > "human maintainers, experts in the code, aren't required to review a patch"
> 
> Speaking of insulting.

Then I sincerely apologize. I see how I was brash. I did not mean to insult.

> 
> > vs "make the fscking AI happy or it's not going anywhere". It's almost
> > insulting.
> 
> Look, I know people are busy.  If checking these reports slows us down
> and we end up merging less code and less buggy code then that's a good
> tradeoff.

Sure. But I'm thinking about the human factor - I simply don't think either
contributors or maintainers will be particularly less stressed with the
introduction of obligatory AI reviews. Maintainers are still hardpressed
to review (as is their function), and contributors need to go through the
tool's output and figure out what's relevant (and _true_) or what's not.

IF we were able to codify the MM process like in (https://docs.kernel.org/process/maintainer-netdev.html),
with things like:
 - NO patch is getting in without being 1) written by a maintainer or 2) getting Rb's and Acks from M's and R's
  - Ideally both, but maple and DAMON need special casing for now, I guess.
 - NO -next content is being accepted during the merge window. straight to /dev/null.
 - review state for each patch is <here>

it would already be a huge, palpable win for everyone involved. Some of
these have been asked for and discussed by people that are much more
load-bearing in MM than I am, for longer than I've been around. And would
make more of a difference than making sashiko (which is not reliable,
experimental software, etc) load-bearing.

> 
> Also, gimme a break.  Like everyone else I'm still trying to wrap my
> head how best to incorporate this new tool into our development
> processes.

I understand. Ideally, sashiko would be a tool that maintainers and
reviewers (and submitters) could use to help find problems. I don't think
having you check every AI review scales. But I also don't think we should be
treating LLM output as if it were a normal review from an expert.

-- 
Pedro

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Andrew Morton 1 week, 6 days ago

On Mon, 23 Mar 2026 23:27:35 +0000 Pedro Falcato <pfalcato@suse.de> wrote:

> > also maple_tree), the amount of such unreviewed material which enters
> > mm-stable and mainline is very very low.
> 
> That is not what I said. I said "we don't require proper M: or R: reviews
> on patches before merging". Which as far as I know is still true when it
> comes to the process. If I have this wrong, then I'm not the only one.

People never define what they mean by "merged".  I define it as "added
to mm-stable".  Things that are in mm-unstable are unstable!  They're
subject to alteration or removal.

I pipeline things, a lot.  The main benefit if this is that material
gets sometimes *weeks* of additional testing which they would not have
otherwise received.  Also there are integration benefits - inter-tree
as well and intra-tree.

If something is getting close to mm-stable and doesn't appear
sufficiently reviewed then I'll send out bleats and if those don't work,
it gets deferred or dropped.

And, btw, it's really bad to remove material late in the cycle - that
means moving an untested code combination into mm-stable, which adds
risk.  For this reason I do ask that M:aintainers and R:eviewers be
attentive to material which is in mm-unstable and to tell me as early
as possible if I should defer or drop it.

It's a mistake that we've never defined the roles and responsibilities
of maintainers and reviewers.  If we were to define their
responsibilities, I'd place "take care of what's in mm.git" high on the
list.

> The fact that the end result is still high quality is a result of your work
> (diligently tracking down review states; yes, i've seen your quilt series file
> and its annotations) and every single one involved in the review process.
> This is not however codified into the process.

Yeah. mea cupla.

> (note: the fact that DAMON and maple tree both lack reviews from !authors
> just shows there is a very low bus factor at stake. we should fix this...)

Agree.  It would be a long haul for someone to effectively pick up
something like mapletree.

> > 
> > > Like, sure, sashiko can be useful, and is better than nothing. But unless
> > > sashiko is better than the maintainers, it should be kept as optional.
> > 
> > Rule #1 is, surely, "don't add bugs".  This thing finds bugs.  If its
> > hit rate is 50% then that's plenty high enough to justify people
> > spending time to go through and check its output.
> 
> I agree. But I don't think it's flawless enough to become mandatory.

Well, I looked at some numbers.  Data!

Searched for linux-mm emails which had from:akpm, message-body contains
"sashiko".

22 emails received replies from authors indicating that alterations
were needed.

2 emails received replies from authors indicating that no alterations
were needed

1 email received a reply from author in which I wasn't able to decide
either way.

A few more replies said "no alteration, but we need to change other
code".

10-15ish have yet to receive replies.

That's a really high hit rate!  How can we possibly not use this, if we
care about Rule #1?

> Sure. But I'm thinking about the human factor - I simply don't think either
> contributors or maintainers will be particularly less stressed with the
> introduction of obligatory AI reviews. Maintainers are still hardpressed
> to review (as is their function), and contributors need to go through the
> tool's output and figure out what's relevant (and _true_) or what's not.

Yeah, it's a matter of figuring this out as we go along.  It will be so
much better if/when people are able to use sashiko privately.  But
heck, people forget to run checkpatch ;)

> IF we were able to codify the MM process like in (https://docs.kernel.org/process/maintainer-netdev.html),
> with things like:
>  - NO patch is getting in without being 1) written by a maintainer or 2) getting Rb's and Acks from M's and R's

Sure.  Where "in" means mm-stable.

>   - Ideally both, but maple and DAMON need special casing for now, I guess.

We do get quite a lot of patches from sole maintainers.

>  - NO -next content is being accepted during the merge window. straight to /dev/null.

For sure.  Well.  I usually park these thing to take a look at after we're all
merged up, but it's usually all stale by then.

>  - review state for each patch is <here>

I generate that now, with the occasional "mm.git review status" emails.
I could run it daily and add it to mm.git or something, but this
doesn't seem to have generated much interest.

> I understand. Ideally, sashiko would be a tool that maintainers and
> reviewers (and submitters) could use to help find problems. I don't think
> having you check every AI review scales. But I also don't think we should be
> treating LLM output as if it were a normal review from an expert.

Sure,  But that hit rate is so high!

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Lorenzo Stoakes (Oracle) 1 week, 6 days ago

On Mon, Mar 23, 2026 at 05:05:37PM -0700, Andrew Morton wrote:
> Well, I looked at some numbers.  Data!
>
> Searched for linux-mm emails which had from:akpm, message-body contains
> "sashiko".
>
> 22 emails received replies from authors indicating that alterations
> were needed.
>
> 2 emails received replies from authors indicating that no alterations
> were needed
>
> 1 email received a reply from author in which I wasn't able to decide
> either way.
>
> A few more replies said "no alteration, but we need to change other
> code".
>
> 10-15ish have yet to receive replies.
>
>
> That's a really high hit rate!  How can we possibly not use this, if we
> care about Rule #1?

Really this data doesn't support that.

If we're generous and say 10 with no replies, that's 22/35 or ~63% _where
sashiko was correct in AT LEAST ONE individual observation_.

That is not indicative of a good signal-to-noise ratio.

Do you not think we can do better?

Roughly in my experience, around ~50% of sashiko INDIVIDUAL REPORTS
(i.e. individual comments made line-by-line) have validity.

Roman has said that the strategy he takes, partly for sensible token usage,
partly to avoid throwing out the baby with the bath water, at this time
leads to more noise. And as models improve this is likely to also.

This is no criticism of him, I am grateful for this tooling.

The issue is with mm process.

This adds quite a burden to reviewers to have to deal with _every single
thing_ reported.

Which is what you unilaterally seemed to say was now a requirement, to
which I object.

There's further problems here:

1. What if a new engineer comes along and sashiko hallucinates a bunch of
   stuff and they respin + respin to match it, and now reviewers have to
   tell them to stop?

2. What if sashiko directly contradicts a human reviewer/maintainer?

3. Are you going to quietly just not take series and people find out in the
   merge window/when you gather up mm-stable in one of the many batches,
   because they didn't respond to a hallucination?

4. AI often generates new 'thoughts' just from being ran for a 2nd time, so
   do we hold series in perpetual flux trying to figure out if the latest
   set are valid?

5. Often the reported 'issues' are so complicated it requires human
   expertise to figure out if they're relevant, thereby increasing the
   already over-strained maintenance workload.

And again, I come back to you requiring sashiko to be able to apply a
series, based on unknown criteria, probably not correctly apply fix-patches
etc. - there is no sensible way for a series author to fulfill that
requirement.

Really we need input of _those doing the actual review_ in how mm review
works.

Let me make workable suggestions:

1. Defer this to sub-maintainers. We have the expertise and experience to
   make judgment calls on this.

2. Don't make this silly series applies demand. It's impossible to adhere
   to.

3. Don't require that every sashiko point be responded to.

4. Sub-maintainers use it as a tool - and only really consider
   critical/high bugs as being potentially important, and only if they can
   determine that the points made are valid AND importantly - only if doing
   so doesn't take all that much time.

Personally I am _already_ using sashiko as part of review for people to
some degree. I see that as being the more useful means of using it.

Treat it as the experiment it is, rather than reflexively deciding to
demand all points get responded to.

>
> > Sure. But I'm thinking about the human factor - I simply don't think either
> > contributors or maintainers will be particularly less stressed with the
> > introduction of obligatory AI reviews. Maintainers are still hardpressed
> > to review (as is their function), and contributors need to go through the
> > tool's output and figure out what's relevant (and _true_) or what's not.
>
> Yeah, it's a matter of figuring this out as we go along.  It will be so
> much better if/when people are able to use sashiko privately.  But
> heck, people forget to run checkpatch ;)

But we're not 'figuring it out', you're not discussing anything with
sub-maintainers or the community, you're unilaterally telling people they
HAVE to respond to everything sashiko says or you WON'T TAKE the patch.

And also (you ignored my reply on this and replied to Pedro instead)

So where's the figuring out exactly?

>
> > IF we were able to codify the MM process like in (https://docs.kernel.org/process/maintainer-netdev.html),
> > with things like:
> >  - NO patch is getting in without being 1) written by a maintainer or 2) getting Rb's and Acks from M's and R's
>
> Sure.  Where "in" means mm-stable.

I'm not sure anybody said otherwise??

>
> >   - Ideally both, but maple and DAMON need special casing for now, I guess.
>
> We do get quite a lot of patches from sole maintainers.
>
> >  - NO -next content is being accepted during the merge window. straight to /dev/null.
>
> For sure.  Well.  I usually park these thing to take a look at after we're all
> merged up, but it's usually all stale by then.
>
> >  - review state for each patch is <here>
>
> I generate that now, with the occasional "mm.git review status" emails.
> I could run it daily and add it to mm.git or something, but this
> doesn't seem to have generated much interest.
>
> > I understand. Ideally, sashiko would be a tool that maintainers and
> > reviewers (and submitters) could use to help find problems. I don't think
> > having you check every AI review scales. But I also don't think we should be
> > treating LLM output as if it were a normal review from an expert.
>
> Sure,  But that hit rate is so high!

Addressed above. Disagree.

Please listen to the people doing the actual review in mm.

Thanks, Lorenzo

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Roman Gushchin 2 weeks, 1 day ago

Andrew Morton <akpm@linux-foundation.org> writes:

> On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
>
>> A lot of patchsets are "failed to apply".  What is Sashiko trying to
>> apply MM patches to?  It would take some smarts to apply the v2
>> patchset when v1 is presently in mm.git?
>
> ?

It's displayed in the Baseline section for every patchset.

For mm patchsets if the base commit is not specified it's mm-new then
mm-unstable then mm-stable then linux-next/HEAD and then linus/HEAD
(and now I think that it should not only show HEAD, but the actual sha).

I don't have yet support for "previous version is applied, let's revert
it and try the new one" case. Something to add later.

> The way things are going at present, I'm just not going to apply a
> series which Sashiko "failed to apply".  And that's cool, I'll just
> wait for a version which Sashiko was able to apply.  And then not
> apply unless all Sashiko questions are resolved or convincingly refuted.
>
> Question please: if Sashiko finds an "issue" in v3 and then v4 comes
> out with changelog words which justifies the questionable alteration, can
> Sashiko parse that changelog justification and think "OK, never mind"?

Yes, I'm planning to add it. Sashiko will have an access to previous
versions of the patchset and the whole discussion thread and take it
into the account.

Thanks!

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Lorenzo Stoakes (Oracle) 1 week, 6 days ago

On Sat, Mar 21, 2026 at 07:12:13PM -0700, Roman Gushchin wrote:
> Andrew Morton <akpm@linux-foundation.org> writes:
>
> > On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> >> A lot of patchsets are "failed to apply".  What is Sashiko trying to
> >> apply MM patches to?  It would take some smarts to apply the v2
> >> patchset when v1 is presently in mm.git?
> >
> > ?
>
> It's displayed in the Baseline section for every patchset.
>
> For mm patchsets if the base commit is not specified it's mm-new then
> mm-unstable then mm-stable then linux-next/HEAD and then linus/HEAD
> (and now I think that it should not only show HEAD, but the actual sha).
>
> I don't have yet support for "previous version is applied, let's revert
> it and try the new one" case. Something to add later.
>
> > The way things are going at present, I'm just not going to apply a
> > series which Sashiko "failed to apply".  And that's cool, I'll just
> > wait for a version which Sashiko was able to apply.  And then not
> > apply unless all Sashiko questions are resolved or convincingly refuted.
> >
> > Question please: if Sashiko finds an "issue" in v3 and then v4 comes
> > out with changelog words which justifies the questionable alteration, can
> > Sashiko parse that changelog justification and think "OK, never mind"?
>
> Yes, I'm planning to add it. Sashiko will have an access to previous
> versions of the patchset and the whole discussion thread and take it
> into the account.

Hmm this question presupposes that we should have to respond somehow to
Sashiko feedback, but with ~50% signal vs. noise (my experience so far)
that's just not sensible, and a painful addition to already overstrained
workload.

For instance
https://sashiko.dev/#/patchset/cover.1774029655.git.ljs%40kernel.org is
full of pretty useless stuff, including a silly hallucination
(VM_WARN_ON_ONCE() cannot be used as a conditional, it's defined as
(void)WARN_ON_ONCE() when CONFIG_DEBUG_VM is enabled).

I don't want to have to explain why exactly I'm ignoring certain things
each time.

Until the noise vs. signal is better, I really don't want Sashiko to block
anything or necessitate responses, which is why I'm very reticent to see it
send emails other than privately directly to the author perhaps.

>
> Thanks!

Thanks, Lorenzo

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by David Hildenbrand (Arm) 1 week, 6 days ago

On 3/23/26 12:19, Lorenzo Stoakes (Oracle) wrote:
> On Sat, Mar 21, 2026 at 07:12:13PM -0700, Roman Gushchin wrote:
>> Andrew Morton <akpm@linux-foundation.org> writes:
>>
>>>
>>>
>>> ?
>>
>> It's displayed in the Baseline section for every patchset.
>>
>> For mm patchsets if the base commit is not specified it's mm-new then
>> mm-unstable then mm-stable then linux-next/HEAD and then linus/HEAD
>> (and now I think that it should not only show HEAD, but the actual sha).
>>
>> I don't have yet support for "previous version is applied, let's revert
>> it and try the new one" case. Something to add later.
>>
>>> The way things are going at present, I'm just not going to apply a
>>> series which Sashiko "failed to apply".  And that's cool, I'll just
>>> wait for a version which Sashiko was able to apply.  And then not
>>> apply unless all Sashiko questions are resolved or convincingly refuted.
>>>
>>> Question please: if Sashiko finds an "issue" in v3 and then v4 comes
>>> out with changelog words which justifies the questionable alteration, can
>>> Sashiko parse that changelog justification and think "OK, never mind"?
>>
>> Yes, I'm planning to add it. Sashiko will have an access to previous
>> versions of the patchset and the whole discussion thread and take it
>> into the account.
> 
> Hmm this question presupposes that we should have to respond somehow to
> Sashiko feedback, but with ~50% signal vs. noise (my experience so far)
> that's just not sensible, and a painful addition to already overstrained
> workload.
> 
> For instance
> https://sashiko.dev/#/patchset/cover.1774029655.git.ljs%40kernel.org is
> full of pretty useless stuff, including a silly hallucination
> (VM_WARN_ON_ONCE() cannot be used as a conditional, it's defined as
> (void)WARN_ON_ONCE() when CONFIG_DEBUG_VM is enabled).
> 
> I don't want to have to explain why exactly I'm ignoring certain things
> each time.
> 
> Until the noise vs. signal is better, I really don't want Sashiko to block
> anything or necessitate responses, which is why I'm very reticent to see it
> send emails other than privately directly to the author perhaps.

100% agreed. It's a pain to dig through the AI output to find something
useful. Fortunately there is some useful stuff in there every now and then.

I've seen the AI either raises wrong stuff or just brings up stuff that
is completely unrelated to the actual code changes, which is quite the
time sink and TBH annoying.

Particularly annoying if review on a new revision suddenly includes new
slop.

I wish we could tune Sashiko to focus on serious regressions, and only
report them if it is extremely sure that there is something real in there.

-- 
Cheers,

David

Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd()

Posted by Lorenzo Stoakes (Oracle) 2 weeks, 2 days ago

On Thu, Mar 19, 2026 at 08:09:17PM -0700, Andrew Morton wrote:
> On Thu, 19 Mar 2026 13:00:06 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
>
> > The zap_huge_pmd() function is overly complicated, clean it up and also add
> > an assert in the case that we encounter a buggy PMD entry that doesn't
> > match expectations.
> >
> > This is motivated by a bug discovered [0] where the PMD entry was none of:
> >
> > * A non-DAX, PFN or mixed map.
> > * The huge zero folio
> > * A present PMD entry
> > * A softleaf entry
> >
> > In zap_huge_pmd(), but due to the bug we manged to reach this code.
> >
> > It is useful to explicitly call this out rather than have an arbitrary NULL
> > pointer dereference happen, which also improves understanding of what's
> > going on.
> >
> > [0]:https://lore.kernel.org/all/6b3d7ad7-49e1-407a-903d-3103704160d8@lucifer.local/
>
> AI review has questions, which I assume you've seen
> 	https://sashiko.dev/#/patchset/cover.1773924928.git.ljs%40kernel.org

Nope but I'll have a look through and see what's valid.

>
>
>
> This isn't going well from a workflow POV.  I merge stuff (this was v2)
> then half a day later a bunch of potential issues are identified.
>
> If these reviews are useful (they seem to be, enough) then I guess I'll
> need to further increase the lag between seeing-it and merging-it.  But
> if there's a 2-day lag before I get onto a series and I'm the first to
> look at Sashiko then that won't help.
>
> So it needs to be something like
>
> 	- series is posted
> 	- 24 hours pass
> 	- submitter takes a look at the AI review, maybe prepares a new
> 	  series.
> 	- 24 hours pass
> 	- rinse, repeat
> 	- it gets merged, hopefully with some Reviewed-by"s.
>
> Not unreasonable, but it requires that submitter be made aware of
> Sashiko's comments.  At present that's via me being tiresome.
>
>
> Anyway, early days.  I'm thinking that an emailed reply-to-all from
> Sashiko will help.  Much hinges on how useful submitters find these
> questions to be - something which I'm paying close attention to...
>

Please not yet, it produces a lot of noise. I've responded at length on the
thread on this [0], and while I appreciate the tooling, it's not ready to
be treated as giving entirely valid feedback yet :)

I think David's on the same page as me on this.

Cheers, Lorenzo

https://lore.kernel.org/all/39e6b4d2-8a30-4eaa-908d-5d11b746f8d5@lucifer.local/