The common order-0 case is important enough to want its own branch, and
avoids the hairy, large loop logic that the CPU does not seem to handle
particularly well.
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
---
mm/mprotect.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/mm/mprotect.c b/mm/mprotect.c
index aa845f5bf14d..23a741f9a4c8 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -170,6 +170,13 @@ static __always_inline void commit_anon_folio_batch(struct vm_area_struct *vma,
int sub_batch_idx = 0;
int len;
+ /* Optimize for the common order-0 case. */
+ if (likely(nr_ptes == 1)) {
+ prot_commit_flush_ptes(vma, addr, ptep, oldpte, ptent, 1,
+ 0, PageAnonExclusive(first_page), tlb);
+ return;
+ }
+
while (nr_ptes) {
expected_anon_exclusive = PageAnonExclusive(first_page + sub_batch_idx);
len = page_anon_exclusive_sub_batch(sub_batch_idx, nr_ptes,
--
2.53.0
On 3/19/26 19:31, Pedro Falcato wrote:
> The common order-0 case is important enough to want its own branch, and
> avoids the hairy, large loop logic that the CPU does not seem to handle
> particularly well.
>
> Signed-off-by: Pedro Falcato <pfalcato@suse.de>
> ---
> mm/mprotect.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index aa845f5bf14d..23a741f9a4c8 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -170,6 +170,13 @@ static __always_inline void commit_anon_folio_batch(struct vm_area_struct *vma,
> int sub_batch_idx = 0;
> int len;
>
> + /* Optimize for the common order-0 case. */
> + if (likely(nr_ptes == 1)) {
> + prot_commit_flush_ptes(vma, addr, ptep, oldpte, ptent, 1,
> + 0, PageAnonExclusive(first_page), tlb);
> + return;
> + }
> +
> while (nr_ptes) {
> expected_anon_exclusive = PageAnonExclusive(first_page + sub_batch_idx);
> len = page_anon_exclusive_sub_batch(sub_batch_idx, nr_ptes,
That should likely be squashed into patch #1, because only then the
inline makes more sense.
--
Cheers,
David
On Thu, Mar 19, 2026 at 10:43:35PM +0100, David Hildenbrand (Arm) wrote:
> On 3/19/26 19:31, Pedro Falcato wrote:
> > The common order-0 case is important enough to want its own branch, and
> > avoids the hairy, large loop logic that the CPU does not seem to handle
> > particularly well.
> >
> > Signed-off-by: Pedro Falcato <pfalcato@suse.de>
> > ---
> > mm/mprotect.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/mm/mprotect.c b/mm/mprotect.c
> > index aa845f5bf14d..23a741f9a4c8 100644
> > --- a/mm/mprotect.c
> > +++ b/mm/mprotect.c
> > @@ -170,6 +170,13 @@ static __always_inline void commit_anon_folio_batch(struct vm_area_struct *vma,
> > int sub_batch_idx = 0;
> > int len;
> >
> > + /* Optimize for the common order-0 case. */
> > + if (likely(nr_ptes == 1)) {
> > + prot_commit_flush_ptes(vma, addr, ptep, oldpte, ptent, 1,
> > + 0, PageAnonExclusive(first_page), tlb);
> > + return;
> > + }
> > +
> > while (nr_ptes) {
> > expected_anon_exclusive = PageAnonExclusive(first_page + sub_batch_idx);
> > len = page_anon_exclusive_sub_batch(sub_batch_idx, nr_ptes,
>
> That should likely be squashed into patch #1, because only then the
> inline makes more sense.
Will do, thanks.
--
Pedro
On Thu, Mar 19, 2026 at 06:31:08PM +0000, Pedro Falcato wrote:
> The common order-0 case is important enough to want its own branch, and
> avoids the hairy, large loop logic that the CPU does not seem to handle
> particularly well.
>
I think it'd be good to get a sense per-commit what the perf impact is.
> Signed-off-by: Pedro Falcato <pfalcato@suse.de>
I am iffy on the likely() in the same way I am always iffy on non-profile backed
likely()/unlikely() but this change seems sensible enough so:
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
> mm/mprotect.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index aa845f5bf14d..23a741f9a4c8 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -170,6 +170,13 @@ static __always_inline void commit_anon_folio_batch(struct vm_area_struct *vma,
> int sub_batch_idx = 0;
> int len;
>
> + /* Optimize for the common order-0 case. */
> + if (likely(nr_ptes == 1)) {
I mean I we're likely()'ing here without data to back it, but I guess I believe
this one :)
> + prot_commit_flush_ptes(vma, addr, ptep, oldpte, ptent, 1,
> + 0, PageAnonExclusive(first_page), tlb);
> + return;
> + }
> +
> while (nr_ptes) {
> expected_anon_exclusive = PageAnonExclusive(first_page + sub_batch_idx);
> len = page_anon_exclusive_sub_batch(sub_batch_idx, nr_ptes,
> --
> 2.53.0
>
Cheers, Lorenzo
On Thu, Mar 19, 2026 at 07:17:31PM +0000, Lorenzo Stoakes (Oracle) wrote: > On Thu, Mar 19, 2026 at 06:31:08PM +0000, Pedro Falcato wrote: > > The common order-0 case is important enough to want its own branch, and > > avoids the hairy, large loop logic that the CPU does not seem to handle > > particularly well. > > > > I think it'd be good to get a sense per-commit what the perf impact is. I added some numbers on the cover letter - this patch + the __always_inline one do the heavy lifting. > > > Signed-off-by: Pedro Falcato <pfalcato@suse.de> > > I am iffy on the likely() in the same way I am always iffy on non-profile backed > likely()/unlikely() but this change seems sensible enough so: > > Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Thanks for the review! -- Pedro
On Fri, Mar 20, 2026 at 10:36:17AM +0000, Pedro Falcato wrote: > On Thu, Mar 19, 2026 at 07:17:31PM +0000, Lorenzo Stoakes (Oracle) wrote: > > On Thu, Mar 19, 2026 at 06:31:08PM +0000, Pedro Falcato wrote: > > > The common order-0 case is important enough to want its own branch, and > > > avoids the hairy, large loop logic that the CPU does not seem to handle > > > particularly well. > > > > > > > I think it'd be good to get a sense per-commit what the perf impact is. > > I added some numbers on the cover letter - this patch + the __always_inline > one do the heavy lifting. I mean again this makes me wonder if we shouldn't have some generalised batch logic to handle order-0 cases. > > > > > > Signed-off-by: Pedro Falcato <pfalcato@suse.de> > > > > I am iffy on the likely() in the same way I am always iffy on non-profile backed > > likely()/unlikely() but this change seems sensible enough so: > > > > Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> > > Thanks for the review! > > -- > Pedro
© 2016 - 2026 Red Hat, Inc.