[PATCH 3/3] mm: optimize free_area_empty() check using per-migratetype counts

Hongru Zhang posted 3 patches 3 days, 18 hours ago
[PATCH 3/3] mm: optimize free_area_empty() check using per-migratetype counts
Posted by Hongru Zhang 3 days, 18 hours ago
From: Hongru Zhang <zhanghongru@xiaomi.com>

Use per-migratetype counts instead of list_empty() helps reduce a
few cpu instructions.

Signed-off-by: Hongru Zhang <zhanghongru@xiaomi.com>
---
 mm/internal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/internal.h b/mm/internal.h
index 1561fc2ff5b8..7759f8fdf445 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -954,7 +954,7 @@ int find_suitable_fallback(struct free_area *area, unsigned int order,
 
 static inline bool free_area_empty(struct free_area *area, int migratetype)
 {
-	return list_empty(&area->free_list[migratetype]);
+	return !READ_ONCE(area->mt_nr_free[migratetype]);
 }
 
 /* mm/util.c */
-- 
2.43.0
Re: [PATCH 3/3] mm: optimize free_area_empty() check using per-migratetype counts
Posted by Barry Song 2 days, 21 hours ago
On Fri, Nov 28, 2025 at 11:13 AM Hongru Zhang <zhanghongru06@gmail.com> wrote:
>
> From: Hongru Zhang <zhanghongru@xiaomi.com>
>
> Use per-migratetype counts instead of list_empty() helps reduce a
> few cpu instructions.
>
> Signed-off-by: Hongru Zhang <zhanghongru@xiaomi.com>
> ---
>  mm/internal.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 1561fc2ff5b8..7759f8fdf445 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -954,7 +954,7 @@ int find_suitable_fallback(struct free_area *area, unsigned int order,
>
>  static inline bool free_area_empty(struct free_area *area, int migratetype)
>  {
> -       return list_empty(&area->free_list[migratetype]);
> +       return !READ_ONCE(area->mt_nr_free[migratetype]);

I'm not quite sure about this. Since the counter is written and read more
frequently, cache coherence traffic may actually be higher than for the list
head.

I'd prefer to drop this unless there is real data showing it performs better.

Thanks
Barry
Re: [PATCH 3/3] mm: optimize free_area_empty() check using per-migratetype counts
Posted by Barry Song 2 days, 12 hours ago
On Sat, Nov 29, 2025 at 8:04 AM Barry Song <21cnbao@gmail.com> wrote:
>
> On Fri, Nov 28, 2025 at 11:13 AM Hongru Zhang <zhanghongru06@gmail.com> wrote:
> >
> > From: Hongru Zhang <zhanghongru@xiaomi.com>
> >
> > Use per-migratetype counts instead of list_empty() helps reduce a
> > few cpu instructions.
> >
> > Signed-off-by: Hongru Zhang <zhanghongru@xiaomi.com>
> > ---
> >  mm/internal.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 1561fc2ff5b8..7759f8fdf445 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -954,7 +954,7 @@ int find_suitable_fallback(struct free_area *area, unsigned int order,
> >
> >  static inline bool free_area_empty(struct free_area *area, int migratetype)
> >  {
> > -       return list_empty(&area->free_list[migratetype]);
> > +       return !READ_ONCE(area->mt_nr_free[migratetype]);
>
> I'm not quite sure about this. Since the counter is written and read more
> frequently, cache coherence traffic may actually be higher than for the list
> head.
>
> I'd prefer to drop this unless there is real data showing it performs better.

If the goal is to optimize free_area list checks and list_add,
a reasonable approach is to organize the data structure
to reduce false sharing between different mt and order entries.

struct mt_free_area {
        struct list_head        free_list;
        unsigned long           nr_free;
} ____cacheline_aligned;

struct free_area {
        struct mt_free_area     mt_free_area[MIGRATE_TYPES];
};

However, without supporting data, it’s unclear if the space increase
is justified :-)

Thanks
Barry