From: Hongru Zhang <zhanghongru@xiaomi.com>
Use per-migratetype counts instead of list_empty() helps reduce a
few cpu instructions.
Signed-off-by: Hongru Zhang <zhanghongru@xiaomi.com>
---
mm/internal.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/internal.h b/mm/internal.h
index 1561fc2ff5b8..7759f8fdf445 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -954,7 +954,7 @@ int find_suitable_fallback(struct free_area *area, unsigned int order,
static inline bool free_area_empty(struct free_area *area, int migratetype)
{
- return list_empty(&area->free_list[migratetype]);
+ return !READ_ONCE(area->mt_nr_free[migratetype]);
}
/* mm/util.c */
--
2.43.0
On Fri, Nov 28, 2025 at 11:13 AM Hongru Zhang <zhanghongru06@gmail.com> wrote:
>
> From: Hongru Zhang <zhanghongru@xiaomi.com>
>
> Use per-migratetype counts instead of list_empty() helps reduce a
> few cpu instructions.
>
> Signed-off-by: Hongru Zhang <zhanghongru@xiaomi.com>
> ---
> mm/internal.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 1561fc2ff5b8..7759f8fdf445 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -954,7 +954,7 @@ int find_suitable_fallback(struct free_area *area, unsigned int order,
>
> static inline bool free_area_empty(struct free_area *area, int migratetype)
> {
> - return list_empty(&area->free_list[migratetype]);
> + return !READ_ONCE(area->mt_nr_free[migratetype]);
I'm not quite sure about this. Since the counter is written and read more
frequently, cache coherence traffic may actually be higher than for the list
head.
I'd prefer to drop this unless there is real data showing it performs better.
Thanks
Barry
On Sat, Nov 29, 2025 at 8:04 AM Barry Song <21cnbao@gmail.com> wrote:
>
> On Fri, Nov 28, 2025 at 11:13 AM Hongru Zhang <zhanghongru06@gmail.com> wrote:
> >
> > From: Hongru Zhang <zhanghongru@xiaomi.com>
> >
> > Use per-migratetype counts instead of list_empty() helps reduce a
> > few cpu instructions.
> >
> > Signed-off-by: Hongru Zhang <zhanghongru@xiaomi.com>
> > ---
> > mm/internal.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 1561fc2ff5b8..7759f8fdf445 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -954,7 +954,7 @@ int find_suitable_fallback(struct free_area *area, unsigned int order,
> >
> > static inline bool free_area_empty(struct free_area *area, int migratetype)
> > {
> > - return list_empty(&area->free_list[migratetype]);
> > + return !READ_ONCE(area->mt_nr_free[migratetype]);
>
> I'm not quite sure about this. Since the counter is written and read more
> frequently, cache coherence traffic may actually be higher than for the list
> head.
>
> I'd prefer to drop this unless there is real data showing it performs better.
If the goal is to optimize free_area list checks and list_add,
a reasonable approach is to organize the data structure
to reduce false sharing between different mt and order entries.
struct mt_free_area {
struct list_head free_list;
unsigned long nr_free;
} ____cacheline_aligned;
struct free_area {
struct mt_free_area mt_free_area[MIGRATE_TYPES];
};
However, without supporting data, it’s unclear if the space increase
is justified :-)
Thanks
Barry
© 2016 - 2025 Red Hat, Inc.