Intermittent workloads behaving in bursts spaced by more than 100ms
on each CPU exhibit bad cache locality and degraded performance compared
to purely per-cpu data indexing, because concurrency IDs are allocated
over various CPUs and cores, therefore losing cache locality of the
associated data.
This series addresses this shortcoming. I observed speedups up to 16.7x
compared to plain mm_cid indexing in benchmarks.
It applies on top of v6.10.6.
This deprecates the prior "sched: NUMA-aware per-memory-map concurrency
IDs" patch series with a simpler and more general approach.
Feedback is welcome!
Thanks,
Mathieu
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Mathieu Desnoyers (1):
sched: Improve cache locality of RSEQ concurrency IDs for intermittent
workloads
fs/exec.c | 2 +-
include/linux/mm_types.h | 72 +++++++++++++++++++++++++++++++++++-----
kernel/fork.c | 2 +-
kernel/sched/core.c | 7 ++--
kernel/sched/sched.h | 47 ++++++++++++++++++--------
5 files changed, 103 insertions(+), 27 deletions(-)
--
2.39.2