On 8/19/24 08:24, Mathieu Desnoyers wrote:
> The issue addressed by this series is the non-locality of NUMA accesses
> to data structures indexed by concurrency IDs: for example, in a
> scenario where a process has two threads, and they periodically run one
> after the other on different NUMA nodes, each will be assigned mm_cid=0.
> As a consequence, they will end up accessing the same pages, and thus at
> least one of the threads will need to perform remote NUMA accesses,
> which is inefficient.
>
> Solve this by making the rseq concurrency ID (mm_cid) NUMA-aware. On
> NUMA systems, when a NUMA-aware concurrency ID is observed by user-space
> to be associated with a NUMA node, guarantee that it never changes NUMA
> node unless either a kernel-level NUMA configuration change happens, or
> scheduler migrations end up migrating tasks across NUMA nodes.
>
> There is a tradeoff between NUMA locality and compactness of the
> concurrency ID allocation. Favor compactness over NUMA locality when
> the scheduler migrates tasks across NUMA nodes, as this does not cause
> the frequent remote NUMA accesses behavior. This is done by limiting the
> concurrency ID range to minimum between the number of threads belonging
> to the process and the number of allowed CPUs.
>
> This series applies on top of v6.10.3.
>
> Cc: Valentin Schneider <vschneid@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Cc: Ben Segall <bsegall@google.com>
> Cc: Yury Norov <yury.norov@gmail.com>
> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
>
> Mathieu Desnoyers (5):
> lib: Implement find_{first,next,nth}_notandnot_bit,
> find_first_andnot_bit
> cpumask: Implement cpumask_{first,next}_{not,}andnot
> sched: NUMA-aware per-memory-map concurrency IDs
> selftests/rseq: x86: Implement rseq_load_u32_u32
> selftests/rseq: Implement NUMA node id vs mm_cid invariant test
>
> include/linux/cpumask.h | 60 ++++++++
> include/linux/find.h | 122 ++++++++++++++-
> include/linux/mm_types.h | 57 ++++++-
> kernel/sched/core.c | 10 +-
> kernel/sched/sched.h | 139 +++++++++++++++--
> lib/find_bit.c | 42 +++++
> tools/testing/selftests/rseq/.gitignore | 1 +
> tools/testing/selftests/rseq/Makefile | 2 +-
> .../testing/selftests/rseq/basic_numa_test.c | 144 ++++++++++++++++++
> tools/testing/selftests/rseq/rseq-x86-bits.h | 43 ++++++
> tools/testing/selftests/rseq/rseq.h | 14 ++
> 11 files changed, 613 insertions(+), 21 deletions(-)
> create mode 100644 tools/testing/selftests/rseq/basic_numa_test.c
>
Looks good to me - for selftests:
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
thanks,
-- Shuah