In this series, we make migrate_enable/migrate_disable inline to obtain
better performance in some case.
In the first patch, we add the macro "COMPILE_OFFSETS" to all the
asm-offset.c to avoid circular dependency in the 3rd patch.
In the 2nd patch, we replace preempt.h with sched.h in
include/linux/rcupdate.h to fix potential compiling issue.
In the 3rd patch, we generate the offset of nr_pinned in "struct rq" with
rq-offsets.c, as the "struct rq" is defined internally and we need to
access the "nr_pinned" field in migrate_enable and migrate_disable. Then,
we move the definition of migrate_enable/migrate_disable from
kernel/sched/core.c to include/linux/sched.h.
In the 4th patch, we fix some typos in include/linux/preempt.h.
One of the beneficiaries of this series is BPF trampoline. Without this
series, the migrate_enable/migrate_disable is hot when we run the
benchmark for FENTRY, FEXIT, MODIFY_RETURN, etc:
54.63% bpf_prog_2dcccf652aac1793_bench_trigger_fentry [k]
bpf_prog_2dcccf652aac1793_bench_trigger_fentry
10.43% [kernel] [k] migrate_enable
10.07% bpf_trampoline_6442517037 [k] bpf_trampoline_6442517037
8.06% [kernel] [k] __bpf_prog_exit_recur
4.11% libc.so.6 [.] syscall
2.15% [kernel] [k] entry_SYSCALL_64
1.48% [kernel] [k] memchr_inv
1.32% [kernel] [k] fput
1.16% [kernel] [k] _copy_to_user
0.73% [kernel] [k] bpf_prog_test_run_raw_tp
Before this patch, the performance of BPF FENTRY is:
fentry : 113.030 ± 0.149M/s
fentry : 112.501 ± 0.187M/s
fentry : 112.828 ± 0.267M/s
fentry : 115.287 ± 0.241M/s
After this patch, the performance of BPF FENTRY increases to:
fentry : 143.644 ± 0.670M/s
fentry : 149.764 ± 0.362M/s
fentry : 149.642 ± 0.156M/s
fentry : 145.263 ± 0.221M/s
Changes since V4:
* add the 2nd patch to fix potential compiling issue
* fix the comment style problem in the 3rd patch
Changes since V3:
* some modification on the 2nd patch, as Alexei advised:
- rename CREATE_MIGRATE_DISABLE to INSTANTIATE_EXPORTED_MIGRATE_DISABLE
- add document for INSTANTIATE_EXPORTED_MIGRATE_DISABLE
Changes since V2:
* some modification on the 2nd patch, as Peter advised:
- don't export runqueues, define migrate_enable and migrate_disable in
kernel/sched/core.c and use them for kernel modules instead
- define the macro this_rq_pinned()
- add some comment for this_rq_raw()
Changes since V1:
* use PERCPU_PTR() for this_rq_raw() if !CONFIG_SMP in the 2nd patch
Menglong Dong (4):
arch: add the macro COMPILE_OFFSETS to all the asm-offsets.c
rcu: replace preempt.h with sched.h in include/linux/rcupdate.h
sched: make migrate_enable/migrate_disable inline
sched: fix some typos in include/linux/preempt.h
Kbuild | 13 ++-
arch/alpha/kernel/asm-offsets.c | 1 +
arch/arc/kernel/asm-offsets.c | 1 +
arch/arm/kernel/asm-offsets.c | 2 +
arch/arm64/kernel/asm-offsets.c | 1 +
arch/csky/kernel/asm-offsets.c | 1 +
arch/hexagon/kernel/asm-offsets.c | 1 +
arch/loongarch/kernel/asm-offsets.c | 2 +
arch/m68k/kernel/asm-offsets.c | 1 +
arch/microblaze/kernel/asm-offsets.c | 1 +
arch/mips/kernel/asm-offsets.c | 2 +
arch/nios2/kernel/asm-offsets.c | 1 +
arch/openrisc/kernel/asm-offsets.c | 1 +
arch/parisc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/riscv/kernel/asm-offsets.c | 1 +
arch/s390/kernel/asm-offsets.c | 1 +
arch/sh/kernel/asm-offsets.c | 1 +
arch/sparc/kernel/asm-offsets.c | 1 +
arch/um/kernel/asm-offsets.c | 2 +
arch/xtensa/kernel/asm-offsets.c | 1 +
include/linux/preempt.h | 11 +--
include/linux/rcupdate.h | 2 +-
include/linux/sched.h | 113 +++++++++++++++++++++++++++
kernel/bpf/verifier.c | 1 +
kernel/sched/core.c | 63 ++++-----------
kernel/sched/rq-offsets.c | 12 +++
27 files changed, 181 insertions(+), 58 deletions(-)
create mode 100644 kernel/sched/rq-offsets.c
--
2.51.0