[PATCH v6 0/2] Optimize performance of update hash-map when free is zero

Feng zhou posted 2 patches 3 years, 10 months ago
kernel/bpf/percpu_freelist.c                  | 20 ++--
tools/testing/selftests/bpf/Makefile          |  4 +-
tools/testing/selftests/bpf/bench.c           |  2 +
.../benchs/bench_bpf_hashmap_full_update.c    | 96 +++++++++++++++++++
.../run_bench_bpf_hashmap_full_update.sh      | 11 +++
.../bpf/progs/bpf_hashmap_full_update_bench.c | 40 ++++++++
6 files changed, 166 insertions(+), 7 deletions(-)
create mode 100644 tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c
create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh
create mode 100644 tools/testing/selftests/bpf/progs/bpf_hashmap_full_update_bench.c
[PATCH v6 0/2] Optimize performance of update hash-map when free is zero
Posted by Feng zhou 3 years, 10 months ago
From: Feng Zhou <zhoufeng.zf@bytedance.com>

We encountered bad case on big system with 96 CPUs that
alloc_htab_elem() would last for 1ms. The reason is that after the
prealloc hashtab has no free elems, when trying to update, it will still
grab spin_locks of all cpus. If there are multiple update users, the
competition is very serious.

0001: Use head->first to check whether the free list is empty or not before taking
the lock.
0002: Add benchmark to reproduce this worst case.

Changelog:
v5->v6: Addressed comments from Alexei Starovoitov.
- Adjust the commit log.
some details in here:
https://lore.kernel.org/all/20220608021050.47279-1-zhoufeng.zf@bytedance.com/

v4->v5: Addressed comments from Alexei Starovoitov.
- Use head->first.
- Use cpu+max_entries.
some details in here:
https://lore.kernel.org/bpf/20220601084149.13097-1-zhoufeng.zf@bytedance.com/

v3->v4: Addressed comments from Daniel Borkmann.
- Use READ_ONCE/WRITE_ONCE.
some details in here:
https://lore.kernel.org/all/20220530091340.53443-1-zhoufeng.zf@bytedance.com/

v2->v3: Addressed comments from Alexei Starovoitov, Andrii Nakryiko.
- Adjust the way the benchmark is tested.
- Adjust the code format.
some details in here:
https://lore.kernel.org/all/20220524075306.32306-1-zhoufeng.zf@bytedance.com/T/

v1->v2: Addressed comments from Alexei Starovoitov.
- add a benchmark to reproduce the issue.
- Adjust the code format that avoid adding indent.
some details in here:
https://lore.kernel.org/all/877ac441-045b-1844-6938-fcaee5eee7f2@bytedance.com/T/

Feng Zhou (2):
  bpf: avoid grabbing spin_locks of all cpus when no free elems
  selftest/bpf/benchs: Add bpf_map benchmark

 kernel/bpf/percpu_freelist.c                  | 20 ++--
 tools/testing/selftests/bpf/Makefile          |  4 +-
 tools/testing/selftests/bpf/bench.c           |  2 +
 .../benchs/bench_bpf_hashmap_full_update.c    | 96 +++++++++++++++++++
 .../run_bench_bpf_hashmap_full_update.sh      | 11 +++
 .../bpf/progs/bpf_hashmap_full_update_bench.c | 40 ++++++++
 6 files changed, 166 insertions(+), 7 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c
 create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_hashmap_full_update_bench.c

-- 
2.20.1
Re: [PATCH v6 0/2] Optimize performance of update hash-map when free is zero
Posted by patchwork-bot+netdevbpf@kernel.org 3 years, 10 months ago
Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Fri, 10 Jun 2022 10:33:06 +0800 you wrote:
> From: Feng Zhou <zhoufeng.zf@bytedance.com>
> 
> We encountered bad case on big system with 96 CPUs that
> alloc_htab_elem() would last for 1ms. The reason is that after the
> prealloc hashtab has no free elems, when trying to update, it will still
> grab spin_locks of all cpus. If there are multiple update users, the
> competition is very serious.
> 
> [...]

Here is the summary with links:
  - [v6,1/2] bpf: avoid grabbing spin_locks of all cpus when no free elems
    https://git.kernel.org/bpf/bpf-next/c/54a9c3a42d92
  - [v6,2/2] selftest/bpf/benchs: Add bpf_map benchmark
    https://git.kernel.org/bpf/bpf-next/c/89eda98428ce

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html