From nobody Wed Apr 29 03:19:43 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 970A2C4332F for ; Tue, 24 May 2022 07:53:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235448AbiEXHxd (ORCPT ); Tue, 24 May 2022 03:53:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235424AbiEXHx0 (ORCPT ); Tue, 24 May 2022 03:53:26 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2E4C5DBF1 for ; Tue, 24 May 2022 00:53:24 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id c2so15209720plh.2 for ; Tue, 24 May 2022 00:53:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/aO+jf2mfum2lozA0/bk+POhUodk65js+sUsVeQ8jAE=; b=MVnVjvd/UxexpfYHga1biKBTrU3m4RS3lG72QwxGBe9OAmNdsbsXYWkJNWPnyTskwc PJ8uJ+PI/2DXuTuPq0u1uRVUMEd7T2KQJN2mO50g8yOGGJriV256B2FJrlBmqaPYrIWU HKUcdWb9xZYfqgxuOtYgmEBBgJmvKY7k4k5wL5V3WzkibEaamoKrfCQPp+1giB7enANw H4Ou6L3GSyoXWBsyj+J36PMHV9EoA68+3wL3txxRs2b6uPv0IEblDiZwTUcYb1sdePtV nohs5W7eIUJzGhK0tjbSuidhw8qJC6RNM9YvAu5EIIzIuXrZaJyNCf7fJ8/w45p8OEzp XoZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/aO+jf2mfum2lozA0/bk+POhUodk65js+sUsVeQ8jAE=; b=vyBAhAZZ5p/Xv42s9QnKWNSYMRZ9H+RUSCwfJh0N04VsguYtCJLFM8QhqCnUzET+E8 KdLTPoJGRHhdpta8rBvPD4vo7N/dXWBYcX8sbPC8TjxxbXIC+mTKkk9VadADkkP25ewS 01wKA/UTnk7fCHtRDEaw3AR41JHPCYiUcKp8iDgkSd8bwzzjDBTukkKnJ0x95Owcnzdw rBIvtFASa2VDCSPuQWJOeBCKIEZ16LyxYlCElN5Ja/feANncNNvZJULsgWHbMHdBuxiP F54kMqzu9uXyo/BHrvpy20NH1S1MuAYC/usHtzGIIaYoTm0Adts+t67YdCZB8t57gpFf FGzA== X-Gm-Message-State: AOAM532m2rbSAejaaHbZZKxQLPCH5pqtlwMWG5mrc6mpSysXqQNC82O5 cy/g2xIrUNjkNEuquwdqLK42QA== X-Google-Smtp-Source: ABdhPJz4CUJl4h8n/3aUm7mxLeta6+iNhyTQT2S1ifrVGTQb6/ynhdzqbYs/S7LCIT24LZAdh4SuyA== X-Received: by 2002:a17:902:a9c9:b0:161:5b73:5ac9 with SMTP id b9-20020a170902a9c900b001615b735ac9mr26772468plr.14.1653378804291; Tue, 24 May 2022 00:53:24 -0700 (PDT) Received: from C02F52LSML85.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id m3-20020a62a203000000b00518327b7d23sm8682136pff.46.2022.05.24.00.53.18 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 May 2022 00:53:24 -0700 (PDT) From: Feng zhou To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, duanxiongchun@bytedance.com, songmuchun@bytedance.com, wangdongdong.6@bytedance.com, cong.wang@bytedance.com, zhouchengming@bytedance.com, zhoufeng.zf@bytedance.com Subject: [PATCH v2 1/2] bpf: avoid grabbing spin_locks of all cpus when no free elems Date: Tue, 24 May 2022 15:53:05 +0800 Message-Id: <20220524075306.32306-2-zhoufeng.zf@bytedance.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20220524075306.32306-1-zhoufeng.zf@bytedance.com> References: <20220524075306.32306-1-zhoufeng.zf@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Feng Zhou This patch add is_empty in pcpu_freelist_head to check freelist having free or not. If having, grab spin_lock, or check next cpu's freelist. Before patch: hash_map performance ./map_perf_test 1 0:hash_map_perf pre-alloc 975345 events per sec 4:hash_map_perf pre-alloc 855367 events per sec 12:hash_map_perf pre-alloc 860862 events per sec 8:hash_map_perf pre-alloc 849561 events per sec 3:hash_map_perf pre-alloc 849074 events per sec 6:hash_map_perf pre-alloc 847120 events per sec 10:hash_map_perf pre-alloc 845047 events per sec 5:hash_map_perf pre-alloc 841266 events per sec 14:hash_map_perf pre-alloc 849740 events per sec 2:hash_map_perf pre-alloc 839598 events per sec 9:hash_map_perf pre-alloc 838695 events per sec 11:hash_map_perf pre-alloc 845390 events per sec 7:hash_map_perf pre-alloc 834865 events per sec 13:hash_map_perf pre-alloc 842619 events per sec 1:hash_map_perf pre-alloc 804231 events per sec 15:hash_map_perf pre-alloc 795314 events per sec hash_map the worst: no free ./map_perf_test 2048 6:worse hash_map_perf pre-alloc 28628 events per sec 5:worse hash_map_perf pre-alloc 28553 events per sec 11:worse hash_map_perf pre-alloc 28543 events per sec 3:worse hash_map_perf pre-alloc 28444 events per sec 1:worse hash_map_perf pre-alloc 28418 events per sec 7:worse hash_map_perf pre-alloc 28427 events per sec 13:worse hash_map_perf pre-alloc 28330 events per sec 14:worse hash_map_perf pre-alloc 28263 events per sec 9:worse hash_map_perf pre-alloc 28211 events per sec 15:worse hash_map_perf pre-alloc 28193 events per sec 12:worse hash_map_perf pre-alloc 28190 events per sec 10:worse hash_map_perf pre-alloc 28129 events per sec 8:worse hash_map_perf pre-alloc 28116 events per sec 4:worse hash_map_perf pre-alloc 27906 events per sec 2:worse hash_map_perf pre-alloc 27801 events per sec 0:worse hash_map_perf pre-alloc 27416 events per sec 3:worse hash_map_perf pre-alloc 28188 events per sec ftrace trace 0) | htab_map_update_elem() { 0) 0.198 us | migrate_disable(); 0) | _raw_spin_lock_irqsave() { 0) 0.157 us | preempt_count_add(); 0) 0.538 us | } 0) 0.260 us | lookup_elem_raw(); 0) | alloc_htab_elem() { 0) | __pcpu_freelist_pop() { 0) | _raw_spin_lock() { 0) 0.152 us | preempt_count_add(); 0) 0.352 us | native_queued_spin_lock_slowpath(); 0) 1.065 us | } | ... 0) | _raw_spin_unlock() { 0) 0.254 us | preempt_count_sub(); 0) 0.555 us | } 0) + 25.188 us | } 0) + 25.486 us | } 0) | _raw_spin_unlock_irqrestore() { 0) 0.155 us | preempt_count_sub(); 0) 0.454 us | } 0) 0.148 us | migrate_enable(); 0) + 28.439 us | } The test machine is 16C, trying to get spin_lock 17 times, in addition to 16c, there is an extralist. after patch: hash_map performance ./map_perf_test 1 0:hash_map_perf pre-alloc 969348 events per sec 10:hash_map_perf pre-alloc 906526 events per sec 11:hash_map_perf pre-alloc 904557 events per sec 9:hash_map_perf pre-alloc 902384 events per sec 15:hash_map_perf pre-alloc 912287 events per sec 14:hash_map_perf pre-alloc 905689 events per sec 12:hash_map_perf pre-alloc 903680 events per sec 13:hash_map_perf pre-alloc 902631 events per sec 8:hash_map_perf pre-alloc 875369 events per sec 4:hash_map_perf pre-alloc 862808 events per sec 1:hash_map_perf pre-alloc 857218 events per sec 2:hash_map_perf pre-alloc 852875 events per sec 5:hash_map_perf pre-alloc 846497 events per sec 6:hash_map_perf pre-alloc 828467 events per sec 3:hash_map_perf pre-alloc 812542 events per sec 7:hash_map_perf pre-alloc 805336 events per sec hash_map worst: no free ./map_perf_test 2048 7:worse hash_map_perf pre-alloc 391104 events per sec 4:worse hash_map_perf pre-alloc 388073 events per sec 5:worse hash_map_perf pre-alloc 387038 events per sec 1:worse hash_map_perf pre-alloc 386546 events per sec 0:worse hash_map_perf pre-alloc 384590 events per sec 11:worse hash_map_perf pre-alloc 379378 events per sec 10:worse hash_map_perf pre-alloc 375480 events per sec 12:worse hash_map_perf pre-alloc 372394 events per sec 6:worse hash_map_perf pre-alloc 367692 events per sec 3:worse hash_map_perf pre-alloc 363970 events per sec 9:worse hash_map_perf pre-alloc 364008 events per sec 8:worse hash_map_perf pre-alloc 363759 events per sec 2:worse hash_map_perf pre-alloc 360743 events per sec 14:worse hash_map_perf pre-alloc 361195 events per sec 13:worse hash_map_perf pre-alloc 360276 events per sec 15:worse hash_map_perf pre-alloc 360057 events per sec 0:worse hash_map_perf pre-alloc 378177 events per sec ftrace trace 0) | htab_map_update_elem() { 0) 0.317 us | migrate_disable(); 0) | _raw_spin_lock_irqsave() { 0) 0.260 us | preempt_count_add(); 0) 1.803 us | } 0) 0.276 us | lookup_elem_raw(); 0) | alloc_htab_elem() { 0) 0.586 us | __pcpu_freelist_pop(); 0) 0.945 us | } 0) | _raw_spin_unlock_irqrestore() { 0) 0.160 us | preempt_count_sub(); 0) 0.972 us | } 0) 0.657 us | migrate_enable(); 0) 8.669 us | } It can be seen that after adding this patch, the map performance is almost not degraded, and when free=3D0, first check is_empty instead of directly acquiring spin_lock. As for why to add is_empty instead of directly judging head->first, my understanding is this, head->first is frequently modified during updating map, which will lead to invalid other cpus's cache, and is_empty is after freelist having no free elems will be changed, the performance will be bett= er. Co-developed-by: Chengming Zhou Signed-off-by: Chengming Zhou Signed-off-by: Feng Zhou --- kernel/bpf/percpu_freelist.c | 28 +++++++++++++++++++++++++--- kernel/bpf/percpu_freelist.h | 1 + 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/bpf/percpu_freelist.c b/kernel/bpf/percpu_freelist.c index 3d897de89061..f83eb63720d4 100644 --- a/kernel/bpf/percpu_freelist.c +++ b/kernel/bpf/percpu_freelist.c @@ -16,9 +16,11 @@ int pcpu_freelist_init(struct pcpu_freelist *s) =20 raw_spin_lock_init(&head->lock); head->first =3D NULL; + head->is_empty =3D true; } raw_spin_lock_init(&s->extralist.lock); s->extralist.first =3D NULL; + s->extralist.is_empty =3D true; return 0; } =20 @@ -32,6 +34,8 @@ static inline void pcpu_freelist_push_node(struct pcpu_fr= eelist_head *head, { node->next =3D head->first; head->first =3D node; + if (head->is_empty) + head->is_empty =3D false; } =20 static inline void ___pcpu_freelist_push(struct pcpu_freelist_head *head, @@ -130,14 +134,19 @@ static struct pcpu_freelist_node *___pcpu_freelist_po= p(struct pcpu_freelist *s) orig_cpu =3D cpu =3D raw_smp_processor_id(); while (1) { head =3D per_cpu_ptr(s->freelist, cpu); + if (head->is_empty) + goto next_cpu; raw_spin_lock(&head->lock); node =3D head->first; if (node) { head->first =3D node->next; + if (!head->first) + head->is_empty =3D true; raw_spin_unlock(&head->lock); return node; } raw_spin_unlock(&head->lock); +next_cpu: cpu =3D cpumask_next(cpu, cpu_possible_mask); if (cpu >=3D nr_cpu_ids) cpu =3D 0; @@ -146,10 +155,15 @@ static struct pcpu_freelist_node *___pcpu_freelist_po= p(struct pcpu_freelist *s) } =20 /* per cpu lists are all empty, try extralist */ + if (s->extralist.is_empty) + return NULL; raw_spin_lock(&s->extralist.lock); node =3D s->extralist.first; - if (node) + if (node) { s->extralist.first =3D node->next; + if (!s->extralist.first) + s->extralist.is_empty =3D true; + } raw_spin_unlock(&s->extralist.lock); return node; } @@ -164,15 +178,20 @@ ___pcpu_freelist_pop_nmi(struct pcpu_freelist *s) orig_cpu =3D cpu =3D raw_smp_processor_id(); while (1) { head =3D per_cpu_ptr(s->freelist, cpu); + if (head->is_empty) + goto next_cpu; if (raw_spin_trylock(&head->lock)) { node =3D head->first; if (node) { head->first =3D node->next; + if (!head->first) + head->is_empty =3D true; raw_spin_unlock(&head->lock); return node; } raw_spin_unlock(&head->lock); } +next_cpu: cpu =3D cpumask_next(cpu, cpu_possible_mask); if (cpu >=3D nr_cpu_ids) cpu =3D 0; @@ -181,11 +200,14 @@ ___pcpu_freelist_pop_nmi(struct pcpu_freelist *s) } =20 /* cannot pop from per cpu lists, try extralist */ - if (!raw_spin_trylock(&s->extralist.lock)) + if (s->extralist.is_empty || !raw_spin_trylock(&s->extralist.lock)) return NULL; node =3D s->extralist.first; - if (node) + if (node) { s->extralist.first =3D node->next; + if (!s->extralist.first) + s->extralist.is_empty =3D true; + } raw_spin_unlock(&s->extralist.lock); return node; } diff --git a/kernel/bpf/percpu_freelist.h b/kernel/bpf/percpu_freelist.h index 3c76553cfe57..9e4545631ed5 100644 --- a/kernel/bpf/percpu_freelist.h +++ b/kernel/bpf/percpu_freelist.h @@ -9,6 +9,7 @@ struct pcpu_freelist_head { struct pcpu_freelist_node *first; raw_spinlock_t lock; + bool is_empty; }; =20 struct pcpu_freelist { --=20 2.20.1 From nobody Wed Apr 29 03:19:43 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB096C433EF for ; Tue, 24 May 2022 07:53:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235424AbiEXHxq (ORCPT ); Tue, 24 May 2022 03:53:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235456AbiEXHxm (ORCPT ); Tue, 24 May 2022 03:53:42 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BA958721B for ; Tue, 24 May 2022 00:53:31 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id x12so15670499pgj.7 for ; Tue, 24 May 2022 00:53:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=wV0DpKXatddjDbJCjTJofvulVU/99t8+gdCbNaoZbpM=; b=KA1pZeloa4+9p56nYDJMi/agE/umtXJKAy7AbfgNIM0jOuC5MFFdUc3MGY1OD7YwVX EL6/PSLALvaRWZeO0+V+2+gf4CS7DtkspoPvrmlC3tByyaZlbilbP3+DrOZW+XLk4b3c F/495q2T9EV7hGQNvVpfNMYLYeeg7XtGDCxmK3WQzRkZ/r1ejx0LUmNw+iTCXk/JMXvo 7zN2FoQYP9NXh2LCAI0HiCSXwxsVyEomVVv7k1VZ/BXs8Thsv0KWTU2rCfZyGALbl81B HP1jA2bTwTg1MoooDPess+wnu6N8qp1+hItLooVZkrNVoXQ75+fcLXcdoNIy9wy3425Y aJuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wV0DpKXatddjDbJCjTJofvulVU/99t8+gdCbNaoZbpM=; b=Or+jLbbmsfXjLCXnIVaRmuzp64PaPh2KQO5u02aBc/m73y7+JJLYd1jwJneddj/XF6 Xu7r+cEddW/KkoDH0Wu1vYWpfjuNqo8ufLKAC/rS02BlDkYxn1f6TWS9kAt/chYxJxqY pcrf+RQpv/Vu83JJqrsfZjMaOotR/ryK61EqFQPTeB+5auTUUjoWYSZNQ4yqSqikAh5/ uax76nsKEiLZVBhYoYf6Pp6KoH7mSlWwuTePAfM6VnxCtNh3HLWYA+sPMzuSpnHX+LTf YOOUrE9obmrGo6PnIkHErE7ewXz1TsZm83rCuSuNUdgPw+M2NIj/knaEMlsf5f7RUyI6 nJaw== X-Gm-Message-State: AOAM531khXCvZ9dj2tpcuhEdYZ4Tm3Qh/nWV9DAgC9DSXQEMFZNYKk8j eN4DKKshpq4alC9Q2QzHG7LoTg== X-Google-Smtp-Source: ABdhPJw1lV/AnIKXMFK++/pxRxsIkuT1tsDXuJeG2hGTxeRGAav7QzcnjK9gJbM7LScExUgEBgqQyQ== X-Received: by 2002:a63:df0f:0:b0:3db:2d4:ded9 with SMTP id u15-20020a63df0f000000b003db02d4ded9mr23094534pgg.267.1653378810601; Tue, 24 May 2022 00:53:30 -0700 (PDT) Received: from C02F52LSML85.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id m3-20020a62a203000000b00518327b7d23sm8682136pff.46.2022.05.24.00.53.24 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 May 2022 00:53:30 -0700 (PDT) From: Feng zhou To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, duanxiongchun@bytedance.com, songmuchun@bytedance.com, wangdongdong.6@bytedance.com, cong.wang@bytedance.com, zhouchengming@bytedance.com, zhoufeng.zf@bytedance.com Subject: [PATCH v2 2/2] selftest/bpf/benchs: Add bpf_map benchmark Date: Tue, 24 May 2022 15:53:06 +0800 Message-Id: <20220524075306.32306-3-zhoufeng.zf@bytedance.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20220524075306.32306-1-zhoufeng.zf@bytedance.com> References: <20220524075306.32306-1-zhoufeng.zf@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Feng Zhou Add benchmark for hash_map to reproduce the worst case that non-stop update when map's free is zero. Signed-off-by: Feng Zhou --- tools/testing/selftests/bpf/Makefile | 4 +- tools/testing/selftests/bpf/bench.c | 2 + .../selftests/bpf/benchs/bench_bpf_map.c | 78 +++++++++++++++++++ .../selftests/bpf/benchs/run_bench_bpf_map.sh | 10 +++ .../selftests/bpf/progs/bpf_map_bench.c | 27 +++++++ 5 files changed, 120 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/benchs/bench_bpf_map.c create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bpf_map.sh create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_bench.c diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests= /bpf/Makefile index 3820608faf57..cd2fada21ed7 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -549,6 +549,7 @@ $(OUTPUT)/bench_ringbufs.o: $(OUTPUT)/ringbuf_bench.ske= l.h \ $(OUTPUT)/bench_bloom_filter_map.o: $(OUTPUT)/bloom_filter_bench.skel.h $(OUTPUT)/bench_bpf_loop.o: $(OUTPUT)/bpf_loop_bench.skel.h $(OUTPUT)/bench_strncmp.o: $(OUTPUT)/strncmp_bench.skel.h +$(OUTPUT)/bench_bpf_map.o: $(OUTPUT)/bpf_map_bench.skel.h $(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ) $(OUTPUT)/bench: LDLIBS +=3D -lm $(OUTPUT)/bench: $(OUTPUT)/bench.o \ @@ -560,7 +561,8 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \ $(OUTPUT)/bench_ringbufs.o \ $(OUTPUT)/bench_bloom_filter_map.o \ $(OUTPUT)/bench_bpf_loop.o \ - $(OUTPUT)/bench_strncmp.o + $(OUTPUT)/bench_strncmp.o \ + $(OUTPUT)/bench_bpf_map.o $(call msg,BINARY,,$@) $(Q)$(CC) $(CFLAGS) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@ =20 diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/= bpf/bench.c index f973320e6dbf..32644c4adc84 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -397,6 +397,7 @@ extern const struct bench bench_hashmap_with_bloom; extern const struct bench bench_bpf_loop; extern const struct bench bench_strncmp_no_helper; extern const struct bench bench_strncmp_helper; +extern const struct bench bench_bpf_map; =20 static const struct bench *benchs[] =3D { &bench_count_global, @@ -431,6 +432,7 @@ static const struct bench *benchs[] =3D { &bench_bpf_loop, &bench_strncmp_no_helper, &bench_strncmp_helper, + &bench_bpf_map, }; =20 static void setup_benchmark() diff --git a/tools/testing/selftests/bpf/benchs/bench_bpf_map.c b/tools/tes= ting/selftests/bpf/benchs/bench_bpf_map.c new file mode 100644 index 000000000000..4db08ed23f1f --- /dev/null +++ b/tools/testing/selftests/bpf/benchs/bench_bpf_map.c @@ -0,0 +1,78 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Bytedadnce */ + +#include +#include "bench.h" +#include "bpf_map_bench.skel.h" + +/* BPF triggering benchmarks */ +static struct ctx { + struct bpf_map_bench *skel; + struct counter hits; +} ctx; + +static void validate(void) +{ + if (env.consumer_cnt !=3D 1) { + fprintf(stderr, "benchmark doesn't support multi-consumer!\n"); + exit(1); + } +} + +static void *producer(void *input) +{ + while (true) { + /* trigger the bpf program */ + syscall(__NR_getpgid); + atomic_inc(&ctx.hits.value); + } + + return NULL; +} + +static void *consumer(void *input) +{ + return NULL; +} + +static void measure(struct bench_res *res) +{ + res->hits =3D atomic_swap(&ctx.hits.value, 0); +} + +static void setup(void) +{ + struct bpf_link *link; + int map_fd, i, max_entries; + + setup_libbpf(); + + ctx.skel =3D bpf_map_bench__open_and_load(); + if (!ctx.skel) { + fprintf(stderr, "failed to open skeleton\n"); + exit(1); + } + + link =3D bpf_program__attach(ctx.skel->progs.benchmark); + if (!link) { + fprintf(stderr, "failed to attach program!\n"); + exit(1); + } + + //fill hash_map + map_fd =3D bpf_map__fd(ctx.skel->maps.hash_map_bench); + max_entries =3D bpf_map__max_entries(ctx.skel->maps.hash_map_bench); + for (i =3D 0; i < max_entries; i++) + bpf_map_update_elem(map_fd, &i, &i, BPF_ANY); +} + +const struct bench bench_bpf_map =3D { + .name =3D "bpf-map", + .validate =3D validate, + .setup =3D setup, + .producer_thread =3D producer, + .consumer_thread =3D consumer, + .measure =3D measure, + .report_progress =3D ops_report_progress, + .report_final =3D ops_report_final, +}; diff --git a/tools/testing/selftests/bpf/benchs/run_bench_bpf_map.sh b/tool= s/testing/selftests/bpf/benchs/run_bench_bpf_map.sh new file mode 100755 index 000000000000..d7cc969e4f85 --- /dev/null +++ b/tools/testing/selftests/bpf/benchs/run_bench_bpf_map.sh @@ -0,0 +1,10 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source ./benchs/run_common.sh + +set -eufo pipefail + +nr_threads=3D`expr $(cat /proc/cpuinfo | grep "processor"| wc -l) - 1` +summary=3D$($RUN_BENCH -p $nr_threads bpf-map) +printf "$summary" diff --git a/tools/testing/selftests/bpf/progs/bpf_map_bench.c b/tools/test= ing/selftests/bpf/progs/bpf_map_bench.c new file mode 100644 index 000000000000..655366e6e0f4 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_map_bench.c @@ -0,0 +1,27 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Bytedance */ + +#include "vmlinux.h" +#include +#include "bpf_misc.h" + +char _license[] SEC("license") =3D "GPL"; + +#define MAX_ENTRIES 1000 + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, u32); + __type(value, u64); + __uint(max_entries, MAX_ENTRIES); +} hash_map_bench SEC(".maps"); + +SEC("fentry/" SYS_PREFIX "sys_getpgid") +int benchmark(void *ctx) +{ + u32 key =3D bpf_get_prandom_u32(); + u64 init_val =3D 1; + + bpf_map_update_elem(&hash_map_bench, &key, &init_val, BPF_ANY); + return 0; +} --=20 2.20.1