From nobody Sun Oct 5 12:23:57 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A1E71E3DDB; Mon, 4 Aug 2025 02:27:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754274432; cv=none; b=r/p+tXfW/usxMOkotFV8znC6UMcKXHl8ohLNNmVvRutHhG5ufKSe4/CcO1avFR0Vu1jDFEbSgWLsQXZVkfTStiGHwL6/PnW7oAy2o2xifS5Rl6NAEJ2q2NbM+Or0zRrdFDfJBnsVvzK/Buhkuf0gkKK1equpADIkUXetKmz9rbs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754274432; c=relaxed/simple; bh=B+jm6axNN8eKPZKwHfaIwnhbv47UYaZtTGu9HqYUqBQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=W2WLpH8DSuBTvM0yjp8u+r0Uy/+h9d7htOIgOz0+cpOOkPGKaMBsIsOymcN7mfz7rDVLu7CeB6uSQD7JVr8RM12SEP5ZQyzOvOzJR5hqLnJEzZo19D2SYzVWageEFM0TENZH+DQt/CklPrYyjq17tPwPPHIQJILkHOn4696//+s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bwL6h0HkgzYQtLW; Mon, 4 Aug 2025 10:27:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id A96761A0E40; Mon, 4 Aug 2025 10:27:06 +0800 (CST) Received: from k-arm6401.huawei.com (unknown [7.217.19.243]) by APP4 (Coremail) with SMTP id gCh0CgAX4BBsGpBoTUL9CQ--.242S3; Mon, 04 Aug 2025 10:27:06 +0800 (CST) From: Xu Kuohai To: bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Yonghong Song , Song Liu , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan , Stanislav Fomichev , Willem de Bruijn , Jason Xing , Paul Chaignon , Tao Chen , Kumar Kartikeya Dwivedi , Martin Kelly Subject: [PATCH bpf-next 1/4] bpf: Add overwrite mode for bpf ring buffer Date: Mon, 4 Aug 2025 10:20:57 +0800 Message-ID: <20250804022101.2171981-2-xukuohai@huaweicloud.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250804022101.2171981-1-xukuohai@huaweicloud.com> References: <20250804022101.2171981-1-xukuohai@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAX4BBsGpBoTUL9CQ--.242S3 X-Coremail-Antispam: 1UD129KBjvAXoW3Cr1xJF4UKFW5KFyDXryUGFg_yoW8Ar13to WSqayfua1vkr1q9rW3Kas7GF1rAryqkF9rCF43uwnxAF9rCrZFqr9xtFs5X3Z8XFs8GF4D C3Z8tF1YqFs8JF1Dn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUOy7kC6x804xWl14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK 8VAvwI8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr 4l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AK xVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ew Av7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY 6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4kS14 v26r4a6rW5MxkF7I0Ew4C26cxK6c8Ij28IcwCY02Avz4vEIxC_Gr1l42xK82IYc2Ij64vI r41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8Gjc xK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MIIYrxkI7VAKI48JMIIF0xvE2Ix0 cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE42xK8V AvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E 14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxUSK9aDUUUU X-CM-SenderInfo: 50xn30hkdlqx5xdzvxpfor3voofrz/ From: Xu Kuohai When the bpf ring buffer is full, new events can not be recorded util the consumer consumes some events to free space. This may cause critical events to be discarded, such as in fault diagnostic, where recent events are more critical than older ones. So add ovewrite mode for bpf ring buffer. In this mode, the new event overwrites the oldest event when the buffer is full. The scheme is as follows: 1. producer_pos tracks the next position to write new data. When there is enough free space, producer simply moves producer_pos forward to make space for the new event. 2. To avoid waiting for consumer to free space when the buffer is full, a new variable overwrite_pos is introduced for producer. overwrite_pos tracks the next event to be overwritten (the oldest event committed) in the buffer. producer moves it forward to discard the oldest events when the buffer is full. 3. pending_pos tracks the oldest event under committing. producer ensures producers_pos never passes pending_pos when making space for new events. So multiple producers never write to the same position at the same time. 4. producer wakes up consumer every half a round ahead to give it a chance to retrieve data. However, for an overwrite-mode ring buffer, users typically only cares about the ring buffer snapshot before a fault occur= s. In this case, the producer should commit data with BPF_RB_NO_WAKEUP flag to avoid unnecessary wakeups. The performance data for overwrite mode will be provided in a follow-up patch that adds overwrite mode benchs. A sample of performance data for non-overwrite mode on an x86_64 and arm64 CPU, before and after this patch, is shown below. As we can see, no obvious performance regression occurs. - x86_64 (AMD EPYC 9654) Before: Ringbuf, multi-producer contention =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D rb-libbpf nr_prod 1 13.218 =C2=B1 0.039M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 2 15.684 =C2=B1 0.015M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 3 7.771 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 4 6.281 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 8 2.842 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 12 2.001 =C2=B1 0.004M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 16 1.833 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 20 1.508 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 24 1.421 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 28 1.309 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 32 1.265 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 36 1.198 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 40 1.174 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 44 1.113 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 48 1.097 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 52 1.070 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) After: Ringbuf, multi-producer contention =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D rb-libbpf nr_prod 1 13.751 =C2=B1 0.673M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 2 15.592 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 3 7.776 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 4 6.463 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 8 2.883 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 12 2.017 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 16 1.816 =C2=B1 0.004M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 20 1.512 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 24 1.396 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 28 1.303 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 32 1.267 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 36 1.210 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 40 1.181 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 44 1.136 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 48 1.090 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 52 1.091 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) - arm64 (HiSilicon Kunpeng 920) Before: Ringbuf, multi-producer contention =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D rb-libbpf nr_prod 1 11.602 =C2=B1 0.423M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 2 9.599 =C2=B1 0.007M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 3 6.669 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 4 4.806 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 8 3.856 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 12 3.368 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 16 3.210 =C2=B1 0.007M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 20 3.003 =C2=B1 0.007M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 24 2.944 =C2=B1 0.007M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 28 2.863 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 32 2.819 =C2=B1 0.007M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 36 2.887 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 40 2.837 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 44 2.787 =C2=B1 0.012M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 48 2.738 =C2=B1 0.010M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 52 2.700 =C2=B1 0.007M/s (drops 0.000 =C2=B1 0.000M/s) After: Ringbuf, multi-producer contention =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D rb-libbpf nr_prod 1 11.614 =C2=B1 0.268M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 2 9.917 =C2=B1 0.007M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 3 6.920 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 4 4.803 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 8 3.898 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 12 3.426 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 16 3.320 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 20 3.029 =C2=B1 0.013M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 24 3.068 =C2=B1 0.012M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 28 2.890 =C2=B1 0.009M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 32 2.950 =C2=B1 0.012M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 36 2.812 =C2=B1 0.006M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 40 2.834 =C2=B1 0.009M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 44 2.803 =C2=B1 0.010M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 48 2.766 =C2=B1 0.010M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 52 2.754 =C2=B1 0.009M/s (drops 0.000 =C2=B1 0.000M/s) Signed-off-by: Xu Kuohai --- include/uapi/linux/bpf.h | 4 + kernel/bpf/ringbuf.c | 159 +++++++++++++++++++++++++++------ tools/include/uapi/linux/bpf.h | 4 + 3 files changed, 141 insertions(+), 26 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 233de8677382..d3b2fd2ae527 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1430,6 +1430,9 @@ enum { =20 /* Do not translate kernel bpf_arena pointers to user pointers */ BPF_F_NO_USER_CONV =3D (1U << 18), + +/* bpf ringbuf works in overwrite mode? */ + BPF_F_OVERWRITE =3D (1U << 19), }; =20 /* Flags for BPF_PROG_QUERY. */ @@ -6215,6 +6218,7 @@ enum { BPF_RB_RING_SIZE =3D 1, BPF_RB_CONS_POS =3D 2, BPF_RB_PROD_POS =3D 3, + BPF_RB_OVER_POS =3D 4, }; =20 /* BPF ring buffer constants */ diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 719d73299397..6ca41d01f187 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -13,7 +13,7 @@ #include #include =20 -#define RINGBUF_CREATE_FLAG_MASK (BPF_F_NUMA_NODE) +#define RINGBUF_CREATE_FLAG_MASK (BPF_F_NUMA_NODE | BPF_F_OVERWRITE) =20 /* non-mmap()'able part of bpf_ringbuf (everything up to consumer page) */ #define RINGBUF_PGOFF \ @@ -27,7 +27,8 @@ struct bpf_ringbuf { wait_queue_head_t waitq; struct irq_work work; - u64 mask; + u64 mask:48; + u64 overwrite_mode:1; struct page **pages; int nr_pages; rqspinlock_t spinlock ____cacheline_aligned_in_smp; @@ -72,6 +73,7 @@ struct bpf_ringbuf { */ unsigned long consumer_pos __aligned(PAGE_SIZE); unsigned long producer_pos __aligned(PAGE_SIZE); + unsigned long overwrite_pos; /* to be overwritten in overwrite mode */ unsigned long pending_pos; char data[] __aligned(PAGE_SIZE); }; @@ -166,7 +168,8 @@ static void bpf_ringbuf_notify(struct irq_work *work) * considering that the maximum value of data_sz is (4GB - 1), there * will be no overflow, so just note the size limit in the comments. */ -static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node) +static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node, + int overwrite_mode) { struct bpf_ringbuf *rb; =20 @@ -183,17 +186,25 @@ static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t d= ata_sz, int numa_node) rb->consumer_pos =3D 0; rb->producer_pos =3D 0; rb->pending_pos =3D 0; + rb->overwrite_mode =3D overwrite_mode; =20 return rb; } =20 static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr) { + int overwrite_mode =3D 0; struct bpf_ringbuf_map *rb_map; =20 if (attr->map_flags & ~RINGBUF_CREATE_FLAG_MASK) return ERR_PTR(-EINVAL); =20 + if (attr->map_flags & BPF_F_OVERWRITE) { + if (attr->map_type =3D=3D BPF_MAP_TYPE_USER_RINGBUF) + return ERR_PTR(-EINVAL); + overwrite_mode =3D 1; + } + if (attr->key_size || attr->value_size || !is_power_of_2(attr->max_entries) || !PAGE_ALIGNED(attr->max_entries)) @@ -205,7 +216,8 @@ static struct bpf_map *ringbuf_map_alloc(union bpf_attr= *attr) =20 bpf_map_init_from_attr(&rb_map->map, attr); =20 - rb_map->rb =3D bpf_ringbuf_alloc(attr->max_entries, rb_map->map.numa_node= ); + rb_map->rb =3D bpf_ringbuf_alloc(attr->max_entries, rb_map->map.numa_node, + overwrite_mode); if (!rb_map->rb) { bpf_map_area_free(rb_map); return ERR_PTR(-ENOMEM); @@ -295,11 +307,16 @@ static int ringbuf_map_mmap_user(struct bpf_map *map,= struct vm_area_struct *vma =20 static unsigned long ringbuf_avail_data_sz(struct bpf_ringbuf *rb) { - unsigned long cons_pos, prod_pos; + unsigned long cons_pos, prod_pos, over_pos; =20 cons_pos =3D smp_load_acquire(&rb->consumer_pos); prod_pos =3D smp_load_acquire(&rb->producer_pos); - return prod_pos - cons_pos; + + if (likely(!rb->overwrite_mode)) + return prod_pos - cons_pos; + + over_pos =3D READ_ONCE(rb->overwrite_pos); + return min(prod_pos - max(cons_pos, over_pos), rb->mask + 1); } =20 static u32 ringbuf_total_data_sz(const struct bpf_ringbuf *rb) @@ -402,11 +419,43 @@ bpf_ringbuf_restore_from_rec(struct bpf_ringbuf_hdr *= hdr) return (void*)((addr & PAGE_MASK) - off); } =20 + +static bool bpf_ringbuf_has_space(const struct bpf_ringbuf *rb, + unsigned long new_prod_pos, + unsigned long cons_pos, + unsigned long pend_pos) +{ + /* no space if oldest not yet committed record until the newest + * record span more than (ringbuf_size - 1) + */ + if (new_prod_pos - pend_pos > rb->mask) + return false; + + /* ok, we have space in ovewrite mode */ + if (unlikely(rb->overwrite_mode)) + return true; + + /* no space if producer position advances more than (ringbuf_size - 1) + * ahead than consumer position when not in overwrite mode + */ + if (new_prod_pos - cons_pos > rb->mask) + return false; + + return true; +} + +static u32 ringbuf_round_up_hdr_len(u32 hdr_len) +{ + hdr_len &=3D ~BPF_RINGBUF_DISCARD_BIT; + return round_up(hdr_len + BPF_RINGBUF_HDR_SZ, 8); +} + static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size) { - unsigned long cons_pos, prod_pos, new_prod_pos, pend_pos, flags; + unsigned long flags; struct bpf_ringbuf_hdr *hdr; - u32 len, pg_off, tmp_size, hdr_len; + u32 len, pg_off, hdr_len; + unsigned long cons_pos, prod_pos, new_prod_pos, pend_pos, over_pos; =20 if (unlikely(size > RINGBUF_MAX_RECORD_SZ)) return NULL; @@ -429,24 +478,39 @@ static void *__bpf_ringbuf_reserve(struct bpf_ringbuf= *rb, u64 size) hdr_len =3D READ_ONCE(hdr->len); if (hdr_len & BPF_RINGBUF_BUSY_BIT) break; - tmp_size =3D hdr_len & ~BPF_RINGBUF_DISCARD_BIT; - tmp_size =3D round_up(tmp_size + BPF_RINGBUF_HDR_SZ, 8); - pend_pos +=3D tmp_size; + pend_pos +=3D ringbuf_round_up_hdr_len(hdr_len); } rb->pending_pos =3D pend_pos; =20 - /* check for out of ringbuf space: - * - by ensuring producer position doesn't advance more than - * (ringbuf_size - 1) ahead - * - by ensuring oldest not yet committed record until newest - * record does not span more than (ringbuf_size - 1) - */ - if (new_prod_pos - cons_pos > rb->mask || - new_prod_pos - pend_pos > rb->mask) { + if (!bpf_ringbuf_has_space(rb, new_prod_pos, cons_pos, pend_pos)) { raw_res_spin_unlock_irqrestore(&rb->spinlock, flags); return NULL; } =20 + /* In overwrite mode, move overwrite_pos to the next record to be + * overwritten if the ring buffer is full + */ + if (unlikely(rb->overwrite_mode)) { + over_pos =3D rb->overwrite_pos; + while (new_prod_pos - over_pos > rb->mask) { + hdr =3D (void *)rb->data + (over_pos & rb->mask); + hdr_len =3D READ_ONCE(hdr->len); + /* since pending_pos is the first record with BUSY + * bit set and overwrite_pos is never bigger than + * pending_pos, no need to check BUSY bit here. + */ + over_pos +=3D ringbuf_round_up_hdr_len(hdr_len); + } + /* smp_store_release(&rb->producer_pos, new_prod_pos) at + * the end of the function ensures that when consumer sees + * the updated rb->producer_pos, it always sees the updated + * rb->overwrite_pos, so when consumer reads overwrite_pos + * after smp_load_acquire(r->producer_pos), the overwrite_pos + * will always be valid. + */ + WRITE_ONCE(rb->overwrite_pos, over_pos); + } + hdr =3D (void *)rb->data + (prod_pos & rb->mask); pg_off =3D bpf_ringbuf_rec_pg_off(rb, hdr); hdr->len =3D size | BPF_RINGBUF_BUSY_BIT; @@ -479,7 +543,50 @@ const struct bpf_func_proto bpf_ringbuf_reserve_proto = =3D { .arg3_type =3D ARG_ANYTHING, }; =20 -static void bpf_ringbuf_commit(void *sample, u64 flags, bool discard) +static __always_inline +bool ringbuf_should_wakeup(const struct bpf_ringbuf *rb, + unsigned long rec_pos, + unsigned long cons_pos, + u32 len, u64 flags) +{ + unsigned long rec_end; + + if (flags & BPF_RB_FORCE_WAKEUP) + return true; + + if (flags & BPF_RB_NO_WAKEUP) + return false; + + /* for non-overwrite mode, if consumer caught up and is waiting for + * our record, notify about new data availability + */ + if (likely(!rb->overwrite_mode)) + return cons_pos =3D=3D rec_pos; + + /* for overwrite mode, to give the consumer a chance to catch up + * before being overwritten, wake up consumer every half a round + * ahead. + */ + rec_end =3D rec_pos + ringbuf_round_up_hdr_len(len); + + cons_pos &=3D (rb->mask >> 1); + rec_pos &=3D (rb->mask >> 1); + rec_end &=3D (rb->mask >> 1); + + if (cons_pos =3D=3D rec_pos) + return true; + + if (rec_pos < cons_pos && cons_pos < rec_end) + return true; + + if (rec_end < rec_pos && (cons_pos > rec_pos || cons_pos < rec_end)) + return true; + + return false; +} + +static __always_inline +void bpf_ringbuf_commit(void *sample, u64 flags, bool discard) { unsigned long rec_pos, cons_pos; struct bpf_ringbuf_hdr *hdr; @@ -495,15 +602,10 @@ static void bpf_ringbuf_commit(void *sample, u64 flag= s, bool discard) /* update record header with correct final size prefix */ xchg(&hdr->len, new_len); =20 - /* if consumer caught up and is waiting for our record, notify about - * new data availability - */ rec_pos =3D (void *)hdr - (void *)rb->data; cons_pos =3D smp_load_acquire(&rb->consumer_pos) & rb->mask; =20 - if (flags & BPF_RB_FORCE_WAKEUP) - irq_work_queue(&rb->work); - else if (cons_pos =3D=3D rec_pos && !(flags & BPF_RB_NO_WAKEUP)) + if (ringbuf_should_wakeup(rb, rec_pos, cons_pos, new_len, flags)) irq_work_queue(&rb->work); } =20 @@ -576,6 +678,8 @@ BPF_CALL_2(bpf_ringbuf_query, struct bpf_map *, map, u6= 4, flags) return smp_load_acquire(&rb->consumer_pos); case BPF_RB_PROD_POS: return smp_load_acquire(&rb->producer_pos); + case BPF_RB_OVER_POS: + return READ_ONCE(rb->overwrite_pos); default: return 0; } @@ -749,6 +853,9 @@ BPF_CALL_4(bpf_user_ringbuf_drain, struct bpf_map *, ma= p, =20 rb =3D container_of(map, struct bpf_ringbuf_map, map)->rb; =20 + if (unlikely(rb->overwrite_mode)) + return -EOPNOTSUPP; + /* If another consumer is already consuming a sample, wait for them to fi= nish. */ if (!atomic_try_cmpxchg(&rb->busy, &busy, 1)) return -EBUSY; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 233de8677382..d3b2fd2ae527 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1430,6 +1430,9 @@ enum { =20 /* Do not translate kernel bpf_arena pointers to user pointers */ BPF_F_NO_USER_CONV =3D (1U << 18), + +/* bpf ringbuf works in overwrite mode? */ + BPF_F_OVERWRITE =3D (1U << 19), }; =20 /* Flags for BPF_PROG_QUERY. */ @@ -6215,6 +6218,7 @@ enum { BPF_RB_RING_SIZE =3D 1, BPF_RB_CONS_POS =3D 2, BPF_RB_PROD_POS =3D 3, + BPF_RB_OVER_POS =3D 4, }; =20 /* BPF ring buffer constants */ --=20 2.43.0 From nobody Sun Oct 5 12:23:57 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A1881D5CED; Mon, 4 Aug 2025 02:27:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754274431; cv=none; b=F4iLSyPtX72ovEtd5JyHrJ6bb33Eqre1dmcOzdUcOjAPFSPtZWkmAYk2qRcA2N9UrFWF9q0nP/zgCiab4w47fcE2MkWvnqbEtoUvE3L7hXORkcGgjFkfKL5OIhYYieVEQ797eZXnJl3I6bp+r5ZRKJVYfA6JtA5FAmp1OkfXIQE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754274431; c=relaxed/simple; bh=zbLc9HNo6kfz862yttepKCrbMBfcWHmcVBEZgDvKJQI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BCyq+37gMZx5wXe0W145xy11uhz8Q8rkkr/kTDH1KHB7A3y+rEvwJ4OEeQU4zzNjZejoVdMQLXoOyqOZ2s8U0gRA73rdOLvEucsgH20sqIJZBmlDDHLXZ5q7hclWbQvtkczO2ZpSd3k8Tfk5hTTr5SAXFG5OHIXcrT/+xFLjhkc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bwL6h6GY1zYQtLW; Mon, 4 Aug 2025 10:27:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 86A631A0E9A; Mon, 4 Aug 2025 10:27:07 +0800 (CST) Received: from k-arm6401.huawei.com (unknown [7.217.19.243]) by APP4 (Coremail) with SMTP id gCh0CgAX4BBsGpBoTUL9CQ--.242S4; Mon, 04 Aug 2025 10:27:07 +0800 (CST) From: Xu Kuohai To: bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Yonghong Song , Song Liu , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan , Stanislav Fomichev , Willem de Bruijn , Jason Xing , Paul Chaignon , Tao Chen , Kumar Kartikeya Dwivedi , Martin Kelly Subject: [PATCH bpf-next 2/4] libbpf: ringbuf: Add overwrite ring buffer process Date: Mon, 4 Aug 2025 10:20:58 +0800 Message-ID: <20250804022101.2171981-3-xukuohai@huaweicloud.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250804022101.2171981-1-xukuohai@huaweicloud.com> References: <20250804022101.2171981-1-xukuohai@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAX4BBsGpBoTUL9CQ--.242S4 X-Coremail-Antispam: 1UD129KBjvJXoWxZw4UGry3Ww4UAr1rXw18Krg_yoWrKFykpF 4Y93W5Ar9rZr17ZrySgFZavFyrGws7Zr4IkFyxJa48Zw1DKF15WFyI9FyYyr4rGr9rKr1S krZ8Jas7Kr1UWwUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ2b4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUXw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_GFv_Wrylc7CjxVAKzI0EY4vE52x082I5MxkIecxEwVCI4VW8JwCF04k20xvY0x0EwIxG rwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4 vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IY x2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8V AvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E 14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxUhjjgDUUUU X-CM-SenderInfo: 50xn30hkdlqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Xu Kuohai In overwrite mode, the producer does not wait for the consumer, so the consumer is responsible for handling conflicts. An optimistic method is used to resolve the conflicts: the consumer first reads consumer_pos, producer_pos and overwrite_pos, then calculates a read window and copies data in the window from the ring buffer. After copying, it checks the positions to decide if the data in the copy window have been overwritten by be the producer. If so, it discards the copy and tries again. Once success, the consumer processes the events in the copy. Signed-off-by: Xu Kuohai --- tools/lib/bpf/ringbuf.c | 103 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 102 insertions(+), 1 deletion(-) diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c index 9702b70da444..9c072af675ff 100644 --- a/tools/lib/bpf/ringbuf.c +++ b/tools/lib/bpf/ringbuf.c @@ -27,10 +27,13 @@ struct ring { ring_buffer_sample_fn sample_cb; void *ctx; void *data; + void *read_buffer; unsigned long *consumer_pos; unsigned long *producer_pos; + unsigned long *overwrite_pos; unsigned long mask; int map_fd; + bool overwrite_mode; }; =20 struct ring_buffer { @@ -69,6 +72,9 @@ static void ringbuf_free_ring(struct ring_buffer *rb, str= uct ring *r) r->producer_pos =3D NULL; } =20 + if (r->read_buffer) + free(r->read_buffer); + free(r); } =20 @@ -119,6 +125,14 @@ int ring_buffer__add(struct ring_buffer *rb, int map_f= d, r->sample_cb =3D sample_cb; r->ctx =3D ctx; r->mask =3D info.max_entries - 1; + r->overwrite_mode =3D info.map_flags & BPF_F_OVERWRITE; + if (unlikely(r->overwrite_mode)) { + r->read_buffer =3D malloc(info.max_entries); + if (!r->read_buffer) { + err =3D -ENOMEM; + goto err_out; + } + } =20 /* Map writable consumer page */ tmp =3D mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED, map= _fd, 0); @@ -148,6 +162,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, goto err_out; } r->producer_pos =3D tmp; + r->overwrite_pos =3D r->producer_pos + 1; /* overwrite_pos is next to pro= ducer_pos */ r->data =3D tmp + rb->page_size; =20 e =3D &rb->events[rb->ring_cnt]; @@ -232,7 +247,7 @@ static inline int roundup_len(__u32 len) return (len + 7) / 8 * 8; } =20 -static int64_t ringbuf_process_ring(struct ring *r, size_t n) +static int64_t ringbuf_process_normal_ring(struct ring *r, size_t n) { int *len_ptr, len, err; /* 64-bit to avoid overflow in case of extreme application behavior */ @@ -278,6 +293,92 @@ static int64_t ringbuf_process_ring(struct ring *r, si= ze_t n) return cnt; } =20 +static int64_t ringbuf_process_overwrite_ring(struct ring *r, size_t n) +{ + + int err; + uint32_t *len_ptr, len; + /* 64-bit to avoid overflow in case of extreme application behavior */ + int64_t cnt =3D 0; + size_t size, offset; + unsigned long cons_pos, prod_pos, over_pos, tmp_pos; + bool got_new_data; + void *sample; + bool copied; + + size =3D r->mask + 1; + + cons_pos =3D smp_load_acquire(r->consumer_pos); + do { + got_new_data =3D false; + + /* grab a copy of data */ + prod_pos =3D smp_load_acquire(r->producer_pos); + do { + over_pos =3D READ_ONCE(*r->overwrite_pos); + /* prod_pos may be outdated now */ + if (over_pos < prod_pos) { + tmp_pos =3D max(cons_pos, over_pos); + /* smp_load_acquire(r->producer_pos) before + * READ_ONCE(*r->overwrite_pos) ensures that + * over_pos + r->mask < prod_pos never occurs, + * so size is never larger than r->mask + */ + size =3D prod_pos - tmp_pos; + if (!size) + goto done; + memcpy(r->read_buffer, + r->data + (tmp_pos & r->mask), size); + copied =3D true; + } else { + copied =3D false; + } + prod_pos =3D smp_load_acquire(r->producer_pos); + /* retry if data is overwritten by producer */ + } while (!copied || prod_pos - tmp_pos > r->mask); + + cons_pos =3D tmp_pos; + + for (offset =3D 0; offset < size; offset +=3D roundup_len(len)) { + len_ptr =3D r->read_buffer + (offset & r->mask); + len =3D *len_ptr; + + if (len & BPF_RINGBUF_BUSY_BIT) + goto done; + + got_new_data =3D true; + cons_pos +=3D roundup_len(len); + + if ((len & BPF_RINGBUF_DISCARD_BIT) =3D=3D 0) { + sample =3D (void *)len_ptr + BPF_RINGBUF_HDR_SZ; + err =3D r->sample_cb(r->ctx, sample, len); + if (err < 0) { + /* update consumer pos and bail out */ + smp_store_release(r->consumer_pos, + cons_pos); + return err; + } + cnt++; + } + + if (cnt >=3D n) + goto done; + } + } while (got_new_data); + +done: + smp_store_release(r->consumer_pos, cons_pos); + return cnt; +} + +static int64_t ringbuf_process_ring(struct ring *r, size_t n) +{ + if (likely(!r->overwrite_mode)) + return ringbuf_process_normal_ring(r, n); + else + return ringbuf_process_overwrite_ring(r, n); +} + /* Consume available ring buffer(s) data without event polling, up to n * records. * --=20 2.43.0 From nobody Sun Oct 5 12:23:57 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9FF5C1F4615; Mon, 4 Aug 2025 02:27:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754274432; cv=none; b=k41bg19ZowWTHvnmgMY870uQVhVBK16QTng9zHbo2oST2kCPtxGkpNQWJuaFXo8DOeffBphUvnD556pTw2CMjC3LatlixFswzCS/SuwuKscdU82ZmIXmJRMGgoDqxevsR81HT3HRdo0KKwceMBWc+hqpznyCWUfFyDA4mKn55ww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754274432; c=relaxed/simple; bh=OoovgL1ZzFbR8jZdYMpd1kECZyijvXmDIUYgrqzgZak=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=o6HsX+ktFFnwQsU/1G2e27M9ZhbCnC/zGrIMmQh6aiytI4jva80uwwbm4DS5jGdaYug58LtM6NprMj+0Jdi8NEG1bc5ospC7gvgUyYX440596Q63a26sPIF4evnKJYlxFPPdRz7XHvvYOu5Y1ObEyb9IcMnH6XYxv3dz1leXmYY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bwL6j5LlRzYQtwl; Mon, 4 Aug 2025 10:27:09 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 64EC91A06DD; Mon, 4 Aug 2025 10:27:08 +0800 (CST) Received: from k-arm6401.huawei.com (unknown [7.217.19.243]) by APP4 (Coremail) with SMTP id gCh0CgAX4BBsGpBoTUL9CQ--.242S5; Mon, 04 Aug 2025 10:27:08 +0800 (CST) From: Xu Kuohai To: bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Yonghong Song , Song Liu , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan , Stanislav Fomichev , Willem de Bruijn , Jason Xing , Paul Chaignon , Tao Chen , Kumar Kartikeya Dwivedi , Martin Kelly Subject: [PATCH bpf-next 3/4] selftests/bpf: Add test for overwrite ring buffer Date: Mon, 4 Aug 2025 10:20:59 +0800 Message-ID: <20250804022101.2171981-4-xukuohai@huaweicloud.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250804022101.2171981-1-xukuohai@huaweicloud.com> References: <20250804022101.2171981-1-xukuohai@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAX4BBsGpBoTUL9CQ--.242S5 X-Coremail-Antispam: 1UD129KBjvJXoWxKw13tF4xXr48JF1rGF1Dtrb_yoWxKw17pa yFgr1YkryIg3WFgrWxuFyIvFW8ur4DAw4rKrsrXw1rZr1DuFsxXr1Ikr1Ut3Z8XrW8Xr1Y k34a9FZxA3WUGF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ2b4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUWw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_GFv_Wrylc7CjxVAKzI0EY4vE52x082I5MxkIecxEwVCI4VW8JwCF04k20xvY0x0EwIxG rwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4 vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IY x2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8V AvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E 14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxUStC7UUUUU X-CM-SenderInfo: 50xn30hkdlqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Xu Kuohai Add test for overwiret mode ring buffer. Signed-off-by: Xu Kuohai --- tools/testing/selftests/bpf/Makefile | 3 +- .../selftests/bpf/prog_tests/ringbuf.c | 74 ++++++++++++++ .../bpf/progs/test_ringbuf_overwrite.c | 98 +++++++++++++++++++ 3 files changed, 174 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/test_ringbuf_overwrit= e.c diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests= /bpf/Makefile index 4863106034df..8a3796a2e5f5 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -499,7 +499,8 @@ LINKED_SKELS :=3D test_static_linked.skel.h linked_func= s.skel.h \ LSKELS :=3D fentry_test.c fexit_test.c fexit_sleep.c atomics.c \ trace_printk.c trace_vprintk.c map_ptr_kern.c \ core_kern.c core_kern_overflow.c test_ringbuf.c \ - test_ringbuf_n.c test_ringbuf_map_key.c test_ringbuf_write.c + test_ringbuf_n.c test_ringbuf_map_key.c test_ringbuf_write.c \ + test_ringbuf_overwrite.c =20 # Generate both light skeleton and libbpf skeleton for these LSKELS_EXTRA :=3D test_ksyms_module.c test_ksyms_weak.c kfunc_call_test.c \ diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testi= ng/selftests/bpf/prog_tests/ringbuf.c index d1e4cb28a72c..205a51c725a7 100644 --- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c +++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c @@ -17,6 +17,7 @@ #include "test_ringbuf_n.lskel.h" #include "test_ringbuf_map_key.lskel.h" #include "test_ringbuf_write.lskel.h" +#include "test_ringbuf_overwrite.lskel.h" =20 #define EDONE 7777 =20 @@ -497,6 +498,77 @@ static void ringbuf_map_key_subtest(void) test_ringbuf_map_key_lskel__destroy(skel_map_key); } =20 +static void ringbuf_overwrite_mode_subtest(void) +{ + unsigned long size, len1, len2, len3, len4, len5; + unsigned long expect_avail_data, expect_prod_pos, expect_over_pos; + struct test_ringbuf_overwrite_lskel *skel; + int err; + + skel =3D test_ringbuf_overwrite_lskel__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return; + + size =3D 0x1000; + len1 =3D 0x800; + len2 =3D 0x400; + len3 =3D size - len1 - len2 - BPF_RINGBUF_HDR_SZ * 3; /* 0x3e8 */ + len4 =3D len3 - 8; /* 0x3e0 */ + len5 =3D len3; /* retry with len3 */ + + skel->maps.ringbuf.max_entries =3D size; + skel->rodata->LEN1 =3D len1; + skel->rodata->LEN2 =3D len2; + skel->rodata->LEN3 =3D len3; + skel->rodata->LEN4 =3D len4; + skel->rodata->LEN5 =3D len5; + + skel->bss->pid =3D getpid(); + + err =3D test_ringbuf_overwrite_lskel__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + err =3D test_ringbuf_overwrite_lskel__attach(skel); + if (!ASSERT_OK(err, "skel_attach")) + goto cleanup; + + syscall(__NR_getpgid); + + ASSERT_EQ(skel->bss->reserve1_fail, 0, "reserve 1"); + ASSERT_EQ(skel->bss->reserve2_fail, 0, "reserve 2"); + ASSERT_EQ(skel->bss->reserve3_fail, 1, "reserve 3"); + ASSERT_EQ(skel->bss->reserve4_fail, 0, "reserve 4"); + ASSERT_EQ(skel->bss->reserve5_fail, 0, "reserve 5"); + + CHECK(skel->bss->ring_size !=3D size, + "check_ring_size", "exp %lu, got %lu\n", + size, skel->bss->ring_size); + + expect_avail_data =3D len2 + len4 + len5 + 3 * BPF_RINGBUF_HDR_SZ; + CHECK(skel->bss->avail_data !=3D expect_avail_data, + "check_avail_size", "exp %lu, got %lu\n", + expect_avail_data, skel->bss->avail_data); + + CHECK(skel->bss->cons_pos !=3D 0, + "check_cons_pos", "exp 0, got %lu\n", + skel->bss->cons_pos); + + expect_prod_pos =3D len1 + len2 + len4 + len5 + 4 * BPF_RINGBUF_HDR_SZ; + CHECK(skel->bss->prod_pos !=3D expect_prod_pos, + "check_prod_pos", "exp %lu, got %lu\n", + expect_prod_pos, skel->bss->prod_pos); + + expect_over_pos =3D len1 + BPF_RINGBUF_HDR_SZ; + CHECK(skel->bss->over_pos !=3D expect_over_pos, + "check_over_pos", "exp %lu, got %lu\n", + (unsigned long)expect_over_pos, skel->bss->over_pos); + + test_ringbuf_overwrite_lskel__detach(skel); +cleanup: + test_ringbuf_overwrite_lskel__destroy(skel); +} + void test_ringbuf(void) { if (test__start_subtest("ringbuf")) @@ -507,4 +579,6 @@ void test_ringbuf(void) ringbuf_map_key_subtest(); if (test__start_subtest("ringbuf_write")) ringbuf_write_subtest(); + if (test__start_subtest("ringbuf_overwrite_mode")) + ringbuf_overwrite_mode_subtest(); } diff --git a/tools/testing/selftests/bpf/progs/test_ringbuf_overwrite.c b/t= ools/testing/selftests/bpf/progs/test_ringbuf_overwrite.c new file mode 100644 index 000000000000..da89ba12a75c --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_ringbuf_overwrite.c @@ -0,0 +1,98 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2025. Huawei Technologies Co., Ltd */ + +#include +#include +#include "bpf_misc.h" + +char _license[] SEC("license") =3D "GPL"; + +struct { + __uint(type, BPF_MAP_TYPE_RINGBUF); + __uint(map_flags, BPF_F_OVERWRITE); +} ringbuf SEC(".maps"); + +int pid; + +const volatile unsigned long LEN1; +const volatile unsigned long LEN2; +const volatile unsigned long LEN3; +const volatile unsigned long LEN4; +const volatile unsigned long LEN5; + +long reserve1_fail =3D 0; +long reserve2_fail =3D 0; +long reserve3_fail =3D 0; +long reserve4_fail =3D 0; +long reserve5_fail =3D 0; + +unsigned long avail_data =3D 0; +unsigned long ring_size =3D 0; +unsigned long cons_pos =3D 0; +unsigned long prod_pos =3D 0; +unsigned long over_pos =3D 0; + +SEC("fentry/" SYS_PREFIX "sys_getpgid") +int test_overwrite_ringbuf(void *ctx) +{ + char *rec1, *rec2, *rec3, *rec4, *rec5; + int cur_pid =3D bpf_get_current_pid_tgid() >> 32; + + if (cur_pid !=3D pid) + return 0; + + rec1 =3D bpf_ringbuf_reserve(&ringbuf, LEN1, 0); + if (!rec1) { + reserve1_fail =3D 1; + return 0; + } + + rec2 =3D bpf_ringbuf_reserve(&ringbuf, LEN2, 0); + if (!rec2) { + bpf_ringbuf_discard(rec1, 0); + reserve2_fail =3D 1; + return 0; + } + + rec3 =3D bpf_ringbuf_reserve(&ringbuf, LEN3, 0); + /* expect failure */ + if (!rec3) { + reserve3_fail =3D 1; + } else { + bpf_ringbuf_discard(rec1, 0); + bpf_ringbuf_discard(rec2, 0); + bpf_ringbuf_discard(rec3, 0); + return 0; + } + + rec4 =3D bpf_ringbuf_reserve(&ringbuf, LEN4, 0); + if (!rec4) { + reserve4_fail =3D 1; + bpf_ringbuf_discard(rec1, 0); + bpf_ringbuf_discard(rec2, 0); + return 0; + } + + bpf_ringbuf_submit(rec1, 0); + bpf_ringbuf_submit(rec2, 0); + bpf_ringbuf_submit(rec4, 0); + + rec5 =3D bpf_ringbuf_reserve(&ringbuf, LEN5, 0); + if (!rec5) { + reserve5_fail =3D 1; + return 0; + } + + for (int i =3D 0; i < LEN3; i++) + rec5[i] =3D 0xdd; + + bpf_ringbuf_submit(rec5, 0); + + ring_size =3D bpf_ringbuf_query(&ringbuf, BPF_RB_RING_SIZE); + avail_data =3D bpf_ringbuf_query(&ringbuf, BPF_RB_AVAIL_DATA); + cons_pos =3D bpf_ringbuf_query(&ringbuf, BPF_RB_CONS_POS); + prod_pos =3D bpf_ringbuf_query(&ringbuf, BPF_RB_PROD_POS); + over_pos =3D bpf_ringbuf_query(&ringbuf, BPF_RB_OVER_POS); + + return 0; +} --=20 2.43.0 From nobody Sun Oct 5 12:23:57 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8464720127B; Mon, 4 Aug 2025 02:27:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754274434; cv=none; b=t1b154R4ebf4qmJSin6gzLCGDn2ViZ/6xJlIilQWuzWVfW5Cpzl600eAoGocceBalOmqCRE74W33Nvwy492YKa1XeIobgbJMf1bCkjsOBzKmPlrGu5BDPyqFiyoqppti1L+19FWQifAlMPF8sc0DUxnHjSd6+N9fhFqzbCNpbv4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754274434; c=relaxed/simple; bh=UW1WmWLA5hqVf8AQJiB3mCG4tYSWQTkhlPbwBNCJnik=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Iuok66QA+f2LhF5SjZayj2h+mpq/9E+baDJ8OIk6tkgGGzTnSW1xKUEFRmRpdYLGV1WxrYvDYTQbWvAQ+y6hqbI690OFNXSJn6HRwA8bqIq/nwGGlNNPN/TMeWDYji5TpA3U/SJuaLKHq6gLXGk+poqOdEDuUku7XYMwYkurQ78= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bwL6k4KDNzYQtxF; Mon, 4 Aug 2025 10:27:10 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 40CCA1A018D; Mon, 4 Aug 2025 10:27:09 +0800 (CST) Received: from k-arm6401.huawei.com (unknown [7.217.19.243]) by APP4 (Coremail) with SMTP id gCh0CgAX4BBsGpBoTUL9CQ--.242S6; Mon, 04 Aug 2025 10:27:08 +0800 (CST) From: Xu Kuohai To: bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Yonghong Song , Song Liu , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan , Stanislav Fomichev , Willem de Bruijn , Jason Xing , Paul Chaignon , Tao Chen , Kumar Kartikeya Dwivedi , Martin Kelly Subject: [PATCH bpf-next 4/4] selftests/bpf/benchs: Add overwrite mode bench for rb-libbpf Date: Mon, 4 Aug 2025 10:21:00 +0800 Message-ID: <20250804022101.2171981-5-xukuohai@huaweicloud.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250804022101.2171981-1-xukuohai@huaweicloud.com> References: <20250804022101.2171981-1-xukuohai@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAX4BBsGpBoTUL9CQ--.242S6 X-Coremail-Antispam: 1UD129KBjvJXoW3Xw45ZrWUAr45urWDtw4xWFg_yoWxGF4fpF WDCFWfCw1xtr93XF1vkw48JrW7ZrnrZ3W5CFyfta17Zw1xWan0q3yxK3yUt3Z8G348C3WS v34ktryrGw1UJwUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wrylc7CjxVAKzI0EY4vE52x082I5MxkIecxEwVCI4VW8JwCF04k20xvY0x0E wIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E74 80Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0 I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6x AIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY 1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x07jtsqXUUUUU= X-CM-SenderInfo: 50xn30hkdlqx5xdzvxpfor3voofrz/ From: Xu Kuohai Add overwrite mode bench for ring buffer. For reference, below are bench numbers collected from x86_64 and arm64. - x86_64 (AMD EPYC 9654) Ringbuf, multi-producer contention, overwrite mode =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D rb-libbpf nr_prod 1 14.970 =C2=B1 0.012M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 2 14.064 =C2=B1 0.007M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 3 7.493 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 4 6.575 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 8 3.696 =C2=B1 0.011M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 12 2.612 =C2=B1 0.012M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 16 2.335 =C2=B1 0.005M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 20 2.079 =C2=B1 0.005M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 24 1.965 =C2=B1 0.004M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 28 1.846 =C2=B1 0.004M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 32 1.790 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 36 1.735 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 40 1.701 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 44 1.669 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 48 1.749 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 52 1.709 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) - arm64 (HiSilicon Kunpeng 920) Ringbuf, multi-producer contention, overwrite mode =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D rb-libbpf nr_prod 1 10.319 =C2=B1 0.231M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 2 9.219 =C2=B1 0.006M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 3 6.699 =C2=B1 0.013M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 4 4.608 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 8 3.905 =C2=B1 0.001M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 12 3.282 =C2=B1 0.004M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 16 3.182 =C2=B1 0.008M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 20 3.029 =C2=B1 0.006M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 24 3.116 =C2=B1 0.004M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 28 2.869 =C2=B1 0.005M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 32 3.075 =C2=B1 0.010M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 36 2.795 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 40 2.947 =C2=B1 0.005M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 44 2.748 =C2=B1 0.006M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 48 2.767 =C2=B1 0.003M/s (drops 0.000 =C2=B1 0.000M/s) rb-libbpf nr_prod 52 2.858 =C2=B1 0.002M/s (drops 0.000 =C2=B1 0.000M/s) Signed-off-by: Xu Kuohai --- .../selftests/bpf/benchs/bench_ringbufs.c | 22 ++++++++++++++++++- .../bpf/benchs/run_bench_ringbufs.sh | 4 ++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/benchs/bench_ringbufs.c b/tools/te= sting/selftests/bpf/benchs/bench_ringbufs.c index e1ee979e6acc..6fdfc61c721b 100644 --- a/tools/testing/selftests/bpf/benchs/bench_ringbufs.c +++ b/tools/testing/selftests/bpf/benchs/bench_ringbufs.c @@ -19,6 +19,7 @@ static struct { int ringbuf_sz; /* per-ringbuf, in bytes */ bool ringbuf_use_output; /* use slower output API */ int perfbuf_sz; /* per-CPU size, in pages */ + bool overwrite_mode; } args =3D { .back2back =3D false, .batch_cnt =3D 500, @@ -27,6 +28,7 @@ static struct { .ringbuf_sz =3D 512 * 1024, .ringbuf_use_output =3D false, .perfbuf_sz =3D 128, + .overwrite_mode =3D false, }; =20 enum { @@ -35,6 +37,7 @@ enum { ARG_RB_BATCH_CNT =3D 2002, ARG_RB_SAMPLED =3D 2003, ARG_RB_SAMPLE_RATE =3D 2004, + ARG_RB_OVERWRITE =3D 2005, }; =20 static const struct argp_option opts[] =3D { @@ -43,6 +46,7 @@ static const struct argp_option opts[] =3D { { "rb-batch-cnt", ARG_RB_BATCH_CNT, "CNT", 0, "Set BPF-side record batch = count"}, { "rb-sampled", ARG_RB_SAMPLED, NULL, 0, "Notification sampling"}, { "rb-sample-rate", ARG_RB_SAMPLE_RATE, "RATE", 0, "Notification sample r= ate"}, + { "rb-overwrite", ARG_RB_OVERWRITE, NULL, 0, "overwrite mode"}, {}, }; =20 @@ -72,6 +76,9 @@ static error_t parse_arg(int key, char *arg, struct argp_= state *state) argp_usage(state); } break; + case ARG_RB_OVERWRITE: + args.overwrite_mode =3D true; + break; default: return ARGP_ERR_UNKNOWN; } @@ -104,6 +111,11 @@ static void bufs_validate(void) fprintf(stderr, "back-to-back mode makes sense only for single-producer = case!\n"); exit(1); } + + if (args.overwrite_mode && strcmp(env.bench_name, "rb-libbpf") !=3D 0) { + fprintf(stderr, "rb-overwrite mode only supports rb-libbpf!\n"); + exit(1); + } } =20 static void *bufs_sample_producer(void *input) @@ -134,6 +146,8 @@ static void ringbuf_libbpf_measure(struct bench_res *re= s) =20 static struct ringbuf_bench *ringbuf_setup_skeleton(void) { + __u32 flags; + struct bpf_map *ringbuf; struct ringbuf_bench *skel; =20 setup_libbpf(); @@ -151,7 +165,13 @@ static struct ringbuf_bench *ringbuf_setup_skeleton(vo= id) /* record data + header take 16 bytes */ skel->rodata->wakeup_data_size =3D args.sample_rate * 16; =20 - bpf_map__set_max_entries(skel->maps.ringbuf, args.ringbuf_sz); + ringbuf =3D skel->maps.ringbuf; + if (args.overwrite_mode) { + flags =3D bpf_map__map_flags(ringbuf) | BPF_F_OVERWRITE; + bpf_map__set_map_flags(ringbuf, flags); + } + + bpf_map__set_max_entries(ringbuf, args.ringbuf_sz); =20 if (ringbuf_bench__load(skel)) { fprintf(stderr, "failed to load skeleton\n"); diff --git a/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh b/too= ls/testing/selftests/bpf/benchs/run_bench_ringbufs.sh index 91e3567962ff..4e758bc52b73 100755 --- a/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh +++ b/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh @@ -49,3 +49,7 @@ for b in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do summarize "rb-libbpf nr_prod $b" "$($RUN_RB_BENCH -p$b --rb-batch-cnt 50 = rb-libbpf)" done =20 +header "Ringbuf, multi-producer contention, overwrite mode" +for b in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do + summarize "rb-libbpf nr_prod $b" "$($RUN_RB_BENCH -p$b --rb-overwrite --r= b-batch-cnt 50 rb-libbpf)" +done --=20 2.43.0