From nobody Mon Feb 9 06:24:35 2026 Received: from mail.nppct.ru (mail.nppct.ru [195.133.245.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F01A1D8E16 for ; Sun, 2 Feb 2025 07:50:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.133.245.4 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738482637; cv=none; b=csQKfHhTZrDhJ+/Mu1dsWw9Z3bWE43xssMISZD1FlxL9UQ6BENQLLdKmwbj8JFa0GEQKXCgApxGndUkpFh2xc9Y8mcA6aWZMuB5b5uimK3CLv09rtK+jq9w7f93KXapm/YcMQrAki72rBM0Xm600yDz84sbdGZHqzAf30acGCR4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738482637; c=relaxed/simple; bh=iS5FENTRHCeErs805y4F7W7cpBpY4MI+F+t9xDGefGA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UBQhFlau4i4yaSdGC4X6BztkA/YKxb64EVQTDm1tYJ4b8ka7q9PiDL8tS4V3OzzQorcJVo0d2yUbMSmSupLSUUi14tluIWGIpv9jH4MN2jd0qmdEifsA/AIHlS04QTWKnxwjHoxdGUZJ97usTbKB4sxiLwWMItgMqNNGN6x1xsM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=nppct.ru; spf=pass smtp.mailfrom=nppct.ru; dkim=pass (1024-bit key) header.d=nppct.ru header.i=@nppct.ru header.b=RTWf95yo; arc=none smtp.client-ip=195.133.245.4 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=nppct.ru Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=nppct.ru Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=nppct.ru header.i=@nppct.ru header.b="RTWf95yo" Received: from mail.nppct.ru (localhost [127.0.0.1]) by mail.nppct.ru (Postfix) with ESMTP id C17801C2444 for ; Sun, 2 Feb 2025 10:50:33 +0300 (MSK) Authentication-Results: mail.nppct.ru (amavisd-new); dkim=pass (1024-bit key) reason="pass (just generated, assumed good)" header.d=nppct.ru DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nppct.ru; h= content-transfer-encoding:mime-version:references:in-reply-to :x-mailer:message-id:date:date:subject:subject:to:from:from; s= dkim; t=1738482633; x=1739346634; bh=iS5FENTRHCeErs805y4F7W7cpBp Y4MI+F+t9xDGefGA=; b=RTWf95yoCabzrpue131QOjcb9rE2KriACZdEflYndcO oC760yI8I4a05oBl2bo8pXsZrux6Ktluwa/hprRbV1/Zxm+f2HzZh5j3ZsBhfu2Q EZ74Uftd++b8AiLmWzKhaKg1NKkztNV5ZSK82TFe2JXZSkpLC6sYhhIW6+vFbLSg = X-Virus-Scanned: Debian amavisd-new at mail.nppct.ru Received: from mail.nppct.ru ([127.0.0.1]) by mail.nppct.ru (mail.nppct.ru [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id vew9J3c-AmYi for ; Sun, 2 Feb 2025 10:50:33 +0300 (MSK) Received: from localhost.localdomain (unknown [87.249.24.51]) by mail.nppct.ru (Postfix) with ESMTPSA id 6048B1C2418; Sun, 2 Feb 2025 10:50:20 +0300 (MSK) From: Alexey Nepomnyashih To: stable@vger.kernel.org, Greg Kroah-Hartman Cc: Alexey Nepomnyashih , Alexei Starovoitov , Daniel Borkmann , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf@vger.kernel.org, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Josh Triplett , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Joel Fernandes , rcu@vger.kernel.org, linux-kernel@vger.kernel.org, lvc-project@linuxtesting.org, Hou Tao Subject: [PATCH 6.1 09/16] bpf: Change bpf_mem_cache draining process. Date: Sun, 2 Feb 2025 07:46:46 +0000 Message-ID: <20250202074709.932174-10-sdl@nppct.ru> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250202074709.932174-1-sdl@nppct.ru> References: <20250202074709.932174-1-sdl@nppct.ru> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Alexei Starovoitov commit d114dde245f9115b73756203b03a633a6fc1b36a upstream. The next patch will introduce cross-cpu llist access and existing irq_work_sync() + drain_mem_cache() + rcu_barrier_tasks_trace() mechanism w= ill not be enough, since irq_work_sync() + drain_mem_cache() on cpu A won't guarantee that llist on cpu A are empty. The free_bulk() on cpu B might add objects back to llist of cpu A. Add 'bool draining' flag. The modified sequence looks like: for_each_cpu: WRITE_ONCE(c->draining, true); // do_call_rcu_ttrace() won't be doing cal= l_rcu() any more irq_work_sync(); // wait for irq_work callback (free_bulk) to finish drain_mem_cache(); // free all objects rcu_barrier_tasks_trace(); // wait for RCU callbacks to execute Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Acked-by: Hou Tao Link: https://lore.kernel.org/bpf/20230706033447.54696-8-alexei.starovoitov= @gmail.com Signed-off-by: Alexey Nepomnyashih --- kernel/bpf/memalloc.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index bbd3fa2bf119..16a57cc4992c 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -98,6 +98,7 @@ struct bpf_mem_cache { int free_cnt; int low_watermark, high_watermark, batch; int percpu_size; + bool draining; =20 /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; @@ -301,6 +302,12 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) * from __free_rcu() and from drain_mem_cache(). */ __llist_add(llnode, &c->waiting_for_gp_ttrace); + + if (unlikely(READ_ONCE(c->draining))) { + __free_rcu(&c->rcu_ttrace); + return; + } + /* Use call_rcu_tasks_trace() to wait for sleepable progs to finish. * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal @@ -538,15 +545,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) rcu_in_progress =3D 0; for_each_possible_cpu(cpu) { c =3D per_cpu_ptr(ma->cache, cpu); - /* - * refill_work may be unfinished for PREEMPT_RT kernel - * in which irq work is invoked in a per-CPU RT thread. - * It is also possible for kernel with - * arch_irq_work_has_interrupt() being false and irq - * work is invoked in timer interrupt. So waiting for - * the completion of irq work to ease the handling of - * concurrency. - */ + WRITE_ONCE(c->draining, true); irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress +=3D atomic_read(&c->call_rcu_ttrace_in_progress); @@ -562,6 +561,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) cc =3D per_cpu_ptr(ma->caches, cpu); for (i =3D 0; i < NUM_CACHES; i++) { c =3D &cc->cache[i]; + WRITE_ONCE(c->draining, true); irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress +=3D atomic_read(&c->call_rcu_ttrace_in_progress); --=20 2.43.0