From nobody Thu Apr 9 13:32:17 2026 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5CD3D374730 for ; Mon, 2 Mar 2026 12:40:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772455251; cv=none; b=HP/uObRVnr03emYziqvU47rUWj4m5XTax3A4z9vCyKmhiITL9KU1nWj0NdS3sr7bcuxKRXi1tU3gk++UyK4GYyRwi5qBElPTzQZXQ8eQEJX4yU5+RYy3gkdS2aVwcPQEHubbkeyyVYJo5PQFDm3Pcbaai29+emZjL/q1oaYAxgE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772455251; c=relaxed/simple; bh=M355Sbdzqhu36DBOFyHzileJa0ZyXN2c/SHzhAr6JW8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=glAPLpA3FYq7t9yFef51WijtJUkkrw9da04+mnATETq+mFT5LIuW0A0DlUcOcGFIRNUFglFO+Z7DNqVu/7OWJxCHcWuZCiEZAXkeccN+Ff7r5H34nGR74KIV29r318j02fRo/VCfahe6l13/rfHTu7XiBiQrPwpWmL5pZQOFcZo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=fWTQUbu+; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fWTQUbu+" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-2ae527552acso2752185ad.0 for ; Mon, 02 Mar 2026 04:40:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772455248; x=1773060048; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2scSrgN6MVzwM6/GDhWSPA9nuDZaPKI6bEHUsTrO/gw=; b=fWTQUbu+RczJrsikNd0z5rGla4dbwwxPUk29zKBPNZxJCBkV3hzhpPXeBoN+mCD06g 7SWPcgrYG4qjDfdXAXux+4Arf3xEzWsVuNnvpsmrzWT45J/je7dfTPXXUKiNzPDRV18F 13goRjy85LLApdnwMEIva3LmEYJ0Fez72OPtZEU409YmXoF3zZYfz+4o9eeOnWtYYOAz j72+h0Bqbq+A8vaT+pn6FtphpBQmus2NLKn+OAk/EF1DsFstIw0WiGfaVO/DJFzPoPLN HF9iod+eMmbvDXmUEkfnNk9cFI5h4Gz9zoDCjxetEoEL/u/edr8YGhbFLMwq2qdI4FeJ 5W2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772455248; x=1773060048; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=2scSrgN6MVzwM6/GDhWSPA9nuDZaPKI6bEHUsTrO/gw=; b=gitfeBGjHNNc6M7hG+Lduvig/WQweDZczoxpqv3k3WM0FCeVzaLoXZct6OOWwkhzrg tlR4Zaq3fXE/Mzn94pm/rdpdnDn6ZOkS2mkkhnz97yLR3FCjH+zfLY2FxUjqPYfR8Xs5 rzUB2azM8Z/0/NAd+rOrfavs3dW0fG+6CwZ1mflpctyX1IEGyrSvmf+VtJxGaMwLm2g9 ucScYHKaGlji1Czd69ybaMZX7TpthYY0fYf5ruhNQwpBoUcOC5/ykd00GXzGzaFBenfy kE+P1/W5n81ykqooKLnBXurYr9n7kk1YJvU/O0N2XzNa5Rz8gefxUVU5AUfMdAKj0DOD 7HzQ== X-Forwarded-Encrypted: i=1; AJvYcCW+vgM56ycbBHlwrOA6B5Vb7pC7eF+zkWHO5q0Wpt55moXVqPw3RF12tnhudvG5Kt0XC/dPHAa4zXjAcRk=@vger.kernel.org X-Gm-Message-State: AOJu0YwvsWFXwT+QH/xrr7XI1Fv2JeSc1+oj5Eusc98JMA9tGOu80cbk o3gPs6mxS+I2K3C03uyr3yoTBlY6qQHS0DF3Ni/IqSL6DI04lSdkWw8u X-Gm-Gg: ATEYQzzhsD0xERkwFC1oSm+LDpJvDM+7Y6rv65LoW2bGmlM7jNoLe8SvPjEb5wX8tdC /czikcaPCm0vdAmTM3IV9J/Q8HT8rn8A0hkKk/f0u8akoklZD2gdOyIsV25PQKsJEY3gBa2+zYd 2xMhZ0akzYhwHFESyMhSFWtqWkRtFTLj0T4upjSh7wk3otMjDQGX+wwpyvL7VrJF0Pgx3obmqGG z1aungNl3ha12PGFmYqOme79MgKpRN7q54nNKKesxDKum1K75nGlMwQJepxCItMFktofWlmTLFE sFkuR+4pMgOf6zkz1gC/jVrIAF9O0bg7WnTvOULBlb8tN4PIlG07WPgJRNEEDP+xM8Ms1g/6C4t ujFlq/HuME0RqV8nZl7vx1r0zvn6IGqJvj8dV7RIqJeyhrLPl1hA3FGe13hkLcFPvTdDnNM6Lvl K7iFgL6hmbzO2YDKAnI2O29f0X5JU+fQuTtndctTkWuhAzFK/JD1qXRQ== X-Received: by 2002:a17:903:2449:b0:2ae:57cc:63d7 with SMTP id d9443c01a7336-2ae57cc65femr12287425ad.7.1772455247611; Mon, 02 Mar 2026 04:40:47 -0800 (PST) Received: from localhost.localdomain ([116.128.244.171]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2ae3d1956absm67910225ad.14.2026.03.02.04.40.40 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 02 Mar 2026 04:40:47 -0800 (PST) From: Chengkaitao To: martin.lau@linux.dev, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com, jolsa@kernel.org, shuah@kernel.org, chengkaitao@kylinos.cn, linux-kselftest@vger.kernel.org Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 1/6] bpf: Introduce the bpf_list_del kfunc. Date: Mon, 2 Mar 2026 20:40:23 +0800 Message-ID: <20260302124028.82420-2-pilgrimtao@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260302124028.82420-1-pilgrimtao@gmail.com> References: <20260302124028.82420-1-pilgrimtao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kaitao Cheng If a user holds ownership of a node in the middle of a list, they can directly remove it from the list without strictly adhering to deletion rules from the head or tail. When a kfunc has only one bpf_list_node parameter, supplement the initialization of the corresponding btf_field. Add a new lock_rec member to struct bpf_reference_state for lock holding detection. This is typically paired with bpf_refcount. After calling bpf_list_del, it is generally necessary to drop the reference to the list node twice to prevent reference count leaks. Signed-off-by: Kaitao Cheng --- include/linux/bpf_verifier.h | 4 +++ kernel/bpf/btf.c | 33 +++++++++++++++++++--- kernel/bpf/helpers.c | 17 ++++++++++++ kernel/bpf/verifier.c | 54 ++++++++++++++++++++++++++++++++++-- 4 files changed, 101 insertions(+), 7 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index ef8e45a362d9..e1358b62d6cc 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -261,6 +261,10 @@ struct bpf_reference_state { * it matches on unlock. */ void *ptr; + /* For REF_TYPE_LOCK_*: btf_record of the locked object, used for lock + * checking in kfuncs such as bpf_list_del. + */ + struct btf_record *lock_rec; }; =20 struct bpf_retval_range { diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 4872d2a6c42d..8a977c793d56 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3785,7 +3785,6 @@ static int btf_find_field_one(const struct btf *btf, case BPF_RES_SPIN_LOCK: case BPF_TIMER: case BPF_WORKQUEUE: - case BPF_LIST_NODE: case BPF_RB_NODE: case BPF_REFCOUNT: case BPF_TASK_WORK: @@ -3794,6 +3793,27 @@ static int btf_find_field_one(const struct btf *btf, if (ret < 0) return ret; break; + case BPF_LIST_NODE: + ret =3D btf_find_struct(btf, var_type, off, sz, field_type, + info_cnt ? &info[0] : &tmp); + if (ret < 0) + return ret; + /* graph_root for verifier: container type and node member name */ + if (info_cnt && var_idx >=3D 0 && (u32)var_idx < btf_type_vlen(var)) { + u32 id; + const struct btf_member *member; + + for (id =3D 1; id < btf_nr_types(btf); id++) { + if (btf_type_by_id(btf, id) =3D=3D var) { + info[0].graph_root.value_btf_id =3D id; + member =3D btf_type_member(var) + var_idx; + info[0].graph_root.node_name =3D + __btf_name_by_offset(btf, member->name_off); + break; + } + } + } + break; case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: @@ -4138,6 +4158,7 @@ struct btf_record *btf_parse_fields(const struct btf = *btf, const struct btf_type if (ret < 0) goto end; break; + case BPF_LIST_NODE: case BPF_LIST_HEAD: ret =3D btf_parse_list_head(btf, &rec->fields[i], &info_arr[i]); if (ret < 0) @@ -4148,7 +4169,6 @@ struct btf_record *btf_parse_fields(const struct btf = *btf, const struct btf_type if (ret < 0) goto end; break; - case BPF_LIST_NODE: case BPF_RB_NODE: break; default: @@ -4192,20 +4212,25 @@ int btf_check_and_fixup_fields(const struct btf *bt= f, struct btf_record *rec) int i; =20 /* There are three types that signify ownership of some other type: - * kptr_ref, bpf_list_head, bpf_rb_root. + * kptr_ref, bpf_list_head/node, bpf_rb_root. * kptr_ref only supports storing kernel types, which can't store * references to program allocated local types. * * Hence we only need to ensure that bpf_{list_head,rb_root} ownership * does not form cycles. */ - if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & (BPF_GRAPH_ROOT | BPF_UPTR= ))) + if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & + (BPF_GRAPH_ROOT | BPF_GRAPH_NODE | BPF_UPTR))) return 0; + for (i =3D 0; i < rec->cnt; i++) { struct btf_struct_meta *meta; const struct btf_type *t; u32 btf_id; =20 + if (rec->fields[i].type & BPF_GRAPH_NODE) + rec->fields[i].graph_root.value_rec =3D rec; + if (rec->fields[i].type =3D=3D BPF_UPTR) { /* The uptr only supports pinning one page and cannot * point to a kernel struct diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 6eb6c82ed2ee..577af62a9f7a 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2459,6 +2459,22 @@ __bpf_kfunc struct bpf_list_node *bpf_list_pop_back(= struct bpf_list_head *head) return __bpf_list_del(head, true); } =20 +__bpf_kfunc struct bpf_list_node *bpf_list_del(struct bpf_list_node *node) +{ + struct bpf_list_node_kern *knode =3D (struct bpf_list_node_kern *)node; + + if (unlikely(!knode)) + return NULL; + + if (WARN_ON_ONCE(!READ_ONCE(knode->owner))) + return NULL; + + list_del_init(&knode->list_head); + WRITE_ONCE(knode->owner, NULL); + + return node; +} + __bpf_kfunc struct bpf_list_node *bpf_list_front(struct bpf_list_head *hea= d) { struct list_head *h =3D (struct list_head *)head; @@ -4545,6 +4561,7 @@ BTF_ID_FLAGS(func, bpf_list_push_front_impl) BTF_ID_FLAGS(func, bpf_list_push_back_impl) BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_list_del, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_list_front, KF_RET_NULL) BTF_ID_FLAGS(func, bpf_list_back, KF_RET_NULL) BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_RCU | KF_RET_NULL) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a3390190c26e..8a782772dd36 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1536,7 +1536,7 @@ static int acquire_reference(struct bpf_verifier_env = *env, int insn_idx) } =20 static int acquire_lock_state(struct bpf_verifier_env *env, int insn_idx, = enum ref_state_type type, - int id, void *ptr) + int id, void *ptr, struct btf_record *lock_rec) { struct bpf_verifier_state *state =3D env->cur_state; struct bpf_reference_state *s; @@ -1547,6 +1547,7 @@ static int acquire_lock_state(struct bpf_verifier_env= *env, int insn_idx, enum r s->type =3D type; s->id =3D id; s->ptr =3D ptr; + s->lock_rec =3D lock_rec; =20 state->active_locks++; state->active_lock_id =3D id; @@ -1662,6 +1663,23 @@ static struct bpf_reference_state *find_lock_state(s= truct bpf_verifier_state *st return NULL; } =20 +static bool rec_has_list_matching_node_type(struct bpf_verifier_env *env, + const struct btf_record *rec, + const struct btf *node_btf, u32 node_btf_id) +{ + u32 i; + + for (i =3D 0; i < rec->cnt; i++) { + if (!(rec->fields[i].type & BPF_LIST_HEAD)) + continue; + if (btf_struct_ids_match(&env->log, node_btf, node_btf_id, 0, + rec->fields[i].graph_root.btf, + rec->fields[i].graph_root.value_btf_id, true)) + return true; + } + return false; +} + static void update_peak_states(struct bpf_verifier_env *env) { u32 cur_states; @@ -8576,7 +8594,8 @@ static int process_spin_lock(struct bpf_verifier_env = *env, int regno, int flags) type =3D REF_TYPE_RES_LOCK; else type =3D REF_TYPE_LOCK; - err =3D acquire_lock_state(env, env->insn_idx, type, reg->id, ptr); + err =3D acquire_lock_state(env, env->insn_idx, type, reg->id, ptr, + reg_btf_record(reg)); if (err < 0) { verbose(env, "Failed to acquire lock state\n"); return err; @@ -12431,6 +12450,7 @@ enum special_kfunc_type { KF_bpf_list_push_back_impl, KF_bpf_list_pop_front, KF_bpf_list_pop_back, + KF_bpf_list_del, KF_bpf_list_front, KF_bpf_list_back, KF_bpf_cast_to_kern_ctx, @@ -12491,6 +12511,7 @@ BTF_ID(func, bpf_list_push_front_impl) BTF_ID(func, bpf_list_push_back_impl) BTF_ID(func, bpf_list_pop_front) BTF_ID(func, bpf_list_pop_back) +BTF_ID(func, bpf_list_del) BTF_ID(func, bpf_list_front) BTF_ID(func, bpf_list_back) BTF_ID(func, bpf_cast_to_kern_ctx) @@ -12966,6 +12987,7 @@ static bool is_bpf_list_api_kfunc(u32 btf_id) btf_id =3D=3D special_kfunc_list[KF_bpf_list_push_back_impl] || btf_id =3D=3D special_kfunc_list[KF_bpf_list_pop_front] || btf_id =3D=3D special_kfunc_list[KF_bpf_list_pop_back] || + btf_id =3D=3D special_kfunc_list[KF_bpf_list_del] || btf_id =3D=3D special_kfunc_list[KF_bpf_list_front] || btf_id =3D=3D special_kfunc_list[KF_bpf_list_back]; } @@ -13088,7 +13110,8 @@ static bool check_kfunc_is_graph_node_api(struct bp= f_verifier_env *env, switch (node_field_type) { case BPF_LIST_NODE: ret =3D (kfunc_btf_id =3D=3D special_kfunc_list[KF_bpf_list_push_front_i= mpl] || - kfunc_btf_id =3D=3D special_kfunc_list[KF_bpf_list_push_back_impl= ]); + kfunc_btf_id =3D=3D special_kfunc_list[KF_bpf_list_push_back_impl= ] || + kfunc_btf_id =3D=3D special_kfunc_list[KF_bpf_list_del]); break; case BPF_RB_NODE: ret =3D (kfunc_btf_id =3D=3D special_kfunc_list[KF_bpf_rbtree_remove] || @@ -13211,6 +13234,9 @@ __process_kf_arg_ptr_to_graph_node(struct bpf_verif= ier_env *env, return -EINVAL; } =20 + if (!*node_field) + *node_field =3D field; + field =3D *node_field; =20 et =3D btf_type_by_id(field->graph_root.btf, field->graph_root.value_btf_= id); @@ -13237,6 +13263,28 @@ __process_kf_arg_ptr_to_graph_node(struct bpf_veri= fier_env *env, return -EINVAL; } =20 + /* bpf_list_del: require list head's lock. Use refs[] REF_TYPE_LOCK_MASK + * only. At lock time we stored the locked object's btf_record in ref-> + * lock_rec, so we can get the list value type from the ref directly. + */ + if (node_field_type =3D=3D BPF_LIST_NODE && + meta->func_id =3D=3D special_kfunc_list[KF_bpf_list_del]) { + struct bpf_verifier_state *cur =3D env->cur_state; + + for (int i =3D 0; i < cur->acquired_refs; i++) { + struct bpf_reference_state *s =3D &cur->refs[i]; + + if (!(s->type & REF_TYPE_LOCK_MASK) || !s->lock_rec) + continue; + + if (rec_has_list_matching_node_type(env, s->lock_rec, + reg->btf, reg->btf_id)) + return 0; + } + verbose(env, "bpf_spin_lock must be held for bpf_list_del\n"); + return -EINVAL; + } + return 0; } =20 --=20 2.50.1 (Apple Git-155)