From nobody Sun Feb 8 13:16:58 2026 Received: from mail-pf1-f194.google.com (mail-pf1-f194.google.com [209.85.210.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA045328B77 for ; Fri, 31 Oct 2025 07:59:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.194 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897599; cv=none; b=kAGz+x8tl3Gi+Vz2wMgvG8ia/NEBwGzl2wEDTRULNO2qJRbSM3kvkMEWcoMJCqXDEtFN/CjXELK5rEeO3wJj/iGThg1Wwd/T3MGb+HdMY9BIAk+iZY+w92t/1sDgomEPf2aoTarCWL5ofYdekJXq/oef9yA5FvonErgyER5Rrm8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897599; c=relaxed/simple; bh=W/RNd09jH5me9bf0kPlfjjSDODbnHc9ViAcrgWeECzo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Z1THqtBc36yqcUEe/nwHp/OctAAsJaAqkXmxl1X2uVY2QWU/vSMNASe7BS8H9xnH1mhGsrr+I/8u64MQZzd7NsWYAxaefroTmCPic8zQ5E2yDTZOD3vzCd/j97jQ4Xkxd5OOUhb9s86i/mc8NUbtSzBq0/I4kHMOjRIQ2Cj3F7Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=h94wJusQ; arc=none smtp.client-ip=209.85.210.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="h94wJusQ" Received: by mail-pf1-f194.google.com with SMTP id d2e1a72fcca58-7a4176547bfso1948613b3a.2 for ; Fri, 31 Oct 2025 00:59:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761897596; x=1762502396; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tEDxYB3NbhXbLQBKmdkOBhSE/MmzDxmJF2zUVWvC4W8=; b=h94wJusQYuuDLVVraVaJKPSSg5DU65/06ZpteKaseGPWEWgTkzihUEbm566o5Fysky 2MsGO76x8bK6PINwQn252eS1jPEgYPAqyTLh2eAIxuM+pVyUY3YnPpq4fNXqYB0vtt7C XLcyWVhXA1Uoytp11kojQErfvbjUzpmpdfke8ZnPp+qZOYMwUu8RWC37uA1oY7tQYisw MpZ0MhFW9gm9eSYV+tnAQNopDU65augAKaPlD2eEd2rYKSTPCmORQCzW++7QCc8ZiWgx MQAGim8Rz+cH49/Uq+vqvnORMjIb+KHZi5+4TwD7to1EgZ8fK+9hWJJku4oNkp6UAbVN sSvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761897596; x=1762502396; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tEDxYB3NbhXbLQBKmdkOBhSE/MmzDxmJF2zUVWvC4W8=; b=UcorbzAtB2qCZnqXXQGKpRVPyftBT6rmw2YBp+0IyQ8EalsqLbMb1ykfeOmH0V9e3J tT+bI5LNAWJElOJOSaBC/6CPA0xOv86HjLTp1SJvtb3iS+fb/B8DKiwfJIN4Rrr10C1W GxoAGs3AUnHGZI439ujz78VwWozBZmMKASs+YzIGNyOQjIAE/TJV0kIFgCVjjPf1GbS2 NwYBS+pm9a+Z+HgvYN3C7QC11rccm89hDixm6dwNxNrjd1uaI/DCU5cy6ZiRwU2lvknI ORBTJ0LYkyX9BQbOEJQQc5MEptbi7ik6sQ+FotU870iXarrtEMSg4jkHUxTHaO3MZ15s NLlg== X-Gm-Message-State: AOJu0Yy1aydmEQsN6Yxg+K1gKnAoVJbIIhiqVdqehlIi/onhSFeRtl1A 7GhHe+qgcSMo5kOgG9nKxlext02mIE4YqHCh4895pT3Z/AQRvj6hgJzaAALUyqN3 X-Gm-Gg: ASbGnctcjn0D/aGN2dX2h+ox2vIRc5ZdOgbH3oze7lmZwrVDv7fP7m36Du4eUKBVcxh 6YI+klUy/LT0eQnNey2uhdwaR+3UsQT+8g57GeJuAP3Yo0cAk0xfiO7B3KZaSjsXAsuTgVMjwNI Xn/f5PROdDj+8YzQc5Ik99jP1O8N6kDRowrFY//aUaj44JDh7QwnHTzc22VdeH20wWAE+8EHWpw WXdR11y5n6gUHtYspvrB1gGMZmr0ZaWmGIdnXlVHXY+fTeeemNj9BzJj1QDzrjb/ge5UUF2q0HA T4qV4hhxvOMw/aLLua1RvYZow88H0aBnlE2kSYHXPP2F6mYL5gBqNyz7ysy7TTKW1wFIfwCg96B CGvPmt3XqkDNKI3MpsTei+uDxqDLRYpsarXxfT1eah1SGP2QEjyJMjUnZjApxxdu8MzHxQfYf0X 6ArU3hQWozsstL/Kly+w== X-Google-Smtp-Source: AGHT+IEwzd/fCoh42agXKCWbNtScc/vcJggdFNwNhuE6cmGlHfU/+/mepMN1f5UQG20G4W2+QDI+4g== X-Received: by 2002:a05:6a20:7290:b0:344:a607:5548 with SMTP id adf61e73a8af0-348cde0a4e6mr4184291637.58.1761897595159; Fri, 31 Oct 2025 00:59:55 -0700 (PDT) Received: from E07P150077.ecarx.com.cn ([103.52.189.23]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b93be4045fbsm1216575a12.28.2025.10.31.00.59.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Oct 2025 00:59:54 -0700 (PDT) From: Jianyun Gao To: linux-kernel@vger.kernel.org Cc: Jianyun Gao , Andrii Nakryiko , Eduard Zingerman , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf@vger.kernel.org (open list:BPF [LIBRARY] (libbpf)) Subject: [PATCH v2 1/5] libbpf: Add doxygen documentation for bpf_map_* APIs in bpf.h Date: Fri, 31 Oct 2025 15:59:03 +0800 Message-Id: <20251031075908.1472249-2-jianyungao89@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251031075908.1472249-1-jianyungao89@gmail.com> References: <20251031075908.1472249-1-jianyungao89@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add doxygen comment blocks for all public bpf_map_* APIs in tools/lib/bpf/bpf.h. These doc comments are for: -bpf_map_create() -bpf_map_update_elem() -bpf_map_lookup_elem() -bpf_map_lookup_elem_flags() -bpf_map_lookup_and_delete_elem() -bpf_map_lookup_and_delete_elem_flags() -bpf_map_delete_elem() -bpf_map_delete_elem_flags() -bpf_map_get_next_key() -bpf_map_freeze() -bpf_map_get_next_id() -bpf_map_get_fd_by_id() -bpf_map_get_fd_by_id_opts() Signed-off-by: Jianyun Gao --- v1->v2: - Refined bpf_map_* return value docs: explicit non-negative success vs negative -errno failures. - Fixed the non-ASCII characters in this patch. The v1 is here: https://lore.kernel.org/lkml/20251031032627.1414462-2-jianyungao89@gmail.co= m/ tools/lib/bpf/bpf.h | 647 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 647 insertions(+) diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index e983a3e40d61..35372c0790ee 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -61,6 +61,57 @@ struct bpf_map_create_opts { }; #define bpf_map_create_opts__last_field excl_prog_hash_size =20 +/** + * @brief Create a new BPF map. + * + * This helper wraps the kernel's BPF_MAP_CREATE command and returns a file + * descriptor referring to the newly created map. The map's behavior (e.g. + * key/value semantics, lookup/update constraints) is determined by its + * type and various size parameters. + * + * @param map_type + * Map type (enum bpf_map_type) selecting the kernel map implementa= tion + * (e.g. BPF_MAP_TYPE_HASH, ARRAY, LRU_HASH, PERCPU_ARRAY, etc.). + * + * @param map_name + * Optional human-readable name (null-terminated). May appear in + * bpftool output and used for pinning; can be NULL for unnamed map= s. + * Must not exceed the kernel's NAME_MAX for BPF objects. + * + * @param key_size + * Size (in bytes) of a single key. For some map types this must ma= tch + * kernel expectations (e.g. prog array uses sizeof(int)). Must be = > 0. + * + * @param value_size + * Size (in bytes) of a single value. Some map types have specific = or + * implicit value sizes (e.g. perf event array); still pass the + * required size. Must be > 0 unless the map type defines otherwise. + * + * @param max_entries + * Maximum number of key/value pairs (capacity). For certain map ty= pes + * (e.g. ring buffer, stack, queue) semantics differ but this field= is + * still used. Must be > 0 except for types that ignore it. + * + * @param opts + * Optional pointer to bpf_map_create_opts providing extended creat= ion + * parameters. Pass NULL for defaults. Common fields include: + * - .map_flags: Additional BPF map flags (e.g. BPF_F_NO_PREALLOC= ). + * - .numa_node: Prefer allocation on specified NUMA node. + * - .btf_fd / .btf_key_type_id / .btf_value_type_id: Associate B= TF + * types for verification and introspection. + * - .inner_map_fd: For map-in-map types (array_of_maps / hash_of= _maps). + * - .map_ifindex: Bind map to a network interface when supported. + * - .map_extra: Reserved/experimental extensions (depends on ker= nel). + * Not all fields may be available in older libbpf versions; zero-i= nit + * the struct and set only known fields. + * + * @return + * >=3D 0: File descriptor of the created map (caller owns it and s= hould + * close() when no longer needed). + * < 0 : Negative error code (libbpf style, typically -errno). Deta= iled + * reason can be inferred from -ret or examined via errno (if + * converted) / libbpf logging. + */ LIBBPF_API int bpf_map_create(enum bpf_map_type map_type, const char *map_name, __u32 key_size, @@ -151,19 +202,457 @@ struct bpf_btf_load_opts { LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts *opts); =20 +/** + * @brief Update or insert an element in a BPF map. + * + * Attempts to store the value referenced by @p value into the BPF map + * identified by @p fd under the key referenced by @p key. The semantics + * of the operation are controlled by @p flags: + * + * - BPF_ANY: Create a new element or update an existing one. + * - BPF_NOEXIST: Create a new element only; fail if the key already exi= sts (errno =3D EEXIST). + * - BPF_EXIST: Update an existing element only; fail if the key does = not exist (errno =3D ENOENT). + * - (Optional) BPF_F_LOCK: If supported by the map type, perform a lock= -based update + * (mainly for certain per-cpu map types). + * + * The memory pointed to by @p key and @p value must be at least the size = of the map's + * key and value definitions respectively, and properly aligned for the ta= rget architecture. + * Callers typically place key/value objects on the stack or in static sto= rage; the kernel + * copies their contents during the call, so they need not remain valid af= ter the function + * returns. + * + * Concurrency: For most map types, updates are atomic with respect to loo= kups and other + * updates. For per-CPU maps, the update affects the current CPU's copy (u= nless a flag + * or map type enforces different behavior). Locking flags (e.g., BPF_F_LO= CK) may be + * required for certain map types to ensure consistent read-modify-write s= equences. + * + * Privileges: Some map updates may require CAP_SYS_ADMIN or CAP_BPF depen= ding on the + * map type and system configuration (e.g., locked down environments or LS= M policies). + * + * @param fd File descriptor referring to the opened BPF map. + * @param key Pointer to the key data to be inserted/updated. + * @param value Pointer to the value data to be stored for the key. + * @param flags Operation control flags (see above). + * + * @return 0 on success; negative error code, otherwise (errno is also set= to + * the error code). + * + * Possible errno values include (not exhaustive): + * - E2BIG: Key or value size exceeds map definition. + * - EINVAL: Invalid map fd, flags, or unsupported operation for map= type. + * - EBADF: @p fd is not a valid BPF map descriptor. + * - ENOENT: Key does not exist (with BPF_EXIST). + * - EEXIST: Key already exists (with BPF_NOEXIST). + * - ENOMEM: Kernel memory allocation failure. + * - EPERM/EACCES: Insufficient privileges or rejected by security polic= y. + * - ENOSPC: Map at capacity (for maps with a max entries limit). + * + */ LIBBPF_API int bpf_map_update_elem(int fd, const void *key, const void *va= lue, __u64 flags); =20 +/** + * @brief Look up an element in a BPF map by key. + * + * Retrieves the value associated with the specified key from a BPF map + * identified by its file descriptor. The caller must supply a pointer to + * a key of the map's key size, and a writable buffer large enough to hold + * the map's value size. On success, the value buffer is filled with the + * data stored in the map. + * + * This is a blocking system call that wraps the BPF_MAP_LOOKUP_ELEM + * command. It may incur a context switch and can fail for a variety of + * reasons, including transient kernel conditions. + * + * @param fd File descriptor of an open BPF map (obtained via bpf_obj_ge= t(), + * bpf_map_create(), or via loading an object file). + * @param key Pointer to a buffer containing the key to look up. The buff= er + * must be exactly the size of the map's key type. + * @param value Pointer to a buffer where the map's value will be copied on + * success. Must be at least the size of the map's value type. + * + * @return 0 on success (value populated); negative error code, otherwise + * (errno is also set to the error code): + * - ENOENT: The key does not exist in the map. + * - EINVAL: Invalid parameters (e.g., wrong sizes or bad map type= ). + * - EPERM / EACCES: Insufficient privileges (e.g., missing CAP_BP= F or + * related capability). + * - EBADF: Invalid map file descriptor. + * - ENOMEM: Kernel could not allocate required memory. + * - EFAULT: key or value points to invalid user memory. + * + */ LIBBPF_API int bpf_map_lookup_elem(int fd, const void *key, void *value); + +/** + * @brief Look up (read) a value stored in a BPF map. + * + * This is a thin libbpf wrapper around the BPF_MAP_LOOKUP_ELEM command of= the + * bpf(2) system call. It retrieves the value associated with the provided= key + * from the map referred to by fd. + * + * The caller must supply storage for both the key and the value. On succe= ss + * the memory pointed to by value is filled with the map element's data. + * + * Concurrency semantics depend on the map type. For maps whose values con= tain + * a bpf_spin_lock (e.g. certain HASH or ARRAY-like map types), you may pa= ss + * the BPF_F_LOCK flag in flags to request that the kernel return the value + * while holding the spin lock, guaranteeing a consistent snapshot for com= plex + * composite data. The lock is released immediately after copying the value + * out to user space. Pass 0 for default (unlocked) lookup semantics. + * + * Note: Only flags supported by the running kernel (currently BPF_F_LOCK)= are + * valid; unsupported flags will cause the lookup to fail with EINVAL. + * + * Key requirements: + * - For array-like maps (e.g., BPF_MAP_TYPE_ARRAY, PERCPU_ARRAY), key po= ints + * to an integer index. + * - For hash-like maps, key points to a full key of the map's declared k= ey + * size. + * + * Value requirements: + * - value must point to a buffer at least as large as the map's value si= ze + * (use bpf_obj_get_info_by_fd() or bpf_map__value_size() helpers to qu= ery + * this). + * + * @param fd File descriptor of the BPF map obtained via bpf_create_ma= p(), + * bpf_obj_get(), or a libbpf helper. + * @param key Pointer to the key (or index) identifying the element to = read. + * Must not be nullptr. + * @param value Pointer to caller-allocated buffer that receives the valu= e on + * success. Must not be nullptr. + * @param flags Bitmask of lookup flags. Use 0 for a normal lookup. Speci= fy + * BPF_F_LOCK (if supported) to perform a locked read of val= ues + * containing a bpf_spin_lock. + * + * @return 0 on success; negative error code, otherwise + * (errno is also set to the error code): + * - ENOENT: No element with the specified key exists. + * - EINVAL: Invalid arguments (bad flags, key/value pointers, or = map type). + * - EPERM / EACCES: Insufficient privileges (e.g., map access res= trictions). + * - EBADF: Invalid map file descriptor. + * - EFAULT: key or value points to unreadable/writable memory. + * - E2BIG: Key size does not match the map's declared key size. + * - Other standard Linux errors depending on map type and kernel. + * + */ LIBBPF_API int bpf_map_lookup_elem_flags(int fd, const void *key, void *va= lue, __u64 flags); +/** + * @brief Atomically look up and delete a single element from a BPF map. + * + * Performs a combined "lookup-and-delete" operation for the element + * identified by the key pointed to by @p key in the map referred to by + * @p fd. If the key exists, its value is copied into the user-provided + * @p value buffer (if non-null) and the element is removed from the map + * as one atomic kernel operation, preventing races between a separate + * lookup and delete sequence. + * + * Supported map types are those for which the kernel implements + * BPF_MAP_LOOKUP_AND_DELETE_ELEM (e.g., queue/stack-like maps and + * certain hash variants). On unsupported map types the call fails. + * + * Concurrency: + * - The lookup and deletion are performed atomically with respect to + * other map operations on the same key, avoiding TOCTOU races. + * - For per-CPU maps (where applicable) the deletion affects only the + * current CPU's instance unless the map semantics dictate otherwise. + * + * Memory requirements: + * - @p key must point to a buffer exactly equal to the declared key + * size of the map. + * - @p value must point to a buffer at least as large as the map's + * value size. If @p value is NULL, no value is copied; the element + * is still deleted (kernel may return EFAULT on older kernels that + * require a non-null value pointer). + * + * Privileges: + * - May require CAP_BPF or CAP_SYS_ADMIN depending on kernel + * configuration, LSM policies, or lockdown state. + * + * @param fd File descriptor of an open BPF map. + * @param key Pointer to the key identifying the element to remove. + * @param value Pointer to caller-allocated buffer that receives the + * value prior to deletion (can be NULL on kernels that + * allow skipping value copy). + * + * @return 0 on success (value copied and element deleted); negative error + * code, otherwise (errno is also set to the error code): + * - ENOENT: Key not found in the map. + * - EINVAL: Invalid arguments (bad key pointer/size, unsupported = map type). + * - EOPNOTSUPP: Operation not supported for this map type. + * - EBADF: @p fd is not a valid BPF map descriptor. + * - EFAULT: key/value points to inaccessible user memory. + * - EPERM / EACCES: Insufficient privileges. + * - ENOMEM: Kernel failed to allocate temporary resources. + * + */ LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key, void *value); +/** + * @brief Atomically look up and delete an element from a BPF map with ext= ra flags. + * + * This is a flags-capable variant of bpf_map_lookup_and_delete_elem(). It= performs + * a single atomic kernel operation that (optionally) retrieves the value = associated + * with the specified key and then deletes the element from the map. The a= dditional + * @p flags parameter allows requesting special semantics if supported by = the map + * type and kernel (e.g., locked access with BPF_F_LOCK when the map value= embeds + * a bpf_spin_lock). + * + * Semantics: + * - If the key exists: + * * Its value is copied into the user-provided @p value buffer (if = non-NULL). + * * The element is removed from the map. + * - If the key does not exist: fails with errno =3D ENOENT, no deletion= performed. + * + * Atomicity: + * The lookup and deletion occur as one kernel operation, eliminating ra= ce + * windows that would exist if lookup and delete were performed separate= ly. + * + * Flags (@p flags): + * - 0: Perform a normal atomic lookup-and-delete. + * - BPF_F_LOCK: If supported and the map value contains a bpf_spin_lock= , the + * kernel acquires the spin lock during value retrieval en= suring + * a consistent snapshot, then releases it prior to return= ing. + * - Other bits: Must be zero unless future kernels introduce new semant= ics; + * unsupported flags yield -1 with errno =3D EINVAL. + * + * Memory requirements: + * - @p key must point to a buffer exactly the size of the map's key. + * - @p value must point to a buffer at least the size of the map's valu= e if + * non-NULL. Passing NULL skips value copy (if supported by the runnin= g kernel). + * + * Supported map types: + * Only those implementing BPF_MAP_LOOKUP_AND_DELETE_ELEM (e.g., queue, = stack, + * certain hash variants). Unsupported types fail with errno =3D EOPNOTS= UPP. + * + * Privileges: + * May require CAP_BPF or CAP_SYS_ADMIN depending on kernel configuratio= n, + * lockdown mode, or LSM policies. + * + * Concurrency: + * - The operation is atomic with respect to other concurrent updates, + * lookups, or deletions of the same key. + * - For per-CPU maps, semantics follow the underlying map implementation + * (typically deleting from the calling CPU's instance). + * + * @param fd File descriptor of an open BPF map. + * @param key Pointer to the key identifying the element to consume. + * @param value Optional pointer to a buffer receiving the element's valu= e prior + * to deletion. Can be NULL to skip retrieval (subject to ke= rnel support). + * @param flags Bitmask controlling lookup/delete behavior (see above). + * + * @return 0 on success; negative error code, otherwise + * (errno is also set to the error code): + * - ENOENT: Key not found. + * - EINVAL: Bad arguments, unsupported flags, or mismatched key s= ize. + * - EOPNOTSUPP: Operation not supported for this map type. + * - EBADF: Invalid map file descriptor. + * - EFAULT: key/value points to inaccessible user memory. + * - EPERM / EACCES: Insufficient privileges / denied by security = policy. + * - ENOMEM: Temporary kernel allocation failure. + * + */ LIBBPF_API int bpf_map_lookup_and_delete_elem_flags(int fd, const void *ke= y, void *value, __u64 flags); +/** + * @brief Delete (remove) a single element from a BPF map. + * + * Issues the BPF_MAP_DELETE_ELEM command for the map referenced by @p fd, + * removing the element identified by the key pointed to by @p key. This + * helper is the simplest deletion API and does not support any additional + * deletion or locking flags. For flag-capable deletion semantics (e.g., + * locked delete of spin_lock-embedded values) use bpf_map_delete_elem_fla= gs(). + * + * Semantics: + * - If an element with the specified key exists, it is atomically remov= ed. + * - If the key is absent, the call fails with errno =3D ENOENT. + * - No value is returned; if you need to retrieve and consume an elemen= t, + * use bpf_map_lookup_and_delete_elem() (or its flags variant). + * + * Concurrency: + * - Deletion is atomic with respect to concurrent lookups and updates of + * the same key. + * - Ordering relative to other operations is map-type dependent; no + * global ordering guarantees are provided beyond atomicity for the ke= y. + * + * Key requirements: + * - @p key must point to a buffer exactly equal in size to the map's + * declared key size. Supplying a buffer of incorrect size or alignment + * can lead to EINVAL or EFAULT. + * + * Privileges: + * - May require CAP_BPF, CAP_SYS_ADMIN, or be restricted by LSM or + * lockdown policies depending on system configuration and map type. + * + * Error handling (errno set on failure): + * - ENOENT: Key not found in the map. + * - EINVAL: Invalid map fd, bad key size, or operation unsupported for = map type. + * - EBADF: @p fd is not a valid (open) BPF map descriptor. + * - EFAULT: @p key points to unreadable user memory. + * - EPERM / EACCES: Insufficient privileges or blocked by security poli= cy. + * - ENOMEM: Transient kernel memory/resource exhaustion (rare). + * + * @param fd File descriptor of an open BPF map. + * @param key Pointer to a buffer containing the key to delete; must not b= e NULL. + * + * @return 0 on success; negative error code, otherwise + * (errno is also set to the error code). + * + */ LIBBPF_API int bpf_map_delete_elem(int fd, const void *key); +/** + * @brief Delete an element from a BPF map with optional flags. + * + * This is a flags-capable variant of bpf_map_delete_elem(). It issues the + * BPF_MAP_DELETE_ELEM command to remove the element identified by the key + * pointed to by @p key from the map referenced by @p fd. Unlike the plain + * variant, this helper allows passing lookup/delete control flags in @p f= lags. + * + * Typical usage mirrors bpf_map_delete_elem(), but if the map's value type + * embeds a bpf_spin_lock (and the kernel supports locked delete semantics= ), + * you may specify BPF_F_LOCK in @p flags to request the kernel to take the + * spin lock while performing the deletion, ensuring consistent removal for + * composite values that might otherwise require user space synchronizatio= n. + * + * Semantics: + * - If the key exists, the element is removed. + * - If the key does not exist, the call fails with errno =3D ENOENT. + * - No value is returned; for consume-and-retrieve use + * bpf_map_lookup_and_delete_elem() or + * bpf_map_lookup_and_delete_elem_flags(). + * + * Flags (@p flags): + * - 0: Perform a normal deletion. + * - BPF_F_LOCK: (If supported) acquire/release map value's spin lock ar= ound + * delete operation. Ignored or rejected if unsupported for the map ty= pe. + * - Unsupported bits cause failure with errno =3D EINVAL. + * + * Concurrency: + * - Deletion is atomic with respect to concurrent lookups/updates of the + * same key. + * - For per-CPU map types, semantics follow underlying implementation + * (only current CPU's instance is affected where applicable). + * + * Privileges: + * - May require CAP_BPF or CAP_SYS_ADMIN depending on kernel configurat= ion, + * system lockdown mode, or LSM policies. + * + * @param fd File descriptor of an open BPF map. + * @param key Pointer to a buffer containing the key to delete. Must be + * exactly the size of the map's key type. + * @param flags Deletion control flags (see above). Use 0 for default beh= avior. + * + * @return 0 on success; negative error code, otherwise + * (errno is also set to the error code): + * - ENOENT: Key not found. + * - EINVAL: Invalid arguments, unsupported flags, or wrong key si= ze. + * - EBADF: @p fd is not a valid BPF map descriptor. + * - EFAULT: @p key points to inaccessible user memory. + * - EPERM / EACCES: Insufficient privileges or denied by security= policy. + * - ENOMEM: Temporary kernel resource allocation failure. + * + */ LIBBPF_API int bpf_map_delete_elem_flags(int fd, const void *key, __u64 fl= ags); +/** + * @brief Iterate over keys in a BPF map by retrieving the key that follow= s a given key. + * + * This helper wraps the BPF_MAP_GET_NEXT_KEY command. It copies into @p n= ext_key + * the key that lexicographically (or implementation-defined order) follow= s @p key + * in the map referenced by @p fd. It is typically used to enumerate all k= eys in + * a map from user space. + * + * Iteration pattern: + * 1. Pass NULL as @p key to retrieve the first key in the map. + * 2. On each successful call, use the returned @p next_key as the @p ke= y input + * for the subsequent call to advance the iteration. + * 3. When there are no more keys, the call fails with errno =3D ENOENT = and + * iteration is complete. + * + * Concurrency: + * - The order of enumeration is not guaranteed to be stable across conc= urrent + * inserts/deletes. Keys added or removed during iteration may or may = not be + * observed. + * - For hash-like maps, ordering is implementation-dependent (hash buck= et + * traversal). For array-like maps (ARRAY/PERCPU_ARRAY), "next" corres= ponds + * to the next valid index. + * + * Memory requirements: + * - @p key (if non-NULL) must point to a buffer exactly the size of the= map's + * key type. + * - @p next_key must point to a writable buffer at least the size of th= e map's + * key type. + * + * Privileges: + * - Access may require CAP_BPF or CAP_SYS_ADMIN depending on system loc= kdown + * mode, LSM policy, or map type. + * + * @param fd File descriptor of an open BPF map. + * @param key Pointer to the current key; NULL to start iteration fro= m the first key. + * @param next_key Pointer to a buffer that receives the next key on succe= ss. + * + * @return 0 on success (next key stored in @p next_key); negative error c= ode, otherwise + * (errno is also set to the error code): + * - ENOENT: No further keys (end of iteration) or map is empty = (when @p key is NULL). + * - EINVAL: Invalid arguments (bad fd, wrong key size, unsuppor= ted map type). + * - EBADF: @p fd is not a valid BPF map descriptor. + * - EFAULT: @p key or @p next_key points to inaccessible user m= emory. + * - EPERM / EACCES: Insufficient privileges or access denied by= security policy. + * + */ LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_ke= y); +/** + * @brief Mark a BPF map as frozen (read-only for any future user space mo= difications). + * + * Invokes the kernel's BPF_MAP_FREEZE command on the map referred to by @= p fd. + * Once a map is successfully frozen: + * - User space can still perform lookups (bpf_map_lookup_elem*(), batch= lookups, etc.). + * - All further update, delete, and batch mutation operations from user= space + * will fail (typically with EPERM). + * - Freezing is irreversible for the lifetime of the map. + * + * Typical use cases: + * - Finalizing initialization data (e.g., config arrays or constant map= s) + * before exposing the map to untrusted code or other processes. + * - Enforcing write-once semantics to ensure stronger safety guarantees. + * - Preventing accidental or malicious runtime mutation of maps that sh= ould + * remain constant after setup. + * + * Semantics & scope: + * - The freeze applies system-wide to the map object, not just to the c= alling + * process. + * - BPF programs' ability to modify the map after freezing depends on k= ernel + * semantics: for most map types, freezing blocks user space mutations= only. + * (Do not rely on program write restrictions unless explicitly docume= nted + * for a specific kernel/map type.) + * - Re-freezing an already frozen map succeeds (idempotent) or may retu= rn + * an error depending on kernel version; treat a second freeze as a no= -op. + * + * Privileges: + * - Typically requires CAP_BPF or CAP_SYS_ADMIN (depending on kernel + * configuration, LSM, and lockdown state). + * + * @param fd File descriptor of an open BPF map to freeze. + * + * @return 0 on success; negative libbpf-style error code (< 0) on failure. + * + * Possible errors (returned as -errno style negatives): + * - -EBADF: @p fd is not a valid file descriptor. + * - -EINVAL: @p fd is not a BPF map, or map type is not freezable. + * - -EPERM / -EACCES: Insufficient privileges or blocked by security po= licy. + * - -EOPNOTSUPP: Kernel doesn't support BPF_MAP_FREEZE. + * - -ENOMEM: Temporary resource allocation failure inside the kernel. + * + * Thread safety: + * - Safe to call concurrently; only the first successful call transitio= ns + * the map into the frozen state. + * + * After freezing: + * - Continue using lookup APIs to read data. + * - Avoid calling mutation APIs (update/delete) unless prepared to hand= le + * expected failures. + * + */ LIBBPF_API int bpf_map_freeze(int fd); =20 struct bpf_map_batch_opts { @@ -488,6 +977,53 @@ struct bpf_prog_test_run_attr { }; =20 LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id); +/** + * @brief Retrieve the next existing BPF map ID after a given starting ID. + * + * This helper enumerates system-wide BPF map IDs in ascending order. It w= raps + * the kernel's BPF_OBJ_GET_NEXT_ID command restricted to BPF maps. + * + * Enumeration pattern: + * 1. Initialize start_id to 0 to obtain the first (lowest) existing map= ID. + * 2. On success, *next_id is set. Use that returned value as the new st= art_id + * for the subsequent call to advance the iteration. + * 3. Repeat until the function returns -ENOENT, which indicates there i= s no + * map with ID greater than start_id (end of enumeration). + * + * Concurrency & races: + * - Map creation/deletion can race with enumeration; a retrieved ID mig= ht + * become invalid by the time you act on it (e.g., when calling + * bpf_map_get_fd_by_id()). + * - To safely interact with a map after enumeration, immediately conver= t the + * ID to a file descriptor with bpf_map_get_fd_by_id() and handle poss= ible + * failures (e.g., -ENOENT if the map was removed). + * + * Typical usage example: + * __u32 id =3D 0, next; + * while (!bpf_map_get_next_id(id, &next)) { + * int map_fd =3D bpf_map_get_fd_by_id(next); + * if (map_fd >=3D 0) { + * // process map_fd + * close(map_fd); + * } + * id =3D next; + * } + * // Loop terminates when -ENOENT is returned (no more IDs). + * + * @param start_id + * Starting point for the search; the function looks for a map ID + * strictly greater than start_id. Use 0 to get the first existing = ID. + * @param next_id + * Pointer to a __u32 that receives the next map ID on success. + * Must not be NULL. + * + * @return + * 0 on success (next_id populated); + * -ENOENT if there is no map ID greater than start_id (end of iter= ation); + * -EINVAL on invalid arguments (e.g., next_id =3D=3D NULL); + * -EPERM / -EACCES if denied by security policy or lacking privile= ges; + * Other negative libbpf-style errors for transient or system failu= res. + */ LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id); LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id); LIBBPF_API int bpf_link_get_next_id(__u32 start_id, __u32 *next_id); @@ -503,7 +1039,118 @@ struct bpf_get_fd_by_id_opts { LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id); LIBBPF_API int bpf_prog_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); +/** + * @brief Get a file descriptor for an existing BPF map given its kernel-a= ssigned ID. + * + * This helper wraps the BPF_MAP_GET_FD_BY_ID command of the bpf(2) syscal= l and + * converts a stable (monotonically increasing) map ID into a process-local + * file descriptor referring to that map object. The returned descriptor g= rants + * the caller access consistent with system security policy (LSM, cgroup, + * namespace, capabilities) at the time of the call. + * + * Typical enumeration pattern: + * __u32 id =3D 0, next; + * while (!bpf_map_get_next_id(id, &next)) { + * int map_fd =3D bpf_map_get_fd_by_id(next); + * if (map_fd >=3D 0) { + * // Use map_fd (query info, perform lookups, etc.) + * close(map_fd); + * } + * id =3D next; + * } + * // Loop ends when bpf_map_get_next_id() returns -ENOENT. + * + * Concurrency & races: + * - A map may be deleted between obtaining its ID (e.g., via + * bpf_map_get_next_id()) and calling this function; in that case the = call + * fails with -ENOENT. + * - Immediately act on (and, when done, close) the returned file descri= ptor + * to minimize race windows. + * + * Lifetime & ownership: + * - On success the caller owns the returned file descriptor and must cl= ose() + * it when no longer needed. + * - The underlying map persists system-wide until all references (FDs a= nd + * in-kernel attachments) are gone; closing this FD alone does not des= troy + * the map. + * + * Privileges / access control: + * - May require CAP_BPF, CAP_SYS_ADMIN, or be denied by LSM / lockdown + * policies depending on system configuration. + * - A successful return does not guarantee unrestricted operations on t= he + * map; specific actions (updates, pinning, freezing) may still be gat= ed. + * + * Error handling (negative libbpf-style return codes): + * - -ENOENT: No map with the specified ID (deleted or never existed). + * - -EACCES / -EPERM: Access denied by security policy or insufficient + * privilege. + * - -EINVAL: Invalid attributes passed to the kernel (rare; typically + * indicates an out-of-date kernel/libbpf mismatch). + * - -ENOMEM: Transient kernel memory/resource exhaustion. + * - Other negative values: Propagated -errno from the bpf() syscall. + * + * @param id + * Kernel-assigned unique ID of the target BPF map (obtained via + * bpf_map_get_next_id() or from info queries). Must be > 0. + * + * @return + * >=3D 0: File descriptor referring to the BPF map (caller must cl= ose()). + * < 0 : Negative error code (libbpf-style, e.g., -ENOENT, -EPERM). + * + */ LIBBPF_API int bpf_map_get_fd_by_id(__u32 id); +/** + * @brief Obtain a file descriptor for an existing BPF map by its kernel-a= ssigned ID, + * with extended options. + * + * This is an extended variant of bpf_map_get_fd_by_id() that allows the c= aller + * to specify additional attributes (via @p opts) affecting how the kernel= opens + * the map. It wraps the BPF_MAP_GET_FD_BY_ID command of the bpf(2) syscal= l. + * + * Typical usage pattern: + * - Enumerate map IDs with bpf_map_get_next_id(). + * - For each ID, call bpf_map_get_fd_by_id_opts() to convert the ID int= o a + * process-local file descriptor. + * - Use the returned FD to query info (bpf_map_get_info_by_fd()), perfo= rm + * lookups/updates, or pin the map. + * - close() the FD when finished. + * + * Concurrency & races: + * A map can be deleted between discovering its ID and calling this func= tion. + * In that case the call fails with -ENOENT. Always check the return val= ue and + * handle transient failures. + * + * Lifetime & ownership: + * On success the caller owns the returned FD. Closing it decrements a + * reference on the underlying map object but does not destroy the map if + * other references (FDs or in-kernel links/programs) remain. + * + * Security / privileges: + * Access can be denied by capabilities (CAP_BPF, CAP_SYS_ADMIN), LSM po= licies, + * or lockdown mode, yielding -EPERM/-EACCES. Supplying certain @p opts = values + * (e.g., restrictive @c open_flags) does not bypass system security pol= icy. + * + * @param id + * Kernel-assigned unique ID of the target map (must be > 0). Typic= ally + * obtained via bpf_map_get_next_id() or from a prior info query. + * @param opts + * Optional pointer to bpf_get_fd_by_id_opts controlling open behav= ior: + * - .open_flags: Requested access/open semantics (kernel-specifi= c; + * pass 0 for default). Unsupported flags produce -EINVAL. + * - .token_fd: FD of a BPF token (if using delegated permissions= ). + * May be NULL for default behavior. Unrecognized or unsupported fi= elds + * should be zero-initialized for forward/backward compatibility. + * + * @return + * >=3D 0 : File descriptor referring to the BPF map (caller must c= lose()). + * < 0 : Negative libbpf-style error code (typically -errno): + * - -ENOENT : No map with @p id (deleted or never existed= ). + * - -EPERM / -EACCES : Insufficient privileges / denied by= policy. + * - -EINVAL : Invalid @p id, malformed @p opts, or bad fl= ags. + * - -ENOMEM : Transient kernel resource exhaustion. + * - Other negative codes propagated from bpf() syscall. + * + */ LIBBPF_API int bpf_map_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id); --=20 2.34.1 From nobody Sun Feb 8 13:16:58 2026 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D8A432B997 for ; Fri, 31 Oct 2025 08:00:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.193 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897608; cv=none; b=RpNL8X8ngpPTzazKTtQYA92KW1eCcks9M9nsj5UnFkI6VoeTMZ77JsqScmYO8POKrqQKN2iYTTrVckDcf1j1NFEZ+Xq6x/69MXpEcQSsg3RPmr+pggOYjf1m7T2WMzrjnPFf+VV1+XwYVwEbMpgj5FgKpGYaTnXoGlXbdMWhMF8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897608; c=relaxed/simple; bh=TsrzwgFBeRd8IHfr1ejSjcMSRxPfnYpKZaoSbpXjCdA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=G+bF503O/N7aCO+ca2g6NfjcGAbD7e+2bz+9BysOo/UNv+elIceLgI7bRfGMzRDvMNLmIx0dZ6otg3+AlPNFeaanm6RPC4g/ukDvDXIWBHhRe+BpFbIL5G6MasGK26uO0dKEOUaod4IeIgIoBv9TCf0UylxPoes7p2xya4BKR08= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Gwuh3xQr; arc=none smtp.client-ip=209.85.210.193 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Gwuh3xQr" Received: by mail-pf1-f193.google.com with SMTP id d2e1a72fcca58-7a28c7e3577so2003174b3a.1 for ; Fri, 31 Oct 2025 01:00:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761897604; x=1762502404; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hqGVylxGniGAnG7/VQtFPRfIEHrq4fNkkxvdTdmPnjE=; b=Gwuh3xQro28QGw2TGi/nxEPbmcMAHrU9pVWODxBP17vK0XkBR7OhU0uBcOU/jd6PZr +GWJY/k1FGvZOnYkV9Snr9RYEM9GuymmZNmj/voTouD2MuVGpHJ37+0wnWC0zWCEbKHg QrvbKzUGzs8X8VB6BlOALKhDqZu8dSCSKggiG9NnRMexgNUfcQ2WVkxw8qPkt2Kg9qcV UjrfkKzUbOFes4TAlQe7ZP1Tr4+OqRuCnm4m4WuQGDBnnLA4jKoksoYUj5IGD508B2oP fF8E1nLaHPV0cZQjU9Uj7v0kn8mmnJ1OEXUNjVjfrCjTrtVXidg45u7ysC54e1gEqkmq DiGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761897604; x=1762502404; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hqGVylxGniGAnG7/VQtFPRfIEHrq4fNkkxvdTdmPnjE=; b=X7CH/RXXVAejUmy9ueJHLvtZbyB6VM+5tbr6p4GcaC8zlSJxg2SuVPM/y4M6QyVMkg IyG2HnD4G3YbZGf9/0OT5RHGjFamRkJJXb/FyE/AzN2lYZQJRz7c0tBvbEV/v/V5bEF7 +aYPIcSq7W+mA5BiD9EKSeXXUEFNKZDCbrSQweDKLo2dCDKyH3GyVupll8hVy3OqPKkJ 65nXwse1z6gSUm0YzEKNxAEEzeb4mFt33BxDCk9e9Ortp7bB0BqsbVwapH9fXN0A3kzZ qKOmREZZR87SXViymz4vIOr+NFOgU/nIdWiQoZGHPGq9hSjVWNBHfNj8BUJYoLTVjyLo QgzQ== X-Gm-Message-State: AOJu0YxGRAs2xtF01a8ZyBPGBNexJN8y69RclAsGf114We1zDovBtsXA VgH1SC69uGOgvl3QoYORbCDfLzDndGBUi+DCd7LTT09XgwPvGTDt93MaUlo5P2Eo X-Gm-Gg: ASbGncvsZ1URZ3hp6gmFeY/mQkQS2mobTEwdhLwS6DqNOPaT34WOcExUIWTKlbYWF9k O/LI3wXcSzihCTpZeGwOJJRQcvtxxhlK/+y9YJO9wvSeR/peBMt8HMeOhFE+2nuykqNyub/dut5 iL8gdz1we5QnUmIBZe9hok5EJGslPeg5g4RsbdfHMNO2GQiTHMB8qaIJ9igbV2tlxQD/UQs3Mys IqSxxtLp72f80AF7mTXvduVYdxb0Nk13a5rCmuicyR2T3oWWi3VeX14/ccCPxcXsV+gi9R0BuMA qTUuh2szn00aBOipmdcVgrFnzCaQDzdNui7rU21RNxZGBeU88zoKeC0ZLI+gXMsBFdDRRHG9j8k lXx6ykZFunDGwamUXC/D+nxhE3HuwNWy2fSoWq/7cC3TJtio34ZyYMgoIYkejvIg/ZZH0JkuFLO jUdkhk+ZGtZTgMLNYm6w== X-Google-Smtp-Source: AGHT+IGRQLdZnJOECdypxjzblvvJrzqPfek8AaF2rCRejcjnVvHasVu0V/yQqgXdbEq5sa4AZjk3Fg== X-Received: by 2002:a05:6a20:3d07:b0:342:9487:7dee with SMTP id adf61e73a8af0-348c9f6791fmr3790833637.12.1761897603739; Fri, 31 Oct 2025 01:00:03 -0700 (PDT) Received: from E07P150077.ecarx.com.cn ([103.52.189.23]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b93be4045fbsm1216575a12.28.2025.10.31.00.59.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Oct 2025 01:00:02 -0700 (PDT) From: Jianyun Gao To: linux-kernel@vger.kernel.org Cc: Jianyun Gao , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , bpf@vger.kernel.org (open list:BPF [GENERAL] (Safe Dynamic Programs and Tools)), netdev@vger.kernel.org (open list:XDP (eXpress Data Path):Keyword:(?:\b|_)xdp(?:\b|_)) Subject: [PATCH v2 2/5] libbpf: Add doxygen documentation for bpf_prog_* APIs in bpf.h Date: Fri, 31 Oct 2025 15:59:04 +0800 Message-Id: <20251031075908.1472249-3-jianyungao89@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251031075908.1472249-1-jianyungao89@gmail.com> References: <20251031075908.1472249-1-jianyungao89@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add doxygen comment blocks for all public bpf_prog_* APIs in tools/lib/bpf/bpf.h. These doc comments are for: -bpf_prog_load() -bpf_prog_attach() -bpf_prog_detach() -bpf_prog_detach2() -bpf_prog_get_next_id() -bpf_prog_get_fd_by_id() -bpf_prog_get_fd_by_id_opts() -bpf_prog_query() -bpf_prog_bind_map() -bpf_prog_test_run_opts() Signed-off-by: Jianyun Gao --- v1->v2: - Fixed the non-ASCII characters in this patch. The v1 is here: https://lore.kernel.org/lkml/20251031032627.1414462-3-jianyungao89@gmail.co= m/ tools/lib/bpf/bpf.h | 655 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 649 insertions(+), 6 deletions(-) diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 35372c0790ee..cd96d7afed6b 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -167,7 +167,104 @@ struct bpf_prog_load_opts { size_t :0; }; #define bpf_prog_load_opts__last_field fd_array_cnt - +/** + * @brief Load (verify and register) a BPF program into the kernel. + * + * This is a high-level libbpf wrapper around the BPF_PROG_LOAD command of= the + * bpf(2) syscall. It submits an array of eBPF instructions to the kernel + * verifier, optionally provides BTF metadata and attachment context, and + * returns a file descriptor referring to the newly loaded (but not yet + * attached) BPF program. + * + * Core flow: + * 1. The kernel verifier validates instruction safety, helper usage, + * stack bounds, pointer arithmetic, and (if provided) BTF type + * consistency. + * 2. If verification succeeds, a program FD is returned (>=3D 0). + * 3. If verification fails, a negative libbpf-style error is returned + * (< 0). If logging was requested via @c opts->log_* fields, a textu= al + * verifier log may be captured for debugging. + * + * @param prog_type + * Enumerated BPF program type (enum bpf_prog_type) selecting verif= ier + * expectations and permissible helpers (e.g. BPF_PROG_TYPE_SOCKET_= FILTER, + * BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_TRACING, BPF_PROG_TYPE_XDP, = etc.). + * + * @param prog_name + * Optional, null-terminated human-readable name. Visible via bpfto= ol + * and in kernel introspection APIs. Can be NULL. If longer than the + * kernel's max BPF object name length (typically BPF_OBJ_NAME_LEN), + * it will be truncated. Use concise alphanumeric/underscore names. + * + * @param license + * Null-terminated license string (e.g. "GPL", "Dual BSD/GPL"). Det= ermines + * eligibility for GPL-only helpers. Must not be NULL. Passing a li= cense + * incompatible with required GPL-only helpers yields -EACCES/-EPER= M. + * + * @param insns + * Pointer to an array of eBPF instructions (struct bpf_insn). Must= be + * non-NULL and executable by the verifier (no out-of-bounds jumps,= etc.). + * The kernel copies this array; caller can free/modify it after re= turn. + * + * @param insn_cnt + * Number of instructions in @p insns. Must be > 0 and within kernel + * limits (historically <=3D ~1M instructions; exact cap is kernel-= specific). + * A too large value results in -E2BIG or -EINVAL. + * + * @param opts + * Optional pointer to a zero-initialized struct bpf_prog_load_opts + * providing extended parameters. Pass NULL for defaults. Only set + * fields you understand; leaving others zero ensures fwd/back comp= at. + * + * Notable fields: + * - sz: Must be set to sizeof(struct bpf_prog_load_opts) for lib= bpf + * to validate structure layout. + * - attempts: Number of automatic retries if bpf() returns -EAGA= IN + * (transient verifier/resource contention). Default is 5 if ze= ro. + * - expected_attach_type: For some program types (tracing, LSM, = etc.) + * the verifier requires an attach type hint. + * - prog_btf_fd: BTF describing function prototypes / types refe= renced + * by the program (enables CO-RE relocations, enhanced validati= on). + * - prog_flags: Bitmask of program load flags (e.g. BPF_F_STRICT= _ALIGNMENT, + * BPF_F_SLEEPABLE for sleepable programs; availability is kern= el-dependent). + * - prog_ifindex: Network interface index for certain net-specif= ic types + * (e.g., tc or XDP offload scenarios). + * - kern_version: Legacy field (mostly for old kernels / cBPF mi= gration). + * - attach_btf_id / attach_btf_obj_fd: Identify kernel BTF targe= t (e.g. + * function or struct) for fentry/fexit/tracing program types. + * - attach_prog_fd: Attach to an existing BPF program (e.g. for = extension). + * - fd_array / fd_array_cnt: Supply an array of FDs (maps, progs= ) when the + * kernel expects auxiliary references (advanced use cases). + * - func_info / line_info (+ *_cnt, *_rec_size): Raw .BTF.ext se= ctions + * used for richer debugging and introspection (normally handle= d by + * libbpf when loading from object files; rarely set manually). + * - log_level / log_size / log_buf: Request verifier output. Set + * log_level > 0, allocate log_buf of at least log_size bytes. = After + * return, log_true_size (if kernel supports) reflects actual l= ength + * (may exceed provided size if truncated). + * - token_fd: BPF token for delegated permissions (non-root cont= rolled + * environments). + * + * Unrecognized (future) fields should remain zeroed. Always update= sz. + * + * @return + * >=3D 0 : File descriptor of loaded BPF program; caller owns it a= nd must + * close() when no longer needed. + * < 0 : Negative libbpf-style error code (typically -errno). Comm= on: + * - -EINVAL: Malformed instructions, bad prog_type/flags= , struct + * size mismatch, missing required attach hint= s. + * - -EACCES / -EPERM: Disallowed helpers (license/capabi= lity), + * missing CAP_BPF/CAP_SYS_ADMIN or= blocked + * by LSM/lockdown. + * - -E2BIG: Instruction count or log size too large. + * - -ENOMEM: Kernel memory/resource exhaustion. + * - -EFAULT: Bad user pointers (insns/log_buf). + * - -EOPNOTSUPP: Unsupported program type or flag on thi= s kernel. + * - -ENOSPC: Program too complex (e.g. verifier limits e= xceeded). + * - -EAGAIN: Transient verifier failure; libbpf may retr= y until + * attempts exhausted. + * + */ LIBBPF_API int bpf_prog_load(enum bpf_prog_type prog_type, const char *prog_name, const char *license, const struct bpf_insn *insns, size_t insn_cnt, @@ -821,10 +918,182 @@ struct bpf_obj_get_opts { LIBBPF_API int bpf_obj_get(const char *pathname); LIBBPF_API int bpf_obj_get_opts(const char *pathname, const struct bpf_obj_get_opts *opts); - +/** + * @brief Attach a loaded BPF program to a kernel hook or attach point. + * + * This is a low-level libbpf helper that wraps the bpf(BPF_PROG_ATTACH) + * syscall command. It establishes a relationship between an already loaded + * BPF program (@p prog_fd) and an attachable kernel entity represented by + * @p attachable_fd (or, for certain attach types, a pseudo file descripto= r). + * + * Common attach targets include: + * - cgroup FDs (for CGroup-related program types like BPF_PROG_TYPE_CGR= OUP_SKB, + * BPF_PROG_TYPE_CGROUP_SOCK_ADDR, etc.). + * - perf event FDs (for certain tracing or profiling program types). + * - socket or socket-like FDs (for SK_MSG, SK_SKB, SOCK_OPS, etc.). + * - BPF prog array FDs (when chaining programs). + * + * Prefer using newer link-based APIs (e.g., bpf_link_create()) when avail= able, + * as they provide a stable lifetime model and automatic cleanup when the = link + * FD is closed. This legacy API is still useful on older kernels or for + * attach types not yet covered by link abstractions. + * + * @param prog_fd + * File descriptor of an already loaded BPF program obtained via + * bpf_prog_load() or similar. Must be a valid BPF program FD. + * + * @param attachable_fd + * File descriptor of the target attach point (e.g., cgroup FD, perf + * event FD, target program array FD). For some attach types this m= ight + * be a special or pseudo FD whose semantics depend on @p type. + * + * @param type + * Enumerated BPF attach type (enum bpf_attach_type) specifying how= the + * kernel should link the program to the target. The allowable set + * depends on both the program's BPF program type and the nature of + * @p attachable_fd. A mismatch typically yields -EINVAL. + * + * @param flags + * Additional attach flags controlling behavior. Most attach types + * require this to be 0. Some program families (e.g., cgroup) permit + * flag combinations (such as replacing existing attachments) subje= ct + * to kernel version support. Unsupported flags result in -EINVAL. + * + * @return 0 on success; negative error code (< 0) on failure. + * + * Example (attaching a cgroup program): + * int prog_fd =3D bpf_prog_load(...); + * int cg_fd =3D open("/sys/fs/cgroup/mygroup", O_RDONLY); + * if (bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_INET_INGRESS, 0) < 0) + * perror("bpf_prog_attach"); + * + */ LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd, enum bpf_attach_type type, unsigned int flags); +/** + * @brief Detach (unlink) BPF program(s) from an attach point. + * + * bpf_prog_detach() is a legacy convenience wrapper around the + * BPF_PROG_DETACH command of the bpf(2) syscall. It removes the BPF + * program currently attached to the kernel object represented by + * attachable_fd for the specified attach @p type. This API only works + * for attach types that historically supported a single attached + * program (e.g., older cgroup program types before multi-attach was + * introduced). + * + * For modern multi-program attach points (e.g., cgroup with multiple + * programs of the same attach type), prefer bpf_prog_detach2(), which + * allows specifying the exact program FD to be detached. Calling + * bpf_prog_detach() on a multi-attach capable target typically fails + * with -EINVAL or -EPERM, or detaches only the "base"/single program + * depending on kernel version, so it should be avoided in new code. + * + * Lifetime semantics: + * - On success, the link between the program and the attach point is + * removed; any subsequent events at that hook will no longer invoke + * the detached program. + * - The program itself remains loaded; its FD is still valid and + * should be closed separately when no longer needed. + * + * Concurrency & races: + * - Detach operations compete with parallel attach/detach attempts. + * If another program is attached between inspection and detach, + * the result may differ from expectations; always check return + * codes. + * + * Typical usage (legacy cgroup case): + * int cg_fd =3D open("/sys/fs/cgroup/mygroup", O_RDONLY); + * if (cg_fd < 0) { perror("open cgroup"); return -1; } + * if (bpf_prog_detach(cg_fd, BPF_CGROUP_INET_INGRESS) < 0) + * perror("bpf_prog_detach"); + * + * @param attachable_fd + * File descriptor of the attach target (e.g., cgroup FD, perf even= t FD, + * etc.). Must refer to an object supporting the given attach type. + * @param type + * Enumerated BPF attach type (enum bpf_attach_type) corresponding = to + * the hook from which to detach. Must match the original attach ty= pe + * used when the program was attached. + * + * @return 0 on success; + * < 0 negative libbpf-style error code (typically -errno) on fail= ure: + * - -EBADF: attachable_fd is not a valid descriptor. + * - -EINVAL: Unsupported attach type for this target, no program + * of that type attached, or legacy detach disallowed + * (multi-attach scenario). + * - -ENOENT: No program currently attached for the given type. + * - -EPERM / -EACCES: Insufficient privileges (missing CAP_BPF / + * CAP_SYS_ADMIN) or blocked by security policy. + * - -EOPNOTSUPP: Kernel lacks support for detaching this type. + * - Other negative codes: Propagated syscall failures (e.g., -E= NOMEM). + * + */ LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type typ= e); +/** + * @brief Detach a specific BPF program from an attach point that may supp= ort multiple + * simultaneously attached programs. + * + * bpf_prog_detach2() is an enhanced variant of bpf_prog_detach(). While + * bpf_prog_detach() detaches "the" program of a given @p type from @p att= achable_fd + * (and therefore only works reliably for legacy single-attach hooks), thi= s function + * targets and detaches the exact BPF program referenced by @p prog_fd fro= m the + * attach point referenced by @p attachable_fd. + * + * Typical use cases: + * - Cgroup multi-attach program types (e.g., CGROUP_SKB, CGROUP_SOCK, C= GROUP_SYSCTL, + * CGROUP_INET_INGRESS/EGRESS, etc.), where multiple programs of the s= ame attach + * type can coexist. + * - Hooks that allow program stacking/chaining and require precise remo= val of a + * single program without disturbing others. + * + * Preferred alternatives: + * - For new code that establishes long-lived attachments, consider usin= g link-based + * APIs (bpf_link_create() + bpf_link_detach()/close(link_fd)), which = provide + * clearer lifetime semantics. bpf_prog_detach2() is still necessary o= n older + * kernels or when working directly with legacy cgroup/perf event styl= e attachments. + * + * Concurrency & races: + * - If another thread/process detaches the same program (or destroys ei= ther FD) + * concurrently, this call can fail with -ENOENT or -EBADF. + * - Immediately check the return value; success means the specified pro= gram + * was detached at the time of the call. The program remains loaded an= d its + * @p prog_fd is still valid; close() it separately when done. + * + * Privileges: + * - Typically requires CAP_BPF and/or CAP_SYS_ADMIN depending on kernel + * configuration, LSM policies, and lockdown mode. + * + * Error handling (negative return codes, libbpf style =3D=3D -errno): + * - -EBADF: @p prog_fd or @p attachable_fd is not a valid file descript= or, or + * @p prog_fd does not reference a loaded BPF program. + * - -EINVAL: Unsupported @p type for the given attachable_fd, mismatch = between + * program's type/expected attach type and @p type, or kernel= doesn't + * support detach2 for this combination. + * - -ENOENT: The specified program is not currently attached at the giv= en hook + * (it may have been detached already or never attached there= ). + * - -EACCES / -EPERM: Insufficient privileges or blocked by security po= licy. + * - -EOPNOTSUPP: Kernel lacks support for multi-program detachment for = this + * attach type. + * - Other negative codes: Propagated from underlying syscall (e.g., -EN= OMEM + * for transient resource issues). + * + * Example (detaching a cgroup eBPF program): + * int prog_fd =3D bpf_prog_load(...); + * int cg_fd =3D open("/sys/fs/cgroup/mygroup", O_RDONLY); + * // (Assume program was previously attached via bpf_prog_attach or lin= k API) + * if (bpf_prog_detach2(prog_fd, cg_fd, BPF_CGROUP_INET_INGRESS) < 0) { + * perror("bpf_prog_detach2"); + * } + * + * @param prog_fd File descriptor of the loaded BPF program to be d= etached. + * @param attachable_fd File descriptor of the attach point (e.g., cgroup= FD, perf + * event FD, socket-like FD, prog array FD). + * @param type BPF attach type (enum bpf_attach_type) identifyin= g the hook + * from which to detach this program. Must match the= original + * attach type used when the program was attached. + * + * @return 0 on success; < 0 on failure (negative error code as described = above). + */ LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd, enum bpf_attach_type type); =20 @@ -975,7 +1244,50 @@ struct bpf_prog_test_run_attr { __u32 ctx_size_out; /* in: max length of ctx_out * out: length of cxt_out */ }; - +/** + * @brief Retrieve the next existing BPF program ID after a given starting= ID. + * + * This helper wraps the kernel's BPF_PROG_GET_NEXT_ID command and enumera= tes + * system-wide BPF program IDs in strictly ascending order. It is typicall= y used + * to iterate over all currently loaded BPF programs from user space. + * + * Enumeration pattern: + * 1. Initialize start_id to 0 to obtain the first (lowest) existing pro= gram ID. + * 2. On success, *next_id is set to the next valid ID greater than star= t_id. + * 3. Use the returned *next_id as the new start_id for the subsequent c= all. + * 4. Repeat until the function returns -ENOENT, indicating there is no = program + * with ID greater than start_id (end of enumeration). + * + * Concurrency & races: + * - Program creation/destruction can race with enumeration. A program w= hose + * ID you just retrieved might disappear (be unloaded) before you conv= ert + * it to a file descriptor (e.g., via bpf_prog_get_fd_by_id()). Always + * handle failures when opening by ID. + * - Enumeration does not provide a consistent snapshot; newly created + * programs may appear after you pass their would-be predecessor ID. + * + * Lifetime considerations: + * - IDs are monotonically increasing and not reused until wraparound (w= hich + * is practically unreachable in normal operation). + * - Successfully retrieving an ID does not pin or otherwise prevent pro= gram + * unloading; obtain an FD immediately if you need to interact with it. + * + * + * @param start_id + * Starting point for the search. The helper finds the first progra= m ID + * strictly greater than start_id. Use 0 to begin enumeration. + * @param next_id + * Pointer to a __u32 that receives the next program ID on success. + * Must not be NULL. + * + * @return + * 0 on success (next_id populated); + * -ENOENT if there is no program ID greater than start_id (end of= iteration); + * -EINVAL if next_id is NULL or invalid arguments were supplied; + * -EPERM / -EACCES if denied by security policy or lacking require= d privileges; + * Other negative libbpf-style errors (-errno) on transient or syst= em failures. + * + */ LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id); /** * @brief Retrieve the next existing BPF map ID after a given starting ID. @@ -1035,8 +1347,88 @@ struct bpf_get_fd_by_id_opts { size_t :0; }; #define bpf_get_fd_by_id_opts__last_field token_fd - +/** + * @brief Convert a kernel-assigned BPF program ID into a process-local fi= le descriptor. + * + * bpf_prog_get_fd_by_id() wraps the BPF_PROG_GET_FD_BY_ID command of the + * bpf(2) syscall. Given a stable, monotonically increasing program ID, it + * returns a new file descriptor referring to that loaded BPF program, all= owing + * user space to inspect or further manage the program (e.g. query info, a= ttach, + * pin, update links). + * + * Typical enumeration + open pattern: + * __u32 id =3D 0, next; + * while (!bpf_prog_get_next_id(id, &next)) { + * int prog_fd =3D bpf_prog_get_fd_by_id(next); + * if (prog_fd >=3D 0) { + * // Use prog_fd (e.g. bpf_prog_get_info_by_fd(), attach, pin, = etc.) + * close(prog_fd); + * } + * id =3D next; + * } + * // Loop ends when bpf_prog_get_next_id() returns -ENOENT. + * + * + * @param id Kernel-assigned unique (non-zero) BPF program ID. + * + * @return + * >=3D 0 : File descriptor referring to the BPF program (caller must cl= ose()). + * < 0 : Negative error code (libbpf-style, see list above). + */ LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id); +/** + * @brief Obtain a file descriptor for an existing BPF program by its kern= el-assigned ID, + * with extended open options. + * + * This function is an extended variant of bpf_prog_get_fd_by_id(). It wra= ps the + * BPF_PROG_GET_FD_BY_ID command of the bpf(2) syscall and converts a stab= le BPF + * program ID into a process-local file descriptor, honoring optional attr= ibutes + * supplied via @p opts. + * + * Typical usage pattern: + * 1. Enumerate program IDs with bpf_prog_get_next_id(). + * 2. For each ID, call bpf_prog_get_fd_by_id_opts() to obtain a program= FD. + * 3. Use the FD (e.g., bpf_prog_get_info_by_fd(), attach, pin, link ope= rations). + * 4. close() the FD when no longer needed. + * + * Example: + * __u32 id =3D ...; // obtained via bpf_prog_get_next_id() + * struct bpf_get_fd_by_id_opts o =3D { + * .sz =3D sizeof(o), + * .open_flags =3D 0, + * }; + * int prog_fd =3D bpf_prog_get_fd_by_id_opts(id, &o); + * if (prog_fd < 0) { + * // handle error + * } else { + * // use prog_fd + * close(prog_fd); + * } + * + * @param id + * Kernel-assigned unique (non-zero) BPF program ID, typically obta= ined via + * bpf_prog_get_next_id() or from a prior info query. Must be > 0. + * @param opts + * Optional pointer to a zero-initialized struct bpf_get_fd_by_id_o= pts controlling + * open behavior. May be NULL for defaults. Fields: + * - sz: Must be set to sizeof(struct bpf_get_fd_by_id_opts) for = forward/backward + * compatibility if @p opts is non-NULL. + * - open_flags: Requested open/access flags (kernel-specific; pa= ss 0 unless a + * documented flag is needed). Unsupported flags yield -EIN= VAL. + * - token_fd: FD of a BPF token providing delegated permissions = (set to -1 or 0 + * if unused). If provided, enables restricted environments= to open the + * program without elevated global capabilities. + * + * @return + * >=3D 0 : File descriptor referring to the BPF program (caller must cl= ose()). + * < 0 : Negative libbpf-style error code (typically -errno): + * - -ENOENT : No program with @p id (unloaded or never existe= d). + * - -EPERM / -EACCES : Insufficient privileges / denied by pol= icy. + * - -EINVAL : Bad @p id, malformed @p opts, or unsupported fl= ags. + * - -ENOMEM : Transient kernel resource exhaustion. + * - Other negative codes: Propagated bpf() syscall errors. + * + */ LIBBPF_API int bpf_prog_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); /** @@ -1272,6 +1664,83 @@ struct bpf_prog_query_opts { */ LIBBPF_API int bpf_prog_query_opts(int target, enum bpf_attach_type type, struct bpf_prog_query_opts *opts); +/** + * @brief Query BPF programs attached to a given target (legacy/simple int= erface). + * + * bpf_prog_query() wraps the BPF_PROG_QUERY command of the bpf(2) syscall= and + * retrieves information about one or more BPF programs attached to an att= ach + * point represented by @p target_fd for a specific attach @p type. For ri= cher + * queries (including link IDs and per-program attach flags) use + * bpf_prog_query_opts(), which supersedes this API. + * + * Typical usage pattern: + * 1. Set *prog_cnt to the capacity (number of elements) of the @p prog_= ids + * buffer. + * 2. Call bpf_prog_query(). + * 3. On success: + * - If @p attach_flags is non-NULL, *attach_flags contains global + * attach flags for the hook (e.g., multi-attach, replace semanti= cs). + * - *prog_cnt is updated with the number of program IDs actually w= ritten. + * - prog_ids[0 .. *prog_cnt-1] holds the program IDs (ascending or= der + * is typical but not guaranteed). + * + * Concurrency & races: + * - Programs may be attached or detached concurrently. The returned lis= t is + * a snapshot at the moment of the query; programs might disappear bef= ore + * you turn their IDs into FDs (via bpf_prog_get_fd_by_id()). + * - Always check subsequent opens for -ENOENT. + * + * Buffer management: + * - On input, *prog_cnt must reflect the capacity of @p prog_ids. + * - On output, *prog_cnt is set to the number of IDs returned (0 is val= id). + * - If @p prog_ids is NULL, the call can still populate @p attach_flags= (if + * provided) and report whether any programs are attached by returning + * *prog_cnt =3D=3D 0 (legacy kernels may return -EINVAL in this case). + * + * @param target_fd + * File descriptor of the attach point (e.g., a cgroup FD, perf eve= nt FD, + * or other object that supports @p type). + * @param type + * BPF attach type (enum bpf_attach_type) describing which hook to = query + * (must match how programs were attached). + * @param query_flags + * Optional refinement flags (must be 0 unless specific flags are + * supported by the running kernel; unsupported flags yield -EINVAL= ). + * @param attach_flags + * Optional output pointer to receive aggregate attach flags descri= bing + * the state/behavior of the attach point. Pass NULL to ignore. + * @param prog_ids + * Caller-provided array to receive program IDs; may be NULL only if + * *prog_cnt =3D=3D 0 or when only @p attach_flags is of interest (= kernel + * version dependent). + * @param prog_cnt + * In: capacity (number of elements) in @p prog_ids. + * Out: number of program IDs actually written. Must not be NULL. + * + * @return + * 0 on success (results populated as described); + * < 0 a negative libbpf-style error code (typically -errno): + * - -EINVAL: Bad arguments (NULL prog_cnt, unsupported query/typ= e, + * invalid flags, insufficient buffer) or target_fd no= t a + * valid attach point for @p type. + * - -ENOENT: No program(s) of this @p type attached (older kerne= ls may + * use 0 + *prog_cnt =3D=3D 0 instead). + * - -EPERM / -EACCES: Insufficient privileges (CAP_BPF/CAP_SYS_A= DMIN) + * or blocked by security policy. + * - -EBADF: target_fd is not a valid file descriptor. + * - -EFAULT: User memory (prog_ids / attach_flags / prog_cnt) is + * unreadable or unwritable. + * - -ENOMEM: Transient kernel memory/resource exhaustion. + * - Other negative codes: Propagated syscall failures. + * + * Post-processing: + * - Convert each returned program ID to an FD with bpf_prog_get_fd_by_i= d() + * for further introspection or management. + * + * Recommended alternative: + * - Prefer bpf_prog_query_opts() for new code; it supports link enumera= tion, + * per-program attach flags, revision checks, and future extensions. + */ LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags, __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt); @@ -1305,7 +1774,57 @@ struct bpf_prog_bind_opts { __u32 flags; }; #define bpf_prog_bind_opts__last_field flags - +/** + * @brief Bind (associate) an already loaded BPF program with an existing = BPF map. + * + * bpf_prog_bind_map() is a low-level libbpf helper wrapping the + * BPF_PROG_BIND_MAP kernel command. It establishes (or updates) an + * association between a loaded BPF program (prog_fd) and a map (map_fd) + * that the program is expected to reference at run time. This allows + * certain late binding or rebinding scenarios (e.g., providing a map that + * could not be created or located at initial program load time, or + * updating a program's backing/global data map after load). The exact + * semantics and which map types are supported are kernel-version dependen= t; + * unsupported combinations will fail with an error. + * + * Typical use cases: + * - Late injection of a data/config map into a program that was loaded + * without direct access to that map. + * - Rebinding a program to a replacement map (e.g., upgraded layout), + * where the kernel permits such updates without reloading the program. + * - Establishing program <-> map relationship needed for specific kernel + * features (e.g., global data sections, special helper expectations, + * or JIT/runtime adjustments). + * + * + * Recommended pattern: + * struct bpf_prog_bind_opts opts =3D { + * .sz =3D sizeof(opts), + * .flags =3D 0, + * }; + * if (bpf_prog_bind_map(prog_fd, map_fd, &opts) < 0) { + * perror("bpf_prog_bind_map"); + * // handle failure + * } + * + * @param prog_fd File descriptor of an already loaded BPF program. + * @param map_fd File descriptor of the BPF map to bind to the program. + * @param opts Optional pointer to bpf_prog_bind_opts (may be NULL for = defaults). + * Must have opts->sz set when non-NULL. opts->flags must b= e 0 unless + * documented otherwise. + * + * @return 0 on success; negative error code (< 0) on failure. + * + * Error handling (negative libbpf-style return codes; errno set): + * - -EBADF: prog_fd or map_fd is not a valid descriptor. + * - -EINVAL: Invalid arguments, unsupported map/program type combinatio= n, + * malformed opts, bad flags, or kernel does not support bind= ing. + * - -EPERM / -EACCES: Insufficient privileges (CAP_BPF/CAP_SYS_ADMIN) or + * blocked by LSM / lockdown policy. + * - -ENOENT: The referenced program or map no longer exists (race). + * - -ENOMEM: Transient kernel resource exhaustion. + * - Other negative codes: Propagated from underlying bpf() syscall. + */ LIBBPF_API int bpf_prog_bind_map(int prog_fd, int map_fd, const struct bpf_prog_bind_opts *opts); =20 @@ -1331,7 +1850,131 @@ struct bpf_test_run_opts { __u32 batch_size; }; #define bpf_test_run_opts__last_field batch_size - +/** + * @brief Execute a loaded BPF program in a controlled (synthetic) context= and + * collect its return code, output data, and timing statistics. + * + * bpf_prog_test_run_opts() is a high-level wrapper around the kernel's + * BPF_PROG_TEST_RUN command. It allows user space to "test run" a program + * without attaching it to a live hook, supplying optional input data + * (data_in), optional execution context (ctx_in), and retrieving any + * transformed output data (data_out), context (ctx_out), program return + * value, and average per-run duration in nanoseconds. + * + * Typical purposes: + * - Unit-style testing of program logic (e.g., XDP, TC, SK_MSG) before + * deployment. + * - Verifying correctness of packet mangling or map access patterns. + * - Microbenchmarking via repeat execution (repeat > 1). + * - Exercising program behavior under different synthetic contexts. + * + * Usage pattern (minimal): + * struct bpf_test_run_opts opts =3D {}; + * opts.sz =3D sizeof(opts); + * opts.data_in =3D pkt; + * opts.data_size_in =3D pkt_len; + * opts.data_out =3D out_buf; + * opts.data_size_out =3D out_buf_cap; + * opts.repeat =3D 1000; + * if (bpf_prog_test_run_opts(prog_fd, &opts) =3D=3D 0) { + * printf("prog retval=3D%u avg_ns=3D%u out_len=3D%u\n", + * opts.retval, opts.duration, opts.data_size_out); + * } else { + * perror("bpf_prog_test_run_opts"); + * } + * + * Structure initialization notes: + * - opts.sz MUST be set to sizeof(struct bpf_test_run_opts) for + * forward/backward compatibility. + * - All unused fields should be zeroed (memset(&opts, 0, sizeof(opts))). + * - Omit (leave NULL/zero) optional buffers you don't need (e.g., ctx_o= ut). + * + * Input fields (set by caller): + * - data_in / data_size_in: + * Optional raw input buffer fed to the program. For packet-oriented + * types (e.g., XDP) this simulates an ingress frame. If data_in is + * NULL, data_size_in must be 0. + * - data_out / data_size_out: + * Optional buffer receiving (potentially) modified data. On success + * data_size_out is updated with actual bytes written. If data_out + * is NULL, set data_size_out =3D 0 (no output capture). + * - ctx_in / ctx_size_in: + * Optional synthetic context (e.g., struct xdp_md) passed to the + * program. Only meaningful for program types expecting a context + * argument. If unused, leave NULL/0. + * - ctx_out / ctx_size_out: + * Optional buffer to retrieve (possibly altered) context. Provide + * initial max size in ctx_size_out. Set ctx_out NULL if not needed. + * - repeat: + * Number of times to run the program back-to-back. If > 1 the kernel + * accumulates total time and returns averaged per-run duration in + * opts.duration. Use for stable timing. If 0 or 1, program executes + * exactly once. + * - flags: + * Feature/control flags (must be 0 unless a supported kernel extens= ion + * is documented; unknown bits yield errors). + * - cpu: + * Optional CPU index hint for program types allowing per-CPU execut= ion + * binding during test runs (e.g., for percpu data semantics). If 0 = and + * not meaningful for the program type, ignored. If unsupported, call + * may fail with -EINVAL. + * - batch_size: + * For program types that support batched test execution (kernel- + * dependent). Each test iteration may process up to batch_size items + * internally. Leave 0 unless specifically targeting a batched mode. + * + * Output fields (populated on success): + * - data_size_out: + * Actual number of bytes written to data_out (may be <=3D original + * capacity; unchanged if no output). + * - ctx_size_out: + * Actual number of bytes written to ctx_out (if provided). + * - retval: + * Program's return value (semantics depend on program type; e.g., + * XDP_* action code for XDP programs). + * - duration: + * Average per run execution time in nanoseconds (only meaningful + * when repeat > 0; may be 0 if kernel cannot measure). + * + * Concurrency & isolation: + * - Test runs occur in isolation from live attachment points; no real + * packets, sockets, or kernel events are consumed. + * - Map interactions are real: the program can read/update maps during + * test runs. Ensure maps are in a suitable state. + * + * Data & context lifetime: + * - Kernel copies input data/context before executing; caller can reuse + * buffers after return. + * - Output buffers must be writable and sufficiently sized; truncation + * occurs if too small (reported via size_out fields). + * + * Performance measurement guidance: + * - Use a sufficiently large repeat count (hundreds/thousands) to + * smooth timing variance. + * - Avoid measuring with data_out/ctx_out unless necessary; copying + * increases overhead. + * + * + * @param prog_fd + * File descriptor of the loaded BPF program to test. + * @param opts + * Pointer to an initialized bpf_test_run_opts describing input, + * output, and execution parameters. Must not be NULL. + * + * @return 0 on success; negative error code (< 0) on failure (errno is al= so set). + * + * Error handling (return value < 0, errno set): + * - -EINVAL: Malformed opts (missing sz), unsupported flags, invalid + * buffer sizes, or program type mismatch. + * - -EPERM / -EACCES: Insufficient privileges (CAP_BPF / CAP_SYS_ADMIN) + * or restricted by LSM/lockdown. + * - -EFAULT: Bad user pointers (data_in/out or ctx_in/out). + * - -ENOMEM: Kernel resource allocation failure. + * - -ENOTSUP / -EOPNOTSUPP: Test run unsupported for this program type + * or kernel version. + * - Other negative codes: Propagated from underlying bpf() syscall. + * + */ LIBBPF_API int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts); =20 --=20 2.34.1 From nobody Sun Feb 8 13:16:58 2026 Received: from mail-pj1-f65.google.com (mail-pj1-f65.google.com [209.85.216.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7AE3B2EA755 for ; Fri, 31 Oct 2025 08:00:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.65 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897619; cv=none; b=JZcs/yLsmxOkRz64cKpcU2MSMwVjJyLKveg/oaUqwu91fdzBAlljDIxUfWr36FazXzihXX0tx2pM0/ovzzTFpkuHknfhShM3J5KbxtIaI5RWhEYvFazZ6Hx4riWahkFJ5Ui8xXgFKenAILTX/A9fpZOSC9eebYRbExz/h6NhXmI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897619; c=relaxed/simple; bh=hxFri/KrVTPIFvuVZR/+6IB1+026xsuK0O7ukr/RSDQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FBUfUnHbRRu6en3rEECjJvpNUlkaW+u9uIOL4HwriJMiOrZ/+kKS6iZ0QJP8a1T1wLb32LFbV9Gxb/PZagyAn2H9ckexw0/QcEENLdXU8V/Vg9HM1Ow9QwwVQuszDmWT0kRqL3Os0bRUeJRZlNq8P8Fvaf5lxIq/W6hYqdsw6sc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Y102qaL3; arc=none smtp.client-ip=209.85.216.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Y102qaL3" Received: by mail-pj1-f65.google.com with SMTP id 98e67ed59e1d1-339d7c403b6so1924698a91.2 for ; Fri, 31 Oct 2025 01:00:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761897616; x=1762502416; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/IBeku+7OSqR7bZxUebGlRtAXA6L2SMktYHxRLyyFCE=; b=Y102qaL3RAjN/7p4Ucew+r84G80G/5HeX9iMd3omMqFnDe431hrkH13/WGfbT+Uw+e Ok+9YE8652h2H9cO3mUXNvDbMOtV/DL3sX8NKac5KyvHLZpymauO5YVJpO6y00Hd6Q38 vKVkPDWNRBIyIWI9IuWsVVTX9sidyfv9eMYQTFg0ctldptX/udmUkY0LB9HyqlsgYjqO Eg654rpzPQ2Y8Pm2Nc5nXyqFVvEF8CUO9pk9JtSx+dA/lbtHvsGNnLchBNjZ5f7RopqV BH8Ghq8l2i0RJJzMmH1s6d4p1jQM8tOk4o9SiQBrtZBb8+iNzNkAnN7HYj718dr/QsIQ 7vEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761897616; x=1762502416; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/IBeku+7OSqR7bZxUebGlRtAXA6L2SMktYHxRLyyFCE=; b=B0gzj9ridg6sFVtnBPh7+UBp4Av0LpGJ6nTX6VJ/dJoMiXHFjpVkAUJ8xThQ7E9rM6 eqrRZJXACqSZ0AJHAVzwCOUT4EugLAGaZfNdcMX5ep4rykNPq5+9chbsRgMk5LRjgfo9 QgDM+1wbuN5Y2tywd0QyoGP/rBWiz90mC/zE7i6tkmuhP4SUT8coi6rMrd88e8OJ2qRx RVHn8NwSmYjXeBLM1E5cSXzJu9d254tRhnN/7sicWcL+fasCK6XsczsoBjGOPItc9eUy 5gcSfwgfkwwouyAJjRqosFZNcesoiSz2CDL3FSgiPdO8/XrcBtfoEV+F4o5IixEOG6bR h/dw== X-Gm-Message-State: AOJu0YzcEPYnWvDLmCTt3vFYyI9ERS0Rve6Xxi2emw+d7D/w2PY/jRUk +6oRZ8DKSalI75ASrFF7gzmlcMHq7KJNfXxZd6UhGDjkWtDl93bV2BTYraTyfPJ9 X-Gm-Gg: ASbGncsLyE8ME9XTqltsgd9NXAVyA56sXoFJULABH7AWDxUZKgZ5qSiA7VHonvMYsco 3PRKn3TZeL7+7syRjZxobLyK4ydFq1U8s+xP4fGOh4uU4y0vRn+e+FjgBI7z7oqpIufbDe6Dxnu ewu0G/qrPjlxYUazHBoOzavKDs7qCQD4VGgTiIKy48s4Ly6zmEOIH6vFZ0JPjSaCGS+c+ELP5tw RxNRX1V6B0oP8wbR5ABkNulZro+y8UyVPcfgmjLuhsg+toUneZYSUo8m2Jb5mAE6Hw4y06igWMa q87exHJusPTq5fitUhaAHCtvPDguApIElEVhs5of8TNIrxdLqv1oUVahRZ6emHMCEc7s8kljrbi 6c1cd/V255p9Ip9y3Kb5Nu+EWkxQSK6Ghi8bESzx4CqCh6KgOt8puSlnyPtvs/if54KBz0oAwUI Boosq/G+PeD45CXpFZYOdeWU6g/Vz6 X-Google-Smtp-Source: AGHT+IGeoo6eVsiB8x7+YUC9Q7dJX+1EQtFsCbAEMK4Lw4YiaING5vuQwYARP7SFKRS2JKK/GaIjJQ== X-Received: by 2002:a17:90b:2d81:b0:340:7380:d092 with SMTP id 98e67ed59e1d1-3408307c81emr3765278a91.26.1761897615975; Fri, 31 Oct 2025 01:00:15 -0700 (PDT) Received: from E07P150077.ecarx.com.cn ([103.52.189.23]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b93be4045fbsm1216575a12.28.2025.10.31.01.00.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Oct 2025 01:00:15 -0700 (PDT) From: Jianyun Gao To: linux-kernel@vger.kernel.org Cc: Jianyun Gao , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , bpf@vger.kernel.org (open list:BPF [GENERAL] (Safe Dynamic Programs and Tools)), netdev@vger.kernel.org (open list:XDP (eXpress Data Path):Keyword:(?:\b|_)xdp(?:\b|_)) Subject: [PATCH v2 3/5] libbpf: Add doxygen documentation for bpf_link_* APIs in bpf.h Date: Fri, 31 Oct 2025 15:59:05 +0800 Message-Id: <20251031075908.1472249-4-jianyungao89@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251031075908.1472249-1-jianyungao89@gmail.com> References: <20251031075908.1472249-1-jianyungao89@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add doxygen comment blocks for all public bpf_link_* APIs in tools/lib/bpf/bpf.h. These doc comments are for: -bpf_link_create() -bpf_link_detach() -bpf_link_update() -bpf_link_get_next_id() -bpf_link_get_fd_by_id() -bpf_link_get_fd_by_id_opts() Signed-off-by: Jianyun Gao --- v1->v2: - Fixed the non-ASCII characters in this patch. The v1 is here: https://lore.kernel.org/lkml/20251031032627.1414462-4-jianyungao89@gmail.co= m/ tools/lib/bpf/bpf.h | 482 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 479 insertions(+), 3 deletions(-) diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index cd96d7afed6b..9040fc891b81 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -1208,11 +1208,195 @@ struct bpf_link_create_opts { size_t :0; }; #define bpf_link_create_opts__last_field uprobe_multi.pid - +/** + * @brief Create a persistent BPF link that attaches a loaded BPF program = to a + * kernel hook or target object. + * + * bpf_link_create() wraps the BPF_LINK_CREATE syscall command and establi= shes + * a first-class in-kernel "link" object representing the attachment of + * @p prog_fd to @p target_fd (or to a kernel entity implied by @p attach_= type). + * The returned FD (>=3D 0) owns the lifetime of that attachment: closing = it + * cleanly detaches the program without requiring a separate detach syscal= l. + * + * Compared to legacy bpf_prog_attach()/bpf_raw_tracepoint_open(), link-ba= sed + * attachment: + * - Provides explicit lifetime control (close(link_fd) =3D=3D detach). + * - Enables richer introspection via bpf_link_get_info_by_fd(). + * - Avoids ambiguous detach semantics and races inherent in "implicit d= etach + * on last program FD close" patterns. + * + * Typical usage: + * struct bpf_link_create_opts opts =3D { + * .sz =3D sizeof(opts), + * .flags =3D 0, + * }; + * int link_fd =3D bpf_link_create(prog_fd, target_fd, BPF_TRACE_FENTRY,= &opts); + * if (link_fd < 0) { + * // handle error + * } + * // ... use link_fd; close(link_fd) to detach later. + * + * @param prog_fd + * File descriptor of a previously loaded BPF program (from bpf_pro= g_load() + * or libbpf higher-level loader). Must be valid and compatible with + * @p attach_type. + * + * @param target_fd + * File descriptor of the attach target, when required by @p attach= _type + * (e.g. a cgroup FD, perf event FD, network interface, or another = BPF + * object). For some attach types (e.g. certain tracing variants) t= his may + * be -1 or ignored; passing an inappropriate FD yields -EINVAL. + * + * @param attach_type + * Enumeration value (enum bpf_attach_type) describing the hook/con= text + * at which the program should be executed (e.g. BPF_CGROUP_INET_IN= GRESS, + * BPF_TRACE_FENTRY, BPF_PERF_EVENT, BPF_NETFILTER, etc.). The prog= ram's + * bpf_prog_type and expected_attach_type must be compatible; other= wise + * verification will fail or the syscall returns -EINVAL/-EOPNOTSUP= P. + * + * @param opts + * Optional pointer to a zero-initialized struct bpf_link_create_op= ts + * extended options; may be NULL for defaults. Must set opts->sz to + * sizeof(struct bpf_link_create_opts) when non-NULL. + * + * Common fields: + * - .flags: Link creation flags (most callers set 0; future kern= els + * may define bits for pinning behaviors, exclusivity, etc.). + * - .target_btf_id: For BTF-enabled tracing/fentry/fexit/kprobe = multi + * scenarios, identifies a BTF entity (function/type) this link + * targets. + * - .iter_info / .iter_info_len: Provide iterator-specific metad= ata + * for BPF iter programs. + * + * Attach-type specific nested unions: + * - .perf_event.bpf_cookie: User-defined cookie visible to progr= am via + * bpf_get_attach_cookie() for PERF_EVENT and some tracing type= s. + * - .kprobe_multi: Batch (multi) kprobe attachment: + * * flags: KPROBE_MULTI_* flags controlling semantics. + * * cnt: Number of symbols/addresses. + * * syms / addrs: Symbol names or raw addresses (one of th= em + * used depending on kernel capabilities). + * * cookies: Optional per-probe cookies. + * - .uprobe_multi: Batch uprobes: + * * path: Target binary path. + * * offsets / ref_ctr_offsets: Instruction/file offsets and + * optional reference counter offsets. + * * pid: Target PID (0 for any or to let kernel decide). + * * cookies: Per-uprobe cookies. + * - .tracing.cookie: Generic tracing cookie for newer tracing ty= pes. + * - .netfilter: Attaching to Netfilter with: + * * pf (protocol family), hooknum, priority, flags. + * - .tcx / .netkit / .cgroup: Relative attachment variants allow= ing + * multi-attach ordering and revision consistency: + * * relative_fd / relative_id: Anchor or neighbor link/pro= gram. + * * expected_revision: Revision check to avoid races (fail= with + * -ESTALE if mismatch). + * + * Zero any fields you do not explicitly use for forward compatibil= ity. + * + * @return + * >=3D 0 : Link file descriptor (attachment active). + * < 0 : Negative error code (attachment failed; program not attached). + * + * Error Handling (negative libbpf-style codes; errno also set): + * - -EINVAL: Invalid prog_fd/target_fd/attach_type combination, malform= ed + * opts, bad sizes, unsupported flags, or missing required un= ion + * fields. + * - -EOPNOTSUPP / -ENOTSUP: Attach type or creation mode unsupported by + * running kernel. + * - -EPERM / -EACCES: Insufficient privileges (CAP_BPF/CAP_SYS_ADMIN) or + * blocked by LSM/lockdown. + * - -ENOENT: Target object no longer exists (race) or unresolved symbol= for + * kprobe/uprobes multi-attach. + * - -EBADF: Invalid file descriptor(s). + * - -ENOMEM: Kernel memory/resource exhaustion. + * - -ESTALE: Revision mismatch when using expected_revision (atomicity = guard). + * - Other negative codes: Propagated from underlying bpf() syscall fail= ures. + * + * Lifetime & Ownership: + * - Success returns a link FD. Caller must close() it to detach. + * - Closing the original program FD does NOT detach the link; only clos= ing + * the link FD (or explicit bpf_link_detach()) does. + * - Link FDs can be pinned to bpffs via bpf_obj_pin() for persistence. + * + * Concurrency & Races: + * - Linking can fail if another concurrent operation changes target's s= tate + * (revision checks can mitigate using expected_revision). + * - Multi-attach environments may reorder relative attachments if not u= sing + * relative_* fields; always inspect returned link state if ordering m= atters. + * + * Introspection: + * - Use bpf_link_get_info_by_fd(link_fd, ...) to query link metadata + * (program ID, attach type, target, cookies, multi-probe details). + * - Enumerate existing links via bpf_link_get_next_id() then open with + * bpf_link_get_fd_by_id(). + * + */ LIBBPF_API int bpf_link_create(int prog_fd, int target_fd, enum bpf_attach_type attach_type, const struct bpf_link_create_opts *opts); - +/** + * @brief Detach (tear down) an existing BPF link represented by a link fi= le descriptor. + * + * bpf_link_detach() issues the BPF_LINK_DETACH command to the kernel, bre= aking + * the association between a previously created BPF link (see bpf_link_cre= ate()) + * and its target (cgroup, tracing hook, perf event, netfilter hook, etc.)= . After + * a successful call the program will no longer be invoked at that attach = point. + * + * In most cases you do not need to call bpf_link_detach() explicitly; sim= ply + * closing the link FD (close(link_fd)) also detaches the link. This helpe= r is + * useful when you want to explicitly detach early while keeping the FD op= en for + * introspection (e.g., querying link info after detachment) or when build= ing + * higher-level lifecycle abstractions. + * + * Semantics: + * - Success makes the in-kernel link inactive; subsequent events at the= hook + * no longer trigger the program. + * - The link FD itself does NOT automatically close; you are still resp= onsible + * for close(link_fd) to release user space resources. + * - Repeated calls after a successful detach will fail (idempotency: on= ly the + * first detach succeeds). + * + * Typical usage: + * int link_fd =3D bpf_link_create(prog_fd, target_fd, attach_type, &opt= s); + * ... + * if (bpf_link_detach(link_fd) < 0) + * perror("bpf_link_detach"); + * close(link_fd); // optional: now just releases the FD + * + * Concurrency & races: + * - Detaching can race with another thread closing or detaching the sam= e link. + * In such cases you may observe -EBADF or -ENOENT. + * - Once detached, the program can be safely re-attached elsewhere if d= esired + * (requires a new link via bpf_link_create()). + * + * Privileges: + * - Usually requires CAP_BPF and/or CAP_SYS_ADMIN depending on kernel + * configuration, LSM, and lockdown mode. Lack of privileges yields -E= PERM + * or -EACCES. + * + * Post-detach: + * - The program object remains loaded; its own FD is still valid and ca= n be + * attached again. + * - Maps referenced by the program are unaffected. + * + * @param link_fd File descriptor of the active BPF link to detach; must h= ave + * been obtained via bpf_link_create() or equivalent. + * + * @return 0 on success; < 0 on failure (negative error code as described = above). + * + * Error handling (negative libbpf-style return codes, errno also set): + * - -EBADF: link_fd is not a valid open file descriptor. + * - -EINVAL: link_fd does not refer to a BPF link, or the kernel does n= ot + * support BPF_LINK_DETACH for this link type. + * - -ENOENT: Link already detached or no longer exists (race with close= ()). + * - -EPERM / -EACCES: Insufficient privileges or denied by security pol= icy. + * - -EOPNOTSUPP / -ENOTSUP: Kernel lacks support for link detachment of= this + * specific attach type. + * - -ENOMEM: Transient kernel resource exhaustion (rare in this path). + * - Other negative codes may be propagated from the underlying bpf() sy= scall. + * + */ LIBBPF_API int bpf_link_detach(int link_fd); =20 struct bpf_link_update_opts { @@ -1222,7 +1406,89 @@ struct bpf_link_update_opts { __u32 old_map_fd; /* expected old map FD */ }; #define bpf_link_update_opts__last_field old_map_fd - +/** + * @brief Atomically replace (update) the BPF program or map referenced by= an + * existing link with a new program. + * + * bpf_link_update() wraps the BPF_LINK_UPDATE command of the bpf(2) sysca= ll. + * It allows retargeting an already established BPF link (identified by + * link_fd) to point at a different loaded BPF program (new_prog_fd) witho= ut + * having to tear the link down (detach) and recreate it. This is typically + * used for hot-swapping a program while preserving: + * - Link pinning (bpffs path remains valid). + * - Relative ordering in multi-attach contexts (TC/XDP/cgroup revisions= ). + * - Existing references held by other processes. + * + * Consistency & safety: + * - The update is performed atomically: events arriving at the hook will + * either see the old program before the call, or the new one after the + * call; no window exists with an unattached link. + * - Optional expectations can be enforced via @p opts to avoid races: + * * old_prog_fd: Fail with -ESTALE if the link does not currently + * reference that program. + * * old_map_fd: (Kernel dependent) Can be used when links encapsul= ate + * a map association; if set and mismatched, update fails. + * * flags: Future extension bits (must be 0 on current kernels). + * + * Typical usage: + * struct bpf_link_update_opts u =3D { + * .sz =3D sizeof(u), + * .flags =3D 0, + * .old_prog_fd =3D old_fd, // set to 0 to skip validation + * }; + * if (bpf_link_update(link_fd, new_prog_fd, &u) < 0) + * perror("bpf_link_update"); + * + * Preconditions: + * - link_fd must refer to a valid, updatable BPF link. Not all link typ= es + * support in-place program replacement; unsupported types return -EOP= NOTSUPP. + * - new_prog_fd must be a loaded BPF program whose type and expected at= tach + * type are compatible with the link's attach context. + * - If @p opts is non-NULL, opts->sz MUST be set to sizeof(*opts). + * + * @param link_fd + * File descriptor of the existing BPF link to be updated. + * @param new_prog_fd + * File descriptor of the newly loaded BPF program that should repl= ace + * the currently attached program. + * @param opts + * Optional pointer to bpf_link_update_opts controlling validation: + * - sz: Structure size for forward/backward compatibility. + * - flags: Reserved; must be 0 (unsupported bits yield -EINVAL). + * - old_prog_fd: Expected current program FD (0 to skip check). + * - old_map_fd: Expected current map FD (0 to skip; kernel-spec= ific). + * Pass NULL for default (no expectation checks). + * + * @return + * 0 on success (link now points to new_prog_fd). + * <0 negative libbpf-style error code (typically -errno): + * - -EBADF: Invalid link_fd or new_prog_fd. + * - -EINVAL: Malformed opts (bad sz/flags) or incompatible p= rogram type. + * - -EOPNOTSUPP: Link type does not support updates. + * - -EPERM / -EACCES: Insufficient privileges (CAP_BPF/CAP_S= YS_ADMIN) or blocked by LSM. + * - -ENOENT: Link no longer exists (race) or old_prog_fd ref= ers to a non-existent program. + * - -ESTALE: Expectation mismatch (old_prog_fd / old_map_fd = differs). + * - -ENOMEM: Kernel resource allocation failure. + * - Other -errno codes propagated from the bpf() syscall. + * + * Postconditions: + * - On success, the old program remains loaded; caller should close its= FD + * if no longer needed. + * - Pinning status and link ID are preserved. + * - Maps referenced by the new program must be valid; no automatic rebi= nding + * occurs beyond program substitution. + * + * Caveats: + * - If verifier features differ (e.g., CO-RE relocations) ensure the new + * program was loaded with compatible expectations for the same hook. + * - Updating to a program of a strictly different attach semantics (e.g= ., + * sleepable vs non-sleepable) is rejected if the link type disallows = it. + * + * Thread safety: + * - Safe to call concurrently with other update attempts; only one succ= eeds. + * - Consumers of the link see either old or new program; intermediate s= tates + * are not observable. + */ LIBBPF_API int bpf_link_update(int link_fd, int new_prog_fd, const struct bpf_link_update_opts *opts); =20 @@ -1338,6 +1604,72 @@ LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, = __u32 *next_id); */ LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id); LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id); +/** + * @brief Retrieve the next existing BPF link ID after a given starting ID. + * + * This helper wraps the kernel's BPF_LINK_GET_NEXT_ID command and enumera= tes + * system-wide BPF link objects (each representing a persistent attachment= of + * a BPF program) in strictly ascending order of their kernel-assigned IDs. + * It is typically used to iterate over all currently existing BPF links f= rom + * user space. + * + * Enumeration pattern: + * 1. Initialize start_id to 0 to obtain the first (lowest) existing lin= k ID. + * 2. On success, *next_id is set to the first link ID strictly greater = than start_id. + * 3. Use the returned *next_id as the new start_id for the subsequent c= all. + * 4. Repeat until the function returns -ENOENT, indicating there is no = link + * with ID greater than start_id (end of enumeration). + * + * Concurrency & races: + * - Links can be created or detached concurrently with enumeration. A l= ink ID + * you just retrieved might become invalid before you convert it to an= FD + * (via bpf_link_get_fd_by_id()). Always handle failures when opening = by ID. + * - Enumeration does not provide a consistent snapshot; links created a= fter + * you pass their predecessor ID may appear in later iterations. + * + * Lifetime considerations: + * - Link IDs are monotonically increasing and not reused until wraparou= nd + * (effectively unreachable in normal operation). + * - Successfully retrieving an ID does not pin or otherwise prevent link + * detachment; obtain an FD immediately if you need to interact with t= he link. + * + * Usage example: + * __u32 id =3D 0, next; + * while (bpf_link_get_next_id(id, &next) =3D=3D 0) { + * int link_fd =3D bpf_link_get_fd_by_id(next); + * if (link_fd >=3D 0) { + * // Inspect link (e.g., bpf_link_get_info_by_fd(link_fd)) + * close(link_fd); + * } + * id =3D next; + * } + * // Loop terminates when -ENOENT is returned. + * + * @param start_id + * Starting point for the search. The helper finds the first link ID + * strictly greater than start_id. Use 0 to begin enumeration. + * @param next_id + * Pointer to a __u32 that receives the next link ID on success. + * Must not be NULL. + * + * @return + * 0 on success (next_id populated); + * -ENOENT if there is no link ID greater than start_id (end of it= eration); + * -EINVAL if next_id is NULL or invalid arguments were supplied; + * -EPERM / -EACCES if denied by security policy or lacking require= d privileges; + * Other negative libbpf-style errors (-errno) on transient or syst= em failures. + * + * Error handling notes: + * - Treat -ENOENT as normal termination (not an error condition). + * - For other negative returns, errno will also be set to the underlyin= g cause. + * + * After enumeration: + * - Convert retrieved IDs to FDs with bpf_link_get_fd_by_id() for intro= spection + * or detachment (via bpf_link_detach()). + * - Closing the FD does not destroy the link if other references remain= (e.g., + * pinned in bpffs); the link persists until explicitly detached or all + * references are released. + */ LIBBPF_API int bpf_link_get_next_id(__u32 start_id, __u32 *next_id); =20 struct bpf_get_fd_by_id_opts { @@ -1548,9 +1880,153 @@ LIBBPF_API int bpf_map_get_fd_by_id_opts(__u32 id, LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id); LIBBPF_API int bpf_btf_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); +/** + * @brief Obtain a file descriptor for an existing BPF link given its kern= el-assigned ID. + * + * bpf_link_get_fd_by_id() wraps the BPF_LINK_GET_FD_BY_ID command of the = bpf(2) + * syscall. A BPF "link" is a persistent in-kernel object representing an + * attachment of a BPF program to some hook (cgroup, tracing point, perf e= vent, + * netfilter hook, tc/xdp chain, etc.). Each link has a unique, monotonica= lly + * increasing ID. This helper converts such an ID into a process-local file + * descriptor, allowing user space to inspect, pin, update, or detach the = link. + * + * Typical enumeration + open pattern: + * __u32 id =3D 0, next; + * while (bpf_link_get_next_id(id, &next) =3D=3D 0) { + * int link_fd =3D bpf_link_get_fd_by_id(next); + * if (link_fd >=3D 0) { + * // Use link_fd (e.g. bpf_link_get_info_by_fd(), bpf_link_deta= ch(), pin) + * close(link_fd); + * } + * id =3D next; + * } + * // Loop terminates when bpf_link_get_next_id() returns -ENOENT. + * + * Concurrency & races: + * - A link may be detached (or otherwise invalidated) between discoveri= ng its ID + * and calling this function. In that case the call fails with -ENOENT. + * - Successfully retrieving a file descriptor does not prevent later de= tachment + * by other processes; always handle subsequent operation failures gra= cefully. + * + * Lifetime & ownership: + * - On success, the caller owns the returned FD and must close() it whe= n done. + * - Closing the FD decreases the user space reference count; the underl= ying link + * persists while any references (FDs or pinned bpffs path) remain. + * - Detaching the link (via bpf_link_detach() or closing the last activ= e FD) + * invalidates future operations on that FD. + * + * Privileges / access control: + * - May require CAP_BPF and/or CAP_SYS_ADMIN depending on kernel config= uration, + * LSM policy, or lockdown mode. Lack of privileges yields -EPERM / -E= ACCES. + * - Security policies can deny access even if the link ID exists. + * + * Error handling (negative libbpf-style codes; errno is also set): + * - -ENOENT: No link with the specified ID (never existed or already de= tached). + * - -EPERM / -EACCES: Insufficient privilege or blocked by security pol= icy. + * - -EINVAL: Invalid ID (e.g., 0) or kernel rejected the request (rare). + * - -ENOMEM: Transient kernel resource exhaustion while creating the FD. + * - -EBADF, -EFAULT, or other -errno values: Propagated from the underl= ying syscall. + * + * Usage notes: + * - Immediately call bpf_link_get_info_by_fd() after acquiring the FD i= f you need + * metadata (program ID, attach type, target, cookie, etc.). + * - To keep a link across process restarts, pin it to bpffs via bpf_obj= _pin(). + * - Prefer using bpf_link_get_fd_by_id_opts() if you need extended open= semantics + * (e.g., token-based delegated permissions) on newer kernels. + * + * @param id + * Kernel-assigned unique ID of the target BPF link (must be > 0). = Usually + * obtained via bpf_link_get_next_id() or from a prior info query. + * + * @return + * >=3D 0 : File descriptor referring to the BPF link (caller must = close()). + * < 0 : Negative error code (libbpf-style, typically -errno) on f= ailure. + */ LIBBPF_API int bpf_link_get_fd_by_id(__u32 id); +/** + * @brief Obtain a file descriptor for an existing BPF link by kernel-assi= gned link ID + * with extended open options. + * + * bpf_link_get_fd_by_id_opts() is an extended variant of bpf_link_get_fd_= by_id(). + * It wraps the BPF_LINK_GET_FD_BY_ID command of the bpf(2) syscall and co= nverts a + * stable, monotonically increasing BPF link ID into a process-local file = descriptor + * while honoring optional attributes supplied via @p opts. + * + * A BPF "link" represents a persistent attachment of a BPF program to som= e kernel + * hook (cgroup, tracing point, perf event, netfilter, tc/xdp chain, etc.)= . Links can + * be enumerated system-wide by first calling bpf_link_get_next_id(). + * + * Typical enumeration + open pattern: + * __u32 id =3D 0, next; + * while (bpf_link_get_next_id(id, &next) =3D=3D 0) { + * struct bpf_get_fd_by_id_opts o =3D { + * .sz =3D sizeof(o), + * .open_flags =3D 0, + * .token_fd =3D 0, + * }; + * int link_fd =3D bpf_link_get_fd_by_id_opts(next, &o); + * if (link_fd >=3D 0) { + * // inspect link (e.g. bpf_link_get_info_by_fd(link_fd)) + * close(link_fd); + * } + * id =3D next; + * } + * // Loop ends when bpf_link_get_next_id() returns -ENOENT (no more lin= ks). + * + * Concurrency & races: + * - A link may detach between enumeration and opening; handle -ENOENT g= racefully. + * - Successfully obtaining a FD does not prevent future detachment by o= ther processes; + * subsequent operations (e.g., bpf_link_get_info_by_fd()) can still f= ail. + * + * Lifetime & ownership: + * - The returned FD holds a user-space reference; close() decrements it. + * - The underlying link persists while any references remain (FDs or bp= ffs pin). + * - Use bpf_obj_pin() to make the link persistent across process lifeti= mes. + * + * Security: + * - CAP_BPF and/or CAP_SYS_ADMIN may be required depending on kernel co= nfiguration. + * - Token-based access (token_fd) can allow operations in sandboxed env= ironments. + * + * Follow-up introspection: + * - Call bpf_link_get_info_by_fd(link_fd, ...) to retrieve program ID, = attach type, + * target info, cookies, and other metadata. + * - Detach via bpf_link_detach(link_fd) or simply close(link_fd). + * + * Recommended usage notes: + * - Always zero-initialize the opts struct before setting fields. + * - Treat -ENOENT after enumeration as normal termination, not an error= condition. + * - Avoid relying on stable ordering beyond ascending ID sequence; link= s created + * during enumeration may appear after you pass their predecessor ID. + * + * @param id + * Kernel-assigned unique (non-zero) BPF link ID. Usually obtained from + * bpf_link_get_next_id() or from a prior info query. Must be > 0. + * + * @param opts + * Optional pointer to a zero-initialized struct bpf_get_fd_by_id_opts: + * - sz: MUST be set to sizeof(struct bpf_get_fd_by_id_opts) if @p opts + * is non-NULL (enables fwd/backward compatibility). + * - open_flags: Additional open/access flags (currently most callers = set 0; + * unsupported bits yield -EINVAL; semantics are kernel-= specific). + * - token_fd: File descriptor of a BPF token granting delegated permi= ssions + * (set 0 or -1 if unused). Allows restricted environments= to + * open the link without elevated global capabilities. + * Pass NULL for defaults (equivalent to open_flags=3D0, no token). + * + * @return + * >=3D 0 : File descriptor referencing the BPF link (caller owns it; cl= ose() when done). + * < 0 : Negative libbpf-style error code (typically -errno): + * - -ENOENT : Link with @p id does not exist (detached or nev= er created). + * - -EPERM / -EACCES : Insufficient privilege or blocked by LS= M/lockdown. + * - -EINVAL : Invalid @p id (0), malformed @p opts (bad sz / = flags), or + * unsupported open_flags. + * - -ENOMEM : Transient kernel memory/resource exhaustion. + * - Other negative codes: Propagated from underlying bpf() sys= call. + * + */ LIBBPF_API int bpf_link_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); + LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_= len); =20 /** --=20 2.34.1 From nobody Sun Feb 8 13:16:58 2026 Received: from mail-pj1-f68.google.com (mail-pj1-f68.google.com [209.85.216.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF6283271FB for ; Fri, 31 Oct 2025 08:00:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897627; cv=none; b=fjMT5CLSN6gAIIUDhNX73m3iLByzV3fs+vwek8/mlZvuaHXoHyzRKUgXXju1yU+Qm2iP5f+QnJe4nWQnb8pA3FMA/utCWPjTgURNamPjpwW7QCKS7YqNDNI0IKM/lCzTgbiYD2cY2AllcGz03Pamdv1u3jz3u22+Eeab/462LMg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897627; c=relaxed/simple; bh=yq5Ld84mvWeaSPZmbqfeRjs+iWZ2K7zR4uHlTNbTICg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TqpT5EAU58ia14OpTKtaqW2/rF7cB5QOBukt7HcobN9ioyFakOEXUOGE0+b2rPMHGQ5uHjxDf2rt8uhZajgGBYtUTWY7g/JfRgMXPhfadgqKtvPKiFF5Cs+KCSIhlbdENC55I7FmiEEEnNFWWPWLCFDkp0D24ARbCLkzEiAo4MM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Ji1aeolb; arc=none smtp.client-ip=209.85.216.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ji1aeolb" Received: by mail-pj1-f68.google.com with SMTP id 98e67ed59e1d1-340564186e0so1649997a91.0 for ; Fri, 31 Oct 2025 01:00:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761897624; x=1762502424; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GGLcR+vDk0i4tFYe5UMZqo2qGNoEKjSp0blyyOlE0s8=; b=Ji1aeolbv9/5WkHNZ5cy/TKmgAdl65m7UDU3Yk+uHaWxHpn+NxO09U7J6721+7tUGL uTAHmx+eHncvniLE4ntYOi/CcmXURAu92NrqRBsLzpPocVrLnyBycuOSsWYOPAIf068k +OXHIwvKileLHYHCeiwKzoOMR+t7TJzjV5NTj1am7iEzVpQxYyiwBkVG5jfCriTcdMPk rkEQmZtC/SzEOKJcf3dk9n8rWL7Ra5dU+Csc/SuBXcpPZwpVYkuIVR+SwvnmTCjBpcfG F8woPDi1CCKfDBMf9TXVrVtz+yal2W3YsJK6bF6F02YG4qYvBXAXU0Ws4MvRw6+n3pq8 jOjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761897624; x=1762502424; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GGLcR+vDk0i4tFYe5UMZqo2qGNoEKjSp0blyyOlE0s8=; b=NZaveZhq5MQAZ92hE6pAnvps+QLenbdDVZZOJyx56WcO1nsEiV2VCOLSiLG8H5f8cQ 78Iy7Mkte5CI+YwxQEY0z7/LtPTx+RLfsYfrY37xG7RTyr735Jt0z8M1EHh5SnDbdXKD rsYubmJRH1IJosTUTZec7l90slbqiC/ZcmwnIy61pmMxbdrnAHj3780WkQmmFdly5Izv kaNRDgmNfW8rjJIF7fGZTeJQTb4JFq83vEySo8mntc9gzXDL7XsN489A0YZNiOzNBhk7 c0dOxSiLxOy0w0Hw3NAbqta7gVG0xnnyRkX/DBuwVWL3n7HbEXU+rt1LkoKiZ8sKWK6k 5cEw== X-Gm-Message-State: AOJu0YxjXUlG5wrkublyD+5XwyGtwKz3BPQ40DHCBhh7UKiL0tAxpQpf BF1V1BnAREFaVmpHK2V0SnjmF4FXKLGSWCyKsSWNRgorep9Jt6ls/43Pg6TnAB1I X-Gm-Gg: ASbGnctWJ9ZMKz4uRYKY7ZqZ1Q9QYr1pkPkcLyvWi9XzqafrdRuvIboSkOgUroTJJle nFwWDa7YZPCUk2kKlV4o+n5AoYx0Coe+l2dz8nz4xR7844FYzDKrxkqYA9l3S8ay/+CGaiT4mdQ Zwj/IWLl0IufwIgyx4aJWIX6LxkkSV28C3ruFWeCzTWi7nlturhem/3UFSTLX/cObuYRixT5fM0 PfkOumEuH+tkC/Hjv91aYM8OPHUMh+8mUF45nEYCCBdbjqGjO+vndUUijr3UTbGZXxDjgLsHC2u FUbNaG9Ce+A9+f0vdjPIm/em4lZtbC2PdwuO37bryReL94qHiMkTai0QbEekGBPvnmmlQzyGc2b Dhk3yuToUDlzGVb4TUx31GTlngWXJqkzA6TdrCeLkaY/J7UtfO/mB5lzILRWKhE4RPIaheywM1s b8b8YbeNlSdkTSdv/jEa7zj5wSMQDi X-Google-Smtp-Source: AGHT+IHvFC5sr7N/yrAoDiwyYNN2PxetDa4jsXkqKkOOoLtvNMqyrcxeH9Y5h1Z4gwtVXit1PESW1w== X-Received: by 2002:a17:90b:3c52:b0:32e:64ca:e84e with SMTP id 98e67ed59e1d1-34082fdbf52mr3850629a91.15.1761897623489; Fri, 31 Oct 2025 01:00:23 -0700 (PDT) Received: from E07P150077.ecarx.com.cn ([103.52.189.23]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b93be4045fbsm1216575a12.28.2025.10.31.01.00.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Oct 2025 01:00:22 -0700 (PDT) From: Jianyun Gao To: linux-kernel@vger.kernel.org Cc: Jianyun Gao , Andrii Nakryiko , Eduard Zingerman , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf@vger.kernel.org (open list:BPF [LIBRARY] (libbpf)) Subject: [PATCH v2 4/5] libbpf: Add doxygen documentation for bpf_obj_* APIs in bpf.h Date: Fri, 31 Oct 2025 15:59:06 +0800 Message-Id: <20251031075908.1472249-5-jianyungao89@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251031075908.1472249-1-jianyungao89@gmail.com> References: <20251031075908.1472249-1-jianyungao89@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add doxygen comment blocks for all public bpf_obj_* APIs in tools/lib/bpf/bpf.h. These doc comments are for: -bpf_obj_pin() -bpf_obj_pin_opts() -bpf_obj_get() -bpf_obj_get_opts() -bpf_obj_get_info_by_fd() Signed-off-by: Jianyun Gao --- v1->v2: - Fixed the non-ASCII characters in this patch. The v1 is here: https://lore.kernel.org/lkml/20251031032627.1414462-5-jianyungao89@gmail.co= m/ tools/lib/bpf/bpf.h | 430 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 427 insertions(+), 3 deletions(-) diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 9040fc891b81..a0cebda09e16 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -900,8 +900,175 @@ struct bpf_obj_pin_opts { size_t :0; }; #define bpf_obj_pin_opts__last_field path_fd - +/** + * @brief Pin a BPF object (map, program, BTF, link, etc.) to a persistent + * location in the BPF filesystem (bpffs). + * + * bpf_obj_pin() wraps the BPF_OBJ_PIN command and creates a bpffs file + * at @p pathname that permanently references the in-kernel BPF object + * associated with @p fd. Once pinned, the object survives process exit + * and can later be reopened (referenced) by other processes via + * bpf_obj_get()/bpf_obj_get_opts(). + * + * Typical use cases: + * - Share maps or programs across processes (e.g., loader + consumer). + * - Preserve objects across service restarts. + * - Provide stable, discoverable paths for orchestration tooling. + * + * Requirements: + * - The BPF filesystem (usually mounted at /sys/fs/bpf) must be mounted. + * - All parent directories in @p pathname must already exist; this helper + * does NOT create intermediate directories. + * - @p fd must reference a pin-able BPF object (map, program, link, BTF,= etc.). + * + * Idempotency & overwriting: + * - If a file already exists at @p pathname, the call fails (typically + * with -EEXIST). Remove or rename the existing entry before pinning + * a new object to that path. + * + * Lifetime semantics: + * - Pinning increments the in-kernel object's refcount. The object will + * remain alive until the pinned bpffs entry is removed and all other + * references (FDs, links, attachments) are closed. + * - Closing @p fd after pinning does NOT unpin the object. + * + * Security & permissions: + * - Usually requires write permission to the bpffs mount and appropriate + * capabilities (CAP_BPF and/or CAP_SYS_ADMIN depending on kernel/LSM). + * - Path components must not traverse outside bpffs (no ".." escapes). + * + * Example: + * int map_fd =3D bpf_map_create(...); + * if (map_fd < 0) + * return -1; + * if (bpf_obj_pin(map_fd, "/sys/fs/bpf/myapp/session_map") < 0) { + * perror("bpf_obj_pin"); + * // handle error (e.g., create parent dir, adjust permissions) + * } + * + * Re-opening later: + * int pinned_fd =3D bpf_obj_get("/sys/fs/bpf/myapp/session_map"); + * if (pinned_fd >=3D 0) { + * // use map + * close(pinned_fd); + * } + * + * @param fd File descriptor of the loaded BPF object to pin. + * @param pathname Absolute or relative path inside bpffs where the object + * should be pinned (e.g. "/sys/fs/bpf/my_map"). Must not= be NULL. + * + * @return 0 on success; < 0 negative error code (libbpf style =3D=3D -err= no) on failure. + * + * Common errors (negative libbpf-style return codes =3D=3D -errno): + * - -EBADF: @p fd is not a valid BPF object FD. + * - -EINVAL: @p fd refers to an object type that cannot be pinned, or + * pathname is invalid. + * - -EEXIST: A file already exists at @p pathname. + * - -ENOENT: One or more parent directories in the path do not exist. + * - -ENOTDIR: A path component expected to be a directory is not. + * - -EPERM / -EACCES: Insufficient privileges or denied by security poli= cy. + * - -ENOMEM: Kernel failed to allocate internal metadata. + * - Other -errno codes may be propagated from the underlying syscall. + * + */ LIBBPF_API int bpf_obj_pin(int fd, const char *pathname); + +/** + * @brief Pin a BPF object (map, program, BTF, link, etc.) to bpffs with + * extended options controlling filesystem open semantics. + * + * This is an extended variant of bpf_obj_pin() that allows specifying + * additional pinning attributes through @p opts. On success a new file + * (bpffs inode) at @p pathname references the in-kernel BPF object + * associated with @p fd, incrementing its refcount and making it + * persist beyond the lifetime of the creating process. + * + * Differences vs bpf_obj_pin(): + * - Supports optional struct bpf_obj_pin_opts for forward/backward + * compatibility without breaking older kernels. + * - Allows passing file creation flags (opts->file_flags) and a + * directory file descriptor (opts->path_fd) for path resolution + * using the underlying kernel support (e.g. enabling O_EXCL-style + * semantics if/when supported). + * + * Typical usage: + * struct bpf_obj_pin_opts popts =3D { + * .sz =3D sizeof(popts), + * .file_flags =3D 0, // reserved / must be 0 unless docume= nted + * .path_fd =3D -1, // optional dir FD; -1 means unused + * }; + * if (bpf_obj_pin_opts(obj_fd, "/sys/fs/bpf/myapp/session_map", &popts)= < 0) { + * perror("bpf_obj_pin_opts"); + * // handle error (inspect errno or negative return value) + * } + * + * Notes on @p pathname: + * - Must reside within a mounted BPF filesystem (bpffs), typically + * /sys/fs/bpf. + * - All parent directories must already exist; intermediate directories + * are not created automatically. + * - Existing path results in -EEXIST (no overwrite). + * - Avoid relative paths that could escape bpffs (no ".." traversal). + * + * opts initialization: + * - If @p opts is non-NULL, opts->sz MUST be set to sizeof(*opts). + * - Unused/unknown fields should be zeroed for forward compatibility. + * - opts->file_flags: Currently reserved; pass 0 unless a kernel + * extension explicitly documents valid bits (non-zero may yield + * -EINVAL on older kernels). + * - opts->path_fd: Optional directory file descriptor that serves as + * the base for relative @p pathname resolution (similar to *at() + * syscalls). Set to -1 or 0 to ignore and use normal absolute path + * semantics. If used, ensure it refers to bpffs. + * + * Concurrency: + * - Pinning is atomic with respect to path name; two simultaneous + * attempts to pin to the same pathname will result in one success + * and one -EEXIST failure. + * - After success, closing @p fd does NOT unpin; removal of the pinned + * bpffs file (unlink) plus closing all other references is required + * to allow object destruction. + * + * Security / Privileges: + * - May require CAP_BPF and/or CAP_SYS_ADMIN depending on kernel + * configuration, LSM policy, and lockdown mode. + * - Filesystem permissions on bpffs apply; lack of write/execute on + * parent directories yields -EACCES / -EPERM. + * + * After pinning: + * - Object can be reopened via bpf_obj_get()/bpf_obj_get_opts() using + * the same pathname. + * - Can be safely shared across processes and persists across + * restarts until explicitly unpinned (unlink). + * + * Best practices: + * - Zero-initialize opts: struct bpf_obj_pin_opts popts =3D {}; + * - Always set popts.sz =3D sizeof(popts) when passing opts. + * - Validate that bpffs is mounted (e.g., stat("/sys/fs/bpf")) before + * attempting to pin. + * - Use distinct subdirectories (e.g., /sys/fs/bpf//...) to avoid + * naming collisions and facilitate cleanup. + * + * @param fd File descriptor of the loaded BPF object to pin. + * @param pathname Absolute (recommended) or relative path inside bpffs + * identifying where to create the pin entry. Must not be= NULL. + * @param opts Optional pointer to a struct bpf_obj_pin_opts providing + * extended pin options; may be NULL for defaults. + * + * @return 0 on success; < 0 negative error code (libbpf style =3D=3D -err= no) on failure. + * + * Error handling (negative libbpf-style return codes =3D=3D -errno): + * - -EBADF: Invalid @p fd, or @p path_fd (if used) not a valid director= y FD. + * - -EINVAL: opts->sz mismatch, unsupported file_flags, invalid pathnam= e, + * or object type cannot be pinned. + * - -EEXIST: A file already exists at @p pathname. + * - -ENOENT: Parent directory component missing (or @p path_fd base inv= alid). + * - -ENOTDIR: A path component expected to be a directory is not. + * - -EPERM / -EACCES: Insufficient privileges or blocked by security po= licy. + * - -ENOMEM: Kernel failed to allocate internal metadata. + * - Other -errno codes may be propagated from the underlying bpf() sysc= all. + * + */ LIBBPF_API int bpf_obj_pin_opts(int fd, const char *pathname, const struct bpf_obj_pin_opts *opts); =20 @@ -914,8 +1081,190 @@ struct bpf_obj_get_opts { size_t :0; }; #define bpf_obj_get_opts__last_field path_fd - +/** + * @brief Open (re-reference) a pinned BPF object by its bpffs pathname. + * + * bpf_obj_get() wraps the BPF_OBJ_GET command of the bpf(2) syscall. It + * converts a persistent BPF filesystem (bpffs) entry (previously created + * with bpf_obj_pin()/bpf_obj_pin_opts()) back into a live file descriptor + * that the caller owns and can use for further operations (e.g. map + * lookups/updates, program introspection, link detachment, BTF queries). + * + * Supported object kinds (depending on kernel version): + * - Maps + * - Programs + * - BTF objects + * - Links + * - (Future kinds may also become accessible through the same API) + * + * Typical usage: + * int fd =3D bpf_obj_get("/sys/fs/bpf/myapp/session_map"); + * if (fd < 0) { + * perror("bpf_obj_get"); + * // handle error + * } else { + * // use fd + * close(fd); + * } + * + * Path requirements: + * - @p pathname must reside inside a mounted BPF filesystem (usually + * /sys/fs/bpf). + * - Intermediate directories must already exist. + * - The path must reference a previously pinned object; regular files + * or non-BPF entries yield errors. + * + * Lifetime semantics: + * - Success returns a new file descriptor referencing the existing + * in-kernel object; the object's lifetime is extended while this FD + * (and any others) remain open or while the bpffs entry stays pinned. + * - Closing the returned FD does not remove or unpin the object. + * - To permanently remove the object, unlink the bpffs path and close + * all remaining descriptors. + * + * Concurrency & races: + * - If the pinned entry is removed (unlink) between name resolution and + * the syscall, the call may fail with -ENOENT. + * - Multiple opens of the same pinned path are safe and return distinct + * FDs. + * + * Privileges & security: + * - May require CAP_BPF and/or CAP_SYS_ADMIN depending on kernel config, + * LSM policies, and lockdown mode. + * - Filesystem permission checks apply (read/search on parent dirs). + * + * Thread safety: + * - The function itself is thread-safe; distinct threads can open the + * same pinned path concurrently. + * + * Performance considerations: + * - Operation cost is dominated by path lookup and a single bpf() + * syscall; typically negligible compared to subsequent map/program + * usage. + * + * @param pathname Absolute (recommended) or relative bpffs path of the + * pinned BPF object; must not be NULL. + * + * @return >=3D 0 : File descriptor referencing the object (caller must cl= ose()). + * < 0 : Negative error code (libbpf style, see list above). + * + * + * Error handling (negative libbpf-style return codes =3D=3D -errno): + * - -ENOENT: Path does not exist or was unpinned. + * - -ENOTDIR: A path component expected to be a directory is not. + * - -EACCES / -EPERM: Insufficient privileges or denied by security pol= icy. + * - -EINVAL: Path does not refer to a valid pinned BPF object (type mis= match, + * corrupted entry, or unsupported kernel feature). + * - -ENOMEM: Kernel could not allocate internal resources. + * - -EBADF: Rare: internal descriptor handling failed. + * - Other negative codes propagated from the underlying syscall. + * + */ LIBBPF_API int bpf_obj_get(const char *pathname); + +/** + * @brief Open (re-reference) a pinned BPF object with extended options. + * + * bpf_obj_get_opts() is an extended variant of bpf_obj_get() that wraps t= he + * BPF_OBJ_GET command of the bpf(2) syscall. It converts a bpffs pathname + * (previously created via bpf_obj_pin()/bpf_obj_pin_opts()) into a proces= s-local + * file descriptor referencing the underlying in-kernel BPF object (map, p= rogram, + * BTF object, link, etc.), honoring additional lookup/open semantics supp= lied + * through @p opts. + * + * Extended capabilities vs bpf_obj_get(): + * - Structured forward/backward compatibility via @p opts->sz. + * - Optional directory FD-relative path resolution (opts->path_fd), + * similar to *at() family syscalls (openat, fstatat, etc.). + * - Future room for file/open semantic modifiers (opts->file_flags). + * + * Requirements: + * - The target pathname must reside inside a mounted BPF filesystem + * (usually /sys/fs/bpf). Relative paths are resolved either against + * the current working directory (if opts->path_fd is -1 or 0) or + * against the directory represented by opts->path_fd. + * - All parent directories must already exist; intermediate components + * are not created automatically. + * - The bpffs entry at @p pathname must refer to a pinned BPF object. + * + * Lifetime semantics: + * - Success returns a new file descriptor owning a user space reference + * to the object. Closing this FD does NOT unpin or destroy the object + * if other references (FDs or pinned entries) remain. + * - To remove the persistent reference, unlink(2) the bpffs path and + * close all remaining FDs. + * + * Concurrency & races: + * - If the pinned entry is unlinked concurrently, the call may fail + * with -ENOENT. + * - Multiple successful opens of the same path yield distinct FDs. + * + * Security / privileges: + * - May require CAP_BPF and/or CAP_SYS_ADMIN depending on kernel config, + * LSM policies, or lockdown mode. + * - Filesystem permission checks apply to path traversal and directory + * components (execute/search permissions). + * + * @param pathname + * Absolute or relative bpffs path of the pinned BPF object. Must + * not be NULL. If relative and opts->path_fd is a valid directory + * FD, resolution is performed relative to that directory; otherwise + * relative to the process's current working directory. + * @param opts + * Optional pointer to a zero-initialized bpf_obj_get_opts structur= e. + * May be NULL for default behavior. Fields: + * - sz: MUST be set to sizeof(struct bpf_obj_get_opts) when @p o= pts + * is non-NULL; mismatch causes -EINVAL. + * - file_flags: Reserved for future extensions; MUST be 0 on + * current kernels or the call may fail with -EINVAL. + * - path_fd: Directory file descriptor for *at()-style relative + * path resolution. Set to -1 (or 0) to ignore and use norm= al + * pathname semantics. Must reference a directory within bp= ffs + * if used with relative @p pathname. + * + * @return + * >=3D 0 : File descriptor referencing the pinned BPF object (caller mu= st close()). + * < 0 : Negative libbpf-style error code (=3D=3D -errno): + * - -ENOENT: Path does not exist or was unpinned. + * - -ENOTDIR: A path component is not a directory; or opts->pa= th_fd + * is not a directory when required. + * - -EACCES / -EPERM: Insufficient privileges or denied by sec= urity policy. + * - -EBADF: Invalid opts->path_fd (not an open FD) or internal= FD misuse. + * - -EINVAL: opts->sz mismatch, unsupported file_flags, invali= d pathname, + * or path does not refer to a valid pinned BPF obje= ct. + * - -ENOMEM: Kernel failed to allocate internal metadata/resou= rces. + * - Other -errno codes may be propagated from the underlying s= yscall. + * + * Usage example: + * struct bpf_obj_get_opts gopts =3D { + * .sz =3D sizeof(gopts), + * .file_flags =3D 0, + * .path_fd =3D -1, + * }; + * int fd =3D bpf_obj_get_opts("/sys/fs/bpf/myapp/session_map", &gopts); + * if (fd < 0) { + * // handle error (inspect -fd or errno) + * } else { + * // use fd + * close(fd); + * } + * + * Best practices: + * - Always zero-initialize the opts struct before setting recognized fi= elds. + * - Verify bpffs is mounted (e.g., stat("/sys/fs/bpf")) before calling. + * - Avoid passing non-zero file_flags until documented by newer kernels. + * - Treat -ENOENT as a normal condition if the object might have been + * cleaned up asynchronously. + * + * Thread safety: + * - Safe to call concurrently from multiple threads; each successful ca= ll + * yields its own FD. + * + * Forward compatibility: + * - Unrecognized future fields must remain zeroed to avoid -EINVAL. + * - Ensure opts->sz matches the libbpf version's struct size to enable + * kernel-side bounds checking and extension handling. + */ LIBBPF_API int bpf_obj_get_opts(const char *pathname, const struct bpf_obj_get_opts *opts); /** @@ -1603,6 +1952,7 @@ LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, _= _u32 *next_id); * Other negative libbpf-style errors for transient or system failu= res. */ LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id); + LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id); /** * @brief Retrieve the next existing BPF link ID after a given starting ID. @@ -1877,7 +2227,9 @@ LIBBPF_API int bpf_map_get_fd_by_id(__u32 id); */ LIBBPF_API int bpf_map_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); + LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id); + LIBBPF_API int bpf_btf_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); /** @@ -2026,7 +2378,77 @@ LIBBPF_API int bpf_link_get_fd_by_id(__u32 id); */ LIBBPF_API int bpf_link_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); - +/** + * @brief Retrieve information about a BPF object (program, map, BTF, or l= ink) given + * its file descriptor. + * + * This is a generic libbpf wrapper around the kernel's BPF_OBJ_GET_INFO_B= Y_FD + * command. Depending on what type of BPF object @p bpf_fd refers to, the = kernel + * expects @p info to point to an appropriately typed info structure: + * + * - struct bpf_prog_info (for program FDs) + * - struct bpf_map_info (for map FDs) + * - struct bpf_btf_info (for BTF object FDs) + * - struct bpf_link_info (for link FDs) + * + * You must: + * 1. Zero-initialize the chosen info structure (to avoid undefined padd= ing contents). + * 2. Set *@p info_len to sizeof(struct ) before the= call. + * 3. Pass a pointer to the structure as @p info. + * + * On success, the kernel fills as much of the structure as it supports/re= cognizes + * for the running kernel version and may update *@p info_len with the act= ual number + * of bytes written (libbpf preserves kernel behavior). Unrecognized futur= e fields + * remain zeroed. If *@p info_len is smaller than the minimum required siz= e for that + * object type, the call fails with -EINVAL. + * + * Typical usage (program example): + * struct bpf_prog_info pinfo =3D {}; + * __u32 len =3D sizeof(pinfo); + * if (bpf_obj_get_info_by_fd(prog_fd, &pinfo, &len) =3D=3D 0) { + * // pinfo now populated (len bytes). Inspect fields like pinfo.id,= pinfo.type, ... + * } else { + * // handle error (errno set; negative return value also provided) + * } + * + * Concurrency & races: + * - The object referenced by @p bpf_fd remains valid while its FD is op= en, so + * races are limited. However, fields referring to related kernel enti= ties + * (e.g., map IDs a program references) may change if other management= operations + * occur concurrently. + * + * Forward/backward compatibility: + * - Always zero the entire info struct before calling; newer kernels ma= y fill + * additional fields. + * - Do not assume all fields are populated; check size/version or speci= fic + * feature flags if present. + * + * Security / privileges: + * - Access may require CAP_BPF and/or CAP_SYS_ADMIN depending on kernel= configuration, + * LSM policy, and lockdown mode. Insufficient privilege yields -EPERM= / -EACCES. + * + * @param bpf_fd File descriptor of a loaded BPF object (program, map, B= TF, or link). + * @param info Pointer to a zero-initialized, type-appropriate info st= ructure + * (see list above). + * @param info_len Pointer to a __u32 containing the size of *info* on inp= ut; on + * success updated to the number of bytes actually written= . Must + * not be NULL. + * + * @return 0 on success; + * < 0 negative error code (libbpf style =3D=3D -errno) on failure: + * - -EBADF: @p bpf_fd is not a valid BPF object descriptor. + * - -EINVAL: Wrong object type, info_len too small, malformed a= rguments. + * - -EFAULT: @p info or @p info_len points to inaccessible user= memory. + * - -EPERM / -EACCES: Insufficient privileges / blocked by secu= rity policy. + * - -ENOMEM: Kernel failed to allocate internal resources. + * - Other -errno values may be propagated from the underlying s= yscall. + * + * Error handling notes: + * - Treat -EINVAL as often indicating a size mismatch; verify that size= of(your struct) + * matches what the kernel expects for your libbpf/kernel version. + * - Always inspect errno (or the negative return value) for precise fai= lure reasons. + * + */ LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_= len); =20 /** @@ -2230,7 +2652,9 @@ struct bpf_raw_tp_opts { #define bpf_raw_tp_opts__last_field cookie =20 LIBBPF_API int bpf_raw_tracepoint_open_opts(int prog_fd, struct bpf_raw_tp= _opts *opts); + LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd); + LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len, __u32 *prog_id, __u32 *fd_type, __u64 *probe_offset, __u64 *probe_addr); --=20 2.34.1 From nobody Sun Feb 8 13:16:58 2026 Received: from mail-pj1-f68.google.com (mail-pj1-f68.google.com [209.85.216.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7939A1373 for ; Fri, 31 Oct 2025 08:00:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897636; cv=none; b=Fl8c+KjOxy429qrWYUqqUOvYtBpMzreBUbw4gxIW9UoxuGMTUd558R4aPXoDCEdX04iEHRI5Fz3GS2B1oJKC6f/3ERvP1ZkPOpPt3/8NjU8DJ11BFlJkLu/+uFIMcg5bhmELAYdScdv0QLGcM7y3fKhAF60RnKctdcEGS8Qb22g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761897636; c=relaxed/simple; bh=PY5JcpIz1RbUpSehb4OzqIXr+m4nDWvNZTAADYEpQgQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=GH/DNg1jsVKZ5SwNG1infDK6c+ij0Oqdwkacna9vy1G0l6A5yC9bWVL6m7rLEu1ZySjemooHcpbpJTGOIkr4b0uHHgoeEpGSbwAXK2xJuwKSMMTl7clWhXYpf08HzKE4320r+OWobPHecnkpW1CHU8EI1Nn5z7CF2WE0kzdKaj0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=E2oNB6XZ; arc=none smtp.client-ip=209.85.216.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="E2oNB6XZ" Received: by mail-pj1-f68.google.com with SMTP id 98e67ed59e1d1-340564186e0so1650260a91.0 for ; Fri, 31 Oct 2025 01:00:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761897631; x=1762502431; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aMEI8A5exlVvtBbYgkjVXGm9tCatVMbqAKq87JD9SN8=; b=E2oNB6XZDVfz2OEY8d42303HLwuruhOzAcpEZYthVkRdwTFpDyriOrscaNowC4FZry HbdSTtU4FvYN7z/slo0Qr5vZtk2xBQt/vfSsQA4NfsH5BuZdag/7xvI7zjG5pafFJWLw LFifE3TOWkErWgCw3fdw0nb57ZHFNMRJIdDsIKL6lO0YdEYc4Ds5NR3Ja3WJ+Bl2t7mq Dhxptn8HjxHu1+OxWBHfm/qgA7xNnoCx4xvejZtkQPyDij/GAuXU5ZpRolePsZabqPy7 ldcigfHwMFhFqsdNe7jfqqljH1bahqfI5i5WNV7fqI2FFAkqqeJKsIdDt+fKZF30dCc2 nQWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761897631; x=1762502431; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aMEI8A5exlVvtBbYgkjVXGm9tCatVMbqAKq87JD9SN8=; b=QjEioNSfNbT6c8wjsTb9AaNXUNdNfDanSVDHT9AdBNFB+t9XQH050qtPTMhOE0FMlz 80/gnyKbJz3xd+HgrVZeglMk9NL2L5mSjtDkHF1FHzid7MzAAnHrk+Ad/esBs/KILldI szoaCGzfEmeh6jKAl8u/PLVxly3TsTh3ydJ76o+3T1ed7iJpF1+6gslZsLRxNwu9xhaD YG6JOCrZLkB62qapkbo6YQ+N9KYhx+TksdpQW8QLhUhJ3nsxnLjb+rEZYslnPbUQ201j 0265bGmPyrkY1CQsizIZ1OuNC8ahZMUipIRHz9lvztLdbWoQpCXny8+ybKQS53LD9h06 tEZg== X-Gm-Message-State: AOJu0YxpAxm46j5LOaXIpN/e39VsDzXJZRfPptxFxwRqkZwpAK/JpdWs fi71GvCOPRo6S4U+kvIhNzVoY69PwAThHJBQ7BAs7m9pEZv9hi5bjIe1lsIddvcl X-Gm-Gg: ASbGncsh2Yh9MJ3lxv/m4MLv9/7MV+zxuWjNbbQU4IW1YtKhtpenpa8tMSyl2aZp2T4 BPEy+4oEg/kZqgzzHb/HOy6zyeJ2REPsLQx1GQB9AO+D3awKJNG5PGHVOmscJ5RLdnY1m+1xEfz 27QZua/ynGv4URiX8+JS4VqmM7d46cOq4gF1kuw/EFPHl/BqolD/TCl4loptqcbEjlgHnaoFG8u P3XIj7uuOxT2XYkzrq8RrtHsfW/4sWIUAopDer5vNw7UiRpbrtZYBB4D1C8DAZwrlDyQXZ/uBIO rDghz5Yxqb+VqEG3hbwwH3IL1iBa6fvloPBIgm49dQ4KxH3QGoWmLUQRTIdhUytf+Z6D5R+KZEV wJO44AjweqRteD7CZ/vo6TaVuQn406c8rQIjtJkW7kAEQnXdvBJpKtFj8WlEBNkopjNWuFD4Eez d4uIi23fFNDnGUNZlO3g== X-Google-Smtp-Source: AGHT+IE6I/euK7zSq1aq635xnYJW9TJkfDVuIBI96r+FjKNtnQk+YSU7YsB4Q+x/iCNHMPaC9aOiRQ== X-Received: by 2002:a17:90b:4c10:b0:32e:936f:ad7 with SMTP id 98e67ed59e1d1-3408308ab21mr3411827a91.27.1761897630788; Fri, 31 Oct 2025 01:00:30 -0700 (PDT) Received: from E07P150077.ecarx.com.cn ([103.52.189.23]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b93be4045fbsm1216575a12.28.2025.10.31.01.00.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Oct 2025 01:00:29 -0700 (PDT) From: Jianyun Gao To: linux-kernel@vger.kernel.org Cc: Jianyun Gao , Andrii Nakryiko , Eduard Zingerman , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf@vger.kernel.org (open list:BPF [LIBRARY] (libbpf)) Subject: [PATCH v2 5/5] libbpf: Add doxygen documentation for btf/iter etc. in bpf.h Date: Fri, 31 Oct 2025 15:59:07 +0800 Message-Id: <20251031075908.1472249-6-jianyungao89@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251031075908.1472249-1-jianyungao89@gmail.com> References: <20251031075908.1472249-1-jianyungao89@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add doxygen comment blocks for remaining helpers (btf/iter etc.) in tools/lib/bpf/bpf.h. These doc comments are for: -libbpf_set_memlock_rlim() -bpf_btf_load() -bpf_iter_create() -bpf_btf_get_next_id() -bpf_btf_get_fd_by_id() -bpf_btf_get_fd_by_id_opts() -bpf_raw_tracepoint_open_opts() -bpf_raw_tracepoint_open() -bpf_task_fd_query() Signed-off-by: Jianyun Gao --- v1->v2: - Fixed compilation error caused by embedded literal "/*" inside a comment (rephrased/escaped). - Fixed the non-ASCII characters in this patch. The v1 is here: https://lore.kernel.org/lkml/20251031032627.1414462-6-jianyungao89@gmail.co= m/ tools/lib/bpf/bpf.h | 745 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 740 insertions(+), 5 deletions(-) diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index a0cebda09e16..6ef1ea7921c4 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -34,7 +34,61 @@ #ifdef __cplusplus extern "C" { #endif - +/** + * @brief Adjust process RLIMIT_MEMLOCK to facilitate loading BPF objects. + * + * libbpf_set_memlock_rlim() raises (or lowers) the calling process's + * RLIMIT_MEMLOCK soft and hard limits to at least the number of bytes + * specified by memlock_bytes. BPF map and program creation can require + * locking kernel/user pages; if RLIMIT_MEMLOCK is too low the kernel + * will fail operations with EPERM/ENOMEM. This helper provides a + * convenient way to pre-allocate sufficient memlock quota. + * + * Semantics: + * - If current (soft or hard) RLIMIT_MEMLOCK is already >=3D memlock_by= tes, + * the limit is left unchanged and the function succeeds. + * - Otherwise, the function attempts to set both soft and hard limits + * to memlock_bytes using setrlimit(RLIMIT_MEMLOCK, ...). + * - On systems enforcing privilege constraints, increasing the hard + * limit may require CAP_SYS_RESOURCE; lack of privilege yields failur= e. + * + * Typical usage (before loading large maps/programs): + * size_t needed =3D 128ul * 1024 * 1024; // 128 MB + * if (libbpf_set_memlock_rlim(needed) < 0) { + * // handle error (e.g., fall back to smaller maps or abort) + * } + * + * Choosing a value: + * - Sum anticipated sizes of maps (key_size + value_size) * max_entries + * plus overhead. Add headroom for verifier, BTF, and future growth. + * - Large per-CPU maps multiply value storage by number of CPUs. + * - Overestimating is usually harmless (within administrative policy). + * + * Concurrency & scope: + * - Affects only the calling process's RLIMIT_MEMLOCK. + * - Child processes inherit the adjusted limits after fork/exec. + * + * Security / privileges: + * - Increasing the hard limit above the current maximum may require + * CAP_SYS_RESOURCE or appropriate PAM/ulimit configuration. + * - Without sufficient privilege, the call fails with -errno (often -EP= ERM). + * + * @param memlock_bytes Desired minimum RLIMIT_MEMLOCK (in bytes). If zero, + * the function is a no-op (always succeeds). + * + * @return 0 on success; + * < 0 negative error code (libbpf style =3D=3D -errno) on failure: + * - -EINVAL: Invalid argument (e.g., internal conversion issues= ). + * - -EPERM / -EACCES: Insufficient privilege to raise hard limi= t. + * - -ENOMEM: Rare failure allocating internal structures. + * - Other -errno codes propagated from setrlimit(). + * + * Failure handling: + * - A failure means RLIMIT_MEMLOCK is unchanged; subsequent BPF map/pro= gram + * loads may still succeed if existing limit is adequate. + * - Check current limits manually (getrlimit) if precise sizing is crit= ical. + * + */ LIBBPF_API int libbpf_set_memlock_rlim(size_t memlock_bytes); =20 struct bpf_map_create_opts { @@ -295,7 +349,104 @@ struct bpf_btf_load_opts { size_t :0; }; #define bpf_btf_load_opts__last_field token_fd - +/** + * @brief Load a BTF (BPF Type Format) blob into the kernel and obtain a B= TF object FD. + * + * bpf_btf_load() wraps the BPF_BTF_LOAD command of the bpf(2) syscall. It= validates + * and registers the BTF metadata described by @p btf_data so that subsequ= ently loaded + * BPF programs and maps can reference rich type information (for CO-RE re= locations, + * pretty printing, introspection, etc.). + * + * Typical usage: + * // Prepare optional verifier/logging buffer (only if you want kernel = diagnostics) + * char log_buf[1 << 20] =3D {}; + * struct bpf_btf_load_opts opts =3D { + * .sz =3D sizeof(opts), + * .log_buf =3D log_buf, + * .log_size =3D sizeof(log_buf), + * .log_level =3D 1, // >0 to request kernel parsing/va= lidation log + * }; + * int btf_fd =3D bpf_btf_load(btf_blob_ptr, btf_blob_size, &opts); + * if (btf_fd < 0) { + * // Inspect errno; if opts.log_buf was provided, it may contain de= tails. + * } else { + * // Use btf_fd (e.g. pass to bpf_prog_load() via prog_btf_fd, or q= uery info). + * } + * + * Input expectations: + * - @p btf_data must point to a complete, well-formed BTF buffer starti= ng with + * struct btf_header followed by the type section and string section. + * - @p btf_size is the total size in bytes of that buffer. + * - Endianness must match the running kernel; cross-endian BTF is rejec= ted. + * - Types must obey kernel constraints (e.g., no unsupported kinds, val= id string + * offsets, canonical integer encodings, no dangling references). + * + * Logging (opts->log_*): + * - If @p opts is non-NULL and opts->log_level > 0, the kernel may emit= a textual + * parse/validation log into opts->log_buf (up to opts->log_size - 1 b= ytes, with + * trailing '\0'). + * - On supported kernels, opts->log_true_size is updated to reflect the= full (untruncated) + * length of the internal log; if larger than log_size, the log was tr= uncated. + * - If the kernel does not support returning true size, log_true_size r= emains equal + * to the original log_size value or zero. + * + * Privileges & security: + * - CAP_BPF and/or CAP_SYS_ADMIN may be required depending on kernel co= nfiguration, + * LSM policy, and lockdown mode. Lack of privilege yields -EPERM / -E= ACCES. + * - In delegated environments, opts->token_fd (if available and support= ed) can grant + * scoped permission to load BTF without full global capabilities. + * + * Memory and lifetime: + * - On success a file descriptor (>=3D 0) referencing the in-kernel BTF= object is returned. + * Close it with close() when no longer needed. + * - The kernel makes its own copy of the supplied BTF blob; the caller = can free or reuse + * @p btf_data immediately after the call returns. + * - BTF objects can be queried via bpf_btf_get_info_by_fd() and referen= ced by programs + * (prog_btf_fd) or maps for type information. + * + * Concurrency & races: + * - Loading is independent; multiple BTF objects may coexist. + * - There is no automatic deduplication across separate loads (except a= ny internal + * kernel optimizations); user space manages uniqueness/pinning if des= ired. + * + * Validation tips: + * - Use bpftool btf dump to sanity-check a blob before loading. + * - Keep string table minimal; excessive strings inflate memory and may= hit limits. + * - Ensure all referenced type IDs exist and form a closed, acyclic gra= ph (except + * for permitted self-references in struct/union definitions). + * + * After loading: + * - Pass the returned FD as prog_btf_fd when loading programs that rely= on CO-RE + * relocations or need BTF type validation. + * - Optionally pin the BTF object with bpf_obj_pin() for persistence ac= ross process + * lifetimes. + * - Query metadata (e.g., number of types, string section size) with bp= f_btf_get_info_by_fd(). + * + * @param btf_data Pointer to the raw in-memory BTF blob. + * @param btf_size Size (in bytes) of the BTF blob pointed to by @p btf_da= ta. + * @param opts Optional pointer to a bpf_btf_load_opts struct. May be = NULL. + * Must set opts->sz =3D sizeof(*opts) when non-NULL. Fiel= ds: + * - log_buf / log_size / log_level: Request and store k= ernel + * validation log (see Logging). + * - log_true_size: Updated by kernel on success (if sup= ported). + * - btf_flags: Reserved for future extensions (must be = 0 unless documented). + * - token_fd: Delegated permission token (0 or -1 if un= used). + * + * @return + * >=3D 0 : File descriptor referencing the loaded BTF object. + * < 0 : Negative error code (see Error handling). + * + * Error handling (negative return codes =3D=3D -errno style): + * - -EINVAL: Malformed BTF (bad header, section sizes, invalid type gra= ph, bad string + * offsets, unsupported features), opts->sz mismatch, bad fla= gs. + * - -EFAULT: @p btf_data or opts->log_buf points to unreadable/writable= memory. + * - -ENOMEM: Kernel failed to allocate memory for internal BTF represen= tation. + * - -EPERM / -EACCES: Insufficient privileges or blocked by security po= licy. + * - -E2BIG: Exceeds kernel size/complexity limits (e.g., too many types= or strings). + * - -ENOTSUP / -EOPNOTSUPP: Kernel lacks support for a feature used in = the blob (rare). + * - Other negative codes may be propagated from the underlying syscall. + * + */ LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts *opts); =20 @@ -1840,7 +1991,84 @@ struct bpf_link_update_opts { */ LIBBPF_API int bpf_link_update(int link_fd, int new_prog_fd, const struct bpf_link_update_opts *opts); - +/** + * @brief Create a user space iterator stream FD from an existing BPF iter= ator link. + * + * bpf_iter_create() wraps the kernel's BPF_ITER_CREATE command. Given a B= PF + * link FD (@p link_fd) that represents an attached BPF iterator program + * (i.e., a program of type BPF_PROG_TYPE_TRACING with an iterator attach + * type such as BPF_TRACE_ITER), this function returns a new file descript= or + * from which user space can sequentially read the iterator's textual or + * binary output. + * + * Reading the returned FD: + * - Use read(), pread(), or a buffered I/O layer to consume iterator da= ta. + * - Each read() returns zero (EOF) when the iterator has completed prod= ucing + * all elements; close the FD afterward. + * - Short reads are normal; loop until EOF or error. + * + * Lifetime & ownership: + * - Success returns a new FD; caller owns it and must close() when fini= shed. + * - Closing the iterator FD does NOT destroy the underlying link or pro= gram. + * - You can create multiple iterator FDs from the same link concurrentl= y; + * each is an independent traversal. + * + * Typical usage: + * int link_fd =3D bpf_link_create(prog_fd, -1, BPF_TRACE_ITER, &opts); + * if (link_fd < 0) { // handle error } + * int iter_fd =3D bpf_iter_create(link_fd); + * if (iter_fd < 0) { // handle error } + * char buf[4096]; + * for (;;) { + * ssize_t n =3D read(iter_fd, buf, sizeof(buf)); + * if (n < 0) { + * if (errno =3D=3D EINTR) continue; + * perror("read iter"); + * break; + * } + * if (n =3D=3D 0) // end of iteration + * break; + * fwrite(buf, 1, n, stdout); + * } + * close(iter_fd); + * + * Concurrency & races: + * - Safe to call concurrently from multiple threads; each iterator FD + * represents its own walk. + * - Underlying kernel objects (maps, tasks, etc.) may change while iter= ating; + * output is a best-effort snapshot, not a stable, atomic view. + * + * Performance considerations: + * - Large buffers (e.g., 16-64 KiB) reduce syscall overhead for high-vo= lume + * iterators. + * - For blocking behavior, select()/poll()/epoll() can be used; EOF is + * indicated by read() returning 0. + * + * Security & privileges: + * - May require CAP_BPF and/or CAP_SYS_ADMIN depending on kernel config= uration, + * lockdown mode, and LSM policy governing the iterator target. + * + * @param link_fd File descriptor of a BPF link representing an attached i= terator program. + * + * @return >=3D 0: Iterator stream file descriptor to read from. + * < 0 : Negative error code (libbpf style, =3D=3D -errno) on fail= ure. + * + * + * Error handling (negative libbpf-style return value =3D=3D -errno): + * - -EBADF: @p link_fd is not a valid open FD. + * - -EINVAL: @p link_fd does not refer to an iterator-capable BPF link,= or + * unsupported combination for the running kernel. + * - -EPERM / -EACCES: Insufficient privileges / blocked by security pol= icy. + * - -EOPNOTSUPP / -ENOTSUP: Kernel lacks iterator creation support for = this link. + * - -ENOMEM: Kernel could not allocate internal data structures. + * - Other -errno codes may be propagated from the underlying bpf() sysc= all. + * + * Robustness tips: + * - Verify the program was attached with the correct iterator attach ty= pe. + * - Treat a 0-length read as normal completion, not an error. + * - Always handle transient read() failures (EINTR, EAGAIN if non-block= ing). + * + */ LIBBPF_API int bpf_iter_create(int link_fd); =20 struct bpf_prog_test_run_attr { @@ -1953,6 +2181,68 @@ LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, = __u32 *next_id); */ LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id); =20 +/** + * @brief Retrieve the next existing BTF object ID after a given starting = ID. + * + * This helper wraps the kernel's BPF_BTF_GET_NEXT_ID command and enumerat= es + * in-kernel BTF (BPF Type Format) objects in strictly ascending order of + * their kernel-assigned IDs. It is typically used to iterate all currently + * loaded BTF objects (e.g., vmlinux BTF, module BTFs, user-loaded BTF blo= bs). + * + * Enumeration pattern: + * 1. Initialize start_id to 0 to obtain the first (lowest) existing BTF= ID. + * 2. On success, *next_id is set to the first BTF ID strictly greater t= han start_id. + * 3. Use the returned *next_id as the new start_id in a subsequent call. + * 4. Repeat until the function returns -ENOENT, which signals there is = no + * BTF object with ID greater than start_id (end of iteration). + * + * Concurrency & races: + * - BTF objects can be loaded or unloaded concurrently with enumeration. + * An ID retrieved in one call may become invalid (object unloaded) be= fore + * you convert it to a file descriptor with bpf_btf_get_fd_by_id(). + * - Enumeration does not provide a stable snapshot. Newly loaded BTFs m= ay + * appear after you've passed their predecessor ID. + * + * Lifetime & validity: + * - IDs are monotonically increasing and effectively never wrap in norm= al + * operation. + * - Successfully retrieving an ID does NOT pin the corresponding BTF ob= ject. + * Obtain a file descriptor immediately if you need to interact with i= t. + * + * Typical usage: + * __u32 id =3D 0, next; + * while (bpf_btf_get_next_id(id, &next) =3D=3D 0) { + * int btf_fd =3D bpf_btf_get_fd_by_id(next); + * if (btf_fd >=3D 0) { + * // Inspect/query BTF (e.g. bpf_btf_get_info_by_fd()). + * close(btf_fd); + * } + * id =3D next; + * } + * // Loop ends when bpf_btf_get_next_id() returns -ENOENT. + * + * @param start_id + * Starting point for the search. The helper finds the first BTF ID + * strictly greater than start_id. Use 0 to begin enumeration. + * @param next_id + * Pointer to a __u32 that receives the next BTF ID on success. + * Must not be NULL. + * + * @return + * 0 on success (next_id populated); + * -ENOENT if there is no BTF ID greater than start_id (normal end of i= teration); + * -EINVAL if next_id is NULL or arguments are otherwise invalid; + * -EPERM / -EACCES if denied by security policy or lacking required pri= vileges; + * Other negative libbpf-style codes (-errno) on transient or system fai= lures. + * + * Error handling notes: + * - Treat -ENOENT as normal termination, not an exceptional error. + * - For other failures, errno is set to the underlying cause. + * + * Follow-up: + * - Convert retrieved IDs to FDs with bpf_btf_get_fd_by_id() to inspect + * metadata or pin the BTF object. + */ LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id); /** * @brief Retrieve the next existing BPF link ID after a given starting ID. @@ -2227,9 +2517,171 @@ LIBBPF_API int bpf_map_get_fd_by_id(__u32 id); */ LIBBPF_API int bpf_map_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); - +/** + * @brief Obtain a file descriptor for an existing in-kernel BTF (BPF Type= Format) + * object given its kernel-assigned ID. + * + * bpf_btf_get_fd_by_id() wraps the BPF_BTF_GET_FD_BY_ID command of the bp= f(2) + * syscall. Each loaded BTF object (vmlinux BTF, kernel module BTF, or use= r-supplied + * BTF blob loaded via BPF_BTF_LOAD) has a monotonically increasing, uniqu= e ID. + * This helper converts that stable ID into a process-local file descriptor + * suitable for introspection (e.g., via bpf_btf_get_info_by_fd()), pinning + * (bpf_obj_pin()), or reuse when loading BPF programs/maps that reference= types + * from this BTF. + * + * Typical enumeration + open pattern: + * __u32 id =3D 0, next; + * while (bpf_btf_get_next_id(id, &next) =3D=3D 0) { + * int btf_fd =3D bpf_btf_get_fd_by_id(next); + * if (btf_fd >=3D 0) { + * // inspect with bpf_btf_get_info_by_fd(btf_fd, ...) + * close(btf_fd); + * } + * id =3D next; + * } + * // Loop ends when bpf_btf_get_next_id() returns -ENOENT. + * + * Concurrency & races: + * - A BTF object may be unloaded (e.g., module removal) between discove= ring + * its ID and calling this function; in that case the call fails with = -ENOENT. + * - Successfully obtaining a file descriptor does not prevent later unl= oading + * by other processes; subsequent operations on the FD can still fail. + * + * Lifetime & ownership: + * - On success the caller owns the returned descriptor and must close()= it + * when no longer needed. + * - Closing the FD does not destroy the underlying BTF object if other + * references (FDs or pinned bpffs paths) remain. + * + * Privileges / security: + * - May require CAP_BPF and/or CAP_SYS_ADMIN depending on kernel config= uration, + * LSM policies, or lockdown mode. Lack of privilege yields -EPERM / -= EACCES. + * - Access can also be restricted by namespace or cgroup-based security= policies. + * + * Use cases: + * - Retrieve BTF metadata (type counts, string section size, specific t= ype + * definitions) via bpf_btf_get_info_by_fd(). + * - Pass the FD as prog_btf_fd when loading eBPF programs needing CO-RE= or + * type validation. + * - Pin the BTF object for persistence across process lifetimes. + * + * @param id + * Kernel-assigned unique (non-zero) BTF object ID. Typically obtai= ned via + * bpf_btf_get_next_id() or from a prior info query. Must be > 0. + * + * @return + * >=3D 0 : File descriptor referencing the BTF object (caller must clos= e()). + * < 0 : Negative libbpf-style error code (=3D=3D -errno): + * - -ENOENT : No BTF object with this ID (unloaded or never ex= isted). + * - -EPERM / -EACCES : Insufficient privileges / blocked by po= licy. + * - -EINVAL : Invalid ID (e.g., 0) or kernel rejected the requ= est. + * - -ENOMEM : Kernel memory/resource exhaustion. + * - Other negative values: Propagated syscall failures. + * + * Error handling notes: + * - Treat -ENOENT as a normal race outcome if objects can disappear. + * - Always close the returned FD to avoid resource leaks. + * + * Thread safety: + * - Safe to call concurrently; each successful invocation yields an ind= ependent FD. + * + * Forward compatibility: + * - ID space is monotonic; practical wraparound is not expected. + * - Future kernels may add additional validation or permission gating; = handle + * new -errno codes conservatively. + */ LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id); =20 +/** + * @brief Obtain a file descriptor for an existing in-kernel BTF (BPF Type= Format) + * object by its kernel-assigned ID, with extended open options. + * + * bpf_btf_get_fd_by_id_opts() is an extended variant of bpf_btf_get_fd_by= _id(). + * It wraps the BPF_BTF_GET_FD_BY_ID command of the bpf(2) syscall and con= verts + * a stable, monotonically increasing BTF object ID (@p id) into a process= -local + * file descriptor, honoring optional attributes supplied via @p opts. + * + * A BTF object represents a loaded collection of type metadata (vmlinux B= TF, + * kernel module BTF, or user-supplied BTF blob). Programs and maps can re= fer + * to these types for CO-RE relocations, verification, and introspection. + * + * Typical enumeration + open pattern: + * __u32 cur =3D 0, next; + * while (bpf_btf_get_next_id(cur, &next) =3D=3D 0) { + * struct bpf_get_fd_by_id_opts o =3D { + * .sz =3D sizeof(o), + * .open_flags =3D 0, + * .token_fd =3D -1, + * }; + * int btf_fd =3D bpf_btf_get_fd_by_id_opts(next, &o); + * if (btf_fd >=3D 0) { + * // use btf_fd (e.g. bpf_btf_get_info_by_fd()) + * close(btf_fd); + * } + * cur =3D next; + * } + * // Loop ends when bpf_btf_get_next_id() returns -ENOENT. + * + * Initialization & @p opts usage: + * - @p opts may be NULL for default behavior (equivalent to zeroed fiel= ds). + * - If @p opts is non-NULL, opts->sz MUST be set to sizeof(*opts); mism= atch + * yields -EINVAL. + * - opts->open_flags: + * Reserved for future kernel extensions; pass 0 unless a documented= flag + * is supported. Unsupported bits =3D> -EINVAL. + * - opts->token_fd: + * Optional BPF token FD enabling delegated (restricted) permissions= . Set + * to -1 or 0 if unused. Provides a way to open BTF objects without = full + * CAP_BPF/CAP_SYS_ADMIN in controlled environments. + * + * Concurrency & races: + * - A BTF object can be unloaded (e.g., module removal) after ID discov= ery + * but before this call; expect -ENOENT in such races. + * - Successfully obtaining a file descriptor does not guarantee the obj= ect + * will remain available for its entire lifetime (it could still be re= moved + * depending on kernel policies), so subsequent operations may fail. + * + * Lifetime & ownership: + * - On success you own the returned FD and must close() it when done. + * - Closing the FD does not destroy the BTF object if other references = (FDs, + * pinned bpffs entries) remain. + * - You may pin the BTF object via bpf_obj_pin() for persistence. + * + * Security / privileges: + * - May require CAP_BPF and/or CAP_SYS_ADMIN depending on kernel config= uration, + * LSM policy, and lockdown mode. + * - Access via a token_fd is subject to token scope; insufficient right= s yield + * -EPERM / -EACCES. + * + * Use cases: + * - Retrieve type information with bpf_btf_get_info_by_fd(). + * - Supply prog_btf_fd when loading eBPF programs needing CO-RE relocat= ions. + * - Enumerate and manage user-loaded or kernel-provided BTF datasets. + * + * Robustness tips: + * - Treat -ENOENT as a normal race when enumerating dynamic BTF objects. + * - Always zero-initialize opts before setting recognized fields: + * struct bpf_get_fd_by_id_opts o =3D {}; + * o.sz =3D sizeof(o); + * - Avoid non-zero open_flags until documented; future kernels may add = semantic + * modifiers (e.g., restricted viewing modes). + * + * @param id Kernel-assigned unique BTF object ID (> 0). + * @param opts Optional pointer to struct bpf_get_fd_by_id_opts controllin= g open + * behavior; may be NULL for defaults. + * + * @return >=3D 0: File descriptor referencing the BTF object (caller must= close()). + * < 0 : Negative error code (libbpf style =3D=3D -errno) on failu= re. + * + * Error handling (negative return values are libbpf-style =3D=3D -errno): + * - -ENOENT: No BTF object with @p id (unloaded or never existed). + * - -EINVAL: Invalid @p id (e.g., 0), malformed @p opts (bad sz), or un= supported + * open_flags bits. + * - -EPERM / -EACCES: Insufficient privileges or blocked by security po= licy. + * - -ENOMEM: Kernel resource allocation failure. + * - Other -errno codes may be propagated from underlying syscall failur= es. + * + */ LIBBPF_API int bpf_btf_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); /** @@ -2650,11 +3102,294 @@ struct bpf_raw_tp_opts { size_t :0; }; #define bpf_raw_tp_opts__last_field cookie - +/** + * @brief Attach a loaded BPF program to a raw tracepoint using extended o= ptions. + * + * bpf_raw_tracepoint_open_opts() wraps the BPF_RAW_TRACEPOINT_OPEN comman= d and + * creates a persistent attachment of @p prog_fd to the raw tracepoint nam= ed in + * @p opts->tp_name. On success it returns a file descriptor representing = the + * attachment. Closing that FD detaches the program from the tracepoint. + * + * Compared to bpf_raw_tracepoint_open(), this variant allows passing a us= er + * cookie (opts->cookie) and provides forward/backward compatibility via t= he + * @p opts->sz field. + * + * Typical usage: + * struct bpf_raw_tp_opts ropts =3D { + * .sz =3D sizeof(ropts), + * .tp_name =3D "sched_switch", // raw tracepoint name (no "tracep= oint/" prefix) + * .cookie =3D 0xdeadbeef, // optional user cookie (visible t= o program) + * }; + * int tp_fd =3D bpf_raw_tracepoint_open_opts(prog_fd, &ropts); + * if (tp_fd < 0) { + * // handle error (inspect errno or negative return value) + * } + * // ... use attachment; close(tp_fd) to detach when done. + * + * Tracepoint name: + * - Use the raw tracepoint identifier as exposed under + * /sys/kernel/debug/tracing/events/ without category prefixes. For raw + * tracepoints this is typically the internal kernel name (e.g., "sche= d_switch"). + * - Passing NULL or an empty string fails with -EINVAL. + * + * Cookie: + * - opts->cookie (if non-zero) becomes available to the attached progra= m via + * bpf_get_attach_cookie() helper (where supported). + * - Set to 0 if you don't need a cookie; kernel treats it as absent. + * + * Structure initialization: + * - opts MUST NOT be NULL. + * - Zero-initialize the struct, then set: + * opts->sz =3D sizeof(struct bpf_raw_tp_opts); + * opts->tp_name =3D ""; + * opts->cookie =3D ; + * - Unrecognized future fields must remain zero for compatibility. + * + * Lifetime & detachment: + * - The returned FD solely controls the attachment lifetime. Closing it + * detaches the program. + * - The program FD @p prog_fd may be closed independently after success= ful + * attachment; the link remains active until the tracepoint FD is clos= ed. + * + * Concurrency: + * - Multiple programs can attach to the same raw tracepoint (each gets = its + * own FD). + * - Attaching/detaching is atomic from the program's perspective; events + * arriving after success will invoke the program. + * + * Privileges: + * - Typically requires CAP_BPF and/or CAP_SYS_ADMIN depending on kernel + * configuration, LSM policy, and lockdown mode. + * + * Performance considerations: + * - Raw tracepoints invoke programs on every event occurrence; ensure p= rogram + * logic is efficient to avoid noticeable system overhead. + * + * @param prog_fd + * File descriptor of a previously loaded BPF program (bpf_prog_load()) = that + * is compatible with raw tracepoint attachment (e.g., program type + * BPF_PROG_TYPE_RAW_TRACEPOINT or suitable tracing type). + * + * @param opts + * Pointer to an initialized bpf_raw_tp_opts structure describing the ta= rget + * tracepoint and optional cookie. Must not be NULL. opts->sz must equal + * sizeof(struct bpf_raw_tp_opts). + * + * @return + * >=3D 0 : File descriptor representing the attachment (close to detach= ). + * < 0 : Negative libbpf-style error code (=3D=3D -errno) on failure: + * - -EINVAL : Bad prog_fd, malformed opts (sz mismatch, NULL= tp_name), + * unsupported program type, or kernel lacks raw T= P support. + * - -EPERM/-EACCES : Insufficient privileges or blocked by sec= urity policy. + * - -ENOENT : Tracepoint name not found / not supported by c= urrent kernel. + * - -EBADF : Invalid prog_fd. + * - -ENOMEM : Kernel memory/resource exhaustion. + * - -EOPNOTSUPP/-ENOTSUP : Raw tracepoint attachment not suppo= rted. + * - Other -errno codes may be propagated from the underlying s= yscall. + * + * Error handling: + * - Inspect the negative return value or errno for diagnostics. + * - Treat -ENOENT as "tracepoint unavailable" (kernel config or version= gap). + * + * After attachment: + * - Optionally pin the FD (bpf_obj_pin()) if you need persistence. + * - Use bpf_obj_get_info_by_fd() to query attachment metadata if suppor= ted. + */ LIBBPF_API int bpf_raw_tracepoint_open_opts(int prog_fd, struct bpf_raw_tp= _opts *opts); =20 +/** + * @brief Attach a loaded BPF program to a raw tracepoint (legacy/simple A= PI). + * + * bpf_raw_tracepoint_open() is a convenience wrapper that issues the + * BPF_RAW_TRACEPOINT_OPEN command to attach the BPF program referenced + * by @p prog_fd to the raw tracepoint named @p name. On success it returns + * a file descriptor representing the attachment; closing that FD detaches + * the program from the tracepoint. + * + * Compared to bpf_raw_tracepoint_open_opts(), this legacy interface + * provides no ability to specify an attach cookie or future extension + * fields. For new code prefer bpf_raw_tracepoint_open_opts() to enable + * forward/backward compatible option passing. + * + * Tracepoint name: + * - @p name must be a non-NULL, null-terminated string identifying a + * raw tracepoint (e.g. "sched_switch"). + * - Pass the raw kernel tracepoint identifier without any category + * prefix (do not include "tracepoint/" or directory components). + * - If the tracepoint is not available (kernel config/version) the + * call fails with -ENOENT. + * + * Program requirements: + * - @p prog_fd must refer to a loaded BPF program of a type compatible + * with raw tracepoint attachment (e.g., BPF_PROG_TYPE_RAW_TRACEPOINT + * or an allowed tracing program type accepted by the kernel). + * - The program may be safely closed after a successful attachment; + * the returned FD controls the lifetime of the link. + * + * Lifetime & detachment: + * - Each successful call creates a distinct attachment with its own FD. + * - Closing the returned FD immediately detaches the program from the + * tracepoint. + * - The returned FD can be pinned (bpf_obj_pin()) for persistence. + * + * Concurrency: + * - Multiple programs can be attached to the same raw tracepoint. + * - Attach/detach operations are atomic; events after success invoke + * the program until its FD is closed. + * + * Privileges & security: + * - Typically requires CAP_BPF and/or CAP_SYS_ADMIN depending on + * kernel configuration, LSM, and lockdown mode. + * - Insufficient privilege yields -EPERM / -EACCES. + * + * Performance considerations: + * - Raw tracepoints can be very frequent; ensure attached program + * logic is efficient to avoid noticeable overhead. + * + * @param name Null-terminated raw tracepoint name (e.g. "sched_switch"= ). + * @param prog_fd File descriptor of a loaded, compatible BPF program. + * + * @return >=3D 0 : Attachment file descriptor (close to detach). + * < 0 : Negative error code (libbpf style =3D=3D -errno) on fail= ure. + * + * Error handling (negative libbpf-style return value =3D=3D -errno): + * - -EINVAL : Invalid @p prog_fd, NULL/empty @p name, incompatible pr= ogram type. + * - -ENOENT : Tracepoint not found / unsupported by current kernel. + * - -EPERM/-EACCES : Insufficient privileges or blocked by security pol= icy. + * - -EBADF : @p prog_fd is not a valid file descriptor. + * - -ENOMEM : Kernel memory/resource exhaustion. + * - -EOPNOTSUPP/-ENOTSUP : Raw tracepoints unsupported by the kernel. + * - Other negative codes may be propagated from the underlying syscall. + * + * Best practices: + * - Prefer bpf_raw_tracepoint_open_opts() for new development to + * gain cookie support and extensibility. + * - Immediately check the return value; do not rely solely on errno. + * - Pin the attachment if you need persistence across process lifetimes. + * + */ LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd); =20 +/** + * @brief Query metadata about a file descriptor in another task (process)= that + * is associated with a BPF tracing/perf event and (optionally) an + * attached BPF program. + * + * This helper wraps the kernel's BPF_TASK_FD_QUERY command. It inspects t= he + * file descriptor number @p fd that belongs to the task identified by @p = pid + * and, if that FD represents a perf event or similar tracing attachment, = it + * returns descriptive information about: + * - The attached BPF program (its kernel program ID). + * - The nature/type of the FD (tracepoint, raw_tracepoint, kprobe, upro= be, etc.). + * - Target symbol/address/offset data for kprobe/uprobes. + * - A human-readable identifier (tracepoint name, kprobe function name, + * uprobe file path), copied into @p buf when provided. + * + * Typical use cases: + * - Introspecting perf event FDs opened by another process to discover + * which BPF program is attached. + * - Enumerating and characterizing dynamically created kprobes or uprob= es + * (e.g., by observability agents). + * - Building higher-level tooling that correlates program IDs with their + * originating probe specifications. + * + * Usage pattern: + * char info[256]; + * __u32 info_len =3D sizeof(info); + * __u32 prog_id =3D 0, fd_type =3D 0; + * __u64 probe_off =3D 0, probe_addr =3D 0; + * int err =3D bpf_task_fd_query(target_pid, target_fd, 0, + * info, &info_len, + * &prog_id, &fd_type, + * &probe_off, &probe_addr); + * if (err =3D=3D 0) { + * // info[] now holds a NUL-terminated identifier (if available) + * // info_len =3D=3D actual length (including terminating '\0') + * // fd_type enumerates one of BPF_FD_TYPE_* values + * // prog_id is the kernel-assigned BPF program ID (0 if none) + * // probe_off / probe_addr describe offsets/addresses for kprobe/u= probe + * } else if (err =3D=3D -ENOSPC) { + * // info_len contains required size; allocate larger buffer and re= try + * } + * + * Buffer semantics (@p buf / @p buf_len): + * - On input @p *buf_len must hold the capacity (in bytes) of @p buf. + * - If @p buf is large enough, the kernel copies a NUL-terminated string + * (tracepoint name, kprobe symbol, uprobe path, etc.) and updates + * @p *buf_len with the actual string length (including the NUL). + * - If @p buf is too small, the call fails with -ENOSPC and sets + * @p *buf_len to the required length; reallocate and retry. + * - If a textual identifier is not applicable (or unavailable), the ker= nel + * may set @p *buf_len to 0 (and leave @p buf untouched). + * - Passing @p buf =3D=3D NULL is allowed only if @p buf_len is non-NUL= L and + * points to 0; otherwise -EINVAL is returned. + * + * Output parameters: + * - @p prog_id: Set to the kernel BPF program ID attached to the perf e= vent + * FD (0 if no BPF program is attached). + * - @p fd_type: Set to one of the BPF_FD_TYPE_* enum values describing = the + * FD (e.g., BPF_FD_TYPE_TRACEPOINT, BPF_FD_TYPE_KPROBE, BPF_FD_TYPE_U= PROBE, + * BPF_FD_TYPE_RAW_TRACEPOINT). Use this to disambiguate interpretatio= n of + * other outputs. + * - @p probe_offset: For kprobe/uprobes, the offset within the symbol or + * mapped file that was requested when the probe was created. + * - @p probe_addr: For kprobes, the resolved kernel address of the prob= ed + * symbol/instruction; for uprobes may be 0 or implementation-dependen= t. + * - Any output pointer may be NULL if the caller is not interested in t= hat + * field (it will simply be skipped). + * + * Privileges & access control: + * - Querying another task's file descriptor typically requires sufficie= nt + * permissions (ptrace-like restrictions, CAP_BPF / CAP_SYS_ADMIN, and= /or + * LSM allowances). Lack of privilege yields -EPERM / -EACCES. + * - The target task must exist and the FD must be valid at query time. + * + * Concurrency / races: + * - The target process may close or replace its FD concurrently; the qu= ery + * can fail with -EBADF or -ENOENT in such races. + * - Retrieved metadata is a point-in-time snapshot and can become stale + * immediately after return. + * + * @param pid PID of the target task whose file descriptor table = should be queried. + * Use the numeric PID (thread group leader or specifi= c thread PID); + * passing 0 is typically invalid (returns -EINVAL). + * @param fd File descriptor number as seen from inside the task= identified by @p pid. + * @param flags Query modifier flags. Must be 0 on current kernels;= non-zero + * (unsupported) bits return -EINVAL. + * @param buf Optional user buffer to receive a NUL-terminated id= entifier string + * (tracepoint name, kprobe symbol, uprobe path). Can = be NULL if + * @p buf_len points to 0. + * @param buf_len In/out pointer to buffer length. On input: capacity= of @p buf. + * On success: actual length copied (including termina= ting NUL). + * On -ENOSPC: required length (caller should realloca= te and retry). + * @param prog_id Optional output pointer receiving the attached BPF = program ID (0 if none). + * @param fd_type Optional output pointer receiving one of BPF_FD_TYP= E_* constants identifying FD type. + * @param probe_offset Optional output pointer receiving the probe offset = (for kprobe/uprobe types). + * @param probe_addr Optional output pointer receiving resolved kernel a= ddress (kprobe) or relevant mapping address. + * + * @return 0 on success; + * Negative libbpf-style error code (< 0) on failure: + * - -EINVAL : Invalid arguments (bad pid/fd, unsupported flags= , inconsistent buf/buf_len). + * - -ENOENT : Task, file descriptor, or associated probe/progr= am not found. + * - -EBADF : Bad file descriptor in target task at time of qu= ery. + * - -ENOSPC : @p buf too small; @p *buf_len updated with requi= red size. + * - -EPERM / -EACCES : Insufficient privileges or access denied= by security policy. + * - -EFAULT : User memory (buf or buf_len or an output pointer= ) not accessible. + * - -ENOMEM : Temporary kernel memory/resource exhaustion. + * - Other -errno codes may be propagated from the underlying sy= scall. + * + * Best practices: + * - Initialize *buf_len with the size of your buffer; handle -ENOSPC by= allocating + * a larger buffer using the returned required length. + * - Check @p fd_type first to interpret @p probe_offset / @p probe_addr= meaningfully. + * - Treat -ENOENT and -EBADF as normal race outcomes in dynamic environ= ments. + * - Avoid querying extremely frequently in production paths; this is in= trospective + * debug/management tooling, not a fast data path primitive. + * + * Thread safety: + * - This helper is thread-safe; multiple threads can query different (o= r the same) + * tasks concurrently. Returned data structures are per-call (no share= d state). + */ LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len, __u32 *prog_id, __u32 *fd_type, __u64 *probe_offset, __u64 *probe_addr); --=20 2.34.1