Documentation/bpf/kfuncs.rst | 27 +++++++++++++++ kernel/bpf/arena.c | 64 ++++++++++++++++++++++++++++++++++++ 2 files changed, 91 insertions(+)
The page-management kfuncs exposed by BPF arena -
bpf_arena_alloc_pages(), bpf_arena_free_pages() and
bpf_arena_reserve_pages() - are part of the BPF kfunc ABI but lack
rendered documentation. Their contracts (valid argument ranges,
sleepable-only context, and the set of error returns) are today only
discoverable by reading kernel/bpf/arena.c.
Add a kernel-doc comment block above each of the three kfuncs and
render them under a new "BPF arena kfuncs" subsection in
Documentation/bpf/kfuncs.rst, alongside the existing core kfunc
subsections.
No functional change.
Signed-off-by: Dhiraj Shah <find.dhiraj@gmail.com>
---
Documentation/bpf/kfuncs.rst | 27 +++++++++++++++
kernel/bpf/arena.c | 64 ++++++++++++++++++++++++++++++++++++
2 files changed, 91 insertions(+)
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index 75e6c078e0e7..fe0df1e16453 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -732,3 +732,30 @@ the verifier. bpf_cgroup_ancestor() can be used as follows:
BPF provides a set of kfuncs that can be used to query, allocate, mutate, and
destroy struct cpumask * objects. Please refer to :ref:`cpumasks-header-label`
for more details.
+
+4.4 BPF arena kfuncs
+--------------------
+
+A BPF arena (``BPF_MAP_TYPE_ARENA``) is a sparsely-populated shared memory
+region that a BPF program and a user-space process can both address. The
+following kfuncs allow a sleepable BPF program to allocate, free, and reserve
+pages within an arena:
+
+.. kernel-doc:: kernel/bpf/arena.c
+ :identifiers: bpf_arena_alloc_pages bpf_arena_free_pages bpf_arena_reserve_pages
+
+A typical pattern is to allocate one or more pages, write to them from BPF,
+and let user space observe the same memory after a page fault populates its
+VMA:
+
+.. code-block:: c
+
+ void __arena *page;
+
+ page = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0);
+ if (!page)
+ return -ENOMEM;
+
+ /* ... use the page from BPF; user space sees the same bytes ... */
+
+ bpf_arena_free_pages(&arena, page, 1);
diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
index 49a8f7b1beef..b8ec2953dee6 100644
--- a/kernel/bpf/arena.c
+++ b/kernel/bpf/arena.c
@@ -870,6 +870,33 @@ static void arena_free_irq(struct irq_work *iw)
__bpf_kfunc_start_defs();
+/**
+ * bpf_arena_alloc_pages() - Allocate pages within a BPF arena.
+ * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
+ * @addr__ign: Page-aligned user-space address within the arena at which to
+ * place the allocation, or %NULL to let the kernel choose. When
+ * non-NULL the address must fall inside the arena's user VMA
+ * range; otherwise the allocation fails.
+ * @page_cnt: Number of pages to allocate. Must be non-zero and no greater
+ * than the arena's configured size in pages.
+ * @node_id: NUMA node hint for the backing pages, or %NUMA_NO_NODE.
+ * @flags: Reserved for future use; must be 0.
+ *
+ * Allocates @page_cnt physically-backed pages and inserts them into the
+ * arena's kernel VMA at the offset corresponding to @addr__ign (or at an
+ * arbitrary free offset when @addr__ign is %NULL). A subsequent user-space
+ * page fault on the matching user address populates the user VMA with the
+ * same pages, giving BPF and user space a shared view of the region.
+ *
+ * The underlying allocator may sleep, so this kfunc is only callable from
+ * sleepable BPF programs.
+ *
+ * Return:
+ * * Kernel pointer to the start of the allocated region on success.
+ * * %NULL if @p__map is not an arena, @flags is non-zero, @page_cnt is zero
+ * or exceeds the arena size, @addr__ign is misaligned or outside the
+ * arena, @node_id is invalid, or the kernel is out of memory.
+ */
__bpf_kfunc void *bpf_arena_alloc_pages(void *p__map, void *addr__ign, u32 page_cnt,
int node_id, u64 flags)
{
@@ -893,6 +920,23 @@ void *bpf_arena_alloc_pages_non_sleepable(void *p__map, void *addr__ign, u32 pag
return (void *)arena_alloc_pages(arena, (long)addr__ign, page_cnt, node_id, false);
}
+
+/**
+ * bpf_arena_free_pages() - Free a range of pages within a BPF arena.
+ * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
+ * @ptr__ign: User-space virtual address of the first page to free, as used
+ * to address the arena from BPF and user space. Typically the
+ * same address that was previously returned (in user-space form)
+ * by bpf_arena_alloc_pages().
+ * @page_cnt: Number of pages to free.
+ *
+ * Releases the backing pages, unmapping them from the arena's kernel VMA
+ * and from any user-space VMA that previously faulted them in. May sleep,
+ * so the kfunc is callable only from sleepable BPF programs.
+ *
+ * The call is a no-op when @p__map is not an arena, when @page_cnt is zero,
+ * or when @ptr__ign is %NULL.
+ */
__bpf_kfunc void bpf_arena_free_pages(void *p__map, void *ptr__ign, u32 page_cnt)
{
struct bpf_map *map = p__map;
@@ -913,6 +957,26 @@ void bpf_arena_free_pages_non_sleepable(void *p__map, void *ptr__ign, u32 page_c
arena_free_pages(arena, (long)ptr__ign, page_cnt, false);
}
+/**
+ * bpf_arena_reserve_pages() - Reserve a page range within a BPF arena.
+ * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
+ * @ptr__ign: Page-aligned user-space virtual address of the start of the
+ * range to reserve.
+ * @page_cnt: Number of pages to reserve. Zero is permitted and is a no-op.
+ *
+ * Marks @page_cnt pages starting at @ptr__ign as reserved so that subsequent
+ * bpf_arena_alloc_pages() calls will not place allocations in that range.
+ * No physical pages are allocated by this kfunc; the range is simply
+ * excluded from the arena's free space.
+ *
+ * Return:
+ * * 0 on success, or when @page_cnt is zero.
+ * * -EINVAL if @p__map is not an arena or the requested range falls outside
+ * the arena's user VMA.
+ * * -EBUSY if any page in the requested range is already allocated, or if
+ * contention on the arena's internal spinlock prevents the operation from
+ * completing.
+ */
__bpf_kfunc int bpf_arena_reserve_pages(void *p__map, void *ptr__ign, u32 page_cnt)
{
struct bpf_map *map = p__map;
--
2.43.0
On Thu, May 21, 2026 at 6:36 AM Dhiraj Shah <find.dhiraj@gmail.com> wrote: > > The page-management kfuncs exposed by BPF arena - > bpf_arena_alloc_pages(), bpf_arena_free_pages() and > bpf_arena_reserve_pages() - are part of the BPF kfunc ABI but lack > rendered documentation. Their contracts (valid argument ranges, > sleepable-only context, and the set of error returns) are today only > discoverable by reading kernel/bpf/arena.c. > > Add a kernel-doc comment block above each of the three kfuncs and > render them under a new "BPF arena kfuncs" subsection in > Documentation/bpf/kfuncs.rst, alongside the existing core kfunc > subsections. > > No functional change. > > Signed-off-by: Dhiraj Shah <find.dhiraj@gmail.com> > --- > Documentation/bpf/kfuncs.rst | 27 +++++++++++++++ > kernel/bpf/arena.c | 64 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 91 insertions(+) > > diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst > index 75e6c078e0e7..fe0df1e16453 100644 > --- a/Documentation/bpf/kfuncs.rst > +++ b/Documentation/bpf/kfuncs.rst > @@ -732,3 +732,30 @@ the verifier. bpf_cgroup_ancestor() can be used as follows: > BPF provides a set of kfuncs that can be used to query, allocate, mutate, and > destroy struct cpumask * objects. Please refer to :ref:`cpumasks-header-label` > for more details. > + > +4.4 BPF arena kfuncs > +-------------------- > + > +A BPF arena (``BPF_MAP_TYPE_ARENA``) is a sparsely-populated shared memory > +region that a BPF program and a user-space process can both address. The > +following kfuncs allow a sleepable BPF program to allocate, free, and reserve > +pages within an arena: > + > +.. kernel-doc:: kernel/bpf/arena.c > + :identifiers: bpf_arena_alloc_pages bpf_arena_free_pages bpf_arena_reserve_pages > + > +A typical pattern is to allocate one or more pages, write to them from BPF, > +and let user space observe the same memory after a page fault populates its > +VMA: > + > +.. code-block:: c > + > + void __arena *page; > + > + page = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0); > + if (!page) > + return -ENOMEM; > + > + /* ... use the page from BPF; user space sees the same bytes ... */ > + > + bpf_arena_free_pages(&arena, page, 1); > diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c > index 49a8f7b1beef..b8ec2953dee6 100644 > --- a/kernel/bpf/arena.c > +++ b/kernel/bpf/arena.c > @@ -870,6 +870,33 @@ static void arena_free_irq(struct irq_work *iw) > > __bpf_kfunc_start_defs(); > > +/** > + * bpf_arena_alloc_pages() - Allocate pages within a BPF arena. > + * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map. > + * @addr__ign: Page-aligned user-space address within the arena at which to > + * place the allocation, or %NULL to let the kernel choose. When > + * non-NULL the address must fall inside the arena's user VMA > + * range; otherwise the allocation fails. > + * @page_cnt: Number of pages to allocate. Must be non-zero and no greater > + * than the arena's configured size in pages. > + * @node_id: NUMA node hint for the backing pages, or %NUMA_NO_NODE. > + * @flags: Reserved for future use; must be 0. > + * > + * Allocates @page_cnt physically-backed pages and inserts them into the > + * arena's kernel VMA at the offset corresponding to @addr__ign (or at an > + * arbitrary free offset when @addr__ign is %NULL). A subsequent user-space > + * page fault on the matching user address populates the user VMA with the > + * same pages, giving BPF and user space a shared view of the region. > + * > + * The underlying allocator may sleep, so this kfunc is only callable from > + * sleepable BPF programs. what? > + * Return: > + * * Kernel pointer to the start of the allocated region on success. what? so much slop nowadays :( pw-bot: cr
> diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
> index 75e6c078e0e7..fe0df1e16453 100644
> --- a/Documentation/bpf/kfuncs.rst
> +++ b/Documentation/bpf/kfuncs.rst
[ ... ]
> diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
> index 49a8f7b1beef..b8ec2953dee6 100644
> --- a/kernel/bpf/arena.c
> +++ b/kernel/bpf/arena.c
> @@ -870,6 +870,33 @@ static void arena_free_irq(struct irq_work *iw)
>
> __bpf_kfunc_start_defs();
>
> +/**
> + * bpf_arena_alloc_pages() - Allocate pages within a BPF arena.
> + * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
> + * @addr__ign: Page-aligned user-space address within the arena at which to
> + * place the allocation, or %NULL to let the kernel choose. When
> + * non-NULL the address must fall inside the arena's user VMA
> + * range; otherwise the allocation fails.
> + * @page_cnt: Number of pages to allocate. Must be non-zero and no greater
> + * than the arena's configured size in pages.
> + * @node_id: NUMA node hint for the backing pages, or %NUMA_NO_NODE.
> + * @flags: Reserved for future use; must be 0.
> + *
> + * Allocates @page_cnt physically-backed pages and inserts them into the
> + * arena's kernel VMA at the offset corresponding to @addr__ign (or at an
> + * arbitrary free offset when @addr__ign is %NULL). A subsequent user-space
> + * page fault on the matching user address populates the user VMA with the
> + * same pages, giving BPF and user space a shared view of the region.
> + *
> + * The underlying allocator may sleep, so this kfunc is only callable from
> + * sleepable BPF programs.
> + *
> + * Return:
> + * * Kernel pointer to the start of the allocated region on success.
Does the return value description match the implementation? Looking at
arena_alloc_pages() in kernel/bpf/arena.c, the function returns:
return clear_lo32(arena->user_vm_start) + uaddr32;
which is a user-space virtual address derived from the arena's
user_vm_start. This matches the description for bpf_arena_free_pages()
later in this patch which notes the address is "in user-space form" when
previously returned by bpf_arena_alloc_pages().
Should the documentation state it returns a user-space pointer rather than
"Kernel pointer"?
> + * * %NULL if @p__map is not an arena, @flags is non-zero, @page_cnt is zero
> + * or exceeds the arena size, @addr__ign is misaligned or outside the
> + * arena, @node_id is invalid, or the kernel is out of memory.
> + */
> __bpf_kfunc void *bpf_arena_alloc_pages(void *p__map, void *addr__ign, u32 page_cnt,
> int node_id, u64 flags)
> {
[ ... ]
> @@ -913,6 +957,26 @@ void bpf_arena_free_pages_non_sleepable(void *p__map, void *ptr__ign, u32 page_c
> arena_free_pages(arena, (long)ptr__ign, page_cnt, false);
> }
>
> +/**
> + * bpf_arena_free_pages() - Free a range of pages within a BPF arena.
> + * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
> + * @ptr__ign: User-space virtual address of the first page to free, as used
> + * to address the arena from BPF and user space. Typically the
> + * same address that was previously returned (in user-space form)
> + * by bpf_arena_alloc_pages().
> + * @page_cnt: Number of pages to free.
> + *
> + * Releases the backing pages, unmapping them from the arena's kernel VMA
> + * and from any user-space VMA that previously faulted them in. May sleep,
> + * so the kfunc is callable only from sleepable BPF programs.
> + *
> + * The call is a no-op when @p__map is not an arena, when @page_cnt is zero,
> + * or when @ptr__ign is %NULL.
> + */
> __bpf_kfunc void bpf_arena_free_pages(void *p__map, void *ptr__ign, u32 page_cnt)
> {
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26206146239
On Thu May 21, 2026 at 12:35 AM EDT, Dhiraj Shah wrote:
> The page-management kfuncs exposed by BPF arena -
> bpf_arena_alloc_pages(), bpf_arena_free_pages() and
> bpf_arena_reserve_pages() - are part of the BPF kfunc ABI but lack
> rendered documentation. Their contracts (valid argument ranges,
> sleepable-only context, and the set of error returns) are today only
> discoverable by reading kernel/bpf/arena.c.
>
> Add a kernel-doc comment block above each of the three kfuncs and
> render them under a new "BPF arena kfuncs" subsection in
> Documentation/bpf/kfuncs.rst, alongside the existing core kfunc
> subsections.
>
> No functional change.
>
> Signed-off-by: Dhiraj Shah <find.dhiraj@gmail.com>
> ---
> Documentation/bpf/kfuncs.rst | 27 +++++++++++++++
> kernel/bpf/arena.c | 64 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 91 insertions(+)
>
> diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
> index 75e6c078e0e7..fe0df1e16453 100644
> --- a/Documentation/bpf/kfuncs.rst
> +++ b/Documentation/bpf/kfuncs.rst
> @@ -732,3 +732,30 @@ the verifier. bpf_cgroup_ancestor() can be used as follows:
> BPF provides a set of kfuncs that can be used to query, allocate, mutate, and
> destroy struct cpumask * objects. Please refer to :ref:`cpumasks-header-label`
> for more details.
> +
> +4.4 BPF arena kfuncs
> +--------------------
> +
> +A BPF arena (``BPF_MAP_TYPE_ARENA``) is a sparsely-populated shared memory
> +region that a BPF program and a user-space process can both address. The
> +following kfuncs allow a sleepable BPF program to allocate, free, and reserve
> +pages within an arena:
> +
> +.. kernel-doc:: kernel/bpf/arena.c
> + :identifiers: bpf_arena_alloc_pages bpf_arena_free_pages bpf_arena_reserve_pages
> +
> +A typical pattern is to allocate one or more pages, write to them from BPF,
> +and let user space observe the same memory after a page fault populates its
> +VMA:
Maybe slight rephrase? This description is a bit dense. E.g.,
"...and let user space access the pages through a mapping in its address space."
> +
> +.. code-block:: c
> +
> + void __arena *page;
> +
> + page = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0);
> + if (!page)
> + return -ENOMEM;
> +
> + /* ... use the page from BPF; user space sees the same bytes ... */
> +
> + bpf_arena_free_pages(&arena, page, 1);
> diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
> index 49a8f7b1beef..b8ec2953dee6 100644
> --- a/kernel/bpf/arena.c
> +++ b/kernel/bpf/arena.c
> @@ -870,6 +870,33 @@ static void arena_free_irq(struct irq_work *iw)
>
> __bpf_kfunc_start_defs();
>
> +/**
> + * bpf_arena_alloc_pages() - Allocate pages within a BPF arena.
> + * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
> + * @addr__ign: Page-aligned user-space address within the arena at which to
> + * place the allocation, or %NULL to let the kernel choose. When
> + * non-NULL the address must fall inside the arena's user VMA
> + * range; otherwise the allocation fails.
> + * @page_cnt: Number of pages to allocate. Must be non-zero and no greater
> + * than the arena's configured size in pages.
> + * @node_id: NUMA node hint for the backing pages, or %NUMA_NO_NODE.
> + * @flags: Reserved for future use; must be 0.
> + *
> + * Allocates @page_cnt physically-backed pages and inserts them into the
> + * arena's kernel VMA at the offset corresponding to @addr__ign (or at an
> + * arbitrary free offset when @addr__ign is %NULL). A subsequent user-space
> + * page fault on the matching user address populates the user VMA with the
> + * same pages, giving BPF and user space a shared view of the region.
> + *
> + * The underlying allocator may sleep, so this kfunc is only callable from
> + * sleepable BPF programs.
I think this is half the story, since the verifier adjusts the call to
the function to the non-sleepable version when necessary. So the kfunc
is technically only callable from sleepable BPF programs but it never
will be thanks to the verifier.
> + *
> + * Return:
> + * * Kernel pointer to the start of the allocated region on success.
> + * * %NULL if @p__map is not an arena, @flags is non-zero, @page_cnt is zero
> + * or exceeds the arena size, @addr__ign is misaligned or outside the
> + * arena, @node_id is invalid, or the kernel is out of memory.
> + */
> __bpf_kfunc void *bpf_arena_alloc_pages(void *p__map, void *addr__ign, u32 page_cnt,
> int node_id, u64 flags)
> {
> @@ -893,6 +920,23 @@ void *bpf_arena_alloc_pages_non_sleepable(void *p__map, void *addr__ign, u32 pag
>
> return (void *)arena_alloc_pages(arena, (long)addr__ign, page_cnt, node_id, false);
> }
> +
> +/**
> + * bpf_arena_free_pages() - Free a range of pages within a BPF arena.
> + * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
> + * @ptr__ign: User-space virtual address of the first page to free, as used
> + * to address the arena from BPF and user space. Typically the
> + * same address that was previously returned (in user-space form)
> + * by bpf_arena_alloc_pages().
> + * @page_cnt: Number of pages to free.
> + *
> + * Releases the backing pages, unmapping them from the arena's kernel VMA
> + * and from any user-space VMA that previously faulted them in. May sleep,
> + * so the kfunc is callable only from sleepable BPF programs.
Same here.
> + *
> + * The call is a no-op when @p__map is not an arena, when @page_cnt is zero,
> + * or when @ptr__ign is %NULL.
> + */
> __bpf_kfunc void bpf_arena_free_pages(void *p__map, void *ptr__ign, u32 page_cnt)
> {
> struct bpf_map *map = p__map;
> @@ -913,6 +957,26 @@ void bpf_arena_free_pages_non_sleepable(void *p__map, void *ptr__ign, u32 page_c
> arena_free_pages(arena, (long)ptr__ign, page_cnt, false);
> }
>
> +/**
> + * bpf_arena_reserve_pages() - Reserve a page range within a BPF arena.
> + * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
> + * @ptr__ign: Page-aligned user-space virtual address of the start of the
> + * range to reserve.
> + * @page_cnt: Number of pages to reserve. Zero is permitted and is a no-op.
> + *
> + * Marks @page_cnt pages starting at @ptr__ign as reserved so that subsequent
> + * bpf_arena_alloc_pages() calls will not place allocations in that range.
> + * No physical pages are allocated by this kfunc; the range is simply
> + * excluded from the arena's free space.
> + *
> + * Return:
> + * * 0 on success, or when @page_cnt is zero.
> + * * -EINVAL if @p__map is not an arena or the requested range falls outside
> + * the arena's user VMA.
> + * * -EBUSY if any page in the requested range is already allocated, or if
> + * contention on the arena's internal spinlock prevents the operation from
> + * completing.
> + */
> __bpf_kfunc int bpf_arena_reserve_pages(void *p__map, void *ptr__ign, u32 page_cnt)
> {
> struct bpf_map *map = p__map;
The page-management kfuncs exposed by BPF arena -
bpf_arena_alloc_pages(), bpf_arena_free_pages() and
bpf_arena_reserve_pages() - are part of the BPF kfunc ABI but lack
rendered documentation. Their contracts (valid argument ranges,
sleepable-only context, and the set of error returns) are today only
discoverable by reading kernel/bpf/arena.c.
Add a kernel-doc comment block above each of the three kfuncs and
render them under a new "BPF arena kfuncs" subsection in
Documentation/bpf/kfuncs.rst, alongside the existing core kfunc
subsections.
No functional change.
Signed-off-by: Dhiraj Shah <find.dhiraj@gmail.com>
---
Changes in v2:
- Fix the return-value description for bpf_arena_alloc_pages(): the kfunc
returns a user-space virtual address (translated by the BPF JIT for
accesses from the BPF program), not a kernel pointer. Thanks to Alexei
Starovoitov, Emil Tsalapatis and the AI reviewers for catching this.
- Drop the "callable only from sleepable BPF programs" claims for
bpf_arena_alloc_pages() and bpf_arena_free_pages(): the verifier
rewrites these calls to their _non_sleepable variants when the calling
program is non-sleepable, so callers do not need to care about this
distinction. Thanks to Emil Tsalapatis.
- Tighten the prose in Documentation/bpf/kfuncs.rst accordingly.
v1: https://lore.kernel.org/bpf/20260521043553.199781-1-find.dhiraj@gmail.com/
Documentation/bpf/kfuncs.rst | 26 ++++++++++++++++
kernel/bpf/arena.c | 59 ++++++++++++++++++++++++++++++++++++
2 files changed, 85 insertions(+)
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index 75e6c078e0e7..28b6b477012a 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -732,3 +732,29 @@ the verifier. bpf_cgroup_ancestor() can be used as follows:
BPF provides a set of kfuncs that can be used to query, allocate, mutate, and
destroy struct cpumask * objects. Please refer to :ref:`cpumasks-header-label`
for more details.
+
+4.4 BPF arena kfuncs
+--------------------
+
+A BPF arena (``BPF_MAP_TYPE_ARENA``) is a sparsely-populated shared memory
+region that a BPF program and a user-space process can both address. The
+following kfuncs allow a BPF program to allocate, free, and reserve pages
+within an arena:
+
+.. kernel-doc:: kernel/bpf/arena.c
+ :identifiers: bpf_arena_alloc_pages bpf_arena_free_pages bpf_arena_reserve_pages
+
+A typical pattern is to allocate one or more pages, write to them from BPF,
+and let user space access the same pages through its mapping of the arena:
+
+.. code-block:: c
+
+ void __arena *page;
+
+ page = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0);
+ if (!page)
+ return -ENOMEM;
+
+ /* ... use the page from BPF; user space sees the same bytes ... */
+
+ bpf_arena_free_pages(&arena, page, 1);
diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
index 49a8f7b1beef..948a43159106 100644
--- a/kernel/bpf/arena.c
+++ b/kernel/bpf/arena.c
@@ -870,6 +870,31 @@ static void arena_free_irq(struct irq_work *iw)
__bpf_kfunc_start_defs();
+/**
+ * bpf_arena_alloc_pages() - Allocate pages within a BPF arena.
+ * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
+ * @addr__ign: Page-aligned user-space address within the arena at which to
+ * place the allocation, or %NULL to let the kernel choose. When
+ * non-NULL the address must fall inside the arena's user VMA
+ * range; otherwise the allocation fails.
+ * @page_cnt: Number of pages to allocate. Must be non-zero and no greater
+ * than the arena's configured size in pages.
+ * @node_id: NUMA node hint for the backing pages, or %NUMA_NO_NODE.
+ * @flags: Reserved for future use; must be 0.
+ *
+ * Allocates @page_cnt pages and inserts them into the arena at the offset
+ * corresponding to @addr__ign (or at an arbitrary free offset when
+ * @addr__ign is %NULL). The pages become accessible to the BPF program
+ * immediately and to user space through the arena's mmap()ed region.
+ *
+ * Return:
+ * * The user-space virtual address of the start of the allocated region on
+ * success. The BPF JIT translates this address for accesses from the BPF
+ * program.
+ * * %NULL if @p__map is not an arena, @flags is non-zero, @page_cnt is zero
+ * or exceeds the arena size, @addr__ign is misaligned or outside the
+ * arena, @node_id is invalid, or the kernel is out of memory.
+ */
__bpf_kfunc void *bpf_arena_alloc_pages(void *p__map, void *addr__ign, u32 page_cnt,
int node_id, u64 flags)
{
@@ -893,6 +918,20 @@ void *bpf_arena_alloc_pages_non_sleepable(void *p__map, void *addr__ign, u32 pag
return (void *)arena_alloc_pages(arena, (long)addr__ign, page_cnt, node_id, false);
}
+
+/**
+ * bpf_arena_free_pages() - Free a range of pages within a BPF arena.
+ * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
+ * @ptr__ign: User-space virtual address of the first page to free, as
+ * returned by bpf_arena_alloc_pages().
+ * @page_cnt: Number of pages to free.
+ *
+ * Releases the backing pages and unmaps them from any user-space mapping
+ * of the arena.
+ *
+ * The call is a no-op when @p__map is not an arena, when @page_cnt is zero,
+ * or when @ptr__ign is %NULL.
+ */
__bpf_kfunc void bpf_arena_free_pages(void *p__map, void *ptr__ign, u32 page_cnt)
{
struct bpf_map *map = p__map;
@@ -913,6 +952,26 @@ void bpf_arena_free_pages_non_sleepable(void *p__map, void *ptr__ign, u32 page_c
arena_free_pages(arena, (long)ptr__ign, page_cnt, false);
}
+/**
+ * bpf_arena_reserve_pages() - Reserve a page range within a BPF arena.
+ * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
+ * @ptr__ign: Page-aligned user-space virtual address of the start of the
+ * range to reserve.
+ * @page_cnt: Number of pages to reserve. Zero is permitted and is a no-op.
+ *
+ * Marks @page_cnt pages starting at @ptr__ign as reserved so that subsequent
+ * bpf_arena_alloc_pages() calls will not place allocations in that range.
+ * No physical pages are allocated by this kfunc; the range is simply
+ * excluded from the arena's free space.
+ *
+ * Return:
+ * * 0 on success, or when @page_cnt is zero.
+ * * -EINVAL if @p__map is not an arena or the requested range falls outside
+ * the arena's user VMA.
+ * * -EBUSY if any page in the requested range is already allocated, or if
+ * contention on the arena's internal spinlock prevents the operation from
+ * completing.
+ */
__bpf_kfunc int bpf_arena_reserve_pages(void *p__map, void *ptr__ign, u32 page_cnt)
{
struct bpf_map *map = p__map;
--
2.43.0
> The page-management kfuncs exposed by BPF arena -
> bpf_arena_alloc_pages(), bpf_arena_free_pages() and
> bpf_arena_reserve_pages() - are part of the BPF kfunc ABI but lack
> rendered documentation. Their contracts (valid argument ranges,
> sleepable-only context, and the set of error returns) are today only
> discoverable by reading kernel/bpf/arena.c.
Is the description 'sleepable-only context' accurate here? The BPF
verifier supports calling bpf_arena_alloc_pages() and bpf_arena_free_pages()
in non-sleepable contexts by dynamically replacing the calls with their
non-sleepable variants. Looking at the BTF_ID_FLAGS definitions, they do not
include KF_SLEEPABLE. While the code documentation was fixed in v2 to reflect
this, the commit message text was not updated.
> diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
> index 49a8f7b1beef..948a43159106 100644
> --- a/kernel/bpf/arena.c
> +++ b/kernel/bpf/arena.c
> @@ -870,6 +870,31 @@ static void arena_free_irq(struct irq_work *iw)
>
> __bpf_kfunc_start_defs();
>
> +/**
> + * bpf_arena_alloc_pages() - Allocate pages within a BPF arena.
> + * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
> + * @addr__ign: Page-aligned user-space address within the arena at which to
> + * place the allocation, or %NULL to let the kernel choose. When
> + * non-NULL the address must fall inside the arena's user VMA
> + * range; otherwise the allocation fails.
> + * @page_cnt: Number of pages to allocate. Must be non-zero and no greater
> + * than the arena's configured size in pages.
> + * @node_id: NUMA node hint for the backing pages, or %NUMA_NO_NODE.
> + * @flags: Reserved for future use; must be 0.
> + *
> + * Allocates @page_cnt pages and inserts them into the arena at the offset
> + * corresponding to @addr__ign (or at an arbitrary free offset when
> + * @addr__ign is %NULL). The pages become accessible to the BPF program
> + * immediately and to user space through the arena's mmap()ed region.
> + *
> + * Return:
> + * * The user-space virtual address of the start of the allocated region on
> + * success. The BPF JIT translates this address for accesses from the BPF
> + * program.
> + * * %NULL if @p__map is not an arena, @flags is non-zero, @page_cnt is zero
> + * or exceeds the arena size, @addr__ign is misaligned or outside the
> + * arena, @node_id is invalid, or the kernel is out of memory.
> + */
Are there missing failure conditions in this documentation? It appears
bpf_arena_alloc_pages() can also return NULL if the requested address is
already allocated, or if there is no contiguous free space of the requested
size in the arena.
[ ... ]
> @@ -913,6 +952,26 @@ void bpf_arena_free_pages_non_sleepable(void *p__map, void *ptr__ign, u32 page_c
> arena_free_pages(arena, (long)ptr__ign, page_cnt, false);
> }
>
> +/**
> + * bpf_arena_reserve_pages() - Reserve a page range within a BPF arena.
> + * @p__map: Pointer to a ``BPF_MAP_TYPE_ARENA`` map.
> + * @ptr__ign: Page-aligned user-space virtual address of the start of the
> + * range to reserve.
> + * @page_cnt: Number of pages to reserve. Zero is permitted and is a no-op.
> + *
> + * Marks @page_cnt pages starting at @ptr__ign as reserved so that subsequent
> + * bpf_arena_alloc_pages() calls will not place allocations in that range.
> + * No physical pages are allocated by this kfunc; the range is simply
> + * excluded from the arena's free space.
> + *
> + * Return:
> + * * 0 on success, or when @page_cnt is zero.
> + * * -EINVAL if @p__map is not an arena or the requested range falls outside
> + * the arena's user VMA.
> + * * -EBUSY if any page in the requested range is already allocated, or if
> + * contention on the arena's internal spinlock prevents the operation from
> + * completing.
> + */
Does this documentation correctly describe the error returns? Looking at
arena_reserve_pages() in kernel/bpf/arena.c, there's a mismatch between the
documented -EINVAL return for misaligned addresses and the actual
implementation:
arena_reserve_pages() {
...
if (uaddr & ~PAGE_MASK)
return 0;
...
}
The documentation implies that misalignment should return -EINVAL (as it's
a validation error), but the code silently succeeds with return 0.
For consistency, bpf_arena_alloc_pages() has the same behavior (returns
NULL for misalignment) and its documentation correctly states it returns
NULL for misaligned addresses. Should either the code be changed to return
-EINVAL for misalignment to match the new documentation, or should the
documentation be updated to state that misaligned addresses result in
returning 0 (no-op)?
Also, does this need to document -ENOMEM? If range_tree_clear() fails to
allocate a new node during a split operation, it seems this can return
-ENOMEM.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26360590766
© 2016 - 2026 Red Hat, Inc.