[PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size

Yuyang Huang posted 2 patches 4 weeks ago
There is a newer version of this series
drivers/net/netkit.c                          |  5 +-
include/linux/bpf-cgroup.h                    |  5 +-
include/linux/bpf_mprog.h                     |  4 +-
include/net/netkit.h                          |  6 +-
include/net/tcx.h                             |  5 +-
kernel/bpf/cgroup.c                           | 13 +--
kernel/bpf/mprog.c                            |  5 +-
kernel/bpf/syscall.c                          | 34 ++++++--
kernel/bpf/tcx.c                              |  5 +-
.../selftests/bpf/prog_tests/bpf_attr_size.c  | 84 +++++++++++++++++++
10 files changed, 141 insertions(+), 25 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_attr_size.c
[PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size
Posted by Yuyang Huang 4 weeks ago
The bpf(cmd, attr, size) syscall copies up to 'size' bytes on input, but
several commands write outputs back to userspace unconditionally. If the
caller passes a short buffer, this can lead to out-of-bounds writes,
potentially overwriting adjacent userspace memory.

This series addresses this by introducing size-gating based on field type:

1) Mandatory fields (original ABI): Return -EINVAL in __sys_bpf() if the
   user-provided buffer size is smaller than the minimum size required to
   cover these fields. This hardens the syscall entry point for several
   commands.
2) Optional fields (later revisions): Skip writeback if the user-provided
   buffer size is too small to cover them. This is applied to
   'query.revision' in BPF_PROG_QUERY.

The first patch implements the plumbing and enforcement in the kernel.
The second patch adds a selftest to verify the behavior.

Yuyang Huang (2):
  bpf: align syscall writeback behavior with caller-declared size
  selftests/bpf: Add verification for BPF_PROG_QUERY attr size
    boundaries

 drivers/net/netkit.c                          |  5 +-
 include/linux/bpf-cgroup.h                    |  5 +-
 include/linux/bpf_mprog.h                     |  4 +-
 include/net/netkit.h                          |  6 +-
 include/net/tcx.h                             |  5 +-
 kernel/bpf/cgroup.c                           | 13 +--
 kernel/bpf/mprog.c                            |  5 +-
 kernel/bpf/syscall.c                          | 34 ++++++--
 kernel/bpf/tcx.c                              |  5 +-
 .../selftests/bpf/prog_tests/bpf_attr_size.c  | 84 +++++++++++++++++++
 10 files changed, 141 insertions(+), 25 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_attr_size.c

-- 
2.54.0.563.g4f69b47b94-goog
Re: [PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size
Posted by Alexei Starovoitov 3 weeks, 4 days ago
On Fri, May 15, 2026 at 12:15 AM Yuyang Huang <yuyanghuang@google.com> wrote:
>
> The bpf(cmd, attr, size) syscall copies up to 'size' bytes on input, but
> several commands write outputs back to userspace unconditionally. If the
> caller passes a short buffer, this can lead to out-of-bounds writes,
> potentially overwriting adjacent userspace memory.

The whole thing sounds like a user space bug.
Please demonstrate a case where this issue is seen
by using libbpf APIs.
Re: [PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size
Posted by Yuyang Huang 3 weeks, 4 days ago
>The whole thing sounds like a user space bug.

Could you please clarify why you consider this a userspace bug?

From our perspective, this is a binary backward compatibility issue.
An older binary compiled against older UAPI headers has no way of
knowing about newer fields. If it passes size = 40 (the correct size
at the time), the kernel violates the ABI contract by writing beyond
that declared size.

> Please demonstrate a case where this issue is seen by using libbpf APIs.

Our old project does not use libbpf; we interface directly with the
raw BPF syscall.

For example, our Android net_test  suite uses this legacy 40-byte
Python ctypes struct layout:

  # legacy 40-byte layout (pre-revision field)
  BpfAttrProgQuery = cstruct.Struct(
    "bpf_attr_prog_query", "=IIIIQI4xQ",
    "target_fd attach_type query_flags attach_flags prog_ids_ptr
prog_cnt prog_attach_flags"
  )
  # Invoked via: syscall(__NR_bpf, BPF_PROG_QUERY, &attr, 40)

As shown in the original discussion thread, without this fix, a newer
kernel unconditionally writes the optional revision field at offset
56. Since copy_to_user doesn't fault on adjacent mapped memory, this
silently corrupts 8 bytes of the caller's heap (corrupting the Python
interpreter).

This size-gating pattern in this patch should be identical to the
pattern introduced by commit 47a71c1f9af0 in btf.c.

I am not 100% sure but, it seems running a binary statically linked
with an older libbpf version (or directly using raw BPF syscalls)
might face a similar issue? Do we expect those binaries to be broken
in the newer kernel version?


On Mon, May 18, 2026 at 12:30 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, May 15, 2026 at 12:15 AM Yuyang Huang <yuyanghuang@google.com> wrote:
> >
> > The bpf(cmd, attr, size) syscall copies up to 'size' bytes on input, but
> > several commands write outputs back to userspace unconditionally. If the
> > caller passes a short buffer, this can lead to out-of-bounds writes,
> > potentially overwriting adjacent userspace memory.
>
> The whole thing sounds like a user space bug.
> Please demonstrate a case where this issue is seen
> by using libbpf APIs.