audit: Expose audit subsystem to BPF LSM programs via BPF kfuncs

[PATCH RFC bpf-next 0/4] audit: Expose audit subsystem to BPF LSM programs via BPF kfuncs

Posted by Frederick Lawler 3 weeks, 5 days ago

The motivation behind the change is to give BPF LSM developers the
ability to report accesses via the audit subsystem much like how LSMs
operate today.

Series:

Patch 1: Introduces bpf_audit_*() kfuncs
Patch 2: Enables bpf_audit_*() kfuns
Patch 3: Prepares audit helpers used for testing
Patch 4: Adds self tests

Documentation will be added when this becomes a versioned series.

Key features:

1. Audit logs include type=AUDIT_BPF_LSM_ACCESS, BPF program ID, and comm
that triggered the hook by default

We wanted audit log consumers to be able to track who and what created
the entry. prog-id=%d is already used for BPF LOAD/UNLOAD logs, thus
is reused here for this distinction. Though, it may be better to use
the tag instead to capture which _specific_ version of the program
made the log, since prog-id can be reused.

2. Leverages BPF KF_AQUIRE/KF_RELEASE semantics to force use of
  bpf_audit_log_end().

One side effect of this decision is that the BPF documentation states
that these flags allow the pointer to struct bpf_audit_context to be 
stored in a map, and then exchanged through bpf_kptr_xchg(). However,
there's prior work with net/netfilter/nf_conntrack_bpf.c such that the
struct is not exposed as a kptr to support that functionality nor is
that supplying a dtor function. The verifier will not allow this use case
due to not exposing the __kptr. Ideally, we don't want the pointer to
be exchanged anyway because the reporting program can become ambiguous.
I am sure there are other edge cases WRT to keeping the audit buffer in a
strange state too that I cannot think of at this moment.

3. All bpf_audit_log_*() functions are destructive

The audit subsystem allows for AUDIT_FAIL_PANIC to be set when the
subsystem can detect that missing events. Further, some call paths may
invoke a BUG_ON(). Therefore all the functions are marked destructive.

4. Functions are callable once per bpf_audit_context

The rationale for this was to prevent abuse. Logs with repeated fields
are not helpful, and may not be handled by user space audit coherently.

This is in the same vein as not providing a audit_format() wrapper.

Similarly, some functions such as bpf_audit_log_path() and
bpf_audit_log_file() report the same information, thus can be
interchangeable in use.

5. API wraps security/lsm_audit.c

lsm_audit.c functions are multiplexed and not handled by BPF verifier
very well, thus the wrapped functions are isolated to their sole
purpose for use within hooks.

Key considerations:

1. Audit field ordering

AFAIK, user space audit is particular about what fields are
present and their order. This patch series does not address ordering.

My assumption is that the first three fields: type, prog-id, pid, comm
are well known, and user space can make an assumption that other
fields after those can appear in any order.

If that is not acceptable, I would propose that we leverage the struct
common_audit_data type order to be the order--much like how the type is
used for log_once() functionality.

I am open to other ideas.

Signed-off-by: Frederick Lawler <fred@cloudflare.com>
---
Frederick Lawler (4):
      audit: Implement bpf_audit_log_*() wrappers
      audit/security: Enable audit BPF kfuncs
      selftests/bpf: Add audit helpers for BPF tests
      selftests/bpf: Add lsm_audit_kfuncs tests

 include/linux/lsm_audit.h                          |   1 +
 include/uapi/linux/audit.h                         |   1 +
 security/Makefile                                  |   2 +
 security/lsm_audit_kfuncs.c                        | 306 +++++++++++
 tools/testing/selftests/bpf/Makefile               |   3 +-
 tools/testing/selftests/bpf/audit_helpers.c        | 281 ++++++++++
 tools/testing/selftests/bpf/audit_helpers.h        |  55 ++
 .../selftests/bpf/prog_tests/lsm_audit_kfuncs.c    | 598 +++++++++++++++++++++
 .../selftests/bpf/progs/test_lsm_audit_kfuncs.c    | 263 +++++++++
 9 files changed, 1509 insertions(+), 1 deletion(-)
---
base-commit: ca0f39a369c5f927c3d004e63a5a778b08a9df94
change-id: 20260105-bpf-auditd-send-message-4a883067aab8

Best regards,
-- 
Frederick Lawler <fred@cloudflare.com>

Re: [PATCH RFC bpf-next 0/4] audit: Expose audit subsystem to BPF LSM programs via BPF kfuncs

Posted by Kumar Kartikeya Dwivedi 3 weeks ago

On Wed, 11 Mar 2026 at 22:31, Frederick Lawler <fred@cloudflare.com> wrote:
>
> The motivation behind the change is to give BPF LSM developers the
> ability to report accesses via the audit subsystem much like how LSMs
> operate today.
>
> Series:
>
> Patch 1: Introduces bpf_audit_*() kfuncs
> Patch 2: Enables bpf_audit_*() kfuns
> Patch 3: Prepares audit helpers used for testing
> Patch 4: Adds self tests
>
> Documentation will be added when this becomes a versioned series.
>
> Key features:
>
> 1. Audit logs include type=AUDIT_BPF_LSM_ACCESS, BPF program ID, and comm
> that triggered the hook by default
>
> We wanted audit log consumers to be able to track who and what created
> the entry. prog-id=%d is already used for BPF LOAD/UNLOAD logs, thus
> is reused here for this distinction. Though, it may be better to use
> the tag instead to capture which _specific_ version of the program
> made the log, since prog-id can be reused.

I think just id is fine. Whoever is watching the prog can also watch
for LOAD/UNLOAD events and detect any recycling.

>
> 2. Leverages BPF KF_AQUIRE/KF_RELEASE semantics to force use of
>   bpf_audit_log_end().
>
> One side effect of this decision is that the BPF documentation states
> that these flags allow the pointer to struct bpf_audit_context to be
> stored in a map, and then exchanged through bpf_kptr_xchg(). However,

Yes, it is only allowed if you give a dtor. No dtor, then no kptr_xchg.

> there's prior work with net/netfilter/nf_conntrack_bpf.c such that the
> struct is not exposed as a kptr to support that functionality nor is
> that supplying a dtor function. The verifier will not allow this use case
> due to not exposing the __kptr. Ideally, we don't want the pointer to
> be exchanged anyway because the reporting program can become ambiguous.
> I am sure there are other edge cases WRT to keeping the audit buffer in a
> strange state too that I cannot think of at this moment.
>
> 3. All bpf_audit_log_*() functions are destructive
>
> The audit subsystem allows for AUDIT_FAIL_PANIC to be set when the
> subsystem can detect that missing events. Further, some call paths may
> invoke a BUG_ON(). Therefore all the functions are marked destructive.

I think the first part makes sense (i.e., the policy simply configured
the system to panic on failure).
However, in the second case, if the program is somehow able to trigger
BUG_ON() relied upon for internal invariants, it would be considered
broken.
I tried grepping through and didn't find anything that would cause
this, hence the whole thing about BUG_ON() in the cover letter only
adds to confusion.
Please drop it or describe cases which you were concerned about.

>
> 4. Functions are callable once per bpf_audit_context
>
> The rationale for this was to prevent abuse. Logs with repeated fields
> are not helpful, and may not be handled by user space audit coherently.
>

This rationale feels weak. What abuse are we talking about?
The LSM program is already written by a trusted entity.

> This is in the same vein as not providing a audit_format() wrapper.
>

I think the format() helper would allow for more flexibility.
I would like to hear what others think, but IMO we may not even want
to hardcode any fields by default and let the program decide what to
report.
prog-id, pid, comm, everything can be logged by the program itself, in
whatever order.

> Similarly, some functions such as bpf_audit_log_path() and
> bpf_audit_log_file() report the same information, thus can be
> interchangeable in use.
>
> 5. API wraps security/lsm_audit.c
>
> lsm_audit.c functions are multiplexed and not handled by BPF verifier
> very well, thus the wrapped functions are isolated to their sole
> purpose for use within hooks.

This part makes sense. I don't think we can (or should) support the
same multiplexing.

>
> Key considerations:
>
> 1. Audit field ordering
>
> AFAIK, user space audit is particular about what fields are
> present and their order. This patch series does not address ordering.
>
> My assumption is that the first three fields: type, prog-id, pid, comm
> are well known, and user space can make an assumption that other
> fields after those can appear in any order.
>
> If that is not acceptable, I would propose that we leverage the struct
> common_audit_data type order to be the order--much like how the type is
> used for log_once() functionality.
>
> I am open to other ideas.
>
> Signed-off-by: Frederick Lawler <fred@cloudflare.com>
> ---
> Frederick Lawler (4):
>       audit: Implement bpf_audit_log_*() wrappers
>       audit/security: Enable audit BPF kfuncs
>       selftests/bpf: Add audit helpers for BPF tests
>       selftests/bpf: Add lsm_audit_kfuncs tests
>
>  include/linux/lsm_audit.h                          |   1 +
>  include/uapi/linux/audit.h                         |   1 +
>  security/Makefile                                  |   2 +
>  security/lsm_audit_kfuncs.c                        | 306 +++++++++++
>  tools/testing/selftests/bpf/Makefile               |   3 +-
>  tools/testing/selftests/bpf/audit_helpers.c        | 281 ++++++++++
>  tools/testing/selftests/bpf/audit_helpers.h        |  55 ++
>  .../selftests/bpf/prog_tests/lsm_audit_kfuncs.c    | 598 +++++++++++++++++++++
>  .../selftests/bpf/progs/test_lsm_audit_kfuncs.c    | 263 +++++++++
>  9 files changed, 1509 insertions(+), 1 deletion(-)
> ---
> base-commit: ca0f39a369c5f927c3d004e63a5a778b08a9df94
> change-id: 20260105-bpf-auditd-send-message-4a883067aab8
>
> Best regards,
> --
> Frederick Lawler <fred@cloudflare.com>
>
>

Re: [PATCH RFC bpf-next 0/4] audit: Expose audit subsystem to BPF LSM programs via BPF kfuncs

Posted by Frederick Lawler 2 weeks, 6 days ago

Hi Kumar,

On Tue, Mar 17, 2026 at 03:43:36AM +0100, Kumar Kartikeya Dwivedi wrote:
> On Wed, 11 Mar 2026 at 22:31, Frederick Lawler <fred@cloudflare.com> wrote:
> > 3. All bpf_audit_log_*() functions are destructive
> >
> > The audit subsystem allows for AUDIT_FAIL_PANIC to be set when the
> > subsystem can detect that missing events. Further, some call paths may
> > invoke a BUG_ON(). Therefore all the functions are marked destructive.
> 
> I think the first part makes sense (i.e., the policy simply configured
> the system to panic on failure).
> However, in the second case, if the program is somehow able to trigger
> BUG_ON() relied upon for internal invariants, it would be considered
> broken.
> I tried grepping through and didn't find anything that would cause
> this, hence the whole thing about BUG_ON() in the cover letter only
> adds to confusion.
> Please drop it or describe cases which you were concerned about.
> 

bpf_audit_log_cause()
 -> audit_log_untrustedstring()
   -> audit_log_n_untrustedstring()
     -> audit_log_n_hex()

Was the primary call chain I was referring to, and yes this relies on
internal invariant of the SKB existing. I can remove from cover.

> >
> > 4. Functions are callable once per bpf_audit_context
> >
> > The rationale for this was to prevent abuse. Logs with repeated fields
> > are not helpful, and may not be handled by user space audit coherently.
> >
> 
> This rationale feels weak. What abuse are we talking about?
> The LSM program is already written by a trusted entity.
> 

I learned through off-list discussions that there's "undocumented" or
"unspoken" rules of log formatting and how auditd expects logs. Without
knowing exactly what these rules are, I can't provide any stronger
arguments other than to reduce duplication of fields in messages, or
exclude wrappers that supply unverified data.

WRT to unverified data, the socket wrappers accepting a netinf int.
In that case, the variable is at least verified via lookup to
report the interface in which it belongs. I left out wrapping
LSM_AUDIT_DATA_ANONINODE due to the passing of arbitrary string,
LSM_AUDIT_DATA_IPC arbitrary int. Others like for infiniband,
lockdown, nlmsgtype, were left out due to specificity, and not general
purpose usage.

Some of the rationale behind this is _not_ trusting BPF LSM authors as
much due to BPF LSM flexibility and accommodating user space audit.

Best,
Fred

Re: [PATCH RFC bpf-next 0/4] audit: Expose audit subsystem to BPF LSM programs via BPF kfuncs

Posted by Alexei Starovoitov 2 weeks, 6 days ago

On Mon, Mar 16, 2026 at 7:44 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Wed, 11 Mar 2026 at 22:31, Frederick Lawler <fred@cloudflare.com> wrote:
> >
> > The motivation behind the change is to give BPF LSM developers the
> > ability to report accesses via the audit subsystem much like how LSMs
> > operate today.

Sure, but bpf lsm-s don't need to follow such conventions.
audit is nothing but a message passing from kernel to user space
and done in a very inefficient way by wrapping strings into skb/netlink.
bpf progs can do this message passing already via various ways:
perfbuf, ringbuf, streams.
Teach your user space to consume one of them.

Re: [PATCH RFC bpf-next 0/4] audit: Expose audit subsystem to BPF LSM programs via BPF kfuncs

Posted by Frederick Lawler 2 weeks, 6 days ago

Hi Alexei,

On Tue, Mar 17, 2026 at 06:15:59PM -0700, Alexei Starovoitov wrote:
> On Mon, Mar 16, 2026 at 7:44 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Wed, 11 Mar 2026 at 22:31, Frederick Lawler <fred@cloudflare.com> wrote:
> > >
> > > The motivation behind the change is to give BPF LSM developers the
> > > ability to report accesses via the audit subsystem much like how LSMs
> > > operate today.
> 
> Sure, but bpf lsm-s don't need to follow such conventions.
> audit is nothing but a message passing from kernel to user space
> and done in a very inefficient way by wrapping strings into skb/netlink.
> bpf progs can do this message passing already via various ways:
> perfbuf, ringbuf, streams.
> Teach your user space to consume one of them.

Audit provides additional functionality by keeping messages close to the
syscall, and in-line with other messages for that syscall. Aside from
that, BPF is arguably too flexible. I'm sure it's already understood,
but the idea here is to reduce bespoke logging implementations and reduce
attack surface for violation reporting. Audit is already provided by the
kernel and a well leveraged pipeline.

WRT to performance characteristics, would you rather see this
implementation leverage those maps in a way to reduce that, background
workers, or something else? This is something we've considered, but left out
of RFC to collect more opinion on perf considerations.

Thanks for bring this up. I forgot to include that in the RFC cover, sorry.

Best,
Fred

Re: [PATCH RFC bpf-next 0/4] audit: Expose audit subsystem to BPF LSM programs via BPF kfuncs

Posted by Alexei Starovoitov 2 weeks, 5 days ago

On Wed, Mar 18, 2026 at 10:49 AM Frederick Lawler <fred@cloudflare.com> wrote:
>
> Hi Alexei,
>
> On Tue, Mar 17, 2026 at 06:15:59PM -0700, Alexei Starovoitov wrote:
> > On Mon, Mar 16, 2026 at 7:44 PM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Wed, 11 Mar 2026 at 22:31, Frederick Lawler <fred@cloudflare.com> wrote:
> > > >
> > > > The motivation behind the change is to give BPF LSM developers the
> > > > ability to report accesses via the audit subsystem much like how LSMs
> > > > operate today.
> >
> > Sure, but bpf lsm-s don't need to follow such conventions.
> > audit is nothing but a message passing from kernel to user space
> > and done in a very inefficient way by wrapping strings into skb/netlink.
> > bpf progs can do this message passing already via various ways:
> > perfbuf, ringbuf, streams.
> > Teach your user space to consume one of them.
>
> Audit provides additional functionality by keeping messages close to the
> syscall, and in-line with other messages for that syscall. Aside from
> that, BPF is arguably too flexible. I'm sure it's already understood,
> but the idea here is to reduce bespoke logging implementations and reduce
> attack surface for violation reporting. Audit is already provided by the
> kernel and a well leveraged pipeline.

Sorry, I don't buy any of these arguments.
Nack.