include/uapi/linux/bpf.h | 3 ++- tools/include/uapi/linux/bpf.h | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-)
Document that BPF_RB_NO_WAKEUP is not recommended for
bpf_ringbuf_discard().
A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
wakeup from a valid record, leaving an epoll-based userspace consumer
asleep even though data is available in the ring buffer.
Scenario:
epoll_wait(rb_fd); // blocks
rec = bpf_ringbuf_reserve(&rb, ...);
bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
rec = bpf_ringbuf_reserve(&rb, ...);
bpf_ringbuf_submit(rec, 0); // valid record, but no wakeup
This behavior is surprising in the context of bpf_ringbuf_discard()
as it seems natural not to want to wake userspace.
Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
---
include/uapi/linux/bpf.h | 3 ++-
tools/include/uapi/linux/bpf.h | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c8d400b7680a..c46b06d45904 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4645,7 +4645,8 @@ union bpf_attr {
* Description
* Discard reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
- * of new data availability is sent.
+ * of new data availability is sent, which is not recommended as
+ * it can suppress a later adaptive wakeup from a subsequent submit.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* If **0** is specified in *flags*, an adaptive notification
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 5e38b4887de6..96de37c3b896 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4645,7 +4645,8 @@ union bpf_attr {
* Description
* Discard reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
- * of new data availability is sent.
+ * of new data availability is sent, which is not recommended as
+ * it can suppress a later adaptive wakeup from a subsequent submit.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* If **0** is specified in *flags*, an adaptive notification
--
2.43.0
On Mon, Mar 30, 2026 at 5:18 AM Eyal Birger <eyal.birger@gmail.com> wrote:
>
> Document that BPF_RB_NO_WAKEUP is not recommended for
> bpf_ringbuf_discard().
>
> A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> wakeup from a valid record, leaving an epoll-based userspace consumer
> asleep even though data is available in the ring buffer.
>
> Scenario:
>
> epoll_wait(rb_fd); // blocks
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_submit(rec, 0); // valid record, but no wakeup
>
> This behavior is surprising in the context of bpf_ringbuf_discard()
> as it seems natural not to want to wake userspace.
why? discarded record still takes space in ringbuf and needs to be
consumed by user-space. So all the same concerns apply here whether
it's discarded or submitted ringbuf record.
>
> Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> ---
> include/uapi/linux/bpf.h | 3 ++-
> tools/include/uapi/linux/bpf.h | 3 ++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c8d400b7680a..c46b06d45904 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
> * Description
> * Discard reserved ring buffer sample, pointed to by *data*.
> * If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - * of new data availability is sent.
> + * of new data availability is sent, which is not recommended as
> + * it can suppress a later adaptive wakeup from a subsequent submit.
NAK, it's not "not recommended". If you are using NO_WAKEUP you
normally would use it with FORCE_WAKEUP as well. It's not a rule, but
that's the most practical combination. But if you are fancy and know
what you are doing, you are free to combine NO_WAKEUP, FORCE_WAKEUP
and "adaptive wakeup" in any form or shape you want.
If anything, instead, it might maybe help to emphasize that
bpf_ringbuf_discard() does live data in the ringbuf that needs to be
consumed by user-space code (normally transparently in libbpf or other
BPF libraries), and the only distinction between submitted and
discarded record is whether application's user code sees that record
or it's silently skipped.
> * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> * of new data availability is sent unconditionally.
> * If **0** is specified in *flags*, an adaptive notification
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 5e38b4887de6..96de37c3b896 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
> * Description
> * Discard reserved ring buffer sample, pointed to by *data*.
> * If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - * of new data availability is sent.
> + * of new data availability is sent, which is not recommended as
> + * it can suppress a later adaptive wakeup from a subsequent submit.
> * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> * of new data availability is sent unconditionally.
> * If **0** is specified in *flags*, an adaptive notification
> --
> 2.43.0
>
Hi,
On Mon, Mar 30, 2026 at 5:27 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Mar 30, 2026 at 5:18 AM Eyal Birger <eyal.birger@gmail.com> wrote:
> >
> > Document that BPF_RB_NO_WAKEUP is not recommended for
> > bpf_ringbuf_discard().
> >
> > A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> > wakeup from a valid record, leaving an epoll-based userspace consumer
> > asleep even though data is available in the ring buffer.
> >
> > Scenario:
> >
> > epoll_wait(rb_fd); // blocks
> >
> > rec = bpf_ringbuf_reserve(&rb, ...);
> > bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
> >
> > rec = bpf_ringbuf_reserve(&rb, ...);
> > bpf_ringbuf_submit(rec, 0); // valid record, but no wakeup
> >
> > This behavior is surprising in the context of bpf_ringbuf_discard()
> > as it seems natural not to want to wake userspace.
>
> why? discarded record still takes space in ringbuf and needs to be
> consumed by user-space. So all the same concerns apply here whether
> it's discarded or submitted ringbuf record.
Right, but sadly this isn't obvious to someone who hasn't carefully
inspected the implementation.
>
> >
> > Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> > Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> > ---
> > include/uapi/linux/bpf.h | 3 ++-
> > tools/include/uapi/linux/bpf.h | 3 ++-
> > 2 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index c8d400b7680a..c46b06d45904 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4645,7 +4645,8 @@ union bpf_attr {
> > * Description
> > * Discard reserved ring buffer sample, pointed to by *data*.
> > * If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> > - * of new data availability is sent.
> > + * of new data availability is sent, which is not recommended as
> > + * it can suppress a later adaptive wakeup from a subsequent submit.
>
> NAK, it's not "not recommended". If you are using NO_WAKEUP you
> normally would use it with FORCE_WAKEUP as well. It's not a rule, but
> that's the most practical combination. But if you are fancy and know
> what you are doing, you are free to combine NO_WAKEUP, FORCE_WAKEUP
> and "adaptive wakeup" in any form or shape you want.
There are still possible corner cases - for example, multiple silent
discards might fill the ringbuf preventing a later placement of a
valid record - even if it's meant to use "force".
Point is, the user should be extra careful, that fact is documented
in other places, and imho should be noted here too.
>
>
> If anything, instead, it might maybe help to emphasize that
> bpf_ringbuf_discard() does live data in the ringbuf that needs to be
> consumed by user-space code (normally transparently in libbpf or other
> BPF libraries), and the only distinction between submitted and
> discarded record is whether application's user code sees that record
> or it's silently skipped.
Ack. Will send a v2 in that spirit.
Eyal.
>
> > * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> > * of new data availability is sent unconditionally.
> > * If **0** is specified in *flags*, an adaptive notification
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 5e38b4887de6..96de37c3b896 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -4645,7 +4645,8 @@ union bpf_attr {
> > * Description
> > * Discard reserved ring buffer sample, pointed to by *data*.
> > * If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> > - * of new data availability is sent.
> > + * of new data availability is sent, which is not recommended as
> > + * it can suppress a later adaptive wakeup from a subsequent submit.
> > * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> > * of new data availability is sent unconditionally.
> > * If **0** is specified in *flags*, an adaptive notification
> > --
> > 2.43.0
> >
Eyal Birger <eyal.birger@gmail.com> writes:
> Document that BPF_RB_NO_WAKEUP is not recommended for
> bpf_ringbuf_discard().
>
> A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> wakeup from a valid record, leaving an epoll-based userspace consumer
> asleep even though data is available in the ring buffer.
>
> Scenario:
>
> epoll_wait(rb_fd); // blocks
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_submit(rec, 0); // valid record, but no wakeup
>
> This behavior is surprising in the context of bpf_ringbuf_discard()
> as it seems natural not to want to wake userspace.
It appears that once you do a submit or discard with NO_WAKEUP, you can
never go back to adaptive mode. You will need to do an explicit
BPF_RB_FORCE_WAKEUP to do a wakeup.
This looks like expected behaviour (by design) but for discard the
programmer might think: "I'm discarding and there's no data, why would I
wake anyone? Let me pass BPF_RB_NO_WAKEUP to be a good citizen."
I am not sure if we just want to change the description of the helper or
also fix the code to ignore the BPF_RB_NO_WAKEUP for discard?
Or a better approach would be to add a wakeup_needed flag to struct
bpf_ringbuf. When NO_WAKEUP suppresses a wakeup that would have fired
(cons_pos == rec_pos), it sets the flag. Any subsequent adaptive commit
sees the flag and sends the wakeup, clearing it. FORCE_WAKEUP also
clears it. But this could change how the ringbuf behaves for existing
programs.
I will let others comment on this.
> Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> ---
> include/uapi/linux/bpf.h | 3 ++-
> tools/include/uapi/linux/bpf.h | 3 ++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c8d400b7680a..c46b06d45904 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
> * Description
> * Discard reserved ring buffer sample, pointed to by *data*.
> * If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - * of new data availability is sent.
> + * of new data availability is sent, which is not recommended as
> + * it can suppress a later adaptive wakeup from a subsequent submit.
> * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> * of new data availability is sent unconditionally.
> * If **0** is specified in *flags*, an adaptive notification
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 5e38b4887de6..96de37c3b896 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
> * Description
> * Discard reserved ring buffer sample, pointed to by *data*.
> * If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - * of new data availability is sent.
> + * of new data availability is sent, which is not recommended as
> + * it can suppress a later adaptive wakeup from a subsequent submit.
> * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> * of new data availability is sent unconditionally.
> * If **0** is specified in *flags*, an adaptive notification
Hi,
On Mon, Mar 30, 2026 at 6:39 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>
> Eyal Birger <eyal.birger@gmail.com> writes:
>
> > Document that BPF_RB_NO_WAKEUP is not recommended for
> > bpf_ringbuf_discard().
> >
> > A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> > wakeup from a valid record, leaving an epoll-based userspace consumer
> > asleep even though data is available in the ring buffer.
> >
> > Scenario:
> >
> > epoll_wait(rb_fd); // blocks
> >
> > rec = bpf_ringbuf_reserve(&rb, ...);
> > bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
> >
> > rec = bpf_ringbuf_reserve(&rb, ...);
> > bpf_ringbuf_submit(rec, 0); // valid record, but no wakeup
> >
> > This behavior is surprising in the context of bpf_ringbuf_discard()
> > as it seems natural not to want to wake userspace.
>
> It appears that once you do a submit or discard with NO_WAKEUP, you can
> never go back to adaptive mode. You will need to do an explicit
> BPF_RB_FORCE_WAKEUP to do a wakeup.
>
> This looks like expected behaviour (by design) but for discard the
> programmer might think: "I'm discarding and there's no data, why would I
> wake anyone? Let me pass BPF_RB_NO_WAKEUP to be a good citizen."
This is exactly what I've seen and it created real bugs.
>
> I am not sure if we just want to change the description of the helper or
> also fix the code to ignore the BPF_RB_NO_WAKEUP for discard?
I wanted to avoid breaking something that might be in use by advanced use-cases.
>
> Or a better approach would be to add a wakeup_needed flag to struct
> bpf_ringbuf. When NO_WAKEUP suppresses a wakeup that would have fired
> (cons_pos == rec_pos), it sets the flag. Any subsequent adaptive commit
> sees the flag and sends the wakeup, clearing it. FORCE_WAKEUP also
> clears it. But this could change how the ringbuf behaves for existing
> programs.
Considered this, but wanted to avoid a runtime penalty for this edge case.
>
> I will let others comment on this.
Thanks for your input!
Eyal.
>
> > Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> > Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> > ---
> > include/uapi/linux/bpf.h | 3 ++-
> > tools/include/uapi/linux/bpf.h | 3 ++-
> > 2 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index c8d400b7680a..c46b06d45904 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4645,7 +4645,8 @@ union bpf_attr {
> > * Description
> > * Discard reserved ring buffer sample, pointed to by *data*.
> > * If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> > - * of new data availability is sent.
> > + * of new data availability is sent, which is not recommended as
> > + * it can suppress a later adaptive wakeup from a subsequent submit.
> > * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> > * of new data availability is sent unconditionally.
> > * If **0** is specified in *flags*, an adaptive notification
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 5e38b4887de6..96de37c3b896 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -4645,7 +4645,8 @@ union bpf_attr {
> > * Description
> > * Discard reserved ring buffer sample, pointed to by *data*.
> > * If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> > - * of new data availability is sent.
> > + * of new data availability is sent, which is not recommended as
> > + * it can suppress a later adaptive wakeup from a subsequent submit.
> > * If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> > * of new data availability is sent unconditionally.
> > * If **0** is specified in *flags*, an adaptive notification
© 2016 - 2026 Red Hat, Inc.