[PATCH v2 0/2] Support BPF traversal of wakeup sources

Samuel Wu posted 2 patches 1 week ago
There is a newer version of this series
drivers/base/power/power.h                    |   7 ++
drivers/base/power/wakeup.c                   |  72 +++++++++++-
tools/testing/selftests/bpf/config            |   3 +-
.../selftests/bpf/prog_tests/wakeup_source.c  | 101 +++++++++++++++++
.../selftests/bpf/progs/test_wakeup_source.c  | 107 ++++++++++++++++++
.../selftests/bpf/progs/wakeup_source.h       |  22 ++++
.../selftests/bpf/progs/wakeup_source_fail.c  |  63 +++++++++++
7 files changed, 372 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/wakeup_source.c
create mode 100644 tools/testing/selftests/bpf/progs/test_wakeup_source.c
create mode 100644 tools/testing/selftests/bpf/progs/wakeup_source.h
create mode 100644 tools/testing/selftests/bpf/progs/wakeup_source_fail.c
[PATCH v2 0/2] Support BPF traversal of wakeup sources
Posted by Samuel Wu 1 week ago
This patchset adds requisite kfuncs for BPF programs to safely traverse
wakeup_sources, and puts a config flag around the sysfs interface.

Currently, a traversal of wakeup sources require going through
/sys/class/wakeup/* or /d/wakeup_sources/*. The repeated syscalls to query
sysfs is inefficient, as there can be hundreds of wakeup_sources, with each
wakeup source also having multiple attributes. debugfs is unstable and
insecure.

Adding kfuncs to lock/unlock wakeup sources allows BPF program to safely
traverse the wakeup sources list. The head address of wakeup_sources can
safely be resolved through BPF helper functions or variable attributes.

On a quiescent Pixel 6 traversing 150 wakeup_sources, I am seeing ~34x
speedup (sampled 75 times in table below). For a device under load, the
speedup is greater.
+-------+----+----------+----------+
|       | n  | AVG (ms) | STD (ms) |
+-------+----+----------+----------+
| sysfs | 75 | 44.9     | 12.6     |
+-------+----+----------+----------+
| BPF   | 75 | 1.3      | 0.7      |
+-------+----+----------+----------+

The initial attempts for BPF traversal of wakeup_sources was with BPF
iterators [1]. However, BPF already allows for traversing of a simple list
with bpf_for(), and this current patchset has the added benefit of being
~2-3x more performant than BPF iterators.

[1]: https://lore.kernel.org/all/20260225210820.177674-1-wusamuel@google.com/

Changes in v2:
- Dropped CONFIG_PM_WAKEUP_STATS_SYSFS patch for future patchset
- Added declarations for kfuncs to .h to fix sparse and checkpatch warnings
- Added kfunc to get address of wakeup_source's head
- Added example bpf prog selftest for traversal of wakeup sources per Kumar
- Added *_fail.c selftest per Kumar
- More concise commit message in patch 1/2
- v1 link: https://lore.kernel.org/all/20260320160055.4114055-1-wusamuel@google.com/

Samuel Wu (2):
  PM: wakeup: Add kfuncs to traverse over wakeup_sources
  selftests/bpf: Add tests for wakeup_sources kfuncs

 drivers/base/power/power.h                    |   7 ++
 drivers/base/power/wakeup.c                   |  72 +++++++++++-
 tools/testing/selftests/bpf/config            |   3 +-
 .../selftests/bpf/prog_tests/wakeup_source.c  | 101 +++++++++++++++++
 .../selftests/bpf/progs/test_wakeup_source.c  | 107 ++++++++++++++++++
 .../selftests/bpf/progs/wakeup_source.h       |  22 ++++
 .../selftests/bpf/progs/wakeup_source_fail.c  |  63 +++++++++++
 7 files changed, 372 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/wakeup_source.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_wakeup_source.c
 create mode 100644 tools/testing/selftests/bpf/progs/wakeup_source.h
 create mode 100644 tools/testing/selftests/bpf/progs/wakeup_source_fail.c

-- 
2.53.0.1018.g2bb0e51243-goog
Re: [PATCH v2 0/2] Support BPF traversal of wakeup sources
Posted by Puranjay Mohan 1 week ago
Samuel Wu <wusamuel@google.com> writes:

> This patchset adds requisite kfuncs for BPF programs to safely traverse
> wakeup_sources, and puts a config flag around the sysfs interface.
>
> Currently, a traversal of wakeup sources require going through
> /sys/class/wakeup/* or /d/wakeup_sources/*. The repeated syscalls to query
> sysfs is inefficient, as there can be hundreds of wakeup_sources, with each
> wakeup source also having multiple attributes. debugfs is unstable and
> insecure.
>
> Adding kfuncs to lock/unlock wakeup sources allows BPF program to safely
> traverse the wakeup sources list. The head address of wakeup_sources can
> safely be resolved through BPF helper functions or variable attributes.
>
> On a quiescent Pixel 6 traversing 150 wakeup_sources, I am seeing ~34x
> speedup (sampled 75 times in table below). For a device under load, the
> speedup is greater.
> +-------+----+----------+----------+
> |       | n  | AVG (ms) | STD (ms) |
> +-------+----+----------+----------+
> | sysfs | 75 | 44.9     | 12.6     |
> +-------+----+----------+----------+
> | BPF   | 75 | 1.3      | 0.7      |
> +-------+----+----------+----------+
>
> The initial attempts for BPF traversal of wakeup_sources was with BPF
> iterators [1]. However, BPF already allows for traversing of a simple list
> with bpf_for(), and this current patchset has the added benefit of being
> ~2-3x more performant than BPF iterators.

I left some inline comments on patch 1, but the high level concern is
that encoding the SRCU index into a fake pointer to get KF_ACQUIRE/
KF_RELEASE tracking is working against the verifier rather than with it.
Nothing actually prevents a BPF program from walking the list without
the lock, and the whole pointer encoding trick goes away if this is done
as an open-coded iterator instead.

Thanks,
Puranjay
Re: [PATCH v2 0/2] Support BPF traversal of wakeup sources
Posted by Kumar Kartikeya Dwivedi 1 week ago
On Thu, 26 Mar 2026 at 13:20, Puranjay Mohan <puranjay@kernel.org> wrote:
>
> Samuel Wu <wusamuel@google.com> writes:
>
> > This patchset adds requisite kfuncs for BPF programs to safely traverse
> > wakeup_sources, and puts a config flag around the sysfs interface.
> >
> > Currently, a traversal of wakeup sources require going through
> > /sys/class/wakeup/* or /d/wakeup_sources/*. The repeated syscalls to query
> > sysfs is inefficient, as there can be hundreds of wakeup_sources, with each
> > wakeup source also having multiple attributes. debugfs is unstable and
> > insecure.
> >
> > Adding kfuncs to lock/unlock wakeup sources allows BPF program to safely
> > traverse the wakeup sources list. The head address of wakeup_sources can
> > safely be resolved through BPF helper functions or variable attributes.
> >
> > On a quiescent Pixel 6 traversing 150 wakeup_sources, I am seeing ~34x
> > speedup (sampled 75 times in table below). For a device under load, the
> > speedup is greater.
> > +-------+----+----------+----------+
> > |       | n  | AVG (ms) | STD (ms) |
> > +-------+----+----------+----------+
> > | sysfs | 75 | 44.9     | 12.6     |
> > +-------+----+----------+----------+
> > | BPF   | 75 | 1.3      | 0.7      |
> > +-------+----+----------+----------+
> >
> > The initial attempts for BPF traversal of wakeup_sources was with BPF
> > iterators [1]. However, BPF already allows for traversing of a simple list
> > with bpf_for(), and this current patchset has the added benefit of being
> > ~2-3x more performant than BPF iterators.
>
> I left some inline comments on patch 1, but the high level concern is
> that encoding the SRCU index into a fake pointer to get KF_ACQUIRE/
> KF_RELEASE tracking is working against the verifier rather than with it.
> Nothing actually prevents a BPF program from walking the list without
> the lock, and the whole pointer encoding trick goes away if this is done
> as an open-coded iterator instead.

Which is fine, the critical section is only doing CO-RE accesses, and
the SRCU lock is just to be able to read things in a valid state while
walking the list. It is all best-effort.
Open coded iterators was already explored as an option in earlier
iterations of the series and discarded as no-go.

>
> Thanks,
> Puranjay
Re: [PATCH v2 0/2] Support BPF traversal of wakeup sources
Posted by Alexei Starovoitov 1 week ago
On Thu, Mar 26, 2026 at 7:54 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Thu, 26 Mar 2026 at 13:20, Puranjay Mohan <puranjay@kernel.org> wrote:
> >
> > Samuel Wu <wusamuel@google.com> writes:
> >
> > > This patchset adds requisite kfuncs for BPF programs to safely traverse
> > > wakeup_sources, and puts a config flag around the sysfs interface.
> > >
> > > Currently, a traversal of wakeup sources require going through
> > > /sys/class/wakeup/* or /d/wakeup_sources/*. The repeated syscalls to query
> > > sysfs is inefficient, as there can be hundreds of wakeup_sources, with each
> > > wakeup source also having multiple attributes. debugfs is unstable and
> > > insecure.
> > >
> > > Adding kfuncs to lock/unlock wakeup sources allows BPF program to safely
> > > traverse the wakeup sources list. The head address of wakeup_sources can
> > > safely be resolved through BPF helper functions or variable attributes.
> > >
> > > On a quiescent Pixel 6 traversing 150 wakeup_sources, I am seeing ~34x
> > > speedup (sampled 75 times in table below). For a device under load, the
> > > speedup is greater.
> > > +-------+----+----------+----------+
> > > |       | n  | AVG (ms) | STD (ms) |
> > > +-------+----+----------+----------+
> > > | sysfs | 75 | 44.9     | 12.6     |
> > > +-------+----+----------+----------+
> > > | BPF   | 75 | 1.3      | 0.7      |
> > > +-------+----+----------+----------+
> > >
> > > The initial attempts for BPF traversal of wakeup_sources was with BPF
> > > iterators [1]. However, BPF already allows for traversing of a simple list
> > > with bpf_for(), and this current patchset has the added benefit of being
> > > ~2-3x more performant than BPF iterators.
> >
> > I left some inline comments on patch 1, but the high level concern is
> > that encoding the SRCU index into a fake pointer to get KF_ACQUIRE/
> > KF_RELEASE tracking is working against the verifier rather than with it.
> > Nothing actually prevents a BPF program from walking the list without
> > the lock, and the whole pointer encoding trick goes away if this is done
> > as an open-coded iterator instead.
>
> Which is fine, the critical section is only doing CO-RE accesses, and
> the SRCU lock is just to be able to read things in a valid state while
> walking the list. It is all best-effort.
> Open coded iterators was already explored as an option in earlier
> iterations of the series and discarded as no-go.

kinda best-effort...
the way it's written bpf_wakeup_sources_get_head() returns
trusted list_head. It's then core-read-ed anyway.
Ideally it should be trusted only within that srcu CS
and invalidated by the verifier similar to KF_RCU_PROTECTED,
but that's bigger task.
Instead let's make bpf_wakeup_sources_get_head() return 'void *',
so it's clearly untrusted.
Re: [PATCH v2 0/2] Support BPF traversal of wakeup sources
Posted by Samuel Wu 1 week ago
On Thu, Mar 26, 2026 at 8:02 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Mar 26, 2026 at 7:54 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Thu, 26 Mar 2026 at 13:20, Puranjay Mohan <puranjay@kernel.org> wrote:
> > >
> > > Samuel Wu <wusamuel@google.com> writes:
> > >
> > > > This patchset adds requisite kfuncs for BPF programs to safely traverse
> > > > wakeup_sources, and puts a config flag around the sysfs interface.
> > > >
> > > > Currently, a traversal of wakeup sources require going through
> > > > /sys/class/wakeup/* or /d/wakeup_sources/*. The repeated syscalls to query
> > > > sysfs is inefficient, as there can be hundreds of wakeup_sources, with each
> > > > wakeup source also having multiple attributes. debugfs is unstable and
> > > > insecure.
> > > >
> > > > Adding kfuncs to lock/unlock wakeup sources allows BPF program to safely
> > > > traverse the wakeup sources list. The head address of wakeup_sources can
> > > > safely be resolved through BPF helper functions or variable attributes.
> > > >
> > > > On a quiescent Pixel 6 traversing 150 wakeup_sources, I am seeing ~34x
> > > > speedup (sampled 75 times in table below). For a device under load, the
> > > > speedup is greater.
> > > > +-------+----+----------+----------+
> > > > |       | n  | AVG (ms) | STD (ms) |
> > > > +-------+----+----------+----------+
> > > > | sysfs | 75 | 44.9     | 12.6     |
> > > > +-------+----+----------+----------+
> > > > | BPF   | 75 | 1.3      | 0.7      |
> > > > +-------+----+----------+----------+
> > > >
> > > > The initial attempts for BPF traversal of wakeup_sources was with BPF
> > > > iterators [1]. However, BPF already allows for traversing of a simple list
> > > > with bpf_for(), and this current patchset has the added benefit of being
> > > > ~2-3x more performant than BPF iterators.
> > >
> > > I left some inline comments on patch 1, but the high level concern is
> > > that encoding the SRCU index into a fake pointer to get KF_ACQUIRE/
> > > KF_RELEASE tracking is working against the verifier rather than with it.
> > > Nothing actually prevents a BPF program from walking the list without
> > > the lock, and the whole pointer encoding trick goes away if this is done
> > > as an open-coded iterator instead.
> >
> > Which is fine, the critical section is only doing CO-RE accesses, and
> > the SRCU lock is just to be able to read things in a valid state while
> > walking the list. It is all best-effort.
> > Open coded iterators was already explored as an option in earlier
> > iterations of the series and discarded as no-go.
>
> kinda best-effort...
> the way it's written bpf_wakeup_sources_get_head() returns
> trusted list_head. It's then core-read-ed anyway.
> Ideally it should be trusted only within that srcu CS
> and invalidated by the verifier similar to KF_RCU_PROTECTED,
> but that's bigger task.
> Instead let's make bpf_wakeup_sources_get_head() return 'void *',
> so it's clearly untrusted.

Thanks all for the fruitful discussion; this is more rigorous. I'll
update v3 so that `bpf_wakeup_sources_get_head()`'s return type is
`void *` and I can add a corresponding selftest that directly
dereferences the head and expects a verifier failure.
Re: [PATCH v2 0/2] Support BPF traversal of wakeup sources
Posted by Kumar Kartikeya Dwivedi 1 week ago
On Thu, 26 Mar 2026 at 17:26, Samuel Wu <wusamuel@google.com> wrote:
>
> On Thu, Mar 26, 2026 at 8:02 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Mar 26, 2026 at 7:54 AM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Thu, 26 Mar 2026 at 13:20, Puranjay Mohan <puranjay@kernel.org> wrote:
> > > >
> > > > Samuel Wu <wusamuel@google.com> writes:
> > > >
> > > > > This patchset adds requisite kfuncs for BPF programs to safely traverse
> > > > > wakeup_sources, and puts a config flag around the sysfs interface.
> > > > >
> > > > > Currently, a traversal of wakeup sources require going through
> > > > > /sys/class/wakeup/* or /d/wakeup_sources/*. The repeated syscalls to query
> > > > > sysfs is inefficient, as there can be hundreds of wakeup_sources, with each
> > > > > wakeup source also having multiple attributes. debugfs is unstable and
> > > > > insecure.
> > > > >
> > > > > Adding kfuncs to lock/unlock wakeup sources allows BPF program to safely
> > > > > traverse the wakeup sources list. The head address of wakeup_sources can
> > > > > safely be resolved through BPF helper functions or variable attributes.
> > > > >
> > > > > On a quiescent Pixel 6 traversing 150 wakeup_sources, I am seeing ~34x
> > > > > speedup (sampled 75 times in table below). For a device under load, the
> > > > > speedup is greater.
> > > > > +-------+----+----------+----------+
> > > > > |       | n  | AVG (ms) | STD (ms) |
> > > > > +-------+----+----------+----------+
> > > > > | sysfs | 75 | 44.9     | 12.6     |
> > > > > +-------+----+----------+----------+
> > > > > | BPF   | 75 | 1.3      | 0.7      |
> > > > > +-------+----+----------+----------+
> > > > >
> > > > > The initial attempts for BPF traversal of wakeup_sources was with BPF
> > > > > iterators [1]. However, BPF already allows for traversing of a simple list
> > > > > with bpf_for(), and this current patchset has the added benefit of being
> > > > > ~2-3x more performant than BPF iterators.
> > > >
> > > > I left some inline comments on patch 1, but the high level concern is
> > > > that encoding the SRCU index into a fake pointer to get KF_ACQUIRE/
> > > > KF_RELEASE tracking is working against the verifier rather than with it.
> > > > Nothing actually prevents a BPF program from walking the list without
> > > > the lock, and the whole pointer encoding trick goes away if this is done
> > > > as an open-coded iterator instead.
> > >
> > > Which is fine, the critical section is only doing CO-RE accesses, and
> > > the SRCU lock is just to be able to read things in a valid state while
> > > walking the list. It is all best-effort.
> > > Open coded iterators was already explored as an option in earlier
> > > iterations of the series and discarded as no-go.
> >
> > kinda best-effort...
> > the way it's written bpf_wakeup_sources_get_head() returns
> > trusted list_head. It's then core-read-ed anyway.
> > Ideally it should be trusted only within that srcu CS
> > and invalidated by the verifier similar to KF_RCU_PROTECTED,
> > but that's bigger task.
> > Instead let's make bpf_wakeup_sources_get_head() return 'void *',
> > so it's clearly untrusted.
>
> Thanks all for the fruitful discussion; this is more rigorous. I'll
> update v3 so that `bpf_wakeup_sources_get_head()`'s return type is
> `void *` and I can add a corresponding selftest that directly
> dereferences the head and expects a verifier failure.

You could also use bpf_core_cast() instead of using macros to read
every field, should be equivalent. You may still need the macros for
bitfields but it should work otherwise.