[PATCH v4 0/3] Handle seccomp notification preemption

Sargun Dhillon posted 3 patches 4 years ago
.../userspace-api/seccomp_filter.rst          |  10 +
include/linux/seccomp.h                       |   3 +-
include/uapi/linux/seccomp.h                  |   2 +
kernel/seccomp.c                              |  42 ++-
tools/testing/selftests/seccomp/seccomp_bpf.c | 282 +++++++++++++++++-
5 files changed, 320 insertions(+), 19 deletions(-)
[PATCH v4 0/3] Handle seccomp notification preemption
Posted by Sargun Dhillon 4 years ago
This patchset addresses a race condition we've dealt with recently with
seccomp. Specifically programs interrupting syscalls while they're in
progress. This was exacerbated by Golang's[1] recent adoption of
"Non-cooperative goroutine preemption", in which they try to interrupt any
syscall that's been running for more than 10ms. During certain syscalls,
it's non-trivial to write them in a reetrant manner in userspace (mount).

It allows a per-filter flag to be set that makes it so that the notifying
process will switch to "TASK_KILLABLE" as opposed to returning to userspace
on non-fatal signals.

Changes since v3[4]:
 * Clean up tests
   * Split out helper function (dedupe code)
   * Add some explanation about whats going on
 * Small documentation edit

Changes since v2[3]:
 * Split out addfd patches
 * Move the flag to be per-filter (as opposed to per notification)

Changes since v1[2]:
 * Fix some documentation
 * Add Rata's patches to allow for direct return from addfd

[1]: https://github.com/golang/proposal/blob/master/design/24543-non-cooperative-preemption.md
[2]: https://lore.kernel.org/lkml/20210220090502.7202-1-sargun@sargun.me/
[3]: https://lore.kernel.org/all/20210426180610.2363-1-sargun@sargun.me/
[4]: https://lore.kernel.org/lkml/20220429023113.74993-1-sargun@sargun.me/

Sargun Dhillon (3):
  seccomp: Add wait_killable semantic to seccomp user notifier
  selftests/seccomp: Refactor get_proc_stat to split out file reading
    code
  selftests/seccomp: Add test for wait killable notifier

 .../userspace-api/seccomp_filter.rst          |  10 +
 include/linux/seccomp.h                       |   3 +-
 include/uapi/linux/seccomp.h                  |   2 +
 kernel/seccomp.c                              |  42 ++-
 tools/testing/selftests/seccomp/seccomp_bpf.c | 282 +++++++++++++++++-
 5 files changed, 320 insertions(+), 19 deletions(-)

-- 
2.25.1
Re: [PATCH v4 0/3] Handle seccomp notification preemption
Posted by Kees Cook 4 years ago
On Tue, 3 May 2022 01:09:55 -0700, Sargun Dhillon wrote:
> This patchset addresses a race condition we've dealt with recently with
> seccomp. Specifically programs interrupting syscalls while they're in
> progress. This was exacerbated by Golang's[1] recent adoption of
> "Non-cooperative goroutine preemption", in which they try to interrupt any
> syscall that's been running for more than 10ms. During certain syscalls,
> it's non-trivial to write them in a reetrant manner in userspace (mount).
> 
> [...]

Applied to for-next/seccomp, thanks!

[1/3] seccomp: Add wait_killable semantic to seccomp user notifier
      https://git.kernel.org/kees/c/c2aa2dfef243
[2/3] selftests/seccomp: Refactor get_proc_stat to split out file reading code
      https://git.kernel.org/kees/c/922a1b520c5f
[3/3] selftests/seccomp: Add test for wait killable notifier
      https://git.kernel.org/kees/c/3b96a9c522b2

-- 
Kees Cook