[PATCH 00/17] exit: Making task exiting a first class concept

Eric W. Biederman posted 17 patches 4 years, 5 months ago
arch/m68k/kernel/ptrace.c    |  12 +----
fs/coredump.c                |  17 +++---
fs/exec.c                    |  12 +++--
fs/proc/array.c              |   9 +++-
include/linux/profile.h      |  26 ---------
include/linux/ptrace.h       |   5 +-
include/linux/sched.h        |   1 +
include/linux/sched/jobctl.h |   2 +
include/linux/sched/signal.h |   6 ++-
include/linux/tracehook.h    |  21 ++++----
kernel/exit.c                |  29 +++++-----
kernel/fork.c                |   2 +
kernel/profile.c             |  50 ------------------
kernel/ptrace.c              |  14 +++--
kernel/signal.c              | 122 +++++++++++++++++++++++--------------------
kernel/tsacct.c              |   7 ++-
mm/mmap.c                    |   1 -
17 files changed, 134 insertions(+), 202 deletions(-)
[PATCH 00/17] exit: Making task exiting a first class concept
Posted by Eric W. Biederman 4 years, 5 months ago
The changes below contain some cleanups and the work to make implement
first class asynchronous task exit.  Most of the cleanups are necessary
for this work but a couple of them (removing profile_task_exit and the
extra setting of PT_SEIZED in ptrace_attach) are included because I
stumbled over them and they are worth applying but they aren't
interesting enough to me to make be in their own patchset.

The core of this set of changes is the addition of
schedule_task_exit_locked.  Ptrace is cleaned up to avoid a conflict in
task->exit_code.  Then the existing task exit code is gradually moved
into the final shape of schedule_task_exit_locked.

This is the fundamental building block I need to fix alpha, m68k,
nios2 and any other architecture that does not always save all of
their registers except when entering into a ptrace context.

This is about half the work to allow coredump signals to use
short-circuit delivery.

With coredumps signals available for short-circuit delivery the
SA_IMMUTABLE hack can be replace by something clean.

The counting of the number of threads that have not been killed to
always set SIGNAL_GROUP_EXIT when a process exits and the coredump
signal short-circuit delivery is a foundation for updating the
SECCOMP_RET_KILL_THREAD implementation such that it can decide if it
should coredump without races.

I have most of those changes pretty much ready I just need to get these
changes finalized reviewed first.  At this point they are looking at
v5.18 material.

These patches are on top of:
https://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git/ signal-for-v5.17

After these patches have been reviewed it is my plan to apply them to my
signal-for-v5.17 branch.  Any and all feedback is welcome.

Eric W. Biederman (17):
      exit: Remove profile_task_exit & profile_munmap
      exit: Coredumps reach do_group_exit
      exit: Fix the exit_code for wait_task_zombie
      exit: Use the correct exit_code in /proc/<pid>/stat
      taskstats: Cleanup the use of task->exit_code
      ptrace: Remove second setting of PT_SEIZED in ptrace_attach
      ptrace: Remove unused regs argument from ptrace_report_syscall
      ptrace/m68k: Stop open coding ptrace_report_syscall
      ptrace: Move setting/clearing ptrace_message into prace_stop
      ptrace: Return the signal to continue with from ptrace_stop
      ptrace: Separate task->ptrace_code out from task->exit_code
      signal: Compute the process exit_code in get_signal
      signal: Make individual tasks exiting a first class concept
      signal: Remove zap_other_threads
      signal: Add JOBCTL_WILL_EXIT to mark exiting tasks
      signal: Record the exit_code when an exit is scheduled
      signal: Always set SIGNAL_GROUP_EXIT on process exit

 arch/m68k/kernel/ptrace.c    |  12 +----
 fs/coredump.c                |  17 +++---
 fs/exec.c                    |  12 +++--
 fs/proc/array.c              |   9 +++-
 include/linux/profile.h      |  26 ---------
 include/linux/ptrace.h       |   5 +-
 include/linux/sched.h        |   1 +
 include/linux/sched/jobctl.h |   2 +
 include/linux/sched/signal.h |   6 ++-
 include/linux/tracehook.h    |  21 ++++----
 kernel/exit.c                |  29 +++++-----
 kernel/fork.c                |   2 +
 kernel/profile.c             |  50 ------------------
 kernel/ptrace.c              |  14 +++--
 kernel/signal.c              | 122 +++++++++++++++++++++++--------------------
 kernel/tsacct.c              |   7 ++-
 mm/mmap.c                    |   1 -
 17 files changed, 134 insertions(+), 202 deletions(-)

[PATCH 00/13] Removing tracehook.h
Posted by Eric W. Biederman 4 years, 3 months ago
While working on cleaning up do_exit I have been having to deal with the
code in tracehook.h.  Unfortunately the code in tracehook.h does not
make sense as organized.

This set of changes reorganizes things so that tracehook.h no longer
exists, and so that it's current contents are organized in a fashion
that is a little easier to understand.

The biggest change is that I lean into the fact that get_signal
always calls task_work_run and removes the logic that tried to
be smart and decouple task_work_run and get_signal as it has proven
to not be effective.

This is a conservative change and I am not changing the how things
like signal_pending operate (although it is probably justified).

A new header resume_user_mode.h is added to hold resume_user_mode_work
which was previously known as tracehook_notify_resume.

Eric W. Biederman (13):
      ptrace: Move ptrace_report_syscall into ptrace.h
      ptrace/arm: Rename tracehook_report_syscall report_syscall
      ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
      ptrace: Remove arch_syscall_{enter,exit}_tracehook
      ptrace: Remove tracehook_signal_handler
      task_work: Remove unnecessary include from posix_timers.h
      task_work: Introduce task_work_pending
      task_work: Call tracehook_notify_signal from get_signal on all architectures
      task_work: Decouple TIF_NOTIFY_SIGNAL and task_work
      signal: Move set_notify_signal and clear_notify_signal into sched/signal.h
      resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume
      resume_user_mode: Move to resume_user_mode.h
      tracehook: Remove tracehook.h

 MAINTAINERS                          |   1 -
 arch/Kconfig                         |   5 +-
 arch/alpha/kernel/ptrace.c           |   5 +-
 arch/alpha/kernel/signal.c           |   4 +-
 arch/arc/kernel/ptrace.c             |   5 +-
 arch/arc/kernel/signal.c             |   4 +-
 arch/arm/kernel/ptrace.c             |  12 +-
 arch/arm/kernel/signal.c             |   4 +-
 arch/arm64/kernel/ptrace.c           |  14 +--
 arch/arm64/kernel/signal.c           |   4 +-
 arch/csky/kernel/ptrace.c            |   5 +-
 arch/csky/kernel/signal.c            |   4 +-
 arch/h8300/kernel/ptrace.c           |   5 +-
 arch/h8300/kernel/signal.c           |   4 +-
 arch/hexagon/kernel/process.c        |   4 +-
 arch/hexagon/kernel/signal.c         |   1 -
 arch/hexagon/kernel/traps.c          |   6 +-
 arch/ia64/kernel/process.c           |   4 +-
 arch/ia64/kernel/ptrace.c            |   6 +-
 arch/ia64/kernel/signal.c            |   1 -
 arch/m68k/kernel/ptrace.c            |   6 +-
 arch/m68k/kernel/signal.c            |   4 +-
 arch/microblaze/kernel/ptrace.c      |   5 +-
 arch/microblaze/kernel/signal.c      |   4 +-
 arch/mips/kernel/ptrace.c            |   5 +-
 arch/mips/kernel/signal.c            |   4 +-
 arch/nds32/include/asm/syscall.h     |   2 +-
 arch/nds32/kernel/ptrace.c           |   5 +-
 arch/nds32/kernel/signal.c           |   4 +-
 arch/nios2/kernel/ptrace.c           |   5 +-
 arch/nios2/kernel/signal.c           |   4 +-
 arch/openrisc/kernel/ptrace.c        |   5 +-
 arch/openrisc/kernel/signal.c        |   4 +-
 arch/parisc/kernel/ptrace.c          |   7 +-
 arch/parisc/kernel/signal.c          |   4 +-
 arch/powerpc/kernel/ptrace/ptrace.c  |   8 +-
 arch/powerpc/kernel/signal.c         |   4 +-
 arch/riscv/kernel/ptrace.c           |   5 +-
 arch/riscv/kernel/signal.c           |   4 +-
 arch/s390/include/asm/entry-common.h |   1 -
 arch/s390/kernel/ptrace.c            |   1 -
 arch/s390/kernel/signal.c            |   5 +-
 arch/sh/kernel/ptrace_32.c           |   5 +-
 arch/sh/kernel/signal_32.c           |   4 +-
 arch/sparc/kernel/ptrace_32.c        |   5 +-
 arch/sparc/kernel/ptrace_64.c        |   5 +-
 arch/sparc/kernel/signal32.c         |   1 -
 arch/sparc/kernel/signal_32.c        |   4 +-
 arch/sparc/kernel/signal_64.c        |   4 +-
 arch/um/kernel/process.c             |   4 +-
 arch/um/kernel/ptrace.c              |   5 +-
 arch/x86/kernel/ptrace.c             |   1 -
 arch/x86/kernel/signal.c             |   5 +-
 arch/x86/mm/tlb.c                    |   1 +
 arch/xtensa/kernel/ptrace.c          |   5 +-
 arch/xtensa/kernel/signal.c          |   4 +-
 block/blk-cgroup.c                   |   2 +-
 fs/coredump.c                        |   1 -
 fs/exec.c                            |   1 -
 fs/io-wq.c                           |   6 +-
 fs/io_uring.c                        |  11 +-
 fs/proc/array.c                      |   1 -
 fs/proc/base.c                       |   1 -
 include/asm-generic/syscall.h        |   2 +-
 include/linux/entry-common.h         |  47 +-------
 include/linux/entry-kvm.h            |   2 +-
 include/linux/posix-timers.h         |   1 -
 include/linux/ptrace.h               |  78 ++++++++++++
 include/linux/resume_user_mode.h     |  64 ++++++++++
 include/linux/sched/signal.h         |  17 +++
 include/linux/task_work.h            |   5 +
 include/linux/tracehook.h            | 226 -----------------------------------
 include/uapi/linux/ptrace.h          |   2 +-
 kernel/entry/common.c                |  19 +--
 kernel/entry/kvm.c                   |   9 +-
 kernel/exit.c                        |   3 +-
 kernel/livepatch/transition.c        |   1 -
 kernel/seccomp.c                     |   1 -
 kernel/signal.c                      |  23 ++--
 kernel/task_work.c                   |   4 +-
 kernel/time/posix-cpu-timers.c       |   1 +
 mm/memcontrol.c                      |   2 +-
 security/apparmor/domain.c           |   1 -
 security/selinux/hooks.c             |   1 -
 84 files changed, 317 insertions(+), 462 deletions(-)

Eric
Re: [PATCH 00/13] Removing tracehook.h
Posted by Jens Axboe 4 years, 3 months ago
On 3/8/22 5:13 PM, Eric W. Biederman wrote:
> 
> While working on cleaning up do_exit I have been having to deal with the
> code in tracehook.h.  Unfortunately the code in tracehook.h does not
> make sense as organized.
> 
> This set of changes reorganizes things so that tracehook.h no longer
> exists, and so that it's current contents are organized in a fashion
> that is a little easier to understand.
> 
> The biggest change is that I lean into the fact that get_signal
> always calls task_work_run and removes the logic that tried to
> be smart and decouple task_work_run and get_signal as it has proven
> to not be effective.
> 
> This is a conservative change and I am not changing the how things
> like signal_pending operate (although it is probably justified).
> 
> A new header resume_user_mode.h is added to hold resume_user_mode_work
> which was previously known as tracehook_notify_resume.

This is a nice cleanup. I did bother me adding the TIF_NOTIFY_SIGNAL
bits and work hooks to something unrelated, but that's where other
things resided then. This makes it a lot better.

-- 
Jens Axboe
[PATCH 0/2] ptrace: Making the ptrace changes atomic
Posted by Eric W. Biederman 4 years, 3 months ago
While working on cleaning up the exit path it have had occasion to look
at what guarantees are provided for setting and reading the fields that
are provided in task_struct for ptraces.  Namely exit_code,
ptrace_message, and last_siginfo. It turns out as the ptrace interface
in the kernel was extended in the kernel the old existing interfaces
in the kernel were just wrapped and not properly updated to handle
the new functionality.  This lead to races and inconsistencies.

This fixes the reason for the races and inconsistencies by moving the
work of maintaining the ptrace fields into ptrace_stop.

The inconsistency that results in some ptrace_stop points continuing
with a signal while others will not I have left alone as it appears to
be part of our userspace ABI, and changing that risks breaking
userspace.

Eric W. Biederman (2):
      ptrace: Move setting/clearing ptrace_message into ptrace_stop
      ptrace: Return the signal to continue with from ptrace_stop

 include/linux/ptrace.h      | 17 +++++++----------
 include/uapi/linux/ptrace.h |  2 +-
 kernel/signal.c             | 40 ++++++++++++++++++++++++----------------
 3 files changed, 32 insertions(+), 27 deletions(-)

Eric
[GIT PULL] ptrace: Cleanups for v5.18
Posted by Eric W. Biederman 4 years, 3 months ago
Linus,

Please pull the ptrace-cleanups-for-v5.18 tag from the git tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git ptrace-cleanups-for-v5.18

  HEAD: dcbc65aac28360df5f5a3b613043ccc0e81da3cf ptrace: Remove duplicated include in ptrace.c

This set of changes removes tracehook.h, moves modification of all of
the ptrace fields inside of siglock to remove races, adds a missing
permission check to ptrace.c

The removal of tracehook.h is quite significant as it has been a major
source of confusion in recent years.  Much of that confusion was
around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
making the semantics clearer).

For people who don't know tracehook.h is a vestiage of an attempt to
implement uprobes like functionality that was never fully merged, and
was later superseeded by uprobes when uprobes was merged.  For many
years now we have been removing what tracehook functionaly a little
bit at a time.  To the point where now anything left in tracehook.h is
some weird strange thing that is difficult to understand.

Eric W. Biederman (15):
      ptrace: Move ptrace_report_syscall into ptrace.h
      ptrace/arm: Rename tracehook_report_syscall report_syscall
      ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
      ptrace: Remove arch_syscall_{enter,exit}_tracehook
      ptrace: Remove tracehook_signal_handler
      task_work: Remove unnecessary include from posix_timers.h
      task_work: Introduce task_work_pending
      task_work: Call tracehook_notify_signal from get_signal on all architectures
      task_work: Decouple TIF_NOTIFY_SIGNAL and task_work
      signal: Move set_notify_signal and clear_notify_signal into sched/signal.h
      resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume
      resume_user_mode: Move to resume_user_mode.h
      tracehook: Remove tracehook.h
      ptrace: Move setting/clearing ptrace_message into ptrace_stop
      ptrace: Return the signal to continue with from ptrace_stop

Jann Horn (1):
      ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE

Yang Li (1):
      ptrace: Remove duplicated include in ptrace.c


 MAINTAINERS                          |   1 -
 arch/Kconfig                         |   5 +-
 arch/alpha/kernel/ptrace.c           |   5 +-
 arch/alpha/kernel/signal.c           |   4 +-
 arch/arc/kernel/ptrace.c             |   5 +-
 arch/arc/kernel/signal.c             |   4 +-
 arch/arm/kernel/ptrace.c             |  12 +-
 arch/arm/kernel/signal.c             |   4 +-
 arch/arm64/kernel/ptrace.c           |  14 +--
 arch/arm64/kernel/signal.c           |   4 +-
 arch/csky/kernel/ptrace.c            |   5 +-
 arch/csky/kernel/signal.c            |   4 +-
 arch/h8300/kernel/ptrace.c           |   5 +-
 arch/h8300/kernel/signal.c           |   4 +-
 arch/hexagon/kernel/process.c        |   4 +-
 arch/hexagon/kernel/signal.c         |   1 -
 arch/hexagon/kernel/traps.c          |   6 +-
 arch/ia64/kernel/process.c           |   4 +-
 arch/ia64/kernel/ptrace.c            |   6 +-
 arch/ia64/kernel/signal.c            |   1 -
 arch/m68k/kernel/ptrace.c            |   5 +-
 arch/m68k/kernel/signal.c            |   4 +-
 arch/microblaze/kernel/ptrace.c      |   5 +-
 arch/microblaze/kernel/signal.c      |   4 +-
 arch/mips/kernel/ptrace.c            |   5 +-
 arch/mips/kernel/signal.c            |   4 +-
 arch/nds32/include/asm/syscall.h     |   2 +-
 arch/nds32/kernel/ptrace.c           |   5 +-
 arch/nds32/kernel/signal.c           |   4 +-
 arch/nios2/kernel/ptrace.c           |   5 +-
 arch/nios2/kernel/signal.c           |   4 +-
 arch/openrisc/kernel/ptrace.c        |   5 +-
 arch/openrisc/kernel/signal.c        |   4 +-
 arch/parisc/kernel/ptrace.c          |   7 +-
 arch/parisc/kernel/signal.c          |   4 +-
 arch/powerpc/kernel/ptrace/ptrace.c  |   8 +-
 arch/powerpc/kernel/signal.c         |   4 +-
 arch/riscv/kernel/ptrace.c           |   5 +-
 arch/riscv/kernel/signal.c           |   4 +-
 arch/s390/include/asm/entry-common.h |   1 -
 arch/s390/kernel/ptrace.c            |   1 -
 arch/s390/kernel/signal.c            |   5 +-
 arch/sh/kernel/ptrace_32.c           |   5 +-
 arch/sh/kernel/signal_32.c           |   4 +-
 arch/sparc/kernel/ptrace_32.c        |   5 +-
 arch/sparc/kernel/ptrace_64.c        |   5 +-
 arch/sparc/kernel/signal32.c         |   1 -
 arch/sparc/kernel/signal_32.c        |   4 +-
 arch/sparc/kernel/signal_64.c        |   4 +-
 arch/um/kernel/process.c             |   4 +-
 arch/um/kernel/ptrace.c              |   5 +-
 arch/x86/kernel/ptrace.c             |   1 -
 arch/x86/kernel/signal.c             |   5 +-
 arch/x86/mm/tlb.c                    |   1 +
 arch/xtensa/kernel/ptrace.c          |   5 +-
 arch/xtensa/kernel/signal.c          |   4 +-
 block/blk-cgroup.c                   |   2 +-
 fs/coredump.c                        |   1 -
 fs/exec.c                            |   1 -
 fs/io-wq.c                           |   6 +-
 fs/io_uring.c                        |  11 +-
 fs/proc/array.c                      |   1 -
 fs/proc/base.c                       |   1 -
 include/asm-generic/syscall.h        |   2 +-
 include/linux/entry-common.h         |  47 +-------
 include/linux/entry-kvm.h            |   2 +-
 include/linux/posix-timers.h         |   1 -
 include/linux/ptrace.h               |  81 ++++++++++++-
 include/linux/resume_user_mode.h     |  64 ++++++++++
 include/linux/sched/signal.h         |  17 +++
 include/linux/task_work.h            |   5 +
 include/linux/tracehook.h            | 226 -----------------------------------
 include/uapi/linux/ptrace.h          |   2 +-
 kernel/entry/common.c                |  19 +--
 kernel/entry/kvm.c                   |   9 +-
 kernel/exit.c                        |   3 +-
 kernel/livepatch/transition.c        |   1 -
 kernel/ptrace.c                      |  47 +++++---
 kernel/seccomp.c                     |   1 -
 kernel/signal.c                      |  62 +++++-----
 kernel/task_work.c                   |   4 +-
 kernel/time/posix-cpu-timers.c       |   1 +
 mm/memcontrol.c                      |   2 +-
 security/apparmor/domain.c           |   1 -
 security/selinux/hooks.c             |   1 -
 85 files changed, 372 insertions(+), 495 deletions(-)

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Jens Axboe 4 years, 3 months ago
On 3/28/22 5:56 PM, Eric W. Biederman wrote:
> 
> Linus,
> 
> Please pull the ptrace-cleanups-for-v5.18 tag from the git tree:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git ptrace-cleanups-for-v5.18
> 
>   HEAD: dcbc65aac28360df5f5a3b613043ccc0e81da3cf ptrace: Remove duplicated include in ptrace.c
> 
> This set of changes removes tracehook.h, moves modification of all of
> the ptrace fields inside of siglock to remove races, adds a missing
> permission check to ptrace.c
> 
> The removal of tracehook.h is quite significant as it has been a major
> source of confusion in recent years.  Much of that confusion was
> around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
> making the semantics clearer).
> 
> For people who don't know tracehook.h is a vestiage of an attempt to
> implement uprobes like functionality that was never fully merged, and
> was later superseeded by uprobes when uprobes was merged.  For many
> years now we have been removing what tracehook functionaly a little
> bit at a time.  To the point where now anything left in tracehook.h is
> some weird strange thing that is difficult to understand.

FWIW, the notify/task_work/io_uring changes look good to me. Thanks for
cleaning this up, Eric.

-- 
Jens Axboe
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Linus Torvalds 4 years, 3 months ago
On Mon, Mar 28, 2022 at 4:56 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> The removal of tracehook.h is quite significant as it has been a major
> source of confusion in recent years.  Much of that confusion was
> around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
> making the semantics clearer).

Hmm. I love removing tracehook.c, but this looks like it hasn't been
in linux-next.

The header file changes messes with other changes, and we have

  kernel/sched/fair.c:2884:9: error: implicit declaration of function
‘init_task_work’; did you mean ‘init_irq_work’?
[-Werror=implicit-function-declaration]
   2884 |         init_task_work(&p->numa_work, task_numa_work);
        |         ^~~~~~~~~~~~~~

as a result (also a few other things in that same file).

Now, this is trivial to fix - just add an include for
<linux/task_work.h> from that file - and that's the right thing to do
anyway.

But I'm a bit unhappy that this was either not tested in linux-next,
or if it was, I wasn't notified about the semantic in the pull
request.

So I've pulled this, and fixed up things in my merge, but I'm a bit
worried that there might be other situations like this where some
header file is no longer included and it was included implicitly
before through that disgusting tracehook.h header..

I *hope* it was just the scheduler header file updates that ended up
having this effect, and nothing else is affected.

Let's see if the test robots start complaining about non-x86
architecture-specific stuff that I don't build test.

              Linus
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Stephen Rothwell 4 years, 3 months ago
Hi Linus,

On Mon, 28 Mar 2022 17:33:52 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Mon, Mar 28, 2022 at 4:56 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >
> > The removal of tracehook.h is quite significant as it has been a major
> > source of confusion in recent years.  Much of that confusion was
> > around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
> > making the semantics clearer).  
> 
> Hmm. I love removing tracehook.c, but this looks like it hasn't been
> in linux-next.

See https://lore.kernel.org/lkml/20220316165612.4f50faad@canb.auug.org.au/

-- 
Cheers,
Stephen Rothwell
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Linus Torvalds 4 years, 3 months ago
On Mon, Mar 28, 2022 at 5:53 PM Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> See https://lore.kernel.org/lkml/20220316165612.4f50faad@canb.auug.org.au/

Ok, so it was known, just not reported to me.

And the good news seems to be that at least there isn't anything
hiding elsewhere.

                Linus
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Eric W. Biederman 4 years, 3 months ago
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, Mar 28, 2022 at 4:56 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
>> The removal of tracehook.h is quite significant as it has been a major
>> source of confusion in recent years.  Much of that confusion was
>> around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
>> making the semantics clearer).
>
> Hmm. I love removing tracehook.c, but this looks like it hasn't been
> in linux-next.
>
> The header file changes messes with other changes, and we have
>
>   kernel/sched/fair.c:2884:9: error: implicit declaration of function
> ‘init_task_work’; did you mean ‘init_irq_work’?
> [-Werror=implicit-function-declaration]
>    2884 |         init_task_work(&p->numa_work, task_numa_work);
>         |         ^~~~~~~~~~~~~~
>
> as a result (also a few other things in that same file).
>
> Now, this is trivial to fix - just add an include for
> <linux/task_work.h> from that file - and that's the right thing to do
> anyway.
>
> But I'm a bit unhappy that this was either not tested in linux-next,
> or if it was, I wasn't notified about the semantic in the pull
> request.
>
> So I've pulled this, and fixed up things in my merge, but I'm a bit
> worried that there might be other situations like this where some
> header file is no longer included and it was included implicitly
> before through that disgusting tracehook.h header..
>
> I *hope* it was just the scheduler header file updates that ended up
> having this effect, and nothing else is affected.
>
> Let's see if the test robots start complaining about non-x86
> architecture-specific stuff that I don't build test.

Sorry for not mentioning that.  I had tracked it down.  It was
fundamentally in the scheduler headers changes removing an include of
task_work.h, so it didn't feel like there was anything I could do in my
tree.  I asked Ingo if he could fix his tree and unfortunately forgot
about it.

For the record there were also a couple of other pretty trivial
conflicts, the removal of nds32, some block_cgroup header where
an adjacent line was modified to what I was changing.  But it thankfully
looks like none of those caused you any problems.

Sorry about all of that I am about that.  I am running pretty weak this
last couple of days as a cold has been running through the household.


Dumb question because this seems to burning a few extra creativity
points.  Is there any way to create a signed tag and a branch with the
same name?  Or in general is there a good way to manage topic branches
and then tag them at the end before I send them?

Having a tag and a branch with the same name seems to completely confuse
git and it just tells me no I won't push anything to another git tree,
because what you are asking me to do is ambiguous.  So now I am having
to come up with two names for each topic branch, even if I only push the
tags upstream.

I feel like there is a best practice on how to manage tags and topic
branches and I just haven't seen it yet.

Eric

Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Linus Torvalds 4 years, 3 months ago
On Mon, Mar 28, 2022 at 8:38 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Dumb question because this seems to burning a few extra creativity
> points.  Is there any way to create a signed tag and a branch with the
> same name?

Oh, absolutely.

But you may find it annoying, because as you noticed:

> Having a tag and a branch with the same name seems to completely confuse
> git and it just tells me no I won't push anything to another git tree,
> because what you are asking me to do is ambiguous.

Not at all.

git is not at all confused by the situation, git is in fact very aware
of it indeed.

But as git then tells you, exactly *because* it is aware that you have
picked the same name for both a branch and a tag, it will keep warning
you about the ambiguity of said name (but after warning, will
generally then preferentially use the tag of that name over the branch
of the same name).

So if you have both a branch and a tag named 'xyz', you generally need
to disambiguate them when you use them. That will make git happy,
because now it's not ambiguous any more.

(Technical detail: some git operations work on specific namespaces, so
"git branch -d xyz" should never remove a _tag_ called 'xyz', and as
such the name isn't ambiguous in the context of that git command)

And disambiguating them is simple, but I suspect you'll find it's
annoying enough that you simply don't want to use the same name for
tags and branches.

The full name of a tag 'x' is actually 'refs/tags/x', and the full
unambiguous name of a branch 'x' is 'refs/heads/x'.

So technically a tag and a branch can never _really_ have the same
name, because internally they have longer unambiguous names.

You would almost never actually use that full name, it's mostly an
internal git thing. Because even if you have ambiguous branch and tag
names, you'd then still shorten it to just 'tags/x' and 'heads/x'
respectively.

Git has other "namespaces" for these refs, too, notably
'refs/remotes/'. In fact, I guess you could make up your own namespace
too if you really wanted, although I don't think anybody really has
ever had a good reason for it.

(There is also the non-namespaced special refs like HEAD and
FETCH_HEAD and MERGE_HEAD for "current branch", "what you fetched" and
"what you are merging")

So you *can* do this:

    # create and check out the branch 'xyz'
    $ git checkout -b xyz master

    # create a tag called 'xyz', but to confuse everybody, make it
point somewhere else
    $ git tag xyz master~3

    # look at what 'xyz' means:
    $ git rev-parse xyz
    warning: refname 'xyz' is ambiguous.
    cffb2b72d3ed47f5093d128bd44d9ce136b6b5af

    # Hmm. it warned about it being ambiguous
    $ git rev-parse heads/xyz
    1930a6e739c4b4a654a69164dbe39e554d228915

    # Ok, it clearly took the tag, not the branch:
    $ git rev-parse tags/xyz
    cffb2b72d3ed47f5093d128bd44d9ce136b6b5af

so as you can see, yes, you can work with a tag and a branch that have
the same 'short name', but having to disambiguate them all the time
will likely just drive you mad.

And it's worth pointing out that the name does not imply a
relationship. So the branch called 'xyz' (ie refs/heads/xyz) has
absolutely *no* relationship to the tag called 'xyz' (ie
refs/tags/xyz) except for that ambiguous short name. So updating the
branch - by perhaps committing more to it - will in no way affect the
tag.

Also note that branches and tags are both "just refs" as far as git is
concerned, but git *does* give some semantic meaning to the
namespaces.

So the branch namespace (it those 'refs/heads/*' things) are things
that get updated automatically as you make new commits.

In contrast, the tag namespace is something you *can* update, but it's
considered odd, and if you want to overwrite an existing tag you
generally need to do something special (eg "git tag -f" to force
overwriting a new tag rather than telling you that you already have
one).

So in a very real way, to git a ref is a ref is a ref, with very
little to no real *technical* distinction. They are just ways to point
to the SHA1 hash of an object. But there are basically some common
semantic rules that are based on the namespaces, and all those git
operations that use certain namespaces by default.

            Linus
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Linus Torvalds 4 years, 3 months ago
On Mon, Mar 28, 2022 at 9:49 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So if you have both a branch and a tag named 'xyz', you generally need
> to disambiguate them when you use them. That will make git happy,
> because now it's not ambiguous any more.

On a similar but very different issue: this is not the only kind of
naming ambiguity you can have.

For example, try this insane thing:

    git tag Makefile

that creates a tag called 'Makefile' (pointing to your current HEAD, of course).

Now, guess what happens when you then do

    git log Makefile

that's right - git once again notices that you are doing something ambiguous.

In fact, git will be *so* unhappy about this kind of ambiguity that
(unlike the 'tag vs branch' case) it will not prefer one version of
reality over the other, and simply consider this to be a fatal error,
and say

    fatal: ambiguous argument 'Makefile': both revision and filename
    Use '--' to separate paths from revisions, like this:
    'git <command> [<revision>...] -- [<file>...]'

and as a result you will hopefully go "Oh, I didn't mean to have that
tag at all" and just remove the bogus tagname.

Because you probably made it by mistake.

But you *can* choose to say

   git log Makefile --

or

   git log refs/tags/Makefile

to make it clear that no, 'Makefile' is not the pathname in the
repository, you really mean the tag 'Makefile'.

Or use

  git log -- Makefile

or

    git log ./Makefile

to say "yeah, I meant the _pathname_ 'Makefile', not the tag".

Or, if you just like playing games, do

    git log Makefile -- Makefile

if you want to track the history of the path 'Makefile' starting at
the tag 'Makefile'.

But don't actually do this for real. There's a reason git notices these things.

Using ambiguous branch names (whether ambiguous with tag-names or with
filenames) is a pain that is not worth it in practice. Git will
notice, and git will allow you to work around it, but it's just a
BadIdea(tm) despite that.

But you probably want to be aware of issues like this if you script
things around git, though.

That "--" form in particular is generally the one you want to use for
scripting, if you want to allow people to do anything they damn well
please. It's the "Give them rope" syntax.

Of course, a much more common reason for "--" for pathname separation,
and one you may already be familiar with, is that you want to see the
history of a pathname that is not *currently* a pathname, but was one
at some point in the past.

So in my current kernel tree, doing

    $ git log arch/nds32/

will actually cause an error, because of "unknown revision or path not
in the working tree".

But if you really want to see the history of that now-deleted
architecture, you do

    $ git log -- arch/nds32/

and it's all good.

This concludes today's sermon on git,

             Linus
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by pr-tracker-bot@kernel.org 4 years, 3 months ago
The pull request you sent on Mon, 28 Mar 2022 18:56:46 -0500:

> git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git ptrace-cleanups-for-v5.18

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/1930a6e739c4b4a654a69164dbe39e554d228915

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago

Today ptrace_message is easy to overlook as it not a core part of
ptrace_stop.  It has been overlooked so much that there are places
that set ptrace_message and don't clear it, and places that never set
it.  So if you get an unlucky sequence of events the ptracer may be
able to read a ptrace_message that does not apply to the current
ptrace stop.

Move setting of ptrace_message into ptrace_stop so that it always gets
set before the stop, and always gets cleared after the stop.  This
prevents non-sense from being reported to userspace and makes
ptrace_message more visible in the ptrace helper functions so that
kernel developers can see it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/ptrace.h      |  9 +++------
 include/uapi/linux/ptrace.h |  2 +-
 kernel/signal.c             | 21 ++++++++++++---------
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 5310f43e4762..3e6b46e2b7be 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
 extern void ptrace_disable(struct task_struct *);
 extern int ptrace_request(struct task_struct *child, long request,
 			  unsigned long addr, unsigned long data);
-extern void ptrace_notify(int exit_code);
+extern void ptrace_notify(int exit_code, unsigned long message);
 extern void __ptrace_link(struct task_struct *child,
 			  struct task_struct *new_parent,
 			  const struct cred *ptracer_cred);
@@ -155,8 +155,7 @@ static inline bool ptrace_event_enabled(struct task_struct *task, int event)
 static inline void ptrace_event(int event, unsigned long message)
 {
 	if (unlikely(ptrace_event_enabled(current, event))) {
-		current->ptrace_message = message;
-		ptrace_notify((event << 8) | SIGTRAP);
+		ptrace_notify((event << 8) | SIGTRAP, message);
 	} else if (event == PTRACE_EVENT_EXEC) {
 		/* legacy EXEC report via SIGTRAP */
 		if ((current->ptrace & (PT_PTRACED|PT_SEIZED)) == PT_PTRACED)
@@ -424,8 +423,7 @@ static inline int ptrace_report_syscall(unsigned long message)
 	if (!(ptrace & PT_PTRACED))
 		return 0;
 
-	current->ptrace_message = message;
-	ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
+	ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0), message);
 
 	/*
 	 * this isn't the same as continuing with a signal, but it will do
@@ -437,7 +435,6 @@ static inline int ptrace_report_syscall(unsigned long message)
 		current->exit_code = 0;
 	}
 
-	current->ptrace_message = 0;
 	return fatal_signal_pending(current);
 }
 
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index b7af92e07d1f..195ae64a8c87 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -114,7 +114,7 @@ struct ptrace_rseq_configuration {
 
 /*
  * These values are stored in task->ptrace_message
- * by ptrace_report_syscall_* to describe the current syscall-stop.
+ * by ptrace_stop to describe the current syscall-stop.
  */
 #define PTRACE_EVENTMSG_SYSCALL_ENTRY	1
 #define PTRACE_EVENTMSG_SYSCALL_EXIT	2
diff --git a/kernel/signal.c b/kernel/signal.c
index c2dee5420567..a49ac7149256 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2191,7 +2191,8 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
  * If we actually decide not to stop at all because the tracer
  * is gone, we keep current->exit_code unless clear_code.
  */
-static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t *info)
+static void ptrace_stop(int exit_code, int why, int clear_code,
+			unsigned long message, kernel_siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
@@ -2237,6 +2238,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t
 	 */
 	smp_wmb();
 
+	current->ptrace_message = message;
 	current->last_siginfo = info;
 	current->exit_code = exit_code;
 
@@ -2315,6 +2317,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t
 	 */
 	spin_lock_irq(&current->sighand->siglock);
 	current->last_siginfo = NULL;
+	current->ptrace_message = 0;
 
 	/* LISTENING can be set only during STOP traps, clear it */
 	current->jobctl &= ~JOBCTL_LISTENING;
@@ -2327,7 +2330,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t
 	recalc_sigpending_tsk(current);
 }
 
-static void ptrace_do_notify(int signr, int exit_code, int why)
+static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
 {
 	kernel_siginfo_t info;
 
@@ -2338,17 +2341,17 @@ static void ptrace_do_notify(int signr, int exit_code, int why)
 	info.si_uid = from_kuid_munged(current_user_ns(), current_uid());
 
 	/* Let the debugger run.  */
-	ptrace_stop(exit_code, why, 1, &info);
+	ptrace_stop(exit_code, why, 1, message, &info);
 }
 
-void ptrace_notify(int exit_code)
+void ptrace_notify(int exit_code, unsigned long message)
 {
 	BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
 	if (unlikely(task_work_pending(current)))
 		task_work_run();
 
 	spin_lock_irq(&current->sighand->siglock);
-	ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED);
+	ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED, message);
 	spin_unlock_irq(&current->sighand->siglock);
 }
 
@@ -2504,10 +2507,10 @@ static void do_jobctl_trap(void)
 			signr = SIGTRAP;
 		WARN_ON_ONCE(!signr);
 		ptrace_do_notify(signr, signr | (PTRACE_EVENT_STOP << 8),
-				 CLD_STOPPED);
+				 CLD_STOPPED, 0);
 	} else {
 		WARN_ON_ONCE(!signr);
-		ptrace_stop(signr, CLD_STOPPED, 0, NULL);
+		ptrace_stop(signr, CLD_STOPPED, 0, 0, NULL);
 		current->exit_code = 0;
 	}
 }
@@ -2561,7 +2564,7 @@ static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)
 	 * comment in dequeue_signal().
 	 */
 	current->jobctl |= JOBCTL_STOP_DEQUEUED;
-	ptrace_stop(signr, CLD_TRAPPED, 0, info);
+	ptrace_stop(signr, CLD_TRAPPED, 0, 0, info);
 
 	/* We're back.  Did the debugger cancel the sig?  */
 	signr = current->exit_code;
@@ -2891,7 +2894,7 @@ static void signal_delivered(struct ksignal *ksig, int stepping)
 	if (current->sas_ss_flags & SS_AUTODISARM)
 		sas_ss_reset(current);
 	if (stepping)
-		ptrace_notify(SIGTRAP);
+		ptrace_notify(SIGTRAP, 0);
 }
 
 void signal_setup_done(int failed, struct ksignal *ksig, int stepping)
-- 
2.29.2
Re: [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Oleg Nesterov 4 years, 3 months ago
On 03/15, Eric W. Biederman wrote:
>
> there are places
> that set ptrace_message and don't clear it, and places that never set
> it.

Yes, I too never understood this.

So I obviously like this change. The only problem (as usual) is that we
can never know if something depends on this old (and strange) behaviour.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Re: [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Kees Cook 4 years, 3 months ago
On Tue, Mar 15, 2022 at 06:21:08PM -0500, Eric W. Biederman wrote:
> 
> Today ptrace_message is easy to overlook as it not a core part of
> ptrace_stop.  It has been overlooked so much that there are places
> that set ptrace_message and don't clear it, and places that never set
> it.  So if you get an unlucky sequence of events the ptracer may be
> able to read a ptrace_message that does not apply to the current
> ptrace stop.
> 
> Move setting of ptrace_message into ptrace_stop so that it always gets
> set before the stop, and always gets cleared after the stop.  This
> prevents non-sense from being reported to userspace and makes
> ptrace_message more visible in the ptrace helper functions so that
> kernel developers can see it.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

This looks good to me. Did you happen to run the seccomp selftests
before/after these changes?

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
Re: [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago
Kees Cook <keescook@chromium.org> writes:

> On Tue, Mar 15, 2022 at 06:21:08PM -0500, Eric W. Biederman wrote:
>> 
>> Today ptrace_message is easy to overlook as it not a core part of
>> ptrace_stop.  It has been overlooked so much that there are places
>> that set ptrace_message and don't clear it, and places that never set
>> it.  So if you get an unlucky sequence of events the ptracer may be
>> able to read a ptrace_message that does not apply to the current
>> ptrace stop.
>> 
>> Move setting of ptrace_message into ptrace_stop so that it always gets
>> set before the stop, and always gets cleared after the stop.  This
>> prevents non-sense from being reported to userspace and makes
>> ptrace_message more visible in the ptrace helper functions so that
>> kernel developers can see it.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> This looks good to me. Did you happen to run the seccomp selftests
> before/after these changes?

I did not.  This is a pure ptrace change.  Do you see a way that seccomp
could be involved?

Eric
Re: [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Kees Cook 4 years, 3 months ago
On Fri, Mar 18, 2022 at 09:44:30AM -0500, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > On Tue, Mar 15, 2022 at 06:21:08PM -0500, Eric W. Biederman wrote:
> >> 
> >> Today ptrace_message is easy to overlook as it not a core part of
> >> ptrace_stop.  It has been overlooked so much that there are places
> >> that set ptrace_message and don't clear it, and places that never set
> >> it.  So if you get an unlucky sequence of events the ptracer may be
> >> able to read a ptrace_message that does not apply to the current
> >> ptrace stop.
> >> 
> >> Move setting of ptrace_message into ptrace_stop so that it always gets
> >> set before the stop, and always gets cleared after the stop.  This
> >> prevents non-sense from being reported to userspace and makes
> >> ptrace_message more visible in the ptrace helper functions so that
> >> kernel developers can see it.
> >> 
> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >
> > This looks good to me. Did you happen to run the seccomp selftests
> > before/after these changes?
> 
> I did not.  This is a pure ptrace change.  Do you see a way that seccomp
> could be involved?

Sorry, that wasn't clear: seccomp includes a number of ptrace tests as
well, especially involving handling process death, messages, and
signals. I'll give it a spin; so far it seems fine.

-- 
Kees Cook
[PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago

The signal a task should continue with after a ptrace stop is
inconsistently read, cleared, and sent.  Solve this by reading and
clearing the signal to be sent in ptrace_stop.

In an ideal world everything except ptrace_signal would share a common
implementation of continuing with the signal, so ptracers could count
on the signal they ask to continue with actually being delivered.  For
now retain bug compatibility and just return with the signal number
the ptracer requested the code continue with.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/ptrace.h | 12 ++++++------
 kernel/signal.c        | 31 ++++++++++++++++++-------------
 2 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 3e6b46e2b7be..15b3d176b6b4 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
 extern void ptrace_disable(struct task_struct *);
 extern int ptrace_request(struct task_struct *child, long request,
 			  unsigned long addr, unsigned long data);
-extern void ptrace_notify(int exit_code, unsigned long message);
+extern int ptrace_notify(int exit_code, unsigned long message);
 extern void __ptrace_link(struct task_struct *child,
 			  struct task_struct *new_parent,
 			  const struct cred *ptracer_cred);
@@ -419,21 +419,21 @@ extern void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction *oa
 static inline int ptrace_report_syscall(unsigned long message)
 {
 	int ptrace = current->ptrace;
+	int signr;
 
 	if (!(ptrace & PT_PTRACED))
 		return 0;
 
-	ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0), message);
+	signr = ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0),
+			      message);
 
 	/*
 	 * this isn't the same as continuing with a signal, but it will do
 	 * for normal use.  strace only continues with a signal if the
 	 * stopping signal is not SIGTRAP.  -brl
 	 */
-	if (current->exit_code) {
-		send_sig(current->exit_code, current, 1);
-		current->exit_code = 0;
-	}
+	if (signr)
+		send_sig(signr, current, 1);
 
 	return fatal_signal_pending(current);
 }
diff --git a/kernel/signal.c b/kernel/signal.c
index a49ac7149256..e53ee84b9021 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2188,15 +2188,17 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
  * That makes it a way to test a stopped process for
  * being ptrace-stopped vs being job-control-stopped.
  *
- * If we actually decide not to stop at all because the tracer
- * is gone, we keep current->exit_code unless clear_code.
+ * Returns the signal the ptracer requested the code resume
+ * with.  If the code did not stop because the tracer is gone,
+ * the stop signal remains unchanged unless clear_code.
  */
-static void ptrace_stop(int exit_code, int why, int clear_code,
+static int ptrace_stop(int exit_code, int why, int clear_code,
 			unsigned long message, kernel_siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
 	bool gstop_done = false;
+	bool read_code = true;
 
 	if (arch_ptrace_stop_needed()) {
 		/*
@@ -2305,8 +2307,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
 
 		/* tasklist protects us from ptrace_freeze_traced() */
 		__set_current_state(TASK_RUNNING);
+		read_code = false;
 		if (clear_code)
-			current->exit_code = 0;
+			exit_code = 0;
 		read_unlock(&tasklist_lock);
 	}
 
@@ -2316,8 +2319,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
 	 * any signal-sending on another CPU that wants to examine it.
 	 */
 	spin_lock_irq(&current->sighand->siglock);
+	if (read_code) exit_code = current->exit_code;
 	current->last_siginfo = NULL;
 	current->ptrace_message = 0;
+	current->exit_code = 0;
 
 	/* LISTENING can be set only during STOP traps, clear it */
 	current->jobctl &= ~JOBCTL_LISTENING;
@@ -2328,9 +2333,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
 	 * This sets TIF_SIGPENDING, but never clears it.
 	 */
 	recalc_sigpending_tsk(current);
+	return exit_code;
 }
 
-static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
+static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
 {
 	kernel_siginfo_t info;
 
@@ -2341,18 +2347,21 @@ static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long me
 	info.si_uid = from_kuid_munged(current_user_ns(), current_uid());
 
 	/* Let the debugger run.  */
-	ptrace_stop(exit_code, why, 1, message, &info);
+	return ptrace_stop(exit_code, why, 1, message, &info);
 }
 
-void ptrace_notify(int exit_code, unsigned long message)
+int ptrace_notify(int exit_code, unsigned long message)
 {
+	int signr;
+
 	BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
 	if (unlikely(task_work_pending(current)))
 		task_work_run();
 
 	spin_lock_irq(&current->sighand->siglock);
-	ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED, message);
+	signr = ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED, message);
 	spin_unlock_irq(&current->sighand->siglock);
+	return signr;
 }
 
 /**
@@ -2511,7 +2520,6 @@ static void do_jobctl_trap(void)
 	} else {
 		WARN_ON_ONCE(!signr);
 		ptrace_stop(signr, CLD_STOPPED, 0, 0, NULL);
-		current->exit_code = 0;
 	}
 }
 
@@ -2564,15 +2572,12 @@ static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)
 	 * comment in dequeue_signal().
 	 */
 	current->jobctl |= JOBCTL_STOP_DEQUEUED;
-	ptrace_stop(signr, CLD_TRAPPED, 0, 0, info);
+	signr = ptrace_stop(signr, CLD_TRAPPED, 0, 0, info);
 
 	/* We're back.  Did the debugger cancel the sig?  */
-	signr = current->exit_code;
 	if (signr == 0)
 		return signr;
 
-	current->exit_code = 0;
-
 	/*
 	 * Update the siginfo structure if the signal has
 	 * changed.  If the debugger wanted something
-- 
2.29.2
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Oleg Nesterov 4 years, 3 months ago
Not sure I understand this patch, I can't apply it. I guess I need to
clone your tree first, will do later.

Just one question right now,

On 03/15, Eric W. Biederman wrote:
>
> +static int ptrace_stop(int exit_code, int why, int clear_code,
>  			unsigned long message, kernel_siginfo_t *info)
>  	__releases(&current->sighand->siglock)
>  	__acquires(&current->sighand->siglock)
>  {
>  	bool gstop_done = false;
> +	bool read_code = true;
>  
>  	if (arch_ptrace_stop_needed()) {
>  		/*
> @@ -2305,8 +2307,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
>  
>  		/* tasklist protects us from ptrace_freeze_traced() */
>  		__set_current_state(TASK_RUNNING);
> +		read_code = false;
>  		if (clear_code)
> -			current->exit_code = 0;
> +			exit_code = 0;
>  		read_unlock(&tasklist_lock);
>  	}
>  
> @@ -2316,8 +2319,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
>  	 * any signal-sending on another CPU that wants to examine it.
>  	 */
>  	spin_lock_irq(&current->sighand->siglock);
> +	if (read_code) exit_code = current->exit_code;

style ;)

>  	current->last_siginfo = NULL;
>  	current->ptrace_message = 0;
> +	current->exit_code = 0;

OK, I like it... but can't we remove the ugly "int clear_code" arg?

Oleg.
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago
Oleg Nesterov <oleg@redhat.com> writes:

> Not sure I understand this patch, I can't apply it. I guess I need to
> clone your tree first, will do later.
>
> Just one question right now,
>
> On 03/15, Eric W. Biederman wrote:
>>
>> +static int ptrace_stop(int exit_code, int why, int clear_code,
>>  			unsigned long message, kernel_siginfo_t *info)
>>  	__releases(&current->sighand->siglock)
>>  	__acquires(&current->sighand->siglock)
>>  {
>>  	bool gstop_done = false;
>> +	bool read_code = true;
>>  
>>  	if (arch_ptrace_stop_needed()) {
>>  		/*
>> @@ -2305,8 +2307,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
>>  
>>  		/* tasklist protects us from ptrace_freeze_traced() */
>>  		__set_current_state(TASK_RUNNING);
>> +		read_code = false;
>>  		if (clear_code)
>> -			current->exit_code = 0;
>> +			exit_code = 0;
>>  		read_unlock(&tasklist_lock);
>>  	}
>>  
>> @@ -2316,8 +2319,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
>>  	 * any signal-sending on another CPU that wants to examine it.
>>  	 */
>>  	spin_lock_irq(&current->sighand->siglock);
>> +	if (read_code) exit_code = current->exit_code;
>
> style ;)
>
>>  	current->last_siginfo = NULL;
>>  	current->ptrace_message = 0;
>> +	current->exit_code = 0;
>
> OK, I like it... but can't we remove the ugly "int clear_code" arg?

The flag clear_code controls what happens if a ptrace_stop does not
stop.  In particular clear_code means do not continue with
a signal if we can not stop.

For do_jobctl_trap calling ptrace_stop it clearly does not matter.

For ptrace_signal it would be a change in behavior, that would
cause the signal not to be delivered.

For ptrace_do_notify we don't have a signal to deliver unless userspace
gives us one so clear_code makes sense at that point.

Which is a long way of saying that no we can not remove clear_stop
because the behavior it is used to implement makes sense and userspace
can reasonably depend upon it.

Eric
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Oleg Nesterov 4 years, 3 months ago
On 03/17, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > OK, I like it... but can't we remove the ugly "int clear_code" arg?
>
> The flag clear_code controls what happens if a ptrace_stop does not
> stop.  In particular clear_code means do not continue with
> a signal if we can not stop.
>
> For do_jobctl_trap calling ptrace_stop it clearly does not matter.
>
> For ptrace_signal it would be a change in behavior, that would
> cause the signal not to be delivered.

Well I meant that "clear_code" should be false, iirc only
ptrace_report_syscall() should be updated to void the spurious
send_sig() if debugger exits... Nevermind, pleasee forget, this is
not as trivial as I thought.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago
Oleg Nesterov <oleg@redhat.com> writes:

> Not sure I understand this patch, I can't apply it. I guess I need to
> clone your tree first, will do later.

Yes this series is on top of:
  https://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git kill-tracehook-for-v5.18

Nothing particularly interesting happens in my kill tracehook series
but I do get rid of tracehook.h and so everything gets moved and
renamed.

Eric
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Kees Cook 4 years, 3 months ago
On Tue, Mar 15, 2022 at 06:22:26PM -0500, Eric W. Biederman wrote:
> 
> The signal a task should continue with after a ptrace stop is
> inconsistently read, cleared, and sent.  Solve this by reading and
> clearing the signal to be sent in ptrace_stop.
> 
> In an ideal world everything except ptrace_signal would share a common
> implementation of continuing with the signal, so ptracers could count
> on the signal they ask to continue with actually being delivered.  For
> now retain bug compatibility and just return with the signal number
> the ptracer requested the code continue with.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  include/linux/ptrace.h | 12 ++++++------
>  kernel/signal.c        | 31 ++++++++++++++++++-------------
>  2 files changed, 24 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 3e6b46e2b7be..15b3d176b6b4 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
>  extern void ptrace_disable(struct task_struct *);
>  extern int ptrace_request(struct task_struct *child, long request,
>  			  unsigned long addr, unsigned long data);
> -extern void ptrace_notify(int exit_code, unsigned long message);
> +extern int ptrace_notify(int exit_code, unsigned long message);
> [...]
> -static void ptrace_stop(int exit_code, int why, int clear_code,
> +static int ptrace_stop(int exit_code, int why, int clear_code,
>  			unsigned long message, kernel_siginfo_t *info)
> [...]
> -static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
> +static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
> [...]
> -void ptrace_notify(int exit_code, unsigned long message)
> +int ptrace_notify(int exit_code, unsigned long message)

Just for robustness, how about marking the functions that have switched
from void to int return as __must_check (or at least just ptrace_notify)?

With that and the style nit Oleg already mentioned, yeah, this looks
good too.

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago
Kees Cook <keescook@chromium.org> writes:

> On Tue, Mar 15, 2022 at 06:22:26PM -0500, Eric W. Biederman wrote:
>> 
>> The signal a task should continue with after a ptrace stop is
>> inconsistently read, cleared, and sent.  Solve this by reading and
>> clearing the signal to be sent in ptrace_stop.
>> 
>> In an ideal world everything except ptrace_signal would share a common
>> implementation of continuing with the signal, so ptracers could count
>> on the signal they ask to continue with actually being delivered.  For
>> now retain bug compatibility and just return with the signal number
>> the ptracer requested the code continue with.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  include/linux/ptrace.h | 12 ++++++------
>>  kernel/signal.c        | 31 ++++++++++++++++++-------------
>>  2 files changed, 24 insertions(+), 19 deletions(-)
>> 
>> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
>> index 3e6b46e2b7be..15b3d176b6b4 100644
>> --- a/include/linux/ptrace.h
>> +++ b/include/linux/ptrace.h
>> @@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
>>  extern void ptrace_disable(struct task_struct *);
>>  extern int ptrace_request(struct task_struct *child, long request,
>>  			  unsigned long addr, unsigned long data);
>> -extern void ptrace_notify(int exit_code, unsigned long message);
>> +extern int ptrace_notify(int exit_code, unsigned long message);
>> [...]
>> -static void ptrace_stop(int exit_code, int why, int clear_code,
>> +static int ptrace_stop(int exit_code, int why, int clear_code,
>>  			unsigned long message, kernel_siginfo_t *info)
>> [...]
>> -static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
>> +static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
>> [...]
>> -void ptrace_notify(int exit_code, unsigned long message)
>> +int ptrace_notify(int exit_code, unsigned long message)
>
> Just for robustness, how about marking the functions that have switched
> from void to int return as __must_check (or at least just ptrace_notify)?

We can't.  There are historical cases that simply don't check if a
signal should be sent after the function, and they exist for every
function that is modified.

If we can modify the code so that everyone is checking the return value
than certainly, but that just doesn't happen to reflect how this
ptrace helper is being used today.

> With that and the style nit Oleg already mentioned, yeah, this looks
> good too.

Alright style nit fixed.

> Reviewed-by: Kees Cook <keescook@chromium.org>

Eric
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Kees Cook 4 years, 3 months ago
On Fri, Mar 18, 2022 at 09:52:46AM -0500, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > On Tue, Mar 15, 2022 at 06:22:26PM -0500, Eric W. Biederman wrote:
> >> 
> >> The signal a task should continue with after a ptrace stop is
> >> inconsistently read, cleared, and sent.  Solve this by reading and
> >> clearing the signal to be sent in ptrace_stop.
> >> 
> >> In an ideal world everything except ptrace_signal would share a common
> >> implementation of continuing with the signal, so ptracers could count
> >> on the signal they ask to continue with actually being delivered.  For
> >> now retain bug compatibility and just return with the signal number
> >> the ptracer requested the code continue with.
> >> 
> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >> ---
> >>  include/linux/ptrace.h | 12 ++++++------
> >>  kernel/signal.c        | 31 ++++++++++++++++++-------------
> >>  2 files changed, 24 insertions(+), 19 deletions(-)
> >> 
> >> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> >> index 3e6b46e2b7be..15b3d176b6b4 100644
> >> --- a/include/linux/ptrace.h
> >> +++ b/include/linux/ptrace.h
> >> @@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
> >>  extern void ptrace_disable(struct task_struct *);
> >>  extern int ptrace_request(struct task_struct *child, long request,
> >>  			  unsigned long addr, unsigned long data);
> >> -extern void ptrace_notify(int exit_code, unsigned long message);
> >> +extern int ptrace_notify(int exit_code, unsigned long message);
> >> [...]
> >> -static void ptrace_stop(int exit_code, int why, int clear_code,
> >> +static int ptrace_stop(int exit_code, int why, int clear_code,
> >>  			unsigned long message, kernel_siginfo_t *info)
> >> [...]
> >> -static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
> >> +static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
> >> [...]
> >> -void ptrace_notify(int exit_code, unsigned long message)
> >> +int ptrace_notify(int exit_code, unsigned long message)
> >
> > Just for robustness, how about marking the functions that have switched
> > from void to int return as __must_check (or at least just ptrace_notify)?
> 
> We can't.  There are historical cases that simply don't check if a
> signal should be sent after the function, and they exist for every
> function that is modified.

This seems at least worth documenting with a comment, otherwise we're
just trading one kind of "weirdness" (setting/clearing
current->exit_code) with another (ignoring the signal returned by
ptrace_notify()).

I see only two cases that would need comments:

static inline void ptrace_event(int event, unsigned long message)
{
        if (unlikely(ptrace_event_enabled(current, event))) {
                ptrace_notify((event << 8) | SIGTRAP, message);
        } else if (event == PTRACE_EVENT_EXEC) {
                /* legacy EXEC report via SIGTRAP */
                if ((current->ptrace & (PT_PTRACED|PT_SEIZED)) == PT_PTRACED)
                        send_sig(SIGTRAP, current, 0);
        }
}

static void signal_delivered(struct ksignal *ksig, int stepping)
{
	...
        if (stepping)
                ptrace_notify(SIGTRAP, 0);
}


-- 
Kees Cook
[PATCH 01/13] ptrace: Move ptrace_report_syscall into ptrace.h
Posted by Eric W. Biederman 4 years, 3 months ago
Move ptrace_report_syscall from tracehook.h into ptrace.h where it
belongs.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/ptrace.h    | 27 +++++++++++++++++++++++++++
 include/linux/tracehook.h | 26 --------------------------
 2 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 8aee2945ff08..91b1074edb4c 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -413,4 +413,31 @@ static inline void user_single_step_report(struct pt_regs *regs)
 extern int task_current_syscall(struct task_struct *target, struct syscall_info *info);
 
 extern void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction *oact);
+
+/*
+ * ptrace report for syscall entry and exit looks identical.
+ */
+static inline int ptrace_report_syscall(unsigned long message)
+{
+	int ptrace = current->ptrace;
+
+	if (!(ptrace & PT_PTRACED))
+		return 0;
+
+	current->ptrace_message = message;
+	ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
+
+	/*
+	 * this isn't the same as continuing with a signal, but it will do
+	 * for normal use.  strace only continues with a signal if the
+	 * stopping signal is not SIGTRAP.  -brl
+	 */
+	if (current->exit_code) {
+		send_sig(current->exit_code, current, 1);
+		current->exit_code = 0;
+	}
+
+	current->ptrace_message = 0;
+	return fatal_signal_pending(current);
+}
 #endif
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 88c007ab5ebc..998bc3863559 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -51,32 +51,6 @@
 #include <linux/blk-cgroup.h>
 struct linux_binprm;
 
-/*
- * ptrace report for syscall entry and exit looks identical.
- */
-static inline int ptrace_report_syscall(unsigned long message)
-{
-	int ptrace = current->ptrace;
-
-	if (!(ptrace & PT_PTRACED))
-		return 0;
-
-	current->ptrace_message = message;
-	ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
-
-	/*
-	 * this isn't the same as continuing with a signal, but it will do
-	 * for normal use.  strace only continues with a signal if the
-	 * stopping signal is not SIGTRAP.  -brl
-	 */
-	if (current->exit_code) {
-		send_sig(current->exit_code, current, 1);
-		current->exit_code = 0;
-	}
-
-	current->ptrace_message = 0;
-	return fatal_signal_pending(current);
-}
 
 /**
  * tracehook_report_syscall_entry - task is about to attempt a system call
-- 
2.29.2
Re: [PATCH 01/13] ptrace: Move ptrace_report_syscall into ptrace.h
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:42AM -0600, Eric W. Biederman wrote:
> Move ptrace_report_syscall from tracehook.h into ptrace.h where it
> belongs.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Yes, as others have said, fixing the naming confusion alone is reason
enough. :)

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 02/13] ptrace/arm: Rename tracehook_report_syscall report_syscall
Posted by Eric W. Biederman 4 years, 3 months ago
Make the arm and arm64 code more concise and less confusing by
renaming the architecture specific tracehook_report_syscall to
report_syscall.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/arm/kernel/ptrace.c   | 7 +++----
 arch/arm64/kernel/ptrace.c | 7 +++----
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 43b963ea4a0e..e5aa3237853d 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -831,8 +831,7 @@ enum ptrace_syscall_dir {
 	PTRACE_SYSCALL_EXIT,
 };
 
-static void tracehook_report_syscall(struct pt_regs *regs,
-				    enum ptrace_syscall_dir dir)
+static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
 {
 	unsigned long ip;
 
@@ -856,7 +855,7 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 	int scno;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
+		report_syscall(regs, PTRACE_SYSCALL_ENTER);
 
 	/* Do seccomp after ptrace; syscall may have changed. */
 #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
@@ -897,5 +896,5 @@ asmlinkage void syscall_trace_exit(struct pt_regs *regs)
 		trace_sys_exit(regs, regs_return_value(regs));
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT);
+		report_syscall(regs, PTRACE_SYSCALL_EXIT);
 }
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 39dbdfdc38d3..b7845575f86f 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1792,8 +1792,7 @@ enum ptrace_syscall_dir {
 	PTRACE_SYSCALL_EXIT,
 };
 
-static void tracehook_report_syscall(struct pt_regs *regs,
-				     enum ptrace_syscall_dir dir)
+static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
 {
 	int regno;
 	unsigned long saved_reg;
@@ -1842,7 +1841,7 @@ int syscall_trace_enter(struct pt_regs *regs)
 	unsigned long flags = read_thread_flags();
 
 	if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
-		tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
+		report_syscall(regs, PTRACE_SYSCALL_ENTER);
 		if (flags & _TIF_SYSCALL_EMU)
 			return NO_SYSCALL;
 	}
@@ -1870,7 +1869,7 @@ void syscall_trace_exit(struct pt_regs *regs)
 		trace_sys_exit(regs, syscall_get_return_value(current, regs));
 
 	if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
-		tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT);
+		report_syscall(regs, PTRACE_SYSCALL_EXIT);
 
 	rseq_syscall(regs);
 }
-- 
2.29.2
Re: [PATCH 02/13] ptrace/arm: Rename tracehook_report_syscall report_syscall
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:43AM -0600, Eric W. Biederman wrote:
> Make the arm and arm64 code more concise and less confusing by
> renaming the architecture specific tracehook_report_syscall to
> report_syscall.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

As a person who is repeatedly stumped trying to finding this function,
yes please.

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 03/13] ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
Posted by Eric W. Biederman 4 years, 3 months ago
Rename tracehook_report_syscall_{entry,exit} to
ptrace_report_syscall_{entry,exit} and place them in ptrace.h

There is no longer any generic tracehook infractructure so make
these ptrace specific functions ptrace specific.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/Kconfig                        |  2 +-
 arch/alpha/kernel/ptrace.c          |  5 ++-
 arch/arc/kernel/ptrace.c            |  5 ++-
 arch/arm/kernel/ptrace.c            |  5 ++-
 arch/arm64/kernel/ptrace.c          |  7 ++--
 arch/csky/kernel/ptrace.c           |  5 ++-
 arch/h8300/kernel/ptrace.c          |  5 ++-
 arch/hexagon/kernel/traps.c         |  6 ++--
 arch/ia64/kernel/ptrace.c           |  4 +--
 arch/m68k/kernel/ptrace.c           |  6 ++--
 arch/microblaze/kernel/ptrace.c     |  5 ++-
 arch/mips/kernel/ptrace.c           |  5 ++-
 arch/nds32/include/asm/syscall.h    |  2 +-
 arch/nds32/kernel/ptrace.c          |  5 ++-
 arch/nios2/kernel/ptrace.c          |  5 ++-
 arch/openrisc/kernel/ptrace.c       |  5 ++-
 arch/parisc/kernel/ptrace.c         |  7 ++--
 arch/powerpc/kernel/ptrace/ptrace.c |  8 ++---
 arch/riscv/kernel/ptrace.c          |  5 ++-
 arch/sh/kernel/ptrace_32.c          |  5 ++-
 arch/sparc/kernel/ptrace_32.c       |  5 ++-
 arch/sparc/kernel/ptrace_64.c       |  5 ++-
 arch/um/kernel/ptrace.c             |  5 ++-
 arch/xtensa/kernel/ptrace.c         |  5 ++-
 include/asm-generic/syscall.h       |  2 +-
 include/linux/entry-common.h        |  6 ++--
 include/linux/ptrace.h              | 51 +++++++++++++++++++++++++++++
 include/linux/tracehook.h           | 51 -----------------------------
 include/uapi/linux/ptrace.h         |  2 +-
 kernel/entry/common.c               |  1 +
 30 files changed, 109 insertions(+), 126 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 678a80713b21..a517a949eb1d 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -217,7 +217,7 @@ config TRACE_IRQFLAGS_SUPPORT
 #	asm/syscall.h		supplying asm-generic/syscall.h interface
 #	linux/regset.h		user_regset interfaces
 #	CORE_DUMP_USE_REGSET	#define'd in linux/elf.h
-#	TIF_SYSCALL_TRACE	calls tracehook_report_syscall_{entry,exit}
+#	TIF_SYSCALL_TRACE	calls ptrace_report_syscall_{entry,exit}
 #	TIF_NOTIFY_RESUME	calls tracehook_notify_resume()
 #	signal delivery		calls tracehook_signal_handler()
 #
diff --git a/arch/alpha/kernel/ptrace.c b/arch/alpha/kernel/ptrace.c
index 8c43212ae38e..a1a239ea002d 100644
--- a/arch/alpha/kernel/ptrace.c
+++ b/arch/alpha/kernel/ptrace.c
@@ -15,7 +15,6 @@
 #include <linux/user.h>
 #include <linux/security.h>
 #include <linux/signal.h>
-#include <linux/tracehook.h>
 #include <linux/audit.h>
 
 #include <linux/uaccess.h>
@@ -323,7 +322,7 @@ asmlinkage unsigned long syscall_trace_enter(void)
 	unsigned long ret = 0;
 	struct pt_regs *regs = current_pt_regs();
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-	    tracehook_report_syscall_entry(current_pt_regs()))
+	    ptrace_report_syscall_entry(current_pt_regs()))
 		ret = -1UL;
 	audit_syscall_entry(regs->r0, regs->r16, regs->r17, regs->r18, regs->r19);
 	return ret ?: current_pt_regs()->r0;
@@ -334,5 +333,5 @@ syscall_trace_leave(void)
 {
 	audit_syscall_exit(current_pt_regs());
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(current_pt_regs(), 0);
+		ptrace_report_syscall_exit(current_pt_regs(), 0);
 }
diff --git a/arch/arc/kernel/ptrace.c b/arch/arc/kernel/ptrace.c
index 883391977fdf..54b419ac8bda 100644
--- a/arch/arc/kernel/ptrace.c
+++ b/arch/arc/kernel/ptrace.c
@@ -4,7 +4,6 @@
  */
 
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
 #include <linux/sched/task_stack.h>
 #include <linux/regset.h>
 #include <linux/unistd.h>
@@ -258,7 +257,7 @@ long arch_ptrace(struct task_struct *child, long request,
 
 asmlinkage int syscall_trace_entry(struct pt_regs *regs)
 {
-	if (tracehook_report_syscall_entry(regs))
+	if (ptrace_report_syscall_entry(regs))
 		return ULONG_MAX;
 
 	return regs->r8;
@@ -266,5 +265,5 @@ asmlinkage int syscall_trace_entry(struct pt_regs *regs)
 
 asmlinkage void syscall_trace_exit(struct pt_regs *regs)
 {
-	tracehook_report_syscall_exit(regs, 0);
+	ptrace_report_syscall_exit(regs, 0);
 }
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index e5aa3237853d..bfe88c6e60d5 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -22,7 +22,6 @@
 #include <linux/hw_breakpoint.h>
 #include <linux/regset.h>
 #include <linux/audit.h>
-#include <linux/tracehook.h>
 #include <linux/unistd.h>
 
 #include <asm/syscall.h>
@@ -843,8 +842,8 @@ static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
 	regs->ARM_ip = dir;
 
 	if (dir == PTRACE_SYSCALL_EXIT)
-		tracehook_report_syscall_exit(regs, 0);
-	else if (tracehook_report_syscall_entry(regs))
+		ptrace_report_syscall_exit(regs, 0);
+	else if (ptrace_report_syscall_entry(regs))
 		current_thread_info()->abi_syscall = -1;
 
 	regs->ARM_ip = ip;
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index b7845575f86f..230a47b9189e 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -27,7 +27,6 @@
 #include <linux/perf_event.h>
 #include <linux/hw_breakpoint.h>
 #include <linux/regset.h>
-#include <linux/tracehook.h>
 #include <linux/elf.h>
 
 #include <asm/compat.h>
@@ -1818,11 +1817,11 @@ static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
 	regs->regs[regno] = dir;
 
 	if (dir == PTRACE_SYSCALL_ENTER) {
-		if (tracehook_report_syscall_entry(regs))
+		if (ptrace_report_syscall_entry(regs))
 			forget_syscall(regs);
 		regs->regs[regno] = saved_reg;
 	} else if (!test_thread_flag(TIF_SINGLESTEP)) {
-		tracehook_report_syscall_exit(regs, 0);
+		ptrace_report_syscall_exit(regs, 0);
 		regs->regs[regno] = saved_reg;
 	} else {
 		regs->regs[regno] = saved_reg;
@@ -1832,7 +1831,7 @@ static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
 		 * tracer modifications to the registers may have rewound the
 		 * state machine.
 		 */
-		tracehook_report_syscall_exit(regs, 1);
+		ptrace_report_syscall_exit(regs, 1);
 	}
 }
 
diff --git a/arch/csky/kernel/ptrace.c b/arch/csky/kernel/ptrace.c
index 1a5f54e0d272..0f7e7b653c72 100644
--- a/arch/csky/kernel/ptrace.c
+++ b/arch/csky/kernel/ptrace.c
@@ -12,7 +12,6 @@
 #include <linux/sched/task_stack.h>
 #include <linux/signal.h>
 #include <linux/smp.h>
-#include <linux/tracehook.h>
 #include <linux/uaccess.h>
 #include <linux/user.h>
 
@@ -321,7 +320,7 @@ long arch_ptrace(struct task_struct *child, long request,
 asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 {
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		if (tracehook_report_syscall_entry(regs))
+		if (ptrace_report_syscall_entry(regs))
 			return -1;
 
 	if (secure_computing() == -1)
@@ -339,7 +338,7 @@ asmlinkage void syscall_trace_exit(struct pt_regs *regs)
 	audit_syscall_exit(regs);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, 0);
+		ptrace_report_syscall_exit(regs, 0);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
 		trace_sys_exit(regs, syscall_get_return_value(current, regs));
diff --git a/arch/h8300/kernel/ptrace.c b/arch/h8300/kernel/ptrace.c
index a11db009d0ea..a9898b27b756 100644
--- a/arch/h8300/kernel/ptrace.c
+++ b/arch/h8300/kernel/ptrace.c
@@ -12,7 +12,6 @@
 #include <linux/errno.h>
 #include <linux/ptrace.h>
 #include <linux/audit.h>
-#include <linux/tracehook.h>
 #include <linux/regset.h>
 #include <linux/elf.h>
 
@@ -174,7 +173,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
 	long ret = 0;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-	    tracehook_report_syscall_entry(regs))
+	    ptrace_report_syscall_entry(regs))
 		/*
 		 * Tracing decided this syscall should not happen.
 		 * We'll return a bogus call number to get an ENOSYS
@@ -196,5 +195,5 @@ asmlinkage void do_syscall_trace_leave(struct pt_regs *regs)
 
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
+		ptrace_report_syscall_exit(regs, step);
 }
diff --git a/arch/hexagon/kernel/traps.c b/arch/hexagon/kernel/traps.c
index 1240f038cce0..6447763ce5a9 100644
--- a/arch/hexagon/kernel/traps.c
+++ b/arch/hexagon/kernel/traps.c
@@ -14,7 +14,7 @@
 #include <linux/kdebug.h>
 #include <linux/syscalls.h>
 #include <linux/signal.h>
-#include <linux/tracehook.h>
+#include <linux/ptrace.h>
 #include <asm/traps.h>
 #include <asm/vm_fault.h>
 #include <asm/syscall.h>
@@ -348,7 +348,7 @@ void do_trap0(struct pt_regs *regs)
 
 		/* allow strace to catch syscall args  */
 		if (unlikely(test_thread_flag(TIF_SYSCALL_TRACE) &&
-			tracehook_report_syscall_entry(regs)))
+			ptrace_report_syscall_entry(regs)))
 			return;  /*  return -ENOSYS somewhere?  */
 
 		/* Interrupts should be re-enabled for syscall processing */
@@ -386,7 +386,7 @@ void do_trap0(struct pt_regs *regs)
 
 		/* allow strace to get the syscall return state  */
 		if (unlikely(test_thread_flag(TIF_SYSCALL_TRACE)))
-			tracehook_report_syscall_exit(regs, 0);
+			ptrace_report_syscall_exit(regs, 0);
 
 		break;
 	case TRAP_DEBUG:
diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c
index 6a1439eaa050..6af64aae087d 100644
--- a/arch/ia64/kernel/ptrace.c
+++ b/arch/ia64/kernel/ptrace.c
@@ -1217,7 +1217,7 @@ syscall_trace_enter (long arg0, long arg1, long arg2, long arg3,
 		     struct pt_regs regs)
 {
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		if (tracehook_report_syscall_entry(&regs))
+		if (ptrace_report_syscall_entry(&regs))
 			return -ENOSYS;
 
 	/* copy user rbs to kernel rbs */
@@ -1243,7 +1243,7 @@ syscall_trace_leave (long arg0, long arg1, long arg2, long arg3,
 
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(&regs, step);
+		ptrace_report_syscall_exit(&regs, step);
 
 	/* copy user rbs to kernel rbs */
 	if (test_thread_flag(TIF_RESTORE_RSE))
diff --git a/arch/m68k/kernel/ptrace.c b/arch/m68k/kernel/ptrace.c
index aa3a0b8d07e9..a0c99fe3118e 100644
--- a/arch/m68k/kernel/ptrace.c
+++ b/arch/m68k/kernel/ptrace.c
@@ -19,7 +19,7 @@
 #include <linux/ptrace.h>
 #include <linux/user.h>
 #include <linux/signal.h>
-#include <linux/tracehook.h>
+#include <linux/ptrace.h>
 
 #include <linux/uaccess.h>
 #include <asm/page.h>
@@ -282,13 +282,13 @@ asmlinkage int syscall_trace_enter(void)
 	int ret = 0;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		ret = tracehook_report_syscall_entry(task_pt_regs(current));
+		ret = ptrace_report_syscall_entry(task_pt_regs(current));
 	return ret;
 }
 
 asmlinkage void syscall_trace_leave(void)
 {
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(task_pt_regs(current), 0);
+		ptrace_report_syscall_exit(task_pt_regs(current), 0);
 }
 #endif /* CONFIG_COLDFIRE */
diff --git a/arch/microblaze/kernel/ptrace.c b/arch/microblaze/kernel/ptrace.c
index badd286882ae..5234d0c1dcaa 100644
--- a/arch/microblaze/kernel/ptrace.c
+++ b/arch/microblaze/kernel/ptrace.c
@@ -33,7 +33,6 @@
 #include <linux/elf.h>
 #include <linux/audit.h>
 #include <linux/seccomp.h>
-#include <linux/tracehook.h>
 
 #include <linux/errno.h>
 #include <asm/processor.h>
@@ -140,7 +139,7 @@ asmlinkage unsigned long do_syscall_trace_enter(struct pt_regs *regs)
 	secure_computing_strict(regs->r12);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-	    tracehook_report_syscall_entry(regs))
+	    ptrace_report_syscall_entry(regs))
 		/*
 		 * Tracing decided this syscall should not happen.
 		 * We'll return a bogus call number to get an ENOSYS
@@ -161,7 +160,7 @@ asmlinkage void do_syscall_trace_leave(struct pt_regs *regs)
 
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
+		ptrace_report_syscall_exit(regs, step);
 }
 
 void ptrace_disable(struct task_struct *child)
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index db7c5be1d4a3..567aec4abac0 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -27,7 +27,6 @@
 #include <linux/smp.h>
 #include <linux/security.h>
 #include <linux/stddef.h>
-#include <linux/tracehook.h>
 #include <linux/audit.h>
 #include <linux/seccomp.h>
 #include <linux/ftrace.h>
@@ -1317,7 +1316,7 @@ asmlinkage long syscall_trace_enter(struct pt_regs *regs, long syscall)
 	current_thread_info()->syscall = syscall;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE)) {
-		if (tracehook_report_syscall_entry(regs))
+		if (ptrace_report_syscall_entry(regs))
 			return -1;
 		syscall = current_thread_info()->syscall;
 	}
@@ -1376,7 +1375,7 @@ asmlinkage void syscall_trace_leave(struct pt_regs *regs)
 		trace_sys_exit(regs, regs_return_value(regs));
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, 0);
+		ptrace_report_syscall_exit(regs, 0);
 
 	user_enter();
 }
diff --git a/arch/nds32/include/asm/syscall.h b/arch/nds32/include/asm/syscall.h
index 90aa56c94af1..04d55ce18d50 100644
--- a/arch/nds32/include/asm/syscall.h
+++ b/arch/nds32/include/asm/syscall.h
@@ -39,7 +39,7 @@ syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
  *
  * It's only valid to call this when @task is stopped for system
  * call exit tracing (due to TIF_SYSCALL_TRACE or TIF_SYSCALL_AUDIT),
- * after tracehook_report_syscall_entry() returned nonzero to prevent
+ * after ptrace_report_syscall_entry() returned nonzero to prevent
  * the system call from taking place.
  *
  * This rolls back the register state in @regs so it's as if the
diff --git a/arch/nds32/kernel/ptrace.c b/arch/nds32/kernel/ptrace.c
index d0eda870fbc2..6a6988cf689d 100644
--- a/arch/nds32/kernel/ptrace.c
+++ b/arch/nds32/kernel/ptrace.c
@@ -3,7 +3,6 @@
 
 #include <linux/ptrace.h>
 #include <linux/regset.h>
-#include <linux/tracehook.h>
 #include <linux/elf.h>
 #include <linux/sched/task_stack.h>
 
@@ -103,7 +102,7 @@ void user_disable_single_step(struct task_struct *child)
 asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 {
 	if (test_thread_flag(TIF_SYSCALL_TRACE)) {
-		if (tracehook_report_syscall_entry(regs))
+		if (ptrace_report_syscall_entry(regs))
 			forget_syscall(regs);
 	}
 	return regs->syscallno;
@@ -113,6 +112,6 @@ asmlinkage void syscall_trace_leave(struct pt_regs *regs)
 {
 	int step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
+		ptrace_report_syscall_exit(regs, step);
 
 }
diff --git a/arch/nios2/kernel/ptrace.c b/arch/nios2/kernel/ptrace.c
index a6ea9e1b4f61..cd62f310778b 100644
--- a/arch/nios2/kernel/ptrace.c
+++ b/arch/nios2/kernel/ptrace.c
@@ -15,7 +15,6 @@
 #include <linux/regset.h>
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
-#include <linux/tracehook.h>
 #include <linux/uaccess.h>
 #include <linux/user.h>
 
@@ -134,7 +133,7 @@ asmlinkage int do_syscall_trace_enter(void)
 	int ret = 0;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		ret = tracehook_report_syscall_entry(task_pt_regs(current));
+		ret = ptrace_report_syscall_entry(task_pt_regs(current));
 
 	return ret;
 }
@@ -142,5 +141,5 @@ asmlinkage int do_syscall_trace_enter(void)
 asmlinkage void do_syscall_trace_exit(void)
 {
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(task_pt_regs(current), 0);
+		ptrace_report_syscall_exit(task_pt_regs(current), 0);
 }
diff --git a/arch/openrisc/kernel/ptrace.c b/arch/openrisc/kernel/ptrace.c
index 4d60ae2a12fa..b971740fc2aa 100644
--- a/arch/openrisc/kernel/ptrace.c
+++ b/arch/openrisc/kernel/ptrace.c
@@ -22,7 +22,6 @@
 #include <linux/ptrace.h>
 #include <linux/audit.h>
 #include <linux/regset.h>
-#include <linux/tracehook.h>
 #include <linux/elf.h>
 
 #include <asm/thread_info.h>
@@ -159,7 +158,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
 	long ret = 0;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-	    tracehook_report_syscall_entry(regs))
+	    ptrace_report_syscall_entry(regs))
 		/*
 		 * Tracing decided this syscall should not happen.
 		 * We'll return a bogus call number to get an ENOSYS
@@ -181,5 +180,5 @@ asmlinkage void do_syscall_trace_leave(struct pt_regs *regs)
 
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
+		ptrace_report_syscall_exit(regs, step);
 }
diff --git a/arch/parisc/kernel/ptrace.c b/arch/parisc/kernel/ptrace.c
index 65de6c4c9354..96ef6a6b66e5 100644
--- a/arch/parisc/kernel/ptrace.c
+++ b/arch/parisc/kernel/ptrace.c
@@ -15,7 +15,6 @@
 #include <linux/elf.h>
 #include <linux/errno.h>
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
 #include <linux/user.h>
 #include <linux/personality.h>
 #include <linux/regset.h>
@@ -316,7 +315,7 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
 long do_syscall_trace_enter(struct pt_regs *regs)
 {
 	if (test_thread_flag(TIF_SYSCALL_TRACE)) {
-		int rc = tracehook_report_syscall_entry(regs);
+		int rc = ptrace_report_syscall_entry(regs);
 
 		/*
 		 * As tracesys_next does not set %r28 to -ENOSYS
@@ -327,7 +326,7 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 		if (rc) {
 			/*
 			 * A nonzero return code from
-			 * tracehook_report_syscall_entry() tells us
+			 * ptrace_report_syscall_entry() tells us
 			 * to prevent the syscall execution.  Skip
 			 * the syscall call and the syscall restart handling.
 			 *
@@ -381,7 +380,7 @@ void do_syscall_trace_exit(struct pt_regs *regs)
 #endif
 
 	if (stepping || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, stepping);
+		ptrace_report_syscall_exit(regs, stepping);
 }
 
 
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index c43f77e2ac31..f394b0d6473f 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -16,7 +16,7 @@
  */
 
 #include <linux/regset.h>
-#include <linux/tracehook.h>
+#include <linux/ptrace.h>
 #include <linux/audit.h>
 #include <linux/context_tracking.h>
 #include <linux/syscalls.h>
@@ -263,12 +263,12 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 	flags = read_thread_flags() & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
 
 	if (flags) {
-		int rc = tracehook_report_syscall_entry(regs);
+		int rc = ptrace_report_syscall_entry(regs);
 
 		if (unlikely(flags & _TIF_SYSCALL_EMU)) {
 			/*
 			 * A nonzero return code from
-			 * tracehook_report_syscall_entry() tells us to prevent
+			 * ptrace_report_syscall_entry() tells us to prevent
 			 * the syscall execution, but we are not going to
 			 * execute it anyway.
 			 *
@@ -334,7 +334,7 @@ void do_syscall_trace_leave(struct pt_regs *regs)
 
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
+		ptrace_report_syscall_exit(regs, step);
 }
 
 void __init pt_regs_check(void);
diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
index a89243730153..793c7da0554b 100644
--- a/arch/riscv/kernel/ptrace.c
+++ b/arch/riscv/kernel/ptrace.c
@@ -17,7 +17,6 @@
 #include <linux/regset.h>
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
-#include <linux/tracehook.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
@@ -241,7 +240,7 @@ long arch_ptrace(struct task_struct *child, long request,
 __visible int do_syscall_trace_enter(struct pt_regs *regs)
 {
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		if (tracehook_report_syscall_entry(regs))
+		if (ptrace_report_syscall_entry(regs))
 			return -1;
 
 	/*
@@ -266,7 +265,7 @@ __visible void do_syscall_trace_exit(struct pt_regs *regs)
 	audit_syscall_exit(regs);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, 0);
+		ptrace_report_syscall_exit(regs, 0);
 
 #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
 	if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
diff --git a/arch/sh/kernel/ptrace_32.c b/arch/sh/kernel/ptrace_32.c
index 5281685f6ad1..d417988d9770 100644
--- a/arch/sh/kernel/ptrace_32.c
+++ b/arch/sh/kernel/ptrace_32.c
@@ -20,7 +20,6 @@
 #include <linux/io.h>
 #include <linux/audit.h>
 #include <linux/seccomp.h>
-#include <linux/tracehook.h>
 #include <linux/elf.h>
 #include <linux/regset.h>
 #include <linux/hw_breakpoint.h>
@@ -456,7 +455,7 @@ long arch_ptrace(struct task_struct *child, long request,
 asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
 {
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-	    tracehook_report_syscall_entry(regs)) {
+	    ptrace_report_syscall_entry(regs)) {
 		regs->regs[0] = -ENOSYS;
 		return -1;
 	}
@@ -484,5 +483,5 @@ asmlinkage void do_syscall_trace_leave(struct pt_regs *regs)
 
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
+		ptrace_report_syscall_exit(regs, step);
 }
diff --git a/arch/sparc/kernel/ptrace_32.c b/arch/sparc/kernel/ptrace_32.c
index 5318174a0268..e7db48acb838 100644
--- a/arch/sparc/kernel/ptrace_32.c
+++ b/arch/sparc/kernel/ptrace_32.c
@@ -21,7 +21,6 @@
 #include <linux/signal.h>
 #include <linux/regset.h>
 #include <linux/elf.h>
-#include <linux/tracehook.h>
 
 #include <linux/uaccess.h>
 #include <asm/cacheflush.h>
@@ -439,9 +438,9 @@ asmlinkage int syscall_trace(struct pt_regs *regs, int syscall_exit_p)
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE)) {
 		if (syscall_exit_p)
-			tracehook_report_syscall_exit(regs, 0);
+			ptrace_report_syscall_exit(regs, 0);
 		else
-			ret = tracehook_report_syscall_entry(regs);
+			ret = ptrace_report_syscall_entry(regs);
 	}
 
 	return ret;
diff --git a/arch/sparc/kernel/ptrace_64.c b/arch/sparc/kernel/ptrace_64.c
index 2b92155db8a5..86a7eb5c27ba 100644
--- a/arch/sparc/kernel/ptrace_64.c
+++ b/arch/sparc/kernel/ptrace_64.c
@@ -25,7 +25,6 @@
 #include <linux/audit.h>
 #include <linux/signal.h>
 #include <linux/regset.h>
-#include <linux/tracehook.h>
 #include <trace/syscall.h>
 #include <linux/compat.h>
 #include <linux/elf.h>
@@ -1095,7 +1094,7 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 		user_exit();
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		ret = tracehook_report_syscall_entry(regs);
+		ret = ptrace_report_syscall_entry(regs);
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
 		trace_sys_enter(regs, regs->u_regs[UREG_G1]);
@@ -1118,7 +1117,7 @@ asmlinkage void syscall_trace_leave(struct pt_regs *regs)
 		trace_sys_exit(regs, regs->u_regs[UREG_I0]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, 0);
+		ptrace_report_syscall_exit(regs, 0);
 
 	if (test_thread_flag(TIF_NOHZ))
 		user_enter();
diff --git a/arch/um/kernel/ptrace.c b/arch/um/kernel/ptrace.c
index b425f47bddbb..bfaf6ab1ac03 100644
--- a/arch/um/kernel/ptrace.c
+++ b/arch/um/kernel/ptrace.c
@@ -6,7 +6,6 @@
 #include <linux/audit.h>
 #include <linux/ptrace.h>
 #include <linux/sched.h>
-#include <linux/tracehook.h>
 #include <linux/uaccess.h>
 #include <asm/ptrace-abi.h>
 
@@ -135,7 +134,7 @@ int syscall_trace_enter(struct pt_regs *regs)
 	if (!test_thread_flag(TIF_SYSCALL_TRACE))
 		return 0;
 
-	return tracehook_report_syscall_entry(regs);
+	return ptrace_report_syscall_entry(regs);
 }
 
 void syscall_trace_leave(struct pt_regs *regs)
@@ -151,7 +150,7 @@ void syscall_trace_leave(struct pt_regs *regs)
 	if (!test_thread_flag(TIF_SYSCALL_TRACE))
 		return;
 
-	tracehook_report_syscall_exit(regs, 0);
+	ptrace_report_syscall_exit(regs, 0);
 	/* force do_signal() --> is_syscall() */
 	if (ptraced & PT_PTRACED)
 		set_thread_flag(TIF_SIGPENDING);
diff --git a/arch/xtensa/kernel/ptrace.c b/arch/xtensa/kernel/ptrace.c
index bb3f4797d212..323c678a691f 100644
--- a/arch/xtensa/kernel/ptrace.c
+++ b/arch/xtensa/kernel/ptrace.c
@@ -26,7 +26,6 @@
 #include <linux/security.h>
 #include <linux/signal.h>
 #include <linux/smp.h>
-#include <linux/tracehook.h>
 #include <linux/uaccess.h>
 
 #define CREATE_TRACE_POINTS
@@ -550,7 +549,7 @@ int do_syscall_trace_enter(struct pt_regs *regs)
 		regs->areg[2] = -ENOSYS;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-	    tracehook_report_syscall_entry(regs)) {
+	    ptrace_report_syscall_entry(regs)) {
 		regs->areg[2] = -ENOSYS;
 		regs->syscall = NO_SYSCALL;
 		return 0;
@@ -583,5 +582,5 @@ void do_syscall_trace_leave(struct pt_regs *regs)
 	step = test_thread_flag(TIF_SINGLESTEP);
 
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
+		ptrace_report_syscall_exit(regs, step);
 }
diff --git a/include/asm-generic/syscall.h b/include/asm-generic/syscall.h
index 81695eb02a12..5a80fe728dc8 100644
--- a/include/asm-generic/syscall.h
+++ b/include/asm-generic/syscall.h
@@ -44,7 +44,7 @@ int syscall_get_nr(struct task_struct *task, struct pt_regs *regs);
  *
  * It's only valid to call this when @task is stopped for system
  * call exit tracing (due to %SYSCALL_WORK_SYSCALL_TRACE or
- * %SYSCALL_WORK_SYSCALL_AUDIT), after tracehook_report_syscall_entry()
+ * %SYSCALL_WORK_SYSCALL_AUDIT), after ptrace_report_syscall_entry()
  * returned nonzero to prevent the system call from taking place.
  *
  * This rolls back the register state in @regs so it's as if the
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 2e2b8d6140ed..a670e9fba7a9 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -3,7 +3,7 @@
 #define __LINUX_ENTRYCOMMON_H
 
 #include <linux/static_call_types.h>
-#include <linux/tracehook.h>
+#include <linux/ptrace.h>
 #include <linux/syscalls.h>
 #include <linux/seccomp.h>
 #include <linux/sched.h>
@@ -95,7 +95,7 @@ static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs
 #ifndef arch_syscall_enter_tracehook
 static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs)
 {
-	return tracehook_report_syscall_entry(regs);
+	return ptrace_report_syscall_entry(regs);
 }
 #endif
 
@@ -294,7 +294,7 @@ static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step);
 #ifndef arch_syscall_exit_tracehook
 static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step)
 {
-	tracehook_report_syscall_exit(regs, step);
+	ptrace_report_syscall_exit(regs, step);
 }
 #endif
 
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 91b1074edb4c..5310f43e4762 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -440,4 +440,55 @@ static inline int ptrace_report_syscall(unsigned long message)
 	current->ptrace_message = 0;
 	return fatal_signal_pending(current);
 }
+
+/**
+ * ptrace_report_syscall_entry - task is about to attempt a system call
+ * @regs:		user register state of current task
+ *
+ * This will be called if %SYSCALL_WORK_SYSCALL_TRACE or
+ * %SYSCALL_WORK_SYSCALL_EMU have been set, when the current task has just
+ * entered the kernel for a system call.  Full user register state is
+ * available here.  Changing the values in @regs can affect the system
+ * call number and arguments to be tried.  It is safe to block here,
+ * preventing the system call from beginning.
+ *
+ * Returns zero normally, or nonzero if the calling arch code should abort
+ * the system call.  That must prevent normal entry so no system call is
+ * made.  If @task ever returns to user mode after this, its register state
+ * is unspecified, but should be something harmless like an %ENOSYS error
+ * return.  It should preserve enough information so that syscall_rollback()
+ * can work (see asm-generic/syscall.h).
+ *
+ * Called without locks, just after entering kernel mode.
+ */
+static inline __must_check int ptrace_report_syscall_entry(
+	struct pt_regs *regs)
+{
+	return ptrace_report_syscall(PTRACE_EVENTMSG_SYSCALL_ENTRY);
+}
+
+/**
+ * ptrace_report_syscall_exit - task has just finished a system call
+ * @regs:		user register state of current task
+ * @step:		nonzero if simulating single-step or block-step
+ *
+ * This will be called if %SYSCALL_WORK_SYSCALL_TRACE has been set, when
+ * the current task has just finished an attempted system call.  Full
+ * user register state is available here.  It is safe to block here,
+ * preventing signals from being processed.
+ *
+ * If @step is nonzero, this report is also in lieu of the normal
+ * trap that would follow the system call instruction because
+ * user_enable_block_step() or user_enable_single_step() was used.
+ * In this case, %SYSCALL_WORK_SYSCALL_TRACE might not be set.
+ *
+ * Called without locks, just before checking for pending signals.
+ */
+static inline void ptrace_report_syscall_exit(struct pt_regs *regs, int step)
+{
+	if (step)
+		user_single_step_report(regs);
+	else
+		ptrace_report_syscall(PTRACE_EVENTMSG_SYSCALL_EXIT);
+}
 #endif
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 998bc3863559..819e82ac09bd 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -52,57 +52,6 @@
 struct linux_binprm;
 
 
-/**
- * tracehook_report_syscall_entry - task is about to attempt a system call
- * @regs:		user register state of current task
- *
- * This will be called if %SYSCALL_WORK_SYSCALL_TRACE or
- * %SYSCALL_WORK_SYSCALL_EMU have been set, when the current task has just
- * entered the kernel for a system call.  Full user register state is
- * available here.  Changing the values in @regs can affect the system
- * call number and arguments to be tried.  It is safe to block here,
- * preventing the system call from beginning.
- *
- * Returns zero normally, or nonzero if the calling arch code should abort
- * the system call.  That must prevent normal entry so no system call is
- * made.  If @task ever returns to user mode after this, its register state
- * is unspecified, but should be something harmless like an %ENOSYS error
- * return.  It should preserve enough information so that syscall_rollback()
- * can work (see asm-generic/syscall.h).
- *
- * Called without locks, just after entering kernel mode.
- */
-static inline __must_check int tracehook_report_syscall_entry(
-	struct pt_regs *regs)
-{
-	return ptrace_report_syscall(PTRACE_EVENTMSG_SYSCALL_ENTRY);
-}
-
-/**
- * tracehook_report_syscall_exit - task has just finished a system call
- * @regs:		user register state of current task
- * @step:		nonzero if simulating single-step or block-step
- *
- * This will be called if %SYSCALL_WORK_SYSCALL_TRACE has been set, when
- * the current task has just finished an attempted system call.  Full
- * user register state is available here.  It is safe to block here,
- * preventing signals from being processed.
- *
- * If @step is nonzero, this report is also in lieu of the normal
- * trap that would follow the system call instruction because
- * user_enable_block_step() or user_enable_single_step() was used.
- * In this case, %SYSCALL_WORK_SYSCALL_TRACE might not be set.
- *
- * Called without locks, just before checking for pending signals.
- */
-static inline void tracehook_report_syscall_exit(struct pt_regs *regs, int step)
-{
-	if (step)
-		user_single_step_report(regs);
-	else
-		ptrace_report_syscall(PTRACE_EVENTMSG_SYSCALL_EXIT);
-}
-
 /**
  * tracehook_signal_handler - signal handler setup is complete
  * @stepping:		nonzero if debugger single-step or block-step in use
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 3747bf816f9a..b7af92e07d1f 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -114,7 +114,7 @@ struct ptrace_rseq_configuration {
 
 /*
  * These values are stored in task->ptrace_message
- * by tracehook_report_syscall_* to describe the current syscall-stop.
+ * by ptrace_report_syscall_* to describe the current syscall-stop.
  */
 #define PTRACE_EVENTMSG_SYSCALL_ENTRY	1
 #define PTRACE_EVENTMSG_SYSCALL_EXIT	2
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index bad713684c2e..f52e57c4d6d8 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -2,6 +2,7 @@
 
 #include <linux/context_tracking.h>
 #include <linux/entry-common.h>
+#include <linux/tracehook.h>
 #include <linux/highmem.h>
 #include <linux/livepatch.h>
 #include <linux/audit.h>
-- 
2.29.2
Re: [PATCH 03/13] ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:44AM -0600, Eric W. Biederman wrote:
> Rename tracehook_report_syscall_{entry,exit} to
> ptrace_report_syscall_{entry,exit} and place them in ptrace.h
> 
> There is no longer any generic tracehook infractructure so make
> these ptrace specific functions ptrace specific.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 04/13] ptrace: Remove arch_syscall_{enter,exit}_tracehook
Posted by Eric W. Biederman 4 years, 3 months ago
These functions are alwasy one-to-one wrappers around
ptrace_report_syscall_entry and ptrace_report_syscall_exit.
So directly call the functions they are wrapping instead.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/entry-common.h | 43 ++----------------------------------
 kernel/entry/common.c        |  4 ++--
 2 files changed, 4 insertions(+), 43 deletions(-)

diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index a670e9fba7a9..9efbdda61f7a 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -79,26 +79,6 @@ static __always_inline void arch_check_user_regs(struct pt_regs *regs);
 static __always_inline void arch_check_user_regs(struct pt_regs *regs) {}
 #endif
 
-/**
- * arch_syscall_enter_tracehook - Wrapper around tracehook_report_syscall_entry()
- * @regs:	Pointer to currents pt_regs
- *
- * Returns: 0 on success or an error code to skip the syscall.
- *
- * Defaults to tracehook_report_syscall_entry(). Can be replaced by
- * architecture specific code.
- *
- * Invoked from syscall_enter_from_user_mode()
- */
-static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs);
-
-#ifndef arch_syscall_enter_tracehook
-static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs)
-{
-	return ptrace_report_syscall_entry(regs);
-}
-#endif
-
 /**
  * enter_from_user_mode - Establish state when coming from user mode
  *
@@ -157,7 +137,7 @@ void syscall_enter_from_user_mode_prepare(struct pt_regs *regs);
  * It handles the following work items:
  *
  *  1) syscall_work flag dependent invocations of
- *     arch_syscall_enter_tracehook(), __secure_computing(), trace_sys_enter()
+ *     ptrace_report_syscall_entry(), __secure_computing(), trace_sys_enter()
  *  2) Invocation of audit_syscall_entry()
  */
 long syscall_enter_from_user_mode_work(struct pt_regs *regs, long syscall);
@@ -279,25 +259,6 @@ static __always_inline void arch_exit_to_user_mode(void) { }
  */
 void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal);
 
-/**
- * arch_syscall_exit_tracehook - Wrapper around tracehook_report_syscall_exit()
- * @regs:	Pointer to currents pt_regs
- * @step:	Indicator for single step
- *
- * Defaults to tracehook_report_syscall_exit(). Can be replaced by
- * architecture specific code.
- *
- * Invoked from syscall_exit_to_user_mode()
- */
-static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step);
-
-#ifndef arch_syscall_exit_tracehook
-static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step)
-{
-	ptrace_report_syscall_exit(regs, step);
-}
-#endif
-
 /**
  * exit_to_user_mode - Fixup state when exiting to user mode
  *
@@ -347,7 +308,7 @@ void syscall_exit_to_user_mode_work(struct pt_regs *regs);
  *	- rseq syscall exit
  *      - audit
  *	- syscall tracing
- *	- tracehook (single stepping)
+ *	- ptrace (single stepping)
  *
  *  2) Preparatory work
  *	- Exit to user mode loop (common TIF handling). Invokes
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index f52e57c4d6d8..f0b1daa1e8da 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -59,7 +59,7 @@ static long syscall_trace_enter(struct pt_regs *regs, long syscall,
 
 	/* Handle ptrace */
 	if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
-		ret = arch_syscall_enter_tracehook(regs);
+		ret = ptrace_report_syscall_entry(regs);
 		if (ret || (work & SYSCALL_WORK_SYSCALL_EMU))
 			return -1L;
 	}
@@ -253,7 +253,7 @@ static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
 
 	step = report_single_step(work);
 	if (step || work & SYSCALL_WORK_SYSCALL_TRACE)
-		arch_syscall_exit_tracehook(regs, step);
+		ptrace_report_syscall_exit(regs, step);
 }
 
 /*
-- 
2.29.2
Re: [PATCH 04/13] ptrace: Remove arch_syscall_{enter,exit}_tracehook
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:45AM -0600, Eric W. Biederman wrote:
> These functions are alwasy one-to-one wrappers around
> ptrace_report_syscall_entry and ptrace_report_syscall_exit.
> So directly call the functions they are wrapping instead.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  include/linux/entry-common.h | 43 ++----------------------------------
>  kernel/entry/common.c        |  4 ++--
>  2 files changed, 4 insertions(+), 43 deletions(-)

nit: This should maybe talk about how this is just removing needless
indirection in the common entry code?

Regardless:

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 05/13] ptrace: Remove tracehook_signal_handler
Posted by Eric W. Biederman 4 years, 3 months ago
The two line function tracehook_signal_handler is only called from
signal_delivered.  Expand it inline in signal_delivered and remove it.
Just to make it easier to understand what is going on.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/Kconfig              |  1 -
 include/linux/tracehook.h | 17 -----------------
 kernel/signal.c           |  3 ++-
 3 files changed, 2 insertions(+), 19 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index a517a949eb1d..6382520ef0a5 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -219,7 +219,6 @@ config TRACE_IRQFLAGS_SUPPORT
 #	CORE_DUMP_USE_REGSET	#define'd in linux/elf.h
 #	TIF_SYSCALL_TRACE	calls ptrace_report_syscall_{entry,exit}
 #	TIF_NOTIFY_RESUME	calls tracehook_notify_resume()
-#	signal delivery		calls tracehook_signal_handler()
 #
 config HAVE_ARCH_TRACEHOOK
 	bool
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 819e82ac09bd..b77bf4917196 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -52,23 +52,6 @@
 struct linux_binprm;
 
 
-/**
- * tracehook_signal_handler - signal handler setup is complete
- * @stepping:		nonzero if debugger single-step or block-step in use
- *
- * Called by the arch code after a signal handler has been set up.
- * Register and stack state reflects the user handler about to run.
- * Signal mask changes have already been made.
- *
- * Called without locks, shortly before returning to user mode
- * (or handling more signals).
- */
-static inline void tracehook_signal_handler(int stepping)
-{
-	if (stepping)
-		ptrace_notify(SIGTRAP);
-}
-
 /**
  * set_notify_resume - cause tracehook_notify_resume() to be called
  * @task:		task that will call tracehook_notify_resume()
diff --git a/kernel/signal.c b/kernel/signal.c
index 38602738866e..0e0bd1c1068b 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2898,7 +2898,8 @@ static void signal_delivered(struct ksignal *ksig, int stepping)
 	set_current_blocked(&blocked);
 	if (current->sas_ss_flags & SS_AUTODISARM)
 		sas_ss_reset(current);
-	tracehook_signal_handler(stepping);
+	if (stepping)
+		ptrace_notify(SIGTRAP);
 }
 
 void signal_setup_done(int failed, struct ksignal *ksig, int stepping)
-- 
2.29.2
Re: [PATCH 05/13] ptrace: Remove tracehook_signal_handler
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:46AM -0600, Eric W. Biederman wrote:
> The two line function tracehook_signal_handler is only called from
> signal_delivered.  Expand it inline in signal_delivered and remove it.
> Just to make it easier to understand what is going on.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 06/13] task_work: Remove unnecessary include from posix_timers.h
Posted by Eric W. Biederman 4 years, 3 months ago
Break a header file circular dependency by removing the unnecessary
include of task_work.h from posix_timers.h.

sched.h -> posix-timers.h
posix-timers.h -> task_work.h
task_work.h -> sched.h

Add missing includes of task_work.h to:
arch/x86/mm/tlb.c
kernel/time/posix-cpu-timers.c

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/x86/mm/tlb.c              | 1 +
 include/linux/posix-timers.h   | 1 -
 kernel/time/posix-cpu-timers.c | 1 +
 3 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index a6cf56a14939..6eb4d91d5365 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -9,6 +9,7 @@
 #include <linux/cpu.h>
 #include <linux/debugfs.h>
 #include <linux/sched/smt.h>
+#include <linux/task_work.h>
 
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 5bbcd280bfd2..83539bb2f023 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -6,7 +6,6 @@
 #include <linux/list.h>
 #include <linux/alarmtimer.h>
 #include <linux/timerqueue.h>
-#include <linux/task_work.h>
 
 struct kernel_siginfo;
 struct task_struct;
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 96b4e7810426..9190d9eb236d 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -15,6 +15,7 @@
 #include <linux/workqueue.h>
 #include <linux/compat.h>
 #include <linux/sched/deadline.h>
+#include <linux/task_work.h>
 
 #include "posix-timers.h"
 
-- 
2.29.2
Re: [PATCH 06/13] task_work: Remove unnecessary include from posix_timers.h
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:47AM -0600, Eric W. Biederman wrote:
> Break a header file circular dependency by removing the unnecessary
> include of task_work.h from posix_timers.h.
> 
> sched.h -> posix-timers.h
> posix-timers.h -> task_work.h
> task_work.h -> sched.h
> 
> Add missing includes of task_work.h to:
> arch/x86/mm/tlb.c
> kernel/time/posix-cpu-timers.c
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 07/13] task_work: Introduce task_work_pending
Posted by Eric W. Biederman 4 years, 3 months ago
Wrap the test of task->task_works in a helper function to make
it clear what is being tested.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/io_uring.c             | 6 +++---
 include/linux/task_work.h | 5 +++++
 include/linux/tracehook.h | 4 ++--
 kernel/signal.c           | 4 ++--
 kernel/task_work.c        | 2 +-
 5 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index e54c4127422e..e85261079a78 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2590,7 +2590,7 @@ static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
 
 static inline bool io_run_task_work(void)
 {
-	if (test_thread_flag(TIF_NOTIFY_SIGNAL) || current->task_works) {
+	if (test_thread_flag(TIF_NOTIFY_SIGNAL) || task_work_pending(current)) {
 		__set_current_state(TASK_RUNNING);
 		tracehook_notify_signal();
 		return true;
@@ -7602,7 +7602,7 @@ static int io_sq_thread(void *data)
 		}
 
 		prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
-		if (!io_sqd_events_pending(sqd) && !current->task_works) {
+		if (!io_sqd_events_pending(sqd) && !task_work_pending(current)) {
 			bool needs_sched = true;
 
 			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
@@ -10321,7 +10321,7 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
 
 		hlist_for_each_entry(req, list, hash_node)
 			seq_printf(m, "  op=%d, task_works=%d\n", req->opcode,
-					req->task->task_works != NULL);
+					task_work_pending(req->task));
 	}
 
 	seq_puts(m, "CqOverflowList:\n");
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 5b8a93f288bb..897494b597ba 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -19,6 +19,11 @@ enum task_work_notify_mode {
 	TWA_SIGNAL,
 };
 
+static inline bool task_work_pending(struct task_struct *task)
+{
+	return READ_ONCE(task->task_works);
+}
+
 int task_work_add(struct task_struct *task, struct callback_head *twork,
 			enum task_work_notify_mode mode);
 
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index b77bf4917196..fa834a22e86e 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -90,7 +90,7 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
 	 * hlist_add_head(task->task_works);
 	 */
 	smp_mb__after_atomic();
-	if (unlikely(current->task_works))
+	if (unlikely(task_work_pending(current)))
 		task_work_run();
 
 #ifdef CONFIG_KEYS_REQUEST_CACHE
@@ -115,7 +115,7 @@ static inline void tracehook_notify_signal(void)
 {
 	clear_thread_flag(TIF_NOTIFY_SIGNAL);
 	smp_mb__after_atomic();
-	if (current->task_works)
+	if (task_work_pending(current))
 		task_work_run();
 }
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 0e0bd1c1068b..3b4cf25fb9b3 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2344,7 +2344,7 @@ static void ptrace_do_notify(int signr, int exit_code, int why)
 void ptrace_notify(int exit_code)
 {
 	BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
-	if (unlikely(current->task_works))
+	if (unlikely(task_work_pending(current)))
 		task_work_run();
 
 	spin_lock_irq(&current->sighand->siglock);
@@ -2626,7 +2626,7 @@ bool get_signal(struct ksignal *ksig)
 	struct signal_struct *signal = current->signal;
 	int signr;
 
-	if (unlikely(current->task_works))
+	if (unlikely(task_work_pending(current)))
 		task_work_run();
 
 	/*
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 1698fbe6f0e1..cc6fccb0e24d 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -78,7 +78,7 @@ task_work_cancel_match(struct task_struct *task,
 	struct callback_head *work;
 	unsigned long flags;
 
-	if (likely(!task->task_works))
+	if (likely(!task_work_pending(task)))
 		return NULL;
 	/*
 	 * If cmpxchg() fails we continue without updating pprev.
-- 
2.29.2
Re: [PATCH 07/13] task_work: Introduce task_work_pending
Posted by Jens Axboe 4 years, 3 months ago
On 3/9/22 9:24 AM, Eric W. Biederman wrote:
> diff --git a/include/linux/task_work.h b/include/linux/task_work.h
> index 5b8a93f288bb..897494b597ba 100644
> --- a/include/linux/task_work.h
> +++ b/include/linux/task_work.h
> @@ -19,6 +19,11 @@ enum task_work_notify_mode {
>  	TWA_SIGNAL,
>  };
>  
> +static inline bool task_work_pending(struct task_struct *task)
> +{
> +	return READ_ONCE(task->task_works);
> +}
> +

Most of the checks for this is current, do we need READ_ONCE here?

-- 
Jens Axboe
Re: [PATCH 07/13] task_work: Introduce task_work_pending
Posted by Eric W. Biederman 4 years, 3 months ago
Jens Axboe <axboe@kernel.dk> writes:

> On 3/9/22 9:24 AM, Eric W. Biederman wrote:
>> diff --git a/include/linux/task_work.h b/include/linux/task_work.h
>> index 5b8a93f288bb..897494b597ba 100644
>> --- a/include/linux/task_work.h
>> +++ b/include/linux/task_work.h
>> @@ -19,6 +19,11 @@ enum task_work_notify_mode {
>>  	TWA_SIGNAL,
>>  };
>>  
>> +static inline bool task_work_pending(struct task_struct *task)
>> +{
>> +	return READ_ONCE(task->task_works);
>> +}
>> +
>
> Most of the checks for this is current, do we need READ_ONCE here?

There is a non-current use in fs/io_uring in __io_uring_show_fdinfo
and another in task_work_cancel_match.

Beyond that there are quite a few writes that are not at all from
current so even on current task->task_works can change if you look
twice.

So if only to keep it from making unwarranted assumptions I think
READ_ONCE makes sense.

Given that READ_ONCE is practically free I don't see where there is
any harm in using it to document the kind of code we expect the compiler
to generate.

Looking a second time I see all of the other reads of task->task_works
are already READ_ONCE in kernel/task_work.c, so really I think if we
don't want READ_ONCE we need a big fat comment about why it is safe
in a check like task_work_pending and while it is needed everywhere
else.  At the moment I am not smart enough to write that comment.

I will see about adding this bit of discussion in the commit comment to
make it a little clearer why I am introducing READ_ONCE.

Eric
Re: [PATCH 07/13] task_work: Introduce task_work_pending
Posted by Jens Axboe 4 years, 3 months ago
On 3/9/22 4:24 PM, Eric W. Biederman wrote:
> Jens Axboe <axboe@kernel.dk> writes:
> 
>> On 3/9/22 9:24 AM, Eric W. Biederman wrote:
>>> diff --git a/include/linux/task_work.h b/include/linux/task_work.h
>>> index 5b8a93f288bb..897494b597ba 100644
>>> --- a/include/linux/task_work.h
>>> +++ b/include/linux/task_work.h
>>> @@ -19,6 +19,11 @@ enum task_work_notify_mode {
>>>  	TWA_SIGNAL,
>>>  };
>>>  
>>> +static inline bool task_work_pending(struct task_struct *task)
>>> +{
>>> +	return READ_ONCE(task->task_works);
>>> +}
>>> +
>>
>> Most of the checks for this is current, do we need READ_ONCE here?
> 
> There is a non-current use in fs/io_uring in __io_uring_show_fdinfo
> and another in task_work_cancel_match.
> 
> Beyond that there are quite a few writes that are not at all from
> current so even on current task->task_works can change if you look
> twice.
> 
> So if only to keep it from making unwarranted assumptions I think
> READ_ONCE makes sense.
> 
> Given that READ_ONCE is practically free I don't see where there is
> any harm in using it to document the kind of code we expect the compiler
> to generate.
> 
> Looking a second time I see all of the other reads of task->task_works
> are already READ_ONCE in kernel/task_work.c, so really I think if we
> don't want READ_ONCE we need a big fat comment about why it is safe
> in a check like task_work_pending and while it is needed everywhere
> else.  At the moment I am not smart enough to write that comment.
> 
> I will see about adding this bit of discussion in the commit comment to
> make it a little clearer why I am introducing READ_ONCE.

Fair enough, and doesn't warrant a current_tw_pending() helper in that
case either.

-- 
Jens Axboe
Re: [PATCH 07/13] task_work: Introduce task_work_pending
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:48AM -0600, Eric W. Biederman wrote:
> Wrap the test of task->task_works in a helper function to make
> it clear what is being tested.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 08/13] task_work: Call tracehook_notify_signal from get_signal on all architectures
Posted by Eric W. Biederman 4 years, 3 months ago
Always handle TIF_NOTIFY_SIGNAL in get_signal.  With commit 35d0b389f3b2
("task_work: unconditionally run task_work from get_signal()") always
calling task_wofffffffrk_run all of the work of tracehook_notify_signal is
already happening except clearing TIF_NOTIFY_SIGNAL.

Factor clear_notify_signal out of tracehook_notify_signal and use it in
get_signal so that get_signal only needs one call of trask_work_run.

To keep the semantics in sync update xfer_to_guest_mode_work (which
does not call get_signal) to call tracehook_notify_signal if either
_TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/s390/kernel/signal.c    |  4 ++--
 arch/x86/kernel/signal.c     |  4 ++--
 include/linux/entry-common.h |  2 +-
 include/linux/tracehook.h    |  9 +++++++--
 kernel/entry/common.c        | 12 ++----------
 kernel/entry/kvm.c           |  2 +-
 kernel/signal.c              | 14 +++-----------
 7 files changed, 18 insertions(+), 29 deletions(-)

diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c
index 307f5d99514d..ea9e5e8182cd 100644
--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -453,7 +453,7 @@ static void handle_signal(struct ksignal *ksig, sigset_t *oldset,
  * stack-frames in one go after that.
  */
 
-void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
+void arch_do_signal_or_restart(struct pt_regs *regs)
 {
 	struct ksignal ksig;
 	sigset_t *oldset = sigmask_to_save();
@@ -466,7 +466,7 @@ void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
 	current->thread.system_call =
 		test_pt_regs_flag(regs, PIF_SYSCALL) ? regs->int_code : 0;
 
-	if (has_signal && get_signal(&ksig)) {
+	if (get_signal(&ksig)) {
 		/* Whee!  Actually deliver the signal.  */
 		if (current->thread.system_call) {
 			regs->int_code = current->thread.system_call;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index ec71e06ae364..de3d5b5724d8 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -861,11 +861,11 @@ static inline unsigned long get_nr_restart_syscall(const struct pt_regs *regs)
  * want to handle. Thus you cannot kill init even with a SIGKILL even by
  * mistake.
  */
-void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
+void arch_do_signal_or_restart(struct pt_regs *regs)
 {
 	struct ksignal ksig;
 
-	if (has_signal && get_signal(&ksig)) {
+	if (get_signal(&ksig)) {
 		/* Whee! Actually deliver the signal.  */
 		handle_signal(&ksig, regs);
 		return;
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 9efbdda61f7a..3537fd25f14e 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -257,7 +257,7 @@ static __always_inline void arch_exit_to_user_mode(void) { }
  *
  * Invoked from exit_to_user_mode_loop().
  */
-void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal);
+void arch_do_signal_or_restart(struct pt_regs *regs);
 
 /**
  * exit_to_user_mode - Fixup state when exiting to user mode
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index fa834a22e86e..b44a7820c468 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -106,6 +106,12 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
 	rseq_handle_notify_resume(NULL, regs);
 }
 
+static inline void clear_notify_signal(void)
+{
+	clear_thread_flag(TIF_NOTIFY_SIGNAL);
+	smp_mb__after_atomic();
+}
+
 /*
  * called by exit_to_user_mode_loop() if ti_work & _TIF_NOTIFY_SIGNAL. This
  * is currently used by TWA_SIGNAL based task_work, which requires breaking
@@ -113,8 +119,7 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
  */
 static inline void tracehook_notify_signal(void)
 {
-	clear_thread_flag(TIF_NOTIFY_SIGNAL);
-	smp_mb__after_atomic();
+	clear_notify_signal();
 	if (task_work_pending(current))
 		task_work_run();
 }
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index f0b1daa1e8da..79eaf9b4b10d 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -139,15 +139,7 @@ void noinstr exit_to_user_mode(void)
 }
 
 /* Workaround to allow gradual conversion of architecture code */
-void __weak arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal) { }
-
-static void handle_signal_work(struct pt_regs *regs, unsigned long ti_work)
-{
-	if (ti_work & _TIF_NOTIFY_SIGNAL)
-		tracehook_notify_signal();
-
-	arch_do_signal_or_restart(regs, ti_work & _TIF_SIGPENDING);
-}
+void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
 
 static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
 					    unsigned long ti_work)
@@ -170,7 +162,7 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
 			klp_update_patch_state(current);
 
 		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
-			handle_signal_work(regs, ti_work);
+			arch_do_signal_or_restart(regs);
 
 		if (ti_work & _TIF_NOTIFY_RESUME)
 			tracehook_notify_resume(regs);
diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
index 96d476e06c77..cabf36a489e4 100644
--- a/kernel/entry/kvm.c
+++ b/kernel/entry/kvm.c
@@ -8,7 +8,7 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
 	do {
 		int ret;
 
-		if (ti_work & _TIF_NOTIFY_SIGNAL)
+		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
 			tracehook_notify_signal();
 
 		if (ti_work & _TIF_SIGPENDING) {
diff --git a/kernel/signal.c b/kernel/signal.c
index 3b4cf25fb9b3..8632b88982c9 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2626,20 +2626,12 @@ bool get_signal(struct ksignal *ksig)
 	struct signal_struct *signal = current->signal;
 	int signr;
 
+	clear_notify_signal();
 	if (unlikely(task_work_pending(current)))
 		task_work_run();
 
-	/*
-	 * For non-generic architectures, check for TIF_NOTIFY_SIGNAL so
-	 * that the arch handlers don't all have to do it. If we get here
-	 * without TIF_SIGPENDING, just exit after running signal work.
-	 */
-	if (!IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
-		if (test_thread_flag(TIF_NOTIFY_SIGNAL))
-			tracehook_notify_signal();
-		if (!task_sigpending(current))
-			return false;
-	}
+	if (!task_sigpending(current))
+		return false;
 
 	if (unlikely(uprobe_deny_signal()))
 		return false;
-- 
2.29.2
Re: [PATCH 08/13] task_work: Call tracehook_notify_signal from get_signal on all architectures
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:49AM -0600, Eric W. Biederman wrote:
> Always handle TIF_NOTIFY_SIGNAL in get_signal.  With commit 35d0b389f3b2
> ("task_work: unconditionally run task_work from get_signal()") always
> calling task_wofffffffrk_run all of the work of tracehook_notify_signal is

typo: cat on keyboard

> already happening except clearing TIF_NOTIFY_SIGNAL.
> 
> Factor clear_notify_signal out of tracehook_notify_signal and use it in
> get_signal so that get_signal only needs one call of trask_work_run.

typo: trask -> task

> 
> To keep the semantics in sync update xfer_to_guest_mode_work (which
> does not call get_signal) to call tracehook_notify_signal if either
> _TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL.

I see three logical changes in this patch, I think?

- creation and use of clear_notify_signal()
- removal of handle_signal_work() and removal of
  arch_do_signal_or_restart() has_signal arg
- something with get_signal() I don't understand yet:
  - why is clear_notify_signal() added?
  - why is tracehook_notify_signal() removed?

> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  arch/s390/kernel/signal.c    |  4 ++--
>  arch/x86/kernel/signal.c     |  4 ++--
>  include/linux/entry-common.h |  2 +-
>  include/linux/tracehook.h    |  9 +++++++--
>  kernel/entry/common.c        | 12 ++----------
>  kernel/entry/kvm.c           |  2 +-
>  kernel/signal.c              | 14 +++-----------
>  7 files changed, 18 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c
> index 307f5d99514d..ea9e5e8182cd 100644
> --- a/arch/s390/kernel/signal.c
> +++ b/arch/s390/kernel/signal.c
> @@ -453,7 +453,7 @@ static void handle_signal(struct ksignal *ksig, sigset_t *oldset,
>   * stack-frames in one go after that.
>   */
>  
> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
> +void arch_do_signal_or_restart(struct pt_regs *regs)
>  {
>  	struct ksignal ksig;
>  	sigset_t *oldset = sigmask_to_save();
> @@ -466,7 +466,7 @@ void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
>  	current->thread.system_call =
>  		test_pt_regs_flag(regs, PIF_SYSCALL) ? regs->int_code : 0;
>  
> -	if (has_signal && get_signal(&ksig)) {

Right, the only caller of arch_do_signal_or_restart(),
handle_signal_work(), only happens after its caller has already checked
_TIF_SIGPENDING.

> +	if (get_signal(&ksig)) {
>  		/* Whee!  Actually deliver the signal.  */
>  		if (current->thread.system_call) {
>  			regs->int_code = current->thread.system_call;
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index ec71e06ae364..de3d5b5724d8 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -861,11 +861,11 @@ static inline unsigned long get_nr_restart_syscall(const struct pt_regs *regs)
>   * want to handle. Thus you cannot kill init even with a SIGKILL even by
>   * mistake.
>   */
> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
> +void arch_do_signal_or_restart(struct pt_regs *regs)
>  {
>  	struct ksignal ksig;
>  
> -	if (has_signal && get_signal(&ksig)) {
> +	if (get_signal(&ksig)) {
>  		/* Whee! Actually deliver the signal.  */
>  		handle_signal(&ksig, regs);
>  		return;
> diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
> index 9efbdda61f7a..3537fd25f14e 100644
> --- a/include/linux/entry-common.h
> +++ b/include/linux/entry-common.h
> @@ -257,7 +257,7 @@ static __always_inline void arch_exit_to_user_mode(void) { }
>   *
>   * Invoked from exit_to_user_mode_loop().
>   */
> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal);
> +void arch_do_signal_or_restart(struct pt_regs *regs);
>  
>  /**
>   * exit_to_user_mode - Fixup state when exiting to user mode
> diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
> index fa834a22e86e..b44a7820c468 100644
> --- a/include/linux/tracehook.h
> +++ b/include/linux/tracehook.h
> @@ -106,6 +106,12 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
>  	rseq_handle_notify_resume(NULL, regs);
>  }
>  
> +static inline void clear_notify_signal(void)
> +{
> +	clear_thread_flag(TIF_NOTIFY_SIGNAL);
> +	smp_mb__after_atomic();
> +}
> +
>  /*
>   * called by exit_to_user_mode_loop() if ti_work & _TIF_NOTIFY_SIGNAL. This
>   * is currently used by TWA_SIGNAL based task_work, which requires breaking
> @@ -113,8 +119,7 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
>   */
>  static inline void tracehook_notify_signal(void)
>  {
> -	clear_thread_flag(TIF_NOTIFY_SIGNAL);
> -	smp_mb__after_atomic();
> +	clear_notify_signal();
>  	if (task_work_pending(current))
>  		task_work_run();
>  }
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index f0b1daa1e8da..79eaf9b4b10d 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -139,15 +139,7 @@ void noinstr exit_to_user_mode(void)
>  }
>  
>  /* Workaround to allow gradual conversion of architecture code */
> -void __weak arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal) { }
> -
> -static void handle_signal_work(struct pt_regs *regs, unsigned long ti_work)
> -{
> -	if (ti_work & _TIF_NOTIFY_SIGNAL)
> -		tracehook_notify_signal();
> -
> -	arch_do_signal_or_restart(regs, ti_work & _TIF_SIGPENDING);
> -}
> +void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
>  
>  static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>  					    unsigned long ti_work)
> @@ -170,7 +162,7 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>  			klp_update_patch_state(current);
>  
>  		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
> -			handle_signal_work(regs, ti_work);
> +			arch_do_signal_or_restart(regs);
>  
>  		if (ti_work & _TIF_NOTIFY_RESUME)
>  			tracehook_notify_resume(regs);
> diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
> index 96d476e06c77..cabf36a489e4 100644
> --- a/kernel/entry/kvm.c
> +++ b/kernel/entry/kvm.c
> @@ -8,7 +8,7 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
>  	do {
>  		int ret;
>  
> -		if (ti_work & _TIF_NOTIFY_SIGNAL)
> +		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
>  			tracehook_notify_signal();
>  
>  		if (ti_work & _TIF_SIGPENDING) {
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 3b4cf25fb9b3..8632b88982c9 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2626,20 +2626,12 @@ bool get_signal(struct ksignal *ksig)
>  	struct signal_struct *signal = current->signal;
>  	int signr;
>  
> +	clear_notify_signal();

Why is this added?

>  	if (unlikely(task_work_pending(current)))
>  		task_work_run();
>  
> -	/*
> -	 * For non-generic architectures, check for TIF_NOTIFY_SIGNAL so
> -	 * that the arch handlers don't all have to do it. If we get here
> -	 * without TIF_SIGPENDING, just exit after running signal work.
> -	 */
> -	if (!IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
> -		if (test_thread_flag(TIF_NOTIFY_SIGNAL))
> -			tracehook_notify_signal();

I don't see why this gets removed?

> -		if (!task_sigpending(current))
> -			return false;
> -	}
> +	if (!task_sigpending(current))
> +		return false;
>  
>  	if (unlikely(uprobe_deny_signal()))
>  		return false;
> -- 
> 2.29.2
> 

-- 
Kees Cook
Re: [PATCH 08/13] task_work: Call tracehook_notify_signal from get_signal on all architectures
Posted by Eric W. Biederman 4 years, 3 months ago
Kees Cook <keescook@chromium.org> writes:

> On Wed, Mar 09, 2022 at 10:24:49AM -0600, Eric W. Biederman wrote:
>> Always handle TIF_NOTIFY_SIGNAL in get_signal.  With commit 35d0b389f3b2
>> ("task_work: unconditionally run task_work from get_signal()") always
>> calling task_wofffffffrk_run all of the work of tracehook_notify_signal is
>
> typo: cat on keyboard
>
>> already happening except clearing TIF_NOTIFY_SIGNAL.
>> 
>> Factor clear_notify_signal out of tracehook_notify_signal and use it in
>> get_signal so that get_signal only needs one call of trask_work_run.
>
> typo: trask -> task
>
>> 
>> To keep the semantics in sync update xfer_to_guest_mode_work (which
>> does not call get_signal) to call tracehook_notify_signal if either
>> _TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL.

First let me say thanks for the close look at this work.

> I see three logical changes in this patch, I think?
>
> - creation and use of clear_notify_signal()
> - removal of handle_signal_work() and removal of
>   arch_do_signal_or_restart() has_signal arg
> - something with get_signal() I don't understand yet:
>   - why is clear_notify_signal() added?
>   - why is tracehook_notify_signal() removed?


The spoiler is the change to get_signal is the logical change.
The rest of the changes follow from that change.  Please see below.

The inline expansion of tracehook_notify_signal in get_signal and
in it's other two callers in the next change is the only real kernel
internal api change in this series of changes.

The optimization that was tried with TIF_NOTIFY_SIGNAL and being able to
only call task_work_run() when TIF_NOTIFY_SIGNAL was set instead of when
get_signal was called failed, and caused a regression.  The removal of
calling task_work_run from get_signal has been reverted but the rest
of the change had not been.  So this change just removes the rest of
the failed optimization.

Please see below for my detailed description of the get_signal change.

I hope this helps.

>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  arch/s390/kernel/signal.c    |  4 ++--
>>  arch/x86/kernel/signal.c     |  4 ++--
>>  include/linux/entry-common.h |  2 +-
>>  include/linux/tracehook.h    |  9 +++++++--
>>  kernel/entry/common.c        | 12 ++----------
>>  kernel/entry/kvm.c           |  2 +-
>>  kernel/signal.c              | 14 +++-----------
>>  7 files changed, 18 insertions(+), 29 deletions(-)
>> 
>> diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c
>> index 307f5d99514d..ea9e5e8182cd 100644
>> --- a/arch/s390/kernel/signal.c
>> +++ b/arch/s390/kernel/signal.c
>> @@ -453,7 +453,7 @@ static void handle_signal(struct ksignal *ksig, sigset_t *oldset,
>>   * stack-frames in one go after that.
>>   */
>>  
>> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
>> +void arch_do_signal_or_restart(struct pt_regs *regs)
>>  {
>>  	struct ksignal ksig;
>>  	sigset_t *oldset = sigmask_to_save();
>> @@ -466,7 +466,7 @@ void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
>>  	current->thread.system_call =
>>  		test_pt_regs_flag(regs, PIF_SYSCALL) ? regs->int_code : 0;
>>  
>> -	if (has_signal && get_signal(&ksig)) {
>
> Right, the only caller of arch_do_signal_or_restart(),
> handle_signal_work(), only happens after its caller has already checked
> _TIF_SIGPENDING.

It could be TIF_SIGPENDING or TIF_NOTIFY_SIGNAL.  The work for
TIF_NOTIFY_SIGNAL has been moved unconditionally into get_signal.
So it no longer makes sense to care which flag has been set.

>> +	if (get_signal(&ksig)) {
>>  		/* Whee!  Actually deliver the signal.  */
>>  		if (current->thread.system_call) {
>>  			regs->int_code = current->thread.system_call;
>> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
>> index ec71e06ae364..de3d5b5724d8 100644
>> --- a/arch/x86/kernel/signal.c
>> +++ b/arch/x86/kernel/signal.c
>> @@ -861,11 +861,11 @@ static inline unsigned long get_nr_restart_syscall(const struct pt_regs *regs)
>>   * want to handle. Thus you cannot kill init even with a SIGKILL even by
>>   * mistake.
>>   */
>> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
>> +void arch_do_signal_or_restart(struct pt_regs *regs)
>>  {
>>  	struct ksignal ksig;
>>  
>> -	if (has_signal && get_signal(&ksig)) {
>> +	if (get_signal(&ksig)) {
>>  		/* Whee! Actually deliver the signal.  */
>>  		handle_signal(&ksig, regs);
>>  		return;
>> diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
>> index 9efbdda61f7a..3537fd25f14e 100644
>> --- a/include/linux/entry-common.h
>> +++ b/include/linux/entry-common.h
>> @@ -257,7 +257,7 @@ static __always_inline void arch_exit_to_user_mode(void) { }
>>   *
>>   * Invoked from exit_to_user_mode_loop().
>>   */
>> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal);
>> +void arch_do_signal_or_restart(struct pt_regs *regs);
>>  
>>  /**
>>   * exit_to_user_mode - Fixup state when exiting to user mode
>> diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
>> index fa834a22e86e..b44a7820c468 100644
>> --- a/include/linux/tracehook.h
>> +++ b/include/linux/tracehook.h
>> @@ -106,6 +106,12 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
>>  	rseq_handle_notify_resume(NULL, regs);
>>  }
>>  
>> +static inline void clear_notify_signal(void)
>> +{
>> +	clear_thread_flag(TIF_NOTIFY_SIGNAL);
>> +	smp_mb__after_atomic();
>> +}
>> +
>>  /*
>>   * called by exit_to_user_mode_loop() if ti_work & _TIF_NOTIFY_SIGNAL. This
>>   * is currently used by TWA_SIGNAL based task_work, which requires breaking
>> @@ -113,8 +119,7 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
>>   */
>>  static inline void tracehook_notify_signal(void)
>>  {
>> -	clear_thread_flag(TIF_NOTIFY_SIGNAL);
>> -	smp_mb__after_atomic();
>> +	clear_notify_signal();
>>  	if (task_work_pending(current))
>>  		task_work_run();
>>  }
>> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
>> index f0b1daa1e8da..79eaf9b4b10d 100644
>> --- a/kernel/entry/common.c
>> +++ b/kernel/entry/common.c
>> @@ -139,15 +139,7 @@ void noinstr exit_to_user_mode(void)
>>  }
>>  
>>  /* Workaround to allow gradual conversion of architecture code */
>> -void __weak arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal) { }
>> -
>> -static void handle_signal_work(struct pt_regs *regs, unsigned long ti_work)
>> -{
>> -	if (ti_work & _TIF_NOTIFY_SIGNAL)
>> -		tracehook_notify_signal();
>> -
>> -	arch_do_signal_or_restart(regs, ti_work & _TIF_SIGPENDING);
>> -}

With the work of tracehook_notify_signal happening in get_signal (called
from arch_do_signal_or_restart) there is no longer a reason to call
tracehook_notify_signal on it's own, or to remember if it was
TIF_NOTIFY_SIGNAL or TIF_SIGPENDING which was set.

>> +void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
>>  
>>  static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>>  					    unsigned long ti_work)
>> @@ -170,7 +162,7 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>>  			klp_update_patch_state(current);
>>  
>>  		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
>> -			handle_signal_work(regs, ti_work);
>> +			arch_do_signal_or_restart(regs);
>>  
>>  		if (ti_work & _TIF_NOTIFY_RESUME)
>>  			tracehook_notify_resume(regs);
>> diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
>> index 96d476e06c77..cabf36a489e4 100644
>> --- a/kernel/entry/kvm.c
>> +++ b/kernel/entry/kvm.c
>> @@ -8,7 +8,7 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
>>  	do {
>>  		int ret;
>>  
>> -		if (ti_work & _TIF_NOTIFY_SIGNAL)
>> +		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
>>  			tracehook_notify_signal();
>>  
>>  		if (ti_work & _TIF_SIGPENDING) {
>> diff --git a/kernel/signal.c b/kernel/signal.c
>> index 3b4cf25fb9b3..8632b88982c9 100644
>> --- a/kernel/signal.c
>> +++ b/kernel/signal.c
>> @@ -2626,20 +2626,12 @@ bool get_signal(struct ksignal *ksig)
>>  	struct signal_struct *signal = current->signal;
>>  	int signr;
>>  
>> +	clear_notify_signal();
>
> Why is this added?

See below.
>
>>  	if (unlikely(task_work_pending(current)))
>>  		task_work_run();
>>  
>> -	/*
>> -	 * For non-generic architectures, check for TIF_NOTIFY_SIGNAL so
>> -	 * that the arch handlers don't all have to do it. If we get here
>> -	 * without TIF_SIGPENDING, just exit after running signal work.
>> -	 */
>> -	if (!IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
>> -		if (test_thread_flag(TIF_NOTIFY_SIGNAL))
>> -			tracehook_notify_signal();
>
> I don't see why this gets removed?

This is the core change of this change, the rest
follows from this change.

The definition of tracehook_notify_signal is:

  static inline void tracehook_notify_signal(void)
  {
  	clear_notify_signal();
  	if (task_work_pending(current))
  		task_work_run();
  }


Which means the only difference between:
	if (unlikely(task_work_pending(current)))
		task_work_run()

and tracehook_notify_signal is the work done by clear_notify_signal.
So I added the missing work and remove the fancy conditional.

There are only two architectures that define CONFIG_GENERIC_ENTRY
x86 and s390.  At some point the task_work_run was moved completely
out of get_signal (on x86 and s390) and it was assumed it was enough
to call tracehook_notify_signal when TIF_NOTIFY_SIGNAL was set.

That turned out to be false and kernel regressions were encountered.  So
the call to task_work_run was added back to get_signal.


Which is where this change comes in.  There is an unconditional call to
task_work_run() in get_signal, and a funny conditional call to
task_work_run().

So this change is just changing the kernel to unconditionally call
task_work_run() in get_signal(), as well as clearing TIF_NOTIFY_SIGNAL.

The result is that on the common path tracehook_notify_signal no longer
needs to be called.

My next change removes tracehook_notify_signal entirely.  Which is
the other reason I add clear_notify_signal.

>
>> -		if (!task_sigpending(current))
>> -			return false;
>> -	}
>> +	if (!task_sigpending(current))
>> +		return false;
>>  
>>  	if (unlikely(uprobe_deny_signal()))
>>  		return false;
>> -- 
>> 2.29.2
>> 

Eric
Re: [PATCH 08/13] task_work: Call tracehook_notify_signal from get_signal on all architectures
Posted by Kees Cook 4 years, 3 months ago
On Thu, Mar 10, 2022 at 01:04:52PM -0600, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > On Wed, Mar 09, 2022 at 10:24:49AM -0600, Eric W. Biederman wrote:
> >> Always handle TIF_NOTIFY_SIGNAL in get_signal.  With commit 35d0b389f3b2
> >> ("task_work: unconditionally run task_work from get_signal()") always
> >> calling task_wofffffffrk_run all of the work of tracehook_notify_signal is
> >
> > typo: cat on keyboard
> >
> >> already happening except clearing TIF_NOTIFY_SIGNAL.
> >> 
> >> Factor clear_notify_signal out of tracehook_notify_signal and use it in
> >> get_signal so that get_signal only needs one call of trask_work_run.
> >
> > typo: trask -> task
> >
> >> 
> >> To keep the semantics in sync update xfer_to_guest_mode_work (which
> >> does not call get_signal) to call tracehook_notify_signal if either
> >> _TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL.
> 
> First let me say thanks for the close look at this work.
> 
> > I see three logical changes in this patch, I think?
> >
> > - creation and use of clear_notify_signal()
> > - removal of handle_signal_work() and removal of
> >   arch_do_signal_or_restart() has_signal arg
> > - something with get_signal() I don't understand yet:
> >   - why is clear_notify_signal() added?
> >   - why is tracehook_notify_signal() removed?
> 
> 
> The spoiler is the change to get_signal is the logical change.
> The rest of the changes follow from that change.  Please see below.
> 
> The inline expansion of tracehook_notify_signal in get_signal and
> in it's other two callers in the next change is the only real kernel
> internal api change in this series of changes.
> 
> The optimization that was tried with TIF_NOTIFY_SIGNAL and being able to
> only call task_work_run() when TIF_NOTIFY_SIGNAL was set instead of when
> get_signal was called failed, and caused a regression.  The removal of
> calling task_work_run from get_signal has been reverted but the rest
> of the change had not been.  So this change just removes the rest of
> the failed optimization.
> 
> Please see below for my detailed description of the get_signal change.
> 
> I hope this helps.

It does! Thanks very much for the additional details.

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 09/13] task_work: Decouple TIF_NOTIFY_SIGNAL and task_work
Posted by Eric W. Biederman 4 years, 3 months ago
There are a small handful of reasons besides pending signals that the
kernel might want to break out of interruptible sleeps.  The flag
TIF_NOTIFY_SIGNAL and the helpers that set and clear TIF_NOTIFY_SIGNAL
provide that the infrastructure for breaking out of interruptible
sleeps and entering the return to user space slow path for those
cases.

Expand tracehook_notify_signal inline in it's callers and remove it,
which makes clear that TIF_NOTIFY_SIGNAL and task_work are separate
concepts.

Update the comment on set_notify_signal to more accurately describe
it's purpose.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/io-wq.c                |  4 +++-
 fs/io_uring.c             |  4 +++-
 include/linux/tracehook.h | 15 ++-------------
 kernel/entry/kvm.c        |  7 +++++--
 4 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/fs/io-wq.c b/fs/io-wq.c
index bb7f161bb19c..8b9147873c2c 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -515,7 +515,9 @@ static bool io_flush_signals(void)
 {
 	if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) {
 		__set_current_state(TASK_RUNNING);
-		tracehook_notify_signal();
+		clear_notify_signal();
+		if (task_work_pending(current))
+			task_work_run();
 		return true;
 	}
 	return false;
diff --git a/fs/io_uring.c b/fs/io_uring.c
index e85261079a78..d5fbae1030f9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2592,7 +2592,9 @@ static inline bool io_run_task_work(void)
 {
 	if (test_thread_flag(TIF_NOTIFY_SIGNAL) || task_work_pending(current)) {
 		__set_current_state(TASK_RUNNING);
-		tracehook_notify_signal();
+		clear_notify_signal();
+		if (task_work_pending(current))
+			task_work_run();
 		return true;
 	}
 
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index b44a7820c468..e5d676e841e3 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -113,19 +113,8 @@ static inline void clear_notify_signal(void)
 }
 
 /*
- * called by exit_to_user_mode_loop() if ti_work & _TIF_NOTIFY_SIGNAL. This
- * is currently used by TWA_SIGNAL based task_work, which requires breaking
- * wait loops to ensure that task_work is noticed and run.
- */
-static inline void tracehook_notify_signal(void)
-{
-	clear_notify_signal();
-	if (task_work_pending(current))
-		task_work_run();
-}
-
-/*
- * Called when we have work to process from exit_to_user_mode_loop()
+ * Called to break out of interruptible wait loops, and enter the
+ * exit_to_user_mode_loop().
  */
 static inline void set_notify_signal(struct task_struct *task)
 {
diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
index cabf36a489e4..3ab5f98988c3 100644
--- a/kernel/entry/kvm.c
+++ b/kernel/entry/kvm.c
@@ -8,8 +8,11 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
 	do {
 		int ret;
 
-		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
-			tracehook_notify_signal();
+		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)) {
+			clear_notify_signal();
+			if (task_work_pending(current))
+				task_work_run();
+		}
 
 		if (ti_work & _TIF_SIGPENDING) {
 			kvm_handle_signal_exit(vcpu);
-- 
2.29.2
Re: [PATCH 09/13] task_work: Decouple TIF_NOTIFY_SIGNAL and task_work
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:50AM -0600, Eric W. Biederman wrote:
> There are a small handful of reasons besides pending signals that the
> kernel might want to break out of interruptible sleeps.  The flag
> TIF_NOTIFY_SIGNAL and the helpers that set and clear TIF_NOTIFY_SIGNAL
> provide that the infrastructure for breaking out of interruptible
> sleeps and entering the return to user space slow path for those
> cases.
> 
> Expand tracehook_notify_signal inline in it's callers and remove it,
> which makes clear that TIF_NOTIFY_SIGNAL and task_work are separate
> concepts.
> 
> Update the comment on set_notify_signal to more accurately describe
> it's purpose.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 10/13] signal: Move set_notify_signal and clear_notify_signal into sched/signal.h
Posted by Eric W. Biederman 4 years, 3 months ago
The header tracehook.h is no place for code to live.  The functions
set_notify_signal and clear_notify_signal are not about signals.  They
are about interruptions that act like signals.  The fundamental signal
primitives wind up calling set_notify_signal and clear_notify_signal.
Which means they need to be maintained with the signal code.

Since set_notify_signal and clear_notify_signal must be maintained
with the signal subsystem move them into sched/signal.h and claim them
as part of the signal subsystem.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/sched/signal.h | 17 +++++++++++++++++
 include/linux/tracehook.h    | 17 -----------------
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index b6ecb9fc4cd2..3c8b34876744 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -349,6 +349,23 @@ extern void sigqueue_free(struct sigqueue *);
 extern int send_sigqueue(struct sigqueue *, struct pid *, enum pid_type);
 extern int do_sigaction(int, struct k_sigaction *, struct k_sigaction *);
 
+static inline void clear_notify_signal(void)
+{
+	clear_thread_flag(TIF_NOTIFY_SIGNAL);
+	smp_mb__after_atomic();
+}
+
+/*
+ * Called to break out of interruptible wait loops, and enter the
+ * exit_to_user_mode_loop().
+ */
+static inline void set_notify_signal(struct task_struct *task)
+{
+	if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_SIGNAL) &&
+	    !wake_up_state(task, TASK_INTERRUPTIBLE))
+		kick_process(task);
+}
+
 static inline int restart_syscall(void)
 {
 	set_tsk_thread_flag(current, TIF_SIGPENDING);
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index e5d676e841e3..1b7365aef8da 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -106,21 +106,4 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
 	rseq_handle_notify_resume(NULL, regs);
 }
 
-static inline void clear_notify_signal(void)
-{
-	clear_thread_flag(TIF_NOTIFY_SIGNAL);
-	smp_mb__after_atomic();
-}
-
-/*
- * Called to break out of interruptible wait loops, and enter the
- * exit_to_user_mode_loop().
- */
-static inline void set_notify_signal(struct task_struct *task)
-{
-	if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_SIGNAL) &&
-	    !wake_up_state(task, TASK_INTERRUPTIBLE))
-		kick_process(task);
-}
-
 #endif	/* <linux/tracehook.h> */
-- 
2.29.2
Re: [PATCH 10/13] signal: Move set_notify_signal and clear_notify_signal into sched/signal.h
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:51AM -0600, Eric W. Biederman wrote:
> The header tracehook.h is no place for code to live.  The functions
> set_notify_signal and clear_notify_signal are not about signals.  They
> are about interruptions that act like signals.  The fundamental signal
> primitives wind up calling set_notify_signal and clear_notify_signal.
> Which means they need to be maintained with the signal code.
> 
> Since set_notify_signal and clear_notify_signal must be maintained
> with the signal subsystem move them into sched/signal.h and claim them
> as part of the signal subsystem.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 11/13] resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume
Posted by Eric W. Biederman 4 years, 3 months ago
Every architecture defines TIF_NOTIFY_RESUME so remove the unnecessary
ifdef.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/tracehook.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 1b7365aef8da..946404ebe10b 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -63,10 +63,8 @@ struct linux_binprm;
  */
 static inline void set_notify_resume(struct task_struct *task)
 {
-#ifdef TIF_NOTIFY_RESUME
 	if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_RESUME))
 		kick_process(task);
-#endif
 }
 
 /**
-- 
2.29.2
Re: [PATCH 11/13] resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:52AM -0600, Eric W. Biederman wrote:
> Every architecture defines TIF_NOTIFY_RESUME so remove the unnecessary
> ifdef.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 12/13] resume_user_mode: Move to resume_user_mode.h
Posted by Eric W. Biederman 4 years, 3 months ago
Move set_notify_resume and tracehook_notify_resume into resume_user_mode.h.
While doing that rename tracehook_notify_resume to resume_user_mode_work.

Update all of the places that included tracehook.h for these functions to
include resume_user_mode.h instead.

Update all of the callers of tracehook_notify_resume to call
resume_user_mode_work.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/Kconfig                     |  2 +-
 arch/alpha/kernel/signal.c       |  4 +-
 arch/arc/kernel/signal.c         |  4 +-
 arch/arm/kernel/signal.c         |  4 +-
 arch/arm64/kernel/signal.c       |  4 +-
 arch/csky/kernel/signal.c        |  4 +-
 arch/h8300/kernel/signal.c       |  4 +-
 arch/hexagon/kernel/process.c    |  4 +-
 arch/hexagon/kernel/signal.c     |  1 -
 arch/ia64/kernel/process.c       |  4 +-
 arch/ia64/kernel/ptrace.c        |  2 +-
 arch/ia64/kernel/signal.c        |  1 -
 arch/m68k/kernel/signal.c        |  4 +-
 arch/microblaze/kernel/signal.c  |  4 +-
 arch/mips/kernel/signal.c        |  4 +-
 arch/nds32/kernel/signal.c       |  4 +-
 arch/nios2/kernel/signal.c       |  4 +-
 arch/openrisc/kernel/signal.c    |  4 +-
 arch/parisc/kernel/signal.c      |  4 +-
 arch/powerpc/kernel/signal.c     |  4 +-
 arch/riscv/kernel/signal.c       |  4 +-
 arch/sh/kernel/signal_32.c       |  4 +-
 arch/sparc/kernel/signal32.c     |  1 -
 arch/sparc/kernel/signal_32.c    |  4 +-
 arch/sparc/kernel/signal_64.c    |  4 +-
 arch/um/kernel/process.c         |  4 +-
 arch/xtensa/kernel/signal.c      |  4 +-
 block/blk-cgroup.c               |  2 +-
 include/linux/entry-kvm.h        |  2 +-
 include/linux/resume_user_mode.h | 64 ++++++++++++++++++++++++++++++++
 include/linux/tracehook.h        | 51 -------------------------
 kernel/entry/common.c            |  4 +-
 kernel/entry/kvm.c               |  2 +-
 kernel/task_work.c               |  2 +-
 mm/memcontrol.c                  |  2 +-
 35 files changed, 117 insertions(+), 107 deletions(-)
 create mode 100644 include/linux/resume_user_mode.h

diff --git a/arch/Kconfig b/arch/Kconfig
index 6382520ef0a5..2e3979c3d66d 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -218,7 +218,7 @@ config TRACE_IRQFLAGS_SUPPORT
 #	linux/regset.h		user_regset interfaces
 #	CORE_DUMP_USE_REGSET	#define'd in linux/elf.h
 #	TIF_SYSCALL_TRACE	calls ptrace_report_syscall_{entry,exit}
-#	TIF_NOTIFY_RESUME	calls tracehook_notify_resume()
+#	TIF_NOTIFY_RESUME	calls resume_user_mode_work()
 #
 config HAVE_ARCH_TRACEHOOK
 	bool
diff --git a/arch/alpha/kernel/signal.c b/arch/alpha/kernel/signal.c
index d8ed71d5bed3..6f47f256fe80 100644
--- a/arch/alpha/kernel/signal.c
+++ b/arch/alpha/kernel/signal.c
@@ -22,7 +22,7 @@
 #include <linux/binfmts.h>
 #include <linux/bitops.h>
 #include <linux/syscalls.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <linux/uaccess.h>
 #include <asm/sigcontext.h>
@@ -531,7 +531,7 @@ do_work_pending(struct pt_regs *regs, unsigned long thread_flags,
 				do_signal(regs, r0, r19);
 				r0 = 0;
 			} else {
-				tracehook_notify_resume(regs);
+				resume_user_mode_work(regs);
 			}
 		}
 		local_irq_disable();
diff --git a/arch/arc/kernel/signal.c b/arch/arc/kernel/signal.c
index cb2f88502baf..f748483628f2 100644
--- a/arch/arc/kernel/signal.c
+++ b/arch/arc/kernel/signal.c
@@ -49,7 +49,7 @@
 #include <linux/personality.h>
 #include <linux/uaccess.h>
 #include <linux/syscalls.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/sched/task_stack.h>
 
 #include <asm/ucontext.h>
@@ -438,5 +438,5 @@ void do_notify_resume(struct pt_regs *regs)
 	 * user mode
 	 */
 	if (test_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index c532a6041066..459abc5d1819 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -9,7 +9,7 @@
 #include <linux/signal.h>
 #include <linux/personality.h>
 #include <linux/uaccess.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/uprobes.h>
 #include <linux/syscalls.h>
 
@@ -627,7 +627,7 @@ do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall)
 			} else if (thread_flags & _TIF_UPROBE) {
 				uprobe_notify_resume(regs);
 			} else {
-				tracehook_notify_resume(regs);
+				resume_user_mode_work(regs);
 			}
 		}
 		local_irq_disable();
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index d8aaf4b6f432..413c51de9d10 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -17,7 +17,7 @@
 #include <linux/uaccess.h>
 #include <linux/sizes.h>
 #include <linux/string.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/ratelimit.h>
 #include <linux/syscalls.h>
 
@@ -941,7 +941,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags)
 				do_signal(regs);
 
 			if (thread_flags & _TIF_NOTIFY_RESUME)
-				tracehook_notify_resume(regs);
+				resume_user_mode_work(regs);
 
 			if (thread_flags & _TIF_FOREIGN_FPSTATE)
 				fpsimd_restore_current_state();
diff --git a/arch/csky/kernel/signal.c b/arch/csky/kernel/signal.c
index c7b763d2f526..7a3149a27e4d 100644
--- a/arch/csky/kernel/signal.c
+++ b/arch/csky/kernel/signal.c
@@ -3,7 +3,7 @@
 #include <linux/signal.h>
 #include <linux/uaccess.h>
 #include <linux/syscalls.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <asm/traps.h>
 #include <asm/ucontext.h>
@@ -265,5 +265,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 		do_signal(regs);
 
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/h8300/kernel/signal.c b/arch/h8300/kernel/signal.c
index 75a1c36b105a..0716fc8a8ce2 100644
--- a/arch/h8300/kernel/signal.c
+++ b/arch/h8300/kernel/signal.c
@@ -39,7 +39,7 @@
 #include <linux/personality.h>
 #include <linux/tty.h>
 #include <linux/binfmts.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <asm/setup.h>
 #include <linux/uaccess.h>
@@ -283,5 +283,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, u32 thread_info_flags)
 		do_signal(regs);
 
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/hexagon/kernel/process.c b/arch/hexagon/kernel/process.c
index 232dfd8956aa..ae3f728eeca0 100644
--- a/arch/hexagon/kernel/process.c
+++ b/arch/hexagon/kernel/process.c
@@ -14,7 +14,7 @@
 #include <linux/tick.h>
 #include <linux/uaccess.h>
 #include <linux/slab.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 /*
  * Program thread launch.  Often defined as a macro in processor.h,
@@ -178,7 +178,7 @@ int do_work_pending(struct pt_regs *regs, u32 thread_info_flags)
 	}
 
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 		return 1;
 	}
 
diff --git a/arch/hexagon/kernel/signal.c b/arch/hexagon/kernel/signal.c
index 94cc7ff52dce..bcba31e9e0ae 100644
--- a/arch/hexagon/kernel/signal.c
+++ b/arch/hexagon/kernel/signal.c
@@ -7,7 +7,6 @@
 
 #include <linux/linkage.h>
 #include <linux/syscalls.h>
-#include <linux/tracehook.h>
 #include <linux/sched/task_stack.h>
 
 #include <asm/registers.h>
diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index 834df24a88f1..d7a256bd9d6b 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -32,7 +32,7 @@
 #include <linux/delay.h>
 #include <linux/kdebug.h>
 #include <linux/utsname.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/rcupdate.h>
 
 #include <asm/cpu.h>
@@ -179,7 +179,7 @@ do_notify_resume_user(sigset_t *unused, struct sigscratch *scr, long in_syscall)
 
 	if (test_thread_flag(TIF_NOTIFY_RESUME)) {
 		local_irq_enable();	/* force interrupt enable */
-		tracehook_notify_resume(&scr->pt);
+		resume_user_mode_work(&scr->pt);
 	}
 
 	/* copy user rbs to kernel rbs */
diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c
index 6af64aae087d..a19acd9f5e1f 100644
--- a/arch/ia64/kernel/ptrace.c
+++ b/arch/ia64/kernel/ptrace.c
@@ -23,7 +23,7 @@
 #include <linux/signal.h>
 #include <linux/regset.h>
 #include <linux/elf.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <asm/processor.h>
 #include <asm/ptrace_offsets.h>
diff --git a/arch/ia64/kernel/signal.c b/arch/ia64/kernel/signal.c
index c1b299760bf7..51cf6a7ec158 100644
--- a/arch/ia64/kernel/signal.c
+++ b/arch/ia64/kernel/signal.c
@@ -12,7 +12,6 @@
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
 #include <linux/sched.h>
 #include <linux/signal.h>
 #include <linux/smp.h>
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index 338817d0cb3f..49533f65958a 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -43,7 +43,7 @@
 #include <linux/tty.h>
 #include <linux/binfmts.h>
 #include <linux/extable.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <asm/setup.h>
 #include <linux/uaccess.h>
@@ -1109,5 +1109,5 @@ void do_notify_resume(struct pt_regs *regs)
 		do_signal(regs);
 
 	if (test_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/microblaze/kernel/signal.c b/arch/microblaze/kernel/signal.c
index 23e8a9336a29..561eb82d7af6 100644
--- a/arch/microblaze/kernel/signal.c
+++ b/arch/microblaze/kernel/signal.c
@@ -31,7 +31,7 @@
 #include <linux/personality.h>
 #include <linux/percpu.h>
 #include <linux/linkage.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <asm/entry.h>
 #include <asm/ucontext.h>
 #include <linux/uaccess.h>
@@ -311,5 +311,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, int in_syscall)
 		do_signal(regs, in_syscall);
 
 	if (test_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/mips/kernel/signal.c b/arch/mips/kernel/signal.c
index 5bce782e694c..1a99f26bf99f 100644
--- a/arch/mips/kernel/signal.c
+++ b/arch/mips/kernel/signal.c
@@ -25,7 +25,7 @@
 #include <linux/compiler.h>
 #include <linux/syscalls.h>
 #include <linux/uaccess.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <asm/abi.h>
 #include <asm/asm.h>
@@ -916,7 +916,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, void *unused,
 		do_signal(regs);
 
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 
 	user_enter();
 }
diff --git a/arch/nds32/kernel/signal.c b/arch/nds32/kernel/signal.c
index 7e3ca430a223..551caef595cb 100644
--- a/arch/nds32/kernel/signal.c
+++ b/arch/nds32/kernel/signal.c
@@ -6,7 +6,7 @@
 #include <linux/ptrace.h>
 #include <linux/personality.h>
 #include <linux/freezer.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/uaccess.h>
 
 #include <asm/cacheflush.h>
@@ -380,5 +380,5 @@ do_notify_resume(struct pt_regs *regs, unsigned int thread_flags)
 		do_signal(regs);
 
 	if (thread_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/nios2/kernel/signal.c b/arch/nios2/kernel/signal.c
index 2009ae2d3c3b..530b60c99545 100644
--- a/arch/nios2/kernel/signal.c
+++ b/arch/nios2/kernel/signal.c
@@ -15,7 +15,7 @@
 #include <linux/uaccess.h>
 #include <linux/unistd.h>
 #include <linux/personality.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <asm/ucontext.h>
 #include <asm/cacheflush.h>
@@ -319,7 +319,7 @@ asmlinkage int do_notify_resume(struct pt_regs *regs)
 			return restart;
 		}
 	} else if (test_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 
 	return 0;
 }
diff --git a/arch/openrisc/kernel/signal.c b/arch/openrisc/kernel/signal.c
index 92c5b70740f5..80f69740c731 100644
--- a/arch/openrisc/kernel/signal.c
+++ b/arch/openrisc/kernel/signal.c
@@ -21,7 +21,7 @@
 #include <linux/ptrace.h>
 #include <linux/unistd.h>
 #include <linux/stddef.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <asm/processor.h>
 #include <asm/syscall.h>
@@ -309,7 +309,7 @@ do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall)
 				}
 				syscall = 0;
 			} else {
-				tracehook_notify_resume(regs);
+				resume_user_mode_work(regs);
 			}
 		}
 		local_irq_disable();
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 46b1050640b8..2f7ebe9add20 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -22,7 +22,7 @@
 #include <linux/errno.h>
 #include <linux/wait.h>
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/unistd.h>
 #include <linux/stddef.h>
 #include <linux/compat.h>
@@ -602,5 +602,5 @@ void do_notify_resume(struct pt_regs *regs, long in_syscall)
 		do_signal(regs, in_syscall);
 
 	if (test_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index b93b87df499d..f7f8620663c7 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -9,7 +9,7 @@
  * this archive for more details.
  */
 
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/signal.h>
 #include <linux/uprobes.h>
 #include <linux/key.h>
@@ -294,7 +294,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 	}
 
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
 
 static unsigned long get_tm_stackpointer(struct task_struct *tsk)
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
index c2d5ecbe5526..d80bf5896c6f 100644
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -9,7 +9,7 @@
 #include <linux/signal.h>
 #include <linux/uaccess.h>
 #include <linux/syscalls.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/linkage.h>
 
 #include <asm/ucontext.h>
@@ -317,5 +317,5 @@ asmlinkage __visible void do_notify_resume(struct pt_regs *regs,
 		do_signal(regs);
 
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/sh/kernel/signal_32.c b/arch/sh/kernel/signal_32.c
index dd3092911efa..90f495d35db2 100644
--- a/arch/sh/kernel/signal_32.c
+++ b/arch/sh/kernel/signal_32.c
@@ -25,7 +25,7 @@
 #include <linux/personality.h>
 #include <linux/binfmts.h>
 #include <linux/io.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <asm/ucontext.h>
 #include <linux/uaccess.h>
 #include <asm/cacheflush.h>
@@ -503,5 +503,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, unsigned int save_r0,
 		do_signal(regs, save_r0);
 
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/arch/sparc/kernel/signal32.c b/arch/sparc/kernel/signal32.c
index 6cc124a3bb98..f9fe502b81c6 100644
--- a/arch/sparc/kernel/signal32.c
+++ b/arch/sparc/kernel/signal32.c
@@ -20,7 +20,6 @@
 #include <linux/binfmts.h>
 #include <linux/compat.h>
 #include <linux/bitops.h>
-#include <linux/tracehook.h>
 
 #include <linux/uaccess.h>
 #include <asm/ptrace.h>
diff --git a/arch/sparc/kernel/signal_32.c b/arch/sparc/kernel/signal_32.c
index ffab16369bea..80c89b362d8b 100644
--- a/arch/sparc/kernel/signal_32.c
+++ b/arch/sparc/kernel/signal_32.c
@@ -19,7 +19,7 @@
 #include <linux/smp.h>
 #include <linux/binfmts.h>	/* do_coredum */
 #include <linux/bitops.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 #include <linux/uaccess.h>
 #include <asm/ptrace.h>
@@ -524,7 +524,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long orig_i0,
 	if (thread_info_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
 		do_signal(regs, orig_i0);
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
 
 asmlinkage int do_sys_sigstack(struct sigstack __user *ssptr,
diff --git a/arch/sparc/kernel/signal_64.c b/arch/sparc/kernel/signal_64.c
index 2a78d2af1265..8b9fc76cd3e0 100644
--- a/arch/sparc/kernel/signal_64.c
+++ b/arch/sparc/kernel/signal_64.c
@@ -15,7 +15,7 @@
 #include <linux/errno.h>
 #include <linux/wait.h>
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/unistd.h>
 #include <linux/mm.h>
 #include <linux/tty.h>
@@ -552,7 +552,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long orig_i0, unsigned long
 	if (thread_info_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
 		do_signal(regs, orig_i0);
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 	user_enter();
 }
 
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index 4a420778ed87..80504680be08 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -23,7 +23,7 @@
 #include <linux/seq_file.h>
 #include <linux/tick.h>
 #include <linux/threads.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <asm/current.h>
 #include <asm/mmu_context.h>
 #include <linux/uaccess.h>
@@ -104,7 +104,7 @@ void interrupt_end(void)
 	    test_thread_flag(TIF_NOTIFY_SIGNAL))
 		do_signal(regs);
 	if (test_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
 
 int get_current_pid(void)
diff --git a/arch/xtensa/kernel/signal.c b/arch/xtensa/kernel/signal.c
index f6c949895b3e..6f68649e86ba 100644
--- a/arch/xtensa/kernel/signal.c
+++ b/arch/xtensa/kernel/signal.c
@@ -19,7 +19,7 @@
 #include <linux/errno.h>
 #include <linux/ptrace.h>
 #include <linux/personality.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/sched/task_stack.h>
 
 #include <asm/ucontext.h>
@@ -511,5 +511,5 @@ void do_notify_resume(struct pt_regs *regs)
 		do_signal(regs);
 
 	if (test_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(regs);
+		resume_user_mode_work(regs);
 }
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 650f7e27989f..4d8be1634bc6 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -28,7 +28,7 @@
 #include <linux/atomic.h>
 #include <linux/ctype.h>
 #include <linux/blk-cgroup.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/psi.h>
 #include <linux/part_stat.h>
 #include "blk.h"
diff --git a/include/linux/entry-kvm.h b/include/linux/entry-kvm.h
index 07c878d6e323..6813171afccb 100644
--- a/include/linux/entry-kvm.h
+++ b/include/linux/entry-kvm.h
@@ -3,7 +3,7 @@
 #define __LINUX_ENTRYKVM_H
 
 #include <linux/static_call_types.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/syscalls.h>
 #include <linux/seccomp.h>
 #include <linux/sched.h>
diff --git a/include/linux/resume_user_mode.h b/include/linux/resume_user_mode.h
new file mode 100644
index 000000000000..285189454449
--- /dev/null
+++ b/include/linux/resume_user_mode.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef LINUX_RESUME_USER_MODE_H
+#define LINUX_RESUME_USER_MODE_H
+
+#include <linux/sched.h>
+#include <linux/task_work.h>
+#include <linux/memcontrol.h>
+#include <linux/blk-cgroup.h>
+
+/**
+ * set_notify_resume - cause resume_user_mode_work() to be called
+ * @task:		task that will call resume_user_mode_work()
+ *
+ * Calling this arranges that @task will call resume_user_mode_work()
+ * before returning to user mode.  If it's already running in user mode,
+ * it will enter the kernel and call resume_user_mode_work() soon.
+ * If it's blocked, it will not be woken.
+ */
+static inline void set_notify_resume(struct task_struct *task)
+{
+	if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_RESUME))
+		kick_process(task);
+}
+
+
+/**
+ * resume_user_mode_work - Perform work before returning to user mode
+ * @regs:		user-mode registers of @current task
+ *
+ * This is called when %TIF_NOTIFY_RESUME has been set.  Now we are
+ * about to return to user mode, and the user state in @regs can be
+ * inspected or adjusted.  The caller in arch code has cleared
+ * %TIF_NOTIFY_RESUME before the call.  If the flag gets set again
+ * asynchronously, this will be called again before we return to
+ * user mode.
+ *
+ * Called without locks.
+ */
+static inline void resume_user_mode_work(struct pt_regs *regs)
+{
+	clear_thread_flag(TIF_NOTIFY_RESUME);
+	/*
+	 * This barrier pairs with task_work_add()->set_notify_resume() after
+	 * hlist_add_head(task->task_works);
+	 */
+	smp_mb__after_atomic();
+	if (unlikely(task_work_pending(current)))
+		task_work_run();
+
+#ifdef CONFIG_KEYS_REQUEST_CACHE
+	if (unlikely(current->cached_requested_key)) {
+		key_put(current->cached_requested_key);
+		current->cached_requested_key = NULL;
+	}
+#endif
+
+	mem_cgroup_handle_over_high();
+	blkcg_maybe_throttle_current();
+
+	rseq_handle_notify_resume(NULL, regs);
+}
+
+#endif /* LINUX_RESUME_USER_MODE_H */
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 946404ebe10b..9f6b3fd1880a 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -52,56 +52,5 @@
 struct linux_binprm;
 
 
-/**
- * set_notify_resume - cause tracehook_notify_resume() to be called
- * @task:		task that will call tracehook_notify_resume()
- *
- * Calling this arranges that @task will call tracehook_notify_resume()
- * before returning to user mode.  If it's already running in user mode,
- * it will enter the kernel and call tracehook_notify_resume() soon.
- * If it's blocked, it will not be woken.
- */
-static inline void set_notify_resume(struct task_struct *task)
-{
-	if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_RESUME))
-		kick_process(task);
-}
-
-/**
- * tracehook_notify_resume - report when about to return to user mode
- * @regs:		user-mode registers of @current task
- *
- * This is called when %TIF_NOTIFY_RESUME has been set.  Now we are
- * about to return to user mode, and the user state in @regs can be
- * inspected or adjusted.  The caller in arch code has cleared
- * %TIF_NOTIFY_RESUME before the call.  If the flag gets set again
- * asynchronously, this will be called again before we return to
- * user mode.
- *
- * Called without locks.
- */
-static inline void tracehook_notify_resume(struct pt_regs *regs)
-{
-	clear_thread_flag(TIF_NOTIFY_RESUME);
-	/*
-	 * This barrier pairs with task_work_add()->set_notify_resume() after
-	 * hlist_add_head(task->task_works);
-	 */
-	smp_mb__after_atomic();
-	if (unlikely(task_work_pending(current)))
-		task_work_run();
-
-#ifdef CONFIG_KEYS_REQUEST_CACHE
-	if (unlikely(current->cached_requested_key)) {
-		key_put(current->cached_requested_key);
-		current->cached_requested_key = NULL;
-	}
-#endif
-
-	mem_cgroup_handle_over_high();
-	blkcg_maybe_throttle_current();
-
-	rseq_handle_notify_resume(NULL, regs);
-}
 
 #endif	/* <linux/tracehook.h> */
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 79eaf9b4b10d..a86823cad853 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -2,7 +2,7 @@
 
 #include <linux/context_tracking.h>
 #include <linux/entry-common.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/highmem.h>
 #include <linux/livepatch.h>
 #include <linux/audit.h>
@@ -165,7 +165,7 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
 			arch_do_signal_or_restart(regs);
 
 		if (ti_work & _TIF_NOTIFY_RESUME)
-			tracehook_notify_resume(regs);
+			resume_user_mode_work(regs);
 
 		/* Architecture specific TIF work */
 		arch_exit_to_user_mode_work(regs, ti_work);
diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
index 3ab5f98988c3..9d09f489b60e 100644
--- a/kernel/entry/kvm.c
+++ b/kernel/entry/kvm.c
@@ -23,7 +23,7 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
 			schedule();
 
 		if (ti_work & _TIF_NOTIFY_RESUME)
-			tracehook_notify_resume(NULL);
+			resume_user_mode_work(NULL);
 
 		ret = arch_xfer_to_guest_mode_handle_work(vcpu, ti_work);
 		if (ret)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index cc6fccb0e24d..c59e1a49bc40 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/spinlock.h>
 #include <linux/task_work.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 
 static struct callback_head work_exited; /* all we need is ->next == NULL */
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 09d342c7cbd0..2aaa400f34d6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -59,7 +59,7 @@
 #include <linux/oom.h>
 #include <linux/lockdep.h>
 #include <linux/file.h>
-#include <linux/tracehook.h>
+#include <linux/resume_user_mode.h>
 #include <linux/psi.h>
 #include <linux/seq_buf.h>
 #include "internal.h"
-- 
2.29.2
Re: [PATCH 12/13] resume_user_mode: Move to resume_user_mode.h
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:53AM -0600, Eric W. Biederman wrote:
> Move set_notify_resume and tracehook_notify_resume into resume_user_mode.h.
> While doing that rename tracehook_notify_resume to resume_user_mode_work.
> 
> Update all of the places that included tracehook.h for these functions to
> include resume_user_mode.h instead.
> 
> Update all of the callers of tracehook_notify_resume to call
> resume_user_mode_work.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 13/13] tracehook: Remove tracehook.h
Posted by Eric W. Biederman 4 years, 3 months ago
Now that all of the definitions have moved out of tracehook.h into
ptrace.h, sched/signal.h, resume_user_mode.h there is nothing left in
tracehook.h so remove it.

Update the few files that were depending upon tracehook.h to bring in
definitions to use the headers they need directly.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 MAINTAINERS                          |  1 -
 arch/s390/include/asm/entry-common.h |  1 -
 arch/s390/kernel/ptrace.c            |  1 -
 arch/s390/kernel/signal.c            |  1 -
 arch/x86/kernel/ptrace.c             |  1 -
 arch/x86/kernel/signal.c             |  1 -
 fs/coredump.c                        |  1 -
 fs/exec.c                            |  1 -
 fs/io-wq.c                           |  2 +-
 fs/io_uring.c                        |  1 -
 fs/proc/array.c                      |  1 -
 fs/proc/base.c                       |  1 -
 include/linux/tracehook.h            | 56 ----------------------------
 kernel/exit.c                        |  3 +-
 kernel/livepatch/transition.c        |  1 -
 kernel/seccomp.c                     |  1 -
 kernel/signal.c                      |  2 +-
 security/apparmor/domain.c           |  1 -
 security/selinux/hooks.c             |  1 -
 19 files changed, 4 insertions(+), 74 deletions(-)
 delete mode 100644 include/linux/tracehook.h

diff --git a/MAINTAINERS b/MAINTAINERS
index ea3e6c914384..2f16a23a26a2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15623,7 +15623,6 @@ F:	arch/*/ptrace*.c
 F:	include/asm-generic/syscall.h
 F:	include/linux/ptrace.h
 F:	include/linux/regset.h
-F:	include/linux/tracehook.h
 F:	include/uapi/linux/ptrace.h
 F:	include/uapi/linux/ptrace.h
 F:	kernel/ptrace.c
diff --git a/arch/s390/include/asm/entry-common.h b/arch/s390/include/asm/entry-common.h
index 17aead80aadb..eabab24b71dd 100644
--- a/arch/s390/include/asm/entry-common.h
+++ b/arch/s390/include/asm/entry-common.h
@@ -5,7 +5,6 @@
 #include <linux/sched.h>
 #include <linux/audit.h>
 #include <linux/randomize_kstack.h>
-#include <linux/tracehook.h>
 #include <linux/processor.h>
 #include <linux/uaccess.h>
 #include <asm/timex.h>
diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c
index 0ea3d02b378d..641fa36f6101 100644
--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -21,7 +21,6 @@
 #include <linux/signal.h>
 #include <linux/elf.h>
 #include <linux/regset.h>
-#include <linux/tracehook.h>
 #include <linux/seccomp.h>
 #include <linux/compat.h>
 #include <trace/syscall.h>
diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c
index ea9e5e8182cd..8b7b5f80f722 100644
--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -25,7 +25,6 @@
 #include <linux/tty.h>
 #include <linux/personality.h>
 #include <linux/binfmts.h>
-#include <linux/tracehook.h>
 #include <linux/syscalls.h>
 #include <linux/compat.h>
 #include <asm/ucontext.h>
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 6d2244c94799..419768d7605e 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -13,7 +13,6 @@
 #include <linux/errno.h>
 #include <linux/slab.h>
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
 #include <linux/user.h>
 #include <linux/elf.h>
 #include <linux/security.h>
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index de3d5b5724d8..e439eb14325f 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -18,7 +18,6 @@
 #include <linux/kstrtox.h>
 #include <linux/errno.h>
 #include <linux/wait.h>
-#include <linux/tracehook.h>
 #include <linux/unistd.h>
 #include <linux/stddef.h>
 #include <linux/personality.h>
diff --git a/fs/coredump.c b/fs/coredump.c
index 1c060c0a2d72..f54c5e316df3 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -31,7 +31,6 @@
 #include <linux/tsacct_kern.h>
 #include <linux/cn_proc.h>
 #include <linux/audit.h>
-#include <linux/tracehook.h>
 #include <linux/kmod.h>
 #include <linux/fsnotify.h>
 #include <linux/fs_struct.h>
diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..e23e2d430485 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -56,7 +56,6 @@
 #include <linux/tsacct_kern.h>
 #include <linux/cn_proc.h>
 #include <linux/audit.h>
-#include <linux/tracehook.h>
 #include <linux/kmod.h>
 #include <linux/fsnotify.h>
 #include <linux/fs_struct.h>
diff --git a/fs/io-wq.c b/fs/io-wq.c
index 8b9147873c2c..cb3cb1833ef6 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -13,7 +13,7 @@
 #include <linux/slab.h>
 #include <linux/rculist_nulls.h>
 #include <linux/cpu.h>
-#include <linux/tracehook.h>
+#include <linux/task_work.h>
 #include <linux/audit.h>
 #include <uapi/linux/io_uring.h>
 
diff --git a/fs/io_uring.c b/fs/io_uring.c
index d5fbae1030f9..6c7eacc0ebd6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -78,7 +78,6 @@
 #include <linux/task_work.h>
 #include <linux/pagemap.h>
 #include <linux/io_uring.h>
-#include <linux/tracehook.h>
 #include <linux/audit.h>
 #include <linux/security.h>
 
diff --git a/fs/proc/array.c b/fs/proc/array.c
index fd8b0c12b2cb..eb815759842c 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -88,7 +88,6 @@
 #include <linux/pid_namespace.h>
 #include <linux/prctl.h>
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
 #include <linux/string_helpers.h>
 #include <linux/user_namespace.h>
 #include <linux/fs_struct.h>
diff --git a/fs/proc/base.c b/fs/proc/base.c
index d654ce7150fd..01fb37ecc89f 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -74,7 +74,6 @@
 #include <linux/mount.h>
 #include <linux/security.h>
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
 #include <linux/printk.h>
 #include <linux/cache.h>
 #include <linux/cgroup.h>
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
deleted file mode 100644
index 9f6b3fd1880a..000000000000
--- a/include/linux/tracehook.h
+++ /dev/null
@@ -1,56 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Tracing hooks
- *
- * Copyright (C) 2008-2009 Red Hat, Inc.  All rights reserved.
- *
- * This file defines hook entry points called by core code where
- * user tracing/debugging support might need to do something.  These
- * entry points are called tracehook_*().  Each hook declared below
- * has a detailed kerneldoc comment giving the context (locking et
- * al) from which it is called, and the meaning of its return value.
- *
- * Each function here typically has only one call site, so it is ok
- * to have some nontrivial tracehook_*() inlines.  In all cases, the
- * fast path when no tracing is enabled should be very short.
- *
- * The purpose of this file and the tracehook_* layer is to consolidate
- * the interface that the kernel core and arch code uses to enable any
- * user debugging or tracing facility (such as ptrace).  The interfaces
- * here are carefully documented so that maintainers of core and arch
- * code do not need to think about the implementation details of the
- * tracing facilities.  Likewise, maintainers of the tracing code do not
- * need to understand all the calling core or arch code in detail, just
- * documented circumstances of each call, such as locking conditions.
- *
- * If the calling core code changes so that locking is different, then
- * it is ok to change the interface documented here.  The maintainer of
- * core code changing should notify the maintainers of the tracing code
- * that they need to work out the change.
- *
- * Some tracehook_*() inlines take arguments that the current tracing
- * implementations might not necessarily use.  These function signatures
- * are chosen to pass in all the information that is on hand in the
- * caller and might conceivably be relevant to a tracer, so that the
- * core code won't have to be updated when tracing adds more features.
- * If a call site changes so that some of those parameters are no longer
- * already on hand without extra work, then the tracehook_* interface
- * can change so there is no make-work burden on the core code.  The
- * maintainer of core code changing should notify the maintainers of the
- * tracing code that they need to work out the change.
- */
-
-#ifndef _LINUX_TRACEHOOK_H
-#define _LINUX_TRACEHOOK_H	1
-
-#include <linux/sched.h>
-#include <linux/ptrace.h>
-#include <linux/security.h>
-#include <linux/task_work.h>
-#include <linux/memcontrol.h>
-#include <linux/blk-cgroup.h>
-struct linux_binprm;
-
-
-
-#endif	/* <linux/tracehook.h> */
diff --git a/kernel/exit.c b/kernel/exit.c
index b00a25bb4ab9..9326d1f97fc7 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -49,7 +49,8 @@
 #include <linux/audit.h> /* for audit_free() */
 #include <linux/resource.h>
 #include <linux/task_io_accounting_ops.h>
-#include <linux/tracehook.h>
+#include <linux/blkdev.h>
+#include <linux/task_work.h>
 #include <linux/fs_struct.h>
 #include <linux/init_task.h>
 #include <linux/perf_event.h>
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 5683ac0d2566..df808d97d84f 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -9,7 +9,6 @@
 
 #include <linux/cpu.h>
 #include <linux/stacktrace.h>
-#include <linux/tracehook.h>
 #include "core.h"
 #include "patch.h"
 #include "transition.h"
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 4d8f44a17727..63198086ee83 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -39,7 +39,6 @@
 #include <linux/pid.h>
 #include <linux/ptrace.h>
 #include <linux/capability.h>
-#include <linux/tracehook.h>
 #include <linux/uaccess.h>
 #include <linux/anon_inodes.h>
 #include <linux/lockdep.h>
diff --git a/kernel/signal.c b/kernel/signal.c
index 8632b88982c9..c2dee5420567 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -32,7 +32,7 @@
 #include <linux/signal.h>
 #include <linux/signalfd.h>
 #include <linux/ratelimit.h>
-#include <linux/tracehook.h>
+#include <linux/task_work.h>
 #include <linux/capability.h>
 #include <linux/freezer.h>
 #include <linux/pid_namespace.h>
diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c
index 583680f6cd81..a29e69d2c300 100644
--- a/security/apparmor/domain.c
+++ b/security/apparmor/domain.c
@@ -14,7 +14,6 @@
 #include <linux/file.h>
 #include <linux/mount.h>
 #include <linux/syscalls.h>
-#include <linux/tracehook.h>
 #include <linux/personality.h>
 #include <linux/xattr.h>
 #include <linux/user_namespace.h>
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 5b6895e4fc29..4d2cd6b9f6fc 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -25,7 +25,6 @@
 #include <linux/kd.h>
 #include <linux/kernel.h>
 #include <linux/kernel_read_file.h>
-#include <linux/tracehook.h>
 #include <linux/errno.h>
 #include <linux/sched/signal.h>
 #include <linux/sched/task.h>
-- 
2.29.2
Re: [PATCH 13/13] tracehook: Remove tracehook.h
Posted by Kees Cook 4 years, 3 months ago
On Wed, Mar 09, 2022 at 10:24:54AM -0600, Eric W. Biederman wrote:
> Now that all of the definitions have moved out of tracehook.h into
> ptrace.h, sched/signal.h, resume_user_mode.h there is nothing left in
> tracehook.h so remove it.
> 
> Update the few files that were depending upon tracehook.h to bring in
> definitions to use the headers they need directly.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
[PATCH 00/13] Removing tracehook.h
Posted by Eric W. Biederman 4 years, 3 months ago
While working on cleaning up do_exit I have been having to deal with the
code in tracehook.h.  Unfortunately the code in tracehook.h does not
make sense as organized.

This set of changes reorganizes things so that tracehook.h no longer
exists, and so that it's current contents are organized in a fashion
that is a little easier to understand.

The biggest change is that I lean into the fact that get_signal
always calls task_work_run and removes the logic that tried to
be smart and decouple task_work_run and get_signal as it has proven
to not be effective.

This is a conservative change and I am not changing the how things
like signal_pending operate (although it is probably justified).

A new header resume_user_mode.h is added to hold resume_user_mode_work
which was previously known as tracehook_notify_resume.

Eric W. Biederman (13):
      ptrace: Move ptrace_report_syscall into ptrace.h
      ptrace/arm: Rename tracehook_report_syscall report_syscall
      ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
      ptrace: Remove arch_syscall_{enter,exit}_tracehook
      ptrace: Remove tracehook_signal_handler
      task_work: Remove unnecessary include from posix_timers.h
      task_work: Introduce task_work_pending
      task_work: Call tracehook_notify_signal from get_signal on all architectures
      task_work: Decouple TIF_NOTIFY_SIGNAL and task_work
      signal: Move set_notify_signal and clear_notify_signal into sched/signal.h
      resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume
      resume_user_mode: Move to resume_user_mode.h
      tracehook: Remove tracehook.h

 MAINTAINERS                          |   1 -
 arch/Kconfig                         |   5 +-
 arch/alpha/kernel/ptrace.c           |   5 +-
 arch/alpha/kernel/signal.c           |   4 +-
 arch/arc/kernel/ptrace.c             |   5 +-
 arch/arc/kernel/signal.c             |   4 +-
 arch/arm/kernel/ptrace.c             |  12 +-
 arch/arm/kernel/signal.c             |   4 +-
 arch/arm64/kernel/ptrace.c           |  14 +--
 arch/arm64/kernel/signal.c           |   4 +-
 arch/csky/kernel/ptrace.c            |   5 +-
 arch/csky/kernel/signal.c            |   4 +-
 arch/h8300/kernel/ptrace.c           |   5 +-
 arch/h8300/kernel/signal.c           |   4 +-
 arch/hexagon/kernel/process.c        |   4 +-
 arch/hexagon/kernel/signal.c         |   1 -
 arch/hexagon/kernel/traps.c          |   6 +-
 arch/ia64/kernel/process.c           |   4 +-
 arch/ia64/kernel/ptrace.c            |   6 +-
 arch/ia64/kernel/signal.c            |   1 -
 arch/m68k/kernel/ptrace.c            |   6 +-
 arch/m68k/kernel/signal.c            |   4 +-
 arch/microblaze/kernel/ptrace.c      |   5 +-
 arch/microblaze/kernel/signal.c      |   4 +-
 arch/mips/kernel/ptrace.c            |   5 +-
 arch/mips/kernel/signal.c            |   4 +-
 arch/nds32/include/asm/syscall.h     |   2 +-
 arch/nds32/kernel/ptrace.c           |   5 +-
 arch/nds32/kernel/signal.c           |   4 +-
 arch/nios2/kernel/ptrace.c           |   5 +-
 arch/nios2/kernel/signal.c           |   4 +-
 arch/openrisc/kernel/ptrace.c        |   5 +-
 arch/openrisc/kernel/signal.c        |   4 +-
 arch/parisc/kernel/ptrace.c          |   7 +-
 arch/parisc/kernel/signal.c          |   4 +-
 arch/powerpc/kernel/ptrace/ptrace.c  |   8 +-
 arch/powerpc/kernel/signal.c         |   4 +-
 arch/riscv/kernel/ptrace.c           |   5 +-
 arch/riscv/kernel/signal.c           |   4 +-
 arch/s390/include/asm/entry-common.h |   1 -
 arch/s390/kernel/ptrace.c            |   1 -
 arch/s390/kernel/signal.c            |   5 +-
 arch/sh/kernel/ptrace_32.c           |   5 +-
 arch/sh/kernel/signal_32.c           |   4 +-
 arch/sparc/kernel/ptrace_32.c        |   5 +-
 arch/sparc/kernel/ptrace_64.c        |   5 +-
 arch/sparc/kernel/signal32.c         |   1 -
 arch/sparc/kernel/signal_32.c        |   4 +-
 arch/sparc/kernel/signal_64.c        |   4 +-
 arch/um/kernel/process.c             |   4 +-
 arch/um/kernel/ptrace.c              |   5 +-
 arch/x86/kernel/ptrace.c             |   1 -
 arch/x86/kernel/signal.c             |   5 +-
 arch/x86/mm/tlb.c                    |   1 +
 arch/xtensa/kernel/ptrace.c          |   5 +-
 arch/xtensa/kernel/signal.c          |   4 +-
 block/blk-cgroup.c                   |   2 +-
 fs/coredump.c                        |   1 -
 fs/exec.c                            |   1 -
 fs/io-wq.c                           |   6 +-
 fs/io_uring.c                        |  11 +-
 fs/proc/array.c                      |   1 -
 fs/proc/base.c                       |   1 -
 include/asm-generic/syscall.h        |   2 +-
 include/linux/entry-common.h         |  47 +-------
 include/linux/entry-kvm.h            |   2 +-
 include/linux/posix-timers.h         |   1 -
 include/linux/ptrace.h               |  78 ++++++++++++
 include/linux/resume_user_mode.h     |  64 ++++++++++
 include/linux/sched/signal.h         |  17 +++
 include/linux/task_work.h            |   5 +
 include/linux/tracehook.h            | 226 -----------------------------------
 include/uapi/linux/ptrace.h          |   2 +-
 kernel/entry/common.c                |  19 +--
 kernel/entry/kvm.c                   |   9 +-
 kernel/exit.c                        |   3 +-
 kernel/livepatch/transition.c        |   1 -
 kernel/seccomp.c                     |   1 -
 kernel/signal.c                      |  23 ++--
 kernel/task_work.c                   |   4 +-
 kernel/time/posix-cpu-timers.c       |   1 +
 mm/memcontrol.c                      |   2 +-
 security/apparmor/domain.c           |   1 -
 security/selinux/hooks.c             |   1 -
 84 files changed, 317 insertions(+), 462 deletions(-)

Eric
Re: [PATCH 00/13] Removing tracehook.h
Posted by Linus Torvalds 4 years, 3 months ago
On Tue, Mar 8, 2022 at 4:16 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> While working on cleaning up do_exit I have been having to deal with the
> code in tracehook.h.  Unfortunately the code in tracehook.h does not
> make sense as organized. [...]

Thanks, that odd naming has tripped me up several times, this looks
like an improvement.

                  Linus