[PATCH 00/13] Removing tracehook.h

Eric W. Biederman posted 13 patches 4 years, 3 months ago
There is a newer version of this series
MAINTAINERS                          |   1 -
arch/Kconfig                         |   5 +-
arch/alpha/kernel/ptrace.c           |   5 +-
arch/alpha/kernel/signal.c           |   4 +-
arch/arc/kernel/ptrace.c             |   5 +-
arch/arc/kernel/signal.c             |   4 +-
arch/arm/kernel/ptrace.c             |  12 +-
arch/arm/kernel/signal.c             |   4 +-
arch/arm64/kernel/ptrace.c           |  14 +--
arch/arm64/kernel/signal.c           |   4 +-
arch/csky/kernel/ptrace.c            |   5 +-
arch/csky/kernel/signal.c            |   4 +-
arch/h8300/kernel/ptrace.c           |   5 +-
arch/h8300/kernel/signal.c           |   4 +-
arch/hexagon/kernel/process.c        |   4 +-
arch/hexagon/kernel/signal.c         |   1 -
arch/hexagon/kernel/traps.c          |   6 +-
arch/ia64/kernel/process.c           |   4 +-
arch/ia64/kernel/ptrace.c            |   6 +-
arch/ia64/kernel/signal.c            |   1 -
arch/m68k/kernel/ptrace.c            |   6 +-
arch/m68k/kernel/signal.c            |   4 +-
arch/microblaze/kernel/ptrace.c      |   5 +-
arch/microblaze/kernel/signal.c      |   4 +-
arch/mips/kernel/ptrace.c            |   5 +-
arch/mips/kernel/signal.c            |   4 +-
arch/nds32/include/asm/syscall.h     |   2 +-
arch/nds32/kernel/ptrace.c           |   5 +-
arch/nds32/kernel/signal.c           |   4 +-
arch/nios2/kernel/ptrace.c           |   5 +-
arch/nios2/kernel/signal.c           |   4 +-
arch/openrisc/kernel/ptrace.c        |   5 +-
arch/openrisc/kernel/signal.c        |   4 +-
arch/parisc/kernel/ptrace.c          |   7 +-
arch/parisc/kernel/signal.c          |   4 +-
arch/powerpc/kernel/ptrace/ptrace.c  |   8 +-
arch/powerpc/kernel/signal.c         |   4 +-
arch/riscv/kernel/ptrace.c           |   5 +-
arch/riscv/kernel/signal.c           |   4 +-
arch/s390/include/asm/entry-common.h |   1 -
arch/s390/kernel/ptrace.c            |   1 -
arch/s390/kernel/signal.c            |   5 +-
arch/sh/kernel/ptrace_32.c           |   5 +-
arch/sh/kernel/signal_32.c           |   4 +-
arch/sparc/kernel/ptrace_32.c        |   5 +-
arch/sparc/kernel/ptrace_64.c        |   5 +-
arch/sparc/kernel/signal32.c         |   1 -
arch/sparc/kernel/signal_32.c        |   4 +-
arch/sparc/kernel/signal_64.c        |   4 +-
arch/um/kernel/process.c             |   4 +-
arch/um/kernel/ptrace.c              |   5 +-
arch/x86/kernel/ptrace.c             |   1 -
arch/x86/kernel/signal.c             |   5 +-
arch/x86/mm/tlb.c                    |   1 +
arch/xtensa/kernel/ptrace.c          |   5 +-
arch/xtensa/kernel/signal.c          |   4 +-
block/blk-cgroup.c                   |   2 +-
fs/coredump.c                        |   1 -
fs/exec.c                            |   1 -
fs/io-wq.c                           |   6 +-
fs/io_uring.c                        |  11 +-
fs/proc/array.c                      |   1 -
fs/proc/base.c                       |   1 -
include/asm-generic/syscall.h        |   2 +-
include/linux/entry-common.h         |  47 +-------
include/linux/entry-kvm.h            |   2 +-
include/linux/posix-timers.h         |   1 -
include/linux/ptrace.h               |  78 ++++++++++++
include/linux/resume_user_mode.h     |  64 ++++++++++
include/linux/sched/signal.h         |  17 +++
include/linux/task_work.h            |   5 +
include/linux/tracehook.h            | 226 -----------------------------------
include/uapi/linux/ptrace.h          |   2 +-
kernel/entry/common.c                |  19 +--
kernel/entry/kvm.c                   |   9 +-
kernel/exit.c                        |   3 +-
kernel/livepatch/transition.c        |   1 -
kernel/seccomp.c                     |   1 -
kernel/signal.c                      |  23 ++--
kernel/task_work.c                   |   4 +-
kernel/time/posix-cpu-timers.c       |   1 +
mm/memcontrol.c                      |   2 +-
security/apparmor/domain.c           |   1 -
security/selinux/hooks.c             |   1 -
84 files changed, 317 insertions(+), 462 deletions(-)
[PATCH 00/13] Removing tracehook.h
Posted by Eric W. Biederman 4 years, 3 months ago
While working on cleaning up do_exit I have been having to deal with the
code in tracehook.h.  Unfortunately the code in tracehook.h does not
make sense as organized.

This set of changes reorganizes things so that tracehook.h no longer
exists, and so that it's current contents are organized in a fashion
that is a little easier to understand.

The biggest change is that I lean into the fact that get_signal
always calls task_work_run and removes the logic that tried to
be smart and decouple task_work_run and get_signal as it has proven
to not be effective.

This is a conservative change and I am not changing the how things
like signal_pending operate (although it is probably justified).

A new header resume_user_mode.h is added to hold resume_user_mode_work
which was previously known as tracehook_notify_resume.

Eric W. Biederman (13):
      ptrace: Move ptrace_report_syscall into ptrace.h
      ptrace/arm: Rename tracehook_report_syscall report_syscall
      ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
      ptrace: Remove arch_syscall_{enter,exit}_tracehook
      ptrace: Remove tracehook_signal_handler
      task_work: Remove unnecessary include from posix_timers.h
      task_work: Introduce task_work_pending
      task_work: Call tracehook_notify_signal from get_signal on all architectures
      task_work: Decouple TIF_NOTIFY_SIGNAL and task_work
      signal: Move set_notify_signal and clear_notify_signal into sched/signal.h
      resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume
      resume_user_mode: Move to resume_user_mode.h
      tracehook: Remove tracehook.h

 MAINTAINERS                          |   1 -
 arch/Kconfig                         |   5 +-
 arch/alpha/kernel/ptrace.c           |   5 +-
 arch/alpha/kernel/signal.c           |   4 +-
 arch/arc/kernel/ptrace.c             |   5 +-
 arch/arc/kernel/signal.c             |   4 +-
 arch/arm/kernel/ptrace.c             |  12 +-
 arch/arm/kernel/signal.c             |   4 +-
 arch/arm64/kernel/ptrace.c           |  14 +--
 arch/arm64/kernel/signal.c           |   4 +-
 arch/csky/kernel/ptrace.c            |   5 +-
 arch/csky/kernel/signal.c            |   4 +-
 arch/h8300/kernel/ptrace.c           |   5 +-
 arch/h8300/kernel/signal.c           |   4 +-
 arch/hexagon/kernel/process.c        |   4 +-
 arch/hexagon/kernel/signal.c         |   1 -
 arch/hexagon/kernel/traps.c          |   6 +-
 arch/ia64/kernel/process.c           |   4 +-
 arch/ia64/kernel/ptrace.c            |   6 +-
 arch/ia64/kernel/signal.c            |   1 -
 arch/m68k/kernel/ptrace.c            |   6 +-
 arch/m68k/kernel/signal.c            |   4 +-
 arch/microblaze/kernel/ptrace.c      |   5 +-
 arch/microblaze/kernel/signal.c      |   4 +-
 arch/mips/kernel/ptrace.c            |   5 +-
 arch/mips/kernel/signal.c            |   4 +-
 arch/nds32/include/asm/syscall.h     |   2 +-
 arch/nds32/kernel/ptrace.c           |   5 +-
 arch/nds32/kernel/signal.c           |   4 +-
 arch/nios2/kernel/ptrace.c           |   5 +-
 arch/nios2/kernel/signal.c           |   4 +-
 arch/openrisc/kernel/ptrace.c        |   5 +-
 arch/openrisc/kernel/signal.c        |   4 +-
 arch/parisc/kernel/ptrace.c          |   7 +-
 arch/parisc/kernel/signal.c          |   4 +-
 arch/powerpc/kernel/ptrace/ptrace.c  |   8 +-
 arch/powerpc/kernel/signal.c         |   4 +-
 arch/riscv/kernel/ptrace.c           |   5 +-
 arch/riscv/kernel/signal.c           |   4 +-
 arch/s390/include/asm/entry-common.h |   1 -
 arch/s390/kernel/ptrace.c            |   1 -
 arch/s390/kernel/signal.c            |   5 +-
 arch/sh/kernel/ptrace_32.c           |   5 +-
 arch/sh/kernel/signal_32.c           |   4 +-
 arch/sparc/kernel/ptrace_32.c        |   5 +-
 arch/sparc/kernel/ptrace_64.c        |   5 +-
 arch/sparc/kernel/signal32.c         |   1 -
 arch/sparc/kernel/signal_32.c        |   4 +-
 arch/sparc/kernel/signal_64.c        |   4 +-
 arch/um/kernel/process.c             |   4 +-
 arch/um/kernel/ptrace.c              |   5 +-
 arch/x86/kernel/ptrace.c             |   1 -
 arch/x86/kernel/signal.c             |   5 +-
 arch/x86/mm/tlb.c                    |   1 +
 arch/xtensa/kernel/ptrace.c          |   5 +-
 arch/xtensa/kernel/signal.c          |   4 +-
 block/blk-cgroup.c                   |   2 +-
 fs/coredump.c                        |   1 -
 fs/exec.c                            |   1 -
 fs/io-wq.c                           |   6 +-
 fs/io_uring.c                        |  11 +-
 fs/proc/array.c                      |   1 -
 fs/proc/base.c                       |   1 -
 include/asm-generic/syscall.h        |   2 +-
 include/linux/entry-common.h         |  47 +-------
 include/linux/entry-kvm.h            |   2 +-
 include/linux/posix-timers.h         |   1 -
 include/linux/ptrace.h               |  78 ++++++++++++
 include/linux/resume_user_mode.h     |  64 ++++++++++
 include/linux/sched/signal.h         |  17 +++
 include/linux/task_work.h            |   5 +
 include/linux/tracehook.h            | 226 -----------------------------------
 include/uapi/linux/ptrace.h          |   2 +-
 kernel/entry/common.c                |  19 +--
 kernel/entry/kvm.c                   |   9 +-
 kernel/exit.c                        |   3 +-
 kernel/livepatch/transition.c        |   1 -
 kernel/seccomp.c                     |   1 -
 kernel/signal.c                      |  23 ++--
 kernel/task_work.c                   |   4 +-
 kernel/time/posix-cpu-timers.c       |   1 +
 mm/memcontrol.c                      |   2 +-
 security/apparmor/domain.c           |   1 -
 security/selinux/hooks.c             |   1 -
 84 files changed, 317 insertions(+), 462 deletions(-)

Eric
Re: [PATCH 00/13] Removing tracehook.h
Posted by Jens Axboe 4 years, 3 months ago
On 3/8/22 5:13 PM, Eric W. Biederman wrote:
> 
> While working on cleaning up do_exit I have been having to deal with the
> code in tracehook.h.  Unfortunately the code in tracehook.h does not
> make sense as organized.
> 
> This set of changes reorganizes things so that tracehook.h no longer
> exists, and so that it's current contents are organized in a fashion
> that is a little easier to understand.
> 
> The biggest change is that I lean into the fact that get_signal
> always calls task_work_run and removes the logic that tried to
> be smart and decouple task_work_run and get_signal as it has proven
> to not be effective.
> 
> This is a conservative change and I am not changing the how things
> like signal_pending operate (although it is probably justified).
> 
> A new header resume_user_mode.h is added to hold resume_user_mode_work
> which was previously known as tracehook_notify_resume.

This is a nice cleanup. I did bother me adding the TIF_NOTIFY_SIGNAL
bits and work hooks to something unrelated, but that's where other
things resided then. This makes it a lot better.

-- 
Jens Axboe
[PATCH 0/2] ptrace: Making the ptrace changes atomic
Posted by Eric W. Biederman 4 years, 3 months ago
While working on cleaning up the exit path it have had occasion to look
at what guarantees are provided for setting and reading the fields that
are provided in task_struct for ptraces.  Namely exit_code,
ptrace_message, and last_siginfo. It turns out as the ptrace interface
in the kernel was extended in the kernel the old existing interfaces
in the kernel were just wrapped and not properly updated to handle
the new functionality.  This lead to races and inconsistencies.

This fixes the reason for the races and inconsistencies by moving the
work of maintaining the ptrace fields into ptrace_stop.

The inconsistency that results in some ptrace_stop points continuing
with a signal while others will not I have left alone as it appears to
be part of our userspace ABI, and changing that risks breaking
userspace.

Eric W. Biederman (2):
      ptrace: Move setting/clearing ptrace_message into ptrace_stop
      ptrace: Return the signal to continue with from ptrace_stop

 include/linux/ptrace.h      | 17 +++++++----------
 include/uapi/linux/ptrace.h |  2 +-
 kernel/signal.c             | 40 ++++++++++++++++++++++++----------------
 3 files changed, 32 insertions(+), 27 deletions(-)

Eric
[GIT PULL] ptrace: Cleanups for v5.18
Posted by Eric W. Biederman 4 years, 2 months ago
Linus,

Please pull the ptrace-cleanups-for-v5.18 tag from the git tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git ptrace-cleanups-for-v5.18

  HEAD: dcbc65aac28360df5f5a3b613043ccc0e81da3cf ptrace: Remove duplicated include in ptrace.c

This set of changes removes tracehook.h, moves modification of all of
the ptrace fields inside of siglock to remove races, adds a missing
permission check to ptrace.c

The removal of tracehook.h is quite significant as it has been a major
source of confusion in recent years.  Much of that confusion was
around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
making the semantics clearer).

For people who don't know tracehook.h is a vestiage of an attempt to
implement uprobes like functionality that was never fully merged, and
was later superseeded by uprobes when uprobes was merged.  For many
years now we have been removing what tracehook functionaly a little
bit at a time.  To the point where now anything left in tracehook.h is
some weird strange thing that is difficult to understand.

Eric W. Biederman (15):
      ptrace: Move ptrace_report_syscall into ptrace.h
      ptrace/arm: Rename tracehook_report_syscall report_syscall
      ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
      ptrace: Remove arch_syscall_{enter,exit}_tracehook
      ptrace: Remove tracehook_signal_handler
      task_work: Remove unnecessary include from posix_timers.h
      task_work: Introduce task_work_pending
      task_work: Call tracehook_notify_signal from get_signal on all architectures
      task_work: Decouple TIF_NOTIFY_SIGNAL and task_work
      signal: Move set_notify_signal and clear_notify_signal into sched/signal.h
      resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume
      resume_user_mode: Move to resume_user_mode.h
      tracehook: Remove tracehook.h
      ptrace: Move setting/clearing ptrace_message into ptrace_stop
      ptrace: Return the signal to continue with from ptrace_stop

Jann Horn (1):
      ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE

Yang Li (1):
      ptrace: Remove duplicated include in ptrace.c


 MAINTAINERS                          |   1 -
 arch/Kconfig                         |   5 +-
 arch/alpha/kernel/ptrace.c           |   5 +-
 arch/alpha/kernel/signal.c           |   4 +-
 arch/arc/kernel/ptrace.c             |   5 +-
 arch/arc/kernel/signal.c             |   4 +-
 arch/arm/kernel/ptrace.c             |  12 +-
 arch/arm/kernel/signal.c             |   4 +-
 arch/arm64/kernel/ptrace.c           |  14 +--
 arch/arm64/kernel/signal.c           |   4 +-
 arch/csky/kernel/ptrace.c            |   5 +-
 arch/csky/kernel/signal.c            |   4 +-
 arch/h8300/kernel/ptrace.c           |   5 +-
 arch/h8300/kernel/signal.c           |   4 +-
 arch/hexagon/kernel/process.c        |   4 +-
 arch/hexagon/kernel/signal.c         |   1 -
 arch/hexagon/kernel/traps.c          |   6 +-
 arch/ia64/kernel/process.c           |   4 +-
 arch/ia64/kernel/ptrace.c            |   6 +-
 arch/ia64/kernel/signal.c            |   1 -
 arch/m68k/kernel/ptrace.c            |   5 +-
 arch/m68k/kernel/signal.c            |   4 +-
 arch/microblaze/kernel/ptrace.c      |   5 +-
 arch/microblaze/kernel/signal.c      |   4 +-
 arch/mips/kernel/ptrace.c            |   5 +-
 arch/mips/kernel/signal.c            |   4 +-
 arch/nds32/include/asm/syscall.h     |   2 +-
 arch/nds32/kernel/ptrace.c           |   5 +-
 arch/nds32/kernel/signal.c           |   4 +-
 arch/nios2/kernel/ptrace.c           |   5 +-
 arch/nios2/kernel/signal.c           |   4 +-
 arch/openrisc/kernel/ptrace.c        |   5 +-
 arch/openrisc/kernel/signal.c        |   4 +-
 arch/parisc/kernel/ptrace.c          |   7 +-
 arch/parisc/kernel/signal.c          |   4 +-
 arch/powerpc/kernel/ptrace/ptrace.c  |   8 +-
 arch/powerpc/kernel/signal.c         |   4 +-
 arch/riscv/kernel/ptrace.c           |   5 +-
 arch/riscv/kernel/signal.c           |   4 +-
 arch/s390/include/asm/entry-common.h |   1 -
 arch/s390/kernel/ptrace.c            |   1 -
 arch/s390/kernel/signal.c            |   5 +-
 arch/sh/kernel/ptrace_32.c           |   5 +-
 arch/sh/kernel/signal_32.c           |   4 +-
 arch/sparc/kernel/ptrace_32.c        |   5 +-
 arch/sparc/kernel/ptrace_64.c        |   5 +-
 arch/sparc/kernel/signal32.c         |   1 -
 arch/sparc/kernel/signal_32.c        |   4 +-
 arch/sparc/kernel/signal_64.c        |   4 +-
 arch/um/kernel/process.c             |   4 +-
 arch/um/kernel/ptrace.c              |   5 +-
 arch/x86/kernel/ptrace.c             |   1 -
 arch/x86/kernel/signal.c             |   5 +-
 arch/x86/mm/tlb.c                    |   1 +
 arch/xtensa/kernel/ptrace.c          |   5 +-
 arch/xtensa/kernel/signal.c          |   4 +-
 block/blk-cgroup.c                   |   2 +-
 fs/coredump.c                        |   1 -
 fs/exec.c                            |   1 -
 fs/io-wq.c                           |   6 +-
 fs/io_uring.c                        |  11 +-
 fs/proc/array.c                      |   1 -
 fs/proc/base.c                       |   1 -
 include/asm-generic/syscall.h        |   2 +-
 include/linux/entry-common.h         |  47 +-------
 include/linux/entry-kvm.h            |   2 +-
 include/linux/posix-timers.h         |   1 -
 include/linux/ptrace.h               |  81 ++++++++++++-
 include/linux/resume_user_mode.h     |  64 ++++++++++
 include/linux/sched/signal.h         |  17 +++
 include/linux/task_work.h            |   5 +
 include/linux/tracehook.h            | 226 -----------------------------------
 include/uapi/linux/ptrace.h          |   2 +-
 kernel/entry/common.c                |  19 +--
 kernel/entry/kvm.c                   |   9 +-
 kernel/exit.c                        |   3 +-
 kernel/livepatch/transition.c        |   1 -
 kernel/ptrace.c                      |  47 +++++---
 kernel/seccomp.c                     |   1 -
 kernel/signal.c                      |  62 +++++-----
 kernel/task_work.c                   |   4 +-
 kernel/time/posix-cpu-timers.c       |   1 +
 mm/memcontrol.c                      |   2 +-
 security/apparmor/domain.c           |   1 -
 security/selinux/hooks.c             |   1 -
 85 files changed, 372 insertions(+), 495 deletions(-)

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Jens Axboe 4 years, 2 months ago
On 3/28/22 5:56 PM, Eric W. Biederman wrote:
> 
> Linus,
> 
> Please pull the ptrace-cleanups-for-v5.18 tag from the git tree:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git ptrace-cleanups-for-v5.18
> 
>   HEAD: dcbc65aac28360df5f5a3b613043ccc0e81da3cf ptrace: Remove duplicated include in ptrace.c
> 
> This set of changes removes tracehook.h, moves modification of all of
> the ptrace fields inside of siglock to remove races, adds a missing
> permission check to ptrace.c
> 
> The removal of tracehook.h is quite significant as it has been a major
> source of confusion in recent years.  Much of that confusion was
> around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
> making the semantics clearer).
> 
> For people who don't know tracehook.h is a vestiage of an attempt to
> implement uprobes like functionality that was never fully merged, and
> was later superseeded by uprobes when uprobes was merged.  For many
> years now we have been removing what tracehook functionaly a little
> bit at a time.  To the point where now anything left in tracehook.h is
> some weird strange thing that is difficult to understand.

FWIW, the notify/task_work/io_uring changes look good to me. Thanks for
cleaning this up, Eric.

-- 
Jens Axboe
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Linus Torvalds 4 years, 2 months ago
On Mon, Mar 28, 2022 at 4:56 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> The removal of tracehook.h is quite significant as it has been a major
> source of confusion in recent years.  Much of that confusion was
> around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
> making the semantics clearer).

Hmm. I love removing tracehook.c, but this looks like it hasn't been
in linux-next.

The header file changes messes with other changes, and we have

  kernel/sched/fair.c:2884:9: error: implicit declaration of function
‘init_task_work’; did you mean ‘init_irq_work’?
[-Werror=implicit-function-declaration]
   2884 |         init_task_work(&p->numa_work, task_numa_work);
        |         ^~~~~~~~~~~~~~

as a result (also a few other things in that same file).

Now, this is trivial to fix - just add an include for
<linux/task_work.h> from that file - and that's the right thing to do
anyway.

But I'm a bit unhappy that this was either not tested in linux-next,
or if it was, I wasn't notified about the semantic in the pull
request.

So I've pulled this, and fixed up things in my merge, but I'm a bit
worried that there might be other situations like this where some
header file is no longer included and it was included implicitly
before through that disgusting tracehook.h header..

I *hope* it was just the scheduler header file updates that ended up
having this effect, and nothing else is affected.

Let's see if the test robots start complaining about non-x86
architecture-specific stuff that I don't build test.

              Linus
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Stephen Rothwell 4 years, 2 months ago
Hi Linus,

On Mon, 28 Mar 2022 17:33:52 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Mon, Mar 28, 2022 at 4:56 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >
> > The removal of tracehook.h is quite significant as it has been a major
> > source of confusion in recent years.  Much of that confusion was
> > around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
> > making the semantics clearer).  
> 
> Hmm. I love removing tracehook.c, but this looks like it hasn't been
> in linux-next.

See https://lore.kernel.org/lkml/20220316165612.4f50faad@canb.auug.org.au/

-- 
Cheers,
Stephen Rothwell
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Linus Torvalds 4 years, 2 months ago
On Mon, Mar 28, 2022 at 5:53 PM Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> See https://lore.kernel.org/lkml/20220316165612.4f50faad@canb.auug.org.au/

Ok, so it was known, just not reported to me.

And the good news seems to be that at least there isn't anything
hiding elsewhere.

                Linus
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Eric W. Biederman 4 years, 2 months ago
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, Mar 28, 2022 at 4:56 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
>> The removal of tracehook.h is quite significant as it has been a major
>> source of confusion in recent years.  Much of that confusion was
>> around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled
>> making the semantics clearer).
>
> Hmm. I love removing tracehook.c, but this looks like it hasn't been
> in linux-next.
>
> The header file changes messes with other changes, and we have
>
>   kernel/sched/fair.c:2884:9: error: implicit declaration of function
> ‘init_task_work’; did you mean ‘init_irq_work’?
> [-Werror=implicit-function-declaration]
>    2884 |         init_task_work(&p->numa_work, task_numa_work);
>         |         ^~~~~~~~~~~~~~
>
> as a result (also a few other things in that same file).
>
> Now, this is trivial to fix - just add an include for
> <linux/task_work.h> from that file - and that's the right thing to do
> anyway.
>
> But I'm a bit unhappy that this was either not tested in linux-next,
> or if it was, I wasn't notified about the semantic in the pull
> request.
>
> So I've pulled this, and fixed up things in my merge, but I'm a bit
> worried that there might be other situations like this where some
> header file is no longer included and it was included implicitly
> before through that disgusting tracehook.h header..
>
> I *hope* it was just the scheduler header file updates that ended up
> having this effect, and nothing else is affected.
>
> Let's see if the test robots start complaining about non-x86
> architecture-specific stuff that I don't build test.

Sorry for not mentioning that.  I had tracked it down.  It was
fundamentally in the scheduler headers changes removing an include of
task_work.h, so it didn't feel like there was anything I could do in my
tree.  I asked Ingo if he could fix his tree and unfortunately forgot
about it.

For the record there were also a couple of other pretty trivial
conflicts, the removal of nds32, some block_cgroup header where
an adjacent line was modified to what I was changing.  But it thankfully
looks like none of those caused you any problems.

Sorry about all of that I am about that.  I am running pretty weak this
last couple of days as a cold has been running through the household.


Dumb question because this seems to burning a few extra creativity
points.  Is there any way to create a signed tag and a branch with the
same name?  Or in general is there a good way to manage topic branches
and then tag them at the end before I send them?

Having a tag and a branch with the same name seems to completely confuse
git and it just tells me no I won't push anything to another git tree,
because what you are asking me to do is ambiguous.  So now I am having
to come up with two names for each topic branch, even if I only push the
tags upstream.

I feel like there is a best practice on how to manage tags and topic
branches and I just haven't seen it yet.

Eric

Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Linus Torvalds 4 years, 2 months ago
On Mon, Mar 28, 2022 at 8:38 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Dumb question because this seems to burning a few extra creativity
> points.  Is there any way to create a signed tag and a branch with the
> same name?

Oh, absolutely.

But you may find it annoying, because as you noticed:

> Having a tag and a branch with the same name seems to completely confuse
> git and it just tells me no I won't push anything to another git tree,
> because what you are asking me to do is ambiguous.

Not at all.

git is not at all confused by the situation, git is in fact very aware
of it indeed.

But as git then tells you, exactly *because* it is aware that you have
picked the same name for both a branch and a tag, it will keep warning
you about the ambiguity of said name (but after warning, will
generally then preferentially use the tag of that name over the branch
of the same name).

So if you have both a branch and a tag named 'xyz', you generally need
to disambiguate them when you use them. That will make git happy,
because now it's not ambiguous any more.

(Technical detail: some git operations work on specific namespaces, so
"git branch -d xyz" should never remove a _tag_ called 'xyz', and as
such the name isn't ambiguous in the context of that git command)

And disambiguating them is simple, but I suspect you'll find it's
annoying enough that you simply don't want to use the same name for
tags and branches.

The full name of a tag 'x' is actually 'refs/tags/x', and the full
unambiguous name of a branch 'x' is 'refs/heads/x'.

So technically a tag and a branch can never _really_ have the same
name, because internally they have longer unambiguous names.

You would almost never actually use that full name, it's mostly an
internal git thing. Because even if you have ambiguous branch and tag
names, you'd then still shorten it to just 'tags/x' and 'heads/x'
respectively.

Git has other "namespaces" for these refs, too, notably
'refs/remotes/'. In fact, I guess you could make up your own namespace
too if you really wanted, although I don't think anybody really has
ever had a good reason for it.

(There is also the non-namespaced special refs like HEAD and
FETCH_HEAD and MERGE_HEAD for "current branch", "what you fetched" and
"what you are merging")

So you *can* do this:

    # create and check out the branch 'xyz'
    $ git checkout -b xyz master

    # create a tag called 'xyz', but to confuse everybody, make it
point somewhere else
    $ git tag xyz master~3

    # look at what 'xyz' means:
    $ git rev-parse xyz
    warning: refname 'xyz' is ambiguous.
    cffb2b72d3ed47f5093d128bd44d9ce136b6b5af

    # Hmm. it warned about it being ambiguous
    $ git rev-parse heads/xyz
    1930a6e739c4b4a654a69164dbe39e554d228915

    # Ok, it clearly took the tag, not the branch:
    $ git rev-parse tags/xyz
    cffb2b72d3ed47f5093d128bd44d9ce136b6b5af

so as you can see, yes, you can work with a tag and a branch that have
the same 'short name', but having to disambiguate them all the time
will likely just drive you mad.

And it's worth pointing out that the name does not imply a
relationship. So the branch called 'xyz' (ie refs/heads/xyz) has
absolutely *no* relationship to the tag called 'xyz' (ie
refs/tags/xyz) except for that ambiguous short name. So updating the
branch - by perhaps committing more to it - will in no way affect the
tag.

Also note that branches and tags are both "just refs" as far as git is
concerned, but git *does* give some semantic meaning to the
namespaces.

So the branch namespace (it those 'refs/heads/*' things) are things
that get updated automatically as you make new commits.

In contrast, the tag namespace is something you *can* update, but it's
considered odd, and if you want to overwrite an existing tag you
generally need to do something special (eg "git tag -f" to force
overwriting a new tag rather than telling you that you already have
one).

So in a very real way, to git a ref is a ref is a ref, with very
little to no real *technical* distinction. They are just ways to point
to the SHA1 hash of an object. But there are basically some common
semantic rules that are based on the namespaces, and all those git
operations that use certain namespaces by default.

            Linus
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by Linus Torvalds 4 years, 2 months ago
On Mon, Mar 28, 2022 at 9:49 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So if you have both a branch and a tag named 'xyz', you generally need
> to disambiguate them when you use them. That will make git happy,
> because now it's not ambiguous any more.

On a similar but very different issue: this is not the only kind of
naming ambiguity you can have.

For example, try this insane thing:

    git tag Makefile

that creates a tag called 'Makefile' (pointing to your current HEAD, of course).

Now, guess what happens when you then do

    git log Makefile

that's right - git once again notices that you are doing something ambiguous.

In fact, git will be *so* unhappy about this kind of ambiguity that
(unlike the 'tag vs branch' case) it will not prefer one version of
reality over the other, and simply consider this to be a fatal error,
and say

    fatal: ambiguous argument 'Makefile': both revision and filename
    Use '--' to separate paths from revisions, like this:
    'git <command> [<revision>...] -- [<file>...]'

and as a result you will hopefully go "Oh, I didn't mean to have that
tag at all" and just remove the bogus tagname.

Because you probably made it by mistake.

But you *can* choose to say

   git log Makefile --

or

   git log refs/tags/Makefile

to make it clear that no, 'Makefile' is not the pathname in the
repository, you really mean the tag 'Makefile'.

Or use

  git log -- Makefile

or

    git log ./Makefile

to say "yeah, I meant the _pathname_ 'Makefile', not the tag".

Or, if you just like playing games, do

    git log Makefile -- Makefile

if you want to track the history of the path 'Makefile' starting at
the tag 'Makefile'.

But don't actually do this for real. There's a reason git notices these things.

Using ambiguous branch names (whether ambiguous with tag-names or with
filenames) is a pain that is not worth it in practice. Git will
notice, and git will allow you to work around it, but it's just a
BadIdea(tm) despite that.

But you probably want to be aware of issues like this if you script
things around git, though.

That "--" form in particular is generally the one you want to use for
scripting, if you want to allow people to do anything they damn well
please. It's the "Give them rope" syntax.

Of course, a much more common reason for "--" for pathname separation,
and one you may already be familiar with, is that you want to see the
history of a pathname that is not *currently* a pathname, but was one
at some point in the past.

So in my current kernel tree, doing

    $ git log arch/nds32/

will actually cause an error, because of "unknown revision or path not
in the working tree".

But if you really want to see the history of that now-deleted
architecture, you do

    $ git log -- arch/nds32/

and it's all good.

This concludes today's sermon on git,

             Linus
Re: [GIT PULL] ptrace: Cleanups for v5.18
Posted by pr-tracker-bot@kernel.org 4 years, 2 months ago
The pull request you sent on Mon, 28 Mar 2022 18:56:46 -0500:

> git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git ptrace-cleanups-for-v5.18

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/1930a6e739c4b4a654a69164dbe39e554d228915

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago

Today ptrace_message is easy to overlook as it not a core part of
ptrace_stop.  It has been overlooked so much that there are places
that set ptrace_message and don't clear it, and places that never set
it.  So if you get an unlucky sequence of events the ptracer may be
able to read a ptrace_message that does not apply to the current
ptrace stop.

Move setting of ptrace_message into ptrace_stop so that it always gets
set before the stop, and always gets cleared after the stop.  This
prevents non-sense from being reported to userspace and makes
ptrace_message more visible in the ptrace helper functions so that
kernel developers can see it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/ptrace.h      |  9 +++------
 include/uapi/linux/ptrace.h |  2 +-
 kernel/signal.c             | 21 ++++++++++++---------
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 5310f43e4762..3e6b46e2b7be 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
 extern void ptrace_disable(struct task_struct *);
 extern int ptrace_request(struct task_struct *child, long request,
 			  unsigned long addr, unsigned long data);
-extern void ptrace_notify(int exit_code);
+extern void ptrace_notify(int exit_code, unsigned long message);
 extern void __ptrace_link(struct task_struct *child,
 			  struct task_struct *new_parent,
 			  const struct cred *ptracer_cred);
@@ -155,8 +155,7 @@ static inline bool ptrace_event_enabled(struct task_struct *task, int event)
 static inline void ptrace_event(int event, unsigned long message)
 {
 	if (unlikely(ptrace_event_enabled(current, event))) {
-		current->ptrace_message = message;
-		ptrace_notify((event << 8) | SIGTRAP);
+		ptrace_notify((event << 8) | SIGTRAP, message);
 	} else if (event == PTRACE_EVENT_EXEC) {
 		/* legacy EXEC report via SIGTRAP */
 		if ((current->ptrace & (PT_PTRACED|PT_SEIZED)) == PT_PTRACED)
@@ -424,8 +423,7 @@ static inline int ptrace_report_syscall(unsigned long message)
 	if (!(ptrace & PT_PTRACED))
 		return 0;
 
-	current->ptrace_message = message;
-	ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
+	ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0), message);
 
 	/*
 	 * this isn't the same as continuing with a signal, but it will do
@@ -437,7 +435,6 @@ static inline int ptrace_report_syscall(unsigned long message)
 		current->exit_code = 0;
 	}
 
-	current->ptrace_message = 0;
 	return fatal_signal_pending(current);
 }
 
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index b7af92e07d1f..195ae64a8c87 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -114,7 +114,7 @@ struct ptrace_rseq_configuration {
 
 /*
  * These values are stored in task->ptrace_message
- * by ptrace_report_syscall_* to describe the current syscall-stop.
+ * by ptrace_stop to describe the current syscall-stop.
  */
 #define PTRACE_EVENTMSG_SYSCALL_ENTRY	1
 #define PTRACE_EVENTMSG_SYSCALL_EXIT	2
diff --git a/kernel/signal.c b/kernel/signal.c
index c2dee5420567..a49ac7149256 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2191,7 +2191,8 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
  * If we actually decide not to stop at all because the tracer
  * is gone, we keep current->exit_code unless clear_code.
  */
-static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t *info)
+static void ptrace_stop(int exit_code, int why, int clear_code,
+			unsigned long message, kernel_siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
@@ -2237,6 +2238,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t
 	 */
 	smp_wmb();
 
+	current->ptrace_message = message;
 	current->last_siginfo = info;
 	current->exit_code = exit_code;
 
@@ -2315,6 +2317,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t
 	 */
 	spin_lock_irq(&current->sighand->siglock);
 	current->last_siginfo = NULL;
+	current->ptrace_message = 0;
 
 	/* LISTENING can be set only during STOP traps, clear it */
 	current->jobctl &= ~JOBCTL_LISTENING;
@@ -2327,7 +2330,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t
 	recalc_sigpending_tsk(current);
 }
 
-static void ptrace_do_notify(int signr, int exit_code, int why)
+static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
 {
 	kernel_siginfo_t info;
 
@@ -2338,17 +2341,17 @@ static void ptrace_do_notify(int signr, int exit_code, int why)
 	info.si_uid = from_kuid_munged(current_user_ns(), current_uid());
 
 	/* Let the debugger run.  */
-	ptrace_stop(exit_code, why, 1, &info);
+	ptrace_stop(exit_code, why, 1, message, &info);
 }
 
-void ptrace_notify(int exit_code)
+void ptrace_notify(int exit_code, unsigned long message)
 {
 	BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
 	if (unlikely(task_work_pending(current)))
 		task_work_run();
 
 	spin_lock_irq(&current->sighand->siglock);
-	ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED);
+	ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED, message);
 	spin_unlock_irq(&current->sighand->siglock);
 }
 
@@ -2504,10 +2507,10 @@ static void do_jobctl_trap(void)
 			signr = SIGTRAP;
 		WARN_ON_ONCE(!signr);
 		ptrace_do_notify(signr, signr | (PTRACE_EVENT_STOP << 8),
-				 CLD_STOPPED);
+				 CLD_STOPPED, 0);
 	} else {
 		WARN_ON_ONCE(!signr);
-		ptrace_stop(signr, CLD_STOPPED, 0, NULL);
+		ptrace_stop(signr, CLD_STOPPED, 0, 0, NULL);
 		current->exit_code = 0;
 	}
 }
@@ -2561,7 +2564,7 @@ static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)
 	 * comment in dequeue_signal().
 	 */
 	current->jobctl |= JOBCTL_STOP_DEQUEUED;
-	ptrace_stop(signr, CLD_TRAPPED, 0, info);
+	ptrace_stop(signr, CLD_TRAPPED, 0, 0, info);
 
 	/* We're back.  Did the debugger cancel the sig?  */
 	signr = current->exit_code;
@@ -2891,7 +2894,7 @@ static void signal_delivered(struct ksignal *ksig, int stepping)
 	if (current->sas_ss_flags & SS_AUTODISARM)
 		sas_ss_reset(current);
 	if (stepping)
-		ptrace_notify(SIGTRAP);
+		ptrace_notify(SIGTRAP, 0);
 }
 
 void signal_setup_done(int failed, struct ksignal *ksig, int stepping)
-- 
2.29.2
Re: [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Oleg Nesterov 4 years, 3 months ago
On 03/15, Eric W. Biederman wrote:
>
> there are places
> that set ptrace_message and don't clear it, and places that never set
> it.

Yes, I too never understood this.

So I obviously like this change. The only problem (as usual) is that we
can never know if something depends on this old (and strange) behaviour.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Re: [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Kees Cook 4 years, 3 months ago
On Tue, Mar 15, 2022 at 06:21:08PM -0500, Eric W. Biederman wrote:
> 
> Today ptrace_message is easy to overlook as it not a core part of
> ptrace_stop.  It has been overlooked so much that there are places
> that set ptrace_message and don't clear it, and places that never set
> it.  So if you get an unlucky sequence of events the ptracer may be
> able to read a ptrace_message that does not apply to the current
> ptrace stop.
> 
> Move setting of ptrace_message into ptrace_stop so that it always gets
> set before the stop, and always gets cleared after the stop.  This
> prevents non-sense from being reported to userspace and makes
> ptrace_message more visible in the ptrace helper functions so that
> kernel developers can see it.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

This looks good to me. Did you happen to run the seccomp selftests
before/after these changes?

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
Re: [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago
Kees Cook <keescook@chromium.org> writes:

> On Tue, Mar 15, 2022 at 06:21:08PM -0500, Eric W. Biederman wrote:
>> 
>> Today ptrace_message is easy to overlook as it not a core part of
>> ptrace_stop.  It has been overlooked so much that there are places
>> that set ptrace_message and don't clear it, and places that never set
>> it.  So if you get an unlucky sequence of events the ptracer may be
>> able to read a ptrace_message that does not apply to the current
>> ptrace stop.
>> 
>> Move setting of ptrace_message into ptrace_stop so that it always gets
>> set before the stop, and always gets cleared after the stop.  This
>> prevents non-sense from being reported to userspace and makes
>> ptrace_message more visible in the ptrace helper functions so that
>> kernel developers can see it.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> This looks good to me. Did you happen to run the seccomp selftests
> before/after these changes?

I did not.  This is a pure ptrace change.  Do you see a way that seccomp
could be involved?

Eric
Re: [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop
Posted by Kees Cook 4 years, 3 months ago
On Fri, Mar 18, 2022 at 09:44:30AM -0500, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > On Tue, Mar 15, 2022 at 06:21:08PM -0500, Eric W. Biederman wrote:
> >> 
> >> Today ptrace_message is easy to overlook as it not a core part of
> >> ptrace_stop.  It has been overlooked so much that there are places
> >> that set ptrace_message and don't clear it, and places that never set
> >> it.  So if you get an unlucky sequence of events the ptracer may be
> >> able to read a ptrace_message that does not apply to the current
> >> ptrace stop.
> >> 
> >> Move setting of ptrace_message into ptrace_stop so that it always gets
> >> set before the stop, and always gets cleared after the stop.  This
> >> prevents non-sense from being reported to userspace and makes
> >> ptrace_message more visible in the ptrace helper functions so that
> >> kernel developers can see it.
> >> 
> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >
> > This looks good to me. Did you happen to run the seccomp selftests
> > before/after these changes?
> 
> I did not.  This is a pure ptrace change.  Do you see a way that seccomp
> could be involved?

Sorry, that wasn't clear: seccomp includes a number of ptrace tests as
well, especially involving handling process death, messages, and
signals. I'll give it a spin; so far it seems fine.

-- 
Kees Cook
[PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago

The signal a task should continue with after a ptrace stop is
inconsistently read, cleared, and sent.  Solve this by reading and
clearing the signal to be sent in ptrace_stop.

In an ideal world everything except ptrace_signal would share a common
implementation of continuing with the signal, so ptracers could count
on the signal they ask to continue with actually being delivered.  For
now retain bug compatibility and just return with the signal number
the ptracer requested the code continue with.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/ptrace.h | 12 ++++++------
 kernel/signal.c        | 31 ++++++++++++++++++-------------
 2 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 3e6b46e2b7be..15b3d176b6b4 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
 extern void ptrace_disable(struct task_struct *);
 extern int ptrace_request(struct task_struct *child, long request,
 			  unsigned long addr, unsigned long data);
-extern void ptrace_notify(int exit_code, unsigned long message);
+extern int ptrace_notify(int exit_code, unsigned long message);
 extern void __ptrace_link(struct task_struct *child,
 			  struct task_struct *new_parent,
 			  const struct cred *ptracer_cred);
@@ -419,21 +419,21 @@ extern void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction *oa
 static inline int ptrace_report_syscall(unsigned long message)
 {
 	int ptrace = current->ptrace;
+	int signr;
 
 	if (!(ptrace & PT_PTRACED))
 		return 0;
 
-	ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0), message);
+	signr = ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0),
+			      message);
 
 	/*
 	 * this isn't the same as continuing with a signal, but it will do
 	 * for normal use.  strace only continues with a signal if the
 	 * stopping signal is not SIGTRAP.  -brl
 	 */
-	if (current->exit_code) {
-		send_sig(current->exit_code, current, 1);
-		current->exit_code = 0;
-	}
+	if (signr)
+		send_sig(signr, current, 1);
 
 	return fatal_signal_pending(current);
 }
diff --git a/kernel/signal.c b/kernel/signal.c
index a49ac7149256..e53ee84b9021 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2188,15 +2188,17 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
  * That makes it a way to test a stopped process for
  * being ptrace-stopped vs being job-control-stopped.
  *
- * If we actually decide not to stop at all because the tracer
- * is gone, we keep current->exit_code unless clear_code.
+ * Returns the signal the ptracer requested the code resume
+ * with.  If the code did not stop because the tracer is gone,
+ * the stop signal remains unchanged unless clear_code.
  */
-static void ptrace_stop(int exit_code, int why, int clear_code,
+static int ptrace_stop(int exit_code, int why, int clear_code,
 			unsigned long message, kernel_siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
 	bool gstop_done = false;
+	bool read_code = true;
 
 	if (arch_ptrace_stop_needed()) {
 		/*
@@ -2305,8 +2307,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
 
 		/* tasklist protects us from ptrace_freeze_traced() */
 		__set_current_state(TASK_RUNNING);
+		read_code = false;
 		if (clear_code)
-			current->exit_code = 0;
+			exit_code = 0;
 		read_unlock(&tasklist_lock);
 	}
 
@@ -2316,8 +2319,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
 	 * any signal-sending on another CPU that wants to examine it.
 	 */
 	spin_lock_irq(&current->sighand->siglock);
+	if (read_code) exit_code = current->exit_code;
 	current->last_siginfo = NULL;
 	current->ptrace_message = 0;
+	current->exit_code = 0;
 
 	/* LISTENING can be set only during STOP traps, clear it */
 	current->jobctl &= ~JOBCTL_LISTENING;
@@ -2328,9 +2333,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
 	 * This sets TIF_SIGPENDING, but never clears it.
 	 */
 	recalc_sigpending_tsk(current);
+	return exit_code;
 }
 
-static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
+static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
 {
 	kernel_siginfo_t info;
 
@@ -2341,18 +2347,21 @@ static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long me
 	info.si_uid = from_kuid_munged(current_user_ns(), current_uid());
 
 	/* Let the debugger run.  */
-	ptrace_stop(exit_code, why, 1, message, &info);
+	return ptrace_stop(exit_code, why, 1, message, &info);
 }
 
-void ptrace_notify(int exit_code, unsigned long message)
+int ptrace_notify(int exit_code, unsigned long message)
 {
+	int signr;
+
 	BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
 	if (unlikely(task_work_pending(current)))
 		task_work_run();
 
 	spin_lock_irq(&current->sighand->siglock);
-	ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED, message);
+	signr = ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED, message);
 	spin_unlock_irq(&current->sighand->siglock);
+	return signr;
 }
 
 /**
@@ -2511,7 +2520,6 @@ static void do_jobctl_trap(void)
 	} else {
 		WARN_ON_ONCE(!signr);
 		ptrace_stop(signr, CLD_STOPPED, 0, 0, NULL);
-		current->exit_code = 0;
 	}
 }
 
@@ -2564,15 +2572,12 @@ static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)
 	 * comment in dequeue_signal().
 	 */
 	current->jobctl |= JOBCTL_STOP_DEQUEUED;
-	ptrace_stop(signr, CLD_TRAPPED, 0, 0, info);
+	signr = ptrace_stop(signr, CLD_TRAPPED, 0, 0, info);
 
 	/* We're back.  Did the debugger cancel the sig?  */
-	signr = current->exit_code;
 	if (signr == 0)
 		return signr;
 
-	current->exit_code = 0;
-
 	/*
 	 * Update the siginfo structure if the signal has
 	 * changed.  If the debugger wanted something
-- 
2.29.2
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Oleg Nesterov 4 years, 3 months ago
Not sure I understand this patch, I can't apply it. I guess I need to
clone your tree first, will do later.

Just one question right now,

On 03/15, Eric W. Biederman wrote:
>
> +static int ptrace_stop(int exit_code, int why, int clear_code,
>  			unsigned long message, kernel_siginfo_t *info)
>  	__releases(&current->sighand->siglock)
>  	__acquires(&current->sighand->siglock)
>  {
>  	bool gstop_done = false;
> +	bool read_code = true;
>  
>  	if (arch_ptrace_stop_needed()) {
>  		/*
> @@ -2305,8 +2307,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
>  
>  		/* tasklist protects us from ptrace_freeze_traced() */
>  		__set_current_state(TASK_RUNNING);
> +		read_code = false;
>  		if (clear_code)
> -			current->exit_code = 0;
> +			exit_code = 0;
>  		read_unlock(&tasklist_lock);
>  	}
>  
> @@ -2316,8 +2319,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
>  	 * any signal-sending on another CPU that wants to examine it.
>  	 */
>  	spin_lock_irq(&current->sighand->siglock);
> +	if (read_code) exit_code = current->exit_code;

style ;)

>  	current->last_siginfo = NULL;
>  	current->ptrace_message = 0;
> +	current->exit_code = 0;

OK, I like it... but can't we remove the ugly "int clear_code" arg?

Oleg.
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago
Oleg Nesterov <oleg@redhat.com> writes:

> Not sure I understand this patch, I can't apply it. I guess I need to
> clone your tree first, will do later.
>
> Just one question right now,
>
> On 03/15, Eric W. Biederman wrote:
>>
>> +static int ptrace_stop(int exit_code, int why, int clear_code,
>>  			unsigned long message, kernel_siginfo_t *info)
>>  	__releases(&current->sighand->siglock)
>>  	__acquires(&current->sighand->siglock)
>>  {
>>  	bool gstop_done = false;
>> +	bool read_code = true;
>>  
>>  	if (arch_ptrace_stop_needed()) {
>>  		/*
>> @@ -2305,8 +2307,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
>>  
>>  		/* tasklist protects us from ptrace_freeze_traced() */
>>  		__set_current_state(TASK_RUNNING);
>> +		read_code = false;
>>  		if (clear_code)
>> -			current->exit_code = 0;
>> +			exit_code = 0;
>>  		read_unlock(&tasklist_lock);
>>  	}
>>  
>> @@ -2316,8 +2319,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code,
>>  	 * any signal-sending on another CPU that wants to examine it.
>>  	 */
>>  	spin_lock_irq(&current->sighand->siglock);
>> +	if (read_code) exit_code = current->exit_code;
>
> style ;)
>
>>  	current->last_siginfo = NULL;
>>  	current->ptrace_message = 0;
>> +	current->exit_code = 0;
>
> OK, I like it... but can't we remove the ugly "int clear_code" arg?

The flag clear_code controls what happens if a ptrace_stop does not
stop.  In particular clear_code means do not continue with
a signal if we can not stop.

For do_jobctl_trap calling ptrace_stop it clearly does not matter.

For ptrace_signal it would be a change in behavior, that would
cause the signal not to be delivered.

For ptrace_do_notify we don't have a signal to deliver unless userspace
gives us one so clear_code makes sense at that point.

Which is a long way of saying that no we can not remove clear_stop
because the behavior it is used to implement makes sense and userspace
can reasonably depend upon it.

Eric
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Oleg Nesterov 4 years, 3 months ago
On 03/17, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > OK, I like it... but can't we remove the ugly "int clear_code" arg?
>
> The flag clear_code controls what happens if a ptrace_stop does not
> stop.  In particular clear_code means do not continue with
> a signal if we can not stop.
>
> For do_jobctl_trap calling ptrace_stop it clearly does not matter.
>
> For ptrace_signal it would be a change in behavior, that would
> cause the signal not to be delivered.

Well I meant that "clear_code" should be false, iirc only
ptrace_report_syscall() should be updated to void the spurious
send_sig() if debugger exits... Nevermind, pleasee forget, this is
not as trivial as I thought.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago
Oleg Nesterov <oleg@redhat.com> writes:

> Not sure I understand this patch, I can't apply it. I guess I need to
> clone your tree first, will do later.

Yes this series is on top of:
  https://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git kill-tracehook-for-v5.18

Nothing particularly interesting happens in my kill tracehook series
but I do get rid of tracehook.h and so everything gets moved and
renamed.

Eric
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Kees Cook 4 years, 3 months ago
On Tue, Mar 15, 2022 at 06:22:26PM -0500, Eric W. Biederman wrote:
> 
> The signal a task should continue with after a ptrace stop is
> inconsistently read, cleared, and sent.  Solve this by reading and
> clearing the signal to be sent in ptrace_stop.
> 
> In an ideal world everything except ptrace_signal would share a common
> implementation of continuing with the signal, so ptracers could count
> on the signal they ask to continue with actually being delivered.  For
> now retain bug compatibility and just return with the signal number
> the ptracer requested the code continue with.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  include/linux/ptrace.h | 12 ++++++------
>  kernel/signal.c        | 31 ++++++++++++++++++-------------
>  2 files changed, 24 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 3e6b46e2b7be..15b3d176b6b4 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
>  extern void ptrace_disable(struct task_struct *);
>  extern int ptrace_request(struct task_struct *child, long request,
>  			  unsigned long addr, unsigned long data);
> -extern void ptrace_notify(int exit_code, unsigned long message);
> +extern int ptrace_notify(int exit_code, unsigned long message);
> [...]
> -static void ptrace_stop(int exit_code, int why, int clear_code,
> +static int ptrace_stop(int exit_code, int why, int clear_code,
>  			unsigned long message, kernel_siginfo_t *info)
> [...]
> -static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
> +static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
> [...]
> -void ptrace_notify(int exit_code, unsigned long message)
> +int ptrace_notify(int exit_code, unsigned long message)

Just for robustness, how about marking the functions that have switched
from void to int return as __must_check (or at least just ptrace_notify)?

With that and the style nit Oleg already mentioned, yeah, this looks
good too.

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Eric W. Biederman 4 years, 3 months ago
Kees Cook <keescook@chromium.org> writes:

> On Tue, Mar 15, 2022 at 06:22:26PM -0500, Eric W. Biederman wrote:
>> 
>> The signal a task should continue with after a ptrace stop is
>> inconsistently read, cleared, and sent.  Solve this by reading and
>> clearing the signal to be sent in ptrace_stop.
>> 
>> In an ideal world everything except ptrace_signal would share a common
>> implementation of continuing with the signal, so ptracers could count
>> on the signal they ask to continue with actually being delivered.  For
>> now retain bug compatibility and just return with the signal number
>> the ptracer requested the code continue with.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  include/linux/ptrace.h | 12 ++++++------
>>  kernel/signal.c        | 31 ++++++++++++++++++-------------
>>  2 files changed, 24 insertions(+), 19 deletions(-)
>> 
>> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
>> index 3e6b46e2b7be..15b3d176b6b4 100644
>> --- a/include/linux/ptrace.h
>> +++ b/include/linux/ptrace.h
>> @@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
>>  extern void ptrace_disable(struct task_struct *);
>>  extern int ptrace_request(struct task_struct *child, long request,
>>  			  unsigned long addr, unsigned long data);
>> -extern void ptrace_notify(int exit_code, unsigned long message);
>> +extern int ptrace_notify(int exit_code, unsigned long message);
>> [...]
>> -static void ptrace_stop(int exit_code, int why, int clear_code,
>> +static int ptrace_stop(int exit_code, int why, int clear_code,
>>  			unsigned long message, kernel_siginfo_t *info)
>> [...]
>> -static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
>> +static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
>> [...]
>> -void ptrace_notify(int exit_code, unsigned long message)
>> +int ptrace_notify(int exit_code, unsigned long message)
>
> Just for robustness, how about marking the functions that have switched
> from void to int return as __must_check (or at least just ptrace_notify)?

We can't.  There are historical cases that simply don't check if a
signal should be sent after the function, and they exist for every
function that is modified.

If we can modify the code so that everyone is checking the return value
than certainly, but that just doesn't happen to reflect how this
ptrace helper is being used today.

> With that and the style nit Oleg already mentioned, yeah, this looks
> good too.

Alright style nit fixed.

> Reviewed-by: Kees Cook <keescook@chromium.org>

Eric
Re: [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop
Posted by Kees Cook 4 years, 3 months ago
On Fri, Mar 18, 2022 at 09:52:46AM -0500, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > On Tue, Mar 15, 2022 at 06:22:26PM -0500, Eric W. Biederman wrote:
> >> 
> >> The signal a task should continue with after a ptrace stop is
> >> inconsistently read, cleared, and sent.  Solve this by reading and
> >> clearing the signal to be sent in ptrace_stop.
> >> 
> >> In an ideal world everything except ptrace_signal would share a common
> >> implementation of continuing with the signal, so ptracers could count
> >> on the signal they ask to continue with actually being delivered.  For
> >> now retain bug compatibility and just return with the signal number
> >> the ptracer requested the code continue with.
> >> 
> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >> ---
> >>  include/linux/ptrace.h | 12 ++++++------
> >>  kernel/signal.c        | 31 ++++++++++++++++++-------------
> >>  2 files changed, 24 insertions(+), 19 deletions(-)
> >> 
> >> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> >> index 3e6b46e2b7be..15b3d176b6b4 100644
> >> --- a/include/linux/ptrace.h
> >> +++ b/include/linux/ptrace.h
> >> @@ -60,7 +60,7 @@ extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned
> >>  extern void ptrace_disable(struct task_struct *);
> >>  extern int ptrace_request(struct task_struct *child, long request,
> >>  			  unsigned long addr, unsigned long data);
> >> -extern void ptrace_notify(int exit_code, unsigned long message);
> >> +extern int ptrace_notify(int exit_code, unsigned long message);
> >> [...]
> >> -static void ptrace_stop(int exit_code, int why, int clear_code,
> >> +static int ptrace_stop(int exit_code, int why, int clear_code,
> >>  			unsigned long message, kernel_siginfo_t *info)
> >> [...]
> >> -static void ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
> >> +static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
> >> [...]
> >> -void ptrace_notify(int exit_code, unsigned long message)
> >> +int ptrace_notify(int exit_code, unsigned long message)
> >
> > Just for robustness, how about marking the functions that have switched
> > from void to int return as __must_check (or at least just ptrace_notify)?
> 
> We can't.  There are historical cases that simply don't check if a
> signal should be sent after the function, and they exist for every
> function that is modified.

This seems at least worth documenting with a comment, otherwise we're
just trading one kind of "weirdness" (setting/clearing
current->exit_code) with another (ignoring the signal returned by
ptrace_notify()).

I see only two cases that would need comments:

static inline void ptrace_event(int event, unsigned long message)
{
        if (unlikely(ptrace_event_enabled(current, event))) {
                ptrace_notify((event << 8) | SIGTRAP, message);
        } else if (event == PTRACE_EVENT_EXEC) {
                /* legacy EXEC report via SIGTRAP */
                if ((current->ptrace & (PT_PTRACED|PT_SEIZED)) == PT_PTRACED)
                        send_sig(SIGTRAP, current, 0);
        }
}

static void signal_delivered(struct ksignal *ksig, int stepping)
{
	...
        if (stepping)
                ptrace_notify(SIGTRAP, 0);
}


-- 
Kees Cook