[PATCH v4 0/12] ptrace: cleaning up ptrace_stop

Eric W. Biederman posted 12 patches 4 years ago
arch/ia64/include/asm/ptrace.h    |   4 --
arch/ia64/kernel/ptrace.c         |  57 ----------------
arch/um/include/asm/thread_info.h |   2 +
arch/um/kernel/exec.c             |   2 +-
arch/um/kernel/process.c          |   2 +-
arch/um/kernel/ptrace.c           |   8 +--
arch/um/kernel/signal.c           |   4 +-
arch/x86/kernel/step.c            |   3 +-
arch/xtensa/kernel/ptrace.c       |   4 +-
arch/xtensa/kernel/signal.c       |   4 +-
drivers/tty/tty_jobctrl.c         |   4 +-
include/linux/ptrace.h            |   7 --
include/linux/sched.h             |  10 ++-
include/linux/sched/jobctl.h      |   8 +++
include/linux/sched/signal.h      |  20 ++++--
include/linux/signal.h            |   3 +-
kernel/ptrace.c                   |  87 ++++++++---------------
kernel/sched/core.c               |   5 +-
kernel/signal.c                   | 140 +++++++++++++++++---------------------
kernel/time/posix-cpu-timers.c    |   6 +-
20 files changed, 140 insertions(+), 240 deletions(-)
[PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Eric W. Biederman 4 years ago
The states TASK_STOPPED and TASK_TRACE are special in they can not
handle spurious wake-ups.  This plus actively depending upon and
changing the value of tsk->__state causes problems for PREEMPT_RT and
Peter's freezer rewrite.

There are a lot of details we have to get right to sort out the
technical challenges and this is my parred back version of the changes
that contains just those problems I see good solutions to that I believe
are ready.

A couple of issues have been pointed but I think this parred back set of
changes is still on the right track.  The biggest change in v4 is the
split of "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs" into
two patches because the dependency I thought exited between two
different changes did not exist.  The rest of the changes are minor
tweaks to "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs";
removing an always true branch, and adding an early  test to see if the
ptracer had gone, before TASK_TRAPPING was set.

This set of changes should support Peter's freezer rewrite, and with the
addition of changing wait_task_inactive(TASK_TRACED) to be
wait_task_inactive(0) in ptrace_check_attach I don't think there are any
races or issues to be concerned about from the ptrace side.

More work is needed to support PREEMPT_RT, but these changes get things
closer.

This set of changes continues to look like it will provide a firm
foundation for solving the PREEMPT_RT and freezer challenges.

Eric W. Biederman (11):
      signal: Rename send_signal send_signal_locked
      signal: Replace __group_send_sig_info with send_signal_locked
      ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
      ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
      ptrace: Remove arch_ptrace_attach
      signal: Use lockdep_assert_held instead of assert_spin_locked
      ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
      ptrace: Document that wait_task_inactive can't fail
      ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
      ptrace: Don't change __state
      ptrace: Always take siglock in ptrace_resume

Peter Zijlstra (1):
      sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state

 arch/ia64/include/asm/ptrace.h    |   4 --
 arch/ia64/kernel/ptrace.c         |  57 ----------------
 arch/um/include/asm/thread_info.h |   2 +
 arch/um/kernel/exec.c             |   2 +-
 arch/um/kernel/process.c          |   2 +-
 arch/um/kernel/ptrace.c           |   8 +--
 arch/um/kernel/signal.c           |   4 +-
 arch/x86/kernel/step.c            |   3 +-
 arch/xtensa/kernel/ptrace.c       |   4 +-
 arch/xtensa/kernel/signal.c       |   4 +-
 drivers/tty/tty_jobctrl.c         |   4 +-
 include/linux/ptrace.h            |   7 --
 include/linux/sched.h             |  10 ++-
 include/linux/sched/jobctl.h      |   8 +++
 include/linux/sched/signal.h      |  20 ++++--
 include/linux/signal.h            |   3 +-
 kernel/ptrace.c                   |  87 ++++++++---------------
 kernel/sched/core.c               |   5 +-
 kernel/signal.c                   | 140 +++++++++++++++++---------------------
 kernel/time/posix-cpu-timers.c    |   6 +-
 20 files changed, 140 insertions(+), 240 deletions(-)

Eric
[PATCH 0/3] ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Posted by Eric W. Biederman 3 years, 10 months ago
Recently I had a conversation where it was pointed out to me that
SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
difficult for a tracer to handle.

Keeping SIGKILL working for anything after the process has been killed
is also a real pain from an implementation point of view.

So I am attempting to remove this wart in the userspace API and see
if anyone cares.

Eric W. Biederman (3):
      signal: Ensure SIGNAL_GROUP_EXIT gets set in do_group_exit
      signal: Guarantee that SIGNAL_GROUP_EXIT is set on process exit
      signal: Drop signals received after a fatal signal has been processed

 fs/coredump.c                |  2 +-
 include/linux/sched/signal.h |  1 +
 kernel/exit.c                | 20 +++++++++++++++++++-
 kernel/fork.c                |  2 ++
 kernel/signal.c              |  3 ++-
 5 files changed, 25 insertions(+), 3 deletions(-)

Eric
Re: [PATCH 0/3] ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Posted by Eric W. Biederman 3 years, 10 months ago
"Eric W. Biederman" <ebiederm@xmission.com> writes:

> Recently I had a conversation where it was pointed out to me that
> SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
> difficult for a tracer to handle.
>
> Keeping SIGKILL working for anything after the process has been killed
> is also a real pain from an implementation point of view.
>
> So I am attempting to remove this wart in the userspace API and see
> if anyone cares.
>
> Eric W. Biederman (3):
>       signal: Ensure SIGNAL_GROUP_EXIT gets set in do_group_exit
>       signal: Guarantee that SIGNAL_GROUP_EXIT is set on process exit
>       signal: Drop signals received after a fatal signal has been processed
>
>  fs/coredump.c                |  2 +-
>  include/linux/sched/signal.h |  1 +
>  kernel/exit.c                | 20 +++++++++++++++++++-
>  kernel/fork.c                |  2 ++
>  kernel/signal.c              |  3 ++-
>  5 files changed, 25 insertions(+), 3 deletions(-)

RR folks any comments?

Did I properly understand what Keno Fischer was asking for when we
talked in person?

Eric
Re: [PATCH 0/3] ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Posted by Keno Fischer 3 years, 10 months ago
Hi Eric,

On Fri, Jul 8, 2022 at 6:25 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> > Recently I had a conversation where it was pointed out to me that
> > SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
> > difficult for a tracer to handle.
> >
>
> RR folks any comments?
>
> Did I properly understand what Keno Fischer was asking for when we
> talked in person?

Yes, this is indeed what I had in mind. I have not yet had the opportunity
to try out your patch series (sorry), but from visual inspection, it does indeed
do what I wanted, which is to make sure that a tracee stays in
PTRACE_EVENT_EXIT for the tracer to inspect, even if there is another
SIGKILL incoming simultaneously (since otherwise it may be impossible
for the tracer to observe the PTRACE_EVENT_EXIT if two SIGKILLs
come in rapid succession). I will try to take this series for a proper spin
shortly.

Keno
Re: [PATCH 0/3] ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Posted by Eric W. Biederman 3 years, 9 months ago
Keno Fischer <keno@juliacomputing.com> writes:

> Hi Eric,
>
> On Fri, Jul 8, 2022 at 6:25 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> > Recently I had a conversation where it was pointed out to me that
>> > SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
>> > difficult for a tracer to handle.
>> >
>>
>> RR folks any comments?
>>
>> Did I properly understand what Keno Fischer was asking for when we
>> talked in person?
>
> Yes, this is indeed what I had in mind. I have not yet had the opportunity
> to try out your patch series (sorry), but from visual inspection, it does indeed
> do what I wanted, which is to make sure that a tracee stays in
> PTRACE_EVENT_EXIT for the tracer to inspect, even if there is another
> SIGKILL incoming simultaneously (since otherwise it may be impossible
> for the tracer to observe the PTRACE_EVENT_EXIT if two SIGKILLs
> come in rapid succession). I will try to take this series for a proper spin
> shortly.

Thanks,

I haven't yet figured out how to get the rr test suite to run
successfully.  Something about my test machine and lack of perf counters
seems to be causing problems.  So if you can perform the testing on your
side that would be fantastic.

Eric
Re: [PATCH 0/3] ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Posted by Eric W. Biederman 3 years, 9 months ago
"Eric W. Biederman" <ebiederm@xmission.com> writes:

> Keno Fischer <keno@juliacomputing.com> writes:
>
>> Hi Eric,
>>
>> On Fri, Jul 8, 2022 at 6:25 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>> > Recently I had a conversation where it was pointed out to me that
>>> > SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
>>> > difficult for a tracer to handle.
>>> >
>>>
>>> RR folks any comments?
>>>
>>> Did I properly understand what Keno Fischer was asking for when we
>>> talked in person?
>>
>> Yes, this is indeed what I had in mind. I have not yet had the opportunity
>> to try out your patch series (sorry), but from visual inspection, it does indeed
>> do what I wanted, which is to make sure that a tracee stays in
>> PTRACE_EVENT_EXIT for the tracer to inspect, even if there is another
>> SIGKILL incoming simultaneously (since otherwise it may be impossible
>> for the tracer to observe the PTRACE_EVENT_EXIT if two SIGKILLs
>> come in rapid succession). I will try to take this series for a proper spin
>> shortly.
>
> Thanks,
>
> I haven't yet figured out how to get the rr test suite to run
> successfully.  Something about my test machine and lack of perf counters
> seems to be causing problems.  So if you can perform the testing on your
> side that would be fantastic.

Ok.  I finally found a machine where I can run rr and the rr test suite.

It looks like there are a couple of the rr 5.5.0 test that fail on
Linus's lastest kernel simply because of changes in kernel behavior.  In
particular clone_cleartid_coredump, and fcntl_rw_hints.  The
clone_cleartid_coredump appears to fail because SIGSEGV no longer kills
all processes that share an mm.  Which was a deliberate change.

With the lastest development version of rr, only detach_sigkill appears
to be failing on Linus's latest.  That failure appears to be independent
of the patches in question as well.  When run manually the
detach_sigkill test succeeds so I am not quite certain what is going on,
any thoughts?

As for my patchset it looks like it does not cause any new test failures
for rr so I will plan on getting it into linux-next shortly.

Eric
Re: [PATCH 0/3] ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Posted by Kyle Huey 3 years, 9 months ago
On Sat, Jul 16, 2022 at 2:31 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> "Eric W. Biederman" <ebiederm@xmission.com> writes:
>
> > Keno Fischer <keno@juliacomputing.com> writes:
> >
> >> Hi Eric,
> >>
> >> On Fri, Jul 8, 2022 at 6:25 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>> > Recently I had a conversation where it was pointed out to me that
> >>> > SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
> >>> > difficult for a tracer to handle.
> >>> >
> >>>
> >>> RR folks any comments?
> >>>
> >>> Did I properly understand what Keno Fischer was asking for when we
> >>> talked in person?
> >>
> >> Yes, this is indeed what I had in mind. I have not yet had the opportunity
> >> to try out your patch series (sorry), but from visual inspection, it does indeed
> >> do what I wanted, which is to make sure that a tracee stays in
> >> PTRACE_EVENT_EXIT for the tracer to inspect, even if there is another
> >> SIGKILL incoming simultaneously (since otherwise it may be impossible
> >> for the tracer to observe the PTRACE_EVENT_EXIT if two SIGKILLs
> >> come in rapid succession). I will try to take this series for a proper spin
> >> shortly.
> >
> > Thanks,
> >
> > I haven't yet figured out how to get the rr test suite to run
> > successfully.  Something about my test machine and lack of perf counters
> > seems to be causing problems.  So if you can perform the testing on your
> > side that would be fantastic.
>
> Ok.  I finally found a machine where I can run rr and the rr test suite.
>
> It looks like there are a couple of the rr 5.5.0 test that fail on
> Linus's lastest kernel simply because of changes in kernel behavior.  In
> particular clone_cleartid_coredump, and fcntl_rw_hints.  The
> clone_cleartid_coredump appears to fail because SIGSEGV no longer kills
> all processes that share an mm.  Which was a deliberate change.

Yeah, we changed to handle this in
https://github.com/rr-debugger/rr/commit/04bbacdbaba1cc496e92060014442bd1fd26b41d
and https://github.com/rr-debugger/rr/commit/1a3b389c2956e1844c0d07bf4297398bb6c561ea.

> With the lastest development version of rr, only detach_sigkill appears
> to be failing on Linus's latest.  That failure appears to be independent
> of the patches in question as well.  When run manually the
> detach_sigkill test succeeds so I am not quite certain what is going on,
> any thoughts?

If it fails before your changes I wouldn't worry about it too much,
there's been some other failures in that test lately.

- Kyle

> As for my patchset it looks like it does not cause any new test failures
> for rr so I will plan on getting it into linux-next shortly.
>
> Eric
>
Re: [PATCH 0/3] ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Posted by Alexander Gordeev 3 years, 10 months ago
On Wed, Jun 22, 2022 at 11:43:37AM -0500, Eric W. Biederman wrote:
> Recently I had a conversation where it was pointed out to me that
> SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
> difficult for a tracer to handle.
> 
> Keeping SIGKILL working for anything after the process has been killed
> is also a real pain from an implementation point of view.
> 
> So I am attempting to remove this wart in the userspace API and see
> if anyone cares.

Hi Eric,

With this series s390 hits the warning exactly same way. Is that expected?

Thanks!
Re: [PATCH 0/3] ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Posted by Eric W. Biederman 3 years, 10 months ago
Alexander Gordeev <agordeev@linux.ibm.com> writes:

> On Wed, Jun 22, 2022 at 11:43:37AM -0500, Eric W. Biederman wrote:
>> Recently I had a conversation where it was pointed out to me that
>> SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
>> difficult for a tracer to handle.
>> 
>> Keeping SIGKILL working for anything after the process has been killed
>> is also a real pain from an implementation point of view.
>> 
>> So I am attempting to remove this wart in the userspace API and see
>> if anyone cares.
>
> Hi Eric,
>
> With this series s390 hits the warning exactly same way. Is that expected?

Yes.  I was working on this before I got your mysterious bug report.  I
included you because I am including everyone I know who deals with the
userspace side of this since I am very deliberately changing the user
visible behavior of PTRACE_EVENT_EXIT.

I am going to start seeing if I can find any possible explanation for
your regression report.  Since I don't have much to go on I expect I
will have to revert the last change in my ptrace_stop series that
apparently triggers the WARN_ON you reported.  I really would have
expected the WARN_ON to be triggered in the patch in which it was
introduced, not the final patch in the series.


To the best of my knowledge changing PTRACE_EVENT_EXIT is both desirable
from a userspace semantics standpoint and from a kernel implementation
standpoint.  If someone knows any differently and depends upon sending
SIGKILL to processes in PTRACE_EVENT_EXIT to steal the process away from
the tracer I would love to hear about that case.

Eric
[PATCH 1/3] signal: Ensure SIGNAL_GROUP_EXIT gets set in do_group_exit
Posted by Eric W. Biederman 3 years, 10 months ago

The function do_group_exit has an optimization that avoids taking
siglock and doing the work to find other threads in the signal group
and shutting them down.

It is very desirable for SIGNAL_GROUP_EXIT to always been set whenever
it is decided for the process to exit.  That ensures only a single
place needs to be tested, and a single bit of state needs to be looked
at.  This makes the optimization in do_group_exit counter productive.

Make the code and maintenance simpler by removing this unnecessary
option.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/exit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index f072959fcab7..96e4b12edea8 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -905,7 +905,7 @@ do_group_exit(int exit_code)
 		exit_code = sig->group_exit_code;
 	else if (sig->group_exec_task)
 		exit_code = 0;
-	else if (!thread_group_empty(current)) {
+	else {
 		struct sighand_struct *const sighand = current->sighand;
 
 		spin_lock_irq(&sighand->siglock);
-- 
2.35.3
[PATCH 2/3] signal: Guarantee that SIGNAL_GROUP_EXIT is set on process exit
Posted by Eric W. Biederman 3 years, 10 months ago

Track how many threads have not started exiting and when the last
thread starts exiting set SIGNAL_GROUP_EXIT.

This guarantees that SIGNAL_GROUP_EXIT will get set when a process
exits.  In practice this achieves nothing as glibc's implementation of
_exit calls sys_group_exit then sys_exit.  While glibc's implemenation
of pthread_exit calls exit (which cleansup and calls _exit) if it is
the last thread and sys_exit if it is the last thread.

This means the only way the kernel might observe a process that does
not set call exit_group is if the language runtime does not use glibc.

With more cleanups I hope to move the decrement of quick_threads
earlier.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/sched/signal.h |  1 +
 kernel/exit.c                | 18 ++++++++++++++++++
 kernel/fork.c                |  2 ++
 3 files changed, 21 insertions(+)

diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index cafbe03eed01..20099268fa25 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -94,6 +94,7 @@ struct signal_struct {
 	refcount_t		sigcnt;
 	atomic_t		live;
 	int			nr_threads;
+	int			quick_threads;
 	struct list_head	thread_head;
 
 	wait_queue_head_t	wait_chldexit;	/* for wait4() */
diff --git a/kernel/exit.c b/kernel/exit.c
index 96e4b12edea8..beaedb867bd3 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -733,11 +733,29 @@ static void check_stack_usage(void)
 static inline void check_stack_usage(void) {}
 #endif
 
+static void synchronize_group_exit(struct task_struct *tsk, long code)
+{
+	struct sighand_struct *sighand = tsk->sighand;
+	struct signal_struct *signal = tsk->signal;
+
+	spin_lock_irq(&sighand->siglock);
+	signal->quick_threads--;
+	if ((signal->quick_threads == 0) &&
+	    !(signal->flags & SIGNAL_GROUP_EXIT)) {
+		signal->flags = SIGNAL_GROUP_EXIT;
+		signal->group_exit_code = code;
+		signal->group_stop_count = 0;
+	}
+	spin_unlock_irq(&sighand->siglock);
+}
+
 void __noreturn do_exit(long code)
 {
 	struct task_struct *tsk = current;
 	int group_dead;
 
+	synchronize_group_exit(tsk, code);
+
 	WARN_ON(tsk->plug);
 
 	kcov_task_exit(tsk);
diff --git a/kernel/fork.c b/kernel/fork.c
index 9d44f2d46c69..67813b25a567 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1692,6 +1692,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 		return -ENOMEM;
 
 	sig->nr_threads = 1;
+	sig->quick_threads = 1;
 	atomic_set(&sig->live, 1);
 	refcount_set(&sig->sigcnt, 1);
 
@@ -2444,6 +2445,7 @@ static __latent_entropy struct task_struct *copy_process(
 			__this_cpu_inc(process_counts);
 		} else {
 			current->signal->nr_threads++;
+			current->signal->quick_threads++;
 			atomic_inc(&current->signal->live);
 			refcount_inc(&current->signal->sigcnt);
 			task_join_group_stop(p);
-- 
2.35.3
Re: [PATCH 2/3] signal: Guarantee that SIGNAL_GROUP_EXIT is set on process exit
Posted by kernel test robot 3 years, 10 months ago
Hi "Eric,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.19-rc3 next-20220622]
[cannot apply to kees/for-next/pstore]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Eric-W-Biederman/signal-Ensure-SIGNAL_GROUP_EXIT-gets-set-in-do_group_exit/20220623-014543
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 3abc3ae553c7ed73365b385b9a4cffc5176aae45
config: nios2-randconfig-s032-20220622 (https://download.01.org/0day-ci/archive/20220623/202206231547.B89O2N6f-lkp@intel.com/config)
compiler: nios2-linux-gcc (GCC) 11.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.4-31-g4880bd19-dirty
        # https://github.com/intel-lab-lkp/linux/commit/0e1cfa4e4efe9c701f3ef5665c22310ac7aa7e7e
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Eric-W-Biederman/signal-Ensure-SIGNAL_GROUP_EXIT-gets-set-in-do_group_exit/20220623-014543
        git checkout 0e1cfa4e4efe9c701f3ef5665c22310ac7aa7e7e
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=nios2 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
   kernel/exit.c:281:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *tsk @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/exit.c:281:37: sparse:     expected struct task_struct *tsk
   kernel/exit.c:281:37: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/exit.c:284:32: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *task @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/exit.c:284:32: sparse:     expected struct task_struct *task
   kernel/exit.c:284:32: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/exit.c:285:35: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *task @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/exit.c:285:35: sparse:     expected struct task_struct *task
   kernel/exit.c:285:35: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/exit.c:330:24: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct task_struct *parent @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/exit.c:330:24: sparse:     expected struct task_struct *parent
   kernel/exit.c:330:24: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/exit.c:357:27: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:357:27: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:357:27: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:360:29: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:360:29: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:360:29: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:583:29: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct task_struct *reaper @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/exit.c:583:29: sparse:     expected struct task_struct *reaper
   kernel/exit.c:583:29: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/exit.c:585:29: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct task_struct *reaper @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/exit.c:585:29: sparse:     expected struct task_struct *reaper
   kernel/exit.c:585:29: sparse:     got struct task_struct [noderef] __rcu *real_parent
>> kernel/exit.c:738:45: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sighand_struct *sighand @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/exit.c:738:45: sparse:     expected struct sighand_struct *sighand
   kernel/exit.c:738:45: sparse:     got struct sighand_struct [noderef] __rcu *sighand
   kernel/exit.c:927:63: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sighand_struct *const sighand @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/exit.c:927:63: sparse:     expected struct sighand_struct *const sighand
   kernel/exit.c:927:63: sparse:     got struct sighand_struct [noderef] __rcu *sighand
   kernel/exit.c:1082:39: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:1082:39: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:1082:39: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:1107:41: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:1107:41: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:1107:41: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:1196:25: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:1196:25: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:1196:25: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:1211:27: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:1211:27: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:1211:27: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:1262:25: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:1262:25: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:1262:25: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:1265:35: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:1265:35: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:1265:35: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:1271:27: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/exit.c:1271:27: sparse:     expected struct spinlock [usertype] *lock
   kernel/exit.c:1271:27: sparse:     got struct spinlock [noderef] __rcu *
   kernel/exit.c:1452:59: sparse: sparse: incompatible types in comparison expression (different base types):
   kernel/exit.c:1452:59: sparse:    void *
   kernel/exit.c:1452:59: sparse:    struct task_struct [noderef] __rcu *
   kernel/exit.c:1468:25: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct task_struct *parent @@     got struct task_struct [noderef] __rcu * @@
   kernel/exit.c:1468:25: sparse:     expected struct task_struct *parent
   kernel/exit.c:1468:25: sparse:     got struct task_struct [noderef] __rcu *
   kernel/exit.c: note: in included file (through arch/nios2/include/uapi/asm/elf.h, arch/nios2/include/asm/elf.h, include/linux/elf.h, ...):
   include/linux/ptrace.h:92:40: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *p1 @@     got struct task_struct [noderef] __rcu *real_parent @@
   include/linux/ptrace.h:92:40: sparse:     expected struct task_struct *p1
   include/linux/ptrace.h:92:40: sparse:     got struct task_struct [noderef] __rcu *real_parent
   include/linux/ptrace.h:92:60: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *p2 @@     got struct task_struct [noderef] __rcu *parent @@
   include/linux/ptrace.h:92:60: sparse:     expected struct task_struct *p2
   include/linux/ptrace.h:92:60: sparse:     got struct task_struct [noderef] __rcu *parent
   include/linux/ptrace.h:92:40: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *p1 @@     got struct task_struct [noderef] __rcu *real_parent @@
   include/linux/ptrace.h:92:40: sparse:     expected struct task_struct *p1
   include/linux/ptrace.h:92:40: sparse:     got struct task_struct [noderef] __rcu *real_parent
   include/linux/ptrace.h:92:60: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *p2 @@     got struct task_struct [noderef] __rcu *parent @@
   include/linux/ptrace.h:92:60: sparse:     expected struct task_struct *p2
   include/linux/ptrace.h:92:60: sparse:     got struct task_struct [noderef] __rcu *parent
   kernel/exit.c: note: in included file (through include/linux/sched/signal.h, include/linux/rcuwait.h, include/linux/percpu-rwsem.h, ...):
   include/linux/sched/task.h:110:21: sparse: sparse: context imbalance in 'wait_task_zombie' - unexpected unlock
   include/linux/sched/task.h:110:21: sparse: sparse: context imbalance in 'wait_task_stopped' - unexpected unlock
   include/linux/sched/task.h:110:21: sparse: sparse: context imbalance in 'wait_task_continued' - unexpected unlock
   kernel/exit.c: note: in included file (through arch/nios2/include/uapi/asm/elf.h, arch/nios2/include/asm/elf.h, include/linux/elf.h, ...):
   include/linux/ptrace.h:92:40: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *p1 @@     got struct task_struct [noderef] __rcu *real_parent @@
   include/linux/ptrace.h:92:40: sparse:     expected struct task_struct *p1
   include/linux/ptrace.h:92:40: sparse:     got struct task_struct [noderef] __rcu *real_parent
   include/linux/ptrace.h:92:60: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *p2 @@     got struct task_struct [noderef] __rcu *parent @@
   include/linux/ptrace.h:92:60: sparse:     expected struct task_struct *p2
   include/linux/ptrace.h:92:60: sparse:     got struct task_struct [noderef] __rcu *parent
   kernel/exit.c: note: in included file (through include/linux/thread_info.h, include/asm-generic/preempt.h, arch/nios2/include/generated/asm/preempt.h, ...):
   arch/nios2/include/asm/thread_info.h:62:9: sparse: sparse: context imbalance in 'do_wait' - wrong count at exit

vim +738 kernel/exit.c

   735	
   736	static void synchronize_group_exit(struct task_struct *tsk, long code)
   737	{
 > 738		struct sighand_struct *sighand = tsk->sighand;
   739		struct signal_struct *signal = tsk->signal;
   740	
   741		spin_lock_irq(&sighand->siglock);
   742		signal->quick_threads--;
   743		if ((signal->quick_threads == 0) &&
   744		    !(signal->flags & SIGNAL_GROUP_EXIT)) {
   745			signal->flags = SIGNAL_GROUP_EXIT;
   746			signal->group_exit_code = code;
   747			signal->group_stop_count = 0;
   748		}
   749		spin_unlock_irq(&sighand->siglock);
   750	}
   751	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp
[PATCH 3/3] signal: Drop signals received after a fatal signal has been processed
Posted by Eric W. Biederman 3 years, 10 months ago

In 403bad72b67d ("coredump: only SIGKILL should interrupt the
coredumping task") Oleg modified the kernel to drop all signals that
come in during a coredump except SIGKILL, and suggested that it might
be a good idea to generalize that to other cases after the process has
received a fatal signal.

Semantically it does not make sense to perform any signal delivery
after the process has already been killed.

When a signal is sent while a process is dying today the signal is
placed in the signal queue by __send_signal and a single task of the
process is woken up with signal_wake_up, if there are any tasks that
have not set PF_EXITING.

Take things one step farther and have prepare_signal report that all
signals that come after a process has been killed should be ignored.
While retaining the historical exception of allowing SIGKILL to
interrupt coredumps.

Update the comment in fs/coredump.c to make it clear coredumps are
special in being able to receive SIGKILL.

This changes things so that a process stopped in PTRACE_EVENT_EXIT can
not be made to escape it's ptracer and finish exiting by sending it
SIGKILL.  That a process can be made to leave PTRACE_EVENT_EXIT and
escape it's tracer by sending the process a SIGKILL has been
complicating tracer's for no apparent advantage.  If the process needs
to be made to leave PTRACE_EVENT_EXIT all that needs to happen is to
kill the proceses's tracer.  This differs from the coredump code where
there is no other mechanism besides honoring SIGKILL to expedite the
end of coredumping.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/coredump.c   | 2 +-
 kernel/signal.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index ebc43f960b64..b836948c9543 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -354,7 +354,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 	struct task_struct *t;
 	int nr = 0;
 
-	/* ignore all signals except SIGKILL, see prepare_signal() */
+	/* Allow SIGKILL, see prepare_signal() */
 	start->signal->flags = SIGNAL_GROUP_EXIT;
 	start->signal->group_exit_code = exit_code;
 	start->signal->group_stop_count = 0;
diff --git a/kernel/signal.c b/kernel/signal.c
index edb1dc9b00dc..369d65b06025 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -913,8 +913,9 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
 		if (signal->core_state)
 			return sig == SIGKILL;
 		/*
-		 * The process is in the middle of dying, nothing to do.
+		 * The process is in the middle of dying, drop the signal.
 		 */
+		return false;
 	} else if (sig_kernel_stop(sig)) {
 		/*
 		 * This is a stop signal.  Remove SIGCONT from all queues.
-- 
2.35.3
[PATCH 00/16] ptrace: cleanups and calling do_cldstop with only siglock
Posted by Eric W. Biederman 3 years, 11 months ago
For ptrace_stop to work on PREEMT_RT no spinlocks can be taken once
ptrace_freeze_traced has completed successfully.  Which fundamentally
means the lock dance of dropping siglock and grabbing tasklist_lock does
not work on PREEMPT_RT.  So I have worked through what is necessary so
that tasklist_lock does not need to be grabbed in ptrace_stop after
siglock is dropped.

I have explored several alternate ways of getting there and along the
way I found a lot of small bug fixes/cleanups that don't necessarily
contribute to the final result but that or worthwhile on their own.  So
I have included those changes in this set of changes just so they don't
get lost.

In addition I had a conversation with Thomas Gleixner recently that
emphasized for me the need to reduce the hold times of tasklist_lock,
and that made me realize that in principle it is possible.
https://lkml.kernel.org/r/87mtfmhap2.fsf@email.froward.int.ebiederm.org

Which is a long way of saying that not taking tasklist_lock in
ptrace_stop is good not just for PREMPT_RT but also for improving the
scalability of the kernel in general.

After this set of changes only cgroup_enter_frozen should remain a
stumbling block for PREEMPT_RT in the ptrace_stop path.

Eric W. Biederman (16):
      signal/alpha: Remove unused definition of TASK_REAL_PARENT
      signal/ia64: Remove unused definition of IA64_TASK_REAL_PARENT_OFFSET
      kdb: Use real_parent when displaying a list of processes
      powerpc/xmon:  Use real_parent when displaying a list of processes
      ptrace: Remove dead code from __ptrace_detach
      ptrace: Remove unnecessary locking in ptrace_(get|set)siginfo
      signal: Wake up the designated parent
      ptrace: Only populate last_siginfo from ptrace
      ptrace: In ptrace_setsiginfo deal with invalid si_signo
      ptrace: In ptrace_signal look at what the debugger did with siginfo
      ptrace: Use si_sino as the signal number to resume with
      ptrace: Stop protecting ptrace_set_signr with tasklist_lock
      ptrace: Document why ptrace_setoptions does not need a lock
      signal: Protect parent child relationships by childs siglock
      ptrace: Use siglock instead of tasklist_lock in ptrace_check_attach
      signal: Always call do_notify_parent_cldstop with siglock held

 arch/alpha/kernel/asm-offsets.c |   1 -
 arch/ia64/kernel/asm-offsets.c  |   1 -
 arch/powerpc/xmon/xmon.c        |   2 +-
 kernel/debug/kdb/kdb_main.c     |   2 +-
 kernel/exit.c                   |  23 +++-
 kernel/fork.c                   |  12 +-
 kernel/ptrace.c                 | 132 ++++++++----------
 kernel/signal.c                 | 296 ++++++++++++++++++++++++++--------------
 8 files changed, 279 insertions(+), 190 deletions(-)

Eric
Re: [PATCH 00/16] ptrace: cleanups and calling do_cldstop with only siglock
Posted by Sebastian Andrzej Siewior 3 years, 11 months ago
On 2022-05-18 17:49:50 [-0500], Eric W. Biederman wrote:
> After this set of changes only cgroup_enter_frozen should remain a
> stumbling block for PREEMPT_RT in the ptrace_stop path.

Yes, I can confirm that. I have no systemd-less system at hand which
means I can't boot a kernel without CGROUP support. But after removing
cgroup_{enter|leave}_frozen() in ptrace_stop() I don't see the problems
I saw earlier.

Sebastian
Re: [PATCH 00/16] ptrace: cleanups and calling do_cldstop with only siglock
Posted by Sebastian Andrzej Siewior 3 years, 11 months ago
On 2022-05-18 17:49:50 [-0500], Eric W. Biederman wrote:
> 
> For ptrace_stop to work on PREEMT_RT no spinlocks can be taken once
> ptrace_freeze_traced has completed successfully.  Which fundamentally
> means the lock dance of dropping siglock and grabbing tasklist_lock does
> not work on PREEMPT_RT.  So I have worked through what is necessary so
> that tasklist_lock does not need to be grabbed in ptrace_stop after
> siglock is dropped.
…
It took me a while to realise that this is a follow-up I somehow assumed
that you added a few patches on top. Might have been the yesterday's
heat. b4 also refused to download this series because the v4 in this
thread looked newer… Anyway. Both series applied:

| =============================
| WARNING: suspicious RCU usage
| 5.18.0-rc7+ #16 Not tainted
| -----------------------------
| include/linux/ptrace.h:120 suspicious rcu_dereference_check() usage!
|
| other info that might help us debug this:
|
| rcu_scheduler_active = 2, debug_locks = 1
| 2 locks held by ssdd/1734:
|  #0: ffff88800eaa6918 (&sighand->siglock){....}-{2:2}, at: lock_parents_siglocks+0xf0/0x3b0
|  #1: ffff88800eaa71d8 (&sighand->siglock/2){....}-{2:2}, at: lock_parents_siglocks+0x115/0x3b0
|
| stack backtrace:
| CPU: 2 PID: 1734 Comm: ssdd Not tainted 5.18.0-rc7+ #16
| Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
| Call Trace:
|  <TASK>
|  dump_stack_lvl+0x45/0x5a
|  unlock_parents_siglocks+0xb6/0xc0
|  ptrace_stop+0xb9/0x390
|  get_signal+0x51c/0x8d0
|  arch_do_signal_or_restart+0x31/0x750
|  exit_to_user_mode_prepare+0x157/0x220
|  irqentry_exit_to_user_mode+0x5/0x50
|  asm_sysvec_apic_timer_interrupt+0x12/0x20

That is ptrace_parent() in unlock_parents_siglocks().

Sebastian
Re: [PATCH 00/16] ptrace: cleanups and calling do_cldstop with only siglock
Posted by Eric W. Biederman 3 years, 11 months ago
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:

> On 2022-05-18 17:49:50 [-0500], Eric W. Biederman wrote:
>> 
>> For ptrace_stop to work on PREEMT_RT no spinlocks can be taken once
>> ptrace_freeze_traced has completed successfully.  Which fundamentally
>> means the lock dance of dropping siglock and grabbing tasklist_lock does
>> not work on PREEMPT_RT.  So I have worked through what is necessary so
>> that tasklist_lock does not need to be grabbed in ptrace_stop after
>> siglock is dropped.
> …
> It took me a while to realise that this is a follow-up I somehow assumed
> that you added a few patches on top. Might have been the yesterday's
> heat. b4 also refused to download this series because the v4 in this
> thread looked newer… Anyway. Both series applied:
>
> | =============================
> | WARNING: suspicious RCU usage
> | 5.18.0-rc7+ #16 Not tainted
> | -----------------------------
> | include/linux/ptrace.h:120 suspicious rcu_dereference_check() usage!
> |
> | other info that might help us debug this:
> |
> | rcu_scheduler_active = 2, debug_locks = 1
> | 2 locks held by ssdd/1734:
> |  #0: ffff88800eaa6918 (&sighand->siglock){....}-{2:2}, at: lock_parents_siglocks+0xf0/0x3b0
> |  #1: ffff88800eaa71d8 (&sighand->siglock/2){....}-{2:2}, at: lock_parents_siglocks+0x115/0x3b0
> |
> | stack backtrace:
> | CPU: 2 PID: 1734 Comm: ssdd Not tainted 5.18.0-rc7+ #16
> | Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> | Call Trace:
> |  <TASK>
> |  dump_stack_lvl+0x45/0x5a
> |  unlock_parents_siglocks+0xb6/0xc0
> |  ptrace_stop+0xb9/0x390
> |  get_signal+0x51c/0x8d0
> |  arch_do_signal_or_restart+0x31/0x750
> |  exit_to_user_mode_prepare+0x157/0x220
> |  irqentry_exit_to_user_mode+0x5/0x50
> |  asm_sysvec_apic_timer_interrupt+0x12/0x20
>
> That is ptrace_parent() in unlock_parents_siglocks().

How odd.  I thought I had the appropriate lockdep config options enabled
in my test build to catch things like this.  I guess not.

Now I am trying to think how to tell it that holding the appropriate
iglock makes this ok.

Eric
Re: [PATCH 00/16] ptrace: cleanups and calling do_cldstop with only siglock
Posted by Peter Zijlstra 3 years, 11 months ago
On Fri, May 20, 2022 at 02:32:24PM -0500, Eric W. Biederman wrote:
> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> 
> > On 2022-05-18 17:49:50 [-0500], Eric W. Biederman wrote:
> >> 
> >> For ptrace_stop to work on PREEMT_RT no spinlocks can be taken once
> >> ptrace_freeze_traced has completed successfully.  Which fundamentally
> >> means the lock dance of dropping siglock and grabbing tasklist_lock does
> >> not work on PREEMPT_RT.  So I have worked through what is necessary so
> >> that tasklist_lock does not need to be grabbed in ptrace_stop after
> >> siglock is dropped.
> > …
> > It took me a while to realise that this is a follow-up I somehow assumed
> > that you added a few patches on top. Might have been the yesterday's
> > heat. b4 also refused to download this series because the v4 in this
> > thread looked newer… Anyway. Both series applied:
> >
> > | =============================
> > | WARNING: suspicious RCU usage
> > | 5.18.0-rc7+ #16 Not tainted
> > | -----------------------------
> > | include/linux/ptrace.h:120 suspicious rcu_dereference_check() usage!
> > |
> > | other info that might help us debug this:
> > |
> > | rcu_scheduler_active = 2, debug_locks = 1
> > | 2 locks held by ssdd/1734:
> > |  #0: ffff88800eaa6918 (&sighand->siglock){....}-{2:2}, at: lock_parents_siglocks+0xf0/0x3b0
> > |  #1: ffff88800eaa71d8 (&sighand->siglock/2){....}-{2:2}, at: lock_parents_siglocks+0x115/0x3b0
> > |
> > | stack backtrace:
> > | CPU: 2 PID: 1734 Comm: ssdd Not tainted 5.18.0-rc7+ #16
> > | Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> > | Call Trace:
> > |  <TASK>
> > |  dump_stack_lvl+0x45/0x5a
> > |  unlock_parents_siglocks+0xb6/0xc0
> > |  ptrace_stop+0xb9/0x390
> > |  get_signal+0x51c/0x8d0
> > |  arch_do_signal_or_restart+0x31/0x750
> > |  exit_to_user_mode_prepare+0x157/0x220
> > |  irqentry_exit_to_user_mode+0x5/0x50
> > |  asm_sysvec_apic_timer_interrupt+0x12/0x20
> >
> > That is ptrace_parent() in unlock_parents_siglocks().
> 
> How odd.  I thought I had the appropriate lockdep config options enabled
> in my test build to catch things like this.  I guess not.
> 
> Now I am trying to think how to tell it that holding the appropriate
> iglock makes this ok.

The typical annotation is something like:

	rcu_dereference_protected(foo, lockdep_is_held(&bar))

Except in this case I think the problem is that bar depends on foo in
non-trivial ways. That is, foo is 'task->parent' and bar is
'task->parent->sighand->siglock' or something.

The other option is to use rcu_dereference_raw() in this one instance
and have a comment that explains the situation.
[PATCH 01/16] signal/alpha: Remove unused definition of TASK_REAL_PARENT
Posted by Eric W. Biederman 3 years, 11 months ago
Rather than update this defition when I move tsk->real_parent into
signal_struct remove it now.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-alpha@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/alpha/kernel/asm-offsets.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/alpha/kernel/asm-offsets.c b/arch/alpha/kernel/asm-offsets.c
index 2e125e5c1508..0fca99dc5757 100644
--- a/arch/alpha/kernel/asm-offsets.c
+++ b/arch/alpha/kernel/asm-offsets.c
@@ -21,7 +21,6 @@ void foo(void)
 
         DEFINE(TASK_BLOCKED, offsetof(struct task_struct, blocked));
         DEFINE(TASK_CRED, offsetof(struct task_struct, cred));
-        DEFINE(TASK_REAL_PARENT, offsetof(struct task_struct, real_parent));
         DEFINE(TASK_GROUP_LEADER, offsetof(struct task_struct, group_leader));
         DEFINE(TASK_TGID, offsetof(struct task_struct, tgid));
         BLANK();
-- 
2.35.3
[PATCH 02/16] signal/ia64: Remove unused definition of IA64_TASK_REAL_PARENT_OFFSET
Posted by Eric W. Biederman 3 years, 11 months ago
Rather than update the unused definition of IA64_TASK_REAL_PARENT_OFFSENT
when I move tsk->real_parent into signal_struct remove it now.

Cc: linux-ia64@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/ia64/kernel/asm-offsets.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/ia64/kernel/asm-offsets.c b/arch/ia64/kernel/asm-offsets.c
index be3b90fef2e9..245c4333ea30 100644
--- a/arch/ia64/kernel/asm-offsets.c
+++ b/arch/ia64/kernel/asm-offsets.c
@@ -55,7 +55,6 @@ void foo(void)
 	DEFINE(IA64_PID_UPID_OFFSET, offsetof (struct pid, numbers[0]));
 	DEFINE(IA64_TASK_PENDING_OFFSET,offsetof (struct task_struct, pending));
 	DEFINE(IA64_TASK_PID_OFFSET, offsetof (struct task_struct, pid));
-	DEFINE(IA64_TASK_REAL_PARENT_OFFSET, offsetof (struct task_struct, real_parent));
 	DEFINE(IA64_TASK_SIGNAL_OFFSET,offsetof (struct task_struct, signal));
 	DEFINE(IA64_TASK_TGID_OFFSET, offsetof (struct task_struct, tgid));
 	DEFINE(IA64_TASK_THREAD_KSP_OFFSET, offsetof (struct task_struct, thread.ksp));
-- 
2.35.3
[PATCH 03/16] kdb: Use real_parent when displaying a list of processes
Posted by Eric W. Biederman 3 years, 11 months ago
kdb has a bug that when using the ps command to display a list of
processes, if a process is being debugged the debugger as the parent
process.

This is silly, and I expect it never comes up in ptractice.  As there
is very little point in using gdb and kdb simultaneously.  Update the
code to use real_parent so that it is clear kdb does not want to
display a debugger as the parent of a process.

Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Douglas Anderson <dianders@chromium.org>
Fixes: 5d5314d6795f ("kdb: core for kgdb back end (1 of 2)"
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/debug/kdb/kdb_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 0852a537dad4..db49f1026eaa 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -2306,7 +2306,7 @@ void kdb_ps1(const struct task_struct *p)
 
 	cpu = kdb_process_cpu(p);
 	kdb_printf("0x%px %8d %8d  %d %4d   %c  0x%px %c%s\n",
-		   (void *)p, p->pid, p->parent->pid,
+		   (void *)p, p->pid, p->real_parent->pid,
 		   kdb_task_has_cpu(p), kdb_process_cpu(p),
 		   kdb_task_state_char(p),
 		   (void *)(&p->thread),
-- 
2.35.3
Re: [PATCH 03/16] kdb: Use real_parent when displaying a list of processes
Posted by Doug Anderson 3 years, 11 months ago
Hi,

On Wed, May 18, 2022 at 3:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> kdb has a bug that when using the ps command to display a list of
> processes, if a process is being debugged the debugger as the parent
> process.
>
> This is silly, and I expect it never comes up in ptractice.  As there
> is very little point in using gdb and kdb simultaneously.  Update the
> code to use real_parent so that it is clear kdb does not want to
> display a debugger as the parent of a process.

So I would tend to defer to Daniel, but I'm not convinced that the
behavior you describe for kdb today _is_ actually silly.

If I was in kdb and I was listing processes, I might actually want to
see that a process's parent was set to gdb. Presumably that would tell
me extra information that might be relevant to my debug session.

Personally, I'd rather add an extra piece of information into the list
showing the real parent if it's not the same as the parent. Then
you're not throwing away information.

-Doug
Re: [PATCH 03/16] kdb: Use real_parent when displaying a list of processes
Posted by Eric W. Biederman 3 years, 11 months ago
Doug Anderson <dianders@chromium.org> writes:

> Hi,
>
> On Wed, May 18, 2022 at 3:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
>> kdb has a bug that when using the ps command to display a list of
>> processes, if a process is being debugged the debugger as the parent
>> process.
>>
>> This is silly, and I expect it never comes up in ptractice.  As there
>> is very little point in using gdb and kdb simultaneously.  Update the
>> code to use real_parent so that it is clear kdb does not want to
>> display a debugger as the parent of a process.
>
> So I would tend to defer to Daniel, but I'm not convinced that the
> behavior you describe for kdb today _is_ actually silly.
>
> If I was in kdb and I was listing processes, I might actually want to
> see that a process's parent was set to gdb. Presumably that would tell
> me extra information that might be relevant to my debug session.
>
> Personally, I'd rather add an extra piece of information into the list
> showing the real parent if it's not the same as the parent. Then
> you're not throwing away information.

The name of the field is confusing for anyone who isn't intimate with
the implementation details.  The function getppid returns
tsk->real_parent->tgid.

If kdb wants information of what the tracer is that is fine, but I
recommend putting that information in another field.

Given that the original description says give the information that ps
gives my sense is that kdb is currently wrong.  Especially as it does
not give you the actual parentage anywhere.

I can certainly be convinced, but I do want some clarity.  It looks very
attractive to rename task->parent to task->ptracer and leave the field
NULL when there is no tracer.

Eric
Re: [PATCH 03/16] kdb: Use real_parent when displaying a list of processes
Posted by Doug Anderson 3 years, 11 months ago
Hi,

On Thu, May 19, 2022 at 4:49 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Doug Anderson <dianders@chromium.org> writes:
>
> > Hi,
> >
> > On Wed, May 18, 2022 at 3:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>
> >> kdb has a bug that when using the ps command to display a list of
> >> processes, if a process is being debugged the debugger as the parent
> >> process.
> >>
> >> This is silly, and I expect it never comes up in ptractice.  As there
> >> is very little point in using gdb and kdb simultaneously.  Update the
> >> code to use real_parent so that it is clear kdb does not want to
> >> display a debugger as the parent of a process.
> >
> > So I would tend to defer to Daniel, but I'm not convinced that the
> > behavior you describe for kdb today _is_ actually silly.
> >
> > If I was in kdb and I was listing processes, I might actually want to
> > see that a process's parent was set to gdb. Presumably that would tell
> > me extra information that might be relevant to my debug session.
> >
> > Personally, I'd rather add an extra piece of information into the list
> > showing the real parent if it's not the same as the parent. Then
> > you're not throwing away information.
>
> The name of the field is confusing for anyone who isn't intimate with
> the implementation details.  The function getppid returns
> tsk->real_parent->tgid.
>
> If kdb wants information of what the tracer is that is fine, but I
> recommend putting that information in another field.
>
> Given that the original description says give the information that ps
> gives my sense is that kdb is currently wrong.  Especially as it does
> not give you the actual parentage anywhere.
>
> I can certainly be convinced, but I do want some clarity.  It looks very
> attractive to rename task->parent to task->ptracer and leave the field
> NULL when there is no tracer.

Fair enough. You can consider my objection rescinded.

Presumably, though, you're hoping for an Ack for your patch and you
plan to take it with the rest of the series. That's going to need to
come from Daniel anyway as he is the actual maintainer. I'm just the
peanut gallery. ;-)

-Doug
Re: [PATCH 03/16] kdb: Use real_parent when displaying a list of processes
Posted by Peter Zijlstra 3 years, 11 months ago
On Wed, May 18, 2022 at 05:53:42PM -0500, Eric W. Biederman wrote:
> kdb has a bug that when using the ps command to display a list of
> processes, if a process is being debugged the debugger as the parent
> process.
> 
> This is silly, and I expect it never comes up in ptractice.  As there
                                                   ^^^^^^^^^

Lol, love the new word :-)
Re: [PATCH 03/16] kdb: Use real_parent when displaying a list of processes
Posted by Eric W. Biederman 3 years, 11 months ago
Peter Zijlstra <peterz@infradead.org> writes:

> On Wed, May 18, 2022 at 05:53:42PM -0500, Eric W. Biederman wrote:
>> kdb has a bug that when using the ps command to display a list of
>> processes, if a process is being debugged the debugger as the parent
>> process.
>> 
>> This is silly, and I expect it never comes up in ptractice.  As there
>                                                    ^^^^^^^^^
>
> Lol, love the new word :-)

It wasn't intentional but now I just might have to keep it.

Eric
[PATCH 04/16] powerpc/xmon: Use real_parent when displaying a list of processes
Posted by Eric W. Biederman 3 years, 11 months ago
xmon has a bug (copied from kdb) that when showing a list of processes
the debugger is listed as the parent, if a processes is being debugged.

This is silly, and I expect it is rare enough no has noticed in
practice.  Update the code to use real_parent so that it is clear xmon
does not want to display a debugger as the parent of a process.

Cc: Douglas Miller <dougmill@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Fixes: 6dfb54049f9a ("powerpc/xmon: Add xmon command to dump process/task similar to ps(1)")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/powerpc/xmon/xmon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index fd72753e8ad5..b308ef9ce604 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -3282,7 +3282,7 @@ static void show_task(struct task_struct *volatile tsk)
 
 	printf("%16px %16lx %16px %6d %6d %c %2d %s\n", tsk,
 		tsk->thread.ksp, tsk->thread.regs,
-		tsk->pid, rcu_dereference(tsk->parent)->pid,
+		tsk->pid, rcu_dereference(tsk->real_parent)->pid,
 		state, task_cpu(tsk),
 		tsk->comm);
 }
-- 
2.35.3
[PATCH 05/16] ptrace: Remove dead code from __ptrace_detach
Posted by Eric W. Biederman 3 years, 11 months ago
Ever since commit 28d838cc4dfe ("Fix ptrace self-attach rule") it has
been impossible to attach another thread in the same thread group.

Remove the code from __ptrace_detach that was trying to support
detaching from a thread in the same thread group.  The code is
dead and I can not make sense of what it is trying to do.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/ptrace.c | 24 +++---------------------
 1 file changed, 3 insertions(+), 21 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 328a34a99124..ca0e47691229 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -526,19 +526,6 @@ static int ptrace_traceme(void)
 	return ret;
 }
 
-/*
- * Called with irqs disabled, returns true if childs should reap themselves.
- */
-static int ignoring_children(struct sighand_struct *sigh)
-{
-	int ret;
-	spin_lock(&sigh->siglock);
-	ret = (sigh->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) ||
-	      (sigh->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT);
-	spin_unlock(&sigh->siglock);
-	return ret;
-}
-
 /*
  * Called with tasklist_lock held for writing.
  * Unlink a traced task, and clean it up if it was a traced zombie.
@@ -565,14 +552,9 @@ static bool __ptrace_detach(struct task_struct *tracer, struct task_struct *p)
 
 	dead = !thread_group_leader(p);
 
-	if (!dead && thread_group_empty(p)) {
-		if (!same_thread_group(p->real_parent, tracer))
-			dead = do_notify_parent(p, p->exit_signal);
-		else if (ignoring_children(tracer->sighand)) {
-			__wake_up_parent(p, tracer);
-			dead = true;
-		}
-	}
+	if (!dead && thread_group_empty(p))
+		dead = do_notify_parent(p, p->exit_signal);
+
 	/* Mark it as in the process of being reaped. */
 	if (dead)
 		p->exit_state = EXIT_DEAD;
-- 
2.35.3
Re: [PATCH 05/16] ptrace: Remove dead code from __ptrace_detach
Posted by Oleg Nesterov 3 years, 11 months ago
Sorry for delay.

On 05/18, Eric W. Biederman wrote:
>
> Ever since commit 28d838cc4dfe ("Fix ptrace self-attach rule") it has
> been impossible to attach another thread in the same thread group.
>
> Remove the code from __ptrace_detach that was trying to support
> detaching from a thread in the same thread group.

may be I am totally confused, but I think you misunderstood this code
and thus this patch is very wrong.

The same_thread_group() check does NOT try to check if debugger and
tracee is in the same thread group, this is indeed impossible.

We need this check to know if the tracee was ptrace_reparented() before
__ptrace_unlink() or not.


> -static int ignoring_children(struct sighand_struct *sigh)
> -{
> -	int ret;
> -	spin_lock(&sigh->siglock);
> -	ret = (sigh->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) ||
> -	      (sigh->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT);
> -	spin_unlock(&sigh->siglock);
> -	return ret;
> -}

...

> @@ -565,14 +552,9 @@ static bool __ptrace_detach(struct task_struct *tracer, struct task_struct *p)
>
>  	dead = !thread_group_leader(p);
>
> -	if (!dead && thread_group_empty(p)) {
> -		if (!same_thread_group(p->real_parent, tracer))
> -			dead = do_notify_parent(p, p->exit_signal);
> -		else if (ignoring_children(tracer->sighand)) {
> -			__wake_up_parent(p, tracer);
> -			dead = true;
> -		}
> -	}

So the code above does:

	- if !same_thread_group(p->real_parent, tracer), then the tracee was
	  ptrace_reparented(), and now we need to notify its natural parent
	  to let it know it has a zombie child.

	- otherwise, the tracee is our natural child, and it is actually dead.
	  however, since we are going to reap this task, we need to wake up our
	  sub-threads possibly sleeping on ->wait_chldexit wait_queue_head_t.

See?

> +	if (!dead && thread_group_empty(p))
> +		dead = do_notify_parent(p, p->exit_signal);

No, this looks wrong. Or I missed something?

Oleg.
Re: [PATCH 05/16] ptrace: Remove dead code from __ptrace_detach
Posted by Oleg Nesterov 3 years, 11 months ago
On 05/24, Oleg Nesterov wrote:
>
> Sorry for delay.
>
> On 05/18, Eric W. Biederman wrote:
> >
> > Ever since commit 28d838cc4dfe ("Fix ptrace self-attach rule") it has
> > been impossible to attach another thread in the same thread group.
> >
> > Remove the code from __ptrace_detach that was trying to support
> > detaching from a thread in the same thread group.
>
> may be I am totally confused, but I think you misunderstood this code
> and thus this patch is very wrong.
>
> The same_thread_group() check does NOT try to check if debugger and
> tracee is in the same thread group, this is indeed impossible.
>
> We need this check to know if the tracee was ptrace_reparented() before
> __ptrace_unlink() or not.
>
>
> > -static int ignoring_children(struct sighand_struct *sigh)
> > -{
> > -	int ret;
> > -	spin_lock(&sigh->siglock);
> > -	ret = (sigh->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) ||
> > -	      (sigh->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT);
> > -	spin_unlock(&sigh->siglock);
> > -	return ret;
> > -}
>
> ...
>
> > @@ -565,14 +552,9 @@ static bool __ptrace_detach(struct task_struct *tracer, struct task_struct *p)
> >
> >  	dead = !thread_group_leader(p);
> >
> > -	if (!dead && thread_group_empty(p)) {
> > -		if (!same_thread_group(p->real_parent, tracer))
> > -			dead = do_notify_parent(p, p->exit_signal);
> > -		else if (ignoring_children(tracer->sighand)) {
> > -			__wake_up_parent(p, tracer);
> > -			dead = true;
> > -		}
> > -	}
>
> So the code above does:
>
> 	- if !same_thread_group(p->real_parent, tracer), then the tracee was
> 	  ptrace_reparented(), and now we need to notify its natural parent
> 	  to let it know it has a zombie child.
>
> 	- otherwise, the tracee is our natural child, and it is actually dead.
> 	  however, since we are going to reap this task, we need to wake up our
> 	  sub-threads possibly sleeping on ->wait_chldexit wait_queue_head_t.
>
> See?
>
> > +	if (!dead && thread_group_empty(p))
> > +		dead = do_notify_parent(p, p->exit_signal);
>
> No, this looks wrong. Or I missed something?

Yes, but...

That said, it seems that we do not need __wake_up_parent() if it was our
natural child?

I'll recheck. Eric, I'll continue to read this series tomorrow, can't
concentrate on ptrace today.

Oleg.
Re: [PATCH 05/16] ptrace: Remove dead code from __ptrace_detach
Posted by Eric W. Biederman 3 years, 11 months ago
Oleg Nesterov <oleg@redhat.com> writes:

> On 05/24, Oleg Nesterov wrote:
>>
>> Sorry for delay.
>>
>> On 05/18, Eric W. Biederman wrote:
>> >
>> > Ever since commit 28d838cc4dfe ("Fix ptrace self-attach rule") it has
>> > been impossible to attach another thread in the same thread group.
>> >
>> > Remove the code from __ptrace_detach that was trying to support
>> > detaching from a thread in the same thread group.
>>
>> may be I am totally confused, but I think you misunderstood this code
>> and thus this patch is very wrong.
>>
>> The same_thread_group() check does NOT try to check if debugger and
>> tracee is in the same thread group, this is indeed impossible.
>>
>> We need this check to know if the tracee was ptrace_reparented() before
>> __ptrace_unlink() or not.
>>
>>
>> > -static int ignoring_children(struct sighand_struct *sigh)
>> > -{
>> > -	int ret;
>> > -	spin_lock(&sigh->siglock);
>> > -	ret = (sigh->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) ||
>> > -	      (sigh->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT);
>> > -	spin_unlock(&sigh->siglock);
>> > -	return ret;
>> > -}
>>
>> ...
>>
>> > @@ -565,14 +552,9 @@ static bool __ptrace_detach(struct task_struct *tracer, struct task_struct *p)
>> >
>> >  	dead = !thread_group_leader(p);
>> >
>> > -	if (!dead && thread_group_empty(p)) {
>> > -		if (!same_thread_group(p->real_parent, tracer))
>> > -			dead = do_notify_parent(p, p->exit_signal);
>> > -		else if (ignoring_children(tracer->sighand)) {
>> > -			__wake_up_parent(p, tracer);
>> > -			dead = true;
>> > -		}
>> > -	}
>>
>> So the code above does:
>>
>> 	- if !same_thread_group(p->real_parent, tracer), then the tracee was
>> 	  ptrace_reparented(), and now we need to notify its natural parent
>> 	  to let it know it has a zombie child.
>>
>> 	- otherwise, the tracee is our natural child, and it is actually dead.
>> 	  however, since we are going to reap this task, we need to wake up our
>> 	  sub-threads possibly sleeping on ->wait_chldexit wait_queue_head_t.
>>
>> See?
>>
>> > +	if (!dead && thread_group_empty(p))
>> > +		dead = do_notify_parent(p, p->exit_signal);
>>
>> No, this looks wrong. Or I missed something?
>
> Yes, but...
>
> That said, it seems that we do not need __wake_up_parent() if it was our
> natural child?

Agreed on both counts.

Hmm.  I see where the logic comes from.  The ignoring_children test and
the __wake_up_parent are what do_notify_parent does when the parent
ignores children.  Hmm.  I even see all of this document in the comment
above __ptrace_detach.

So I am just going to drop this change.

> I'll recheck. Eric, I'll continue to read this series tomorrow, can't
> concentrate on ptrace today.

No worries.  This was entirely too close to the merge window so I
dropped it all until today.

Eric
[PATCH 06/16] ptrace: Remove unnecessary locking in ptrace_(get|set)siginfo
Posted by Eric W. Biederman 3 years, 11 months ago
Since commit 9899d11f6544 ("ptrace: ensure arch_ptrace/ptrace_request
can never race with SIGKILL") it has been unnecessary for
ptrace_getsiginfo and ptrace_setsiginfo to use lock_task_sighand.

Having the code taking an unnecessary lock is confusing
as it suggests that other parts of the code need to take
the unnecessary lock as well.

So remove the unnecessary lock to make the code more
efficient, simpler, and less confusing.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/ptrace.c | 30 ++++++++----------------------
 1 file changed, 8 insertions(+), 22 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index ca0e47691229..15e93eafa6f0 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -677,34 +677,20 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
 
 static int ptrace_getsiginfo(struct task_struct *child, kernel_siginfo_t *info)
 {
-	unsigned long flags;
-	int error = -ESRCH;
+	if (unlikely(!child->last_siginfo))
+		return -EINVAL;
 
-	if (lock_task_sighand(child, &flags)) {
-		error = -EINVAL;
-		if (likely(child->last_siginfo != NULL)) {
-			copy_siginfo(info, child->last_siginfo);
-			error = 0;
-		}
-		unlock_task_sighand(child, &flags);
-	}
-	return error;
+	copy_siginfo(info, child->last_siginfo);
+	return 0;
 }
 
 static int ptrace_setsiginfo(struct task_struct *child, const kernel_siginfo_t *info)
 {
-	unsigned long flags;
-	int error = -ESRCH;
+	if (unlikely(!child->last_siginfo))
+		return -EINVAL;
 
-	if (lock_task_sighand(child, &flags)) {
-		error = -EINVAL;
-		if (likely(child->last_siginfo != NULL)) {
-			copy_siginfo(child->last_siginfo, info);
-			error = 0;
-		}
-		unlock_task_sighand(child, &flags);
-	}
-	return error;
+	copy_siginfo(child->last_siginfo, info);
+	return 0;
 }
 
 static int ptrace_peek_siginfo(struct task_struct *child,
-- 
2.35.3
Re: [PATCH 06/16] ptrace: Remove unnecessary locking in ptrace_(get|set)siginfo
Posted by Oleg Nesterov 3 years, 11 months ago
On 05/18, Eric W. Biederman wrote:
>
> Since commit 9899d11f6544 ("ptrace: ensure arch_ptrace/ptrace_request
> can never race with SIGKILL") it has been unnecessary for
> ptrace_getsiginfo and ptrace_setsiginfo to use lock_task_sighand.

ACK
[PATCH 07/16] signal: Wake up the designated parent
Posted by Eric W. Biederman 3 years, 11 months ago
Today if a process is ptraced only the ptracer will ever be woken up in
wait, if the parent is waiting with __WNOTHREAD.  Update the code
so that the real_parent can also be woken up with __WNOTHREAD even
when the code is ptraced.

Fixes: 75b95953a569 ("job control: Add @for_ptrace to do_notify_parent_cldstop()")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/exit.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index f072959fcab7..0e26f73c49ac 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1421,26 +1421,35 @@ static int ptrace_do_wait(struct wait_opts *wo, struct task_struct *tsk)
 	return 0;
 }
 
+struct child_wait_info {
+	struct task_struct *p;
+	struct task_struct *parent;
+};
+
 static int child_wait_callback(wait_queue_entry_t *wait, unsigned mode,
 				int sync, void *key)
 {
 	struct wait_opts *wo = container_of(wait, struct wait_opts,
 						child_wait);
-	struct task_struct *p = key;
+	struct child_wait_info *info = key;
 
-	if (!eligible_pid(wo, p))
+	if (!eligible_pid(wo, info->p))
 		return 0;
 
-	if ((wo->wo_flags & __WNOTHREAD) && wait->private != p->parent)
-		return 0;
+	if ((wo->wo_flags & __WNOTHREAD) && (wait->private != info->parent))
+			return 0;
 
 	return default_wake_function(wait, mode, sync, key);
 }
 
 void __wake_up_parent(struct task_struct *p, struct task_struct *parent)
 {
+	struct child_wait_info info = {
+		.p = p,
+		.parent = parent,
+	};
 	__wake_up_sync_key(&parent->signal->wait_chldexit,
-			   TASK_INTERRUPTIBLE, p);
+			   TASK_INTERRUPTIBLE, &info);
 }
 
 static bool is_effectively_child(struct wait_opts *wo, bool ptrace,
-- 
2.35.3
Re: [PATCH 07/16] signal: Wake up the designated parent
Posted by Oleg Nesterov 3 years, 11 months ago
I fail to understand this patch...

On 05/18, Eric W. Biederman wrote:
>
> Today if a process is ptraced only the ptracer will ever be woken up in
> wait

and why is this wrong?

> Fixes: 75b95953a569 ("job control: Add @for_ptrace to do_notify_parent_cldstop()")

how does this change fix 75b95953a569?

>  static int child_wait_callback(wait_queue_entry_t *wait, unsigned mode,
>  				int sync, void *key)
>  {
>  	struct wait_opts *wo = container_of(wait, struct wait_opts,
>  						child_wait);
> -	struct task_struct *p = key;
> +	struct child_wait_info *info = key;
>
> -	if (!eligible_pid(wo, p))
> +	if (!eligible_pid(wo, info->p))
>  		return 0;
>
> -	if ((wo->wo_flags & __WNOTHREAD) && wait->private != p->parent)
> -		return 0;
> +	if ((wo->wo_flags & __WNOTHREAD) && (wait->private != info->parent))
> +			return 0;

So. wait->private is the task T which sleeping on wait_chldexit.

Before the patch the logic is clear. T called do_wait(__WNOTHREAD) and
we do not need to wake it up if it is not the "actual" parent of p.

After the patch we check it T is actual to the "parent" arg passed to
__wake_up_parent(). Why??? This arg is only used to find the
->signal->wait_chldexit wait_queue_head, and this is fine.

As I said, I don't understand this patch. But at least this change is
wrong in case when __wake_up_parent() is calles by __ptrace_detach().
(you removed it in 5/16 but this looks wrong too). Sure, we can change
ptrace_detach() to use __wake_up_parent(p, p->parent), but for what?

I must have missed something.

Oleg.
Re: [PATCH 07/16] signal: Wake up the designated parent
Posted by Oleg Nesterov 3 years, 11 months ago
On 05/24, Oleg Nesterov wrote:
>
> I fail to understand this patch...
>
> On 05/18, Eric W. Biederman wrote:
> >
> > Today if a process is ptraced only the ptracer will ever be woken up in
> > wait
>
> and why is this wrong?
>
> > Fixes: 75b95953a569 ("job control: Add @for_ptrace to do_notify_parent_cldstop()")
>
> how does this change fix 75b95953a569?

OK, I guess you mean the 2nd do_notify_parent_cldstop() in ptrace_stop(),
the problematic case is current->ptrace == T. Right?

I dislike this patch anyway, but let me think more about it.

Oleg.
Re: [PATCH 07/16] signal: Wake up the designated parent
Posted by Oleg Nesterov 3 years, 11 months ago
On 05/24, Oleg Nesterov wrote:
>
> On 05/24, Oleg Nesterov wrote:
> >
> > I fail to understand this patch...
> >
> > On 05/18, Eric W. Biederman wrote:
> > >
> > > Today if a process is ptraced only the ptracer will ever be woken up in
> > > wait
> >
> > and why is this wrong?
> >
> > > Fixes: 75b95953a569 ("job control: Add @for_ptrace to do_notify_parent_cldstop()")
> >
> > how does this change fix 75b95953a569?
>
> OK, I guess you mean the 2nd do_notify_parent_cldstop() in ptrace_stop(),
> the problematic case is current->ptrace == T. Right?
>
> I dislike this patch anyway, but let me think more about it.

OK, now that I understand the problem, the patch doesn't look bad to me,
although I'd ask to make the changelog more clear.

After this change __wake_up_parent() can't accept any "parent" from
p->parent thread group, but all callers look fine except ptrace_detach().

Oleg.
Re: [PATCH 07/16] signal: Wake up the designated parent
Posted by Eric W. Biederman 3 years, 11 months ago
Oleg Nesterov <oleg@redhat.com> writes:

> On 05/24, Oleg Nesterov wrote:
>>
>> On 05/24, Oleg Nesterov wrote:
>> >
>> > I fail to understand this patch...
>> >
>> > On 05/18, Eric W. Biederman wrote:
>> > >
>> > > Today if a process is ptraced only the ptracer will ever be woken up in
>> > > wait
>> >
>> > and why is this wrong?
>> >
>> > > Fixes: 75b95953a569 ("job control: Add @for_ptrace to do_notify_parent_cldstop()")
>> >
>> > how does this change fix 75b95953a569?
>>
>> OK, I guess you mean the 2nd do_notify_parent_cldstop() in ptrace_stop(),
>> the problematic case is current->ptrace == T. Right?
>>
>> I dislike this patch anyway, but let me think more about it.
>
> OK, now that I understand the problem, the patch doesn't look bad to me,
> although I'd ask to make the changelog more clear.

I will see what I can do.

> After this change __wake_up_parent() can't accept any "parent" from
> p->parent thread group, but all callers look fine except
> ptrace_detach().

Having looked at it a little more I think the change was too
restrictive.  For the !ptrace_reparented case there are possibly
two threads of the parent process that wait_consider_task will
allow to wait even with __WNOTHREAD specified.  It is desirable
to wake them both up.

Which if I have had enough sleep reduces this patch to just:

diff --git a/kernel/exit.c b/kernel/exit.c
index f072959fcab7..c8156366b722 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1431,8 +1431,10 @@ static int child_wait_callback(wait_queue_entry_t *wait, unsigned mode,
        if (!eligible_pid(wo, p))
                return 0;
 
-       if ((wo->wo_flags & __WNOTHREAD) && wait->private != p->parent)
-               return 0;
+       if ((wo->wo_flags & __WNOTHREAD) &&
+           (wait->private != p->parent) &&
+           (wait->private != p->real_parent))
+                       return 0;
 
        return default_wake_function(wait, mode, sync, key);
 }


I think that solves the issue without missing wake-ups without adding
any more.

For the same set of reasons it looks like the __wake_up_parent in
__ptrace_detach is just simply dead code.  I don't think there is a case
where when !ptrace_reparented the thread that is the real_parent can
sleep in do_wait when the thread that was calling ptrace could not.

That needs a very close look to confirm. 

Eric
Re: [PATCH 07/16] signal: Wake up the designated parent
Posted by Oleg Nesterov 3 years, 11 months ago
On 06/06, Eric W. Biederman wrote:
>
> Which if I have had enough sleep reduces this patch to just:
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index f072959fcab7..c8156366b722 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1431,8 +1431,10 @@ static int child_wait_callback(wait_queue_entry_t *wait, unsigned mode,
>         if (!eligible_pid(wo, p))
>                 return 0;
>
> -       if ((wo->wo_flags & __WNOTHREAD) && wait->private != p->parent)
> -               return 0;
> +       if ((wo->wo_flags & __WNOTHREAD) &&
> +           (wait->private != p->parent) &&
> +           (wait->private != p->real_parent))
> +                       return 0;
>
>         return default_wake_function(wait, mode, sync, key);
>  }
>
>
> I think that solves the issue without missing wake-ups without adding
> any more.

Agreed, and looks much simpler.

> For the same set of reasons it looks like the __wake_up_parent in
> __ptrace_detach is just simply dead code.  I don't think there is a case
> where when !ptrace_reparented the thread that is the real_parent can
> sleep in do_wait when the thread that was calling ptrace could not.

Yes... this doesn't really differ from the case when one thread reaps
a natural child and another thread sleep in do_wait().

> That needs a very close look to confirm.

Yes.

Oleg.
[PATCH 08/16] ptrace: Only populate last_siginfo from ptrace
Posted by Eric W. Biederman 3 years, 11 months ago
The code in ptrace_signal to populate siginfo if the signal number
changed is buggy.  If the tracer contined the tracee using
ptrace_detach it is guaranteed to use the real_parent (or possibly a
new tracer) but definitely not the origional tracer to populate si_pid
and si_uid.

Fix this bug by only updating siginfo from the tracer so that the
tracers pid and the tracers uid are always used.

If it happens that ptrace_resume or ptrace_detach don't have
a signal to continue with clear siginfo.

This is a very old bug that has been fixable since commit 1669ce53e2ff
("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO") when last_siginfo was
introduced and the tracer could change siginfo.

Fixes: v2.1.68
History-Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/ptrace.c | 31 +++++++++++++++++++++++++++++--
 kernel/signal.c | 18 ------------------
 2 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 15e93eafa6f0..a24eed725cec 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -526,6 +526,33 @@ static int ptrace_traceme(void)
 	return ret;
 }
 
+static void ptrace_set_signr(struct task_struct *child, unsigned int signr)
+{
+	struct kernel_siginfo *info = child->last_siginfo;
+
+	child->exit_code = signr;
+	/*
+	 * Update the siginfo structure if the signal has
+	 * changed.  If the debugger wanted something
+	 * specific in the siginfo structure then it should
+	 * have updated *info via PTRACE_SETSIGINFO.
+	 */
+	if (info && (info->si_signo != signr)) {
+		clear_siginfo(info);
+
+		if (signr != 0) {
+			info->si_signo = signr;
+			info->si_errno = 0;
+			info->si_code = SI_USER;
+			rcu_read_lock();
+			info->si_pid = task_pid_nr_ns(current, task_active_pid_ns(child));
+			info->si_uid = from_kuid_munged(task_cred_xxx(child, user_ns),
+						current_uid());
+			rcu_read_unlock();
+		}
+	}
+}
+
 /*
  * Called with tasklist_lock held for writing.
  * Unlink a traced task, and clean it up if it was a traced zombie.
@@ -579,7 +606,7 @@ static int ptrace_detach(struct task_struct *child, unsigned int data)
 	 * tasklist_lock avoids the race with wait_task_stopped(), see
 	 * the comment in ptrace_resume().
 	 */
-	child->exit_code = data;
+	ptrace_set_signr(child, data);
 	__ptrace_detach(current, child);
 	write_unlock_irq(&tasklist_lock);
 
@@ -851,7 +878,7 @@ static int ptrace_resume(struct task_struct *child, long request,
 	 * wait_task_stopped() after resume.
 	 */
 	spin_lock_irq(&child->sighand->siglock);
-	child->exit_code = data;
+	ptrace_set_signr(child, data);
 	child->jobctl &= ~JOBCTL_TRACED;
 	wake_up_state(child, __TASK_TRACED);
 	spin_unlock_irq(&child->sighand->siglock);
diff --git a/kernel/signal.c b/kernel/signal.c
index e782c2611b64..ff4a52352390 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2562,24 +2562,6 @@ static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)
 	if (signr == 0)
 		return signr;
 
-	/*
-	 * Update the siginfo structure if the signal has
-	 * changed.  If the debugger wanted something
-	 * specific in the siginfo structure then it should
-	 * have updated *info via PTRACE_SETSIGINFO.
-	 */
-	if (signr != info->si_signo) {
-		clear_siginfo(info);
-		info->si_signo = signr;
-		info->si_errno = 0;
-		info->si_code = SI_USER;
-		rcu_read_lock();
-		info->si_pid = task_pid_vnr(current->parent);
-		info->si_uid = from_kuid_munged(current_user_ns(),
-						task_uid(current->parent));
-		rcu_read_unlock();
-	}
-
 	/* If the (new) signal is now blocked, requeue it.  */
 	if (sigismember(&current->blocked, signr) ||
 	    fatal_signal_pending(current)) {
-- 
2.35.3
Re: [PATCH 08/16] ptrace: Only populate last_siginfo from ptrace
Posted by Oleg Nesterov 3 years, 11 months ago
On 05/18, Eric W. Biederman wrote:
>
> The code in ptrace_signal to populate siginfo if the signal number
> changed is buggy.  If the tracer contined the tracee using
> ptrace_detach it is guaranteed to use the real_parent (or possibly a
> new tracer) but definitely not the origional tracer to populate si_pid
> and si_uid.

I guess nobody cares. As the comment says

	 If the debugger wanted something
	 specific in the siginfo structure then it should
	 have updated *info via PTRACE_SETSIGINFO.

otherwise I don't think si_pid/si_uid have any value.

However the patch looks fine to me, just the word "buggy" looks a bit
too strong imo.

Oleg.
Re: [PATCH 08/16] ptrace: Only populate last_siginfo from ptrace
Posted by Eric W. Biederman 3 years, 11 months ago
Oleg Nesterov <oleg@redhat.com> writes:

> On 05/18, Eric W. Biederman wrote:
>>
>> The code in ptrace_signal to populate siginfo if the signal number
>> changed is buggy.  If the tracer contined the tracee using
>> ptrace_detach it is guaranteed to use the real_parent (or possibly a
>> new tracer) but definitely not the origional tracer to populate si_pid
>> and si_uid.
>
> I guess nobody cares. As the comment says
>
> 	 If the debugger wanted something
> 	 specific in the siginfo structure then it should
> 	 have updated *info via PTRACE_SETSIGINFO.
>
> otherwise I don't think si_pid/si_uid have any value.

No one has complained so it is clearly no one cares.  So it is
definitely not a regression.  Or even anything that needs to be
backported.

However si_pid and si_uid are defined with SI_USER are defined
to be whomever sent the signal.  So I would argue by definition
those values are wrong.

> However the patch looks fine to me, just the word "buggy" looks a bit
> too strong imo.

I guess I am in general agreement.  Perhaps I can just say they values
are wrong by definition?

Eric
Re: [PATCH 08/16] ptrace: Only populate last_siginfo from ptrace
Posted by Oleg Nesterov 3 years, 11 months ago
On 06/06, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > However the patch looks fine to me, just the word "buggy" looks a bit
> > too strong imo.
>
> I guess I am in general agreement.  Perhaps I can just say they values
> are wrong by definition?

Up to you. I won't really argue with "buggy".

Oleg.
[PATCH 09/16] ptrace: In ptrace_setsiginfo deal with invalid si_signo
Posted by Eric W. Biederman 3 years, 11 months ago
If the tracer calls PTRACE_SETSIGINFO it only has an effect if the
tracee is stopped in ptrace_signal.

When one of PTRACE_DETACH, PTRACE_SINGLESTEP, PTRACE_SINGLEBLOCK,
PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP, PTRACE_SYSCALL, or
PTRACE_CONT pass in a signel number to continue with the kernel
validates that signal number and the ptrace_signal verifies the signal
number matches the si_signo, before the siginfo is used.

As the signal number to continue with is verified to be a valid signal
number the signal number in si_signo must be a valid signal number.

Make this obvious and avoid needing checks later by immediately
clearing siginfo if si_signo is not valid.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/ptrace.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index a24eed725cec..a0a07d140751 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -716,7 +716,9 @@ static int ptrace_setsiginfo(struct task_struct *child, const kernel_siginfo_t *
 	if (unlikely(!child->last_siginfo))
 		return -EINVAL;
 
-	copy_siginfo(child->last_siginfo, info);
+	clear_siginfo(child->last_siginfo);
+	if (valid_signal(info->si_signo))
+		copy_siginfo(child->last_siginfo, info);
 	return 0;
 }
 
-- 
2.35.3
[PATCH 10/16] ptrace: In ptrace_signal look at what the debugger did with siginfo
Posted by Eric W. Biederman 3 years, 11 months ago
Now that siginfo is only modified by the tracer and that siginfo is
cleared with the signal is canceled have ptrace_signal directly examine
siginfo.

This makes the code a little simpler and handles the case when
the tracer exits without calling ptrace_detach.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/signal.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index ff4a52352390..3d955c23b13d 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2556,9 +2556,10 @@ static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)
 	 * comment in dequeue_signal().
 	 */
 	current->jobctl |= JOBCTL_STOP_DEQUEUED;
-	signr = ptrace_stop(signr, CLD_TRAPPED, 0, info);
+	ptrace_stop(signr, CLD_TRAPPED, 0, info);
 
 	/* We're back.  Did the debugger cancel the sig?  */
+	signr = info->si_signo;
 	if (signr == 0)
 		return signr;
 
-- 
2.35.3
[PATCH 11/16] ptrace: Use si_sino as the signal number to resume with
Posted by Eric W. Biederman 3 years, 11 months ago
The signal number to resume with is already in si_signo.  So instead
of placing an extra copy in tsk->exit_code and later reading the extra
copy from tsk->exit_code just read si_signo.

Read si_signo in ptrace_do_notify where it is easy as the siginfo is a
local variable.  Only ptrace_report_syscall cares about the signal to
resume with from ptrace_stop and it calls ptrace_notify which calls
ptrace_do_notify so moving the actual work into ptrace_do_notify where
it is easier is not a problem.

With ptrace_stop not being involved in returning the signal to tracer
asked the tracee to resume with remove the comment and the return
code from ptrace_stop.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/ptrace.c |  1 -
 kernel/signal.c | 13 ++++---------
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index a0a07d140751..e0ecb1536dfc 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -530,7 +530,6 @@ static void ptrace_set_signr(struct task_struct *child, unsigned int signr)
 {
 	struct kernel_siginfo *info = child->last_siginfo;
 
-	child->exit_code = signr;
 	/*
 	 * Update the siginfo structure if the signal has
 	 * changed.  If the debugger wanted something
diff --git a/kernel/signal.c b/kernel/signal.c
index 3d955c23b13d..2cc45e8448e2 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2186,12 +2186,8 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
  * We always set current->last_siginfo while stopped here.
  * That makes it a way to test a stopped process for
  * being ptrace-stopped vs being job-control-stopped.
- *
- * Returns the signal the ptracer requested the code resume
- * with.  If the code did not stop because the tracer is gone,
- * the stop signal remains unchanged unless clear_code.
  */
-static int ptrace_stop(int exit_code, int why, unsigned long message,
+static void ptrace_stop(int exit_code, int why, unsigned long message,
 		       kernel_siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
@@ -2219,7 +2215,7 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
 	 * signals here to prevent ptrace_stop sleeping in schedule.
 	 */
 	if (!current->ptrace || __fatal_signal_pending(current))
-		return exit_code;
+		return;
 
 	set_special_state(TASK_TRACED);
 	current->jobctl |= JOBCTL_TRACED;
@@ -2302,7 +2298,6 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
 	 * any signal-sending on another CPU that wants to examine it.
 	 */
 	spin_lock_irq(&current->sighand->siglock);
-	exit_code = current->exit_code;
 	current->last_siginfo = NULL;
 	current->ptrace_message = 0;
 	current->exit_code = 0;
@@ -2316,7 +2311,6 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
 	 * This sets TIF_SIGPENDING, but never clears it.
 	 */
 	recalc_sigpending_tsk(current);
-	return exit_code;
 }
 
 static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long message)
@@ -2330,7 +2324,8 @@ static int ptrace_do_notify(int signr, int exit_code, int why, unsigned long mes
 	info.si_uid = from_kuid_munged(current_user_ns(), current_uid());
 
 	/* Let the debugger run.  */
-	return ptrace_stop(exit_code, why, message, &info);
+	ptrace_stop(exit_code, why, message, &info);
+	return info.si_signo;
 }
 
 int ptrace_notify(int exit_code, unsigned long message)
-- 
2.35.3
[PATCH 12/16] ptrace: Stop protecting ptrace_set_signr with tasklist_lock
Posted by Eric W. Biederman 3 years, 11 months ago
Now that ptrace_set_signr no longer sets task->exit_code the race
documented in commit b72c186999e6 ("ptrace: fix race between
ptrace_resume() and wait_task_stopped()") is no longer possible, as
task->exit_code is only updated by wait during a ptrace_stop.

As there is no possibilty of a race and ptrace_freeze_traced is
all of the protection ptrace_set_signr needs to operate without
contention move ptrace_set_signr outside of tasklist_lock
and remove the documentation about the race that is no more.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/ptrace.c | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index e0ecb1536dfc..d0527b6e2b29 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -595,17 +595,14 @@ static int ptrace_detach(struct task_struct *child, unsigned int data)
 	/* Architecture-specific hardware disable .. */
 	ptrace_disable(child);
 
+	ptrace_set_signr(child, data);
+
 	write_lock_irq(&tasklist_lock);
 	/*
 	 * We rely on ptrace_freeze_traced(). It can't be killed and
 	 * untraced by another thread, it can't be a zombie.
 	 */
 	WARN_ON(!child->ptrace || child->exit_state);
-	/*
-	 * tasklist_lock avoids the race with wait_task_stopped(), see
-	 * the comment in ptrace_resume().
-	 */
-	ptrace_set_signr(child, data);
 	__ptrace_detach(current, child);
 	write_unlock_irq(&tasklist_lock);
 
@@ -869,17 +866,9 @@ static int ptrace_resume(struct task_struct *child, long request,
 		user_disable_single_step(child);
 	}
 
-	/*
-	 * Change ->exit_code and ->state under siglock to avoid the race
-	 * with wait_task_stopped() in between; a non-zero ->exit_code will
-	 * wrongly look like another report from tracee.
-	 *
-	 * Note that we need siglock even if ->exit_code == data and/or this
-	 * status was not reported yet, the new status must not be cleared by
-	 * wait_task_stopped() after resume.
-	 */
-	spin_lock_irq(&child->sighand->siglock);
 	ptrace_set_signr(child, data);
+
+	spin_lock_irq(&child->sighand->siglock);
 	child->jobctl &= ~JOBCTL_TRACED;
 	wake_up_state(child, __TASK_TRACED);
 	spin_unlock_irq(&child->sighand->siglock);
-- 
2.35.3
[PATCH 13/16] ptrace: Document why ptrace_setoptions does not need a lock
Posted by Eric W. Biederman 3 years, 11 months ago
The functions that change ->ptrace are: ptrace_attach, ptrace_traceme,
ptrace_init_task, __ptrace_unlink, ptrace_setoptions.

Except for ptrace_setoptions all of the places where ->ptrace is
modified hold tasklist_lock for write, and either the tracee or the
tracer is modifies ->ptrace.

When ptrace_setoptions is called the tracee has been frozen with
ptrace_freeze_traced, and most be explicitly unfrozen by the tracer
before it can do anything.  As ptrace_setoption is run in the tracer
there can be no contention by the simple fact that the tracee can't
run.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/ptrace.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index d0527b6e2b29..fbadd2f21f09 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -689,7 +689,10 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
 	if (ret)
 		return ret;
 
-	/* Avoid intermediate state when all opts are cleared */
+	/*
+	 * With a frozen tracee, only the tracer modifies ->ptrace.
+	 * Avoid intermediate state when all opts are cleared.
+	 */
 	flags = child->ptrace;
 	flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
 	flags |= (data << PT_OPT_FLAG_SHIFT);
-- 
2.35.3
[PATCH 14/16] signal: Protect parent child relationships by childs siglock
Posted by Eric W. Biederman 3 years, 11 months ago
The functions ptrace_stop and do_signal_stop have to drop siglock
and grab tasklist_lock because the parent/child relation ship
is guarded by tasklist_lock and not siglock.

Simplify things by additionally guarding the parent/child relationship
with siglock.  This just requires a little bit of code motion.

After this change tsk->parent, tsk->real_parent, tsk->ptracer_cred
are all protected by tsk->siglock.

The fields tsk->sibling and tsk->ptrace_entry are mostly protected by
tsk->siglock.  The field tsk->ptrace_entry is not protected by siglock
when tsk->ptrace_entry is reused as the dead task list.  The field
tsk->sibling is not protected by siglock when children are reparented
because their original parent dies.

The field tsk->ptrace is protected by siglock except for the options
which may change without siglock being held.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/exit.c   |  4 ++++
 kernel/fork.c   | 12 ++++++------
 kernel/ptrace.c |  9 +++++----
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 0e26f73c49ac..bad434b23c48 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -643,11 +643,15 @@ static void forget_original_parent(struct task_struct *father,
 
 	reaper = find_new_reaper(father, reaper);
 	list_for_each_entry(p, &father->children, sibling) {
+		spin_lock(&p->sighand->siglock);
 		for_each_thread(p, t) {
 			RCU_INIT_POINTER(t->real_parent, reaper);
 			BUG_ON((!t->ptrace) != (rcu_access_pointer(t->parent) == father));
 			if (likely(!t->ptrace))
 				t->parent = t->real_parent;
+		}
+		spin_unlock(&p->sighand->siglock);
+		for_each_thread(p, t) {
 			if (t->pdeath_signal)
 				group_send_sig_info(t->pdeath_signal,
 						    SEND_SIG_NOINFO, t,
diff --git a/kernel/fork.c b/kernel/fork.c
index 9796897560ab..841021da69f3 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2367,6 +2367,12 @@ static __latent_entropy struct task_struct *copy_process(
 	 */
 	write_lock_irq(&tasklist_lock);
 
+	klp_copy_process(p);
+
+	sched_core_fork(p);
+
+	spin_lock(&current->sighand->siglock);
+
 	/* CLONE_PARENT re-uses the old parent */
 	if (clone_flags & (CLONE_PARENT|CLONE_THREAD)) {
 		p->real_parent = current->real_parent;
@@ -2381,12 +2387,6 @@ static __latent_entropy struct task_struct *copy_process(
 		p->exit_signal = args->exit_signal;
 	}
 
-	klp_copy_process(p);
-
-	sched_core_fork(p);
-
-	spin_lock(&current->sighand->siglock);
-
 	/*
 	 * Copy seccomp details explicitly here, in case they were changed
 	 * before holding sighand lock.
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index fbadd2f21f09..77dfdb3d1ced 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -123,13 +123,12 @@ void __ptrace_unlink(struct task_struct *child)
 	clear_task_syscall_work(child, SYSCALL_EMU);
 #endif
 
+	spin_lock(&child->sighand->siglock);
 	child->parent = child->real_parent;
 	list_del_init(&child->ptrace_entry);
 	old_cred = child->ptracer_cred;
 	child->ptracer_cred = NULL;
 	put_cred(old_cred);
-
-	spin_lock(&child->sighand->siglock);
 	child->ptrace = 0;
 	/*
 	 * Clear all pending traps and TRAPPING.  TRAPPING should be
@@ -441,15 +440,15 @@ static int ptrace_attach(struct task_struct *task, long request,
 	if (task->ptrace)
 		goto unlock_tasklist;
 
+	spin_lock(&task->sighand->siglock);
 	task->ptrace = flags;
 
 	ptrace_link(task, current);
 
 	/* SEIZE doesn't trap tracee on attach */
 	if (!seize)
-		send_sig_info(SIGSTOP, SEND_SIG_PRIV, task);
+		send_signal_locked(SIGSTOP, SEND_SIG_PRIV, task, PIDTYPE_PID);
 
-	spin_lock(&task->sighand->siglock);
 
 	/*
 	 * If the task is already STOPPED, set JOBCTL_TRAP_STOP and
@@ -517,8 +516,10 @@ static int ptrace_traceme(void)
 		 * pretend ->real_parent untraces us right after return.
 		 */
 		if (!ret && !(current->real_parent->flags & PF_EXITING)) {
+			spin_lock(&current->sighand->siglock);
 			current->ptrace = PT_PTRACED;
 			ptrace_link(current, current->real_parent);
+			spin_unlock(&current->sighand->siglock);
 		}
 	}
 	write_unlock_irq(&tasklist_lock);
-- 
2.35.3
[PATCH 15/16] ptrace: Use siglock instead of tasklist_lock in ptrace_check_attach
Posted by Eric W. Biederman 3 years, 11 months ago
Now that siglock protects tsk->parent and tsk->ptrace there is no need
to grab tasklist_lock in ptrace_check_attach.  The siglock can handle
all of the locking needs of ptrace_check_attach.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/ptrace.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 77dfdb3d1ced..fa65841bbdbe 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -194,17 +194,14 @@ static bool ptrace_freeze_traced(struct task_struct *task)
 {
 	bool ret = false;
 
-	/* Lockless, nobody but us can set this flag */
 	if (task->jobctl & JOBCTL_LISTENING)
 		return ret;
 
-	spin_lock_irq(&task->sighand->siglock);
 	if (task_is_traced(task) && !looks_like_a_spurious_pid(task) &&
 	    !__fatal_signal_pending(task)) {
 		task->jobctl |= JOBCTL_PTRACE_FROZEN;
 		ret = true;
 	}
-	spin_unlock_irq(&task->sighand->siglock);
 
 	return ret;
 }
@@ -240,32 +237,30 @@ static void ptrace_unfreeze_traced(struct task_struct *task)
  * state.
  *
  * CONTEXT:
- * Grabs and releases tasklist_lock and @child->sighand->siglock.
+ * Grabs and releases @child->sighand->siglock.
  *
  * RETURNS:
  * 0 on success, -ESRCH if %child is not ready.
  */
 static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 {
+	unsigned long flags;
 	int ret = -ESRCH;
 
 	/*
-	 * We take the read lock around doing both checks to close a
+	 * We take the siglock around doing both checks to close a
 	 * possible race where someone else was tracing our child and
 	 * detached between these two checks.  After this locked check,
 	 * we are sure that this is our traced child and that can only
 	 * be changed by us so it's not changing right after this.
 	 */
-	read_lock(&tasklist_lock);
-	if (child->ptrace && child->parent == current) {
-		/*
-		 * child->sighand can't be NULL, release_task()
-		 * does ptrace_unlink() before __exit_signal().
-		 */
-		if (ignore_state || ptrace_freeze_traced(child))
-			ret = 0;
+	if (lock_task_sighand(child, &flags)) {
+		if (child->ptrace && child->parent == current) {
+			if (ignore_state || ptrace_freeze_traced(child))
+				ret = 0;
+		}
+		unlock_task_sighand(child, &flags);
 	}
-	read_unlock(&tasklist_lock);
 
 	if (!ret && !ignore_state &&
 	    WARN_ON_ONCE(!wait_task_inactive(child, __TASK_TRACED)))
-- 
2.35.3
[PATCH 16/16] signal: Always call do_notify_parent_cldstop with siglock held
Posted by Eric W. Biederman 3 years, 11 months ago
Now that siglock keeps tsk->parent and tsk->real_parent constant
require that do_notify_parent_cldstop is called with tsk->siglock held
instead of the tasklist_lock.

As all of the callers of do_notify_parent_cldstop had to drop the
siglock and take tasklist_lock this simplifies all of it's callers.

This removes one reason for taking tasklist_lock.

This makes ptrace_stop so that it should reliably work correctly and
reliably with PREEMPT_RT enabled and CONFIG_CGROUPS disabled.  The
remaining challenge is that cgroup_enter_frozen takes spin_lock after
__state has been set to TASK_TRACED.  Which on PREEMPT_RT means the
code can sleep and change __state.  Not only that but it means that
wait_task_inactive could potentially detect the code scheduling away
at that point and fail, causing ptrace_check_attach to fail.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/signal.c | 262 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 189 insertions(+), 73 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 2cc45e8448e2..d4956be51939 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1994,6 +1994,129 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
 	return ret;
 }
 
+/**
+ * lock_parents_siglocks - Take current, real_parent, and parent's siglock
+ * @lock_tracer: The tracers siglock is needed.
+ *
+ * There is no natural ordering to these locks so they must be sorted
+ * before being taken.
+ *
+ * There are two complicating factors here:
+ * - The locks live in sighand and sighand can be arbitrarily shared
+ * - parent and real_parent can change when current's siglock is unlocked.
+ *
+ * To deal with this first the all of the sighand pointers are
+ * gathered under current's siglock, and the sighand pointers are
+ * sorted.  As siglock lives inside of sighand this also sorts the
+ * siglock's by address.
+ *
+ * Then the siglocks are taken in order dropping current's siglock if
+ * necessary.
+ *
+ * Finally if parent and real_parent have not changed return.
+ * If they either parent has changed drop their locks and try again.
+ *
+ * Changing sighand is an infrequent and somewhat expensive operation
+ * (unshare or exec) and so even in the worst case this loop
+ * should not loop too many times before all of the proper locks are
+ * taken in order.
+ *
+ * CONTEXT:
+ * Must be called with @current->sighand->siglock held
+ *
+ * RETURNS:
+ * current's, real_parent's, and parent's siglock held.
+ */
+static void lock_parents_siglocks(bool lock_tracer)
+	__releases(&current->sighand->siglock)
+	__acquires(&current->sighand->siglock)
+	__acquires(&current->real_parent->sighand->siglock)
+	__acquires(&current->parent->sighand->siglock)
+{
+	struct task_struct *me = current;
+	struct sighand_struct *m_sighand = me->sighand;
+
+	lockdep_assert_held(&m_sighand->siglock);
+
+	rcu_read_lock();
+	for (;;) {
+		struct task_struct *parent, *tracer;
+		struct sighand_struct *p_sighand, *t_sighand, *s1, *s2, *s3;
+
+		parent = me->real_parent;
+		tracer = ptrace_parent(me);
+		if (!tracer || !lock_tracer)
+			tracer = parent;
+
+		p_sighand = rcu_dereference(parent->sighand);
+		t_sighand = rcu_dereference(tracer->sighand);
+
+		/* Sort the sighands so that s1 >= s2 >= s3 */
+		s1 = m_sighand;
+		s2 = p_sighand;
+		s3 = t_sighand;
+		if (s1 > s2)
+			swap(s1, s2);
+		if (s1 > s3)
+			swap(s1, s3);
+		if (s2 > s3)
+			swap(s2, s3);
+
+		/* Take the locks in order */
+		if (s1 != m_sighand) {
+			spin_unlock(&m_sighand->siglock);
+			spin_lock(&s1->siglock);
+		}
+		if (s1 != s2)
+			spin_lock_nested(&s2->siglock, 1);
+		if (s2 != s3)
+			spin_lock_nested(&s3->siglock, 2);
+
+		/* Verify the proper locks are held */
+		if (likely((s1 == m_sighand) ||
+			   ((me->real_parent == parent) &&
+			    (me->parent == tracer) &&
+			    (parent->sighand == p_sighand) &&
+			    (tracer->sighand == t_sighand)))) {
+			break;
+		}
+
+		/* Drop all but current's siglock */
+		if (p_sighand != m_sighand)
+			spin_unlock(&p_sighand->siglock);
+		if (t_sighand != p_sighand)
+			spin_unlock(&t_sighand->siglock);
+
+		/*
+		 * Since [pt]_sighand will likely change if we go
+		 * around, and m_sighand is the only one held, make sure
+		 * it is subclass-0, since the above 's1 != m_sighand'
+		 * clause very much relies on that.
+		 */
+		lock_set_subclass(&m_sighand->siglock.dep_map, 0, _RET_IP_);
+	}
+	rcu_read_unlock();
+}
+
+static void unlock_parents_siglocks(bool unlock_tracer)
+	__releases(&current->real_parent->sighand->siglock)
+	__releases(&current->parent->sighand->siglock)
+{
+	struct task_struct *me = current;
+	struct task_struct *parent = me->real_parent;
+	struct task_struct *tracer = ptrace_parent(me);
+	struct sighand_struct *m_sighand = me->sighand;
+	struct sighand_struct *p_sighand = parent->sighand;
+
+	if (p_sighand != m_sighand)
+		spin_unlock(&p_sighand->siglock);
+	if (tracer && unlock_tracer) {
+		struct sighand_struct *t_sighand = tracer->sighand;
+		if (t_sighand != p_sighand)
+			spin_unlock(&t_sighand->siglock);
+	}
+}
+
 static void do_notify_pidfd(struct task_struct *task)
 {
 	struct pid *pid;
@@ -2125,11 +2248,12 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
 				     bool for_ptracer, int why)
 {
 	struct kernel_siginfo info;
-	unsigned long flags;
 	struct task_struct *parent;
 	struct sighand_struct *sighand;
 	u64 utime, stime;
 
+	lockdep_assert_held(&tsk->sighand->siglock);
+
 	if (for_ptracer) {
 		parent = tsk->parent;
 	} else {
@@ -2137,6 +2261,8 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
 		parent = tsk->real_parent;
 	}
 
+	lockdep_assert_held(&parent->sighand->siglock);
+
 	clear_siginfo(&info);
 	info.si_signo = SIGCHLD;
 	info.si_errno = 0;
@@ -2168,7 +2294,6 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
  	}
 
 	sighand = parent->sighand;
-	spin_lock_irqsave(&sighand->siglock, flags);
 	if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN &&
 	    !(sighand->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDSTOP))
 		send_signal_locked(SIGCHLD, &info, parent, PIDTYPE_TGID);
@@ -2176,7 +2301,6 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
 	 * Even if SIGCHLD is not generated, we must wake up wait4 calls.
 	 */
 	__wake_up_parent(tsk, parent);
-	spin_unlock_irqrestore(&sighand->siglock, flags);
 }
 
 /*
@@ -2208,14 +2332,18 @@ static void ptrace_stop(int exit_code, int why, unsigned long message,
 		spin_lock_irq(&current->sighand->siglock);
 	}
 
+	lock_parents_siglocks(true);
 	/*
 	 * After this point ptrace_signal_wake_up or signal_wake_up
 	 * will clear TASK_TRACED if ptrace_unlink happens or a fatal
 	 * signal comes in.  Handle previous ptrace_unlinks and fatal
 	 * signals here to prevent ptrace_stop sleeping in schedule.
 	 */
-	if (!current->ptrace || __fatal_signal_pending(current))
+
+	if (!current->ptrace || __fatal_signal_pending(current)) {
+		unlock_parents_siglocks(true);
 		return;
+	}
 
 	set_special_state(TASK_TRACED);
 	current->jobctl |= JOBCTL_TRACED;
@@ -2254,16 +2382,6 @@ static void ptrace_stop(int exit_code, int why, unsigned long message,
 	if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
 		gstop_done = task_participate_group_stop(current);
 
-	/* any trap clears pending STOP trap, STOP trap clears NOTIFY */
-	task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
-	if (info && info->si_code >> 8 == PTRACE_EVENT_STOP)
-		task_clear_jobctl_pending(current, JOBCTL_TRAP_NOTIFY);
-
-	/* entering a trap, clear TRAPPING */
-	task_clear_jobctl_trapping(current);
-
-	spin_unlock_irq(&current->sighand->siglock);
-	read_lock(&tasklist_lock);
 	/*
 	 * Notify parents of the stop.
 	 *
@@ -2279,14 +2397,25 @@ static void ptrace_stop(int exit_code, int why, unsigned long message,
 	if (gstop_done && (!current->ptrace || ptrace_reparented(current)))
 		do_notify_parent_cldstop(current, false, why);
 
+	unlock_parents_siglocks(true);
+
+	/* any trap clears pending STOP trap, STOP trap clears NOTIFY */
+	task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
+	if (info && info->si_code >> 8 == PTRACE_EVENT_STOP)
+		task_clear_jobctl_pending(current, JOBCTL_TRAP_NOTIFY);
+
+	/* entering a trap, clear TRAPPING */
+	task_clear_jobctl_trapping(current);
+
 	/*
 	 * Don't want to allow preemption here, because
 	 * sys_ptrace() needs this task to be inactive.
 	 *
-	 * XXX: implement read_unlock_no_resched().
+	 * XXX: implement spin_unlock_no_resched().
 	 */
 	preempt_disable();
-	read_unlock(&tasklist_lock);
+	spin_unlock_irq(&current->sighand->siglock);
+
 	cgroup_enter_frozen();
 	preempt_enable_no_resched();
 	freezable_schedule();
@@ -2361,8 +2490,8 @@ int ptrace_notify(int exit_code, unsigned long message)
  * on %true return.
  *
  * RETURNS:
- * %false if group stop is already cancelled or ptrace trap is scheduled.
- * %true if participated in group stop.
+ * %false if group stop is already cancelled.
+ * %true otherwise (as lock_parents_siglocks may have dropped siglock).
  */
 static bool do_signal_stop(int signr)
 	__releases(&current->sighand->siglock)
@@ -2425,36 +2554,24 @@ static bool do_signal_stop(int signr)
 		}
 	}
 
+	lock_parents_siglocks(false);
+	/* Recheck JOBCTL_STOP_PENDING after unlock+lock of siglock */
+	if (unlikely(!(current->jobctl & JOBCTL_STOP_PENDING)))
+		goto out;
 	if (likely(!current->ptrace)) {
-		int notify = 0;
-
 		/*
 		 * If there are no other threads in the group, or if there
 		 * is a group stop in progress and we are the last to stop,
-		 * report to the parent.
+		 * report to the real_parent.
 		 */
 		if (task_participate_group_stop(current))
-			notify = CLD_STOPPED;
+			do_notify_parent_cldstop(current, false, CLD_STOPPED);
+		unlock_parents_siglocks(false);
 
 		current->jobctl |= JOBCTL_STOPPED;
 		set_special_state(TASK_STOPPED);
 		spin_unlock_irq(&current->sighand->siglock);
 
-		/*
-		 * Notify the parent of the group stop completion.  Because
-		 * we're not holding either the siglock or tasklist_lock
-		 * here, ptracer may attach inbetween; however, this is for
-		 * group stop and should always be delivered to the real
-		 * parent of the group leader.  The new ptracer will get
-		 * its notification when this task transitions into
-		 * TASK_TRACED.
-		 */
-		if (notify) {
-			read_lock(&tasklist_lock);
-			do_notify_parent_cldstop(current, false, notify);
-			read_unlock(&tasklist_lock);
-		}
-
 		/* Now we don't run again until woken by SIGCONT or SIGKILL */
 		cgroup_enter_frozen();
 		freezable_schedule();
@@ -2465,8 +2582,11 @@ static bool do_signal_stop(int signr)
 		 * Schedule it and let the caller deal with it.
 		 */
 		task_set_jobctl_pending(current, JOBCTL_TRAP_STOP);
-		return false;
 	}
+out:
+	unlock_parents_siglocks(false);
+	spin_unlock_irq(&current->sighand->siglock);
+	return true;
 }
 
 /**
@@ -2624,32 +2744,30 @@ bool get_signal(struct ksignal *ksig)
 	if (unlikely(signal->flags & SIGNAL_CLD_MASK)) {
 		int why;
 
-		if (signal->flags & SIGNAL_CLD_CONTINUED)
-			why = CLD_CONTINUED;
-		else
-			why = CLD_STOPPED;
+		lock_parents_siglocks(true);
+		/* Recheck signal->flags after unlock+lock of siglock */
+		if (likely(signal->flags & SIGNAL_CLD_MASK)) {
+			if (signal->flags & SIGNAL_CLD_CONTINUED)
+				why = CLD_CONTINUED;
+			else
+				why = CLD_STOPPED;
 
-		signal->flags &= ~SIGNAL_CLD_MASK;
+			signal->flags &= ~SIGNAL_CLD_MASK;
 
-		spin_unlock_irq(&sighand->siglock);
-
-		/*
-		 * Notify the parent that we're continuing.  This event is
-		 * always per-process and doesn't make whole lot of sense
-		 * for ptracers, who shouldn't consume the state via
-		 * wait(2) either, but, for backward compatibility, notify
-		 * the ptracer of the group leader too unless it's gonna be
-		 * a duplicate.
-		 */
-		read_lock(&tasklist_lock);
-		do_notify_parent_cldstop(current, false, why);
-
-		if (ptrace_reparented(current->group_leader))
-			do_notify_parent_cldstop(current->group_leader,
-						true, why);
-		read_unlock(&tasklist_lock);
-
-		goto relock;
+			/*
+			 * Notify the parent that we're continuing.  This event is
+			 * always per-process and doesn't make whole lot of sense
+			 * for ptracers, who shouldn't consume the state via
+			 * wait(2) either, but, for backward compatibility, notify
+			 * the ptracer of the group leader too unless it's gonna be
+			 * a duplicate.
+			 */
+			do_notify_parent_cldstop(current, false, why);
+			if (ptrace_reparented(current->group_leader))
+				do_notify_parent_cldstop(current->group_leader,
+							 true, why);
+		}
+		unlock_parents_siglocks(true);
 	}
 
 	for (;;) {
@@ -2906,7 +3024,6 @@ static void retarget_shared_pending(struct task_struct *tsk, sigset_t *which)
 
 void exit_signals(struct task_struct *tsk)
 {
-	int group_stop = 0;
 	sigset_t unblocked;
 
 	/*
@@ -2937,21 +3054,20 @@ void exit_signals(struct task_struct *tsk)
 	signotset(&unblocked);
 	retarget_shared_pending(tsk, &unblocked);
 
-	if (unlikely(tsk->jobctl & JOBCTL_STOP_PENDING) &&
-	    task_participate_group_stop(tsk))
-		group_stop = CLD_STOPPED;
-out:
-	spin_unlock_irq(&tsk->sighand->siglock);
-
 	/*
 	 * If group stop has completed, deliver the notification.  This
 	 * should always go to the real parent of the group leader.
 	 */
-	if (unlikely(group_stop)) {
-		read_lock(&tasklist_lock);
-		do_notify_parent_cldstop(tsk, false, group_stop);
-		read_unlock(&tasklist_lock);
+	if (unlikely(tsk->jobctl & JOBCTL_STOP_PENDING)) {
+		lock_parents_siglocks(false);
+		/* Recheck JOBCTL_STOP_PENDING after unlock+lock of siglock */
+		if ((tsk->jobctl & JOBCTL_STOP_PENDING) &&
+		    task_participate_group_stop(tsk))
+			do_notify_parent_cldstop(tsk, false, CLD_STOPPED);
+		unlock_parents_siglocks(false);
 	}
+out:
+	spin_unlock_irq(&tsk->sighand->siglock);
 }
 
 /*
-- 
2.35.3
Re: [PATCH 16/16] signal: Always call do_notify_parent_cldstop with siglock held
Posted by kernel test robot 3 years, 11 months ago
Hi "Eric,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on next-20220518]
[cannot apply to linux/master powerpc/next wireless-next/main wireless/main linus/master v5.18-rc7 v5.18-rc6 v5.18-rc5 v5.18-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Eric-W-Biederman/signal-alpha-Remove-unused-definition-of-TASK_REAL_PARENT/20220519-065947
base:    736ee37e2e8eed7fe48d0a37ee5a709514d478b3
config: parisc-randconfig-s032-20220519 (https://download.01.org/0day-ci/archive/20220521/202205210010.E4Hyn2kD-lkp@intel.com/config)
compiler: hppa-linux-gcc (GCC) 11.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.4-dirty
        # https://github.com/intel-lab-lkp/linux/commit/4b66a617bf6d095d33fe43e9dbcfdf2e0de9fb29
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Eric-W-Biederman/signal-alpha-Remove-unused-definition-of-TASK_REAL_PARENT/20220519-065947
        git checkout 4b66a617bf6d095d33fe43e9dbcfdf2e0de9fb29
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=parisc SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
   kernel/signal.c: note: in included file (through arch/parisc/include/uapi/asm/signal.h, arch/parisc/include/asm/signal.h, include/uapi/linux/signal.h, ...):
   include/uapi/asm-generic/signal-defs.h:83:29: sparse: sparse: multiple address spaces given
   kernel/signal.c:195:31: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:195:31: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:195:31: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:198:33: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:198:33: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:198:33: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:480:9: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:480:9: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:480:9: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:484:34: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:484:34: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:484:34: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:542:53: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct k_sigaction *ka @@     got struct k_sigaction [noderef] __rcu * @@
   kernel/signal.c:542:53: sparse:     expected struct k_sigaction *ka
   kernel/signal.c:542:53: sparse:     got struct k_sigaction [noderef] __rcu *
   include/uapi/asm-generic/signal-defs.h:83:29: sparse: sparse: multiple address spaces given
   kernel/signal.c:1261:9: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: cannot dereference this type
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: cannot dereference this type
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: cannot dereference this type
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: cannot dereference this type
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: cannot dereference this type
   kernel/signal.c:1267:29: sparse: sparse: no member 'ip' in struct pt_regs
   kernel/signal.c:1267:29: sparse: sparse: cast from unknown type
   kernel/signal.c:1267:29: sparse: sparse: incompatible types for 'case' statement
   kernel/signal.c:1267:29: sparse: sparse: incompatible types for 'case' statement
   kernel/signal.c:1267:29: sparse: sparse: incompatible types for 'case' statement
   kernel/signal.c:1267:29: sparse: sparse: incompatible types for 'case' statement
   kernel/signal.c:1328:9: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:1328:9: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:1328:9: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:1329:16: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct k_sigaction *action @@     got struct k_sigaction [noderef] __rcu * @@
   kernel/signal.c:1329:16: sparse:     expected struct k_sigaction *action
   kernel/signal.c:1329:16: sparse:     got struct k_sigaction [noderef] __rcu *
   kernel/signal.c:1349:34: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:1349:34: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:1349:34: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:1938:36: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:1938:36: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:1938:36: sparse:     got struct spinlock [noderef] __rcu *
>> kernel/signal.c:2048:46: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sighand_struct *m_sighand @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/signal.c:2048:46: sparse:     expected struct sighand_struct *m_sighand
   kernel/signal.c:2048:46: sparse:     got struct sighand_struct [noderef] __rcu *sighand
   kernel/signal.c:2057:24: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct task_struct *parent @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/signal.c:2057:24: sparse:     expected struct task_struct *parent
   kernel/signal.c:2057:24: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/signal.c:2087:21: sparse: sparse: incompatible types in comparison expression (different address spaces):
>> kernel/signal.c:2087:21: sparse:    struct task_struct [noderef] __rcu *
>> kernel/signal.c:2087:21: sparse:    struct task_struct *
>> kernel/signal.c:2117:40: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct task_struct *parent @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/signal.c:2117:40: sparse:     expected struct task_struct *parent
   kernel/signal.c:2117:40: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/signal.c:2119:46: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sighand_struct *m_sighand @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/signal.c:2119:46: sparse:     expected struct sighand_struct *m_sighand
   kernel/signal.c:2119:46: sparse:     got struct sighand_struct [noderef] __rcu *sighand
>> kernel/signal.c:2120:50: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sighand_struct *p_sighand @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/signal.c:2120:50: sparse:     expected struct sighand_struct *p_sighand
   kernel/signal.c:2120:50: sparse:     got struct sighand_struct [noderef] __rcu *sighand
>> kernel/signal.c:2125:58: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sighand_struct *t_sighand @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/signal.c:2125:58: sparse:     expected struct sighand_struct *t_sighand
   kernel/signal.c:2125:58: sparse:     got struct sighand_struct [noderef] __rcu *sighand
   kernel/signal.c:2171:44: sparse: sparse: cast removes address space '__rcu' of expression
   kernel/signal.c:2190:65: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *tsk @@     got struct task_struct [noderef] __rcu *parent @@
   kernel/signal.c:2190:65: sparse:     expected struct task_struct *tsk
   kernel/signal.c:2190:65: sparse:     got struct task_struct [noderef] __rcu *parent
   kernel/signal.c:2191:40: sparse: sparse: cast removes address space '__rcu' of expression
   kernel/signal.c:2209:14: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct sighand_struct *psig @@     got struct sighand_struct [noderef] __rcu *[noderef] __rcu sighand @@
   kernel/signal.c:2209:14: sparse:     expected struct sighand_struct *psig
   kernel/signal.c:2209:14: sparse:     got struct sighand_struct [noderef] __rcu *[noderef] __rcu sighand
   kernel/signal.c:2238:53: sparse: sparse: incorrect type in argument 3 (different address spaces) @@     expected struct task_struct *t @@     got struct task_struct [noderef] __rcu *parent @@
   kernel/signal.c:2238:53: sparse:     expected struct task_struct *t
   kernel/signal.c:2238:53: sparse:     got struct task_struct [noderef] __rcu *parent
   kernel/signal.c:2239:34: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *parent @@     got struct task_struct [noderef] __rcu *parent @@
   kernel/signal.c:2239:34: sparse:     expected struct task_struct *parent
   kernel/signal.c:2239:34: sparse:     got struct task_struct [noderef] __rcu *parent
   kernel/signal.c:2269:24: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct task_struct *parent @@     got struct task_struct [noderef] __rcu *parent @@
   kernel/signal.c:2269:24: sparse:     expected struct task_struct *parent
   kernel/signal.c:2269:24: sparse:     got struct task_struct [noderef] __rcu *parent
   kernel/signal.c:2272:24: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct task_struct *parent @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/signal.c:2272:24: sparse:     expected struct task_struct *parent
   kernel/signal.c:2272:24: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/signal.c:2307:17: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct sighand_struct *sighand @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/signal.c:2307:17: sparse:     expected struct sighand_struct *sighand
   kernel/signal.c:2307:17: sparse:     got struct sighand_struct [noderef] __rcu *sighand
   kernel/signal.c:2341:41: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2341:41: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2341:41: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2343:39: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2343:39: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2343:39: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2428:33: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2428:33: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2428:33: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2440:31: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2440:31: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2440:31: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2479:31: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2479:31: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2479:31: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2481:33: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2481:33: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2481:33: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2584:41: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2584:41: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2584:41: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2599:33: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2599:33: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2599:33: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2656:41: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2656:41: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2656:41: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2668:33: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:2668:33: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:2668:33: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:2726:49: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sighand_struct *sighand @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/signal.c:2726:49: sparse:     expected struct sighand_struct *sighand
   kernel/signal.c:2726:49: sparse:     got struct sighand_struct [noderef] __rcu *sighand
   kernel/signal.c:3052:27: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3052:27: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3052:27: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3081:29: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3081:29: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3081:29: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3138:27: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3138:27: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3138:27: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3140:29: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3140:29: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3140:29: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3291:31: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3291:31: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3291:31: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3294:33: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3294:33: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3294:33: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3683:27: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3683:27: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3683:27: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3695:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3695:37: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3695:37: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3700:35: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3700:35: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3700:35: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:3705:29: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:3705:29: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:3705:29: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:4159:31: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:4159:31: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:4159:31: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:4171:33: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:4171:33: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:4171:33: sparse:     got struct spinlock [noderef] __rcu *
   kernel/signal.c:4189:11: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct k_sigaction *k @@     got struct k_sigaction [noderef] __rcu * @@
   kernel/signal.c:4189:11: sparse:     expected struct k_sigaction *k
   kernel/signal.c:4189:11: sparse:     got struct k_sigaction [noderef] __rcu *
   kernel/signal.c:4191:25: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/signal.c:4191:25: sparse:     expected struct spinlock [usertype] *lock
   kernel/signal.c:4191:25: sparse:     got struct spinlock [noderef] __rcu *

vim +2048 kernel/signal.c

  1934	
  1935	void sigqueue_free(struct sigqueue *q)
  1936	{
  1937		unsigned long flags;
> 1938		spinlock_t *lock = &current->sighand->siglock;
  1939	
  1940		BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
  1941		/*
  1942		 * We must hold ->siglock while testing q->list
  1943		 * to serialize with collect_signal() or with
  1944		 * __exit_signal()->flush_sigqueue().
  1945		 */
  1946		spin_lock_irqsave(lock, flags);
  1947		q->flags &= ~SIGQUEUE_PREALLOC;
  1948		/*
  1949		 * If it is queued it will be freed when dequeued,
  1950		 * like the "regular" sigqueue.
  1951		 */
  1952		if (!list_empty(&q->list))
  1953			q = NULL;
  1954		spin_unlock_irqrestore(lock, flags);
  1955	
  1956		if (q)
  1957			__sigqueue_free(q);
  1958	}
  1959	
  1960	int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
  1961	{
  1962		int sig = q->info.si_signo;
  1963		struct sigpending *pending;
  1964		struct task_struct *t;
  1965		unsigned long flags;
  1966		int ret, result;
  1967	
  1968		BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
  1969	
  1970		ret = -1;
  1971		rcu_read_lock();
  1972		t = pid_task(pid, type);
  1973		if (!t || !likely(lock_task_sighand(t, &flags)))
  1974			goto ret;
  1975	
  1976		ret = 1; /* the signal is ignored */
  1977		result = TRACE_SIGNAL_IGNORED;
  1978		if (!prepare_signal(sig, t, false))
  1979			goto out;
  1980	
  1981		ret = 0;
  1982		if (unlikely(!list_empty(&q->list))) {
  1983			/*
  1984			 * If an SI_TIMER entry is already queue just increment
  1985			 * the overrun count.
  1986			 */
  1987			BUG_ON(q->info.si_code != SI_TIMER);
  1988			q->info.si_overrun++;
  1989			result = TRACE_SIGNAL_ALREADY_PENDING;
  1990			goto out;
  1991		}
  1992		q->info.si_overrun = 0;
  1993	
  1994		signalfd_notify(t, sig);
  1995		pending = (type != PIDTYPE_PID) ? &t->signal->shared_pending : &t->pending;
  1996		list_add_tail(&q->list, &pending->list);
  1997		sigaddset(&pending->signal, sig);
  1998		complete_signal(sig, t, type);
  1999		result = TRACE_SIGNAL_DELIVERED;
  2000	out:
  2001		trace_signal_generate(sig, &q->info, t, type != PIDTYPE_PID, result);
  2002		unlock_task_sighand(t, &flags);
  2003	ret:
  2004		rcu_read_unlock();
  2005		return ret;
  2006	}
  2007	
  2008	/**
  2009	 * lock_parents_siglocks - Take current, real_parent, and parent's siglock
  2010	 * @lock_tracer: The tracers siglock is needed.
  2011	 *
  2012	 * There is no natural ordering to these locks so they must be sorted
  2013	 * before being taken.
  2014	 *
  2015	 * There are two complicating factors here:
  2016	 * - The locks live in sighand and sighand can be arbitrarily shared
  2017	 * - parent and real_parent can change when current's siglock is unlocked.
  2018	 *
  2019	 * To deal with this first the all of the sighand pointers are
  2020	 * gathered under current's siglock, and the sighand pointers are
  2021	 * sorted.  As siglock lives inside of sighand this also sorts the
  2022	 * siglock's by address.
  2023	 *
  2024	 * Then the siglocks are taken in order dropping current's siglock if
  2025	 * necessary.
  2026	 *
  2027	 * Finally if parent and real_parent have not changed return.
  2028	 * If they either parent has changed drop their locks and try again.
  2029	 *
  2030	 * Changing sighand is an infrequent and somewhat expensive operation
  2031	 * (unshare or exec) and so even in the worst case this loop
  2032	 * should not loop too many times before all of the proper locks are
  2033	 * taken in order.
  2034	 *
  2035	 * CONTEXT:
  2036	 * Must be called with @current->sighand->siglock held
  2037	 *
  2038	 * RETURNS:
  2039	 * current's, real_parent's, and parent's siglock held.
  2040	 */
  2041	static void lock_parents_siglocks(bool lock_tracer)
  2042		__releases(&current->sighand->siglock)
  2043		__acquires(&current->sighand->siglock)
  2044		__acquires(&current->real_parent->sighand->siglock)
  2045		__acquires(&current->parent->sighand->siglock)
  2046	{
  2047		struct task_struct *me = current;
> 2048		struct sighand_struct *m_sighand = me->sighand;
  2049	
  2050		lockdep_assert_held(&m_sighand->siglock);
  2051	
  2052		rcu_read_lock();
  2053		for (;;) {
  2054			struct task_struct *parent, *tracer;
  2055			struct sighand_struct *p_sighand, *t_sighand, *s1, *s2, *s3;
  2056	
  2057			parent = me->real_parent;
  2058			tracer = ptrace_parent(me);
  2059			if (!tracer || !lock_tracer)
  2060				tracer = parent;
  2061	
  2062			p_sighand = rcu_dereference(parent->sighand);
  2063			t_sighand = rcu_dereference(tracer->sighand);
  2064	
  2065			/* Sort the sighands so that s1 >= s2 >= s3 */
  2066			s1 = m_sighand;
  2067			s2 = p_sighand;
  2068			s3 = t_sighand;
  2069			if (s1 > s2)
  2070				swap(s1, s2);
  2071			if (s1 > s3)
  2072				swap(s1, s3);
  2073			if (s2 > s3)
  2074				swap(s2, s3);
  2075	
  2076			/* Take the locks in order */
  2077			if (s1 != m_sighand) {
  2078				spin_unlock(&m_sighand->siglock);
  2079				spin_lock(&s1->siglock);
  2080			}
  2081			if (s1 != s2)
  2082				spin_lock_nested(&s2->siglock, 1);
  2083			if (s2 != s3)
  2084				spin_lock_nested(&s3->siglock, 2);
  2085	
  2086			/* Verify the proper locks are held */
> 2087			if (likely((s1 == m_sighand) ||
  2088				   ((me->real_parent == parent) &&
  2089				    (me->parent == tracer) &&
  2090				    (parent->sighand == p_sighand) &&
  2091				    (tracer->sighand == t_sighand)))) {
  2092				break;
  2093			}
  2094	
  2095			/* Drop all but current's siglock */
  2096			if (p_sighand != m_sighand)
  2097				spin_unlock(&p_sighand->siglock);
  2098			if (t_sighand != p_sighand)
  2099				spin_unlock(&t_sighand->siglock);
  2100	
  2101			/*
  2102			 * Since [pt]_sighand will likely change if we go
  2103			 * around, and m_sighand is the only one held, make sure
  2104			 * it is subclass-0, since the above 's1 != m_sighand'
  2105			 * clause very much relies on that.
  2106			 */
  2107			lock_set_subclass(&m_sighand->siglock.dep_map, 0, _RET_IP_);
  2108		}
  2109		rcu_read_unlock();
  2110	}
  2111	
  2112	static void unlock_parents_siglocks(bool unlock_tracer)
  2113		__releases(&current->real_parent->sighand->siglock)
  2114		__releases(&current->parent->sighand->siglock)
  2115	{
  2116		struct task_struct *me = current;
> 2117		struct task_struct *parent = me->real_parent;
  2118		struct task_struct *tracer = ptrace_parent(me);
  2119		struct sighand_struct *m_sighand = me->sighand;
> 2120		struct sighand_struct *p_sighand = parent->sighand;
  2121	
  2122		if (p_sighand != m_sighand)
  2123			spin_unlock(&p_sighand->siglock);
  2124		if (tracer && unlock_tracer) {
> 2125			struct sighand_struct *t_sighand = tracer->sighand;
  2126			if (t_sighand != p_sighand)
  2127				spin_unlock(&t_sighand->siglock);
  2128		}
  2129	}
  2130	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Oleg Nesterov 3 years, 12 months ago
On 05/05, Eric W. Biederman wrote:
>
> Eric W. Biederman (11):
>       signal: Rename send_signal send_signal_locked
>       signal: Replace __group_send_sig_info with send_signal_locked
>       ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
>       ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
>       ptrace: Remove arch_ptrace_attach
>       signal: Use lockdep_assert_held instead of assert_spin_locked
>       ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
>       ptrace: Document that wait_task_inactive can't fail
>       ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
>       ptrace: Don't change __state
>       ptrace: Always take siglock in ptrace_resume
>
> Peter Zijlstra (1):
>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state

OK, lgtm.

Reviewed-by: Oleg Nesterov <oleg@redhat.com>


I still dislike you removed TASK_WAKEKILL from TASK_TRACED, but I can't
find a good argument against it ;) and yes, this is subjective.

Oleg.
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Eric W. Biederman 3 years, 12 months ago
Oleg Nesterov <oleg@redhat.com> writes:

> On 05/05, Eric W. Biederman wrote:
>>
>> Eric W. Biederman (11):
>>       signal: Rename send_signal send_signal_locked
>>       signal: Replace __group_send_sig_info with send_signal_locked
>>       ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
>>       ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
>>       ptrace: Remove arch_ptrace_attach
>>       signal: Use lockdep_assert_held instead of assert_spin_locked
>>       ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
>>       ptrace: Document that wait_task_inactive can't fail
>>       ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
>>       ptrace: Don't change __state
>>       ptrace: Always take siglock in ptrace_resume
>>
>> Peter Zijlstra (1):
>>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
>
> OK, lgtm.
>
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>
>
>
> I still dislike you removed TASK_WAKEKILL from TASK_TRACED, but I can't
> find a good argument against it ;) and yes, this is subjective.

Does anyone else have any comments on this patchset?

If not I am going to apply this to a branch and get it into linux-next.

Eric
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Eric W. Biederman 3 years, 12 months ago
"Eric W. Biederman" <ebiederm@xmission.com> writes:

> Oleg Nesterov <oleg@redhat.com> writes:
>
>> On 05/05, Eric W. Biederman wrote:
>>>
>>> Eric W. Biederman (11):
>>>       signal: Rename send_signal send_signal_locked
>>>       signal: Replace __group_send_sig_info with send_signal_locked
>>>       ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
>>>       ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
>>>       ptrace: Remove arch_ptrace_attach
>>>       signal: Use lockdep_assert_held instead of assert_spin_locked
>>>       ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
>>>       ptrace: Document that wait_task_inactive can't fail
>>>       ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
>>>       ptrace: Don't change __state
>>>       ptrace: Always take siglock in ptrace_resume
>>>
>>> Peter Zijlstra (1):
>>>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
>>
>> OK, lgtm.
>>
>> Reviewed-by: Oleg Nesterov <oleg@redhat.com>
>>
>>
>> I still dislike you removed TASK_WAKEKILL from TASK_TRACED, but I can't
>> find a good argument against it ;) and yes, this is subjective.
>
> Does anyone else have any comments on this patchset?
>
> If not I am going to apply this to a branch and get it into linux-next.

Thank you all.

I have pushed this to my ptrace_stop-cleanup-for-v5.19 branch
and placed the branch in linux-next.

Eric
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Sebastian Andrzej Siewior 3 years, 12 months ago
On 2022-05-10 09:26:36 [-0500], Eric W. Biederman wrote:
> Does anyone else have any comments on this patchset?
> 
> If not I am going to apply this to a branch and get it into linux-next.

Looks good I guess.
Be aware that there will be clash due to
   https://lore.kernel.org/all/1649240981-11024-3-git-send-email-yangtiezhu@loongson.cn/

which sits currently in -akpm.

> Eric

Sebastian
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Eric W. Biederman 3 years, 12 months ago
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:

> On 2022-05-10 09:26:36 [-0500], Eric W. Biederman wrote:
>> Does anyone else have any comments on this patchset?
>> 
>> If not I am going to apply this to a branch and get it into linux-next.
>
> Looks good I guess.
> Be aware that there will be clash due to
>    https://lore.kernel.org/all/1649240981-11024-3-git-send-email-yangtiezhu@loongson.cn/
>
> which sits currently in -akpm.

Thanks for the heads up.  That looks like the best kind of conflict.
One where code just disappears.

Eric
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Kees Cook 4 years ago
On Thu, May 05, 2022 at 01:25:57PM -0500, Eric W. Biederman wrote:
> The states TASK_STOPPED and TASK_TRACE are special in they can not
> handle spurious wake-ups.  This plus actively depending upon and
> changing the value of tsk->__state causes problems for PREEMPT_RT and
> Peter's freezer rewrite.
> 
> There are a lot of details we have to get right to sort out the
> technical challenges and this is my parred back version of the changes
> that contains just those problems I see good solutions to that I believe
> are ready.
> 
> A couple of issues have been pointed but I think this parred back set of
> changes is still on the right track.  The biggest change in v4 is the
> split of "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs" into
> two patches because the dependency I thought exited between two
> different changes did not exist.  The rest of the changes are minor
> tweaks to "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs";
> removing an always true branch, and adding an early  test to see if the
> ptracer had gone, before TASK_TRAPPING was set.
> 
> This set of changes should support Peter's freezer rewrite, and with the
> addition of changing wait_task_inactive(TASK_TRACED) to be
> wait_task_inactive(0) in ptrace_check_attach I don't think there are any
> races or issues to be concerned about from the ptrace side.
> 
> More work is needed to support PREEMPT_RT, but these changes get things
> closer.
> 
> This set of changes continues to look like it will provide a firm
> foundation for solving the PREEMPT_RT and freezer challenges.

One of the more sensitive projects to changes around ptrace is rr
(Robert and Kyle added to CC). I ran rr's selftests before/after this
series and saw no changes. My failures remained the same; I assume
they're due to missing CPU features (pkeys) or build configs (bpf), etc:

99% tests passed, 19 tests failed out of 2777

Total Test time (real) = 773.40 sec

The following tests FAILED:
         42 - bpf_map (Failed)
         43 - bpf_map-no-syscallbuf (Failed)
        414 - netfilter (Failed)
        415 - netfilter-no-syscallbuf (Failed)
        454 - x86/pkeys (Failed)
        455 - x86/pkeys-no-syscallbuf (Failed)
        1152 - ttyname (Failed)
        1153 - ttyname-no-syscallbuf (Failed)
        1430 - bpf_map-32 (Failed)
        1431 - bpf_map-32-no-syscallbuf (Failed)
        1502 - detach_sigkill-32 (Failed)
        1802 - netfilter-32 (Failed)
        1803 - netfilter-32-no-syscallbuf (Failed)
        1842 - x86/pkeys-32 (Failed)
        1843 - x86/pkeys-32-no-syscallbuf (Failed)
        2316 - crash_in_function-32 (Failed)
        2317 - crash_in_function-32-no-syscallbuf (Failed)
        2540 - ttyname-32 (Failed)
        2541 - ttyname-32-no-syscallbuf (Failed)

So, I guess:

Tested-by: Kees Cook <keescook@chromium.org>

:)

-- 
Kees Cook
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Eric W. Biederman 4 years ago
Kees Cook <keescook@chromium.org> writes:

> On Thu, May 05, 2022 at 01:25:57PM -0500, Eric W. Biederman wrote:
>> The states TASK_STOPPED and TASK_TRACE are special in they can not
>> handle spurious wake-ups.  This plus actively depending upon and
>> changing the value of tsk->__state causes problems for PREEMPT_RT and
>> Peter's freezer rewrite.
>> 
>> There are a lot of details we have to get right to sort out the
>> technical challenges and this is my parred back version of the changes
>> that contains just those problems I see good solutions to that I believe
>> are ready.
>> 
>> A couple of issues have been pointed but I think this parred back set of
>> changes is still on the right track.  The biggest change in v4 is the
>> split of "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs" into
>> two patches because the dependency I thought exited between two
>> different changes did not exist.  The rest of the changes are minor
>> tweaks to "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs";
>> removing an always true branch, and adding an early  test to see if the
>> ptracer had gone, before TASK_TRAPPING was set.
>> 
>> This set of changes should support Peter's freezer rewrite, and with the
>> addition of changing wait_task_inactive(TASK_TRACED) to be
>> wait_task_inactive(0) in ptrace_check_attach I don't think there are any
>> races or issues to be concerned about from the ptrace side.
>> 
>> More work is needed to support PREEMPT_RT, but these changes get things
>> closer.
>> 
>> This set of changes continues to look like it will provide a firm
>> foundation for solving the PREEMPT_RT and freezer challenges.
>
> One of the more sensitive projects to changes around ptrace is rr
> (Robert and Kyle added to CC). I ran rr's selftests before/after this
> series and saw no changes. My failures remained the same; I assume
> they're due to missing CPU features (pkeys) or build configs (bpf), etc:
>
> 99% tests passed, 19 tests failed out of 2777
>
> Total Test time (real) = 773.40 sec
>
> The following tests FAILED:
>          42 - bpf_map (Failed)
>          43 - bpf_map-no-syscallbuf (Failed)
>         414 - netfilter (Failed)
>         415 - netfilter-no-syscallbuf (Failed)
>         454 - x86/pkeys (Failed)
>         455 - x86/pkeys-no-syscallbuf (Failed)
>         1152 - ttyname (Failed)
>         1153 - ttyname-no-syscallbuf (Failed)
>         1430 - bpf_map-32 (Failed)
>         1431 - bpf_map-32-no-syscallbuf (Failed)
>         1502 - detach_sigkill-32 (Failed)
>         1802 - netfilter-32 (Failed)
>         1803 - netfilter-32-no-syscallbuf (Failed)
>         1842 - x86/pkeys-32 (Failed)
>         1843 - x86/pkeys-32-no-syscallbuf (Failed)
>         2316 - crash_in_function-32 (Failed)
>         2317 - crash_in_function-32-no-syscallbuf (Failed)
>         2540 - ttyname-32 (Failed)
>         2541 - ttyname-32-no-syscallbuf (Failed)
>
> So, I guess:
>
> Tested-by: Kees Cook <keescook@chromium.org>
>
> :)

Thank you.  I was thinking it would be good to add the rr folks to the
discussion.

Eric
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Oleg Nesterov 4 years ago
On 05/05, Eric W. Biederman wrote:
>
> Eric W. Biederman (11):
>       signal: Rename send_signal send_signal_locked
>       signal: Replace __group_send_sig_info with send_signal_locked
>       ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
>       ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
>       ptrace: Remove arch_ptrace_attach
>       signal: Use lockdep_assert_held instead of assert_spin_locked
>       ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
>       ptrace: Document that wait_task_inactive can't fail
>       ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
>       ptrace: Don't change __state
>       ptrace: Always take siglock in ptrace_resume
>
> Peter Zijlstra (1):
>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state

I can't comment 5/12. to be honest I didn't even try to look into
arch/ia64/.

But other than that I see no problems in this version. However, I'd
like to actually apply the whole series and read the changed code
carefully, but sorry, I don't think I can do this before Monday.

Oleg.
Re: [PATCH v4 0/12] ptrace: cleaning up ptrace_stop
Posted by Eric W. Biederman 4 years ago
Oleg Nesterov <oleg@redhat.com> writes:

> On 05/05, Eric W. Biederman wrote:
>>
>> Eric W. Biederman (11): signal: Rename send_signal send_signal_locked
>> signal: Replace __group_send_sig_info with send_signal_locked
>> ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP ptrace/xtensa:
>> Replace PT_SINGLESTEP with TIF_SINGLESTEP ptrace: Remove
>> arch_ptrace_attach signal: Use lockdep_assert_held instead of
>> assert_spin_locked ptrace: Reimplement PTRACE_KILL by always sending
>> SIGKILL ptrace: Document that wait_task_inactive can't fail ptrace:
>> Admit ptrace_stop can generate spuriuos SIGTRAPs ptrace: Don't change
>> __state ptrace: Always take siglock in ptrace_resume
>>
>> Peter Zijlstra (1):
>>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
>
> I can't comment 5/12. to be honest I didn't even try to look into
> arch/ia64/.

I just looked at arch_ptrace_attach again and I spotted what looks like
a fairly easy analysis that is mostly arch-generic code that shows this
is dead code on ia64.

On ia64 arch_ptrace_attach is ptrace_attach_sync_user_rbs, and does
nothing if __state is not TASK_STOPPED.

When arch_ptrace_attach is called after ptrace_traceme __state is
TASK_RUNNING pretty much by definition as we are running in the
child.  Therefore ptrace_attach_sync_user_rbs does nothing in that case.

When arch_ptrace_attach is called after ptrace_attach __state there
are two possibilities.  If the tracee was already in TASK_STOPPED
before the ptrace_attach, the tracee will be in TASK_TRACED.
Otherwise the tracee will be in TASK_TRACED or on it's way to stopping
in TASK_TRACED.

Unless I totally misread ptrace_attach.  There is no way that after
a successful ptrace_attach for the tracee to be in TASK_STOPPED.
This makes ptrace_attach_sync_user_rbs a big noop, AKA dead code.
So it can be removed.

> But other than that I see no problems in this version. However, I'd
> like to actually apply the whole series and read the changed code
> carefully, but sorry, I don't think I can do this before Monday.

No rush.  I don't expect the merge window will open for a while yet.

Eric