vhost: Fix freezer/ps regressions

[PATCH 1/3] signal: Don't always put SIGKILL in shared_pending

Posted by Mike Christie 2 years, 8 months ago

When get_pending detects the task has been marked to be killed we try to
clean up the SIGKLL by doing a sigdelset and recalc_sigpending, but we
still leave it in shared_pending. If the signal is being short circuit
delivered there is no need to put in shared_pending so this adds a check
in complete_signal.

This patch was modified from Eric Biederman <ebiederm@xmission.com>
original patch.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 kernel/signal.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/signal.c b/kernel/signal.c
index 8f6330f0e9ca..3dc99b9aec7f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1052,6 +1052,14 @@ static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
 			signal->flags = SIGNAL_GROUP_EXIT;
 			signal->group_exit_code = sig;
 			signal->group_stop_count = 0;
+
+			/*
+			 * The signal is being short circuit delivered so
+			 * don't set pending.
+			 */
+			if (type != PIDTYPE_PID)
+				sigdelset(&signal->shared_pending.signal, sig);
+
 			t = p;
 			do {
 				task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
-- 
2.25.1

Re: [PATCH 1/3] signal: Don't always put SIGKILL in shared_pending

Posted by Eric W. Biederman 2 years, 8 months ago

Mike Christie <michael.christie@oracle.com> writes:

> When get_pending detects the task has been marked to be killed we try to
       ^^^^^^^^^^^ get_signal
> clean up the SIGKLL by doing a sigdelset and recalc_sigpending, but we
> still leave it in shared_pending. If the signal is being short circuit
> delivered there is no need to put in shared_pending so this adds a check
> in complete_signal.
>
> This patch was modified from Eric Biederman <ebiederm@xmission.com>
> original patch.
>
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  kernel/signal.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 8f6330f0e9ca..3dc99b9aec7f 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1052,6 +1052,14 @@ static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
>  			signal->flags = SIGNAL_GROUP_EXIT;
>  			signal->group_exit_code = sig;
>  			signal->group_stop_count = 0;
> +
> +			/*
> +			 * The signal is being short circuit delivered so
> +			 * don't set pending.
> +			 */
> +			if (type != PIDTYPE_PID)
> +				sigdelset(&signal->shared_pending.signal, sig);
> +
>  			t = p;
>  			do {
>  				task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);

Oleg Nesterov <oleg@redhat.com> writes:
>
> Eric, sorry. I fail to understand this patch.
>
> How can it help? And whom?

You were looking at why recalc_sigpending was resulting in
TIF_SIGPENDING set.

The big bug was that get_signal was getting called by the thread after
the thread had realized it was part of a group exit.

The minor bug is that SIGKILL was stuck in shared_pending and causing
recalc_sigpending to set TIF_SIGPENDING after get_signal removed the
per thread flag that asks the thread to exit.

The fact is that fatal signals (that pass all of the checks) are
delivered right there in complete_signal so it does not make sense from
a data structure consistency standpoint to leave the fatal signal (like
SIGKILL) in shared_pending.

Outside of this case it will only affect coredumps and other analyzers
that run at process exit.

One thing I am looking at is that the vhost code shares a common problem
with the coredump code to pipes.  There is code that tests
signal_pending() and does something with it after signal processing has
completed.

Fixing the data structure to be consistent seems like one way to handle
that situation.

Eric

[PATCH 1/3] signal: Don't always put SIGKILL in shared_pending
[PATCH 2/3] signal: Don't exit for PF_USER_WORKER tasks
[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression