kernel/livepatch/transition.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
v4: address changelog comments by Josh (thank you)
---8<---
When a KLP fails to apply, klp_reverse_transition will clear the
TIF_PATCH_PENDING flag on all tasks, except for newly created tasks
which are not on the task list yet. A similar race is possible
for normal (forward) transitions, where TIF_PATCH_PENDING gets
copied to the child, then later cleared in the parent.
Meanwhile, fork will copy over the TIF_PATCH_PENDING flag from the
parent to the child early on, in dup_task_struct -> setup_thread_stack.
Much later, klp_copy_process will set child->patch_state to match
that of the parent.
However, the parent's patch_state may have been changed by KLP loading
or unloading since it was initially copied over into the child.
This results in the KLP code occasionally hitting this warning in
klp_complete_transition:
for_each_process_thread(g, task) {
WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
task->patch_state = KLP_UNDEFINED;
}
Set, or clear, the TIF_PATCH_PENDING flag in the child task
depending on whether or not it is needed at the time
klp_copy_process is called, at a point in copy_process where the
tasklist_lock is held exclusively, preventing races with the KLP
code.
The KLP code does have a few places where the state is changed
without the tasklist_lock held, but those should not cause
problems because klp_update_patch_state(current) cannot be
called while the current task is in the middle of fork,
klp_check_and_switch_task() which is called under the pi_lock,
which prevents rescheduling, and manipulation of the patch
state of idle tasks, which do not fork.
This should prevent this warning from triggering again in the
future, and close the race for both normal and reverse transitions.
Signed-off-by: Rik van Riel <riel@surriel.com>
Reported-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Fixes: d83a7cb375ee ("livepatch: change to a per-task consistency model")
Cc: stable@kernel.org
---
kernel/livepatch/transition.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 5d03a2ad1066..30187b1d8275 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -610,9 +610,23 @@ void klp_reverse_transition(void)
/* Called from copy_process() during fork */
void klp_copy_process(struct task_struct *child)
{
- child->patch_state = current->patch_state;
- /* TIF_PATCH_PENDING gets copied in setup_thread_stack() */
+ /*
+ * The parent process may have gone through a KLP transition since
+ * the thread flag was copied in setup_thread_stack earlier. Bring
+ * the task flag up to date with the parent here.
+ *
+ * The operation is serialized against all klp_*_transition()
+ * operations by the tasklist_lock. The only exception is
+ * klp_update_patch_state(current), but we cannot race with
+ * that because we are current.
+ */
+ if (test_tsk_thread_flag(current, TIF_PATCH_PENDING))
+ set_tsk_thread_flag(child, TIF_PATCH_PENDING);
+ else
+ clear_tsk_thread_flag(child, TIF_PATCH_PENDING);
+
+ child->patch_state = current->patch_state;
}
/*
--
2.35.1
On Wed 2022-07-27 10:24:37, Rik van Riel wrote: > v4: address changelog comments by Josh (thank you) > > ---8<--- > When a KLP fails to apply, klp_reverse_transition will clear the > TIF_PATCH_PENDING flag on all tasks, except for newly created tasks > which are not on the task list yet. It actually is not true. klp_reverse_transtion() clears TIF_PATCH_FLAG only temporary when it waits until all processes leave the ftrace handler. It sets TIF_PATCH_FLAG once again for all tasks by calling klp_start_transition(). The difference is important. The WARN_ON_ONCE() in klp_complete_transition() will be printed when fork() copied TIF_PATCH_FLAG before it was set again. Anyway, the important thing is that TIF_PATCH_FLAG and task->patch_state might be incompatible because fork() copies them at different times. klp_copy_process() must make sure that they are in sync. And it must be done under tasklist_lock when the child is added to the global task list. Best Regards, Petr
On Thu, 2022-07-28 at 17:37 +0200, Petr Mladek wrote: > On Wed 2022-07-27 10:24:37, Rik van Riel wrote: > > v4: address changelog comments by Josh (thank you) > > > > ---8<--- > > When a KLP fails to apply, klp_reverse_transition will clear the > > TIF_PATCH_PENDING flag on all tasks, except for newly created tasks > > which are not on the task list yet. > > It actually is not true. klp_reverse_transtion() clears > TIF_PATCH_FLAG only > temporary when it waits until all processes leave the ftrace > handler. It sets TIF_PATCH_FLAG once again for all tasks by calling > klp_start_transition(). > > The difference is important. The WARN_ON_ONCE() in > klp_complete_transition() will be printed when fork() copied > TIF_PATCH_FLAG before it was set again. > > Anyway, the important thing is that TIF_PATCH_FLAG and task- > >patch_state > might be incompatible because fork() copies them at different times. > > klp_copy_process() must make sure that they are in sync. And > it must be done under tasklist_lock when the child is added > to the global task list. Hmmm, how should this be addressed in the changelog? Should I just remove most of that paragraph and leave it at "there can be a race"? -- All Rights Reversed.
On Tue 2022-08-02 16:07:08, Rik van Riel wrote: > On Thu, 2022-07-28 at 17:37 +0200, Petr Mladek wrote: > > On Wed 2022-07-27 10:24:37, Rik van Riel wrote: > > > v4: address changelog comments by Josh (thank you) > > > > > > ---8<--- > > > When a KLP fails to apply, klp_reverse_transition will clear the > > > TIF_PATCH_PENDING flag on all tasks, except for newly created tasks > > > which are not on the task list yet. > > > > It actually is not true. klp_reverse_transtion() clears > > TIF_PATCH_FLAG only > > temporary when it waits until all processes leave the ftrace > > handler. It sets TIF_PATCH_FLAG once again for all tasks by calling > > klp_start_transition(). > > > > The difference is important. The WARN_ON_ONCE() in > > klp_complete_transition() will be printed when fork() copied > > TIF_PATCH_FLAG before it was set again. > > > > Anyway, the important thing is that TIF_PATCH_FLAG and task- > > >patch_state > > might be incompatible because fork() copies them at different times. > > > > klp_copy_process() must make sure that they are in sync. And > > it must be done under tasklist_lock when the child is added > > to the global task list. > > Hmmm, how should this be addressed in the changelog? > > Should I just remove most of that paragraph and leave it > at "there can be a race"? It would be nice to somehow summarize what I wrote. I mean to explain why the problem is easier to see with revert and not with forward transition. It is because TIF_PATCH_FLAG might stay cleared in the child even when it was set again in the parent by the klp_revert_transtion(). As a result, the child will never get transition back to the reverted state. The problem is hard to hit during the forward transition because child might have TIF_PATCH_FLAG still set even when it might later copy an already migrated task->patch_state when parent gets migrated in the race window. In this case, the TIF_PATCH_FLAG will get cleared when the child returns from fork and all will be good. In each case, the inconsistent state is there even during the forward transition. But it would be caught only when the entire transition is finished during the rather small race window. The patch should fix the race in any direction. I could provide even better description after I am back from vacation on Aug 22. Best Regards, Petr
© 2016 - 2026 Red Hat, Inc.