[PATCH v2] LoongArch: Fix unreliable stack for live patching

Tiezhu Yang posted 1 patch 2 weeks, 2 days ago
arch/loongarch/kernel/stacktrace.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH v2] LoongArch: Fix unreliable stack for live patching
Posted by Tiezhu Yang 2 weeks, 2 days ago
When testing the kernel live patching with "modprobe livepatch-sample",
there is a timeout over 15 seconds from "starting patching transition"
to "patching complete", dmesg shows "unreliable stack" for user tasks
in debug mode, here is one of the messages:

  livepatch: klp_try_switch_task: bash:1193 has an unreliable stack

The "unreliable stack" is because it can not unwind from do_syscall()
to its previous frame handle_syscall(), it should use fp to find the
original stack top due to secondary stack in do_syscall(), but fp is
not used for some other functions, then fp can not be restored by the
next frame of do_syscall(), so it is necessary to save fp if task is
not current to get the stack top of do_syscall().

Here are the call chains:

  klp_enable_patch()
    klp_try_complete_transition()
      klp_try_switch_task()
        klp_check_and_switch_task()
          klp_check_stack()
            stack_trace_save_tsk_reliable()
              arch_stack_walk_reliable()

When executing "rmmod livepatch-sample", there exists the similar issue.
With this patch, it takes a short time for patching and unpatching.

Before:

  # modprobe livepatch-sample
  # dmesg -T | tail -3
  [Sat Sep  6 11:00:20 2025] livepatch: 'livepatch_sample': starting patching transition
  [Sat Sep  6 11:00:35 2025] livepatch: signaling remaining tasks
  [Sat Sep  6 11:00:36 2025] livepatch: 'livepatch_sample': patching complete

  # echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled
  # rmmod livepatch_sample
  rmmod: ERROR: Module livepatch_sample is in use
  # rmmod livepatch_sample
  # dmesg -T | tail -3
  [Sat Sep  6 11:06:05 2025] livepatch: 'livepatch_sample': starting unpatching transition
  [Sat Sep  6 11:06:20 2025] livepatch: signaling remaining tasks
  [Sat Sep  6 11:06:21 2025] livepatch: 'livepatch_sample': unpatching complete

After:

  # modprobe livepatch-sample
  # dmesg -T | tail -2
  [Tue Sep 16 16:19:30 2025] livepatch: 'livepatch_sample': starting patching transition
  [Tue Sep 16 16:19:31 2025] livepatch: 'livepatch_sample': patching complete

  # echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled
  # rmmod livepatch_sample
  # dmesg -T | tail -2
  [Tue Sep 16 16:19:36 2025] livepatch: 'livepatch_sample': starting unpatching transition
  [Tue Sep 16 16:19:37 2025] livepatch: 'livepatch_sample': unpatching complete

Cc: stable@vger.kernel.org # v6.9+
Fixes: 199cc14cb4f1 ("LoongArch: Add kernel livepatching support")
Reported-by: Xi Zhang <zhangxi@kylinos.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
 arch/loongarch/kernel/stacktrace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
index 9a038d1070d7..387dc4d3c486 100644
--- a/arch/loongarch/kernel/stacktrace.c
+++ b/arch/loongarch/kernel/stacktrace.c
@@ -51,12 +51,13 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
 	if (task == current) {
 		regs->regs[3] = (unsigned long)__builtin_frame_address(0);
 		regs->csr_era = (unsigned long)__builtin_return_address(0);
+		regs->regs[22] = 0;
 	} else {
 		regs->regs[3] = thread_saved_fp(task);
 		regs->csr_era = thread_saved_ra(task);
+		regs->regs[22] = task->thread.reg22;
 	}
 	regs->regs[1] = 0;
-	regs->regs[22] = 0;
 
 	for (unwind_start(&state, task, regs);
 	     !unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) {
-- 
2.42.0
Re: [PATCH v2] LoongArch: Fix unreliable stack for live patching
Posted by Huacai Chen 2 weeks ago
Applied, thanks.

Huacai

On Tue, Sep 16, 2025 at 5:35 PM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
>
> When testing the kernel live patching with "modprobe livepatch-sample",
> there is a timeout over 15 seconds from "starting patching transition"
> to "patching complete", dmesg shows "unreliable stack" for user tasks
> in debug mode, here is one of the messages:
>
>   livepatch: klp_try_switch_task: bash:1193 has an unreliable stack
>
> The "unreliable stack" is because it can not unwind from do_syscall()
> to its previous frame handle_syscall(), it should use fp to find the
> original stack top due to secondary stack in do_syscall(), but fp is
> not used for some other functions, then fp can not be restored by the
> next frame of do_syscall(), so it is necessary to save fp if task is
> not current to get the stack top of do_syscall().
>
> Here are the call chains:
>
>   klp_enable_patch()
>     klp_try_complete_transition()
>       klp_try_switch_task()
>         klp_check_and_switch_task()
>           klp_check_stack()
>             stack_trace_save_tsk_reliable()
>               arch_stack_walk_reliable()
>
> When executing "rmmod livepatch-sample", there exists the similar issue.
> With this patch, it takes a short time for patching and unpatching.
>
> Before:
>
>   # modprobe livepatch-sample
>   # dmesg -T | tail -3
>   [Sat Sep  6 11:00:20 2025] livepatch: 'livepatch_sample': starting patching transition
>   [Sat Sep  6 11:00:35 2025] livepatch: signaling remaining tasks
>   [Sat Sep  6 11:00:36 2025] livepatch: 'livepatch_sample': patching complete
>
>   # echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled
>   # rmmod livepatch_sample
>   rmmod: ERROR: Module livepatch_sample is in use
>   # rmmod livepatch_sample
>   # dmesg -T | tail -3
>   [Sat Sep  6 11:06:05 2025] livepatch: 'livepatch_sample': starting unpatching transition
>   [Sat Sep  6 11:06:20 2025] livepatch: signaling remaining tasks
>   [Sat Sep  6 11:06:21 2025] livepatch: 'livepatch_sample': unpatching complete
>
> After:
>
>   # modprobe livepatch-sample
>   # dmesg -T | tail -2
>   [Tue Sep 16 16:19:30 2025] livepatch: 'livepatch_sample': starting patching transition
>   [Tue Sep 16 16:19:31 2025] livepatch: 'livepatch_sample': patching complete
>
>   # echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled
>   # rmmod livepatch_sample
>   # dmesg -T | tail -2
>   [Tue Sep 16 16:19:36 2025] livepatch: 'livepatch_sample': starting unpatching transition
>   [Tue Sep 16 16:19:37 2025] livepatch: 'livepatch_sample': unpatching complete
>
> Cc: stable@vger.kernel.org # v6.9+
> Fixes: 199cc14cb4f1 ("LoongArch: Add kernel livepatching support")
> Reported-by: Xi Zhang <zhangxi@kylinos.cn>
> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> ---
>  arch/loongarch/kernel/stacktrace.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
> index 9a038d1070d7..387dc4d3c486 100644
> --- a/arch/loongarch/kernel/stacktrace.c
> +++ b/arch/loongarch/kernel/stacktrace.c
> @@ -51,12 +51,13 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
>         if (task == current) {
>                 regs->regs[3] = (unsigned long)__builtin_frame_address(0);
>                 regs->csr_era = (unsigned long)__builtin_return_address(0);
> +               regs->regs[22] = 0;
>         } else {
>                 regs->regs[3] = thread_saved_fp(task);
>                 regs->csr_era = thread_saved_ra(task);
> +               regs->regs[22] = task->thread.reg22;
>         }
>         regs->regs[1] = 0;
> -       regs->regs[22] = 0;
>
>         for (unwind_start(&state, task, regs);
>              !unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) {
> --
> 2.42.0
>