[PATCH] trace/osnoise: Fix possible recursive locking in stop_per_cpu_kthreads

Nico Pache posted 1 patch 3 years, 6 months ago
kernel/trace/trace_osnoise.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH] trace/osnoise: Fix possible recursive locking in stop_per_cpu_kthreads
Posted by Nico Pache 3 years, 6 months ago
There is a recursive lock on the cpu_hotplug_lock.

In kernel/trace/trace_osnoise.c:<start/stop>_per_cpu_kthreads:
    - start_per_cpu_kthreads calls cpus_read_lock() and if
	start_kthreads returns a error it will call stop_per_cpu_kthreads.
    - stop_per_cpu_kthreads then calls cpus_read_lock() again causing
      deadlock.

Fix this by calling cpus_read_unlock() before calling
stop_per_cpu_kthreads. This behavior can also be seen in commit
f46b16520a08 ("trace/hwlat: Implement the per-cpu mode").

This error was noticed during the LTP ftrace-stress-test:

WARNING: possible recursive locking detected
--------------------------------------------
sh/275006 is trying to acquire lock:
ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: stop_per_cpu_kthreads

but task is already holding lock:
ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: start_per_cpu_kthreads

other info that might help us debug this:
 Possible unsafe locking scenario:

      CPU0
      ----
 lock(cpu_hotplug_lock);
 lock(cpu_hotplug_lock);

 *** DEADLOCK ***

May be due to missing lock nesting notation

3 locks held by sh/275006:
 #0: ffff8881023f0470 (sb_writers#24){.+.+}-{0:0}, at: ksys_write
 #1: ffffffffb084f430 (trace_types_lock){+.+.}-{3:3}, at: rb_simple_write
 #2: ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: start_per_cpu_kthreads

Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
Signed-off-by: Nico Pache <npache@redhat.com>
---
 kernel/trace/trace_osnoise.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 313439920a8c..78d536d3ff3d 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1786,8 +1786,9 @@ static int start_per_cpu_kthreads(void)
 	for_each_cpu(cpu, current_mask) {
 		retval = start_kthread(cpu);
 		if (retval) {
+			cpus_read_unlock();
 			stop_per_cpu_kthreads();
-			break;
+			return retval;
 		}
 	}
 
-- 
2.37.2
Re: [PATCH] trace/osnoise: Fix possible recursive locking in stop_per_cpu_kthreads
Posted by Daniel Bristot de Oliveira 3 years, 6 months ago
Hi Nico!

On 9/19/22 16:49, Nico Pache wrote:
> There is a recursive lock on the cpu_hotplug_lock.
> 
> In kernel/trace/trace_osnoise.c:<start/stop>_per_cpu_kthreads:
>     - start_per_cpu_kthreads calls cpus_read_lock() and if
> 	start_kthreads returns a error it will call stop_per_cpu_kthreads.
>     - stop_per_cpu_kthreads then calls cpus_read_lock() again causing
>       deadlock.

Yep, I see the problem.

> Fix this by calling cpus_read_unlock() before calling
> stop_per_cpu_kthreads. This behavior can also be seen in commit
> f46b16520a08 ("trace/hwlat: Implement the per-cpu mode").

Correct, we have a similar function that *without* this problem.

> This error was noticed during the LTP ftrace-stress-test:
> 
> WARNING: possible recursive locking detected
> --------------------------------------------
> sh/275006 is trying to acquire lock:
> ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: stop_per_cpu_kthreads
> 
> but task is already holding lock:
> ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: start_per_cpu_kthreads
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>       CPU0
>       ----
>  lock(cpu_hotplug_lock);
>  lock(cpu_hotplug_lock);
> 
>  *** DEADLOCK ***
> 
> May be due to missing lock nesting notation
> 
> 3 locks held by sh/275006:
>  #0: ffff8881023f0470 (sb_writers#24){.+.+}-{0:0}, at: ksys_write
>  #1: ffffffffb084f430 (trace_types_lock){+.+.}-{3:3}, at: rb_simple_write
>  #2: ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: start_per_cpu_kthreads
> 
> Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
> Signed-off-by: Nico Pache <npache@redhat.com>

Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>

Thanks!
-- Daniel