[v1] tracing/osnoise: fix potential deadlock in cpu hotplug

[PATCH] tracing/osnoise: fix potential deadlock in cpu hotplug

Posted by hu.shengming@zte.com.cn 1 week, 3 days ago

From: luohaiyang10243395 <luo.haiyang@zte.com.cn>

The following sequence may leads deadlock in cpu hotplug:

  CPU0                        |  CPU1
                              |  schedule_work_on
                              |
  _cpu_down//set CPU1 offline |
  cpus_write_lock             |
                              |  osnoise_hotplug_workfn
                              |    mutex_lock(&interface_lock);
                              |    cpus_read_lock();  //wait cpu_hotplug_lock
                              |
                              |  cpuhp/1
                              |    osnoise_cpu_die
                              |      kthread_stop
                              |        wait_for_completion //wait osnoise/1 exit
                              |
                              |  osnoise/1
                              |    osnoise_sleep
                              |      mutex_lock(&interface_lock); //deadlock

Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock).

Signed-off-by: Luo Haiyang <luo.haiyang@zte.com.cn>
---
 kernel/trace/trace_osnoise.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index dee610e465b9..be6cf0bb3c03 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -2073,8 +2073,8 @@ static void osnoise_hotplug_workfn(struct work_struct *dummy)
 	if (!osnoise_has_registered_instances())
 		return;

-	guard(mutex)(&interface_lock);
 	guard(cpus_read_lock)();
+	guard(mutex)(&interface_lock);

 	if (!cpu_online(cpu))
 		return;
@@ -2237,11 +2237,11 @@ static ssize_t osnoise_options_write(struct file *filp, const char __user *ubuf,
 	if (running)
 		stop_per_cpu_kthreads();

-	mutex_lock(&interface_lock);
 	/*
 	 * avoid CPU hotplug operations that might read options.
 	 */
 	cpus_read_lock();
+	mutex_lock(&interface_lock);

 	retval = cnt;

@@ -2257,8 +2257,8 @@ static ssize_t osnoise_options_write(struct file *filp, const char __user *ubuf,
 			clear_bit(option, &osnoise_options);
 	}

-	cpus_read_unlock();
 	mutex_unlock(&interface_lock);
+	cpus_read_unlock();

 	if (running)
 		start_per_cpu_kthreads();
@@ -2345,16 +2345,16 @@ osnoise_cpus_write(struct file *filp, const char __user *ubuf, size_t count,
 	if (running)
 		stop_per_cpu_kthreads();

-	mutex_lock(&interface_lock);
 	/*
 	 * osnoise_cpumask is read by CPU hotplug operations.
 	 */
 	cpus_read_lock();
+	mutex_lock(&interface_lock);

 	cpumask_copy(&osnoise_cpumask, osnoise_cpumask_new);

-	cpus_read_unlock();
 	mutex_unlock(&interface_lock);
+	cpus_read_unlock();

 	if (running)
 		start_per_cpu_kthreads();
-- 
2.25.1

Re: [PATCH] tracing/osnoise: fix potential deadlock in cpu hotplug

Posted by Steven Rostedt 1 week, 2 days ago

On Tue, 24 Mar 2026 15:06:16 +0800 (CST)
<hu.shengming@zte.com.cn> wrote:

> From: luohaiyang10243395 <luo.haiyang@zte.com.cn>
> 
> The following sequence may leads deadlock in cpu hotplug:
> 
>   CPU0                        |  CPU1
>                               |  schedule_work_on
>                               |
>   _cpu_down//set CPU1 offline |
>   cpus_write_lock             |
>                               |  osnoise_hotplug_workfn
>                               |    mutex_lock(&interface_lock);
>                               |    cpus_read_lock();  //wait cpu_hotplug_lock
>                               |
>                               |  cpuhp/1
>                               |    osnoise_cpu_die
>                               |      kthread_stop
>                               |        wait_for_completion //wait osnoise/1 exit
>                               |
>                               |  osnoise/1
>                               |    osnoise_sleep
>                               |      mutex_lock(&interface_lock); //deadlock
> 
> Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock).

So the deadlock is due to the "wait_for_completion"?

How did you find this bug? Inspection, AI, triggered?

Thanks,

-- Steve


> 
> Signed-off-by: Luo Haiyang <luo.haiyang@zte.com.cn>
> ---
>  kernel/trace/trace_osnoise.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> index dee610e465b9..be6cf0bb3c03 100644
> --- a/kernel/trace/trace_osnoise.c
> +++ b/kernel/trace/trace_osnoise.c
> @@ -2073,8 +2073,8 @@ static void osnoise_hotplug_workfn(struct work_struct *dummy)
>  	if (!osnoise_has_registered_instances())
>  		return;
> 
> -	guard(mutex)(&interface_lock);
>  	guard(cpus_read_lock)();
> +	guard(mutex)(&interface_lock);
> 
>  	if (!cpu_online(cpu))
>  		return;
> @@ -2237,11 +2237,11 @@ static ssize_t osnoise_options_write(struct file *filp, const char __user *ubuf,
>  	if (running)
>  		stop_per_cpu_kthreads();
> 
> -	mutex_lock(&interface_lock);
>  	/*
>  	 * avoid CPU hotplug operations that might read options.
>  	 */
>  	cpus_read_lock();
> +	mutex_lock(&interface_lock);
> 
>  	retval = cnt;
> 
> @@ -2257,8 +2257,8 @@ static ssize_t osnoise_options_write(struct file *filp, const char __user *ubuf,
>  			clear_bit(option, &osnoise_options);
>  	}
> 
> -	cpus_read_unlock();
>  	mutex_unlock(&interface_lock);
> +	cpus_read_unlock();
> 
>  	if (running)
>  		start_per_cpu_kthreads();
> @@ -2345,16 +2345,16 @@ osnoise_cpus_write(struct file *filp, const char __user *ubuf, size_t count,
>  	if (running)
>  		stop_per_cpu_kthreads();
> 
> -	mutex_lock(&interface_lock);
>  	/*
>  	 * osnoise_cpumask is read by CPU hotplug operations.
>  	 */
>  	cpus_read_lock();
> +	mutex_lock(&interface_lock);
> 
>  	cpumask_copy(&osnoise_cpumask, osnoise_cpumask_new);
> 
> -	cpus_read_unlock();
>  	mutex_unlock(&interface_lock);
> +	cpus_read_unlock();
> 
>  	if (running)
>  		start_per_cpu_kthreads();

Re: [PATCH] tracing/osnoise: fix potential deadlock in cpu hotplug

Posted by hu.shengming@zte.com.cn 1 week, 2 days ago

>On Tue, 24 Mar 2026 15:06:16 +0800 (CST)
><hu.shengming@zte.com.cn> wrote:
>
>> From: luohaiyang10243395 <luo.haiyang@zte.com.cn>
>> 
>> The following sequence may leads deadlock in cpu hotplug:
>> 
>>   CPU0                        |  CPU1
>>                               |  schedule_work_on
>>                               |
>>   _cpu_down//set CPU1 offline |
>>   cpus_write_lock             |
>>                               |  osnoise_hotplug_workfn
>>                               |    mutex_lock(&interface_lock);
>>                               |    cpus_read_lock();  //wait cpu_hotplug_lock
>>                               |
>>                               |  cpuhp/1
>>                               |    osnoise_cpu_die
>>                               |      kthread_stop
>>                               |        wait_for_completion //wait osnoise/1 exit
>>                               |
>>                               |  osnoise/1
>>                               |    osnoise_sleep
>>                               |      mutex_lock(&interface_lock); //deadlock
>> 
>> Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock).
>
>So the deadlock is due to the "wait_for_completion"?

The osnoise_cpu_init callback returns directly, which may allow another CPU offline task to run, 
the offline task holds the cpu_hotplug_lock while waiting for the osnoise task to exit. 
osnoise_hotplug_workfn may acquire interface_lock first, causing the offline task to be blocked. 
This is an ABBA deadlock.

>How did you find this bug? Inspection, AI, triggered?
>
>Thanks,
>
>-- Steve

We run autotests on kernel-6.6, report following hung task warning, and we think the same issue exists
in linux-stable.
 [39401.476843] INFO: task cpuhp/7:47 blocked for more than 120 seconds.
 [39401.483196]       Tainted: G            E      6.6.102-5.2.1.an23.103.aarch64 #1
 [39401.490581] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 [39401.498398] task:cpuhp/7         state:D stack:0     pid:47    ppid:2      flags:0x00000208
 [39401.506739] Call trace:
 [39401.509175]  __switch_to+0x138/0x180
 [39401.512743]  __schedule+0x250/0x5e8
 [39401.516220]  schedule+0x60/0x100
 [39401.519437]  schedule_timeout+0x1a0/0x1c0
 [39401.523437]  wait_for_completion+0xbc/0x190
 [39401.527609]  kthread_stop+0x7c/0x268
 [39401.531175]  stop_kthread+0x8c/0x178
 [39401.534740]  osnoise_cpu_die+0xc/0x18
 [39401.538391]  cpuhp_invoke_callback+0x148/0x580
 [39401.542822]  cpuhp_thread_fun+0xc8/0x1a0
 [39401.546733]  smpboot_thread_fn+0x224/0x250
 [39401.550817]  kthread+0xf8/0x110
 [39401.553947]  ret_from_fork+0x10/0x20
 [39401.557545] INFO: task sh:28856 blocked for more than 120 seconds.
 [39401.563713]       Tainted: G            E      6.6.102-5.2.1.an23.103.aarch64 #1
 [39401.571095] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 [39401.578912] task:sh              state:D stack:0     pid:28856 ppid:1      flags:0x00800004
 [39401.587251] Call trace:
 [39401.589685]  __switch_to+0x138/0x180
 [39401.593250]  __schedule+0x250/0x5e8
 [39401.596725]  schedule+0x60/0x100
 [39401.599941]  schedule_timeout+0x1a0/0x1c0
 [39401.603940]  wait_for_completion+0xbc/0x190
 [39401.608113]  __flush_work+0x5c/0xa8
 [39401.611590]  work_on_cpu_key+0x88/0xc0
 [39401.615331]  cpu_down_maps_locked+0xd0/0xe8
 [39401.619503]  cpu_device_down+0x38/0x60
 [39401.623240]  cpu_subsys_offline+0x14/0x28
 [39401.627238]  device_offline+0xb8/0x130
 [39401.630976]  online_store+0x64/0xe0
 [39401.634453]  dev_attr_store+0x1c/0x38
 [39401.638104]  sysfs_kf_write+0x48/0x60
 [39401.641756]  kernfs_fop_write_iter+0x118/0x1e8
 [39401.646188]  vfs_write+0x1a4/0x2f8
 [39401.649580]  ksys_write+0x70/0x108
 [39401.652970]  __arm64_sys_write+0x20/0x30
 [39401.656880]  el0_svc_common.constprop.0+0x60/0x138
 [39401.661660]  do_el0_svc+0x20/0x30
 [39401.664964]  el0_svc+0x44/0x1f8
 [39401.668093]  el0t_64_sync_handler+0xf8/0x128
 [39401.672352]  el0t_64_sync+0x17c/0x180
 [39401.875086] INFO: task kworker/7:2:2314252 blocked for more than 121 seconds.
 [39401.882208]       Tainted: G            E      6.6.102-5.2.1.an23.103.aarch64 #1
 [39401.889590] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 [39401.897406] task:kworker/7:2     state:D stack:0     pid:2314252 ppid:2      flags:0x00000008
 [39401.905917] Workqueue: events osnoise_hotplug_workfn
 [39401.910871] Call trace:
 [39401.913306]  __switch_to+0x138/0x180
 [39401.916870]  __schedule+0x250/0x5e8
 [39401.920345]  schedule+0x60/0x100
 [39401.923561]  percpu_rwsem_wait+0xfc/0x128
 [39401.927559]  __percpu_down_read+0x60/0x198
 [39401.931644]  percpu_down_read.constprop.0+0xac/0xb8
 [39401.936510]  cpus_read_lock+0x14/0x20
 [39401.940160]  osnoise_hotplug_workfn+0x54/0xb0
 [39401.944506]  process_one_work+0x184/0x420
 [39401.948503]  worker_thread+0x2b4/0x3d8
 [39401.952241]  kthread+0xf8/0x110
 [39401.955370]  ret_from_fork+0x10/0x20
 [39402.125508] INFO: task osnoise/0:2356235 blocked for more than 121 seconds.
 [39402.132458]       Tainted: G            E      6.6.102-5.2.1.an23.103.aarch64 #1
 [39402.139840] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 [39402.147656] task:osnoise/0       state:D stack:0     pid:2356235 ppid:2      flags:0x00000008
 [39402.156168] Call trace:
 [39402.158602]  __switch_to+0x138/0x180
 [39402.162166]  __schedule+0x250/0x5e8
 [39402.165643]  schedule+0x60/0x100
 [39402.168860]  schedule_preempt_disabled+0x28/0x48
 [39402.173466]  __mutex_lock.constprop.0+0x324/0x5f8
 [39402.178158]  __mutex_lock_slowpath+0x18/0x28
 [39402.182416]  mutex_lock+0x64/0x78
 [39402.185720]  osnoise_sleep+0x30/0x130
 [39402.189371]  osnoise_main+0x164/0x190
 [39402.193021]  kthread+0xf8/0x110
 [39402.196149]  ret_from_fork+0x10/0x20
 [39402.199713] INFO: task osnoise/1:2356236 blocked for more than 121 seconds.
 [39402.206661]       Tainted: G            E      6.6.102-5.2.1.an23.103.aarch64 #1
 [39402.214044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 [39402.221860] task:osnoise/1       state:D stack:0     pid:2356236 ppid:2      flags:0x00000008
 [39402.230372] Call trace:
 [39402.232804]  __switch_to+0x138/0x180
 [39402.236368]  __schedule+0x250/0x5e8
 [39402.239845]  schedule+0x60/0x100
 [39402.243061]  schedule_preempt_disabled+0x28/0x48
 [39402.247666]  __mutex_lock.constprop.0+0x324/0x5f8
 [39402.252359]  __mutex_lock_slowpath+0x18/0x28
 [39402.256618]  mutex_lock+0x64/0x78
 [39402.259921]  osnoise_sleep+0x30/0x130
 [39402.263572]  osnoise_main+0x164/0x190
 [39402.267223]  kthread+0xf8/0x110
 [39402.270352]  ret_from_fork+0x10/0x20
 [39402.273916] INFO: task osnoise/2:2356237 blocked for more than 121 seconds.
 [39402.280865]       Tainted: G            E      6.6.102-5.2.1.an23.103.aarch64 #1
 [39402.288247] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 [39402.296064] task:osnoise/2       state:D stack:0     pid:2356237 ppid:2      flags:0x00000008
 [39402.304575] Call trace:
 [39402.307010]  __switch_to+0x138/0x180
 [39402.310574]  __schedule+0x250/0x5e8
 [39402.314051]  schedule+0x60/0x100
 [39402.317268]  schedule_preempt_disabled+0x28/0x48
 [39402.321873]  __mutex_lock.constprop.0+0x324/0x5f8
 [39402.326566]  __mutex_lock_slowpath+0x18/0x28
 [39402.330824]  mutex_lock+0x64/0x78
 [39402.334128]  osnoise_sleep+0x30/0x130
 [39402.337778]  osnoise_main+0x164/0x190
 [39402.341429]  kthread+0xf8/0x110
 [39402.344556]  ret_from_fork+0x10/0x20
 [39402.348120] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
 [39402.356295] Kernel panic - not syncing: hung_task: blocked tasks 

Thanks,
Haiyang

>> 
>> Signed-off-by: Luo Haiyang <luo.haiyang@zte.com.cn>
>> ---
>>  kernel/trace/trace_osnoise.c | 10 +++++-----
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>> 
>> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
>> index dee610e465b9..be6cf0bb3c03 100644
>> --- a/kernel/trace/trace_osnoise.c
>> +++ b/kernel/trace/trace_osnoise.c
>> @@ -2073,8 +2073,8 @@ static void osnoise_hotplug_workfn(struct work_struct *dummy)
>>      if (!osnoise_has_registered_instances())
>>          return;
>> 
>> -    guard(mutex)(&interface_lock);
>>      guard(cpus_read_lock)();
>> +    guard(mutex)(&interface_lock);
>> 
>>      if (!cpu_online(cpu))
>>          return;
>> @@ -2237,11 +2237,11 @@ static ssize_t osnoise_options_write(struct file *filp, const char __user *ubuf,
>>      if (running)
>>          stop_per_cpu_kthreads();
>> 
>> -    mutex_lock(&interface_lock);
>>      /*
>>       * avoid CPU hotplug operations that might read options.
>>       */
>>      cpus_read_lock();
>> +    mutex_lock(&interface_lock);
>> 
>>      retval = cnt;
>> 
>> @@ -2257,8 +2257,8 @@ static ssize_t osnoise_options_write(struct file *filp, const char __user *ubuf,
>>              clear_bit(option, &osnoise_options);
>>      }
>> 
>> -    cpus_read_unlock();
>>      mutex_unlock(&interface_lock);
>> +    cpus_read_unlock();
>> 
>>      if (running)
>>          start_per_cpu_kthreads();
>> @@ -2345,16 +2345,16 @@ osnoise_cpus_write(struct file *filp, const char __user *ubuf, size_t count,
>>      if (running)
>>          stop_per_cpu_kthreads();
>> 
>> -    mutex_lock(&interface_lock);
>>      /*
>>       * osnoise_cpumask is read by CPU hotplug operations.
>>       */
>>      cpus_read_lock();
>> +    mutex_lock(&interface_lock);
>> 
>>      cpumask_copy(&osnoise_cpumask, osnoise_cpumask_new);
>> 
>> -    cpus_read_unlock();
>>      mutex_unlock(&interface_lock);
>> +    cpus_read_unlock();
>> 
>>      if (running)
>>          start_per_cpu_kthreads();

Re: [PATCH] tracing/osnoise: fix potential deadlock in cpu hotplug

Posted by Steven Rostedt 1 week, 1 day ago

On Wed, 25 Mar 2026 10:25:42 +0800 (CST)
<hu.shengming@zte.com.cn> wrote:

> >On Tue, 24 Mar 2026 15:06:16 +0800 (CST)
> ><hu.shengming@zte.com.cn> wrote:
> >  
> >> From: luohaiyang10243395 <luo.haiyang@zte.com.cn>
> >> 
> >> The following sequence may leads deadlock in cpu hotplug:
> >> 
> >>   CPU0                        |  CPU1
> >>                               |  schedule_work_on
> >>                               |
> >>   _cpu_down//set CPU1 offline |
> >>   cpus_write_lock             |
> >>                               |  osnoise_hotplug_workfn
> >>                               |    mutex_lock(&interface_lock);
> >>                               |    cpus_read_lock();  //wait cpu_hotplug_lock
> >>                               |
> >>                               |  cpuhp/1
> >>                               |    osnoise_cpu_die
> >>                               |      kthread_stop
> >>                               |        wait_for_completion //wait osnoise/1 exit
> >>                               |
> >>                               |  osnoise/1
> >>                               |    osnoise_sleep
> >>                               |      mutex_lock(&interface_lock); //deadlock
> >> 
> >> Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock).  
> >
> >So the deadlock is due to the "wait_for_completion"?  
> 
> The osnoise_cpu_init callback returns directly, which may allow another CPU offline task to run, 
> the offline task holds the cpu_hotplug_lock while waiting for the osnoise task to exit. 
> osnoise_hotplug_workfn may acquire interface_lock first, causing the offline task to be blocked. 
> This is an ABBA deadlock.

Right, as I said, it is due to the "wait_for_completion" and not due to two
different locks. One is waiting for the osnoise task to exit (the
"wait_for_completion") but the osnoise task is blocked on the interface_lock().

Better to show it as:


    task1		task2		task3
    -----		-----		-----

 mutex_lock(&interface_lock)

		    [CPU GOING OFFLINE]

		    cpus_write_lock();
		    osnoise_cpu_die();
		      kthread_stop(task3);
		        wait_for_completion();

				      osnoise_sleep();
				        mutex_lock(&interface_lock);

 cpus_read_lock();

 [DEAD LOCK]

> 
> >How did you find this bug? Inspection, AI, triggered?
> >
> >Thanks,
> >
> >-- Steve  
> 
> We run autotests on kernel-6.6, report following hung task warning, and we think the same issue exists
> in linux-stable.

Thanks. It's usually good to state how a bug was discovered when fixing it.

Could you send a v2 with an updated change log?

-- Steve