cpufreq: Fix hotplug-suspend race during reboot

[PATCH v2] cpufreq: Fix hotplug-suspend race during reboot

Posted by Tianxiang Chen 2 months, 1 week ago

During system reboot, cpufreq_suspend() is called via the
kernel_restart() -> device_shutdown() -> pm_notifier_call_chain()
path. Unlike the normal system suspend path, the reboot path does not
call freeze_processes(), so userspace processes and kernel threads
remain active.

This allows CPU hotplug operations to run concurrently with
cpufreq_suspend(). The original code has no synchronization with CPU
hotplug, leading to a race condition where governor_data can be freed
by the hotplug path while cpufreq_suspend() is still accessing it,
resulting in a null pointer dereference:

  Unable to handle kernel NULL pointer dereference
  Call Trace:
   do_kernel_fault+0x28/0x3c
   cpufreq_suspend+0xdc/0x160
   device_shutdown+0x18/0x200
   kernel_restart+0x40/0x80
   arm64_sys_reboot+0x1b0/0x200

Fix this by adding cpus_read_lock()/cpus_read_unlock() to
cpufreq_suspend() to block CPU hotplug operations while suspend is in
progress.

Signed-off-by: Tianxiang Chen <nanmu@xiaomi.com>
---
v2:
- Update changelog to explicitly mention reboot scenario
- Add observed crash trace
---
 drivers/cpufreq/cpufreq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 1f794524a1d9..6f1d264c378b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1979,6 +1979,7 @@ void cpufreq_suspend(void)
 	if (!cpufreq_driver)
 		return;
 
+	cpus_read_lock();
 	if (!has_target() && !cpufreq_driver->suspend)
 		goto suspend;
 
@@ -1998,6 +1999,7 @@ void cpufreq_suspend(void)
 
 suspend:
 	cpufreq_suspended = true;
+	cpus_read_unlock();
 }
 
 /**
-- 
2.34.1

Re: [PATCH v2] cpufreq: Fix hotplug-suspend race during reboot

Posted by Zhongqiu Han 2 months ago

On 4/8/2026 10:19 PM, Tianxiang Chen wrote:
> During system reboot, cpufreq_suspend() is called via the
> kernel_restart() -> device_shutdown() -> pm_notifier_call_chain()
> path. Unlike the normal system suspend path, the reboot path does not
> call freeze_processes(), so userspace processes and kernel threads
> remain active.
> 
> This allows CPU hotplug operations to run concurrently with
> cpufreq_suspend(). The original code has no synchronization with CPU
> hotplug, leading to a race condition where governor_data can be freed
> by the hotplug path while cpufreq_suspend() is still accessing it,
> resulting in a null pointer dereference:
> 
>    Unable to handle kernel NULL pointer dereference
>    Call Trace:
>     do_kernel_fault+0x28/0x3c
>     cpufreq_suspend+0xdc/0x160
>     device_shutdown+0x18/0x200
>     kernel_restart+0x40/0x80
>     arm64_sys_reboot+0x1b0/0x200
> 
> Fix this by adding cpus_read_lock()/cpus_read_unlock() to
> cpufreq_suspend() to block CPU hotplug operations while suspend is in
> progress.
> 
> Signed-off-by: Tianxiang Chen <nanmu@xiaomi.com>
> ---
> v2:
> - Update changelog to explicitly mention reboot scenario
> - Add observed crash trace
> ---
>   drivers/cpufreq/cpufreq.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 1f794524a1d9..6f1d264c378b 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1979,6 +1979,7 @@ void cpufreq_suspend(void)
>   	if (!cpufreq_driver)
>   		return;
>   
> +	cpus_read_lock();
>   	if (!has_target() && !cpufreq_driver->suspend)
>   		goto suspend;
>   
> @@ -1998,6 +1999,7 @@ void cpufreq_suspend(void)
>   
>   suspend:
>   	cpufreq_suspended = true;
> +	cpus_read_unlock();
>   }
>   
>   /**

Hi Tianxiang,

May I know did you test this with lockdep enabled? Specifically, does
the new cpus_read_lock() → policy->rwsem ordering in cpufreq_suspend()
trigger any lockdep warnings? Thanks



-- 
Thx and BRs,
Zhongqiu Han

Re: [PATCH v2] cpufreq: Fix hotplug-suspend race during reboot

Posted by Tianxiang Chen 1 month ago

On 4/14/2026 10:44 PM, Zhongqiu Han wrote:
> May I know did you test this with lockdep enabled? Specifically, does
> the new cpus_read_lock() -> policy->rwsem ordering in cpufreq_suspend()
> trigger any lockdep warnings? Thanks

Hi Zhongqiu,

Thanks for the review.

I did test v2 with lockdep enabled and was NOT able to reproduce any
warning. 

No circular-locking splat or "possible deadlock" report was observed
in dmesg across multiple runs.

My reasoning for why the new order should be safe:

  * The patch establishes  cpus_read_lock() -> policy->rwsem.
  * The hotplug path already holds cpu_hotplug_lock (write side,
    via the hotplug core) before taking policy->rwsem inside
    cpufreq_offline()/cpufreq_online(), i.e. the same direction.
  * I grep'd cpufreq and did not find an existing path that takes
    policy->rwsem first and then acquires cpus_read_lock()
    underneath. If I missed one, please point me at it.
  * cpus_read_lock() is a percpu-rwsem read side and is re-entrant,
    so even if an outer caller already holds it (e.g. via a pm
    notifier running inside a hotplug callback) this is safe.

May I ask whether you have actually observed a lockdep splat on this
change on any downstream tree, or is this a precautionary question?
If you have a specific call chain in mind, I would like to add
targeted coverage before v3 so we can nail it down definitively.

Thanks,
Tianxiang

Re: [PATCH v2] cpufreq: Fix hotplug-suspend race during reboot

Posted by Zhongqiu Han 1 month ago

On 5/11/2026 7:14 PM, Tianxiang Chen wrote:
> On 4/14/2026 10:44 PM, Zhongqiu Han wrote:
>> May I know did you test this with lockdep enabled? Specifically, does
>> the new cpus_read_lock() -> policy->rwsem ordering in cpufreq_suspend()
>> trigger any lockdep warnings? Thanks
> 
> Hi Zhongqiu,
> 
> Thanks for the review.
> 
> I did test v2 with lockdep enabled and was NOT able to reproduce any
> warning.
> 
> No circular-locking splat or "possible deadlock" report was observed
> in dmesg across multiple runs.


Thanks for the confirmation.

> 
> My reasoning for why the new order should be safe:
> 
>    * The patch establishes  cpus_read_lock() -> policy->rwsem.
>    * The hotplug path already holds cpu_hotplug_lock (write side,
>      via the hotplug core) before taking policy->rwsem inside
>      cpufreq_offline()/cpufreq_online(), i.e. the same direction.
>    * I grep'd cpufreq and did not find an existing path that takes
>      policy->rwsem first and then acquires cpus_read_lock()
>      underneath. If I missed one, please point me at it.
>    * cpus_read_lock() is a percpu-rwsem read side and is re-entrant,
>      so even if an outer caller already holds it (e.g. via a pm
>      notifier running inside a hotplug callback) this is safe.
> 
> May I ask whether you have actually observed a lockdep splat on this
> change on any downstream tree, or is this a precautionary question?
> If you have a specific call chain in mind, I would like to add
> targeted coverage before v3 so we can nail it down definitively.


This was mainly a precautionary question on my side, to make sure there
aren't any hidden locking concerns. In my experience, running new
locking changes under lockdep can be a useful sanity check, so I wanted
to double‑check.


For context, none of the existing cpufreq driver .suspend callbacks
currently invoke cpus_read_lock() so this does not affect any existing
suspend paths.


Looks okay to me.

Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>


> 
> Thanks,
> Tianxiang


-- 
Thx and BRs,
Zhongqiu Han

Re: [PATCH v2] cpufreq: Fix hotplug-suspend race during reboot

Posted by Rafael J. Wysocki 3 weeks, 3 days ago

On Mon, May 11, 2026 at 3:54 PM Zhongqiu Han
<zhongqiu.han@oss.qualcomm.com> wrote:
>
> On 5/11/2026 7:14 PM, Tianxiang Chen wrote:
> > On 4/14/2026 10:44 PM, Zhongqiu Han wrote:
> >> May I know did you test this with lockdep enabled? Specifically, does
> >> the new cpus_read_lock() -> policy->rwsem ordering in cpufreq_suspend()
> >> trigger any lockdep warnings? Thanks
> >
> > Hi Zhongqiu,
> >
> > Thanks for the review.
> >
> > I did test v2 with lockdep enabled and was NOT able to reproduce any
> > warning.
> >
> > No circular-locking splat or "possible deadlock" report was observed
> > in dmesg across multiple runs.
>
>
> Thanks for the confirmation.
>
> >
> > My reasoning for why the new order should be safe:
> >
> >    * The patch establishes  cpus_read_lock() -> policy->rwsem.
> >    * The hotplug path already holds cpu_hotplug_lock (write side,
> >      via the hotplug core) before taking policy->rwsem inside
> >      cpufreq_offline()/cpufreq_online(), i.e. the same direction.
> >    * I grep'd cpufreq and did not find an existing path that takes
> >      policy->rwsem first and then acquires cpus_read_lock()
> >      underneath. If I missed one, please point me at it.
> >    * cpus_read_lock() is a percpu-rwsem read side and is re-entrant,
> >      so even if an outer caller already holds it (e.g. via a pm
> >      notifier running inside a hotplug callback) this is safe.
> >
> > May I ask whether you have actually observed a lockdep splat on this
> > change on any downstream tree, or is this a precautionary question?
> > If you have a specific call chain in mind, I would like to add
> > targeted coverage before v3 so we can nail it down definitively.
>
>
> This was mainly a precautionary question on my side, to make sure there
> aren't any hidden locking concerns. In my experience, running new
> locking changes under lockdep can be a useful sanity check, so I wanted
> to double‑check.
>
>
> For context, none of the existing cpufreq driver .suspend callbacks
> currently invoke cpus_read_lock() so this does not affect any existing
> suspend paths.
>
>
> Looks okay to me.
>
> Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>

Applied as 7.1-rc material, thanks!