[v2] cpufreq: fix locking order in store_local_boost to prevent deadlock

[PATCH v2] cpufreq: fix locking order in store_local_boost to prevent deadlock

Posted by Seyediman Seyedarab 9 months, 2 weeks ago

Lockdep reports a possible circular locking dependency[1] when
writing to /sys/devices/system/cpu/cpufreq/policyN/boost,
triggered by power-profiles-daemon at boot.

store_local_boost() acquires cpu_hotplug_lock *AFTER* policy->rwsem
has already been taken by the store() handler. However, the expected
locking hierarchy is to acquire cpu_hotplug_lock before policy->rwsem.
This inverted lock order creates a *theoretical* deadlock possibility.

Acquire cpu_hotplug_lock in the store() handler *only* for the
local_boost, before taking down_write(&policy->rwsem),
and remove the cpus_read_lock/unlock() calls inside the
store_local_boost().

 [1]
 ======================================================
 WARNING: possible circular locking dependency detected
 6.15.0-rc3-debug #28 Not tainted
 ------------------------------------------------------
 power-profiles-/596 is trying to acquire lock:
 ffffffffb147e910 (cpu_hotplug_lock){++++}-{0:0}, at: store_local_boost+0x6a/0xd0

 but task is already holding lock:
 ffff9eaa48377b80 (&policy->rwsem){++++}-{4:4}, at: store+0x37/0x90

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (&policy->rwsem){++++}-{4:4}:
        down_write+0x29/0xb0
        cpufreq_online+0x841/0xa00
        cpufreq_add_dev+0x71/0x80
        subsys_interface_register+0x14b/0x170
        cpufreq_register_driver+0x154/0x250
        amd_pstate_register_driver+0x36/0x70
        amd_pstate_init+0x1e7/0x270
        do_one_initcall+0x67/0x2c0
        kernel_init_freeable+0x230/0x270
        kernel_init+0x15/0x130
        ret_from_fork+0x2c/0x50
        ret_from_fork_asm+0x11/0x20

 -> #1 (subsys mutex#3){+.+.}-{4:4}:
        __mutex_lock+0xc2/0x930
        subsys_interface_register+0x83/0x170
        cpufreq_register_driver+0x154/0x250
        amd_pstate_register_driver+0x36/0x70
        amd_pstate_init+0x1e7/0x270
        do_one_initcall+0x67/0x2c0
        kernel_init_freeable+0x230/0x270
        kernel_init+0x15/0x130
        ret_from_fork+0x2c/0x50
        ret_from_fork_asm+0x11/0x20

 -> #0 (cpu_hotplug_lock){++++}-{0:0}:
        __lock_acquire+0x1087/0x17e0
        lock_acquire.part.0+0x66/0x1b0
        cpus_read_lock+0x2a/0xc0
        store_local_boost+0x6a/0xd0
        store+0x50/0x90
        kernfs_fop_write_iter+0x135/0x200
        vfs_write+0x2ab/0x540
        ksys_write+0x6c/0xe0
        do_syscall_64+0xbb/0x1d0
        entry_SYSCALL_64_after_hwframe+0x56/0x5e

Signed-off-by: Seyediman Seyedarab <ImanDevel@gmail.com>
---
Changes in v2:
 - Restrict cpu_hotplug_lock acquisition to only
   the local_boost attribute in store() handler.

Regards,
Seyediman

 drivers/cpufreq/cpufreq.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index acf19b004..5464c8487 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -653,10 +653,7 @@ static ssize_t store_local_boost(struct cpufreq_policy *policy,
 
 	policy->boost_enabled = enable;
 
-	cpus_read_lock();
 	ret = cpufreq_driver->set_boost(policy, enable);
-	cpus_read_unlock();
-
 	if (ret) {
 		policy->boost_enabled = !policy->boost_enabled;
 		return ret;
@@ -1045,11 +1042,17 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
 	if (!fattr->store)
 		return -EIO;
 
+	if (fattr == &local_boost)
+		cpus_read_lock();
+
 	down_write(&policy->rwsem);
 	if (likely(!policy_is_inactive(policy)))
 		ret = fattr->store(policy, buf, count);
 	up_write(&policy->rwsem);
 
+	if (fattr == &local_boost)
+		cpus_read_unlock();
+
 	return ret;
 }
 
-- 
2.49.0

Re: [PATCH v2] cpufreq: fix locking order in store_local_boost to prevent deadlock

Posted by Viresh Kumar 9 months, 2 weeks ago

On 28-04-25, 13:19, Seyediman Seyedarab wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index acf19b004..5464c8487 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -653,10 +653,7 @@ static ssize_t store_local_boost(struct cpufreq_policy *policy,
>  
>  	policy->boost_enabled = enable;
>  
> -	cpus_read_lock();
>  	ret = cpufreq_driver->set_boost(policy, enable);
> -	cpus_read_unlock();
> -
>  	if (ret) {
>  		policy->boost_enabled = !policy->boost_enabled;
>  		return ret;
> @@ -1045,11 +1042,17 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
>  	if (!fattr->store)
>  		return -EIO;
>  
> +	if (fattr == &local_boost)
> +		cpus_read_lock();
> +

Please add a comment as well to explain why this is required here.

>  	down_write(&policy->rwsem);

And please rebase over PM tree's linux-next branch, your patch won't
apply anymore due to recent changes.

>  	if (likely(!policy_is_inactive(policy)))
>  		ret = fattr->store(policy, buf, count);
>  	up_write(&policy->rwsem);
>  
> +	if (fattr == &local_boost)
> +		cpus_read_unlock();
> +
>  	return ret;
>  }

-- 
viresh

Re: [PATCH v2] cpufreq: fix locking order in store_local_boost to prevent deadlock

Posted by Seyediman Seyedarab 9 months, 2 weeks ago

On 25/04/29 10:54AM, Viresh Kumar wrote:
> On 28-04-25, 13:19, Seyediman Seyedarab wrote:
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index acf19b004..5464c8487 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -1045,11 +1042,17 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
> >  	if (!fattr->store)
> >  		return -EIO;
> >  
> > +	if (fattr == &local_boost)
> > +		cpus_read_lock();
> > +
> >  	down_write(&policy->rwsem);
 
> And please rebase over PM tree's linux-next branch, your patch won't
> apply anymore due to recent changes.
 
Since the new code uses the guard() macro to lock the policy,
I see two possible ways to address this:
 1. Switch to scoped_guard() and manually wrap the local boost case
    in an if block, calling cpus_read_lock() and cpus_read_unlock()
    around the scoped_guard(cpufreq_policy_write, policy) block.

 2. Use DEFINE_GUARD() to create a cpu_hotplug_read guard, but this
    requires introducing an empty struct to satisfy the macro. Feels
    semantically awkward, since there's no actual object being guarded.

Option 1 looks more reasonable to me, because it's more direct and avoids
misusing the guard pattern for something that's not resource-based.

Let me know if you'd prefer the second approach for consistency.

Regards,
Seyediman