drivers/base/power/runtime.c | 54 ++++++++++++ drivers/clk/clk.c | 204 +++++++++++++++++++++++++++++++------------ include/linux/pm_runtime.h | 2 + 3 files changed, 204 insertions(+), 56 deletions(-)
As explained in the following thread, there is a known ABBA locking
dependency between clk and runtime PM.
Link: https://lore.kernel.org/linux-clk/20240527181928.4fc6b5f0@xps-13/
The problem is that the clk subsystem uses a mutex to protect concurrent
accesses to its tree structure, and so do other subsystems such as
generic power domains. While it holds its own mutex, the clk subsystem
performs runtime PM calls which end up executing callbacks from other
subsystems (again, gen PD is in the loop). But typically power domains
may also need to perform clock related operations, and thus the
following two situations may happen:
mutex_lock(clk);
mutex_lock(genpd);
or
mutex_lock(genpd);
mutex_lock(clk);
As of today I know that at least NXP i.MX8MP and MediaTek MT8183 SoCs
are complex enough to face this kind of issues.
There's been a first workaround to "silence" lockdep with the most
obvious case triggering the warning: making sure all clocks are RPM
enabled before running the clk_disable_unused() work, but this is just
addressing one situation among many other potentially problematic
situations. In the past, both Laurent Pinchart and Marek Vasut have
experienced these issues when enabling HDMI and audio support,
respectively.
Following a discussion we had at last Plumbers with Steven, I am
proposing to decouple both locks by changing a bit the clk approach:
let's always runtime resume all clocks that we *might* need before
taking the clock lock. But how do we know the list? Well, depending on
the situation we may either need to wake up:
- the upper part of the tree during prepare/unprepare operations.
- the lower part of the tree during (read) rate operations.
- the upper part and the lower part of the tree otherwise (especially
during rate changes which may involve reparenting).
Luckily, we do not need to do that by hand, are more importantly we do
not need to use the clock tree for that because thanks to the work from
Saravana, we already have device links describing exhaustively the
consumer/supplier relationships. The clock topology (from a runtime PM
perspective) is reflected in these links. In practice, we do not care
about all consumers, but the few clock operations that will actually
trigger runtime PM operations are probably not impacting enough to
justify something more complex.
So here it is: every patch in this series decouples the two locks in
various places of the clock subsystem, until we reach a point where all
needed clocks will always be resumed before acquiring the core lock. It
obviously requires a few new helpers in the RPM core which may probably
be enhanced, I've tried to keep them as simple and straightforward as
possible.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
Miquel Raynal (10):
PM: runtime: Add helpers to resume consumers
clk: Improve comments with usual punctuation
clk: Avoid non needed runtime PM calls
clk: Avoid open coded-logic when a there is a helper available
clk: Move runtime PM calls out of the prepare_lock in clk_init()
clk: Move runtime PM calls out of the prepare_lock in clk_prepare()
clk: Ensure all RPM enabled clocks are enabled before reparenting orphans
clk: Move runtime PM calls out of the prepare_lock in clk_unregister()
clk: Make sure clock parents and children are resumed when necessary
clk: Fix the ABBA locking issue with runtime PM subcalls
drivers/base/power/runtime.c | 54 ++++++++++++
drivers/clk/clk.c | 204 +++++++++++++++++++++++++++++++------------
include/linux/pm_runtime.h | 2 +
3 files changed, 204 insertions(+), 56 deletions(-)
---
base-commit: ab6df33805e6e6e4ac1a519cfcade3f7f19f6ff1
change-id: 20250307-cross-lock-dep-e65f285ce8e1
Best regards,
--
Miquel Raynal <miquel.raynal@bootlin.com>
Quoting Miquel Raynal (2025-03-26 11:26:15)
> As explained in the following thread, there is a known ABBA locking
> dependency between clk and runtime PM.
> Link: https://lore.kernel.org/linux-clk/20240527181928.4fc6b5f0@xps-13/
>
> The problem is that the clk subsystem uses a mutex to protect concurrent
> accesses to its tree structure, and so do other subsystems such as
> generic power domains. While it holds its own mutex, the clk subsystem
> performs runtime PM calls which end up executing callbacks from other
> subsystems (again, gen PD is in the loop). But typically power domains
> may also need to perform clock related operations, and thus the
> following two situations may happen:
>
> mutex_lock(clk);
> mutex_lock(genpd);
>
> or
>
> mutex_lock(genpd);
> mutex_lock(clk);
>
> As of today I know that at least NXP i.MX8MP and MediaTek MT8183 SoCs
> are complex enough to face this kind of issues.
>
> There's been a first workaround to "silence" lockdep with the most
> obvious case triggering the warning: making sure all clocks are RPM
> enabled before running the clk_disable_unused() work, but this is just
> addressing one situation among many other potentially problematic
> situations. In the past, both Laurent Pinchart and Marek Vasut have
> experienced these issues when enabling HDMI and audio support,
> respectively.
>
> Following a discussion we had at last Plumbers with Steven, I am
> proposing to decouple both locks by changing a bit the clk approach:
> let's always runtime resume all clocks that we *might* need before
> taking the clock lock. But how do we know the list? Well, depending on
> the situation we may either need to wake up:
> - the upper part of the tree during prepare/unprepare operations.
> - the lower part of the tree during (read) rate operations.
> - the upper part and the lower part of the tree otherwise (especially
> during rate changes which may involve reparenting).
Thanks for taking on this work. This problem is coming up more and more
often.
>
> Luckily, we do not need to do that by hand, are more importantly we do
> not need to use the clock tree for that because thanks to the work from
> Saravana, we already have device links describing exhaustively the
> consumer/supplier relationships. The clock topology (from a runtime PM
> perspective) is reflected in these links. In practice, we do not care
> about all consumers, but the few clock operations that will actually
> trigger runtime PM operations are probably not impacting enough to
> justify something more complex.
This won't always work, for a couple reasons. First because clk drivers
aren't required to describe their parent clks that are outside the clk
controller by using DT with a 'clocks' property in the clk controller
node. Second because there can be a many to one relationship between a
struct device and struct device_node. We're trying to push drivers to be
written in a way that the binding has the 'clocks' property, but that
isn't always the case, so we still need a solution that works in all
cases so as to not regress old (legacy?) implementations or ones that
divide a platform device into some number of auxiliary devices and
drivers.
One idea to do that would be to implement the device links between clk
controller devices based on all possible parents of the clk. We support
lazily registering clks though, meaning a parent could be registered at
any time, so we would have to explore the clk tree each time a clk is
registered to see if any new clks need to be found and device links made
between devices. The general algorithm is probably something like:
clk_register()
make_links_for_node()
if device node missing 'clocks' property
for each parent string
pclk = find parent clk
pdev = pclk->dev
link pdev to dev node
else
for each clk in clocks property
pclk = find parent clk
pdev = pclk->dev
link pdev to dev node
We have to get the parent clk in all cases because we don't know which
device it may be registered for (the platform device or auxiliary
device). If we optimize here, I'd prefer we optimize for the case where
the 'clocks' property is present to encourage migration. Maybe what we
can do is make some structure for a clk controller and have a linked
list of those that we look up when a new clk is registered. We actually
have that already with 'struct clock_provider' so maybe we need to
extend that.
Stash the device pointer in there and some variable sized array of the
clk_core pointers to the external clks. In the 'clocks' DT property
case, we can size this immediately and map the array index to the
property but in the non-property case we'll have to grow this array each
time a new clk is found to be a parent of the device. Maybe for that we
should just have some other sort of linked list of clk_core pointers
that we continue to stack clks onto.
struct clock_provider {
void (*clk_init_cb)(struct device_node *);
struct device *dev;
struct device_node *np;
struct list_head node;
struct clk_core *legacy_clks; // Or struct list_head legacy?
size_t len_clocks;
struct clk_core clocks_property[];
};
Hello Stephen, all,
On Mon, 14 Apr 2025 18:00:15 -0700
Stephen Boyd <sboyd@kernel.org> wrote:
> Quoting Miquel Raynal (2025-03-26 11:26:15)
> > As explained in the following thread, there is a known ABBA locking
> > dependency between clk and runtime PM.
> > Link: https://lore.kernel.org/linux-clk/20240527181928.4fc6b5f0@xps-13/
> >
> > The problem is that the clk subsystem uses a mutex to protect concurrent
> > accesses to its tree structure, and so do other subsystems such as
> > generic power domains. While it holds its own mutex, the clk subsystem
> > performs runtime PM calls which end up executing callbacks from other
> > subsystems (again, gen PD is in the loop). But typically power domains
> > may also need to perform clock related operations, and thus the
> > following two situations may happen:
> >
> > mutex_lock(clk);
> > mutex_lock(genpd);
> >
> > or
> >
> > mutex_lock(genpd);
> > mutex_lock(clk);
> >
> > As of today I know that at least NXP i.MX8MP and MediaTek MT8183 SoCs
> > are complex enough to face this kind of issues.
> >
> > There's been a first workaround to "silence" lockdep with the most
> > obvious case triggering the warning: making sure all clocks are RPM
> > enabled before running the clk_disable_unused() work, but this is just
> > addressing one situation among many other potentially problematic
> > situations. In the past, both Laurent Pinchart and Marek Vasut have
> > experienced these issues when enabling HDMI and audio support,
> > respectively.
> >
> > Following a discussion we had at last Plumbers with Steven, I am
> > proposing to decouple both locks by changing a bit the clk approach:
> > let's always runtime resume all clocks that we *might* need before
> > taking the clock lock. But how do we know the list? Well, depending on
> > the situation we may either need to wake up:
> > - the upper part of the tree during prepare/unprepare operations.
> > - the lower part of the tree during (read) rate operations.
> > - the upper part and the lower part of the tree otherwise (especially
> > during rate changes which may involve reparenting).
>
> Thanks for taking on this work. This problem is coming up more and more
> often.
Reviving this thread after today I had a very rare occurrence of
apparently this same issue:
WARNING: possible circular locking dependency detected
It happened on imx8mp, on a board and with a setup that I'm using since
many months to do unrelated development (mostly DRM). It was a very rare
occurrence because I always have clk_ignore_unused in my kernel cmdline.
On my setup that warning appeared exactly once in thousands of boots
I've done in several months. Just rebooting without changing anything
and it didn't show up again.
Here's the full warning message:
[ 5.077473] ======================================================
[ 5.083658] WARNING: possible circular locking dependency detected
[ 5.089845] 6.17.0-rc4+ #2 Tainted: G T
[ 5.095164] ------------------------------------------------------
[ 5.101346] kworker/u16:4/52 is trying to acquire lock:
[ 5.106576] ffff0000016ae740 (&genpd->mlock){+.+.}-{4:4}, at: genpd_lock_mtx+0x20/0x38
[ 5.114533]
[ 5.114533] but task is already holding lock:
[ 5.120368] ffff800084eb5258 (prepare_lock){+.+.}-{4:4}, at: clk_prepare_lock+0x38/0xc0
[ 5.128404]
[ 5.128404] which lock already depends on the new lock.
[ 5.128404]
[ 5.136583]
[ 5.136583] the existing dependency chain (in reverse order) is:
[ 5.144070]
[ 5.144070] -> #1 (prepare_lock){+.+.}-{4:4}:
[ 5.149924] __mutex_lock+0xb8/0x7f0
[ 5.154034] mutex_lock_nested+0x2c/0x40
[ 5.158487] clk_prepare_lock+0x58/0xc0
[ 5.162849] clk_prepare+0x28/0x58
[ 5.166780] clk_bulk_prepare+0x54/0xe8
[ 5.171141] imx_pgc_power_up+0x80/0x378
[ 5.175592] _genpd_power_on+0xa0/0x168
[ 5.179955] genpd_power_on+0xd8/0x248
[ 5.184234] genpd_runtime_resume+0x12c/0x298
[ 5.189121] __rpm_callback+0x50/0x200
[ 5.193400] rpm_callback+0x7c/0x90
[ 5.197414] rpm_resume+0x534/0x718
[ 5.201432] __pm_runtime_resume+0x58/0xa8
[ 5.206056] pm_runtime_get_suppliers+0x6c/0xa0
[ 5.211117] __driver_probe_device+0x50/0x140
[ 5.216002] driver_probe_device+0xe0/0x170
[ 5.220710] __driver_attach+0xa0/0x1c0
[ 5.225074] bus_for_each_dev+0x90/0xf8
[ 5.229436] driver_attach+0x2c/0x40
[ 5.233538] bus_add_driver+0xec/0x218
[ 5.237816] driver_register+0x64/0x138
[ 5.242178] __platform_driver_register+0x2c/0x40
[ 5.247411] hotplug_bridge_dynconn_get_modes+0x28/0x48 [hotplug_bridge]
[ 5.254654] do_one_initcall+0x84/0x358
[ 5.259020] do_init_module+0x60/0x268
[ 5.263298] load_module+0x1fc4/0x2108
[ 5.267574] init_module_from_file+0x90/0xe0
[ 5.272372] idempotent_init_module+0x1f8/0x300
[ 5.277432] __arm64_sys_finit_module+0x6c/0xb8
[ 5.282491] invoke_syscall+0x50/0x120
[ 5.286771] el0_svc_common.constprop.0+0x48/0xf0
[ 5.292004] do_el0_svc+0x24/0x38
[ 5.295848] el0_svc+0x4c/0x160
[ 5.299519] el0t_64_sync_handler+0xa0/0xe8
[ 5.304229] el0t_64_sync+0x198/0x1a0
[ 5.308420]
[ 5.308420] -> #0 (&genpd->mlock){+.+.}-{4:4}:
[ 5.314359] __lock_acquire+0x1338/0x1f50
[ 5.318897] lock_acquire+0x1c4/0x350
[ 5.323089] __mutex_lock+0xb8/0x7f0
[ 5.327194] mutex_lock_nested+0x2c/0x40
[ 5.331645] genpd_lock_mtx+0x20/0x38
[ 5.335834] genpd_runtime_resume+0x118/0x298
[ 5.340721] __rpm_callback+0x50/0x200
[ 5.344997] rpm_callback+0x7c/0x90
[ 5.349013] rpm_resume+0x534/0x718
[ 5.353029] __pm_runtime_resume+0x58/0xa8
[ 5.357653] clk_pm_runtime_get.part.0.isra.0+0x24/0x98
[ 5.363408] __clk_register+0x51c/0x970
[ 5.367771] devm_clk_hw_register+0x64/0xe8
[ 5.372481] imx8mp_hsio_blk_ctrl_probe+0xa0/0xf8
[ 5.377712] imx8mp_blk_ctrl_probe+0x358/0x568
[ 5.382684] platform_probe+0x64/0xa8
[ 5.386875] really_probe+0xc4/0x2b8
[ 5.390976] __driver_probe_device+0x80/0x140
[ 5.395860] driver_probe_device+0xe0/0x170
[ 5.400571] __device_attach_driver+0xc0/0x148
[ 5.405542] bus_for_each_drv+0x9c/0x108
[ 5.409991] __device_attach+0xa8/0x1a0
[ 5.414354] device_initial_probe+0x1c/0x30
[ 5.419065] bus_probe_device+0xb4/0xc0
[ 5.423428] deferred_probe_work_func+0x90/0xd8
[ 5.428485] process_one_work+0x214/0x618
[ 5.433027] worker_thread+0x1b4/0x368
[ 5.437305] kthread+0x150/0x238
[ 5.441062] ret_from_fork+0x10/0x20
[ 5.445165]
[ 5.445165] other info that might help us debug this:
[ 5.445165]
[ 5.453171] Possible unsafe locking scenario:
[ 5.453171]
[ 5.459091] CPU0 CPU1
[ 5.463622] ---- ----
[ 5.468151] lock(prepare_lock);
[ 5.471476] lock(&genpd->mlock);
[ 5.477405] lock(prepare_lock);
[ 5.483248] lock(&genpd->mlock);
[ 5.486655]
[ 5.486655] *** DEADLOCK ***
[ 5.486655]
[ 5.492577] 4 locks held by kworker/u16:4/52:
[ 5.496937] #0: ffff000000030948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x198/0x618
[ 5.507051] #1: ffff800086e3bd70 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1c0/0x618
[ 5.516299] #2: ffff000000c608f8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x44/0x1a0
[ 5.524680] #3: ffff800084eb5258 (prepare_lock){+.+.}-{4:4}, at: clk_prepare_lock+0x38/0xc0
[ 5.533150]
[ 5.533150] stack backtrace:
[ 5.537512] CPU: 2 UID: 0 PID: 52 Comm: kworker/u16:4 Tainted: G T 6.17.0-rc4+ #2 PREEMPT
[ 5.547262] Tainted: [T]=RANDSTRUCT
[ 5.550752] Hardware name: ...
[ 5.557284] Workqueue: events_unbound deferred_probe_work_func
[ 5.563128] Call trace:
[ 5.565578] show_stack+0x20/0x38 (C)
[ 5.569251] dump_stack_lvl+0x8c/0xd0
[ 5.572919] dump_stack+0x18/0x28
[ 5.576239] print_circular_bug+0x28c/0x370
[ 5.580431] check_noncircular+0x178/0x190
[ 5.584537] __lock_acquire+0x1338/0x1f50
[ 5.588558] lock_acquire+0x1c4/0x350
[ 5.592231] __mutex_lock+0xb8/0x7f0
[ 5.595815] mutex_lock_nested+0x2c/0x40
[ 5.599750] genpd_lock_mtx+0x20/0x38
[ 5.603418] genpd_runtime_resume+0x118/0x298
[ 5.607786] __rpm_callback+0x50/0x200
[ 5.611543] rpm_callback+0x7c/0x90
[ 5.615041] rpm_resume+0x534/0x718
[ 5.618539] __pm_runtime_resume+0x58/0xa8
[ 5.622644] clk_pm_runtime_get.part.0.isra.0+0x24/0x98
[ 5.627876] __clk_register+0x51c/0x970
[ 5.631717] devm_clk_hw_register+0x64/0xe8
[ 5.635909] imx8mp_hsio_blk_ctrl_probe+0xa0/0xf8
[ 5.640622] imx8mp_blk_ctrl_probe+0x358/0x568
[ 5.645071] platform_probe+0x64/0xa8
[ 5.648743] really_probe+0xc4/0x2b8
[ 5.652326] __driver_probe_device+0x80/0x140
[ 5.656693] driver_probe_device+0xe0/0x170
[ 5.660886] __device_attach_driver+0xc0/0x148
[ 5.665339] bus_for_each_drv+0x9c/0x108
[ 5.669269] __device_attach+0xa8/0x1a0
[ 5.673115] device_initial_probe+0x1c/0x30
[ 5.677308] bus_probe_device+0xb4/0xc0
[ 5.681152] deferred_probe_work_func+0x90/0xd8
[ 5.685691] process_one_work+0x214/0x618
[ 5.689713] worker_thread+0x1b4/0x368
[ 5.693471] kthread+0x150/0x238
[ 5.696709] ret_from_fork+0x10/0x20
You're welcome to ask for more info, even though I'm afraid I might be
unable to provide them given how rare this event is.
Best regards,
Luca
--
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
© 2016 - 2025 Red Hat, Inc.