drivers/phy/phy-core.c | 5 ++++- include/linux/phy/phy.h | 2 ++ 2 files changed, 6 insertions(+), 1 deletion(-)
If the PHY driver uses another PHY internally (e.g. in case of eUSB2,
repeaters are represented as PHYs), then it would trigger the following
lockdep splat because all PHYs use a single static lockdep key and thus
lockdep can not identify whether there is a dependency or not and
reports a false positive.
Make PHY subsystem use dynamic lockdep keys, assigning each driver a
separate key. This way lockdep can correctly identify dependency graph
between mutexes.
============================================
WARNING: possible recursive locking detected
6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 Not tainted
--------------------------------------------
kworker/u51:0/78 is trying to acquire lock:
ffff0008116554f0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
but task is already holding lock:
ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&phy->mutex);
lock(&phy->mutex);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by kworker/u51:0/78:
#0: ffff000800010948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x18c/0x5ec
#1: ffff80008036bdb0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1b4/0x5ec
#2: ffff0008094ac8f8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x188
#3: ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
stack backtrace:
CPU: 0 UID: 0 PID: 78 Comm: kworker/u51:0 Not tainted 6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 PREEMPT
Hardware name: Qualcomm CRD, BIOS 6.0.240904.BOOT.MXF.2.4-00528.1-HAMOA-1 09/ 4/2024
Workqueue: events_unbound deferred_probe_work_func
Call trace:
show_stack+0x18/0x24 (C)
dump_stack_lvl+0x90/0xd0
dump_stack+0x18/0x24
print_deadlock_bug+0x258/0x348
__lock_acquire+0x10fc/0x1f84
lock_acquire+0x1c8/0x338
__mutex_lock+0xb8/0x59c
mutex_lock_nested+0x24/0x30
phy_init+0x4c/0x12c
snps_eusb2_hsphy_init+0x54/0x1a0
phy_init+0xe0/0x12c
dwc3_core_init+0x450/0x10b4
dwc3_core_probe+0xce4/0x15fc
dwc3_probe+0x64/0xb0
platform_probe+0x68/0xc4
really_probe+0xbc/0x298
__driver_probe_device+0x78/0x12c
driver_probe_device+0x3c/0x160
__device_attach_driver+0xb8/0x138
bus_for_each_drv+0x84/0xe0
__device_attach+0x9c/0x188
device_initial_probe+0x14/0x20
bus_probe_device+0xac/0xb0
deferred_probe_work_func+0x8c/0xc8
process_one_work+0x208/0x5ec
worker_thread+0x1c0/0x368
kthread+0x14c/0x20c
ret_from_fork+0x10/0x20
Fixes: 3584f6392f09 ("phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
---
Note: I've used a Fixes tag pointing to the commit which actually
started using nested PHYs. If you think that it's incorrect, I'm fine
with dropping it.
Note2: I've tried using mutex_lock_nested, however that didn't play
well. We can not store nest level in the struct phy (as it can be used
by different drivers), so using mutex_lock_nested() would require us to
change and wrap all PHY APIs which take a lock internally. Using dynamic
lockdep keys looks like a more ellegant solution, especially granted
that there is no extra impact if lockdep is disabled.
---
Changes in v2:
- Fix lamsm ML address
- Link to v1: https://lore.kernel.org/r/20250529-phy-subinit-v1-1-b74cadf366b8@oss.qualcomm.com
---
drivers/phy/phy-core.c | 5 ++++-
include/linux/phy/phy.h | 2 ++
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
index 8e2daea81666bf8a76d9c936c1a16d6318105c91..04a5a34e7a950ae94fae915673c25d476fc071c1 100644
--- a/drivers/phy/phy-core.c
+++ b/drivers/phy/phy-core.c
@@ -994,7 +994,8 @@ struct phy *phy_create(struct device *dev, struct device_node *node,
}
device_initialize(&phy->dev);
- mutex_init(&phy->mutex);
+ lockdep_register_key(&phy->lockdep_key);
+ mutex_init_with_key(&phy->mutex, &phy->lockdep_key);
phy->dev.class = &phy_class;
phy->dev.parent = dev;
@@ -1259,6 +1260,8 @@ static void phy_release(struct device *dev)
dev_vdbg(dev, "releasing '%s'\n", dev_name(dev));
debugfs_remove_recursive(phy->debugfs);
regulator_put(phy->pwr);
+ mutex_destroy(&phy->mutex);
+ lockdep_unregister_key(&phy->lockdep_key);
ida_free(&phy_ida, phy->id);
kfree(phy);
}
diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
index 437769e061b7030105c9ea4e9b0da9d32b6fa158..13add0c2c40721fe9ca3f0350d13c035cd25af45 100644
--- a/include/linux/phy/phy.h
+++ b/include/linux/phy/phy.h
@@ -154,6 +154,7 @@ struct phy_attrs {
* @id: id of the phy device
* @ops: function pointers for performing phy operations
* @mutex: mutex to protect phy_ops
+ * @lockdep_key: lockdep information for this mutex
* @init_count: used to protect when the PHY is used by multiple consumers
* @power_count: used to protect when the PHY is used by multiple consumers
* @attrs: used to specify PHY specific attributes
@@ -165,6 +166,7 @@ struct phy {
int id;
const struct phy_ops *ops;
struct mutex mutex;
+ struct lock_class_key lockdep_key;
int init_count;
int power_count;
struct phy_attrs attrs;
---
base-commit: 460178e842c7a1e48a06df684c66eb5fd630bcf7
change-id: 20250528-phy-subinit-42c988a12b6c
Best regards,
--
Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
On Fri, May 30, 2025 at 07:08:28PM +0300, Dmitry Baryshkov wrote:
> If the PHY driver uses another PHY internally (e.g. in case of eUSB2,
> repeaters are represented as PHYs), then it would trigger the following
> lockdep splat because all PHYs use a single static lockdep key and thus
> lockdep can not identify whether there is a dependency or not and
> reports a false positive.
>
> Make PHY subsystem use dynamic lockdep keys, assigning each driver a
> separate key. This way lockdep can correctly identify dependency graph
> between mutexes.
>
> ============================================
> WARNING: possible recursive locking detected
> 6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 Not tainted
> --------------------------------------------
> kworker/u51:0/78 is trying to acquire lock:
> ffff0008116554f0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> but task is already holding lock:
> ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&phy->mutex);
> lock(&phy->mutex);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by kworker/u51:0/78:
> #0: ffff000800010948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x18c/0x5ec
> #1: ffff80008036bdb0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1b4/0x5ec
> #2: ffff0008094ac8f8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x188
> #3: ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> stack backtrace:
> CPU: 0 UID: 0 PID: 78 Comm: kworker/u51:0 Not tainted 6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 PREEMPT
> Hardware name: Qualcomm CRD, BIOS 6.0.240904.BOOT.MXF.2.4-00528.1-HAMOA-1 09/ 4/2024
> Workqueue: events_unbound deferred_probe_work_func
> Call trace:
> show_stack+0x18/0x24 (C)
> dump_stack_lvl+0x90/0xd0
> dump_stack+0x18/0x24
> print_deadlock_bug+0x258/0x348
> __lock_acquire+0x10fc/0x1f84
> lock_acquire+0x1c8/0x338
> __mutex_lock+0xb8/0x59c
> mutex_lock_nested+0x24/0x30
> phy_init+0x4c/0x12c
> snps_eusb2_hsphy_init+0x54/0x1a0
> phy_init+0xe0/0x12c
> dwc3_core_init+0x450/0x10b4
> dwc3_core_probe+0xce4/0x15fc
> dwc3_probe+0x64/0xb0
> platform_probe+0x68/0xc4
> really_probe+0xbc/0x298
> __driver_probe_device+0x78/0x12c
> driver_probe_device+0x3c/0x160
> __device_attach_driver+0xb8/0x138
> bus_for_each_drv+0x84/0xe0
> __device_attach+0x9c/0x188
> device_initial_probe+0x14/0x20
> bus_probe_device+0xac/0xb0
> deferred_probe_work_func+0x8c/0xc8
> process_one_work+0x208/0x5ec
> worker_thread+0x1c0/0x368
> kthread+0x14c/0x20c
> ret_from_fork+0x10/0x20
Nit: This last bit of the stack trace adds little value and can be
dropped.
> Fixes: 3584f6392f09 ("phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> ---
> Note: I've used a Fixes tag pointing to the commit which actually
> started using nested PHYs. If you think that it's incorrect, I'm fine
> with dropping it.
I think it's warranted. And if there were further users before this one
as Neil suggested you could just list them all as each has been
introducing a new splat.
> Note2: I've tried using mutex_lock_nested, however that didn't play
> well. We can not store nest level in the struct phy (as it can be used
> by different drivers), so using mutex_lock_nested() would require us to
> change and wrap all PHY APIs which take a lock internally. Using dynamic
> lockdep keys looks like a more ellegant solution, especially granted
> that there is no extra impact if lockdep is disabled.
Thanks for fixing this. I've been using a local hack based on
mutex_lock_nested() too but dynamic keys looks like the right way to go.
Perhaps you can add:
Reported-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/lkml/ZnpoAVGJMG4Zu-Jw@hovoldconsulting.com/
Works fine on the T14s:
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Johan
On 30/05/2025 18:08, Dmitry Baryshkov wrote:
> If the PHY driver uses another PHY internally (e.g. in case of eUSB2,
> repeaters are represented as PHYs), then it would trigger the following
> lockdep splat because all PHYs use a single static lockdep key and thus
> lockdep can not identify whether there is a dependency or not and
> reports a false positive.
>
> Make PHY subsystem use dynamic lockdep keys, assigning each driver a
> separate key. This way lockdep can correctly identify dependency graph
> between mutexes.
>
> ============================================
> WARNING: possible recursive locking detected
> 6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 Not tainted
> --------------------------------------------
> kworker/u51:0/78 is trying to acquire lock:
> ffff0008116554f0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> but task is already holding lock:
> ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&phy->mutex);
> lock(&phy->mutex);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by kworker/u51:0/78:
> #0: ffff000800010948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x18c/0x5ec
> #1: ffff80008036bdb0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1b4/0x5ec
> #2: ffff0008094ac8f8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x188
> #3: ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> stack backtrace:
> CPU: 0 UID: 0 PID: 78 Comm: kworker/u51:0 Not tainted 6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 PREEMPT
> Hardware name: Qualcomm CRD, BIOS 6.0.240904.BOOT.MXF.2.4-00528.1-HAMOA-1 09/ 4/2024
> Workqueue: events_unbound deferred_probe_work_func
> Call trace:
> show_stack+0x18/0x24 (C)
> dump_stack_lvl+0x90/0xd0
> dump_stack+0x18/0x24
> print_deadlock_bug+0x258/0x348
> __lock_acquire+0x10fc/0x1f84
> lock_acquire+0x1c8/0x338
> __mutex_lock+0xb8/0x59c
> mutex_lock_nested+0x24/0x30
> phy_init+0x4c/0x12c
> snps_eusb2_hsphy_init+0x54/0x1a0
> phy_init+0xe0/0x12c
> dwc3_core_init+0x450/0x10b4
> dwc3_core_probe+0xce4/0x15fc
> dwc3_probe+0x64/0xb0
> platform_probe+0x68/0xc4
> really_probe+0xbc/0x298
> __driver_probe_device+0x78/0x12c
> driver_probe_device+0x3c/0x160
> __device_attach_driver+0xb8/0x138
> bus_for_each_drv+0x84/0xe0
> __device_attach+0x9c/0x188
> device_initial_probe+0x14/0x20
> bus_probe_device+0xac/0xb0
> deferred_probe_work_func+0x8c/0xc8
> process_one_work+0x208/0x5ec
> worker_thread+0x1c0/0x368
> kthread+0x14c/0x20c
> ret_from_fork+0x10/0x20
>
> Fixes: 3584f6392f09 ("phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> ---
Commented the wrong version:
In fact, other drivers were using sub-phys before:
./amlogic/phy-meson-axg-pcie.c
./amlogic/phy-meson-axg-mipi-dphy.c
(yeah I know, my bad !)
so it should be:
Fixes: e2463559ff1d ("phy: amlogic: Add Amlogic AXG PCIE PHY Driver")
With the Fixes changed:
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
On 25-05-30 19:08:28, Dmitry Baryshkov wrote:
> If the PHY driver uses another PHY internally (e.g. in case of eUSB2,
> repeaters are represented as PHYs), then it would trigger the following
> lockdep splat because all PHYs use a single static lockdep key and thus
> lockdep can not identify whether there is a dependency or not and
> reports a false positive.
>
> Make PHY subsystem use dynamic lockdep keys, assigning each driver a
> separate key. This way lockdep can correctly identify dependency graph
> between mutexes.
>
[...]
>
> Fixes: 3584f6392f09 ("phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
I'm OK with this as a temporary workaround, at least until we figure out
a way to have chained PHYs in the generic framework. I think long-term,
the PHY framework should be the one to call the ops of the child PHY on
behalf of the parent.
For now, this LGTM:
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
© 2016 - 2025 Red Hat, Inc.