drivers/net/phy/phy_device.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
If a PHY has no driver, the genphy driver is probed/removed directly in
phy_attach/detach. If the PHY's ofnode has an "leds" subnode, then the
LEDs will be (un)registered when probing/removing the genphy driver.
This could occur if the leds are for a non-generic driver that isn't
loaded for whatever reason. Synchronously removing the PHY device in
phy_detach leads to the following deadlock:
rtnl_lock()
ndo_close()
...
phy_detach()
phy_remove()
phy_leds_unregister()
led_classdev_unregister()
led_trigger_set()
netdev_trigger_deactivate()
unregister_netdevice_notifier()
rtnl_lock()
There is a corresponding deadlock on the open/register side of things
(and that one is reported by lockdep), but it requires a race while this
one is deterministic.
Generic PHYs do not support LEDs anyway, so don't bother registering
them.
Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---
drivers/net/phy/phy_device.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 73f9cb2e2844..f76ee8489504 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -3416,7 +3416,8 @@ static int phy_probe(struct device *dev)
/* Get the LEDs from the device tree, and instantiate standard
* LEDs for them.
*/
- if (IS_ENABLED(CONFIG_PHYLIB_LEDS))
+ if (IS_ENABLED(CONFIG_PHYLIB_LEDS) && !phy_driver_is_genphy(phydev) &&
+ !phy_driver_is_genphy_10g(phydev))
err = of_phy_leds(phydev);
out:
@@ -3433,7 +3434,8 @@ static int phy_remove(struct device *dev)
cancel_delayed_work_sync(&phydev->state_queue);
- if (IS_ENABLED(CONFIG_PHYLIB_LEDS))
+ if (IS_ENABLED(CONFIG_PHYLIB_LEDS) && !phy_driver_is_genphy(phydev) &&
+ !phy_driver_is_genphy_10g(phydev))
phy_leds_unregister(phydev);
phydev->state = PHY_DOWN;
--
2.35.1.1320.gc452695387.dirty
On Mon, Jul 07, 2025 at 03:58:03PM -0400, Sean Anderson wrote: > If a PHY has no driver, the genphy driver is probed/removed directly in > phy_attach/detach. If the PHY's ofnode has an "leds" subnode, then the > LEDs will be (un)registered when probing/removing the genphy driver. Maybe checking whether the PHY driver supports LEDs would be more sensible than checking whether it's one of the genphy drivers? > This could occur if the leds are for a non-generic driver that isn't > loaded for whatever reason. Synchronously removing the PHY device in > phy_detach leads to the following deadlock: > > rtnl_lock() > ndo_close() > ... > phy_detach() > phy_remove() > phy_leds_unregister() > led_classdev_unregister() > led_trigger_set() > netdev_trigger_deactivate() > unregister_netdevice_notifier() > rtnl_lock() > > There is a corresponding deadlock on the open/register side of things > (and that one is reported by lockdep), but it requires a race while this > one is deterministic. Doesn't this deadlock exist irrespective of whether the genphy driver(s) are being used, and whether or not the PHY driver supports LEDs? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On 7/8/25 04:58, Russell King (Oracle) wrote: > On Mon, Jul 07, 2025 at 03:58:03PM -0400, Sean Anderson wrote: >> If a PHY has no driver, the genphy driver is probed/removed directly in >> phy_attach/detach. If the PHY's ofnode has an "leds" subnode, then the >> LEDs will be (un)registered when probing/removing the genphy driver. > > Maybe checking whether the PHY driver supports LEDs would be more > sensible than checking whether it's one of the genphy drivers? The genphy driver is special, since it is probed synchronously from phy_attach. All other drivers are probed asynchronously and don't have this problem. >> This could occur if the leds are for a non-generic driver that isn't >> loaded for whatever reason. Synchronously removing the PHY device in >> phy_detach leads to the following deadlock: >> >> rtnl_lock() >> ndo_close() >> ... >> phy_detach() >> phy_remove() >> phy_leds_unregister() >> led_classdev_unregister() >> led_trigger_set() >> netdev_trigger_deactivate() >> unregister_netdevice_notifier() >> rtnl_lock() >> >> There is a corresponding deadlock on the open/register side of things >> (and that one is reported by lockdep), but it requires a race while this >> one is deterministic. > > Doesn't this deadlock exist irrespective of whether the genphy driver(s) > are being used, and whether or not the PHY driver supports LEDs? Nope. --Sean
On Mon, 7 Jul 2025 15:58:03 -0400 Sean Anderson wrote:
> - if (IS_ENABLED(CONFIG_PHYLIB_LEDS))
> + if (IS_ENABLED(CONFIG_PHYLIB_LEDS) && !phy_driver_is_genphy(phydev) &&
> + !phy_driver_is_genphy_10g(phydev))
Breaks build for smaller configs:
drivers/net/phy/phy_device.c: In function ‘phy_probe’:
drivers/net/phy/phy_device.c:3506:14: error: implicit declaration of function ‘phy_driver_is_genphy_10g’; did you mean ‘phy_driver_is_genphy’? [-Werror=implicit-function-declaration]
3506 | !phy_driver_is_genphy_10g(phydev))
| ^~~~~~~~~~~~~~~~~~~~~~~~
| phy_driver_is_genphy
--
pw-bot: cr
On 7/7/25 19:32, Jakub Kicinski wrote: > On Mon, 7 Jul 2025 15:58:03 -0400 Sean Anderson wrote: >> - if (IS_ENABLED(CONFIG_PHYLIB_LEDS)) >> + if (IS_ENABLED(CONFIG_PHYLIB_LEDS) && !phy_driver_is_genphy(phydev) && >> + !phy_driver_is_genphy_10g(phydev)) > > Breaks build for smaller configs: > > drivers/net/phy/phy_device.c: In function ‘phy_probe’: > drivers/net/phy/phy_device.c:3506:14: error: implicit declaration of function ‘phy_driver_is_genphy_10g’; did you mean ‘phy_driver_is_genphy’? [-Werror=implicit-function-declaration] > 3506 | !phy_driver_is_genphy_10g(phydev)) > | ^~~~~~~~~~~~~~~~~~~~~~~~ > | phy_driver_is_genphy This is due to https://github.com/linux-netdev/testing/commit/42ed7f7e94da01391d3519ffb5747698d2be0a67 which is not in net/main yet. --Sean
Hi Jakub, On 7/8/25 11:52, Sean Anderson wrote: > On 7/7/25 19:32, Jakub Kicinski wrote: >> On Mon, 7 Jul 2025 15:58:03 -0400 Sean Anderson wrote: >>> - if (IS_ENABLED(CONFIG_PHYLIB_LEDS)) >>> + if (IS_ENABLED(CONFIG_PHYLIB_LEDS) && !phy_driver_is_genphy(phydev) && >>> + !phy_driver_is_genphy_10g(phydev)) >> >> Breaks build for smaller configs: >> >> drivers/net/phy/phy_device.c: In function ‘phy_probe’: >> drivers/net/phy/phy_device.c:3506:14: error: implicit declaration of function ‘phy_driver_is_genphy_10g’; did you mean ‘phy_driver_is_genphy’? [-Werror=implicit-function-declaration] >> 3506 | !phy_driver_is_genphy_10g(phydev)) >> | ^~~~~~~~~~~~~~~~~~~~~~~~ >> | phy_driver_is_genphy > > This is due to > https://github.com/linux-netdev/testing/commit/42ed7f7e94da01391d3519ffb5747698d2be0a67 > which is not in net/main yet. > > --Sean I see this is marked "Changes Requested" in patchwork. However, I don't believe that I need to change anything until the above commit is merged into net/main. Will you be merging that commit? Or should I just resend without changes? --Sean
On Thu, 10 Jul 2025 13:40:33 -0400 Sean Anderson wrote: > I see this is marked "Changes Requested" in patchwork. However, I don't > believe that I need to change anything until the above commit is merged > into net/main. Will you be merging that commit? Or should I just resend > without changes? The patch must build when posted. If it didn't you need to repost.
On 7/10/25 13:52, Jakub Kicinski wrote: > On Thu, 10 Jul 2025 13:40:33 -0400 Sean Anderson wrote: >> I see this is marked "Changes Requested" in patchwork. However, I don't >> believe that I need to change anything until the above commit is merged >> into net/main. Will you be merging that commit? Or should I just resend >> without changes? > > The patch must build when posted. If it didn't you need to repost. It builds on net/main. Which is what I posted for. The CI applied it to net-next/main. --Sean
On Thu, 10 Jul 2025 14:17:18 -0400 Sean Anderson wrote: > On 7/10/25 13:52, Jakub Kicinski wrote: > > On Thu, 10 Jul 2025 13:40:33 -0400 Sean Anderson wrote: > >> I see this is marked "Changes Requested" in patchwork. However, I don't > >> believe that I need to change anything until the above commit is merged > >> into net/main. Will you be merging that commit? Or should I just resend > >> without changes? > > > > The patch must build when posted. If it didn't you need to repost. > > It builds on net/main. Which is what I posted for. The CI applied it to net-next/main. Damn, I see your point now, sorry :/ So in net-next we'll have to drop the phy_driver_is_genphy_10g() ? I think it may be best if we turn this into an explicit merge conflict, IOW if you could post both net and net-next version I will merge them at the same time. Upstream trees like CI or linux-next will have easier time handling a git conflict than a build failure. Does that make sense? For the net-next version describe it from the perspective of the net patch already being merged and you're writing the "-next" version of the patch. I'll edit in the git hash of the net commit when applying.
On 7/10/25 14:49, Jakub Kicinski wrote:
> On Thu, 10 Jul 2025 14:17:18 -0400 Sean Anderson wrote:
>> On 7/10/25 13:52, Jakub Kicinski wrote:
>> > On Thu, 10 Jul 2025 13:40:33 -0400 Sean Anderson wrote:
>> >> I see this is marked "Changes Requested" in patchwork. However, I don't
>> >> believe that I need to change anything until the above commit is merged
>> >> into net/main. Will you be merging that commit? Or should I just resend
>> >> without changes?
>> >
>> > The patch must build when posted. If it didn't you need to repost.
>>
>> It builds on net/main. Which is what I posted for. The CI applied it to net-next/main.
>
> Damn, I see your point now, sorry :/
> So in net-next we'll have to drop the phy_driver_is_genphy_10g() ?
Yes. I believe phy_driver_is_genphy() is sufficient in net-next.
> I think it may be best if we turn this into an explicit merge
> conflict, IOW if you could post both net and net-next version
> I will merge them at the same time. Upstream trees like CI or
> linux-next will have easier time handling a git conflict than
> a build failure. Does that make sense? For the net-next version
> describe it from the perspective of the net patch already being
> merged and you're writing the "-next" version of the patch.
> I'll edit in the git hash of the net commit when applying.
OK, so if A is this patch and B is the conflict above, you'd like me to
post an A' like:
B---merge---A' net-next
/ /
base---A net
? Or did you have something in mind more like
B---A' net-next
/
base---A net
--Sean
On Thu, 10 Jul 2025 14:57:48 -0400 Sean Anderson wrote:
> OK, so if A is this patch and B is the conflict above, you'd like me to
> post an A' like:
>
> B---merge---A' net-next
> / /
> base---A net
>
> ? Or did you have something in mind more like
>
> B---A' net-next
> /
> base---A net
The latter, so we will end up with:
B---A' net-next --M
/ /
base---A net -----
where M is a merge with a conflict, to resolve will basically pick
the net-next version.
© 2016 - 2026 Red Hat, Inc.