[PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to suspend

Youwan Wang posted 1 patch 1 year, 5 months ago
There is a newer version of this series
drivers/net/phy/phy_device.c | 9 +++++++++
1 file changed, 9 insertions(+)
[PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to suspend
Posted by Youwan Wang 1 year, 5 months ago
If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
we cannot suspend the PHY. Although the WOL status has been
checked in phy_suspend(), returning -EBUSY(-16) would cause
the Power Management (PM) to fail to suspend. Since
phy_suspend() is an exported symbol (EXPORT_SYMBOL),
timely error reporting is needed. Therefore, an additional
check is performed here. If the PHY of the mido bus is enabled
with WOL, we skip calling phy_suspend() to avoid PM failure.

log:
[  322.631362] OOM killer disabled.
[  322.631364] Freezing remaining freezable tasks
[  322.632536] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  322.632540] printk: Suspending console(s) (use no_console_suspend to debug)
[  322.633052] YT8521 Gigabit Ethernet stmmac-0:01:
PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x110 [libphy] returns -16
[  322.633071] YT8521 Gigabit Ethernet stmmac-0:01:
PM: failed to suspend: error -16
[  322.669699] PM: Some devices failed to suspend, or early wake event detected
[  322.669949] OOM killer enabled.
[  322.669951] Restarting tasks ... done.
[  322.671008] random: crng reseeded on system resumption
[  322.671014] PM: suspend exit

If the YT8521 driver adds phydrv->flags, ask the YT8521 driver to process
WOL at suspend and resume time, the phydev->suspended_by_mdio_bus=1
flag would cause the resume failure.

log:
[  260.814763] YT8521 Gigabit Ethernet stmmac-0:01:
PM: dpm_run_callback():mdio_bus_phy_resume+0x0/0x160 [libphy] returns -95
[  260.814782] YT8521 Gigabit Ethernet stmmac-0:01:
PM: failed to resume: error -95

Signed-off-by: Youwan Wang <youwan@nfschina.com>
---
 drivers/net/phy/phy_device.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 2ce74593d6e4..c766130e2c41 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -270,6 +270,7 @@ static DEFINE_MUTEX(phy_fixup_lock);
 
 static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
 {
+	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
 	struct device_driver *drv = phydev->mdio.dev.driver;
 	struct phy_driver *phydrv = to_phy_driver(drv);
 	struct net_device *netdev = phydev->attached_dev;
@@ -277,6 +278,14 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
 	if (!drv || !phydrv->suspend)
 		return false;
 
+	/* If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
+	 * we cannot suspend the PHY.
+	 */
+	phy_ethtool_get_wol(phydev, &wol);
+	phydev->wol_enabled = !!(wol.wolopts);
+	if (phydev->wol_enabled && !(phydrv->flags & PHY_ALWAYS_CALL_SUSPEND))
+		return false;
+
 	/* PHY not attached? May suspend if the PHY has not already been
 	 * suspended as part of a prior call to phy_disconnect() ->
 	 * phy_detach() -> phy_suspend() because the parent netdev might be the
-- 
2.25.1
Re: [PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to suspend
Posted by Russell King (Oracle) 1 year, 5 months ago
On Fri, Jun 28, 2024 at 02:03:18PM +0800, Youwan Wang wrote:
> If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
> we cannot suspend the PHY. Although the WOL status has been
> checked in phy_suspend(), returning -EBUSY(-16) would cause
> the Power Management (PM) to fail to suspend. Since
> phy_suspend() is an exported symbol (EXPORT_SYMBOL),
> timely error reporting is needed. Therefore, an additional
> check is performed here. If the PHY of the mido bus is enabled
> with WOL, we skip calling phy_suspend() to avoid PM failure.
> 
> log:
> [  322.631362] OOM killer disabled.
> [  322.631364] Freezing remaining freezable tasks
> [  322.632536] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [  322.632540] printk: Suspending console(s) (use no_console_suspend to debug)
> [  322.633052] YT8521 Gigabit Ethernet stmmac-0:01:
> PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x110 [libphy] returns -16
> [  322.633071] YT8521 Gigabit Ethernet stmmac-0:01:
> PM: failed to suspend: error -16
> [  322.669699] PM: Some devices failed to suspend, or early wake event detected
> [  322.669949] OOM killer enabled.
> [  322.669951] Restarting tasks ... done.
> [  322.671008] random: crng reseeded on system resumption
> [  322.671014] PM: suspend exit
> 
> If the YT8521 driver adds phydrv->flags, ask the YT8521 driver to process
> WOL at suspend and resume time, the phydev->suspended_by_mdio_bus=1
> flag would cause the resume failure.

I think the reason this is happening is because the PHY has WoL enabled
on it without the kernel/netdev driver being aware that WoL is enabled.
Thus, mdio_bus_phy_may_suspend() returns true, allowing the suspend to
happen, but then we find unexpectedly that WoL is enabled on the PHY.

However, whenever a user configures WoL, netdev->wol_enabled will be
set when _any_ WoL mode is enabled and cleared only if all WoL modes
are disabled.

Thus, what we have is a de-sync between the kernel state and hardware
state, leading to the suspend failing.

I don't see anything in the motorcomm driver that requires suspend
if WoL is enabled - yt8521_suspend() first checks to see whether WoL
is enabled, and exits if it is.

Andrew - how do you feel about reading the WoL state from the PHY and
setting netdev->wol_enabled if any WoL is enabled on the PHY? That
would mean that the netdev's WoL state is consistent with the PHY
whether or not the user has configured WoL.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to suspend
Posted by Florian Fainelli 1 year, 5 months ago

On 6/28/2024 9:17 AM, Russell King (Oracle) wrote:
> On Fri, Jun 28, 2024 at 02:03:18PM +0800, Youwan Wang wrote:
>> If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
>> we cannot suspend the PHY. Although the WOL status has been
>> checked in phy_suspend(), returning -EBUSY(-16) would cause
>> the Power Management (PM) to fail to suspend. Since
>> phy_suspend() is an exported symbol (EXPORT_SYMBOL),
>> timely error reporting is needed. Therefore, an additional
>> check is performed here. If the PHY of the mido bus is enabled
>> with WOL, we skip calling phy_suspend() to avoid PM failure.
>>
>> log:
>> [  322.631362] OOM killer disabled.
>> [  322.631364] Freezing remaining freezable tasks
>> [  322.632536] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>> [  322.632540] printk: Suspending console(s) (use no_console_suspend to debug)
>> [  322.633052] YT8521 Gigabit Ethernet stmmac-0:01:
>> PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x110 [libphy] returns -16
>> [  322.633071] YT8521 Gigabit Ethernet stmmac-0:01:
>> PM: failed to suspend: error -16
>> [  322.669699] PM: Some devices failed to suspend, or early wake event detected
>> [  322.669949] OOM killer enabled.
>> [  322.669951] Restarting tasks ... done.
>> [  322.671008] random: crng reseeded on system resumption
>> [  322.671014] PM: suspend exit
>>
>> If the YT8521 driver adds phydrv->flags, ask the YT8521 driver to process
>> WOL at suspend and resume time, the phydev->suspended_by_mdio_bus=1
>> flag would cause the resume failure.

Did you mean to write that if the YT8521 PHY driver entry set the 
PHY_ALWAYS_CALL_SUSPEND flag, then it would cause an error during 
resume? If so, why is that?

> 
> I think the reason this is happening is because the PHY has WoL enabled
> on it without the kernel/netdev driver being aware that WoL is enabled.
> Thus, mdio_bus_phy_may_suspend() returns true, allowing the suspend to
> happen, but then we find unexpectedly that WoL is enabled on the PHY.
> 
> However, whenever a user configures WoL, netdev->wol_enabled will be
> set when _any_ WoL mode is enabled and cleared only if all WoL modes
> are disabled.
> 
> Thus, what we have is a de-sync between the kernel state and hardware
> state, leading to the suspend failing.
> 
> I don't see anything in the motorcomm driver that requires suspend
> if WoL is enabled - yt8521_suspend() first checks to see whether WoL
> is enabled, and exits if it is.
> 
> Andrew - how do you feel about reading the WoL state from the PHY and
> setting netdev->wol_enabled if any WoL is enabled on the PHY? That
> would mean that the netdev's WoL state is consistent with the PHY
> whether or not the user has configured WoL.

Would not the situation described here be solved by having the Motorcomm 
PHY driver set PHY_ALWAYS_CALL_SUSPEND since it deals with checking 
whether WoL is enabled or not and will just return then.
-- 
Florian
Re: [PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to suspend
Posted by Russell King (Oracle) 1 year, 5 months ago
On Fri, Jun 28, 2024 at 09:25:54AM +0100, Florian Fainelli wrote:
> Would not the situation described here be solved by having the Motorcomm PHY
> driver set PHY_ALWAYS_CALL_SUSPEND since it deals with checking whether WoL
> is enabled or not and will just return then.

Let's also look at PHY_ALWAYS_CALL_SUSPEND. There are currently two
drivers that make use of it - realtek and broadcom.

Looking at realtek, it is used with driver instances that call
	rtl821x_suspend
	rtl821x_resume

rtl821x_suspend() does nothing if phydev->wol_enabled is true.
rtl821x_resume() only re-enabled the clock if phydev->wol_enabled
was false (in other words, rtl821x has disabled the clock.) However,
it always calls genphy_resume() - presumably this is the reason for
the flag.

The realtek driver instances do not populate .set_wol nor .get_wol,
so the PHY itself does not support WoL configuration. This means
that the phy_ethtool_get_wol() call in phy_suspend() will also fail,
and since wol.wolopts is initialised to zero, phydev->wol_enabled
will only be true if netdev->wol_enabled is true.

Thus, for phydev->wol_enabled to be true, netdev->wol_enabled must
be true, and we won't get here via mdio_bus_phy_suspend() as 
mdio_bus_phy_may_suspend() will return false in this case.


Looking at broadcom, it's used with only one driver instance for
BCM54210E which calls:
	bcm54xx_suspend
	bcm54xx_resume

Other driver instances also call these two functions but do not
set this flag - BCM54612E, BCM54810, BCM54811, BCM50610, and
BCM50610M. Moreover, none of these implement the .get_wol and
.set_wol methods which means the behaviour is as I describe for
Realtek above that also doesn't implement these methods.

The only case where this is different is BCM54210E which does
populate these methods.

bcm54xx_suspend() stops ptp, and if WoL is enabled, configures the
wakeup IRQ. bcm54xx_resume() deconfigures the wakeup IRQ.

This could lead us into a case where the PHY has WoL enabled, the
phy_ethtool_get_wol() call returns that, causing phydev->wol_enabled
to be set, and netdev->wol_enabled is not set.

However, in this case, it would not be a problem because the driver
has set PHY_ALWAYS_CALL_SUSPEND, so we won't return -EBUSY.


Now, looking again at motorcomm, yt8521_resume() disables auto-sleep
before checking whether WoL is enabled. So, the driver is probably
missing PHY_ALWAYS_CALL_SUSPEND if auto-sleep needs to be disabled
each and every resume whether WoL is enabled or not.

However, if we look at yt8521_config_init(), this will also disable
auto sleep. This will be called from phy_init_hw(), and in the
mdio_bus_phy_resume() path, immediately before phy_resume(). Likewise
when we attach the PHY.

Then we have some net drivers that call phy_resume() directly...

drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
	(we already have a workaround merged for
	PHY-not-providing-clock)

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
	A suspend/resume cycle of the PHY is done when entering loopback mode.

drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
	No idea on this one - it resumes the PHY before enabling
	loopback mode, and enters suspend when disabling loopback
	mode!

drivers/net/ethernet/broadcom/genet/bcmgenet.c
	bcmgenet_resume() calls phy_init_hw() before phy_resume().

drivers/net/ethernet/broadcom/bcmsysport.c
	bcm_sysport_resume() *doesn't* appear to call phy_init_hw()
	before phy_resume(), so I wonder whether the config is
	properly restored on resume?

drivers/net/ethernet/realtek/r8169_main.c
	rtl8169_up() calls phy_init_hw() before phy_resume().

drivers/net/usb/ax88172a.c
	This doesn't actually call phy_resume(), but calls
	genphy_resume() directly from ax88172a_reset() immediately
	after phy_connect(). However, connecting to a PHY will
	call phy_resume()... confused here.

So I'm left wondering whether yt8521_resume() should be fiddling with
this auto-sleep mode, especially as yt8521_config_init() will do that
if the appropriate DT property is set... and whether this should be
done unconditionally.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to suspend
Posted by Florian Fainelli 1 year, 5 months ago

On 6/28/2024 10:24 AM, Russell King (Oracle) wrote:
> On Fri, Jun 28, 2024 at 09:25:54AM +0100, Florian Fainelli wrote:
>> Would not the situation described here be solved by having the Motorcomm PHY
>> driver set PHY_ALWAYS_CALL_SUSPEND since it deals with checking whether WoL
>> is enabled or not and will just return then.
> 
> Let's also look at PHY_ALWAYS_CALL_SUSPEND. There are currently two
> drivers that make use of it - realtek and broadcom.
> 
> Looking at realtek, it is used with driver instances that call
> 	rtl821x_suspend
> 	rtl821x_resume
> 
> rtl821x_suspend() does nothing if phydev->wol_enabled is true.
> rtl821x_resume() only re-enabled the clock if phydev->wol_enabled
> was false (in other words, rtl821x has disabled the clock.) However,
> it always calls genphy_resume() - presumably this is the reason for
> the flag.
> 
> The realtek driver instances do not populate .set_wol nor .get_wol,
> so the PHY itself does not support WoL configuration. This means
> that the phy_ethtool_get_wol() call in phy_suspend() will also fail,
> and since wol.wolopts is initialised to zero, phydev->wol_enabled
> will only be true if netdev->wol_enabled is true.
> 
> Thus, for phydev->wol_enabled to be true, netdev->wol_enabled must
> be true, and we won't get here via mdio_bus_phy_suspend() as
> mdio_bus_phy_may_suspend() will return false in this case.
> 
> 
> Looking at broadcom, it's used with only one driver instance for
> BCM54210E which calls:
> 	bcm54xx_suspend
> 	bcm54xx_resume
> 
> Other driver instances also call these two functions but do not
> set this flag - BCM54612E, BCM54810, BCM54811, BCM50610, and
> BCM50610M. Moreover, none of these implement the .get_wol and
> .set_wol methods which means the behaviour is as I describe for
> Realtek above that also doesn't implement these methods.
> 
> The only case where this is different is BCM54210E which does
> populate these methods.
> 
> bcm54xx_suspend() stops ptp, and if WoL is enabled, configures the
> wakeup IRQ. bcm54xx_resume() deconfigures the wakeup IRQ.
> 
> This could lead us into a case where the PHY has WoL enabled, the
> phy_ethtool_get_wol() call returns that, causing phydev->wol_enabled
> to be set, and netdev->wol_enabled is not set.
> 
> However, in this case, it would not be a problem because the driver
> has set PHY_ALWAYS_CALL_SUSPEND, so we won't return -EBUSY.
> 
> 
> Now, looking again at motorcomm, yt8521_resume() disables auto-sleep
> before checking whether WoL is enabled. So, the driver is probably
> missing PHY_ALWAYS_CALL_SUSPEND if auto-sleep needs to be disabled
> each and every resume whether WoL is enabled or not.
> 
> However, if we look at yt8521_config_init(), this will also disable
> auto sleep. This will be called from phy_init_hw(), and in the
> mdio_bus_phy_resume() path, immediately before phy_resume(). Likewise
> when we attach the PHY.
> 
> Then we have some net drivers that call phy_resume() directly...
> 
> drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
> 	(we already have a workaround merged for
> 	PHY-not-providing-clock)
> 
> drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
> 	A suspend/resume cycle of the PHY is done when entering loopback mode.
> 
> drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
> 	No idea on this one - it resumes the PHY before enabling
> 	loopback mode, and enters suspend when disabling loopback
> 	mode!
> 
> drivers/net/ethernet/broadcom/genet/bcmgenet.c
> 	bcmgenet_resume() calls phy_init_hw() before phy_resume().

Yes, there was a reason for that, that had to do with a finicky PHY 
IIRC, should be documented properly in the commit message since this 
came from my colleague Doug.

> 
> drivers/net/ethernet/broadcom/bcmsysport.c
> 	bcm_sysport_resume() *doesn't* appear to call phy_init_hw()
> 	before phy_resume(), so I wonder whether the config is
> 	properly restored on resume?

This driver is using the fixed PHY emulation so it does not really 
matter, it also sets mac_managed_pm to true.

> 
> drivers/net/ethernet/realtek/r8169_main.c
> 	rtl8169_up() calls phy_init_hw() before phy_resume().
> 
> drivers/net/usb/ax88172a.c
> 	This doesn't actually call phy_resume(), but calls
> 	genphy_resume() directly from ax88172a_reset() immediately
> 	after phy_connect(). However, connecting to a PHY will
> 	call phy_resume()... confused here.
> 
> So I'm left wondering whether yt8521_resume() should be fiddling with
> this auto-sleep mode, especially as yt8521_config_init() will do that
> if the appropriate DT property is set... and whether this should be
> done unconditionally.
> 

-- 
Florian
Re: [PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to suspend
Posted by Russell King (Oracle) 1 year, 5 months ago
On Fri, Jun 28, 2024 at 09:25:54AM +0100, Florian Fainelli wrote:
> 
> 
> On 6/28/2024 9:17 AM, Russell King (Oracle) wrote:
> > On Fri, Jun 28, 2024 at 02:03:18PM +0800, Youwan Wang wrote:
> > > If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
> > > we cannot suspend the PHY. Although the WOL status has been
> > > checked in phy_suspend(), returning -EBUSY(-16) would cause
> > > the Power Management (PM) to fail to suspend. Since
> > > phy_suspend() is an exported symbol (EXPORT_SYMBOL),
> > > timely error reporting is needed. Therefore, an additional
> > > check is performed here. If the PHY of the mido bus is enabled
> > > with WOL, we skip calling phy_suspend() to avoid PM failure.
> > > 
> > > log:
> > > [  322.631362] OOM killer disabled.
> > > [  322.631364] Freezing remaining freezable tasks
> > > [  322.632536] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> > > [  322.632540] printk: Suspending console(s) (use no_console_suspend to debug)
> > > [  322.633052] YT8521 Gigabit Ethernet stmmac-0:01:
> > > PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x110 [libphy] returns -16
> > > [  322.633071] YT8521 Gigabit Ethernet stmmac-0:01:
> > > PM: failed to suspend: error -16
> > > [  322.669699] PM: Some devices failed to suspend, or early wake event detected
> > > [  322.669949] OOM killer enabled.
> > > [  322.669951] Restarting tasks ... done.
> > > [  322.671008] random: crng reseeded on system resumption
> > > [  322.671014] PM: suspend exit
> > > 
> > > If the YT8521 driver adds phydrv->flags, ask the YT8521 driver to process
> > > WOL at suspend and resume time, the phydev->suspended_by_mdio_bus=1
> > > flag would cause the resume failure.
> 
> Did you mean to write that if the YT8521 PHY driver entry set the
> PHY_ALWAYS_CALL_SUSPEND flag, then it would cause an error during resume? If
> so, why is that?

It doesn't appear to do that - at least not in net-next, and not in
mainline.

> > I think the reason this is happening is because the PHY has WoL enabled
> > on it without the kernel/netdev driver being aware that WoL is enabled.
> > Thus, mdio_bus_phy_may_suspend() returns true, allowing the suspend to
> > happen, but then we find unexpectedly that WoL is enabled on the PHY.
> > 
> > However, whenever a user configures WoL, netdev->wol_enabled will be
> > set when _any_ WoL mode is enabled and cleared only if all WoL modes
> > are disabled.
> > 
> > Thus, what we have is a de-sync between the kernel state and hardware
> > state, leading to the suspend failing.
> > 
> > I don't see anything in the motorcomm driver that requires suspend
> > if WoL is enabled - yt8521_suspend() first checks to see whether WoL
> > is enabled, and exits if it is.
> > 
> > Andrew - how do you feel about reading the WoL state from the PHY and
> > setting netdev->wol_enabled if any WoL is enabled on the PHY? That
> > would mean that the netdev's WoL state is consistent with the PHY
> > whether or not the user has configured WoL.
> 
> Would not the situation described here be solved by having the Motorcomm PHY
> driver set PHY_ALWAYS_CALL_SUSPEND since it deals with checking whether WoL
> is enabled or not and will just return then.

Is there a reason that netdev->wol_enabled shouldn't reflect the
hardware configuration?

If netdev->wol_enabled is appropriately set, then it seems to me
that there's little reason for motorcomm to be checking whether
WoL is enabled in its suspend function - which means less driver
specific code and driver specific behaviour.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to suspend
Posted by Florian Fainelli 1 year, 5 months ago

On 6/28/2024 9:38 AM, Russell King (Oracle) wrote:
> On Fri, Jun 28, 2024 at 09:25:54AM +0100, Florian Fainelli wrote:
>>
>>
>> On 6/28/2024 9:17 AM, Russell King (Oracle) wrote:
>>> On Fri, Jun 28, 2024 at 02:03:18PM +0800, Youwan Wang wrote:
>>>> If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
>>>> we cannot suspend the PHY. Although the WOL status has been
>>>> checked in phy_suspend(), returning -EBUSY(-16) would cause
>>>> the Power Management (PM) to fail to suspend. Since
>>>> phy_suspend() is an exported symbol (EXPORT_SYMBOL),
>>>> timely error reporting is needed. Therefore, an additional
>>>> check is performed here. If the PHY of the mido bus is enabled
>>>> with WOL, we skip calling phy_suspend() to avoid PM failure.
>>>>
>>>> log:
>>>> [  322.631362] OOM killer disabled.
>>>> [  322.631364] Freezing remaining freezable tasks
>>>> [  322.632536] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>>>> [  322.632540] printk: Suspending console(s) (use no_console_suspend to debug)
>>>> [  322.633052] YT8521 Gigabit Ethernet stmmac-0:01:
>>>> PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x110 [libphy] returns -16
>>>> [  322.633071] YT8521 Gigabit Ethernet stmmac-0:01:
>>>> PM: failed to suspend: error -16
>>>> [  322.669699] PM: Some devices failed to suspend, or early wake event detected
>>>> [  322.669949] OOM killer enabled.
>>>> [  322.669951] Restarting tasks ... done.
>>>> [  322.671008] random: crng reseeded on system resumption
>>>> [  322.671014] PM: suspend exit
>>>>
>>>> If the YT8521 driver adds phydrv->flags, ask the YT8521 driver to process
>>>> WOL at suspend and resume time, the phydev->suspended_by_mdio_bus=1
>>>> flag would cause the resume failure.
>>
>> Did you mean to write that if the YT8521 PHY driver entry set the
>> PHY_ALWAYS_CALL_SUSPEND flag, then it would cause an error during resume? If
>> so, why is that?
> 
> It doesn't appear to do that - at least not in net-next, and not in
> mainline.
> 
>>> I think the reason this is happening is because the PHY has WoL enabled
>>> on it without the kernel/netdev driver being aware that WoL is enabled.
>>> Thus, mdio_bus_phy_may_suspend() returns true, allowing the suspend to
>>> happen, but then we find unexpectedly that WoL is enabled on the PHY.
>>>
>>> However, whenever a user configures WoL, netdev->wol_enabled will be
>>> set when _any_ WoL mode is enabled and cleared only if all WoL modes
>>> are disabled.
>>>
>>> Thus, what we have is a de-sync between the kernel state and hardware
>>> state, leading to the suspend failing.
>>>
>>> I don't see anything in the motorcomm driver that requires suspend
>>> if WoL is enabled - yt8521_suspend() first checks to see whether WoL
>>> is enabled, and exits if it is.
>>>
>>> Andrew - how do you feel about reading the WoL state from the PHY and
>>> setting netdev->wol_enabled if any WoL is enabled on the PHY? That
>>> would mean that the netdev's WoL state is consistent with the PHY
>>> whether or not the user has configured WoL.
>>
>> Would not the situation described here be solved by having the Motorcomm PHY
>> driver set PHY_ALWAYS_CALL_SUSPEND since it deals with checking whether WoL
>> is enabled or not and will just return then.
> 
> Is there a reason that netdev->wol_enabled shouldn't reflect the
> hardware configuration?

Unless there is some sort of Ethernet MAC driver bug that we are not 
being made aware of, the only thing that I can of is happening here, is 
that a SW/FW agent other than Linux would have enabled the PHY for 
Wake-on-LAN, and there was no configuration done by the user via the 
Ethernet MAC driver that attempted to enable Wake-on-LAN. In that case 
only can I think of a disconnect between the HW and SW states?

> 
> If netdev->wol_enabled is appropriately set, then it seems to me
> that there's little reason for motorcomm to be checking whether
> WoL is enabled in its suspend function - which means less driver
> specific code and driver specific behaviour.
> 

That should work, too.
-- 
Florian