[PATCH net-next v2 2/2] net: phy: marvell-88q2xxx: Prevent hwmon access with asserted reset

Dimitri Fedrau posted 2 patches 10 months ago
There is a newer version of this series
[PATCH net-next v2 2/2] net: phy: marvell-88q2xxx: Prevent hwmon access with asserted reset
Posted by Dimitri Fedrau 10 months ago
If the PHYs reset is asserted it returns 0xffff for any read operation.
This might happen if the user admins down the interface and wants to read
the temperature. Prevent reading the temperature in this case and return
with an network is down error. Write operations are ignored by the device
when reset is asserted, still return a network is down error in this
case to make the user aware of the operation gone wrong.

Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com>
---
 drivers/net/phy/marvell-88q2xxx.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/phy/marvell-88q2xxx.c b/drivers/net/phy/marvell-88q2xxx.c
index 342a909a12a2785ad579656eb369c69acaace9d1..ea9a2a923146bf432a33ff46b606c08debb69a4f 100644
--- a/drivers/net/phy/marvell-88q2xxx.c
+++ b/drivers/net/phy/marvell-88q2xxx.c
@@ -698,6 +698,12 @@ static int mv88q2xxx_hwmon_read(struct device *dev,
 	struct phy_device *phydev = dev_get_drvdata(dev);
 	int ret;
 
+	/* If the PHYs reset is asserted it returns 0xffff for any read
+	 * operation. Return with an network is down error in this case.
+	 */
+	if (phydev->mdio.reset_state == 1)
+		return -ENETDOWN;
+
 	switch (attr) {
 	case hwmon_temp_input:
 		ret = phy_read_mmd(phydev, MDIO_MMD_PCS,
@@ -737,6 +743,14 @@ static int mv88q2xxx_hwmon_write(struct device *dev,
 {
 	struct phy_device *phydev = dev_get_drvdata(dev);
 
+	/* If the PHYs reset is asserted it ignores any write operation, return
+	 * with an network is down error in this case. Without returning an
+	 * error the user would not know that writing the temperature threshold
+	 * has gone wrong.
+	 */
+	if (phydev->mdio.reset_state == 1)
+		return -ENETDOWN;
+
 	switch (attr) {
 	case hwmon_temp_max:
 		clamp_val(val, -75000, 180000);

-- 
2.39.5
Re: [PATCH net-next v2 2/2] net: phy: marvell-88q2xxx: Prevent hwmon access with asserted reset
Posted by Russell King (Oracle) 10 months ago
On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote:
> If the PHYs reset is asserted it returns 0xffff for any read operation.
> This might happen if the user admins down the interface and wants to read
> the temperature. Prevent reading the temperature in this case and return
> with an network is down error. Write operations are ignored by the device
> when reset is asserted, still return a network is down error in this
> case to make the user aware of the operation gone wrong.

If we look at where mdio_device_reset() is called from:

1. mdio_device_register() -> mdiobus_register_device() asserts reset
   before adding the device to the device layer (which will then
   cause the driver to be searched for and bound.)

2. mdio_probe(), deasserts the reset signal before calling the MDIO
   driver's ->probe method, which will be phy_probe().

3. after a probe failure to re-assert the reset signal.

4. after ->remove has been called.

That is the sum total. So, while the driver is bound to the device,
phydev->mdio.reset_state is guaranteed to be zero.

Therefore, is this patch fixing a real observed problem with the
current driver?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net-next v2 2/2] net: phy: marvell-88q2xxx: Prevent hwmon access with asserted reset
Posted by Dimitri Fedrau 10 months ago
Hi Russell,

Am Thu, Feb 20, 2025 at 09:36:49AM +0000 schrieb Russell King (Oracle):
> On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote:
> > If the PHYs reset is asserted it returns 0xffff for any read operation.
> > This might happen if the user admins down the interface and wants to read
> > the temperature. Prevent reading the temperature in this case and return
> > with an network is down error. Write operations are ignored by the device
> > when reset is asserted, still return a network is down error in this
> > case to make the user aware of the operation gone wrong.
> 
> If we look at where mdio_device_reset() is called from:
> 
> 1. mdio_device_register() -> mdiobus_register_device() asserts reset
>    before adding the device to the device layer (which will then
>    cause the driver to be searched for and bound.)
> 
> 2. mdio_probe(), deasserts the reset signal before calling the MDIO
>    driver's ->probe method, which will be phy_probe().
> 
> 3. after a probe failure to re-assert the reset signal.
> 
> 4. after ->remove has been called.
> 

There is also phy_device_reset that calls mdio_device_reset.

> That is the sum total. So, while the driver is bound to the device,
> phydev->mdio.reset_state is guaranteed to be zero.
> 
> Therefore, is this patch fixing a real observed problem with the
> current driver?
>
Yes, when I admin up and afterwards down the network device then the PHYs
reset is asserted. In this case phy_detach is called which calls
phy_device_reset(phydev, 1), ...

Best regards,
Dimitri Fedrau
Re: [PATCH net-next v2 2/2] net: phy: marvell-88q2xxx: Prevent hwmon access with asserted reset
Posted by Russell King (Oracle) 10 months ago
On Thu, Feb 20, 2025 at 04:22:14PM +0100, Dimitri Fedrau wrote:
> Hi Russell,
> 
> Am Thu, Feb 20, 2025 at 09:36:49AM +0000 schrieb Russell King (Oracle):
> > On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote:
> > > If the PHYs reset is asserted it returns 0xffff for any read operation.
> > > This might happen if the user admins down the interface and wants to read
> > > the temperature. Prevent reading the temperature in this case and return
> > > with an network is down error. Write operations are ignored by the device
> > > when reset is asserted, still return a network is down error in this
> > > case to make the user aware of the operation gone wrong.
> > 
> > If we look at where mdio_device_reset() is called from:
> > 
> > 1. mdio_device_register() -> mdiobus_register_device() asserts reset
> >    before adding the device to the device layer (which will then
> >    cause the driver to be searched for and bound.)
> > 
> > 2. mdio_probe(), deasserts the reset signal before calling the MDIO
> >    driver's ->probe method, which will be phy_probe().
> > 
> > 3. after a probe failure to re-assert the reset signal.
> > 
> > 4. after ->remove has been called.
> > 
> 
> There is also phy_device_reset that calls mdio_device_reset.

Ok, thanks for pointing that out.

> > That is the sum total. So, while the driver is bound to the device,
> > phydev->mdio.reset_state is guaranteed to be zero.
> > 
> > Therefore, is this patch fixing a real observed problem with the
> > current driver?
> >
> Yes, when I admin up and afterwards down the network device then the PHYs
> reset is asserted. In this case phy_detach is called which calls
> phy_device_reset(phydev, 1), ...

I'm still concerned that this solution is basically racy - the
netdev can come up/down while hwmon is accessing the device. I'm
also unconvinced that ENETDOWN is a good idea here.

While I get the "describe the hardware" is there a real benefit to
describing this?

What I'm wondering is whether manipulating the reset signal in this
case provides more pain than gain.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net-next v2 2/2] net: phy: marvell-88q2xxx: Prevent hwmon access with asserted reset
Posted by Dimitri Fedrau 10 months ago
Am Thu, Feb 20, 2025 at 03:29:10PM +0000 schrieb Russell King (Oracle):
> On Thu, Feb 20, 2025 at 04:22:14PM +0100, Dimitri Fedrau wrote:
> > Hi Russell,
> > 
> > Am Thu, Feb 20, 2025 at 09:36:49AM +0000 schrieb Russell King (Oracle):
> > > On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote:
> > > > If the PHYs reset is asserted it returns 0xffff for any read operation.
> > > > This might happen if the user admins down the interface and wants to read
> > > > the temperature. Prevent reading the temperature in this case and return
> > > > with an network is down error. Write operations are ignored by the device
> > > > when reset is asserted, still return a network is down error in this
> > > > case to make the user aware of the operation gone wrong.
> > > 
> > > If we look at where mdio_device_reset() is called from:
> > > 
> > > 1. mdio_device_register() -> mdiobus_register_device() asserts reset
> > >    before adding the device to the device layer (which will then
> > >    cause the driver to be searched for and bound.)
> > > 
> > > 2. mdio_probe(), deasserts the reset signal before calling the MDIO
> > >    driver's ->probe method, which will be phy_probe().
> > > 
> > > 3. after a probe failure to re-assert the reset signal.
> > > 
> > > 4. after ->remove has been called.
> > > 
> > 
> > There is also phy_device_reset that calls mdio_device_reset.
> 
> Ok, thanks for pointing that out.
> 
> > > That is the sum total. So, while the driver is bound to the device,
> > > phydev->mdio.reset_state is guaranteed to be zero.
> > > 
> > > Therefore, is this patch fixing a real observed problem with the
> > > current driver?
> > >
> > Yes, when I admin up and afterwards down the network device then the PHYs
> > reset is asserted. In this case phy_detach is called which calls
> > phy_device_reset(phydev, 1), ...
> 
> I'm still concerned that this solution is basically racy - the
> netdev can come up/down while hwmon is accessing the device. I'm
> also unconvinced that ENETDOWN is a good idea here.
>
Yes it is racy. A solution would be to set a flag or whatever in
mdio_device_reset right before changes to the reset line happens and
clear the flag right after the changes have been done. Add a function
that return the state of the flag.
Better go back to EIO ? At first I thought it was a good idea because
the user gets at least a hint what the cause of the error is...

> While I get the "describe the hardware" is there a real benefit to
> describing this?
> 
I can't follow.

> What I'm wondering is whether manipulating the reset signal in this
> case provides more pain than gain.
>
It seems so. I wasn't really aware of it :-)

Best regards,
Dimitri Fedrau