[PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit

Daniel Golle posted 1 patch 2 weeks ago
There is a newer version of this series
drivers/net/dsa/lantiq/mxl-gsw1xx.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
[PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Daniel Golle 2 weeks ago
Despite being documented as self-clearing, the RANEG bit sometimes
remains set, preventing auto-negotiation from happening.

Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
delayed_work emulating the asynchronous self-clearing behavior.

Fixes: 22335939ec90 ("net: dsa: add driver for MaxLinear GSW1xx switch family")
Reported-by: Rasmus Villemoes <ravi@prevas.dk>
Suggested-by: "Benny (Ying-Tsan) Weng" <yweng@maxlinear.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
 drivers/net/dsa/lantiq/mxl-gsw1xx.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/net/dsa/lantiq/mxl-gsw1xx.c b/drivers/net/dsa/lantiq/mxl-gsw1xx.c
index cf33a16fd183b..91304d3c1ca9b 100644
--- a/drivers/net/dsa/lantiq/mxl-gsw1xx.c
+++ b/drivers/net/dsa/lantiq/mxl-gsw1xx.c
@@ -11,10 +11,12 @@
 
 #include <linux/bits.h>
 #include <linux/delay.h>
+#include <linux/jiffies.h>
 #include <linux/module.h>
 #include <linux/of_device.h>
 #include <linux/of_mdio.h>
 #include <linux/regmap.h>
+#include <linux/workqueue.h>
 #include <net/dsa.h>
 
 #include "lantiq_gswip.h"
@@ -29,6 +31,7 @@ struct gsw1xx_priv {
 	struct			regmap *clk;
 	struct			regmap *shell;
 	struct			phylink_pcs pcs;
+	struct delayed_work	clear_raneg;
 	phy_interface_t		tbi_interface;
 	struct gswip_priv	gswip;
 };
@@ -145,6 +148,8 @@ static void gsw1xx_pcs_disable(struct phylink_pcs *pcs)
 {
 	struct gsw1xx_priv *priv = pcs_to_gsw1xx(pcs);
 
+	cancel_delayed_work_sync(&priv->clear_raneg);
+
 	/* Assert SGMII shell reset */
 	regmap_set_bits(priv->shell, GSW1XX_SHELL_RST_REQ,
 			GSW1XX_RST_REQ_SGMII_SHELL);
@@ -428,12 +433,27 @@ static int gsw1xx_pcs_config(struct phylink_pcs *pcs, unsigned int neg_mode,
 	return 0;
 }
 
+static void gsw1xx_pcs_clear_raneg(struct work_struct *work)
+{
+	struct gsw1xx_priv *priv =
+		container_of(work, struct gsw1xx_priv, clear_raneg.work);
+
+	regmap_clear_bits(priv->sgmii, GSW1XX_SGMII_TBI_ANEGCTL,
+			  GSW1XX_SGMII_TBI_ANEGCTL_RANEG);
+}
+
 static void gsw1xx_pcs_an_restart(struct phylink_pcs *pcs)
 {
 	struct gsw1xx_priv *priv = pcs_to_gsw1xx(pcs);
 
 	regmap_set_bits(priv->sgmii, GSW1XX_SGMII_TBI_ANEGCTL,
 			GSW1XX_SGMII_TBI_ANEGCTL_RANEG);
+
+	/* despite being documented as self-clearing, the RANEG bit
+	 * sometimes remains set, preventing auto-negotiation from happening.
+	 * MaxLinear advises to manually clear the bit after 10ms.
+	 */
+	schedule_delayed_work(&priv->clear_raneg, msecs_to_jiffies(10));
 }
 
 static void gsw1xx_pcs_link_up(struct phylink_pcs *pcs,
@@ -636,6 +656,8 @@ static int gsw1xx_probe(struct mdio_device *mdiodev)
 	if (ret)
 		return ret;
 
+	INIT_DELAYED_WORK(&priv->clear_raneg, gsw1xx_pcs_clear_raneg);
+
 	ret = gswip_probe_common(&priv->gswip, version);
 	if (ret)
 		return ret;
-- 
2.52.0
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Andrew Lunn 2 weeks ago
On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> Despite being documented as self-clearing, the RANEG bit sometimes
> remains set, preventing auto-negotiation from happening.
> 
> Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> delayed_work emulating the asynchronous self-clearing behavior.

Maybe add some text why the complexity of delayed work is used, rather
than just a msleep(10)?

Calling regmap_read_poll_timeout() to see if it clears itself could
optimise this, and still be simpler.

	Andrew
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Daniel Golle 2 weeks ago
On Fri, Dec 05, 2025 at 02:45:35PM +0100, Andrew Lunn wrote:
> On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> > Despite being documented as self-clearing, the RANEG bit sometimes
> > remains set, preventing auto-negotiation from happening.
> > 
> > Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> > delayed_work emulating the asynchronous self-clearing behavior.
> 
> Maybe add some text why the complexity of delayed work is used, rather
> than just a msleep(10)?
> 
> Calling regmap_read_poll_timeout() to see if it clears itself could
> optimise this, and still be simpler.

Is the restart_an() operation allowed to sleep? Looking at other
drivers I only ever see that it sets a self-clearing AN RESTART bit,
never waiting for that bit to clear. Hence I wanted to immitate
that behavior by clearing the bit asynchronously. If that's not needed
and msleep(10) or usleep_range(10000, 20000) can be used instead that'd
be much easier, of course.
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Russell King (Oracle) 1 week, 6 days ago
On Fri, Dec 05, 2025 at 01:56:39PM +0000, Daniel Golle wrote:
> On Fri, Dec 05, 2025 at 02:45:35PM +0100, Andrew Lunn wrote:
> > On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> > > Despite being documented as self-clearing, the RANEG bit sometimes
> > > remains set, preventing auto-negotiation from happening.
> > > 
> > > Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> > > delayed_work emulating the asynchronous self-clearing behavior.
> > 
> > Maybe add some text why the complexity of delayed work is used, rather
> > than just a msleep(10)?
> > 
> > Calling regmap_read_poll_timeout() to see if it clears itself could
> > optimise this, and still be simpler.
> 
> Is the restart_an() operation allowed to sleep? Looking at other
> drivers I only ever see that it sets a self-clearing AN RESTART bit,
> never waiting for that bit to clear. Hence I wanted to immitate
> that behavior by clearing the bit asynchronously. If that's not needed
> and msleep(10) or usleep_range(10000, 20000) can be used instead that'd
> be much easier, of course.

Sleeping is permitted in this code path, but bear in mind that it
will be called from ethtool ops, and thus the RTNL will be held,
please keep sleep durations to a minimum.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Daniel Golle 1 week, 6 days ago
On Fri, Dec 05, 2025 at 03:17:37PM +0000, Russell King (Oracle) wrote:
> On Fri, Dec 05, 2025 at 01:56:39PM +0000, Daniel Golle wrote:
> > On Fri, Dec 05, 2025 at 02:45:35PM +0100, Andrew Lunn wrote:
> > > On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> > > > Despite being documented as self-clearing, the RANEG bit sometimes
> > > > remains set, preventing auto-negotiation from happening.
> > > > 
> > > > Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> > > > delayed_work emulating the asynchronous self-clearing behavior.
> > > 
> > > Maybe add some text why the complexity of delayed work is used, rather
> > > than just a msleep(10)?
> > > 
> > > Calling regmap_read_poll_timeout() to see if it clears itself could
> > > optimise this, and still be simpler.
> > 
> > Is the restart_an() operation allowed to sleep? Looking at other
> > drivers I only ever see that it sets a self-clearing AN RESTART bit,
> > never waiting for that bit to clear. Hence I wanted to immitate
> > that behavior by clearing the bit asynchronously. If that's not needed
> > and msleep(10) or usleep_range(10000, 20000) can be used instead that'd
> > be much easier, of course.
> 
> Sleeping is permitted in this code path, but bear in mind that it
> will be called from ethtool ops, and thus the RTNL will be held,
> please keep sleep durations to a minimum.

In that sense 10ms (on top of the MDIO operation) is not that little.
Maybe it is worth to use delayed_work to clear the bit after all...
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Andrew Lunn 1 week, 6 days ago
On Fri, Dec 05, 2025 at 03:40:35PM +0000, Daniel Golle wrote:
> On Fri, Dec 05, 2025 at 03:17:37PM +0000, Russell King (Oracle) wrote:
> > On Fri, Dec 05, 2025 at 01:56:39PM +0000, Daniel Golle wrote:
> > > On Fri, Dec 05, 2025 at 02:45:35PM +0100, Andrew Lunn wrote:
> > > > On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> > > > > Despite being documented as self-clearing, the RANEG bit sometimes
> > > > > remains set, preventing auto-negotiation from happening.
> > > > > 
> > > > > Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> > > > > delayed_work emulating the asynchronous self-clearing behavior.
> > > > 
> > > > Maybe add some text why the complexity of delayed work is used, rather
> > > > than just a msleep(10)?
> > > > 
> > > > Calling regmap_read_poll_timeout() to see if it clears itself could
> > > > optimise this, and still be simpler.
> > > 
> > > Is the restart_an() operation allowed to sleep? Looking at other
> > > drivers I only ever see that it sets a self-clearing AN RESTART bit,
> > > never waiting for that bit to clear. Hence I wanted to immitate
> > > that behavior by clearing the bit asynchronously. If that's not needed
> > > and msleep(10) or usleep_range(10000, 20000) can be used instead that'd
> > > be much easier, of course.
> > 
> > Sleeping is permitted in this code path, but bear in mind that it
> > will be called from ethtool ops, and thus the RTNL will be held,
> > please keep sleep durations to a minimum.
> 
> In that sense 10ms (on top of the MDIO operation) is not that little.
> Maybe it is worth to use delayed_work to clear the bit after all...

Do you have any idea how often it does not self clear? And if it does
self clear, how fast does it clear?

If it self clears after 1ms with 99% probability,
regmap_read_poll_timeout() with a 1ms poll interval will mostly just
cost 1ms, which is not too bad. Its just the 1% when you need the full
10ms.

	Andrew
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Russell King (Oracle) 1 week, 6 days ago
On Fri, Dec 05, 2025 at 03:40:35PM +0000, Daniel Golle wrote:
> On Fri, Dec 05, 2025 at 03:17:37PM +0000, Russell King (Oracle) wrote:
> > On Fri, Dec 05, 2025 at 01:56:39PM +0000, Daniel Golle wrote:
> > > On Fri, Dec 05, 2025 at 02:45:35PM +0100, Andrew Lunn wrote:
> > > > On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> > > > > Despite being documented as self-clearing, the RANEG bit sometimes
> > > > > remains set, preventing auto-negotiation from happening.
> > > > > 
> > > > > Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> > > > > delayed_work emulating the asynchronous self-clearing behavior.
> > > > 
> > > > Maybe add some text why the complexity of delayed work is used, rather
> > > > than just a msleep(10)?
> > > > 
> > > > Calling regmap_read_poll_timeout() to see if it clears itself could
> > > > optimise this, and still be simpler.
> > > 
> > > Is the restart_an() operation allowed to sleep? Looking at other
> > > drivers I only ever see that it sets a self-clearing AN RESTART bit,
> > > never waiting for that bit to clear. Hence I wanted to immitate
> > > that behavior by clearing the bit asynchronously. If that's not needed
> > > and msleep(10) or usleep_range(10000, 20000) can be used instead that'd
> > > be much easier, of course.
> > 
> > Sleeping is permitted in this code path, but bear in mind that it
> > will be called from ethtool ops, and thus the RTNL will be held,
> > please keep sleep durations to a minimum.
> 
> In that sense 10ms (on top of the MDIO operation) is not that little.
> Maybe it is worth to use delayed_work to clear the bit after all...

... in which case I think you need to do a better job.

The cancel_delayed_work() in the pcs_disable() method means if we
stopp using this PCS briefly while the AN restart bit is set, there's
nothing that will clear it.

There are other implementations that have this problem. mvneta has
the same problem, but there we can write the register to set the
MVNETA_GMAC_INBAND_RESTART_AN, and then immediately write it again
without delay to clear this bit. The bit is documented as self
clearing, but practical observation indicates it never does.

Are you sure you need to wait 10ms ? What happens if you set the
bit and then immediately clear it, like we do for mvneta?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Daniel Golle 1 week, 6 days ago
On Fri, Dec 05, 2025 at 03:52:49PM +0000, Russell King (Oracle) wrote:
> On Fri, Dec 05, 2025 at 03:40:35PM +0000, Daniel Golle wrote:
> > On Fri, Dec 05, 2025 at 03:17:37PM +0000, Russell King (Oracle) wrote:
> > > On Fri, Dec 05, 2025 at 01:56:39PM +0000, Daniel Golle wrote:
> > > > On Fri, Dec 05, 2025 at 02:45:35PM +0100, Andrew Lunn wrote:
> > > > > On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> > > > > > Despite being documented as self-clearing, the RANEG bit sometimes
> > > > > > remains set, preventing auto-negotiation from happening.
> > > > > > 
> > > > > > Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> > > > > > delayed_work emulating the asynchronous self-clearing behavior.
> > > > > 
> > > > > Maybe add some text why the complexity of delayed work is used, rather
> > > > > than just a msleep(10)?
> > > > > 
> > > > > Calling regmap_read_poll_timeout() to see if it clears itself could
> > > > > optimise this, and still be simpler.
> > > > 
> > > > Is the restart_an() operation allowed to sleep? Looking at other
> > > > drivers I only ever see that it sets a self-clearing AN RESTART bit,
> > > > never waiting for that bit to clear. Hence I wanted to immitate
> > > > that behavior by clearing the bit asynchronously. If that's not needed
> > > > and msleep(10) or usleep_range(10000, 20000) can be used instead that'd
> > > > be much easier, of course.
> > > 
> > > Sleeping is permitted in this code path, but bear in mind that it
> > > will be called from ethtool ops, and thus the RTNL will be held,
> > > please keep sleep durations to a minimum.
> > 
> > In that sense 10ms (on top of the MDIO operation) is not that little.
> > Maybe it is worth to use delayed_work to clear the bit after all...
> 
> ... in which case I think you need to do a better job.
> 
> The cancel_delayed_work() in the pcs_disable() method means if we
> stopp using this PCS briefly while the AN restart bit is set, there's
> nothing that will clear it.

The reset of the whole PCS unit (which is asserted in .pcs_disable and
deasserted in .pcs_enable) also resets the AN restart bit.

> 
> There are other implementations that have this problem. mvneta has
> the same problem, but there we can write the register to set the
> MVNETA_GMAC_INBAND_RESTART_AN, and then immediately write it again
> without delay to clear this bit. The bit is documented as self
> clearing, but practical observation indicates it never does.
> 
> Are you sure you need to wait 10ms ? What happens if you set the
> bit and then immediately clear it, like we do for mvneta?

MaxLinear engineer Benny Weng has told me (quote):

"I did a check on the REANEG bit, [...] and it needs to clear the bit
manually and please give 10ms delay in between."

In my observation the bit *does* clear itself *most* of the time,
but aparently when being set in the wrong moment it doesn't. This
makes testing rather difficult and I'd just follow their advise.
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Andrew Lunn 2 weeks ago
On Fri, Dec 05, 2025 at 01:56:39PM +0000, Daniel Golle wrote:
> On Fri, Dec 05, 2025 at 02:45:35PM +0100, Andrew Lunn wrote:
> > On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> > > Despite being documented as self-clearing, the RANEG bit sometimes
> > > remains set, preventing auto-negotiation from happening.
> > > 
> > > Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> > > delayed_work emulating the asynchronous self-clearing behavior.
> > 
> > Maybe add some text why the complexity of delayed work is used, rather
> > than just a msleep(10)?
> > 
> > Calling regmap_read_poll_timeout() to see if it clears itself could
> > optimise this, and still be simpler.
> 
> Is the restart_an() operation allowed to sleep?

This would typically be an MDIO operations, since PCS are often on an
MDIO bus. And MDIO is expected to sleep. Also, regmap is using a lock
to prevent parallel access, and i expect that is a sleeping lock, not
a spinlock.

If you want to be sure, put in a might_sleep() call, and build the
kernel will sleep in atomic debug enabled. You will get a splat if i'm
wrong.

	Andrew
Re: [PATCH net] net: dsa: mxl-gsw1xx: manually clear RANEG bit
Posted by Vladimir Oltean 2 weeks ago
On Fri, Dec 05, 2025 at 01:56:39PM +0000, Daniel Golle wrote:
> On Fri, Dec 05, 2025 at 02:45:35PM +0100, Andrew Lunn wrote:
> > On Fri, Dec 05, 2025 at 01:32:20AM +0000, Daniel Golle wrote:
> > > Despite being documented as self-clearing, the RANEG bit sometimes
> > > remains set, preventing auto-negotiation from happening.
> > > 
> > > Manually clear the RANEG bit after 10ms as advised by MaxLinear, using
> > > delayed_work emulating the asynchronous self-clearing behavior.
> > 
> > Maybe add some text why the complexity of delayed work is used, rather
> > than just a msleep(10)?
> > 
> > Calling regmap_read_poll_timeout() to see if it clears itself could
> > optimise this, and still be simpler.
> 
> Is the restart_an() operation allowed to sleep?

Isn't regmap_set_bits() already sleeping? Your gsw1xx_regmap_bus
accesses __mdiobus_write() and __mdiobus_read() which are sleepable
operations and there's no problem with that.