Also set bit 12 which disabled the RX FIDO clear function when setting up
MAC MCR, as MediaTek SDK did the same change stating:
"If without this patch, kernel might receive invalid packets that are
corrupted by GMAC."[1]
This fixes issues with <= 1G speed where we could previously observe
about 30% packet loss while the bad packet counter was increasing.
[1]: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/d8a2975939a12686c4a95c40db21efdc3f821f63
Tested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
drivers/net/ethernet/mediatek/mtk_eth_soc.c | 3 ++-
drivers/net/ethernet/mediatek/mtk_eth_soc.h | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 030d87c42bd42..ed32a511adc30 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -616,7 +616,8 @@ static int mtk_mac_finish(struct phylink_config *config, unsigned int mode,
mcr_cur = mtk_r32(mac->hw, MTK_MAC_MCR(mac->id));
mcr_new = mcr_cur;
mcr_new |= MAC_MCR_IPG_CFG | MAC_MCR_FORCE_MODE |
- MAC_MCR_BACKOFF_EN | MAC_MCR_BACKPR_EN | MAC_MCR_FORCE_LINK;
+ MAC_MCR_BACKOFF_EN | MAC_MCR_BACKPR_EN | MAC_MCR_FORCE_LINK |
+ MAC_MCR_RX_FIFO_CLR_DIS;
/* Only update control register when needed! */
if (mcr_new != mcr_cur)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index 142def8629c82..529c95c481b73 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -404,6 +404,7 @@
#define MAC_MCR_FORCE_MODE BIT(15)
#define MAC_MCR_TX_EN BIT(14)
#define MAC_MCR_RX_EN BIT(13)
+#define MAC_MCR_RX_FIFO_CLR_DIS BIT(12)
#define MAC_MCR_BACKOFF_EN BIT(9)
#define MAC_MCR_BACKPR_EN BIT(8)
#define MAC_MCR_FORCE_RX_FC BIT(5)
--
2.39.2
On Wed, Mar 01, 2023 at 07:55:05PM +0000, Daniel Golle wrote: > Also set bit 12 which disabled the RX FIDO clear function when setting up > MAC MCR, as MediaTek SDK did the same change stating: > "If without this patch, kernel might receive invalid packets that are > corrupted by GMAC."[1] > This fixes issues with <= 1G speed where we could previously observe > about 30% packet loss while the bad packet counter was increasing. > > [1]: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/d8a2975939a12686c4a95c40db21efdc3f821f63 > Tested-by: Bjørn Mork <bjorn@mork.no> > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > --- Should this patch be submitted separately from the series, to the net.git tree, to be backported to stable kernels?
On Thu, Mar 02, 2023 at 01:31:21AM +0200, Vladimir Oltean wrote: > On Wed, Mar 01, 2023 at 07:55:05PM +0000, Daniel Golle wrote: > > Also set bit 12 which disabled the RX FIDO clear function when setting up > > MAC MCR, as MediaTek SDK did the same change stating: > > "If without this patch, kernel might receive invalid packets that are > > corrupted by GMAC."[1] > > This fixes issues with <= 1G speed where we could previously observe > > about 30% packet loss while the bad packet counter was increasing. > > > > [1]: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/d8a2975939a12686c4a95c40db21efdc3f821f63 > > Tested-by: Bjørn Mork <bjorn@mork.no> > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > --- > > Should this patch be submitted separately from the series, to the > net.git tree, to be backported to stable kernels? Maybe yes, as this issue may affect e.g. the BPi-R3 board when used with 1G SFP modules. Previously this has just never been a problem as all practically all boards with MediaTek SoCs using SGMII also use the MediaTek MT7531 switch connecting in 2500Base-X mode. Should the Fixes:-tag hence reference the commit adding support for the BPi-R3?
On Thu, Mar 02, 2023 at 12:03:45AM +0000, Daniel Golle wrote: > On Thu, Mar 02, 2023 at 01:31:21AM +0200, Vladimir Oltean wrote: > > On Wed, Mar 01, 2023 at 07:55:05PM +0000, Daniel Golle wrote: > > > Also set bit 12 which disabled the RX FIDO clear function when setting up > > > MAC MCR, as MediaTek SDK did the same change stating: > > > "If without this patch, kernel might receive invalid packets that are > > > corrupted by GMAC."[1] > > > This fixes issues with <= 1G speed where we could previously observe > > > about 30% packet loss while the bad packet counter was increasing. > > > > > > [1]: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/d8a2975939a12686c4a95c40db21efdc3f821f63 > > > Tested-by: Bjørn Mork <bjorn@mork.no> > > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > > --- > > > > Should this patch be submitted separately from the series, to the > > net.git tree, to be backported to stable kernels? > > Maybe yes, as this issue may affect e.g. the BPi-R3 board when used > with 1G SFP modules. Previously this has just never been a problem as > all practically all boards with MediaTek SoCs using SGMII also use the > MediaTek MT7531 switch connecting in 2500Base-X mode. > > Should the Fixes:-tag hence reference the commit adding support for the > BPi-R3? If it's not an issue that affects existing setups, there is no need to backport the patch. But it needs to be clearly described as such in the commit message. You mention <= 1G speeds, but then only talk about 1G SFP modules. I see that the mtk_eth_soc driver also sets "gmii" and "rgmii" in phylink's supported_interfaces. Those are also <= 1G speeds. There could also be SGMII on-board PHYs. Does the RX FIFO clearing issue not affect those?
On Thu, Mar 02, 2023 at 12:00:22PM +0200, Vladimir Oltean wrote: > On Thu, Mar 02, 2023 at 12:03:45AM +0000, Daniel Golle wrote: > > On Thu, Mar 02, 2023 at 01:31:21AM +0200, Vladimir Oltean wrote: > > > On Wed, Mar 01, 2023 at 07:55:05PM +0000, Daniel Golle wrote: > > > > Also set bit 12 which disabled the RX FIDO clear function when setting up > > > > MAC MCR, as MediaTek SDK did the same change stating: > > > > "If without this patch, kernel might receive invalid packets that are > > > > corrupted by GMAC."[1] > > > > This fixes issues with <= 1G speed where we could previously observe > > > > about 30% packet loss while the bad packet counter was increasing. > > > > > > > > [1]: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/d8a2975939a12686c4a95c40db21efdc3f821f63 > > > > Tested-by: Bjørn Mork <bjorn@mork.no> > > > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > > > --- > > > > > > Should this patch be submitted separately from the series, to the > > > net.git tree, to be backported to stable kernels? > > > > Maybe yes, as this issue may affect e.g. the BPi-R3 board when used > > with 1G SFP modules. Previously this has just never been a problem as > > all practically all boards with MediaTek SoCs using SGMII also use the > > MediaTek MT7531 switch connecting in 2500Base-X mode. > > > > Should the Fixes:-tag hence reference the commit adding support for the > > BPi-R3? > > If it's not an issue that affects existing setups, there is no need to > backport the patch. But it needs to be clearly described as such in the > commit message. > > You mention <= 1G speeds, but then only talk about 1G SFP modules. > I see that the mtk_eth_soc driver also sets "gmii" and "rgmii" in > phylink's supported_interfaces. Those are also <= 1G speeds. There could > also be SGMII on-board PHYs. Does the RX FIFO clearing issue not affect > those? The issues affects PHYs (and potentially switch PHY ICs) connected via SGMII operating at 1.25Mbaud. The only officially supported board affected by this is the BPi-R3 where it affects the SFP cages -- the on-board MT7531 switch which is also used on all other boards using these SoCs is connected with 2500Base-X. The issue does **not** affect RGMII or GMII on the MT7623 SoC, but I don't have any way to try RGMII or GMII on more recent SoCs as I lack hardware making use of that to connect a PHY.
On Thu, Mar 02, 2023 at 11:56:56AM +0000, Daniel Golle wrote: > The issues affects PHYs (and potentially switch PHY ICs) connected via > SGMII operating at 1.25Mbaud. > > The only officially supported board affected by this is the BPi-R3 where > it affects the SFP cages -- the on-board MT7531 switch which is also > used on all other boards using these SoCs is connected with 2500Base-X. > > The issue does **not** affect RGMII or GMII on the MT7623 SoC, but I > don't have any way to try RGMII or GMII on more recent SoCs as I lack > hardware making use of that to connect a PHY. I don't believe that board vendors need to have their device tree "officially supported" in Linux in order to claim that a bug affecting them should be fixed on stable kernels. As long as the PHY interface type is generally supported by the driver and is expected to work, it should work with any board (the exception being if there are other configuration steps specific to that board required). In any case, a good commit message for a bug fix explains what is the user impact of the bug being fixed, the configurations which are affected, how it was noticed, an adequate Fixes: tag, and if necessary, why it is being fixed the way it is. In other words, it must be able to respond to normal questions that a reader might have when he/she stumbles upon it, for various reasons (it introduces a regression, they are debugging an issue and want to assess whether backporting this patch would help them, etc). The reader might be in the not so close future, when you might not be able to provide clarifications personally, so the commit message should contain all that you know which is relevant to the topic. Most of these clarifications were provided by you as replies to the patch, but they should be present in the commit message instead.
© 2016 - 2024 Red Hat, Inc.