The Radxa E24C has a Realtek RTL8367RB-VB switch with four usable ports
and is connected using a fixed-link to GMAC1 on the RK3528 SoC.
Add an ethernet-switch node to describe the RTL8367RB-VB switch.
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
Initial testing with iperf3 showed ~930-940 Mbits/sec in one direction
and only around ~1-2 Mbits/sec in the other direction.
The RK3528 hardware design guide recommends that timing between TXCLK
and data is controlled by MAC, and timing between RXCLK and data is
controlled by PHY.
Any mix of MAC (rx/tx delay) and switch (rx/tx internal delay) did not
seem to resolve this speed issue, however dropping snps,tso seems to fix
that issue.
Unsure what is best here, should MAC or switch add the delays? Here I
just followed DT from vendor downstream tree and added rx/tx internal
delay to switch.
Vendor downstream DT also adds 'pause' to the fixed-link nodes, and this
may be something that should be added here. However, during testing flow
control always ended up being disabled so I skipped 'pause' here.
Schematics: https://dl.radxa.com/e/e24c/docs/radxa_e24c_v1200_schematic.pdf
---
.../boot/dts/rockchip/rk3528-radxa-e24c.dts | 55 +++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3528-radxa-e24c.dts b/arch/arm64/boot/dts/rockchip/rk3528-radxa-e24c.dts
index 225f2b0c5339..26754ff7f4ef 100644
--- a/arch/arm64/boot/dts/rockchip/rk3528-radxa-e24c.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3528-radxa-e24c.dts
@@ -196,6 +196,7 @@ &cpu3 {
};
&gmac1 {
+ /delete-property/ snps,tso;
clock_in_out = "output";
phy-mode = "rgmii-id";
phy-supply = <&avdd_rtl8367rb>;
@@ -368,6 +369,60 @@ &mdio1 {
reset-delay-us = <25000>;
reset-gpios = <&gpio4 RK_PC2 GPIO_ACTIVE_LOW>;
reset-post-delay-us = <100000>;
+
+ ethernet-switch@1d {
+ compatible = "realtek,rtl8365mb";
+ reg = <0x1d>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&rtl8367rb_eint>;
+
+ ethernet-ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ethernet-port@0 {
+ reg = <0>;
+ label = "wan";
+ };
+
+ ethernet-port@1 {
+ reg = <1>;
+ label = "lan1";
+ };
+
+ ethernet-port@2 {
+ reg = <2>;
+ label = "lan2";
+ };
+
+ ethernet-port@3 {
+ reg = <3>;
+ label = "lan3";
+ };
+
+ ethernet-port@6 {
+ reg = <6>;
+ ethernet = <&gmac1>;
+ label = "cpu";
+ phy-mode = "rgmii-id";
+ rx-internal-delay-ps = <2000>;
+ tx-internal-delay-ps = <2000>;
+
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+ };
+ };
+
+ interrupt-controller {
+ interrupt-parent = <&gpio1>;
+ interrupts = <RK_PC2 IRQ_TYPE_LEVEL_LOW>;
+ interrupt-controller;
+ #address-cells = <0>;
+ #interrupt-cells = <1>;
+ };
+ };
};
&pinctrl {
--
2.50.1
Hi, > Initial testing with iperf3 showed ~930-940 Mbits/sec in one direction > and only around ~1-2 Mbits/sec in the other direction. > Any mix of MAC (rx/tx delay) and switch (rx/tx internal delay) did not > seem to resolve this speed issue, however dropping snps,tso seems to fix > that issue. Have you tried setting phy-mode to rgmii? (just for testing) Usually this problem is caused by incorrect rx/tx delay. > + ethernet-switch@1d { > + compatible = "realtek,rtl8365mb"; > + reg = <0x1d>; > + pinctrl-names = "default"; > + pinctrl-0 = <&rtl8367rb_eint>; Shouldn't this pinctrl be written in interrupts? > + ethernet-port@6 { > + reg = <6>; > + ethernet = <&gmac1>; > + label = "cpu"; No need for label = "cpu": https://github.com/torvalds/linux/commit/567f38317054e66647fd59cfa4e261219a2a21db > This series relaxes the realtek dsa drivers requirement of having a mdio > child OF node to probe and instead have it register a user_mii_bus to > make it function when a mdio child OF node is missing. This is weird, the switch is connected to the gmac via mdio. Can you try the following and see if it works? I tried it on a rk3568 + rtl8367s board and it worked: ``` &mdio1 { switch@29 { compatible = "realtek,rtl8365mb"; reg = <29>; reset-gpios = ... switch_intc: interrupt-controller { interrupt-parent = ... interrupts = ... interrupt-controller; #address-cells = <0>; #interrupt-cells = <1>; }; mdio { #address-cells = <1>; #size-cells = <0>; phy0: ethernet-phy@0 { reg = <0>; interrupt-parent = <&switch_intc>; interrupts = <0>; }; phy1: ethernet-phy@1 { reg = <1>; interrupt-parent = <&switch_intc>; interrupts = <1>; }; phy2: ethernet-phy@2 { reg = <2>; interrupt-parent = <&switch_intc>; interrupts = <2>; }; phy3: ethernet-phy@3 { reg = <3>; interrupt-parent = <&switch_intc>; interrupts = <3>; }; }; ports { #address-cells = <1>; #size-cells = <0>; port@0 { reg = <0>; label = "wan"; phy-handle = <&phy0>; }; port@1 { reg = <1>; label = "lan1"; phy-handle = <&phy1>; }; port@2 { reg = <2>; label = "lan2"; phy-handle = <&phy2>; }; port@3 { reg = <3>; label = "lan3"; phy-handle = <&phy3>; }; port@x { reg = <x>; ethernet = <&gmac1>; phy-mode = "rgmii"; fixed-link { speed = <1000>; full-duplex; }; }; }; }; }; ``` Thanks, Chukun -- 2.25.1
Hi Chukun, On 7/28/2025 4:30 PM, Chukun Pan wrote: > Hi, > >> Initial testing with iperf3 showed ~930-940 Mbits/sec in one direction >> and only around ~1-2 Mbits/sec in the other direction. > >> Any mix of MAC (rx/tx delay) and switch (rx/tx internal delay) did not >> seem to resolve this speed issue, however dropping snps,tso seems to fix >> that issue. > > Have you tried setting phy-mode to rgmii? (just for testing) > Usually this problem is caused by incorrect rx/tx delay. The issue is with TSO and RX checksum offload, with those disabled on the conduit interface (gmac1/eth0) my issues disappeared. Use of rgmii-id "RX and TX delays are not provided by the PCB." as defined by the dt-bindings seem to most correctly describe the HW. Describing switches is new to me, so I could be wrong :-) > >> + ethernet-switch@1d { >> + compatible = "realtek,rtl8365mb"; >> + reg = <0x1d>; >> + pinctrl-names = "default"; >> + pinctrl-0 = <&rtl8367rb_eint>; > > Shouldn't this pinctrl be written in interrupts? Not necessarily, in my mind the pinctrl is applied for the switch interface to the SoC, not the internal workings of the switch. > >> + ethernet-port@6 { >> + reg = <6>; >> + ethernet = <&gmac1>; >> + label = "cpu"; > > No need for label = "cpu": > https://github.com/torvalds/linux/commit/567f38317054e66647fd59cfa4e261219a2a21db Thanks, will drop in v2. > >> This series relaxes the realtek dsa drivers requirement of having a mdio >> child OF node to probe and instead have it register a user_mii_bus to >> make it function when a mdio child OF node is missing. > > This is weird, the switch is connected to the gmac via mdio. > Can you try the following and see if it works? I tried it on > a rk3568 + rtl8367s board and it worked: With a 'mdio' child node 'make CHECK_DTBS=y' report something like: rockchip/rk3528-radxa-e24c.dtb: ethernet-switch@1d (realtek,rtl8365mb): mdio: False schema does not allow { [snip] } from schema $id: http://devicetree.org/schemas/net/dsa/realtek.yaml# With a mdio node the driver is happy and dtschema is sad, and without the mdio node it was the other way around. The plan is to drop this patch and instead modify the dt-binding to allow describing a mdio node when the switch node has a reg prop in v2. Regards, Jonas > > ``` > &mdio1 { > switch@29 { > compatible = "realtek,rtl8365mb"; > reg = <29>; > reset-gpios = ... > > switch_intc: interrupt-controller { > interrupt-parent = ... > interrupts = ... > interrupt-controller; > #address-cells = <0>; > #interrupt-cells = <1>; > }; > > mdio { > #address-cells = <1>; > #size-cells = <0>; > > phy0: ethernet-phy@0 { > reg = <0>; > interrupt-parent = <&switch_intc>; > interrupts = <0>; > }; > > phy1: ethernet-phy@1 { > reg = <1>; > interrupt-parent = <&switch_intc>; > interrupts = <1>; > }; > > phy2: ethernet-phy@2 { > reg = <2>; > interrupt-parent = <&switch_intc>; > interrupts = <2>; > }; > > phy3: ethernet-phy@3 { > reg = <3>; > interrupt-parent = <&switch_intc>; > interrupts = <3>; > }; > }; > > ports { > #address-cells = <1>; > #size-cells = <0>; > > port@0 { > reg = <0>; > label = "wan"; > phy-handle = <&phy0>; > }; > > port@1 { > reg = <1>; > label = "lan1"; > phy-handle = <&phy1>; > }; > > port@2 { > reg = <2>; > label = "lan2"; > phy-handle = <&phy2>; > }; > > port@3 { > reg = <3>; > label = "lan3"; > phy-handle = <&phy3>; > }; > > port@x { > reg = <x>; > ethernet = <&gmac1>; > phy-mode = "rgmii"; > > fixed-link { > speed = <1000>; > full-duplex; > }; > }; > }; > }; > }; > ``` > > Thanks, > Chukun > > -- > 2.25.1 > >
Hi, > The issue is with TSO and RX checksum offload, with those disabled on > the conduit interface (gmac1/eth0) my issues disappeared. I did a test today and the same problem occurred when running the new kernel on my rk3568 + rtl8367s board. This problem does not exist on older kernels (6.1 and 6.6). Not sure where the problem is. > With a 'mdio' child node 'make CHECK_DTBS=y' report something like: > > rockchip/rk3528-radxa-e24c.dtb: ethernet-switch@1d (realtek,rtl8365mb): mdio: False schema does not allow { [snip] } > from schema $id: http://devicetree.org/schemas/net/dsa/realtek.yaml# > > With a mdio node the driver is happy and dtschema is sad, and without > the mdio node it was the other way around. On older kernels (6.1/6.6) only realtek-smi requires mdio child OF node. Commit bba140a566ed ("net: dsa: realtek: use the same mii bus driver for both interfaces") changed this behavior, both MDIO interface and SMI interface need it (rtl83xx_setup_user_mdio), but the dt-bindings has not been updated. I think this needs a Fixes tag. Thanks, Chukun -- 2.25.1
Hi Chukun, On 7/29/2025 1:50 PM, Chukun Pan wrote: > Hi, > >> The issue is with TSO and RX checksum offload, with those disabled on >> the conduit interface (gmac1/eth0) my issues disappeared. > > I did a test today and the same problem occurred when running the new > kernel on my rk3568 + rtl8367s board. This problem does not exist on > older kernels (6.1 and 6.6). Not sure where the problem is. I had only tested on a next-20250722 based kernel and on a vendor 6.1 based kernel. And similar to your findings, on 6.1 based kernel there was no issue only on the newer kernel. I will probably drop the use of "/delete-property/ snps,tso" and include a note in commit message about the TSO and RX checksum issue for v2. > >> With a 'mdio' child node 'make CHECK_DTBS=y' report something like: >> >> rockchip/rk3528-radxa-e24c.dtb: ethernet-switch@1d (realtek,rtl8365mb): mdio: False schema does not allow { [snip] } >> from schema $id: http://devicetree.org/schemas/net/dsa/realtek.yaml# >> >> With a mdio node the driver is happy and dtschema is sad, and without >> the mdio node it was the other way around. > > On older kernels (6.1/6.6) only realtek-smi requires mdio child OF node. > Commit bba140a566ed ("net: dsa: realtek: use the same mii bus driver for both interfaces") > changed this behavior, both MDIO interface and SMI interface need it > (rtl83xx_setup_user_mdio), but the dt-bindings has not been updated. > I think this needs a Fixes tag. Thanks for finding this, and yes I can see that commit bba140a566ed changed the behavior of the driver and probably broke out-of-tree users. My current plan for a v2 is to: - include a new dt-bindings patch to allow use of a mdio node - include a mdio node in the switch node - add a Fixes tag to the driver patch Then leave up to maintainers to decide if they want to accept this patch or not. Regards, Jonas > > Thanks, > Chukun > > -- > 2.25.1 > >
Hi, > I had only tested on a next-20250722 based kernel and on a vendor 6.1 > based kernel. And similar to your findings, on 6.1 based kernel there > was no issue only on the newer kernel. > > I will probably drop the use of "/delete-property/ snps,tso" and include > a note in commit message about the TSO and RX checksum issue for v2. After my test, this problem is caused by commit 041cc86 ("net: stmmac: Enable TSO on VLANs") https://github.com/torvalds/linux/commit/041cc86b3653cbcdf6ab96c2f2ae34f3d0a99b0a It seems that this commit just exposed the TSO problem (with VLANs). Thanks, Chukun -- 2.25.1
On Sun, Aug 10, 2025 at 10:01:15PM +0800, Chukun Pan wrote: > Hi, > > > I had only tested on a next-20250722 based kernel and on a vendor 6.1 > > based kernel. And similar to your findings, on 6.1 based kernel there > > was no issue only on the newer kernel. > > > > I will probably drop the use of "/delete-property/ snps,tso" and include > > a note in commit message about the TSO and RX checksum issue for v2. > > After my test, this problem is caused by commit 041cc86 ("net: stmmac: Enable TSO on VLANs") > https://github.com/torvalds/linux/commit/041cc86b3653cbcdf6ab96c2f2ae34f3d0a99b0a > > It seems that this commit just exposed the TSO problem (with VLANs). I'm not sure that is correct. What this patch does is enable TSO for VLANs by adding the VLAN header to the packet in software before transmitting it, rather than asking the hardware to insert the VLAN header as it transmits. What i don't understand yet, is what has VLANs got to do with DSA? Does the DSA tagger being used not actually insert a switch specific header, but is using VLAN overlays? Why is the VLAN path in the stmmac transmit function being used? Just a guess, but maybe it is a DSA tagger bug? Maybe the user frame is a VLAN frame. The tagger is placing the VLAN tag into the DSA header, so in effect, the frame is no longer a VLAN frame. But it is not calling __vlan_hwaccel_clear_tag() to indicate the skbuf no longer needs VLAN processing? Andrew
On Sun, Aug 10, 2025 at 05:15:59PM +0200, Andrew Lunn wrote: > Just a guess, but maybe it is a DSA tagger bug? Maybe the user frame > is a VLAN frame. The tagger is placing the VLAN tag into the DSA > header, so in effect, the frame is no longer a VLAN frame. But it is > not calling __vlan_hwaccel_clear_tag() to indicate the skbuf no longer > needs VLAN processing? For the original skb to have had a VLAN hwaccel tag, validate_xmit_vlan() would have had to not push it inside, so vlan_hw_offload_capable() must have been true for DSA user ports. But we advertise neither the NETIF_F_HW_VLAN_CTAG_TX nor the NETIF_F_HW_VLAN_STAG_TX netdev feature. So the VLAN tags in skbs transmitted through DSA user ports should all be in the skb head.
> > I did a test today and the same problem occurred when running the new > > kernel on my rk3568 + rtl8367s board. This problem does not exist on > > older kernels (6.1 and 6.6). Not sure where the problem is. > > I had only tested on a next-20250722 based kernel and on a vendor 6.1 > based kernel. And similar to your findings, on 6.1 based kernel there > was no issue only on the newer kernel. > > I will probably drop the use of "/delete-property/ snps,tso" and include > a note in commit message about the TSO and RX checksum issue for v2. You are submitting a patch for todays kernel, not a historic kernel. If todays kernel needs this property to work, please include it. You can always remove it when you have done a git bisect and find what changed, and submit a fix. Andrew
On Sun, Jul 27, 2025 at 06:03:00PM +0000, Jonas Karlman wrote: > The Radxa E24C has a Realtek RTL8367RB-VB switch with four usable ports > and is connected using a fixed-link to GMAC1 on the RK3528 SoC. > > Add an ethernet-switch node to describe the RTL8367RB-VB switch. > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se> > --- > Initial testing with iperf3 showed ~930-940 Mbits/sec in one direction > and only around ~1-2 Mbits/sec in the other direction. > > The RK3528 hardware design guide recommends that timing between TXCLK > and data is controlled by MAC, and timing between RXCLK and data is > controlled by PHY. Assuming RK3528 is the MAC side, then that makes sense - it's basically suggesting that the _producer_ of the signals should appropriately skew them. > Any mix of MAC (rx/tx delay) and switch (rx/tx internal delay) did not > seem to resolve this speed issue, however dropping snps,tso seems to fix > that issue. > > Unsure what is best here, should MAC or switch add the delays? Here I > just followed DT from vendor downstream tree and added rx/tx internal > delay to switch. Okay. Heres'a an in-depth explanation, because I think many people need this for MAC-to-MAC RGMII links. A MAC to PHY link: MAC1 ------- PHY1 The PHY mode in the MAC1 description controls the application of delays at PHY1. This is relatively well undersood. Now, for a MAC to MAC link: MAC1 ------- MAC2 in a PHY Let's say MAC2 is part of a PHY. Okay, so this is quite simple because it's a PHY on the other end, and thus the situation above applies. In both these cases, the MAC driver will pass the PHY interface to phylib, which will in turn pass it to the PHY driver, which is expected to configure the PHY appropriately. There is a side-case, where a MAC driver is allowed to configure the delays at its end _provided_ it then passes PHY_INTERFACE_MODE_RGMII to phylib (telling the PHY not to add its own delays.) Now let's look at something that isn't a PHY: MAC1 ------- MAC2 in a switch In this case, MAC2 isn't in a PHY or part of a PHY that is driven by phylib, so we don't have a way in the kernel of passing the PHY mode from MAC1 to MAC2 in order for MAC2 to configure itself. It's tempting to say that which RGMII mode is used doesn't matter, but consider the side-case above - if we're talking about a MAC driver that interprets the PHY mode and adds its own delays, then it *does* very much matter. It also matters for MAC2. This could be a switch port that can be used as a CPU facing port, or a switch port that is used as a PHY. In the latter case, it becomes exactly as the first two cases above. Let's take a theoretical case of two ethernet MACs on the same system connected with a RGMII link: MAC1 ------- MAC2 They both use the same driver. What RGMII interface mode should be used for each? Would it be correct to state "rgmii-id" for both MACs? Or "rgmii" for one and "rgmii-id" for the other. You may notice I'm not providing a solution - this is a thought experiment, to provoke some thought into the proper use of the phy-mode property at each end of a MAC to MAC link, and hopefully gain some realisation that (probably) using "rgmii-id" for both MACs probably isn't correct given the model that phy-mode _generally_ states how the opposite end of the RGMII link to the MAC should be configured. However, currently it doesn't matter as long as we don't end up with two MACs that are back to back and the MAC drivers decide to insert the RGMII delay (the side-case I mention above.) Personally, I don't like that we have this side-case as a possibility because it causes problems (if you go through the thought experiment above, you'll trip over the problem if you consider the combinations of MAC1 and MAC2 that do/do not use the side-case!) So, I would expect a MAC to MAC link to have "rgmii" at one end, and "rgmii-id" at the other end, rather than both having the same RGMII mode. > Vendor downstream DT also adds 'pause' to the fixed-link nodes, and this > may be something that should be added here. However, during testing flow > control always ended up being disabled so I skipped 'pause' here. Does it get disabled at the switch, or the stmmac end? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Sun, Jul 27, 2025 at 06:03:00PM +0000, Jonas Karlman wrote: > The Radxa E24C has a Realtek RTL8367RB-VB switch with four usable ports > and is connected using a fixed-link to GMAC1 on the RK3528 SoC. > > Add an ethernet-switch node to describe the RTL8367RB-VB switch. > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se> > --- > Initial testing with iperf3 showed ~930-940 Mbits/sec in one direction > and only around ~1-2 Mbits/sec in the other direction. > > The RK3528 hardware design guide recommends that timing between TXCLK > and data is controlled by MAC, and timing between RXCLK and data is > controlled by PHY. > > Any mix of MAC (rx/tx delay) and switch (rx/tx internal delay) did not > seem to resolve this speed issue, however dropping snps,tso seems to fix > that issue. It could well be that the Synopsis TSO code does not understand the DSA headers. When it takes a big block to TCP data and segments it, you need to have the DSA header on each segment. If it does not do that, only the first segment has the DSA header, the switch is going to be dropping all the other segments, causes TCP to do a lot of retries. > Unsure what is best here, should MAC or switch add the delays? It should not matter. 2ns is 2ns... Andrew
Hi Andrew, On 7/27/2025 9:16 PM, Andrew Lunn wrote: > On Sun, Jul 27, 2025 at 06:03:00PM +0000, Jonas Karlman wrote: >> The Radxa E24C has a Realtek RTL8367RB-VB switch with four usable ports >> and is connected using a fixed-link to GMAC1 on the RK3528 SoC. >> >> Add an ethernet-switch node to describe the RTL8367RB-VB switch. >> >> Signed-off-by: Jonas Karlman <jonas@kwiboo.se> >> --- >> Initial testing with iperf3 showed ~930-940 Mbits/sec in one direction >> and only around ~1-2 Mbits/sec in the other direction. >> >> The RK3528 hardware design guide recommends that timing between TXCLK >> and data is controlled by MAC, and timing between RXCLK and data is >> controlled by PHY. >> >> Any mix of MAC (rx/tx delay) and switch (rx/tx internal delay) did not >> seem to resolve this speed issue, however dropping snps,tso seems to fix >> that issue. > > It could well be that the Synopsis TSO code does not understand the > DSA headers. When it takes a big block to TCP data and segments it, > you need to have the DSA header on each segment. If it does not do > that, only the first segment has the DSA header, the switch is going > to be dropping all the other segments, causes TCP to do a lot of > retries. Thanks for your insights! I can confirm that disable of TSO and RX checksum offload on the conduit interface help fix any TCP speed issue and reduced UDP packet loss to a minimum. Regards, Jonas > >> Unsure what is best here, should MAC or switch add the delays? > > It should not matter. 2ns is 2ns... > > Andrew
© 2016 - 2025 Red Hat, Inc.