From: Alexei Lazar <alazar@nvidia.com>
Xon/Xoff sizes are derived from calculations that include
the port speed.
These settings need to be updated and applied whenever the
port speed is changed.
The port speed is typically set after the physical link goes down
and is negotiated as part of the link-up process between the two
connected interfaces.
Xon/Xoff parameters being updated at the point where the new
negotiated speed is established.
Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 15eded36b872..e680673ffb72 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -139,6 +139,8 @@ void mlx5e_update_carrier(struct mlx5e_priv *priv)
if (up) {
netdev_info(priv->netdev, "Link up\n");
netif_carrier_on(priv->netdev);
+ mlx5e_port_manual_buffer_config(priv, 0, priv->netdev->mtu,
+ NULL, NULL, NULL);
} else {
netdev_info(priv->netdev, "Link down\n");
netif_carrier_off(priv->netdev);
--
2.34.1
On Mon, 25 Aug 2025 17:34:33 +0300 Mark Bloch wrote: > Xon/Xoff sizes are derived from calculations that include > the port speed. > These settings need to be updated and applied whenever the > port speed is changed. > The port speed is typically set after the physical link goes down > and is negotiated as part of the link-up process between the two > connected interfaces. > Xon/Xoff parameters being updated at the point where the new > negotiated speed is established. Hi, this is breaking dual host CX7 w/ 28.45.1300 (but I think most older FW versions, too). Looks like the host is not receiving any mcast (ping within a subnet doesn't work because the host receives no ndisc), and most traffic slows down to a trickle. Lost of rx_prio0_buf_discard increments. Please TAL ASAP, this change went to LTS last week.
On Wed, 10 Sep 2025 17:00:11 -0700 Jakub Kicinski wrote: > On Mon, 25 Aug 2025 17:34:33 +0300 Mark Bloch wrote: > > Xon/Xoff sizes are derived from calculations that include > > the port speed. > > These settings need to be updated and applied whenever the > > port speed is changed. > > The port speed is typically set after the physical link goes down > > and is negotiated as part of the link-up process between the two > > connected interfaces. > > Xon/Xoff parameters being updated at the point where the new > > negotiated speed is established. > > Hi, this is breaking dual host CX7 w/ 28.45.1300 (but I think most > older FW versions, too). Looks like the host is not receiving any > mcast (ping within a subnet doesn't work because the host receives > no ndisc), and most traffic slows down to a trickle. > Lost of rx_prio0_buf_discard increments. > > Please TAL ASAP, this change went to LTS last week. Any news on this? I heard that it also breaks DCB/QoS configuration on 6.12.45 LTS.
On 11/09/2025 16:47, Jakub Kicinski wrote: > On Wed, 10 Sep 2025 17:00:11 -0700 Jakub Kicinski wrote: >> On Mon, 25 Aug 2025 17:34:33 +0300 Mark Bloch wrote: >>> Xon/Xoff sizes are derived from calculations that include >>> the port speed. >>> These settings need to be updated and applied whenever the >>> port speed is changed. >>> The port speed is typically set after the physical link goes down >>> and is negotiated as part of the link-up process between the two >>> connected interfaces. >>> Xon/Xoff parameters being updated at the point where the new >>> negotiated speed is established. >> >> Hi, this is breaking dual host CX7 w/ 28.45.1300 (but I think most >> older FW versions, too). Looks like the host is not receiving any >> mcast (ping within a subnet doesn't work because the host receives >> no ndisc), and most traffic slows down to a trickle. >> Lost of rx_prio0_buf_discard increments. >> >> Please TAL ASAP, this change went to LTS last week. > > Any news on this? I heard that it also breaks DCB/QoS configuration > on 6.12.45 LTS. Hi Jakub, We are looking into this, once we have anything I'll update. Just to make sure, reverting this is one commit solves the issue you are seeing? Mark
On Thu, 11 Sep 2025 17:25:22 +0300 Mark Bloch wrote: > On 11/09/2025 16:47, Jakub Kicinski wrote: > > On Wed, 10 Sep 2025 17:00:11 -0700 Jakub Kicinski wrote: > >> Hi, this is breaking dual host CX7 w/ 28.45.1300 (but I think most > >> older FW versions, too). Looks like the host is not receiving any > >> mcast (ping within a subnet doesn't work because the host receives > >> no ndisc), and most traffic slows down to a trickle. > >> Lost of rx_prio0_buf_discard increments. > >> > >> Please TAL ASAP, this change went to LTS last week. > > > > Any news on this? I heard that it also breaks DCB/QoS configuration > > on 6.12.45 LTS. > > We are looking into this, once we have anything I'll update. > Just to make sure, reverting this is one commit solves the > issue you are seeing? It did for me, but Daniel (who is working on the PSP series) mentioned that he had reverted all three to get net-next working: net/mlx5e: Set local Xoff after FW update net/mlx5e: Update and set Xon/Xoff upon port speed set net/mlx5e: Update and set Xon/Xoff upon MTU set
On 11/09/2025 17:36, Jakub Kicinski wrote: > On Thu, 11 Sep 2025 17:25:22 +0300 Mark Bloch wrote: >> On 11/09/2025 16:47, Jakub Kicinski wrote: >>> On Wed, 10 Sep 2025 17:00:11 -0700 Jakub Kicinski wrote: >>>> Hi, this is breaking dual host CX7 w/ 28.45.1300 (but I think most >>>> older FW versions, too). Looks like the host is not receiving any >>>> mcast (ping within a subnet doesn't work because the host receives >>>> no ndisc), and most traffic slows down to a trickle. >>>> Lost of rx_prio0_buf_discard increments. >>>> >>>> Please TAL ASAP, this change went to LTS last week. >>> >>> Any news on this? I heard that it also breaks DCB/QoS configuration >>> on 6.12.45 LTS. >> >> We are looking into this, once we have anything I'll update. >> Just to make sure, reverting this is one commit solves the >> issue you are seeing? > > It did for me, but Daniel (who is working on the PSP series) > mentioned that he had reverted all three to get net-next working: > > net/mlx5e: Set local Xoff after FW update > net/mlx5e: Update and set Xon/Xoff upon port speed set > net/mlx5e: Update and set Xon/Xoff upon MTU set > Hi Jakub, Thanks for reporting. We're investigating and will update soon. Regards, Tariq
On 15/09/2025 10:38, Tariq Toukan wrote: > > > On 11/09/2025 17:36, Jakub Kicinski wrote: >> On Thu, 11 Sep 2025 17:25:22 +0300 Mark Bloch wrote: >>> On 11/09/2025 16:47, Jakub Kicinski wrote: >>>> On Wed, 10 Sep 2025 17:00:11 -0700 Jakub Kicinski wrote: >>>>> Hi, this is breaking dual host CX7 w/ 28.45.1300 (but I think most >>>>> older FW versions, too). Looks like the host is not receiving any >>>>> mcast (ping within a subnet doesn't work because the host receives >>>>> no ndisc), and most traffic slows down to a trickle. >>>>> Lost of rx_prio0_buf_discard increments. >>>>> >>>>> Please TAL ASAP, this change went to LTS last week. >>>> >>>> Any news on this? I heard that it also breaks DCB/QoS configuration >>>> on 6.12.45 LTS. >>> >>> We are looking into this, once we have anything I'll update. >>> Just to make sure, reverting this is one commit solves the >>> issue you are seeing? >> >> It did for me, but Daniel (who is working on the PSP series) >> mentioned that he had reverted all three to get net-next working: >> >> net/mlx5e: Set local Xoff after FW update >> net/mlx5e: Update and set Xon/Xoff upon port speed set >> net/mlx5e: Update and set Xon/Xoff upon MTU set >> > > Hi Jakub, > > Thanks for reporting. > We're investigating and will update soon. > > Regards, > Tariq > Hi, We prefer reverting the single patch [1] for now. We'll submit a fixed version later. Regarding the other two patches [2], initial testing showed no issues. Can you/Daniel share more info? What issues you see, and the repro steps. Thanks, Tariq [1] net/mlx5e: Update and set Xon/Xoff upon port speed set [2] net/mlx5e: Set local Xoff after FW update net/mlx5e: Update and set Xon/Xoff upon MTU set
On 9/17/25 6:39 AM, Tariq Toukan wrote: > > > On 15/09/2025 10:38, Tariq Toukan wrote: >> >> >> On 11/09/2025 17:36, Jakub Kicinski wrote: >>> On Thu, 11 Sep 2025 17:25:22 +0300 Mark Bloch wrote: >>>> On 11/09/2025 16:47, Jakub Kicinski wrote: >>>>> On Wed, 10 Sep 2025 17:00:11 -0700 Jakub Kicinski wrote: >>>>>> Hi, this is breaking dual host CX7 w/ 28.45.1300 (but I think most >>>>>> older FW versions, too). Looks like the host is not receiving any >>>>>> mcast (ping within a subnet doesn't work because the host receives >>>>>> no ndisc), and most traffic slows down to a trickle. >>>>>> Lost of rx_prio0_buf_discard increments. >>>>>> >>>>>> Please TAL ASAP, this change went to LTS last week. >>>>> >>>>> Any news on this? I heard that it also breaks DCB/QoS configuration >>>>> on 6.12.45 LTS. >>>> >>>> We are looking into this, once we have anything I'll update. >>>> Just to make sure, reverting this is one commit solves the >>>> issue you are seeing? >>> >>> It did for me, but Daniel (who is working on the PSP series) >>> mentioned that he had reverted all three to get net-next working: >>> >>> net/mlx5e: Set local Xoff after FW update >>> net/mlx5e: Update and set Xon/Xoff upon port speed set >>> net/mlx5e: Update and set Xon/Xoff upon MTU set >>> >> >> Hi Jakub, >> >> Thanks for reporting. >> We're investigating and will update soon. >> >> Regards, >> Tariq >> > > Hi, > > We prefer reverting the single patch [1] for now. We'll submit a fixed > version later. > > Regarding the other two patches [2], initial testing showed no issues. > Can you/Daniel share more info? What issues you see, and the repro steps. > > Thanks, > Tariq > > [1] > net/mlx5e: Update and set Xon/Xoff upon port speed set > > [2] > net/mlx5e: Set local Xoff after FW update > net/mlx5e: Update and set Xon/Xoff upon MTU set > Hello Tariq, My notes for the situation were that I was running a vanilla net-next kernel on a dual host, CX7 system, with the 28.45.1300 FW at commit: deb105f49879 net: phy: marvell: Fix 88e1510 downshift counter errata and I was having the issues that Jakub described. No ping working in a subnet. Extremely slow bandwidth on a large transfer. My notes say that reverting just [1] (from your message) did not fix the problem, but then reverting [2] and [3] restored normal behavior. However, I did attempt to reproduce again on the same system this morning, and now I'm seeing that reverting just [1] is sufficient to fix the issues.
On 17/09/2025 16:00, Daniel Zahka wrote: > > > On 9/17/25 6:39 AM, Tariq Toukan wrote: >> >> >> On 15/09/2025 10:38, Tariq Toukan wrote: >>> >>> >>> On 11/09/2025 17:36, Jakub Kicinski wrote: >>>> On Thu, 11 Sep 2025 17:25:22 +0300 Mark Bloch wrote: >>>>> On 11/09/2025 16:47, Jakub Kicinski wrote: >>>>>> On Wed, 10 Sep 2025 17:00:11 -0700 Jakub Kicinski wrote: >>>>>>> Hi, this is breaking dual host CX7 w/ 28.45.1300 (but I think most >>>>>>> older FW versions, too). Looks like the host is not receiving any >>>>>>> mcast (ping within a subnet doesn't work because the host receives >>>>>>> no ndisc), and most traffic slows down to a trickle. >>>>>>> Lost of rx_prio0_buf_discard increments. >>>>>>> >>>>>>> Please TAL ASAP, this change went to LTS last week. >>>>>> >>>>>> Any news on this? I heard that it also breaks DCB/QoS configuration >>>>>> on 6.12.45 LTS. >>>>> >>>>> We are looking into this, once we have anything I'll update. >>>>> Just to make sure, reverting this is one commit solves the >>>>> issue you are seeing? >>>> >>>> It did for me, but Daniel (who is working on the PSP series) >>>> mentioned that he had reverted all three to get net-next working: >>>> >>>> net/mlx5e: Set local Xoff after FW update >>>> net/mlx5e: Update and set Xon/Xoff upon port speed set >>>> net/mlx5e: Update and set Xon/Xoff upon MTU set >>>> >>> >>> Hi Jakub, >>> >>> Thanks for reporting. >>> We're investigating and will update soon. >>> >>> Regards, >>> Tariq >>> >> >> Hi, >> >> We prefer reverting the single patch [1] for now. We'll submit a fixed >> version later. >> >> Regarding the other two patches [2], initial testing showed no issues. >> Can you/Daniel share more info? What issues you see, and the repro steps. >> >> Thanks, >> Tariq >> >> [1] >> net/mlx5e: Update and set Xon/Xoff upon port speed set >> >> [2] >> net/mlx5e: Set local Xoff after FW update >> net/mlx5e: Update and set Xon/Xoff upon MTU set >> > > Hello Tariq, > > My notes for the situation were that I was running a vanilla net-next > kernel on a dual host, CX7 system, with the 28.45.1300 FW at commit: > > deb105f49879 net: phy: marvell: Fix 88e1510 downshift counter errata > > and I was having the issues that Jakub described. No ping working in a > subnet. Extremely slow bandwidth on a large transfer. My notes say that > reverting just [1] (from your message) did not fix the problem, but then > reverting [2] and [3] restored normal behavior. > > However, I did attempt to reproduce again on the same system this > morning, and now I'm seeing that reverting just [1] is sufficient to fix > the issues. I see. For now, I'll submit a revert only for [1]. Let us know of any related issue you still hit after the revert. Thanks.
© 2016 - 2025 Red Hat, Inc.