[PATCH net-next V2 7/7] net/mlx5e: Defer channels closure to reduce interface down time

Tariq Toukan posted 7 patches 3 months, 1 week ago
[PATCH net-next V2 7/7] net/mlx5e: Defer channels closure to reduce interface down time
Posted by Tariq Toukan 3 months, 1 week ago
Cap bit tis_tir_td_order=1 indicates that an old firmware requirement /
limitation no longer exists. When unset, the latency of several firmware
commands significantly increases with the presence of high number of
co-existing channels (both old and new sets). Hence, we used to close
unneeded old channels before invoking those firmware commands.

Today, on capable devices, this is no longer the case. Minimize the
interface down time by deferring the old channels closure, after the
activation of the new ones.

Perf numbers:
Measured the number of dropped packets in a simple ping flood test,
during a configuration change operation, that switches the number of
channels from 247 to 248.

Before: 71 packets lost
After:  15 packets lost, ~80% saving.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 4edf64e1572a..9650fa6c6a63 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3384,7 +3384,8 @@ static int mlx5e_switch_priv_channels(struct mlx5e_priv *priv,
 		}
 	}
 
-	mlx5e_close_channels(old_chs);
+	if (!MLX5_CAP_GEN(priv->mdev, tis_tir_td_order))
+		mlx5e_close_channels(old_chs);
 	priv->profile->update_rx(priv);
 
 	mlx5e_selq_apply(&priv->selq);
@@ -3432,6 +3433,9 @@ int mlx5e_safe_switch_params(struct mlx5e_priv *priv,
 	if (err)
 		goto err_close;
 
+	if (MLX5_CAP_GEN(priv->mdev, tis_tir_td_order))
+		mlx5e_close_channels(old_chs);
+
 	kfree(new_chs);
 	kfree(old_chs);
 	return 0;
-- 
2.31.1
Re: [PATCH net-next V2 7/7] net/mlx5e: Defer channels closure to reduce interface down time
Posted by Simon Horman 3 months ago
On Thu, Oct 30, 2025 at 03:32:39PM +0200, Tariq Toukan wrote:
> Cap bit tis_tir_td_order=1 indicates that an old firmware requirement /
> limitation no longer exists. When unset, the latency of several firmware
> commands significantly increases with the presence of high number of
> co-existing channels (both old and new sets). Hence, we used to close
> unneeded old channels before invoking those firmware commands.
> 
> Today, on capable devices, this is no longer the case. Minimize the
> interface down time by deferring the old channels closure, after the
> activation of the new ones.
> 
> Perf numbers:
> Measured the number of dropped packets in a simple ping flood test,
> during a configuration change operation, that switches the number of
> channels from 247 to 248.
> 
> Before: 71 packets lost
> After:  15 packets lost, ~80% saving.
> 
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> Reviewed-by: Carolina Jubran <cjubran@nvidia.com>

Reviewed-by: Simon Horman <horms@kernel.org>