[PATCH net-next 4/4] net/mlx5: Lag, add net namespace support

Tariq Toukan posted 4 patches 3 weeks ago
There is a newer version of this series
[PATCH net-next 4/4] net/mlx5: Lag, add net namespace support
Posted by Tariq Toukan 3 weeks ago
From: Shay Drory <shayd@nvidia.com>

Update the LAG implementation to support net namespace isolation.

With recent changes to the devcom framework allowing namespace-aware
matching, the LAG layer is updated to register devcom clients with the
associated net namespace. This ensures that LAG formation only occurs
between mlx5 interfaces that reside in the same namespace.

This change ensures that devices in different namespaces do not interfere
with each other's LAG setup and behavior. For example, if two PCI PFs are
in the same namespace, they are eligible to form a hardware LAG.

In addition, reload behavior for LAG is adjusted to handle namespace
contexts appropriately.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/devlink.c |  5 -----
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 14 +++++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h |  1 +
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index a0b68321355a..bfa44414be82 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -204,11 +204,6 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
 		return 0;
 	}
 
-	if (mlx5_lag_is_active(dev)) {
-		NL_SET_ERR_MSG_MOD(extack, "reload is unsupported in Lag mode");
-		return -EOPNOTSUPP;
-	}
-
 	if (mlx5_core_is_mp_slave(dev)) {
 		NL_SET_ERR_MSG_MOD(extack, "reload is unsupported for multi port slave");
 		return -EOPNOTSUPP;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index ccb22ed13f84..59c00c911275 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -35,6 +35,7 @@
 #include <linux/mlx5/driver.h>
 #include <linux/mlx5/eswitch.h>
 #include <linux/mlx5/vport.h>
+#include "lib/mlx5.h"
 #include "lib/devcom.h"
 #include "mlx5_core.h"
 #include "eswitch.h"
@@ -231,9 +232,13 @@ static void mlx5_do_bond_work(struct work_struct *work);
 static void mlx5_ldev_free(struct kref *ref)
 {
 	struct mlx5_lag *ldev = container_of(ref, struct mlx5_lag, ref);
+	struct net *net;
+
+	if (ldev->nb.notifier_call) {
+		net = read_pnet(&ldev->net);
+		unregister_netdevice_notifier_net(net, &ldev->nb);
+	}
 
-	if (ldev->nb.notifier_call)
-		unregister_netdevice_notifier_net(&init_net, &ldev->nb);
 	mlx5_lag_mp_cleanup(ldev);
 	cancel_delayed_work_sync(&ldev->bond_work);
 	destroy_workqueue(ldev->wq);
@@ -271,7 +276,8 @@ static struct mlx5_lag *mlx5_lag_dev_alloc(struct mlx5_core_dev *dev)
 	INIT_DELAYED_WORK(&ldev->bond_work, mlx5_do_bond_work);
 
 	ldev->nb.notifier_call = mlx5_lag_netdev_event;
-	if (register_netdevice_notifier_net(&init_net, &ldev->nb)) {
+	write_pnet(&ldev->net, mlx5_core_net(dev));
+	if (register_netdevice_notifier_net(read_pnet(&ldev->net), &ldev->nb)) {
 		ldev->nb.notifier_call = NULL;
 		mlx5_core_err(dev, "Failed to register LAG netdev notifier\n");
 	}
@@ -1413,6 +1419,8 @@ static int mlx5_lag_register_hca_devcom_comp(struct mlx5_core_dev *dev)
 {
 	struct mlx5_devcom_match_attr attr = {
 		.key.val = mlx5_query_nic_system_image_guid(dev),
+		.flags = MLX5_DEVCOM_MATCH_FLAGS_NS,
+		.net = mlx5_core_net(dev),
 	};
 
 	/* This component is use to sync adding core_dev to lag_dev and to sync
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
index c2f256bb2bc2..4918eee2b3da 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
@@ -67,6 +67,7 @@ struct mlx5_lag {
 	struct workqueue_struct   *wq;
 	struct delayed_work       bond_work;
 	struct notifier_block     nb;
+	possible_net_t net;
 	struct lag_mp             lag_mp;
 	struct mlx5_lag_port_sel  port_sel;
 	/* Protect lag fields/state changes */
-- 
2.31.1
Re: [PATCH net-next 4/4] net/mlx5: Lag, add net namespace support
Posted by Simon Horman 2 weeks, 6 days ago
On Thu, Sep 11, 2025 at 09:31:07AM +0300, Tariq Toukan wrote:
> From: Shay Drory <shayd@nvidia.com>
> 
> Update the LAG implementation to support net namespace isolation.
> 
> With recent changes to the devcom framework allowing namespace-aware
> matching, the LAG layer is updated to register devcom clients with the
> associated net namespace. This ensures that LAG formation only occurs
> between mlx5 interfaces that reside in the same namespace.
> 
> This change ensures that devices in different namespaces do not interfere
> with each other's LAG setup and behavior. For example, if two PCI PFs are
> in the same namespace, they are eligible to form a hardware LAG.
> 
> In addition, reload behavior for LAG is adjusted to handle namespace
> contexts appropriately.
> 
> Signed-off-by: Shay Drory <shayd@nvidia.com>
> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/devlink.c |  5 -----
>  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 14 +++++++++++---
>  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h |  1 +
>  3 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
> index a0b68321355a..bfa44414be82 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
> @@ -204,11 +204,6 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
>  		return 0;
>  	}
>  
> -	if (mlx5_lag_is_active(dev)) {
> -		NL_SET_ERR_MSG_MOD(extack, "reload is unsupported in Lag mode");
> -		return -EOPNOTSUPP;
> -	}
> -

Maybe I'm missing something obvious. But I think this could do with
some further commentary in the commit message. Or perhaps being a separate
patch.

>  	if (mlx5_core_is_mp_slave(dev)) {
>  		NL_SET_ERR_MSG_MOD(extack, "reload is unsupported for multi port slave");
>  		return -EOPNOTSUPP;

...

> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
> index c2f256bb2bc2..4918eee2b3da 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
> @@ -67,6 +67,7 @@ struct mlx5_lag {
>  	struct workqueue_struct   *wq;
>  	struct delayed_work       bond_work;
>  	struct notifier_block     nb;
> +	possible_net_t net;

nit: inconsistent indentation.

>  	struct lag_mp             lag_mp;
>  	struct mlx5_lag_port_sel  port_sel;
>  	/* Protect lag fields/state changes */
> -- 
> 2.31.1
> 
>
Re: [PATCH net-next 4/4] net/mlx5: Lag, add net namespace support
Posted by Mark Bloch 2 weeks, 5 days ago

On 12/09/2025 17:09, Simon Horman wrote:
> On Thu, Sep 11, 2025 at 09:31:07AM +0300, Tariq Toukan wrote:
>> From: Shay Drory <shayd@nvidia.com>
>>
>> Update the LAG implementation to support net namespace isolation.
>>
>> With recent changes to the devcom framework allowing namespace-aware
>> matching, the LAG layer is updated to register devcom clients with the
>> associated net namespace. This ensures that LAG formation only occurs
>> between mlx5 interfaces that reside in the same namespace.
>>
>> This change ensures that devices in different namespaces do not interfere
>> with each other's LAG setup and behavior. For example, if two PCI PFs are
>> in the same namespace, they are eligible to form a hardware LAG.
>>
>> In addition, reload behavior for LAG is adjusted to handle namespace
>> contexts appropriately.
>>
>> Signed-off-by: Shay Drory <shayd@nvidia.com>
>> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
>> ---
>>  drivers/net/ethernet/mellanox/mlx5/core/devlink.c |  5 -----
>>  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 14 +++++++++++---
>>  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h |  1 +
>>  3 files changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
>> index a0b68321355a..bfa44414be82 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
>> @@ -204,11 +204,6 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
>>  		return 0;
>>  	}
>>  
>> -	if (mlx5_lag_is_active(dev)) {
>> -		NL_SET_ERR_MSG_MOD(extack, "reload is unsupported in Lag mode");
>> -		return -EOPNOTSUPP;
>> -	}
>> -
> 
> Maybe I'm missing something obvious. But I think this could do with
> some further commentary in the commit message. Or perhaps being a separate
> patch.

While one could split this into two patches, first enabling LAG creation
within a namespace, then separately removing the devlink reload restriction
that separation feels artificial.

Both changes are required to deliver complete support for LAG in namespaces
Since removing the reload restriction is a trivial change, it is better
to deliver the entire feature in this single patch.

Will clarify and add this justification to the commit message.

> 
>>  	if (mlx5_core_is_mp_slave(dev)) {
>>  		NL_SET_ERR_MSG_MOD(extack, "reload is unsupported for multi port slave");
>>  		return -EOPNOTSUPP;
> 
> ...
> 
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
>> index c2f256bb2bc2..4918eee2b3da 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
>> @@ -67,6 +67,7 @@ struct mlx5_lag {
>>  	struct workqueue_struct   *wq;
>>  	struct delayed_work       bond_work;
>>  	struct notifier_block     nb;
>> +	possible_net_t net;
> 
> nit: inconsistent indentation.

I prefer to avoid vertical alignment. I find that style
is often brittle and makes the code harder to maintain
over time.

Thanks for the review!
Mark

> 
>>  	struct lag_mp             lag_mp;
>>  	struct mlx5_lag_port_sel  port_sel;
>>  	/* Protect lag fields/state changes */
>> -- 
>> 2.31.1
>>
>>
Re: [PATCH net-next 4/4] net/mlx5: Lag, add net namespace support
Posted by Simon Horman 2 weeks, 3 days ago
On Sat, Sep 13, 2025 at 04:52:17AM +0300, Mark Bloch wrote:
> 
> 
> On 12/09/2025 17:09, Simon Horman wrote:
> > On Thu, Sep 11, 2025 at 09:31:07AM +0300, Tariq Toukan wrote:
> >> From: Shay Drory <shayd@nvidia.com>
> >>
> >> Update the LAG implementation to support net namespace isolation.
> >>
> >> With recent changes to the devcom framework allowing namespace-aware
> >> matching, the LAG layer is updated to register devcom clients with the
> >> associated net namespace. This ensures that LAG formation only occurs
> >> between mlx5 interfaces that reside in the same namespace.
> >>
> >> This change ensures that devices in different namespaces do not interfere
> >> with each other's LAG setup and behavior. For example, if two PCI PFs are
> >> in the same namespace, they are eligible to form a hardware LAG.
> >>
> >> In addition, reload behavior for LAG is adjusted to handle namespace
> >> contexts appropriately.
> >>
> >> Signed-off-by: Shay Drory <shayd@nvidia.com>
> >> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
> >> Reviewed-by: Parav Pandit <parav@nvidia.com>
> >> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> >> ---
> >>  drivers/net/ethernet/mellanox/mlx5/core/devlink.c |  5 -----
> >>  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 14 +++++++++++---
> >>  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h |  1 +
> >>  3 files changed, 12 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
> >> index a0b68321355a..bfa44414be82 100644
> >> --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
> >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
> >> @@ -204,11 +204,6 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
> >>  		return 0;
> >>  	}
> >>  
> >> -	if (mlx5_lag_is_active(dev)) {
> >> -		NL_SET_ERR_MSG_MOD(extack, "reload is unsupported in Lag mode");
> >> -		return -EOPNOTSUPP;
> >> -	}
> >> -
> > 
> > Maybe I'm missing something obvious. But I think this could do with
> > some further commentary in the commit message. Or perhaps being a separate
> > patch.
> 
> While one could split this into two patches, first enabling LAG creation
> within a namespace, then separately removing the devlink reload restriction
> that separation feels artificial.
> 
> Both changes are required to deliver complete support for LAG in namespaces
> Since removing the reload restriction is a trivial change, it is better
> to deliver the entire feature in this single patch.
> 
> Will clarify and add this justification to the commit message.

Thanks, much appreciated.