From: Shay Drory <shayd@nvidia.com>
Update the LAG implementation to support net namespace isolation.
With recent changes to the devcom framework allowing namespace-aware
matching, the LAG layer is updated to register devcom clients with the
associated net namespace. This ensures that LAG formation only occurs
between mlx5 interfaces that reside in the same namespace.
This change ensures that devices in different namespaces do not interfere
with each other's LAG setup and behavior. For example, if two PCI PFs are
in the same namespace, they are eligible to form a hardware LAG.
In addition, reload behavior for LAG is adjusted to handle namespace
contexts appropriately.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 5 -----
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 14 +++++++++++---
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 1 +
3 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index a0b68321355a..bfa44414be82 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -204,11 +204,6 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
return 0;
}
- if (mlx5_lag_is_active(dev)) {
- NL_SET_ERR_MSG_MOD(extack, "reload is unsupported in Lag mode");
- return -EOPNOTSUPP;
- }
-
if (mlx5_core_is_mp_slave(dev)) {
NL_SET_ERR_MSG_MOD(extack, "reload is unsupported for multi port slave");
return -EOPNOTSUPP;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index ccb22ed13f84..59c00c911275 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -35,6 +35,7 @@
#include <linux/mlx5/driver.h>
#include <linux/mlx5/eswitch.h>
#include <linux/mlx5/vport.h>
+#include "lib/mlx5.h"
#include "lib/devcom.h"
#include "mlx5_core.h"
#include "eswitch.h"
@@ -231,9 +232,13 @@ static void mlx5_do_bond_work(struct work_struct *work);
static void mlx5_ldev_free(struct kref *ref)
{
struct mlx5_lag *ldev = container_of(ref, struct mlx5_lag, ref);
+ struct net *net;
+
+ if (ldev->nb.notifier_call) {
+ net = read_pnet(&ldev->net);
+ unregister_netdevice_notifier_net(net, &ldev->nb);
+ }
- if (ldev->nb.notifier_call)
- unregister_netdevice_notifier_net(&init_net, &ldev->nb);
mlx5_lag_mp_cleanup(ldev);
cancel_delayed_work_sync(&ldev->bond_work);
destroy_workqueue(ldev->wq);
@@ -271,7 +276,8 @@ static struct mlx5_lag *mlx5_lag_dev_alloc(struct mlx5_core_dev *dev)
INIT_DELAYED_WORK(&ldev->bond_work, mlx5_do_bond_work);
ldev->nb.notifier_call = mlx5_lag_netdev_event;
- if (register_netdevice_notifier_net(&init_net, &ldev->nb)) {
+ write_pnet(&ldev->net, mlx5_core_net(dev));
+ if (register_netdevice_notifier_net(read_pnet(&ldev->net), &ldev->nb)) {
ldev->nb.notifier_call = NULL;
mlx5_core_err(dev, "Failed to register LAG netdev notifier\n");
}
@@ -1413,6 +1419,8 @@ static int mlx5_lag_register_hca_devcom_comp(struct mlx5_core_dev *dev)
{
struct mlx5_devcom_match_attr attr = {
.key.val = mlx5_query_nic_system_image_guid(dev),
+ .flags = MLX5_DEVCOM_MATCH_FLAGS_NS,
+ .net = mlx5_core_net(dev),
};
/* This component is use to sync adding core_dev to lag_dev and to sync
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
index c2f256bb2bc2..4918eee2b3da 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
@@ -67,6 +67,7 @@ struct mlx5_lag {
struct workqueue_struct *wq;
struct delayed_work bond_work;
struct notifier_block nb;
+ possible_net_t net;
struct lag_mp lag_mp;
struct mlx5_lag_port_sel port_sel;
/* Protect lag fields/state changes */
--
2.31.1
On Thu, Sep 11, 2025 at 09:31:07AM +0300, Tariq Toukan wrote: > From: Shay Drory <shayd@nvidia.com> > > Update the LAG implementation to support net namespace isolation. > > With recent changes to the devcom framework allowing namespace-aware > matching, the LAG layer is updated to register devcom clients with the > associated net namespace. This ensures that LAG formation only occurs > between mlx5 interfaces that reside in the same namespace. > > This change ensures that devices in different namespaces do not interfere > with each other's LAG setup and behavior. For example, if two PCI PFs are > in the same namespace, they are eligible to form a hardware LAG. > > In addition, reload behavior for LAG is adjusted to handle namespace > contexts appropriately. > > Signed-off-by: Shay Drory <shayd@nvidia.com> > Reviewed-by: Mark Bloch <mbloch@nvidia.com> > Reviewed-by: Parav Pandit <parav@nvidia.com> > Signed-off-by: Tariq Toukan <tariqt@nvidia.com> > --- > drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 5 ----- > drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 14 +++++++++++--- > drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 1 + > 3 files changed, 12 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c > index a0b68321355a..bfa44414be82 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c > @@ -204,11 +204,6 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change, > return 0; > } > > - if (mlx5_lag_is_active(dev)) { > - NL_SET_ERR_MSG_MOD(extack, "reload is unsupported in Lag mode"); > - return -EOPNOTSUPP; > - } > - Maybe I'm missing something obvious. But I think this could do with some further commentary in the commit message. Or perhaps being a separate patch. > if (mlx5_core_is_mp_slave(dev)) { > NL_SET_ERR_MSG_MOD(extack, "reload is unsupported for multi port slave"); > return -EOPNOTSUPP; ... > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h > index c2f256bb2bc2..4918eee2b3da 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h > +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h > @@ -67,6 +67,7 @@ struct mlx5_lag { > struct workqueue_struct *wq; > struct delayed_work bond_work; > struct notifier_block nb; > + possible_net_t net; nit: inconsistent indentation. > struct lag_mp lag_mp; > struct mlx5_lag_port_sel port_sel; > /* Protect lag fields/state changes */ > -- > 2.31.1 > >
On 12/09/2025 17:09, Simon Horman wrote: > On Thu, Sep 11, 2025 at 09:31:07AM +0300, Tariq Toukan wrote: >> From: Shay Drory <shayd@nvidia.com> >> >> Update the LAG implementation to support net namespace isolation. >> >> With recent changes to the devcom framework allowing namespace-aware >> matching, the LAG layer is updated to register devcom clients with the >> associated net namespace. This ensures that LAG formation only occurs >> between mlx5 interfaces that reside in the same namespace. >> >> This change ensures that devices in different namespaces do not interfere >> with each other's LAG setup and behavior. For example, if two PCI PFs are >> in the same namespace, they are eligible to form a hardware LAG. >> >> In addition, reload behavior for LAG is adjusted to handle namespace >> contexts appropriately. >> >> Signed-off-by: Shay Drory <shayd@nvidia.com> >> Reviewed-by: Mark Bloch <mbloch@nvidia.com> >> Reviewed-by: Parav Pandit <parav@nvidia.com> >> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> >> --- >> drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 5 ----- >> drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 14 +++++++++++--- >> drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 1 + >> 3 files changed, 12 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c >> index a0b68321355a..bfa44414be82 100644 >> --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c >> @@ -204,11 +204,6 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change, >> return 0; >> } >> >> - if (mlx5_lag_is_active(dev)) { >> - NL_SET_ERR_MSG_MOD(extack, "reload is unsupported in Lag mode"); >> - return -EOPNOTSUPP; >> - } >> - > > Maybe I'm missing something obvious. But I think this could do with > some further commentary in the commit message. Or perhaps being a separate > patch. While one could split this into two patches, first enabling LAG creation within a namespace, then separately removing the devlink reload restriction that separation feels artificial. Both changes are required to deliver complete support for LAG in namespaces Since removing the reload restriction is a trivial change, it is better to deliver the entire feature in this single patch. Will clarify and add this justification to the commit message. > >> if (mlx5_core_is_mp_slave(dev)) { >> NL_SET_ERR_MSG_MOD(extack, "reload is unsupported for multi port slave"); >> return -EOPNOTSUPP; > > ... > >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h >> index c2f256bb2bc2..4918eee2b3da 100644 >> --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h >> @@ -67,6 +67,7 @@ struct mlx5_lag { >> struct workqueue_struct *wq; >> struct delayed_work bond_work; >> struct notifier_block nb; >> + possible_net_t net; > > nit: inconsistent indentation. I prefer to avoid vertical alignment. I find that style is often brittle and makes the code harder to maintain over time. Thanks for the review! Mark > >> struct lag_mp lag_mp; >> struct mlx5_lag_port_sel port_sel; >> /* Protect lag fields/state changes */ >> -- >> 2.31.1 >> >>
On Sat, Sep 13, 2025 at 04:52:17AM +0300, Mark Bloch wrote: > > > On 12/09/2025 17:09, Simon Horman wrote: > > On Thu, Sep 11, 2025 at 09:31:07AM +0300, Tariq Toukan wrote: > >> From: Shay Drory <shayd@nvidia.com> > >> > >> Update the LAG implementation to support net namespace isolation. > >> > >> With recent changes to the devcom framework allowing namespace-aware > >> matching, the LAG layer is updated to register devcom clients with the > >> associated net namespace. This ensures that LAG formation only occurs > >> between mlx5 interfaces that reside in the same namespace. > >> > >> This change ensures that devices in different namespaces do not interfere > >> with each other's LAG setup and behavior. For example, if two PCI PFs are > >> in the same namespace, they are eligible to form a hardware LAG. > >> > >> In addition, reload behavior for LAG is adjusted to handle namespace > >> contexts appropriately. > >> > >> Signed-off-by: Shay Drory <shayd@nvidia.com> > >> Reviewed-by: Mark Bloch <mbloch@nvidia.com> > >> Reviewed-by: Parav Pandit <parav@nvidia.com> > >> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> > >> --- > >> drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 5 ----- > >> drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 14 +++++++++++--- > >> drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 1 + > >> 3 files changed, 12 insertions(+), 8 deletions(-) > >> > >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c > >> index a0b68321355a..bfa44414be82 100644 > >> --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c > >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c > >> @@ -204,11 +204,6 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change, > >> return 0; > >> } > >> > >> - if (mlx5_lag_is_active(dev)) { > >> - NL_SET_ERR_MSG_MOD(extack, "reload is unsupported in Lag mode"); > >> - return -EOPNOTSUPP; > >> - } > >> - > > > > Maybe I'm missing something obvious. But I think this could do with > > some further commentary in the commit message. Or perhaps being a separate > > patch. > > While one could split this into two patches, first enabling LAG creation > within a namespace, then separately removing the devlink reload restriction > that separation feels artificial. > > Both changes are required to deliver complete support for LAG in namespaces > Since removing the reload restriction is a trivial change, it is better > to deliver the entire feature in this single patch. > > Will clarify and add this justification to the commit message. Thanks, much appreciated.
© 2016 - 2025 Red Hat, Inc.