For zero-copy (devmem, io_uring), the netdev DMA device used
is the parent device of the net device. However that is not
always accurate for mlx5 devices:
- SFs: The parent device is an auxdev.
- Multi-PF netdevs: The DMA device should be determined by
the queue.
This change implements the DMA device queue API that returns the DMA
device appropriately for all cases.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 21bb88c5d3dc..0e48065a46eb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -5625,12 +5625,36 @@ static int mlx5e_queue_start(struct net_device *dev, void *newq,
return 0;
}
+static struct device *mlx5e_queue_get_dma_dev(struct net_device *dev,
+ int queue_index)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ struct mlx5e_channels *channels;
+ struct device *pdev = NULL;
+ struct mlx5e_channel *ch;
+
+ channels = &priv->channels;
+
+ mutex_lock(&priv->state_lock);
+
+ if (queue_index >= channels->num)
+ goto out;
+
+ ch = channels->c[queue_index];
+ pdev = ch->pdev;
+out:
+ mutex_unlock(&priv->state_lock);
+
+ return pdev;
+}
+
static const struct netdev_queue_mgmt_ops mlx5e_queue_mgmt_ops = {
.ndo_queue_mem_size = sizeof(struct mlx5_qmgmt_data),
.ndo_queue_mem_alloc = mlx5e_queue_mem_alloc,
.ndo_queue_mem_free = mlx5e_queue_mem_free,
.ndo_queue_start = mlx5e_queue_start,
.ndo_queue_stop = mlx5e_queue_stop,
+ .ndo_queue_get_dma_dev = mlx5e_queue_get_dma_dev,
};
static void mlx5e_build_nic_netdev(struct net_device *netdev)
--
2.50.1
On Fri, Aug 15, 2025 at 4:07 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > For zero-copy (devmem, io_uring), the netdev DMA device used > is the parent device of the net device. However that is not > always accurate for mlx5 devices: > - SFs: The parent device is an auxdev. > - Multi-PF netdevs: The DMA device should be determined by > the queue. > > This change implements the DMA device queue API that returns the DMA > device appropriately for all cases. > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > --- > .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > index 21bb88c5d3dc..0e48065a46eb 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > @@ -5625,12 +5625,36 @@ static int mlx5e_queue_start(struct net_device *dev, void *newq, > return 0; > } > > +static struct device *mlx5e_queue_get_dma_dev(struct net_device *dev, > + int queue_index) > +{ > + struct mlx5e_priv *priv = netdev_priv(dev); > + struct mlx5e_channels *channels; > + struct device *pdev = NULL; > + struct mlx5e_channel *ch; > + > + channels = &priv->channels; > + > + mutex_lock(&priv->state_lock); > + > + if (queue_index >= channels->num) > + goto out; > + > + ch = channels->c[queue_index]; > + pdev = ch->pdev; This code assumes priv is initialized, and probably that the device is up/running/registered. At first I thought that was fine, but now that I look at the code more closely, netdev_nl_bind_rx_doit checks if the device is present but doesn't seem to check that the device is registered. I wonder if we should have a generic check in netdev_nl_bind_rx_doit for NETDEV_REGISTERED, and if not, does this code handle unregistered netdev correctly (like netdev_priv and priv->channels are valid even for unregistered mlx5 devices)? -- Thanks, Mina
On Fri, Aug 15, 2025 at 10:37:15AM -0700, Mina Almasry wrote: > On Fri, Aug 15, 2025 at 4:07 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > For zero-copy (devmem, io_uring), the netdev DMA device used > > is the parent device of the net device. However that is not > > always accurate for mlx5 devices: > > - SFs: The parent device is an auxdev. > > - Multi-PF netdevs: The DMA device should be determined by > > the queue. > > > > This change implements the DMA device queue API that returns the DMA > > device appropriately for all cases. > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > --- > > .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++++++++ > > 1 file changed, 24 insertions(+) > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > index 21bb88c5d3dc..0e48065a46eb 100644 > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > @@ -5625,12 +5625,36 @@ static int mlx5e_queue_start(struct net_device *dev, void *newq, > > return 0; > > } > > > > +static struct device *mlx5e_queue_get_dma_dev(struct net_device *dev, > > + int queue_index) > > +{ > > + struct mlx5e_priv *priv = netdev_priv(dev); > > + struct mlx5e_channels *channels; > > + struct device *pdev = NULL; > > + struct mlx5e_channel *ch; > > + > > + channels = &priv->channels; > > + > > + mutex_lock(&priv->state_lock); > > + > > + if (queue_index >= channels->num) > > + goto out; > > + > > + ch = channels->c[queue_index]; > > + pdev = ch->pdev; > > This code assumes priv is initialized, and probably that the device is > up/running/registered. At first I thought that was fine, but now that > I look at the code more closely, netdev_nl_bind_rx_doit checks if the > device is present but doesn't seem to check that the device is > registered. > > I wonder if we should have a generic check in netdev_nl_bind_rx_doit > for NETDEV_REGISTERED, and if not, does this code handle unregistered > netdev correctly (like netdev_priv and priv->channels are valid even > for unregistered mlx5 devices)? > netdev_get_by_index_lock() returns non-NULL only when the device is in state NETDEV_REGISTERED or NETREG_UNINITIALIZED. So I think that this check should suffice. Thanks, Dragos
On Mon, Aug 18, 2025 at 10:40 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > On Fri, Aug 15, 2025 at 10:37:15AM -0700, Mina Almasry wrote: > > On Fri, Aug 15, 2025 at 4:07 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > > > For zero-copy (devmem, io_uring), the netdev DMA device used > > > is the parent device of the net device. However that is not > > > always accurate for mlx5 devices: > > > - SFs: The parent device is an auxdev. > > > - Multi-PF netdevs: The DMA device should be determined by > > > the queue. > > > > > > This change implements the DMA device queue API that returns the DMA > > > device appropriately for all cases. > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > > --- > > > .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++++++++ > > > 1 file changed, 24 insertions(+) > > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > > index 21bb88c5d3dc..0e48065a46eb 100644 > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > > @@ -5625,12 +5625,36 @@ static int mlx5e_queue_start(struct net_device *dev, void *newq, > > > return 0; > > > } > > > > > > +static struct device *mlx5e_queue_get_dma_dev(struct net_device *dev, > > > + int queue_index) > > > +{ > > > + struct mlx5e_priv *priv = netdev_priv(dev); > > > + struct mlx5e_channels *channels; > > > + struct device *pdev = NULL; > > > + struct mlx5e_channel *ch; > > > + > > > + channels = &priv->channels; > > > + > > > + mutex_lock(&priv->state_lock); > > > + > > > + if (queue_index >= channels->num) > > > + goto out; > > > + > > > + ch = channels->c[queue_index]; > > > + pdev = ch->pdev; > > > > This code assumes priv is initialized, and probably that the device is > > up/running/registered. At first I thought that was fine, but now that > > I look at the code more closely, netdev_nl_bind_rx_doit checks if the > > device is present but doesn't seem to check that the device is > > registered. > > > > I wonder if we should have a generic check in netdev_nl_bind_rx_doit > > for NETDEV_REGISTERED, and if not, does this code handle unregistered > > netdev correctly (like netdev_priv and priv->channels are valid even > > for unregistered mlx5 devices)? > > > netdev_get_by_index_lock() returns non-NULL only when the device is in > state NETDEV_REGISTERED or NETREG_UNINITIALIZED. So I think that this > check should suffice. > Ack, thanks for checking. For the patch: Reviewed-by: Mina Almasry <almasrymina@google.com> -- Thanks, Mina
© 2016 - 2025 Red Hat, Inc.