For zero-copy (devmem, io_uring), the netdev DMA device used
is the parent device of the net device. However that is not
always accurate for mlx5 devices:
- SFs: The parent device is an auxdev.
- Multi-PF netdevs: The DMA device should be determined by
the queue.
This change implements the DMA device queue API that returns the DMA
device appropriately for all cases.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 21bb88c5d3dc..0e48065a46eb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -5625,12 +5625,36 @@ static int mlx5e_queue_start(struct net_device *dev, void *newq,
return 0;
}
+static struct device *mlx5e_queue_get_dma_dev(struct net_device *dev,
+ int queue_index)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ struct mlx5e_channels *channels;
+ struct device *pdev = NULL;
+ struct mlx5e_channel *ch;
+
+ channels = &priv->channels;
+
+ mutex_lock(&priv->state_lock);
+
+ if (queue_index >= channels->num)
+ goto out;
+
+ ch = channels->c[queue_index];
+ pdev = ch->pdev;
+out:
+ mutex_unlock(&priv->state_lock);
+
+ return pdev;
+}
+
static const struct netdev_queue_mgmt_ops mlx5e_queue_mgmt_ops = {
.ndo_queue_mem_size = sizeof(struct mlx5_qmgmt_data),
.ndo_queue_mem_alloc = mlx5e_queue_mem_alloc,
.ndo_queue_mem_free = mlx5e_queue_mem_free,
.ndo_queue_start = mlx5e_queue_start,
.ndo_queue_stop = mlx5e_queue_stop,
+ .ndo_queue_get_dma_dev = mlx5e_queue_get_dma_dev,
};
static void mlx5e_build_nic_netdev(struct net_device *netdev)
--
2.50.1
On Fri, Aug 15, 2025 at 4:07 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> For zero-copy (devmem, io_uring), the netdev DMA device used
> is the parent device of the net device. However that is not
> always accurate for mlx5 devices:
> - SFs: The parent device is an auxdev.
> - Multi-PF netdevs: The DMA device should be determined by
> the queue.
>
> This change implements the DMA device queue API that returns the DMA
> device appropriately for all cases.
>
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> ---
> .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index 21bb88c5d3dc..0e48065a46eb 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -5625,12 +5625,36 @@ static int mlx5e_queue_start(struct net_device *dev, void *newq,
> return 0;
> }
>
> +static struct device *mlx5e_queue_get_dma_dev(struct net_device *dev,
> + int queue_index)
> +{
> + struct mlx5e_priv *priv = netdev_priv(dev);
> + struct mlx5e_channels *channels;
> + struct device *pdev = NULL;
> + struct mlx5e_channel *ch;
> +
> + channels = &priv->channels;
> +
> + mutex_lock(&priv->state_lock);
> +
> + if (queue_index >= channels->num)
> + goto out;
> +
> + ch = channels->c[queue_index];
> + pdev = ch->pdev;
This code assumes priv is initialized, and probably that the device is
up/running/registered. At first I thought that was fine, but now that
I look at the code more closely, netdev_nl_bind_rx_doit checks if the
device is present but doesn't seem to check that the device is
registered.
I wonder if we should have a generic check in netdev_nl_bind_rx_doit
for NETDEV_REGISTERED, and if not, does this code handle unregistered
netdev correctly (like netdev_priv and priv->channels are valid even
for unregistered mlx5 devices)?
--
Thanks,
Mina
On Fri, Aug 15, 2025 at 10:37:15AM -0700, Mina Almasry wrote:
> On Fri, Aug 15, 2025 at 4:07 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> >
> > For zero-copy (devmem, io_uring), the netdev DMA device used
> > is the parent device of the net device. However that is not
> > always accurate for mlx5 devices:
> > - SFs: The parent device is an auxdev.
> > - Multi-PF netdevs: The DMA device should be determined by
> > the queue.
> >
> > This change implements the DMA device queue API that returns the DMA
> > device appropriately for all cases.
> >
> > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > ---
> > .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++++++++
> > 1 file changed, 24 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > index 21bb88c5d3dc..0e48065a46eb 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > @@ -5625,12 +5625,36 @@ static int mlx5e_queue_start(struct net_device *dev, void *newq,
> > return 0;
> > }
> >
> > +static struct device *mlx5e_queue_get_dma_dev(struct net_device *dev,
> > + int queue_index)
> > +{
> > + struct mlx5e_priv *priv = netdev_priv(dev);
> > + struct mlx5e_channels *channels;
> > + struct device *pdev = NULL;
> > + struct mlx5e_channel *ch;
> > +
> > + channels = &priv->channels;
> > +
> > + mutex_lock(&priv->state_lock);
> > +
> > + if (queue_index >= channels->num)
> > + goto out;
> > +
> > + ch = channels->c[queue_index];
> > + pdev = ch->pdev;
>
> This code assumes priv is initialized, and probably that the device is
> up/running/registered. At first I thought that was fine, but now that
> I look at the code more closely, netdev_nl_bind_rx_doit checks if the
> device is present but doesn't seem to check that the device is
> registered.
>
> I wonder if we should have a generic check in netdev_nl_bind_rx_doit
> for NETDEV_REGISTERED, and if not, does this code handle unregistered
> netdev correctly (like netdev_priv and priv->channels are valid even
> for unregistered mlx5 devices)?
>
netdev_get_by_index_lock() returns non-NULL only when the device is in
state NETDEV_REGISTERED or NETREG_UNINITIALIZED. So I think that this
check should suffice.
Thanks,
Dragos
On Mon, Aug 18, 2025 at 10:40 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Fri, Aug 15, 2025 at 10:37:15AM -0700, Mina Almasry wrote:
> > On Fri, Aug 15, 2025 at 4:07 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > >
> > > For zero-copy (devmem, io_uring), the netdev DMA device used
> > > is the parent device of the net device. However that is not
> > > always accurate for mlx5 devices:
> > > - SFs: The parent device is an auxdev.
> > > - Multi-PF netdevs: The DMA device should be determined by
> > > the queue.
> > >
> > > This change implements the DMA device queue API that returns the DMA
> > > device appropriately for all cases.
> > >
> > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > ---
> > > .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++++++++
> > > 1 file changed, 24 insertions(+)
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > index 21bb88c5d3dc..0e48065a46eb 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > > @@ -5625,12 +5625,36 @@ static int mlx5e_queue_start(struct net_device *dev, void *newq,
> > > return 0;
> > > }
> > >
> > > +static struct device *mlx5e_queue_get_dma_dev(struct net_device *dev,
> > > + int queue_index)
> > > +{
> > > + struct mlx5e_priv *priv = netdev_priv(dev);
> > > + struct mlx5e_channels *channels;
> > > + struct device *pdev = NULL;
> > > + struct mlx5e_channel *ch;
> > > +
> > > + channels = &priv->channels;
> > > +
> > > + mutex_lock(&priv->state_lock);
> > > +
> > > + if (queue_index >= channels->num)
> > > + goto out;
> > > +
> > > + ch = channels->c[queue_index];
> > > + pdev = ch->pdev;
> >
> > This code assumes priv is initialized, and probably that the device is
> > up/running/registered. At first I thought that was fine, but now that
> > I look at the code more closely, netdev_nl_bind_rx_doit checks if the
> > device is present but doesn't seem to check that the device is
> > registered.
> >
> > I wonder if we should have a generic check in netdev_nl_bind_rx_doit
> > for NETDEV_REGISTERED, and if not, does this code handle unregistered
> > netdev correctly (like netdev_priv and priv->channels are valid even
> > for unregistered mlx5 devices)?
> >
> netdev_get_by_index_lock() returns non-NULL only when the device is in
> state NETDEV_REGISTERED or NETREG_UNINITIALIZED. So I think that this
> check should suffice.
>
Ack, thanks for checking. For the patch:
Reviewed-by: Mina Almasry <almasrymina@google.com>
--
Thanks,
Mina
© 2016 - 2026 Red Hat, Inc.