For zerocopy (io_uring, devmem), there is an assumption that the
parent device can do DMA. However that is not always the case:
- Scalable Function netdevs [1] have the DMA device in the grandparent.
- For Multi-PF netdevs [2] queues can be associated to different DMA
devices.
This patch introduces the a queue based interface for allowing drivers
to expose a different DMA device for zerocopy.
[1] Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst
[2] Documentation/networking/multi-pf-netdev.rst
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
---
include/net/netdev_queues.h | 8 ++++++++
net/core/Makefile | 1 +
net/core/netdev_queues.c | 25 +++++++++++++++++++++++++
3 files changed, 34 insertions(+)
create mode 100644 net/core/netdev_queues.c
diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
index 6e835972abd1..f648de4ca03e 100644
--- a/include/net/netdev_queues.h
+++ b/include/net/netdev_queues.h
@@ -127,6 +127,10 @@ void netdev_stat_queue_sum(struct net_device *netdev,
* @ndo_queue_stop: Stop the RX queue at the specified index. The stopped
* queue's memory is written at the specified address.
*
+ * @ndo_queue_get_dma_dev: Get dma device for zero-copy operations to be used
+ * for this queue. When such device is not available,
+ * the function will return NULL.
+ *
* Note that @ndo_queue_mem_alloc and @ndo_queue_mem_free may be called while
* the interface is closed. @ndo_queue_start and @ndo_queue_stop will only
* be called for an interface which is open.
@@ -144,6 +148,8 @@ struct netdev_queue_mgmt_ops {
int (*ndo_queue_stop)(struct net_device *dev,
void *per_queue_mem,
int idx);
+ struct device * (*ndo_queue_get_dma_dev)(struct net_device *dev,
+ int idx);
};
/**
@@ -321,4 +327,6 @@ static inline void netif_subqueue_sent(const struct net_device *dev,
get_desc, start_thrs); \
})
+struct device *netdev_queue_get_dma_dev(struct net_device *dev, int idx);
+
#endif
diff --git a/net/core/Makefile b/net/core/Makefile
index b2a76ce33932..9ef2099c5426 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_NETDEV_ADDR_LIST_TEST) += dev_addr_lists_test.o
obj-y += net-sysfs.o
obj-y += hotdata.o
obj-y += netdev_rx_queue.o
+obj-y += netdev_queues.o
obj-$(CONFIG_PAGE_POOL) += page_pool.o page_pool_user.o
obj-$(CONFIG_PROC_FS) += net-procfs.o
obj-$(CONFIG_NET_PKTGEN) += pktgen.o
diff --git a/net/core/netdev_queues.c b/net/core/netdev_queues.c
new file mode 100644
index 000000000000..d00f071e0edf
--- /dev/null
+++ b/net/core/netdev_queues.c
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include <net/netdev_queues.h>
+
+/**
+ * netdev_queue_get_dma_dev() - get dma device for zero-copy operations
+ * @dev: net_device
+ * @idx: queue index
+ *
+ * Get dma device for zero-copy operations to be used for this queue.
+ * When such device is not available or valid, the function will return NULL.
+ */
+struct device *netdev_queue_get_dma_dev(struct net_device *dev, int idx)
+{
+ const struct netdev_queue_mgmt_ops *queue_ops = dev->queue_mgmt_ops;
+ struct device *dma_dev;
+
+ if (queue_ops && queue_ops->ndo_queue_get_dma_dev)
+ dma_dev = queue_ops->ndo_queue_get_dma_dev(dev, idx);
+ else
+ dma_dev = dev->dev.parent;
+
+ return dma_dev && dma_dev->dma_mask ? dma_dev : NULL;
+}
+EXPORT_SYMBOL(netdev_queue_get_dma_dev);
\ No newline at end of file
--
2.50.1
On Wed, 20 Aug 2025 20:11:52 +0300 Dragos Tatulea wrote: > + * @ndo_queue_get_dma_dev: Get dma device for zero-copy operations to be used > + * for this queue. When such device is not available, > + * the function will return NULL. nit: I think you're using a different tense/grammar than the doc for other callbacks (which is admittedly somewhat unusual :$) Also should we indicate that "not available" is an error? Maybe just: Get dma device for zero-copy operations to be used for this queue. Return NULL on error. > * Note that @ndo_queue_mem_alloc and @ndo_queue_mem_free may be called while > * the interface is closed. @ndo_queue_start and @ndo_queue_stop will only > * be called for an interface which is open. > +/** > + * netdev_queue_get_dma_dev() - get dma device for zero-copy operations > + * @dev: net_device > + * @idx: queue index > + * > + * Get dma device for zero-copy operations to be used for this queue. > + * When such device is not available or valid, the function will return NULL. Unfortunately kdoc really wants us to add Return: statements to all functions... > + */ > +struct device *netdev_queue_get_dma_dev(struct net_device *dev, int idx) > +{ > + const struct netdev_queue_mgmt_ops *queue_ops = dev->queue_mgmt_ops; > + struct device *dma_dev; > + > + if (queue_ops && queue_ops->ndo_queue_get_dma_dev) > + dma_dev = queue_ops->ndo_queue_get_dma_dev(dev, idx); > + else > + dma_dev = dev->dev.parent; > + > + return dma_dev && dma_dev->dma_mask ? dma_dev : NULL; > +} > +EXPORT_SYMBOL(netdev_queue_get_dma_dev); > \ No newline at end of file This is in desperate need of a terminating new line. But also -- why the export? iouring and devmem can't be modules.
On Wed, Aug 20, 2025 at 06:01:00PM -0700, Jakub Kicinski wrote: > On Wed, 20 Aug 2025 20:11:52 +0300 Dragos Tatulea wrote: > > + * @ndo_queue_get_dma_dev: Get dma device for zero-copy operations to be used > > + * for this queue. When such device is not available, > > + * the function will return NULL. > > nit: I think you're using a different tense/grammar than the doc for > other callbacks (which is admittedly somewhat unusual :$) > Also should we indicate that "not available" is an error? Maybe just: > > Get dma device for zero-copy operations to be used > for this queue. Return NULL on error. > Sure. Will fix. > > * Note that @ndo_queue_mem_alloc and @ndo_queue_mem_free may be called while > > * the interface is closed. @ndo_queue_start and @ndo_queue_stop will only > > * be called for an interface which is open. > > > +/** > > + * netdev_queue_get_dma_dev() - get dma device for zero-copy operations > > + * @dev: net_device > > + * @idx: queue index > > + * > > + * Get dma device for zero-copy operations to be used for this queue. > > + * When such device is not available or valid, the function will return NULL. > > Unfortunately kdoc really wants us to add Return: statements to all > functions... > Ack. Will fix. > > + */ > > +struct device *netdev_queue_get_dma_dev(struct net_device *dev, int idx) > > +{ > > + const struct netdev_queue_mgmt_ops *queue_ops = dev->queue_mgmt_ops; > > + struct device *dma_dev; > > + > > + if (queue_ops && queue_ops->ndo_queue_get_dma_dev) > > + dma_dev = queue_ops->ndo_queue_get_dma_dev(dev, idx); > > + else > > + dma_dev = dev->dev.parent; > > + > > + return dma_dev && dma_dev->dma_mask ? dma_dev : NULL; > > +} > > +EXPORT_SYMBOL(netdev_queue_get_dma_dev); > > \ No newline at end of file > > This is in desperate need of a terminating new line. > Ack. Will fix. > But also -- why the export? iouring and devmem can't be modules. Oh, right! Will replace with a newline. Thanks, Dragos
On Wed, Aug 20, 2025 at 10:13 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > For zerocopy (io_uring, devmem), there is an assumption that the > parent device can do DMA. However that is not always the case: > - Scalable Function netdevs [1] have the DMA device in the grandparent. > - For Multi-PF netdevs [2] queues can be associated to different DMA > devices. > > This patch introduces the a queue based interface for allowing drivers > to expose a different DMA device for zerocopy. > > [1] Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst > [2] Documentation/networking/multi-pf-netdev.rst > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Mina Almasry <almasrymina@google.com> -- Thanks, Mina
© 2016 - 2025 Red Hat, Inc.