From nobody Mon Jun 8 09:51:36 2026 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04BC4202C48 for ; Thu, 4 Jun 2026 00:43:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780533790; cv=none; b=Hw+QUqSoQbFgPpP5t/+1ImmesH12wo0mQ8agIz5e6+OJUGfY5TIF2w/Yv1HGfh1jfw1QxoiF79zz5e54YWWSE8M/5R2e9omjbGN2OAE06duSEDfHfwt45BFVPJ9OmTdJj/5fylgZ5vVY6yeEztWa2LnLpB8Pi90OteOQuV2GQz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780533790; c=relaxed/simple; bh=u8CLOIiiL03Jn4LaBdDJYWviByZ6A69MVt+eIqh+9w0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VRu6gjNewkut+7vax+y3uLRBKC96BHY3G/7HgbQNj8419CZvK915APzWRWUJA/GzMhN4jehvjHCBqQcRsm1HYZ1iowtfAabjECD1hMIjkelnNLWHphkCaMga2KIdaCdj3CKZQiX1lG3JsrXVDfs6qjMqQ1q4A15OW5rMi3jh9Yw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JgwjkCY7; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JgwjkCY7" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-2c0c32f6ce1so826765ad.2 for ; Wed, 03 Jun 2026 17:43:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780533787; x=1781138587; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=DMXdy1WUN6bY3q1z5SmXOuLnvJqwfrxGDALvO9pXctw=; b=JgwjkCY74lytAxhRPp4xJOYqcSotL1GFGMNylRPpfmGVT11CTT6JBz8HTzh+4uKQgg +ltcmmIOA6qWfRqVgBFIleqvz0dWKXITzk/rWMPARagQvQz+Sevz5UEcHDEF/+RFCPTz sS2yQ1ebSfXlKfX72q9bJ+Pf6/6LCnqyJaC/vQEyg1CBRrdPPFo0JoU5otBTCC6c4RCU cWJLFCcpGX18kTcQa5yX01JLT5E0pW1iyc1SoG2zvbrqQiLmgO+hXFk5DypcO70meLmb sgfqis005YOBcQl0zd7HT+FclgrAmCJ9Wf5xYh96A0bFJS39tfiLbGU5FmcOMTnOZI6j Tf0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780533787; x=1781138587; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=DMXdy1WUN6bY3q1z5SmXOuLnvJqwfrxGDALvO9pXctw=; b=EL23DjEmvXsA3CggMMvqqJIn2sRXOfJY1O8BDpzyp1F3GeCn6BK2fABnAGhSp5Vu1g OGLvcRcPiP3WWiHAJxBysnjBR27fPzhVsSz22tt76bBsUEjCx5sWaBiaF+ERs+pDBuhP YTbP5X2PmUXjDjiG/YS+FfYhGa+4sULXQyyJSiODas/9FIgctHZRqgTTYmLIyc9go6B3 ziVFpS2MarJgRR4BAIaNJHxeYg/lFvE6m9ot3cNJ8gkEJrtNilTc9IsUyWC5zJIOZ6Sy F8BaoOVrQu8xTzWuN2b4bDtE31bToOtLuooY13qbDIqIW5EPs0F9FsSUbOYQwu2qNCPt 2dHQ== X-Forwarded-Encrypted: i=1; AFNElJ84T1qRUaSPdc06yIRvFYF5dL3ySV/yJzGXUp7YL4qJ2cBoS55GnZKWEEF6SaT/ANVq8FvbhuQV30rN5og=@vger.kernel.org X-Gm-Message-State: AOJu0YwuxrIG701GCCRRerKt3ScOpAZ7KQ66ShkrMacLdyFx9N3Q4LVa XXyBmtohUzN6u3yvDfAz9t4oYw0P1876FhQBkHhUAxAhaC3wdKccLwhP X-Gm-Gg: Acq92OGQ/Yd88jSQjCj2Wc42cJ4XeAtrNQOIsWLUlRp+J17IqCmRYg5guj3T8LgZMDw VvMPFT5co2eSlhNUGIkyBeujvnNqsaLBMHYuPQLFyXr2iKSLuOCADGJKwWXHIfr8NVfGdWikL8P mfABbVjr9cXMOTypQnNSLDN68/jACc+AmXdbXDAHvaTxZmh5FUhWlY5THz8Qt+rNnEl1KDl83Xt SVHirNzkA6aHWoaIM8BW9/danSupbg58HNKAuT01rrNlODV6Btg5rndjaxIcBwOkwJGWV1+yBiF y/yGZqEtvSgK+/gcNhK7naGu2Hlu9wAsiSFmgITy9XmSy+JslNakZI6cIBEHboXK0l2CfTHih4F I3AUgNwyxswTe+1U3UcTqY5o++ZWR3g7rxnkfMn7m7gi/TCxMiiIAui2gT1xP+i+0ud4zJQlofE RMLhnDV1YnJLCJXN5Qj/CRAYly14iWoQ== X-Received: by 2002:a17:903:2a8b:b0:2c1:1685:b8f with SMTP id d9443c01a7336-2c163a10116mr71998895ad.9.1780533787168; Wed, 03 Jun 2026 17:43:07 -0700 (PDT) Received: from localhost ([2a03:2880:ff:55::]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2c164f6d2bbsm38725645ad.1.2026.06.03.17.43.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Jun 2026 17:43:06 -0700 (PDT) From: Bobby Eshleman Date: Wed, 03 Jun 2026 17:42:58 -0700 Subject: [PATCH net-next 1/4] net: devmem: allow rx-buf-size > PAGE_SIZE per dmabuf binding Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260603-tcpdm-large-niovs-v1-1-f37a4ac6726c@meta.com> References: <20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com> In-Reply-To: <20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com> To: Donald Hunter , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Simon Horman , Andrew Lunn , Gerd Hoffmann , Vivek Kasireddy , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Shuah Khan Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-kselftest@vger.kernel.org, sdf@fomichev.me, razor@blackwall.org, daniel@iogearbox.net, almasrymina@google.com, matttbe@kernel.org, skhawaja@google.com, dw@davidwei.uk, Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Every devmem dmabuf binding today hands the page_pool PAGE_SIZE niovs. This caps a single RX descriptor at PAGE_SIZE, burning CPU on buffer churn for large flows. Add a bind-time netlink attribute, NETDEV_A_DMABUF_RX_BUF_SIZE, that lets userspace request a larger niov size. The value must be a power of two >=3D PAGE_SIZE. Measurements ------------ Setup: kperf in devmem RX/TX cuda mode, 4 flows, 64 MB messages, 60s, dctcp, num-rx-queues=3D4, dmabuf-rx/tx-size-mb=3D2048, 10 runs per niov size, mlx5. CPU Util: niov net sirq % net idle % app sys % app id= le % ----- ---------------- ---------------- ---------------- ------------= ---- 4K 62.38 +/- 8.27 33.40 +/- 7.51 54.15 +/- 10.23 43.67 +/- 1= 0.53 16K 58.91 +/- 5.35 35.23 +/- 5.88 41.05 +/- 8.87 56.42 +/- = 9.24 32K 64.12 +/- 0.68 31.09 +/- 1.48 44.54 +/- 3.51 52.63 +/- = 3.65 64K 54.69 +/- 5.54 39.67 +/- 5.81 35.47 +/- 3.11 61.97 +/- = 3.27 RX app sys % drops ~19% from 4K to 64K. Throughput: niov RX dev Gbps RX flow avg Gbps ----- ---------------- ----------------- 4K 300.63 +/- 53.21 75.16 +/- 13.30 16K 321.35 +/- 28.20 80.34 +/- 7.05 32K 347.63 +/- 2.20 86.91 +/- 0.55 64K 332.11 +/- 14.26 83.03 +/- 3.56 Throughput seems to increase, but the stdev is pretty wide so could just be noise. kperf support (not yet merged): https://github.com/facebookexperimental/kperf/commit/8837577f920876bce6986e= c18869ac04439ebcd2 Signed-off-by: Bobby Eshleman --- Documentation/netlink/specs/netdev.yaml | 8 +++++ include/uapi/linux/netdev.h | 1 + net/core/devmem.c | 52 +++++++++++++++++++----------= ---- net/core/devmem.h | 13 ++++++--- net/core/netdev-genl-gen.c | 5 ++-- net/core/netdev-genl.c | 18 ++++++++++-- tools/include/uapi/linux/netdev.h | 1 + 7 files changed, 68 insertions(+), 30 deletions(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlin= k/specs/netdev.yaml index a1f4c5a561e9..063119907983 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -591,6 +591,13 @@ attribute-sets: type: u32 checks: min: 1 + - + name: rx-buf-size + doc: | + Size in bytes of each RX buffer the NIC writes into from the bou= nd + dmabuf. Must be a power of two and >=3D PAGE_SIZE; defaults to + PAGE_SIZE. + type: u32 =20 operations: list: @@ -805,6 +812,7 @@ operations: - ifindex - fd - queues + - rx-buf-size reply: attributes: - id diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 7df1056a35fd..180a4ffffd60 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -217,6 +217,7 @@ enum { NETDEV_A_DMABUF_QUEUES, NETDEV_A_DMABUF_FD, NETDEV_A_DMABUF_ID, + NETDEV_A_DMABUF_RX_BUF_SIZE, =20 __NETDEV_A_DMABUF_MAX, NETDEV_A_DMABUF_MAX =3D (__NETDEV_A_DMABUF_MAX - 1) diff --git a/net/core/devmem.c b/net/core/devmem.c index 957d6b96216b..5a1c0d7984a8 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -46,7 +46,7 @@ static dma_addr_t net_devmem_get_dma_addr(const struct ne= t_iov *niov) =20 owner =3D net_devmem_iov_to_chunk_owner(niov); return owner->base_dma_addr + - ((dma_addr_t)net_iov_idx(niov) << PAGE_SHIFT); + ((dma_addr_t)net_iov_idx(niov) << owner->binding->niov_shift); } =20 static void net_devmem_dmabuf_binding_release(struct percpu_ref *ref) @@ -93,13 +93,14 @@ net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_bindin= g *binding) ssize_t offset; ssize_t index; =20 - dma_addr =3D gen_pool_alloc_owner(binding->chunk_pool, PAGE_SIZE, + dma_addr =3D gen_pool_alloc_owner(binding->chunk_pool, + 1UL << binding->niov_shift, (void **)&owner); if (!dma_addr) return NULL; =20 offset =3D dma_addr - owner->base_dma_addr; - index =3D offset / PAGE_SIZE; + index =3D offset >> binding->niov_shift; niov =3D &owner->area.niovs[index]; =20 niov->desc.pp_magic =3D 0; @@ -113,12 +114,13 @@ void net_devmem_free_dmabuf(struct net_iov *niov) { struct net_devmem_dmabuf_binding *binding =3D net_devmem_iov_binding(niov= ); unsigned long dma_addr =3D net_devmem_get_dma_addr(niov); + size_t niov_size =3D 1UL << binding->niov_shift; =20 if (WARN_ON(!gen_pool_has_addr(binding->chunk_pool, dma_addr, - PAGE_SIZE))) + niov_size))) return; =20 - gen_pool_free(binding->chunk_pool, dma_addr, PAGE_SIZE); + gen_pool_free(binding->chunk_pool, dma_addr, niov_size); } =20 void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding) @@ -163,6 +165,9 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *= dev, u32 rxq_idx, u32 xa_idx; int err; =20 + if (binding->niov_shift !=3D PAGE_SHIFT) + mp_params.rx_page_size =3D 1U << binding->niov_shift; + err =3D netif_mp_open_rxq(dev, rxq_idx, &mp_params, extack); if (err) return err; @@ -184,14 +189,16 @@ struct net_devmem_dmabuf_binding * net_devmem_bind_dmabuf(struct net_device *dev, void *vdev, struct device *dma_dev, enum dma_data_direction direction, - unsigned int dmabuf_fd, struct netdev_nl_sock *priv, + unsigned int dmabuf_fd, unsigned int niov_shift, + struct netdev_nl_sock *priv, struct netlink_ext_ack *extack) { struct net_devmem_dmabuf_binding *binding; + size_t niov_size =3D 1UL << niov_shift; static u32 id_alloc_next; + unsigned int sg_idx, i; struct scatterlist *sg; struct dma_buf *dmabuf; - unsigned int sg_idx, i; unsigned long virtual; int err; =20 @@ -213,6 +220,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, void *vd= ev, =20 binding->dev =3D dev; binding->vdev =3D vdev; + binding->niov_shift =3D niov_shift; xa_init_flags(&binding->bound_rxqs, XA_FLAGS_ALLOC); =20 err =3D percpu_ref_init(&binding->ref, @@ -248,18 +256,14 @@ net_devmem_bind_dmabuf(struct net_device *dev, void *= vdev, goto err_unmap; } binding->tx_vec =3D kvmalloc_objs(struct net_iov *, - dmabuf->size / PAGE_SIZE); + dmabuf->size >> niov_shift); if (!binding->tx_vec) { err =3D -ENOMEM; goto err_unmap; } } =20 - /* For simplicity we expect to make PAGE_SIZE allocations, but the - * binding can be much more flexible than that. We may be able to - * allocate MTU sized chunks here. Leave that for future work... - */ - binding->chunk_pool =3D gen_pool_create(PAGE_SHIFT, + binding->chunk_pool =3D gen_pool_create(niov_shift, dev_to_node(&dev->dev)); if (!binding->chunk_pool) { err =3D -ENOMEM; @@ -273,9 +277,11 @@ net_devmem_bind_dmabuf(struct net_device *dev, void *v= dev, size_t len =3D sg_dma_len(sg); struct net_iov *niov; =20 - if (!IS_ALIGNED(len, PAGE_SIZE)) { + if (!IS_ALIGNED(dma_addr, niov_size) || + !IS_ALIGNED(len, niov_size)) { err =3D -EINVAL; - NL_SET_ERR_MSG(extack, "dma-buf SG length must be PAGE_SIZE aligned"); + NL_SET_ERR_MSG(extack, + "dmabuf sg entry not aligned to niov size"); goto err_free_chunks; } =20 @@ -288,7 +294,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, void *vd= ev, =20 owner->area.base_virtual =3D virtual; owner->base_dma_addr =3D dma_addr; - owner->area.num_niovs =3D len / PAGE_SIZE; + owner->area.num_niovs =3D len >> niov_shift; owner->binding =3D binding; =20 err =3D gen_pool_add_owner(binding->chunk_pool, dma_addr, @@ -313,7 +319,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, void *vd= ev, page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov), net_devmem_get_dma_addr(niov)); if (direction =3D=3D DMA_TO_DEVICE) - binding->tx_vec[owner->area.base_virtual / PAGE_SIZE + i] =3D niov; + binding->tx_vec[(owner->area.base_virtual >> niov_shift) + i] =3D niov; } =20 virtual +=3D len; @@ -430,13 +436,15 @@ struct net_iov * net_devmem_get_niov_at(struct net_devmem_dmabuf_binding *binding, size_t virt_addr, size_t *off, size_t *size) { + size_t niov_size =3D 1UL << binding->niov_shift; + if (virt_addr >=3D binding->dmabuf->size) return NULL; =20 - *off =3D virt_addr % PAGE_SIZE; - *size =3D PAGE_SIZE - *off; + *off =3D virt_addr & (niov_size - 1); + *size =3D niov_size - *off; =20 - return binding->tx_vec[virt_addr / PAGE_SIZE]; + return binding->tx_vec[virt_addr >> binding->niov_shift]; } =20 /*** "Dmabuf devmem memory provider" ***/ @@ -454,8 +462,8 @@ int mp_dmabuf_devmem_init(struct page_pool *pool) pool->dma_sync =3D false; pool->dma_sync_for_cpu =3D false; =20 - if (pool->p.order !=3D 0) - return -E2BIG; + if (pool->p.order !=3D binding->niov_shift - PAGE_SHIFT) + return -EINVAL; =20 net_devmem_dmabuf_binding_get(binding); return 0; diff --git a/net/core/devmem.h b/net/core/devmem.h index 3852a56036cb..4a293a7d1149 100644 --- a/net/core/devmem.h +++ b/net/core/devmem.h @@ -71,6 +71,8 @@ struct net_devmem_dmabuf_binding { */ struct net_iov **tx_vec; =20 + unsigned int niov_shift; + struct work_struct unbind_w; }; =20 @@ -93,7 +95,8 @@ struct net_devmem_dmabuf_binding * net_devmem_bind_dmabuf(struct net_device *dev, void *vdev, struct device *dma_dev, enum dma_data_direction direction, - unsigned int dmabuf_fd, struct netdev_nl_sock *priv, + unsigned int dmabuf_fd, unsigned int niov_shift, + struct netdev_nl_sock *priv, struct netlink_ext_ack *extack); struct net_devmem_dmabuf_binding *net_devmem_lookup_dmabuf(u32 id); void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding); @@ -122,10 +125,11 @@ static inline u32 net_devmem_iov_binding_id(const str= uct net_iov *niov) =20 static inline unsigned long net_iov_virtual_addr(const struct net_iov *nio= v) { - struct net_iov_area *owner =3D net_iov_owner(niov); + struct dmabuf_genpool_chunk_owner *co =3D + net_devmem_iov_to_chunk_owner(niov); =20 - return owner->base_virtual + - ((unsigned long)net_iov_idx(niov) << PAGE_SHIFT); + return net_iov_owner(niov)->base_virtual + + ((unsigned long)net_iov_idx(niov) << co->binding->niov_shift); } =20 static inline bool @@ -175,6 +179,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, void *vd= ev, struct device *dma_dev, enum dma_data_direction direction, unsigned int dmabuf_fd, + unsigned int niov_shift, struct netdev_nl_sock *priv, struct netlink_ext_ack *extack) { diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index c7e138bfe345..55e03b9cd227 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -106,10 +106,11 @@ static const struct nla_policy netdev_qstats_get_nl_p= olicy[NETDEV_A_QSTATS_SCOPE }; =20 /* NETDEV_CMD_BIND_RX - do */ -static const struct nla_policy netdev_bind_rx_nl_policy[NETDEV_A_DMABUF_FD= + 1] =3D { +static const struct nla_policy netdev_bind_rx_nl_policy[NETDEV_A_DMABUF_RX= _BUF_SIZE + 1] =3D { [NETDEV_A_DMABUF_IFINDEX] =3D NLA_POLICY_MIN(NLA_U32, 1), [NETDEV_A_DMABUF_FD] =3D { .type =3D NLA_U32, }, [NETDEV_A_DMABUF_QUEUES] =3D NLA_POLICY_NESTED(netdev_queue_id_nl_policy), + [NETDEV_A_DMABUF_RX_BUF_SIZE] =3D { .type =3D NLA_U32, }, }; =20 /* NETDEV_CMD_NAPI_SET - do */ @@ -219,7 +220,7 @@ static const struct genl_split_ops netdev_nl_ops[] =3D { .cmd =3D NETDEV_CMD_BIND_RX, .doit =3D netdev_nl_bind_rx_doit, .policy =3D netdev_bind_rx_nl_policy, - .maxattr =3D NETDEV_A_DMABUF_FD, + .maxattr =3D NETDEV_A_DMABUF_RX_BUF_SIZE, .flags =3D GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index b4d48f3672a5..9902a97698f5 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -1012,6 +1012,7 @@ int netdev_nl_bind_rx_doit(struct sk_buff *skb, struc= t genl_info *info) { struct net_devmem_dmabuf_binding *binding; u32 ifindex, dmabuf_fd, rxq_idx; + unsigned int niov_shift =3D PAGE_SHIFT; struct netdev_nl_sock *priv; struct net_device *netdev; unsigned long *rxq_bitmap; @@ -1028,6 +1029,18 @@ int netdev_nl_bind_rx_doit(struct sk_buff *skb, stru= ct genl_info *info) ifindex =3D nla_get_u32(info->attrs[NETDEV_A_DEV_IFINDEX]); dmabuf_fd =3D nla_get_u32(info->attrs[NETDEV_A_DMABUF_FD]); =20 + if (info->attrs[NETDEV_A_DMABUF_RX_BUF_SIZE]) { + u32 rx_buf_size =3D nla_get_u32(info->attrs[NETDEV_A_DMABUF_RX_BUF_SIZE]= ); + + if (!rx_buf_size || !is_power_of_2(rx_buf_size) || + rx_buf_size < PAGE_SIZE) { + NL_SET_ERR_MSG(info->extack, + "rx_buf_size must be a power of 2 >=3D PAGE_SIZE"); + return -EINVAL; + } + niov_shift =3D ilog2(rx_buf_size); + } + priv =3D genl_sk_priv_get(&netdev_nl_family, NETLINK_CB(skb).sk); if (IS_ERR(priv)) return PTR_ERR(priv); @@ -1078,7 +1091,8 @@ int netdev_nl_bind_rx_doit(struct sk_buff *skb, struc= t genl_info *info) } =20 binding =3D net_devmem_bind_dmabuf(netdev, NULL, dma_dev, DMA_FROM_DEVICE, - dmabuf_fd, priv, info->extack); + dmabuf_fd, niov_shift, priv, + info->extack); if (IS_ERR(binding)) { err =3D PTR_ERR(binding); goto err_rxq_bitmap; @@ -1221,7 +1235,7 @@ int netdev_nl_bind_tx_doit(struct sk_buff *skb, struc= t genl_info *info) binding =3D net_devmem_bind_dmabuf(bind_dev, bind_dev !=3D netdev ? netdev : NULL, dma_dev, DMA_TO_DEVICE, dmabuf_fd, - priv, info->extack); + PAGE_SHIFT, priv, info->extack); if (IS_ERR(binding)) { err =3D PTR_ERR(binding); goto err_unlock_bind_dev; diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/n= etdev.h index 7df1056a35fd..180a4ffffd60 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -217,6 +217,7 @@ enum { NETDEV_A_DMABUF_QUEUES, NETDEV_A_DMABUF_FD, NETDEV_A_DMABUF_ID, + NETDEV_A_DMABUF_RX_BUF_SIZE, =20 __NETDEV_A_DMABUF_MAX, NETDEV_A_DMABUF_MAX =3D (__NETDEV_A_DMABUF_MAX - 1) --=20 2.53.0-Meta From nobody Mon Jun 8 09:51:36 2026 Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F20AB219301 for ; Thu, 4 Jun 2026 00:43:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780533790; cv=none; b=PMOKi3EfujC2Dly+38Pt/ah3/Rda3E18piIkp+6xvel/KY2SPg2cnrFUOi7MSklRSOhiT8wP/zkOvQYlO9j0pjSrJbMpIRHA+08QY1ChgI+div7CR98MrBgXIyCK55yP00OLU9qpAc4vcOz/azfSM578LiqsW5fDKv/NkpW2iCc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780533790; c=relaxed/simple; bh=FGmZPfBI/+gfTbbn4qo1X8ydVMFTuK5Qubrpu7cBrNU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=KAwzK+3+s660UWrKX+N4S55/uR8AqT9f68e+SOLbdUuZ+G2fSyk/ERAyI24Uvs6qxAysUb65s3g/7zJMPFrXJ/mUFwtYSLn7Yf8akbzddJUpZqbYfYPLH+/826R5uy7Ok/UjTwAmgAvV5QW/rvO4PYGyKdnbPaW0V7SqLNB4SiM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Z9jR7y/n; arc=none smtp.client-ip=209.85.210.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Z9jR7y/n" Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-8424b6792efso33879b3a.3 for ; Wed, 03 Jun 2026 17:43:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780533788; x=1781138588; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=vsTKuujCl0cTIDmjasTaeO7+rGb2Yh+GT+02Pw0KJmo=; b=Z9jR7y/nf2qUO/HoPZMiABndEyuUZTo74mRia3We3WnYmExDCwvXJY6XWCxnjXcf+5 mLbDiOw+XDkT75TDo386vhgaI4U22ZNlndS+qBfnIc276EakiVpvsRsanLaLyDaSjWgp mCcQQRlAy3vEkEjsH+bzA2C5iNsL1LmwNfQCeZkSDCdycnq+Gywh6Wph3lQcrKc4Qsxh 0CSdJI5TKGijjwSEvV5zdWFFKcV8a9tNvl9EfDIClKO2veuLHWvhP+QnfurW1ZFHIxAe zhxVoVwGDAS9PFgkmHeaE2eQKzUMlolihYBdXpukiYrIFGMy8J+oZXcEZlwI8Mz80JwB R/aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780533788; x=1781138588; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=vsTKuujCl0cTIDmjasTaeO7+rGb2Yh+GT+02Pw0KJmo=; b=npWN0DzDzyc5ObpsRY5uNr23AIc+xCo5cBkd3gO/Qt974arx+G7FiORXjamVukA6+/ LpMmRxp800fnvtnJMBsPaUazvEXRzK+55zdlYQ0W9zq1ClhC7eBzMiDz5CNi98Az7Uoe t4Bnm575TIm1A58JDXBmwiLZX1YEiSQvijuOHwRc511ep1JqYxMi2MAkaPyQVRZssKkg Qok4d+aNuW69TOqH4dTcawbp7XSWD39n0RAko96y05+w5v65//EVs7HM9V3KHl/H0B4G waBUyWmejDQn7o/a8mrTqaj010HBuRkhXSzjMPgM9zQgYYcmklCVIjwKjOFkZT5pYSy7 5pkg== X-Forwarded-Encrypted: i=1; AFNElJ9Lxsp3Tf/NknJATX2iXqDnXPxkcCbcCXT6vCrvVgPIxY34OQxPid4W5WDIyS5LCC81sxJZ2pI2ydYvPuU=@vger.kernel.org X-Gm-Message-State: AOJu0YxYCSci2GpyFpJnYw5sjVNXjrpx4VC4JZ2iV3bmEE85fkxsSmDn 4laADdtxX0e5i5PM94JlDIT8I92PAkKQ2hadoN2BFwZDR2jrVtUNq84I X-Gm-Gg: Acq92OGKDZtTNbCq4cHRjEafl+mxRAUkISBX9PbbmGdgYTPG/M3GjdhnA0mWmrKb8i2 z4lzLlAFQHCOAcyvP82CMlkn7d8yOY+QvGPzyZWm7qmIvN79677o4QL2EtDpPnea9VlbQLDQAz7 vDRVqRs6MUjXdEFNa0RzdUhHy4iTVKo7KwJyECE0kmcfSLWkGm1O19Q445pygOVnogE8h2E6Qk6 5dS9Rfckii/yps4qrntd+n0S8aSHBS5iZpZDbhyMvTPZG/bn1u67nYitTqnO8Q8mpWbZ3S/d0gR 9xprbH6Z9xSVNspmnAVOTbkxwtLGXPxD0YCp/DSHhPekMnsIoG9vVkSvPbMfitWzosL9OKcNkZL Cqx+bnJghpOMDGmlcuYaLr0Sl9qUwMRdH4SdIdNLJ2Atb8Ts9yO9/R0Qcu4BkRyuLYWUXcYn3zD rfypy1If7vHdCgUsLOQfB+h6AKS9TTUw== X-Received: by 2002:a05:6a00:139e:b0:842:77ab:35c8 with SMTP id d2e1a72fcca58-84284fe7c61mr5429878b3a.44.1780533788133; Wed, 03 Jun 2026 17:43:08 -0700 (PDT) Received: from localhost ([2a03:2880:ff:52::]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-84282372502sm4584728b3a.16.2026.06.03.17.43.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Jun 2026 17:43:07 -0700 (PDT) From: Bobby Eshleman Date: Wed, 03 Jun 2026 17:42:59 -0700 Subject: [PATCH net-next 2/4] udmabuf: emit one sg entry per pinned folio Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260603-tcpdm-large-niovs-v1-2-f37a4ac6726c@meta.com> References: <20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com> In-Reply-To: <20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com> To: Donald Hunter , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Simon Horman , Andrew Lunn , Gerd Hoffmann , Vivek Kasireddy , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Shuah Khan Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-kselftest@vger.kernel.org, sdf@fomichev.me, razor@blackwall.org, daniel@iogearbox.net, almasrymina@google.com, matttbe@kernel.org, skhawaja@google.com, dw@davidwei.uk, Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman get_sg_table() emitted one PAGE_SIZE sg entry per page even when the underlying folio was larger. Instead, walk folios[] and emit one sg entry per folio. When folios represent large pages (as is for MFD_HUGETLB), each sg entry is a large page. Normal PAGE_SIZE sg tables are unchanged. Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with udmabuf. Signed-off-by: Bobby Eshleman --- drivers/dma-buf/udmabuf.c | 47 ++++++++++++++++++++++++++++++++++++++++++-= ---- 1 file changed, 42 insertions(+), 5 deletions(-) diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 94b8ecb892bb..f28dd3788ada 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -141,26 +141,63 @@ static void vunmap_udmabuf(struct dma_buf *buf, struc= t iosys_map *map) vm_unmap_ram(map->vaddr, ubuf->pagecount); } =20 +/* Return the number of contiguous pages backed by the folio at @i. + * A udmabuf may map only part of a folio, or reference the same folio + * in multiple non-contiguous runs, so folio_nr_pages() can't be used. + */ +static pgoff_t udmabuf_folio_nr_pages(struct udmabuf *ubuf, pgoff_t i) +{ + struct folio *f =3D ubuf->folios[i]; + pgoff_t j; + + for (j =3D 1; i + j < ubuf->pagecount; j++) { + if (ubuf->folios[i + j] !=3D f) + break; + /* Same folio, but not a sequential offset within it. */ + if (ubuf->offsets[i + j] !=3D ubuf->offsets[i] + j * PAGE_SIZE) + break; + } + return j; +} + +/* Count the contiguous folio runs in @ubuf, one sg entry per run. */ +static unsigned int udmabuf_sg_nents(struct udmabuf *ubuf) +{ + unsigned int nents =3D 0; + pgoff_t i; + + for (i =3D 0; i < ubuf->pagecount; i +=3D udmabuf_folio_nr_pages(ubuf, i)) + nents++; + return nents; +} + static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *b= uf, enum dma_data_direction direction) { struct udmabuf *ubuf =3D buf->priv; - struct sg_table *sg; struct scatterlist *sgl; - unsigned int i =3D 0; + struct sg_table *sg; + pgoff_t i, run; + unsigned int nents; int ret; =20 + nents =3D udmabuf_sg_nents(ubuf); + sg =3D kzalloc_obj(*sg); if (!sg) return ERR_PTR(-ENOMEM); =20 - ret =3D sg_alloc_table(sg, ubuf->pagecount, GFP_KERNEL); + ret =3D sg_alloc_table(sg, nents, GFP_KERNEL); if (ret < 0) goto err_alloc; =20 - for_each_sg(sg->sgl, sgl, ubuf->pagecount, i) - sg_set_folio(sgl, ubuf->folios[i], PAGE_SIZE, + sgl =3D sg->sgl; + for (i =3D 0; i < ubuf->pagecount; i +=3D run) { + run =3D udmabuf_folio_nr_pages(ubuf, i); + sg_set_folio(sgl, ubuf->folios[i], run << PAGE_SHIFT, ubuf->offsets[i]); + sgl =3D sg_next(sgl); + } =20 ret =3D dma_map_sgtable(dev, sg, direction, 0); if (ret < 0) --=20 2.53.0-Meta From nobody Mon Jun 8 09:51:36 2026 Received: from mail-pg1-f178.google.com (mail-pg1-f178.google.com [209.85.215.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F423F225416 for ; Thu, 4 Jun 2026 00:43:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780533791; cv=none; b=SnDRlxUWCgUaH6v63C5qdR4KNnJU4x5G+JgEFkE3LbmQP0G8QNNpAbSDRGPfMkclrLgcjUsusjtpfNW3E4KzKI5A7Ohq7gYkoBha8nmQlJkgJ43qzwiUjk34CuzUV/JKgJozebj0AWLjSKSjQsbJDUfayWhTSm282zw5yPEhDcI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780533791; c=relaxed/simple; bh=9z1OHN5xpF/fsVsqPXa5vg6ICHYQvZOvBI/HAuTjNNY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=g2YWybpOt8QXMsgFdSzo88ZFuHMCb/tFi8F2nRwTr483uUdATnWHYwgdDCmecSO61j7p8e+YSHWWK2pTbzUNgL3309+u2MGAbbigeIzqOR5q2ecmX6ID3mkQpecD5Z1viLp2z1LDJBHpVMmlq0z4P/OjmlqJzgLgxKHP6+w8Czg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lRfxb9rR; arc=none smtp.client-ip=209.85.215.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lRfxb9rR" Received: by mail-pg1-f178.google.com with SMTP id 41be03b00d2f7-c858dc05ee3so96555a12.2 for ; Wed, 03 Jun 2026 17:43:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780533789; x=1781138589; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=xPElRdLQqSlIpSLm9S4Z23ki+7V6eQuSW40WuLFvr5g=; b=lRfxb9rR1o+usdbf/PVBC2mqE5SHI0wM7Mu6ehZ/CbF5H9rIZzwwLDgGXqQ28YUFux +hOblwPz+t4xDvdFUViqs3Paxxn4L9zSt/ZN+tr1nfVe/65l+jry1yBpWll9iStsds0I coPFml+dvATeqEqSDjJmGBlso739dP+koHLgiBSroxDid5uZ6oGWfp83RFDpEuxGIGBV cyOhxEarJuqzOT+KNo/J4vuDZAon6gGRpgbtHfjoCkox4zCsJtgEPeFvG8LedL8MaZTS vupNs5MT20IT9SmlRH7BhZM3oyoUAiaLcI4OkHQrppT91Qc15MrQfGVm46DsbFO0RIDu wPww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780533789; x=1781138589; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=xPElRdLQqSlIpSLm9S4Z23ki+7V6eQuSW40WuLFvr5g=; b=KrCkE/uIX3h5Qa34sWkhB3V5PRG+C+ZEDq/iHZJc3j4P5VrFgKSS+LT33YHqSzpSHh HnzOuHZFrBXpCNvI/7gIpXG6JNSXFtLEU2un8Zw2Fa5s9x7GQSbqtXm1PI6Tlw4Iwvrn 6DeIdCKR8WP8LQ/OkcVPf6j/9wQ/q9zbs7VHvcSqQNAiyYr2IChWuS/0d3OTNuo1CFVB j8c990sCVi5UPqsOg1YDyDTliaD/pnHqOlAtJ/yXeHWDSiW3k+i3acbKNb1NiR3tgIZN v0sHLoVzRZz1PQkL5Saxhpa3rqW5Al1YyY8wfbAVzHlZ2+UgtEFaGde5TyMB2kSRFiwG EfkQ== X-Forwarded-Encrypted: i=1; AFNElJ9NvcIsDAamPS+bBDE9byYUpRAZMsOXtjeH7xi70n59bSDkKqoOMlkwYH0DwrDtpAjATHFmKGBADBGJuUc=@vger.kernel.org X-Gm-Message-State: AOJu0YzFGAFPlvZxuVvHfE6mX0g6pU8VQ9e677eQv6kc/AwrAnjXMU/1 3iZF7W/DF4EhuS8/k/Ikx66GhRGxLVUzUVSnkIAauuLcgZwEeACo92hx X-Gm-Gg: Acq92OF2HrauKypgn2JXdMMODFJLYvRTPYabclfdZe4dXjWi/8hhdcPh8B/PVVAo8UM 5RvhsCyUWCZGRO9Jxu1O9/9IVuP10gGo42u6lTlNwXkTydCb9FAaHCVeTIaNMdCZVMQZ8jJu0yp w7o6pH6rwr0BfGT8hPo5obE37PetTSh7q9UCL4N+19JEZrMWMbngkpaf5ys2JT1k6AnIqA8xYWt Y1kzyK+9tVKnnJdFkZs3v/Uw4GU2mfDF90SAMmf4Yz95lrqPfIiNa6wFVopf+11VIo2FGlwgIzh wnrAn5QfnVA0upLWtPycWau7nbgJUUL1VLr6Fu4vbltWCPTqiNoDZnoG3WDxedHCQCl1SRWwIfL xg4JwpM8T5WQozOPhKOSb09cbZezaDsaqoJiQj6v6RLcqal/7fxw1RFfjY6lAjQfpG0RE1yAq24 isUe3ygfumAwPpH41reC8OcJRlilDn0w== X-Received: by 2002:a05:6300:67ca:b0:39f:3ca8:a331 with SMTP id adf61e73a8af0-3b4974cc6d4mr6579788637.16.1780533789092; Wed, 03 Jun 2026 17:43:09 -0700 (PDT) Received: from localhost ([2a03:2880:ff:59::]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c85df04a0e9sm2879012a12.13.2026.06.03.17.43.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Jun 2026 17:43:08 -0700 (PDT) From: Bobby Eshleman Date: Wed, 03 Jun 2026 17:43:00 -0700 Subject: [PATCH net-next 3/4] selftests/net: ncdevmem: add -b option to set rx-buf-size on bind Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260603-tcpdm-large-niovs-v1-3-f37a4ac6726c@meta.com> References: <20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com> In-Reply-To: <20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com> To: Donald Hunter , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Simon Horman , Andrew Lunn , Gerd Hoffmann , Vivek Kasireddy , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Shuah Khan Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-kselftest@vger.kernel.org, sdf@fomichev.me, razor@blackwall.org, daniel@iogearbox.net, almasrymina@google.com, matttbe@kernel.org, skhawaja@google.com, dw@davidwei.uk, Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Add -b to request a non-default niov size via NETDEV_A_DMABUF_RX_BUF_SIZE. When the value exceeds PAGE_SIZE, udmabuf_alloc() switches to an MFD_HUGETLB-backed memfd so each 2 MB hugepage produces one naturally-aligned sg entry. Reject values > 2 MB up front: MFD_HUGETLB + udmabuf can only guarantee 2 MB per sg entry (one hugepage), so a larger rx_buf_size would fail the per-sg length/alignment check. Add CONFIG_HUGETLBFS=3Dy to drivers/net/hw/config so the new path is reachable in the CI kernels built for these tests. Signed-off-by: Bobby Eshleman --- tools/testing/selftests/drivers/net/hw/config | 1 + tools/testing/selftests/drivers/net/hw/ncdevmem.c | 49 +++++++++++++++++++= ++-- 2 files changed, 47 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hw/config b/tools/testing/= selftests/drivers/net/hw/config index b9f406dd7282..388721bee553 100644 --- a/tools/testing/selftests/drivers/net/hw/config +++ b/tools/testing/selftests/drivers/net/hw/config @@ -3,6 +3,7 @@ CONFIG_FAIL_FUNCTION=3Dy CONFIG_FAULT_INJECTION=3Dy CONFIG_FAULT_INJECTION_DEBUG_FS=3Dy CONFIG_FUNCTION_ERROR_INJECTION=3Dy +CONFIG_HUGETLBFS=3Dy CONFIG_INET6_ESP=3Dy CONFIG_INET6_ESP_OFFLOAD=3Dy CONFIG_INET_ESP=3Dy diff --git a/tools/testing/selftests/drivers/net/hw/ncdevmem.c b/tools/test= ing/selftests/drivers/net/hw/ncdevmem.c index d96e8a3b5a65..325c128191e2 100644 --- a/tools/testing/selftests/drivers/net/hw/ncdevmem.c +++ b/tools/testing/selftests/drivers/net/hw/ncdevmem.c @@ -61,6 +61,7 @@ #include =20 #include +#include #include #include #include @@ -79,6 +80,7 @@ #define PAGE_SHIFT 12 #define TEST_PREFIX "ncdevmem" #define NUM_PAGES 16000 +#define MB(x) ((x) << 20) =20 #ifndef MSG_SOCK_DEVMEM #define MSG_SOCK_DEVMEM 0x2000000 @@ -100,6 +102,7 @@ static unsigned int dmabuf_id; static uint32_t tx_dmabuf_id; static int waittime_ms =3D 500; static bool fail_on_linear; +static uint32_t rx_buf_size; =20 /* System state loaded by current_config_load() */ #define MAX_FLOWS 8 @@ -142,6 +145,7 @@ static struct memory_buffer *udmabuf_alloc(size_t size) { struct udmabuf_create create; struct memory_buffer *ctx; + unsigned int memfd_flags; int ret; =20 ctx =3D malloc(sizeof(*ctx)); @@ -156,9 +160,14 @@ static struct memory_buffer *udmabuf_alloc(size_t size) goto err_free_ctx; } =20 - ctx->memfd =3D memfd_create("udmabuf-test", MFD_ALLOW_SEALING); + memfd_flags =3D MFD_ALLOW_SEALING; + if (rx_buf_size > (uint32_t)getpagesize()) + memfd_flags |=3D MFD_HUGETLB | MFD_HUGE_2MB; + + ctx->memfd =3D memfd_create("udmabuf-test", memfd_flags); if (ctx->memfd < 0) { - pr_err("[skip,no-memfd]"); + pr_err("[skip,no-memfd%s]", + (memfd_flags & MFD_HUGETLB) ? " (need hugepages)" : ""); goto err_close_dev; } =20 @@ -168,6 +177,11 @@ static struct memory_buffer *udmabuf_alloc(size_t size) goto err_close_memfd; } =20 + if (memfd_flags & MFD_HUGETLB) { + size =3D roundup(size, MB(2)); + ctx->size =3D size; + } + ret =3D ftruncate(ctx->memfd, size); if (ret =3D=3D -1) { pr_err("[FAIL,memfd-truncate]"); @@ -699,6 +713,8 @@ static int bind_rx_queue(unsigned int ifindex, unsigned= int dmabuf_fd, netdev_bind_rx_req_set_ifindex(req, ifindex); netdev_bind_rx_req_set_fd(req, dmabuf_fd); __netdev_bind_rx_req_set_queues(req, queues, n_queue_index); + if (rx_buf_size) + netdev_bind_rx_req_set_rx_buf_size(req, rx_buf_size); =20 rsp =3D netdev_bind_rx(*ys, req); if (!rsp) { @@ -1411,7 +1427,7 @@ int main(int argc, char *argv[]) int is_server =3D 0, opt; int ret, err =3D 1; =20 - while ((opt =3D getopt(argc, argv, "Lls:c:p:v:q:t:f:z:n")) !=3D -1) { + while ((opt =3D getopt(argc, argv, "Lls:c:p:v:q:t:f:z:nb:")) !=3D -1) { switch (opt) { case 'L': fail_on_linear =3D true; @@ -1446,6 +1462,33 @@ int main(int argc, char *argv[]) case 'n': skip_config =3D 1; break; + case 'b': { + char *endp; + unsigned long val; + + errno =3D 0; + val =3D strtoul(optarg, &endp, 0); + if (errno || endp =3D=3D optarg || *endp || val =3D=3D 0 || + val > UINT32_MAX) { + pr_err("invalid rx_buf_size: %s", optarg); + return 1; + } + if (val & (val - 1)) { + pr_err("rx_buf_size must be a power of 2"); + return 1; + } + if (val < (unsigned long)getpagesize()) { + pr_err("rx_buf_size must be >=3D PAGE_SIZE (%d)", + getpagesize()); + return 1; + } + if (val > MB(2)) { + pr_err("rx_buf_size > 2 MB not supported"); + return 1; + } + rx_buf_size =3D val; + break; + } case '?': fprintf(stderr, "unknown option: %c\n", optopt); break; --=20 2.53.0-Meta From nobody Mon Jun 8 09:51:36 2026 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 117C6238C36 for ; Thu, 4 Jun 2026 00:43:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780533797; cv=none; b=SQdqOEtKnCaZdDisfr1ntPXi/DntrbW6R/Kg/oyGRJ+e48sKZ1wi0DZI0WdF7oVXQBKcmoggzOMIAfG4GsfhoTaClsK831JRyGCyCf4ODQLKFHA2mGPiPanO8Ey1riJxLvTGX7gULajhdKQBx5ZodldNpAq4gF4H27rtNwMdDBE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780533797; c=relaxed/simple; bh=z413rEVyMB6mH/jde6zuXpETKXPhggQOWfBnPF3Jd7k=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=dUcQGwKNJBf0r8/o14fpS0/32/B6WlpCCcOfNJA7INzZZJ8X91eW+fSLYc5nfT487hdMjkEOBlcSfg5+0lbunGPtw1UjaFmkGyJ11PJhURs2wYrFjf/FaHMmG/co6I4WpSxzeABZLq6LZGxll5P57UNu5hECZC0DNakYgG9afsU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=pmFIruSV; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="pmFIruSV" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-2c0c2c7e0c5so842535ad.1 for ; Wed, 03 Jun 2026 17:43:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780533790; x=1781138590; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=yFwU+gJPCdIvaG8BIeMjrGJqao3C2Cvmdt82URI9gpQ=; b=pmFIruSVaXb1eqrli29ftEkZpNcoxQ1fJRkKQz5Y+T2MjTTDwzikrjYY1o806t4bC4 vTWRbrJCpmhe7iLHalhz/2Uwfmxz/CandvnRuqwoYL9LQhYkoQnAxSUEtACzpV7e49m9 9fG+Pti+0GLtB2HLWRoO13ctFUHlF5EOOJam1r4Ly35pIQsCmn6SjycNhq/D1h28Drcv dCwQlq6UORY2CfE3rwN5XFthoqqQxrjfH889qAbenqSQMkdJZQZyNYH7niCQiW67G05Z UDzvNjXE2eqcYsi84tBP2uGZBz1PtlLnzMenxiJvU9SxEg9VmQrwjcZgdc8Xrg3WcQG9 PobA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780533790; x=1781138590; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=yFwU+gJPCdIvaG8BIeMjrGJqao3C2Cvmdt82URI9gpQ=; b=ERW2GTy0H+1gEOkJkB+jN2xOEfpWSXOesGyZ/DhQqxS3SUCrcbcmx04G6DhKkOFsyM TyglJp8i5NkCJSmNn0vR0wI8W9c9TNB0EYve+E5TsiIiw4d5Esh9ANBAaiGgEhQOE7r1 iqipH2cJEEv7CCgReX2eoCZH6s2CiungqCXEww7LCgcmCfzNHn7MCXryTlHYR/JVeOl1 jPCBkyIceF4n3owiyrJeRa689KCp+hIA7VqaU+C8ViUioiuOA1SkekvbvxroGYdKkw/n l8GqsZOdTT9uX+vOPXYYcFYmgH87MOv8b62DqLcdQwU5OOQbpRwr8vf2MX1yST8q8CKY bwlQ== X-Forwarded-Encrypted: i=1; AFNElJ9gvBcT46rz+oGT/tE9G13wJ+MJ0a79RIh9ruc3qvJU2cCYELWDvUUQxDSm/KROJ3uK8v5dXn9zJiAAB/c=@vger.kernel.org X-Gm-Message-State: AOJu0Yyffz1H8yyGNohpmEAr+VNNdkUWWNMTlERh/BtXZMOlEoH/p9HH 8pqu4tt0IuFzh+gKMMjyWZlLBv5uNqBSQA50lHjw06lkzk7RVh9jEzjx X-Gm-Gg: Acq92OEUrmskeXvBhh08bnEIVmTcSpbiwr7zlZulISxQ27wjWm6HTyYDCh0QrJUZWuJ 5LAzk0FXCm2xBijj/34i8GgzSN42WusqKr/AXpFUvZOZWsqwKfG5l1+ckgorXsTkItu/X0MrXfb IiHjyxYF6AcBaUr2tH0CGYhUdRd4xfzoRbilBJB3BX2+ABg+tJ0at00j1btInHl++c7rk1JOzpQ WoGSzZodTCIm/zB9zZgVuNgGDvgCJYklcUr75aYukaG9iuAgNBnV3cIpuUB8DLx1aAzKieFqGXa HiZG0jXGizmlgdJrG80JtY+XqG9Mts/LNHEFiequqhMXT9uX4rvDJdM5viApI8V+mm+kM5kLtVO cxQYAKZVEehM+Cpu1y/wmirzQ7z8UEQp8XQ1qigbdro96zxfUyhHAFPvHKq6dXaLmIZhfELVl4O Lv3IjIFI0CqZC/v3zixHXUc097Ifn13A== X-Received: by 2002:a17:903:3b85:b0:2c1:8fea:4dbf with SMTP id d9443c01a7336-2c18fea4df8mr24927635ad.8.1780533790154; Wed, 03 Jun 2026 17:43:10 -0700 (PDT) Received: from localhost ([2a03:2880:ff:73::]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2c16609e1b6sm37277735ad.38.2026.06.03.17.43.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Jun 2026 17:43:09 -0700 (PDT) From: Bobby Eshleman Date: Wed, 03 Jun 2026 17:43:01 -0700 Subject: [PATCH net-next 4/4] selftests/net: devmem.py: add check_rx_large_niov Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260603-tcpdm-large-niovs-v1-4-f37a4ac6726c@meta.com> References: <20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com> In-Reply-To: <20260603-tcpdm-large-niovs-v1-0-f37a4ac6726c@meta.com> To: Donald Hunter , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Simon Horman , Andrew Lunn , Gerd Hoffmann , Vivek Kasireddy , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Shuah Khan Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-kselftest@vger.kernel.org, sdf@fomichev.me, razor@blackwall.org, daniel@iogearbox.net, almasrymina@google.com, matttbe@kernel.org, skhawaja@google.com, dw@davidwei.uk, Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Add a new devmem test case for binding the dmabuf with rx-buf-size=3D16K. The test sweeps RX payload sizes straddling the niov boundary to cover the sub-niov, exact-niov, and multi-niov RX paths. Signed-off-by: Bobby Eshleman --- tools/testing/selftests/drivers/net/hw/devmem.py | 12 +++++- .../testing/selftests/drivers/net/hw/devmem_lib.py | 46 ++++++++++++++++++= +++- .../testing/selftests/drivers/net/hw/nk_devmem.py | 11 +++++- 3 files changed, 63 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hw/devmem.py b/tools/testi= ng/selftests/drivers/net/hw/devmem.py index 031cf9905f65..47b54e18e7a6 100755 --- a/tools/testing/selftests/drivers/net/hw/devmem.py +++ b/tools/testing/selftests/drivers/net/hw/devmem.py @@ -2,7 +2,8 @@ # SPDX-License-Identifier: GPL-2.0 =20 from os import path -from devmem_lib import setup_test, run_rx, run_tx, run_tx_chunks, run_rx_h= ds +from devmem_lib import (setup_test, run_rx, run_tx, run_tx_chunks, run_rx_= hds, + run_rx_large_niov) from lib.py import ksft_run, ksft_exit, ksft_disruptive from lib.py import NetDrvEpEnv =20 @@ -30,11 +31,18 @@ def check_rx_hds(cfg) -> None: run_rx_hds(cfg) =20 =20 +@ksft_disruptive +def check_rx_large_niov(cfg) -> None: + """Run the devmem RX test with rx-buf-size =3D 16 KiB.""" + run_rx_large_niov(cfg) + + def main() -> None: """Run the devmem test cases.""" with NetDrvEpEnv(__file__) as cfg: setup_test(cfg, path.abspath(path.dirname(__file__) + "/ncdevmem")) - ksft_run([check_rx, check_tx, check_tx_chunks, check_rx_hds], + ksft_run([check_rx, check_tx, check_tx_chunks, check_rx_hds, + check_rx_large_niov], args=3D(cfg,)) ksft_exit() =20 diff --git a/tools/testing/selftests/drivers/net/hw/devmem_lib.py b/tools/t= esting/selftests/drivers/net/hw/devmem_lib.py index 0921ff03eb81..1d9ad3a294c8 100644 --- a/tools/testing/selftests/drivers/net/hw/devmem_lib.py +++ b/tools/testing/selftests/drivers/net/hw/devmem_lib.py @@ -8,7 +8,7 @@ from lib.py import (bkg, cmd, defer, ethtool, rand_port, wa= it_port_listen, NetdevFamily) =20 =20 -def require_devmem(cfg): +def require_devmem(cfg, rx_buf_size=3D0): """Probe ncdevmem on cfg.ifname and SKIP the test if devmem isn't supp= orted.""" if not hasattr(cfg, "devmem_probed"): probe_command =3D f"{cfg.bin_local} -f {cfg.ifname}" @@ -18,6 +18,19 @@ def require_devmem(cfg): if not cfg.devmem_supported: raise KsftSkipEx("Test requires devmem support") =20 + if rx_buf_size > 0: + if not hasattr(cfg, "devmem_rx_buf_size_probed"): + cfg.devmem_rx_buf_size_probed =3D {} + + if rx_buf_size not in cfg.devmem_rx_buf_size_probed: + probe_command =3D f"{cfg.bin_local} -f {cfg.ifname} -b {rx_buf= _size}" + cfg.devmem_rx_buf_size_probed[rx_buf_size] =3D \ + cmd(probe_command, fail=3DFalse, shell=3DTrue).ret =3D=3D 0 + + if not cfg.devmem_rx_buf_size_probed[rx_buf_size]: + raise KsftSkipEx( + f"Test requires devmem rx-buf-size=3D{rx_buf_size} support= ") + =20 def configure_nic(cfg): """Channels, rings, RSS, queue lease for netkit devmem.""" @@ -76,7 +89,8 @@ def set_flow_rule(cfg, port): return int(re.search(r'ID (\d+)', output).group(1)) =20 =20 -def ncdevmem_rx(cfg, port, verify=3DTrue, fail_on_linear=3DFalse, flow_ste= er=3DFalse): +def ncdevmem_rx(cfg, port, verify=3DTrue, fail_on_linear=3DFalse, flow_ste= er=3DFalse, + rx_buf_size=3D0): """Build the ncdevmem RX listener command.""" if hasattr(cfg, 'netns'): flow_rule_id =3D set_flow_rule(cfg, port) @@ -96,6 +110,8 @@ def ncdevmem_rx(cfg, port, verify=3DTrue, fail_on_linear= =3DFalse, flow_steer=3DFalse): extras.append("-v 7") if fail_on_linear: extras.append("-L") + if rx_buf_size > 0: + extras.append(f"-b {rx_buf_size}") =20 parts =3D [cfg.bin_local, "-l", f"-f {ifname}", f"-s {addr}", f"-p {port}", *extras] @@ -202,6 +218,32 @@ def run_tx_chunks(cfg): ksft_eq(socat.stdout.strip(), "hello\nworld") =20 =20 +def run_rx_large_niov(cfg): + """Run the devmem RX test with a large niov (rx-buf-size > PAGE_SIZE). + + Sweep payload sizes that straddle the niov boundary: below, equal to, + and above rx_buf_size, to exercise sub-niov, exact-niov, and multi-niov + RX paths. + """ + require_devmem(cfg, rx_buf_size=3D16384) + configure_nic(cfg) + netns =3D getattr(cfg, "netns", None) + + for size in [1024, 4096, 8192, 16384, 32768, 65536]: + port =3D rand_port() + socat =3D socat_send(cfg, port) + listen_cmd =3D ncdevmem_rx(cfg, port, + flow_steer=3Dnot netns, + rx_buf_size=3D16384) + data_pipe =3D (f"yes $(echo -e \x01\x02\x03\x04\x05\x06) | " + f"head -c {size} | {socat}") + with bkg(listen_cmd, exit_wait=3DTrue, ns=3Dnetns) as ncdevmem: + wait_port_listen(port, proto=3D"tcp", ns=3Dnetns) + cmd(data_pipe, host=3Dcfg.remote, shell=3DTrue) + ksft_eq(ncdevmem.ret, 0, + f"large-niov failed for payload size {size}") + + def run_rx_hds(cfg): """Run the HDS test by running devmem RX across a segment size sweep."= "" require_devmem(cfg) diff --git a/tools/testing/selftests/drivers/net/hw/nk_devmem.py b/tools/te= sting/selftests/drivers/net/hw/nk_devmem.py index 300ed2a70ab4..7f1867e4ff32 100755 --- a/tools/testing/selftests/drivers/net/hw/nk_devmem.py +++ b/tools/testing/selftests/drivers/net/hw/nk_devmem.py @@ -3,7 +3,8 @@ """Test devmem TCP with netkit.""" =20 import os -from devmem_lib import setup_test, run_rx, run_tx, run_tx_chunks, run_rx_h= ds +from devmem_lib import (setup_test, run_rx, run_tx, run_tx_chunks, run_rx_= hds, + run_rx_large_niov) from lib.py import ksft_run, ksft_exit, ksft_disruptive from lib.py import NetDrvContEnv =20 @@ -31,6 +32,12 @@ def check_nk_rx_hds(cfg) -> None: run_rx_hds(cfg) =20 =20 +@ksft_disruptive +def check_nk_rx_large_niov(cfg) -> None: + """Run the devmem RX large-niov test through netkit.""" + run_rx_large_niov(cfg) + + def main() -> None: """Run the netkit devmem test cases.""" with NetDrvContEnv(__file__, rxqueues=3D2, primary_rx_redirect=3DTrue)= as cfg: @@ -38,7 +45,7 @@ def main() -> None: os.path.join(os.path.dirname(os.path.abspath(__file__)), "ncdevmem")) ksft_run([check_nk_rx, check_nk_tx, check_nk_tx_chunks, - check_nk_rx_hds], args=3D(cfg,)) + check_nk_rx_hds, check_nk_rx_large_niov], args=3D(cfg,)) ksft_exit() =20 =20 --=20 2.53.0-Meta