Later patches in the series adds TX net_iovs where there is no pp
associated, so we can't rely on niov->pp->mp_ops to tell what is the
type of the net_iov.
Add a type enum to the net_iov which tells us the net_iov type.
Signed-off-by: Mina Almasry <almasrymina@google.com>
---
v8:
- Since io_uring zcrx is now in net-next, update io_uring net_iov type
setting and remove the NET_IOV_UNSPECIFIED type
v7:
- New patch
fix iouring
---
include/net/netmem.h | 11 ++++++++++-
io_uring/zcrx.c | 1 +
net/core/devmem.c | 3 ++-
3 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index c61d5b21e7b4..64af9a288c80 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -20,8 +20,17 @@ DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
*/
#define NET_IOV 0x01UL
+enum net_iov_type {
+ NET_IOV_DMABUF,
+ NET_IOV_IOURING,
+
+ /* Force size to unsigned long to make the NET_IOV_ASSERTS below pass.
+ */
+ NET_IOV_MAX = ULONG_MAX,
+};
+
struct net_iov {
- unsigned long __unused_padding;
+ enum net_iov_type type;
unsigned long pp_magic;
struct page_pool *pp;
struct net_iov_area *owner;
diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 0f46e0404c04..17a54e74ed5d 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -247,6 +247,7 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
niov->owner = &area->nia;
area->freelist[i] = i;
atomic_set(&area->user_refs[i], 0);
+ niov->type = NET_IOV_IOURING;
}
area->free_count = nr_iovs;
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 6e27a47d0493..f5c3a7e6dbb7 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -30,7 +30,7 @@ static const struct memory_provider_ops dmabuf_devmem_ops;
bool net_is_devmem_iov(struct net_iov *niov)
{
- return niov->pp->mp_ops == &dmabuf_devmem_ops;
+ return niov->type == NET_IOV_DMABUF;
}
static void net_devmem_dmabuf_free_chunk_owner(struct gen_pool *genpool,
@@ -266,6 +266,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
for (i = 0; i < owner->area.num_niovs; i++) {
niov = &owner->area.niovs[i];
+ niov->type = NET_IOV_DMABUF;
niov->owner = &owner->area;
page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
net_devmem_get_dma_addr(niov));
--
2.49.0.805.g082f7c87e0-goog
On 4/18/25 00:15, Mina Almasry wrote: > Later patches in the series adds TX net_iovs where there is no pp > associated, so we can't rely on niov->pp->mp_ops to tell what is the > type of the net_iov. That's fine, but that needs a NULL pp check in io_uring as well, specifically in io_zcrx_recv_frag(). You can also move it to struct net_iov_area and check niov->owner->type instead. It's a safer choice than aliasing with struct page, there is no cost as you're loading ->owner anyway (e.g. for net_iov_virtual_addr()), and it's better in terms of normalisation / not unnecessary duplicating it, assuming we'll never have niovs of different types bound to the same struct net_iov_area. -- Pavel Begunkov
On Tue, Apr 22, 2025 at 1:16 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 4/18/25 00:15, Mina Almasry wrote:
> > Later patches in the series adds TX net_iovs where there is no pp
> > associated, so we can't rely on niov->pp->mp_ops to tell what is the
> > type of the net_iov.
>
> That's fine, but that needs a NULL pp check in io_uring as well,
> specifically in io_zcrx_recv_frag().
>
I think you mean this update in the code:
if (!niov->pp || niov->pp->mp_ops != &io_uring_pp_zc_ops ||
io_pp_to_ifq(niov->pp) != ifq)
return -EFAULT;
Yes, thanks, will do.
> You can also move it to struct net_iov_area and check niov->owner->type
> instead. It's a safer choice than aliasing with struct page, there is
> no cost as you're loading ->owner anyway (e.g. for
> net_iov_virtual_addr()), and it's better in terms of normalisation /
> not unnecessary duplicating it, assuming we'll never have niovs of
> different types bound to the same struct net_iov_area.
>
Putting it in niov->owner->type is an alternative approach. I don't
see a strong reason to go with one over the other. I'm thinking there
will be fast code paths that want to know the type of the frag or skb
and don't need the owner, so it will be good to save loading another
cacheline. We have more space in struct net_iov than we know what to
do with anyway.
--
Thanks,
Mina
On 4/22/25 15:03, Mina Almasry wrote: > On Tue, Apr 22, 2025 at 1:16 AM Pavel Begunkov <asml.silence@gmail.com> wrote: >> >> On 4/18/25 00:15, Mina Almasry wrote: >>> Later patches in the series adds TX net_iovs where there is no pp >>> associated, so we can't rely on niov->pp->mp_ops to tell what is the >>> type of the net_iov. >> >> That's fine, but that needs a NULL pp check in io_uring as well, >> specifically in io_zcrx_recv_frag(). >> > > I think you mean this update in the code: > > if (!niov->pp || niov->pp->mp_ops != &io_uring_pp_zc_ops || > io_pp_to_ifq(niov->pp) != ifq) > return -EFAULT; > > Yes, thanks, will do. That will work. I'm assuming that those pp-less niovs can end up in the rx path. I think it was deemed not impossible, right? >> You can also move it to struct net_iov_area and check niov->owner->type >> instead. It's a safer choice than aliasing with struct page, there is >> no cost as you're loading ->owner anyway (e.g. for >> net_iov_virtual_addr()), and it's better in terms of normalisation / >> not unnecessary duplicating it, assuming we'll never have niovs of >> different types bound to the same struct net_iov_area. >> > > Putting it in niov->owner->type is an alternative approach. I don't > see a strong reason to go with one over the other. I'm thinking there > will be fast code paths that want to know the type of the frag or skb> and don't need the owner, so it will be good to save loading another > cacheline. We have more space in struct net_iov than we know what to > do with anyway. That's fine. I wouldn't say it's about space, we can grow net_iov private bits beyond the pp fields in sturct page, but it's rather about the mess from the aliasing page. The fact that it's net_iov makes it better, but I'd rather avoid any additional aliasing altogether. -- Pavel Begunkov
On Tue, Apr 22, 2025 at 12:52 PM Pavel Begunkov <asml.silence@gmail.com> wrote: > > On 4/22/25 15:03, Mina Almasry wrote: > > On Tue, Apr 22, 2025 at 1:16 AM Pavel Begunkov <asml.silence@gmail.com> wrote: > >> > >> On 4/18/25 00:15, Mina Almasry wrote: > >>> Later patches in the series adds TX net_iovs where there is no pp > >>> associated, so we can't rely on niov->pp->mp_ops to tell what is the > >>> type of the net_iov. > >> > >> That's fine, but that needs a NULL pp check in io_uring as well, > >> specifically in io_zcrx_recv_frag(). > >> > > > > I think you mean this update in the code: > > > > if (!niov->pp || niov->pp->mp_ops != &io_uring_pp_zc_ops || > > io_pp_to_ifq(niov->pp) != ifq) > > return -EFAULT; > > > > Yes, thanks, will do. > > That will work. I'm assuming that those pp-less niovs can > end up in the rx path. I think it was deemed not impossible, > right? > I'm not sure these pp-less niovs can ever end up in the RX path, but I'm not sure, and I guess better safe than sorry. We usually get yelled at for defensive checks but I don't think this one is too defensive. There could be a path where a TX skb somehow ends up here. -- Thanks, Mina
© 2016 - 2025 Red Hat, Inc.