[v6] Device memory TCP TX

[PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Mina Almasry 11 months, 2 weeks ago

Currently net_iovs support only pp ref counts, and do not support a
page ref equivalent.

This is fine for the RX path as net_iovs are used exclusively with the
pp and only pp refcounting is needed there. The TX path however does not
use pp ref counts, thus, support for get_page/put_page equivalent is
needed for netmem.

Support get_netmem/put_netmem. Check the type of the netmem before
passing it to page or net_iov specific code to obtain a page ref
equivalent.

For dmabuf net_iovs, we obtain a ref on the underlying binding. This
ensures the entire binding doesn't disappear until all the net_iovs have
been put_netmem'ed. We do not need to track the refcount of individual
dmabuf net_iovs as we don't allocate/free them from a pool similar to
what the buddy allocator does for pages.

This code is written to be extensible by other net_iov implementers.
get_netmem/put_netmem will check the type of the netmem and route it to
the correct helper:

pages -> [get|put]_page()
dmabuf net_iovs -> net_devmem_[get|put]_net_iov()
new net_iovs ->	new helpers

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>

---

v2:
- Add comment on top of refcount_t ref explaining the usage in the XT
  path.
- Fix missing definition of net_devmem_dmabuf_binding_put in this patch.

---
 include/linux/skbuff_ref.h |  4 ++--
 include/net/netmem.h       |  3 +++
 net/core/devmem.c          | 10 ++++++++++
 net/core/devmem.h          | 20 ++++++++++++++++++++
 net/core/skbuff.c          | 30 ++++++++++++++++++++++++++++++
 5 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff_ref.h b/include/linux/skbuff_ref.h
index 0f3c58007488..9e49372ef1a0 100644
--- a/include/linux/skbuff_ref.h
+++ b/include/linux/skbuff_ref.h
@@ -17,7 +17,7 @@
  */
 static inline void __skb_frag_ref(skb_frag_t *frag)
 {
-	get_page(skb_frag_page(frag));
+	get_netmem(skb_frag_netmem(frag));
 }
 
 /**
@@ -40,7 +40,7 @@ static inline void skb_page_unref(netmem_ref netmem, bool recycle)
 	if (recycle && napi_pp_put_page(netmem))
 		return;
 #endif
-	put_page(netmem_to_page(netmem));
+	put_netmem(netmem);
 }
 
 /**
diff --git a/include/net/netmem.h b/include/net/netmem.h
index c61d5b21e7b4..a2148ffb203d 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -264,4 +264,7 @@ static inline unsigned long netmem_get_dma_addr(netmem_ref netmem)
 	return __netmem_clear_lsb(netmem)->dma_addr;
 }
 
+void get_netmem(netmem_ref netmem);
+void put_netmem(netmem_ref netmem);
+
 #endif /* _NET_NETMEM_H */
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 7c6e0b5b6acb..b1aafc66ebb7 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -325,6 +325,16 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
 	return ERR_PTR(err);
 }
 
+void net_devmem_get_net_iov(struct net_iov *niov)
+{
+	net_devmem_dmabuf_binding_get(net_devmem_iov_binding(niov));
+}
+
+void net_devmem_put_net_iov(struct net_iov *niov)
+{
+	net_devmem_dmabuf_binding_put(net_devmem_iov_binding(niov));
+}
+
 /*** "Dmabuf devmem memory provider" ***/
 
 int mp_dmabuf_devmem_init(struct page_pool *pool)
diff --git a/net/core/devmem.h b/net/core/devmem.h
index 7fc158d52729..946f2e015746 100644
--- a/net/core/devmem.h
+++ b/net/core/devmem.h
@@ -29,6 +29,10 @@ struct net_devmem_dmabuf_binding {
 	 * The binding undos itself and unmaps the underlying dmabuf once all
 	 * those refs are dropped and the binding is no longer desired or in
 	 * use.
+	 *
+	 * net_devmem_get_net_iov() on dmabuf net_iovs will increment this
+	 * reference, making sure that the binding remains alive until all the
+	 * net_iovs are no longer used.
 	 */
 	refcount_t ref;
 
@@ -111,6 +115,9 @@ net_devmem_dmabuf_binding_put(struct net_devmem_dmabuf_binding *binding)
 	__net_devmem_dmabuf_binding_free(binding);
 }
 
+void net_devmem_get_net_iov(struct net_iov *niov);
+void net_devmem_put_net_iov(struct net_iov *niov);
+
 struct net_iov *
 net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding);
 void net_devmem_free_dmabuf(struct net_iov *ppiov);
@@ -120,6 +127,19 @@ bool net_is_devmem_iov(struct net_iov *niov);
 #else
 struct net_devmem_dmabuf_binding;
 
+static inline void
+net_devmem_dmabuf_binding_put(struct net_devmem_dmabuf_binding *binding)
+{
+}
+
+static inline void net_devmem_get_net_iov(struct net_iov *niov)
+{
+}
+
+static inline void net_devmem_put_net_iov(struct net_iov *niov)
+{
+}
+
 static inline void
 __net_devmem_dmabuf_binding_free(struct net_devmem_dmabuf_binding *binding)
 {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5b241c9e6f38..6e853d55a3e8 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -89,6 +89,7 @@
 #include <linux/textsearch.h>
 
 #include "dev.h"
+#include "devmem.h"
 #include "netmem_priv.h"
 #include "sock_destructor.h"
 
@@ -7253,3 +7254,32 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes,
 	return false;
 }
 EXPORT_SYMBOL(csum_and_copy_from_iter_full);
+
+void get_netmem(netmem_ref netmem)
+{
+	if (netmem_is_net_iov(netmem)) {
+		/* Assume any net_iov is devmem and route it to
+		 * net_devmem_get_net_iov. As new net_iov types are added they
+		 * need to be checked here.
+		 */
+		net_devmem_get_net_iov(netmem_to_net_iov(netmem));
+		return;
+	}
+	get_page(netmem_to_page(netmem));
+}
+EXPORT_SYMBOL(get_netmem);
+
+void put_netmem(netmem_ref netmem)
+{
+	if (netmem_is_net_iov(netmem)) {
+		/* Assume any net_iov is devmem and route it to
+		 * net_devmem_put_net_iov. As new net_iov types are added they
+		 * need to be checked here.
+		 */
+		net_devmem_put_net_iov(netmem_to_net_iov(netmem));
+		return;
+	}
+
+	put_page(netmem_to_page(netmem));
+}
+EXPORT_SYMBOL(put_netmem);
-- 
2.48.1.658.g4767266eb4-goog

Re: [PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Jakub Kicinski 11 months, 2 weeks ago

On Thu, 27 Feb 2025 04:12:02 +0000 Mina Almasry wrote:
>  static inline void __skb_frag_ref(skb_frag_t *frag)
>  {
> -	get_page(skb_frag_page(frag));
> +	get_netmem(skb_frag_netmem(frag));
>  }

Silently handling types of memory the caller may not be expecting
always worries me. Why do we need this? 

In general, I'm surprised by the lack of bug reports for devmem.
Can you think of any way we could expose this more to syzbot?
First thing that comes to mind is a simple hack in netdevsim,
to make it insert a netmem handle (allocated locally, not a real
memory provider), every N packets (controllable via debugfs).
Would that work?

Re: [PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Mina Almasry 11 months, 2 weeks ago

On Fri, Feb 28, 2025 at 4:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 27 Feb 2025 04:12:02 +0000 Mina Almasry wrote:
> >  static inline void __skb_frag_ref(skb_frag_t *frag)
> >  {
> > -     get_page(skb_frag_page(frag));
> > +     get_netmem(skb_frag_netmem(frag));
> >  }
>
> Silently handling types of memory the caller may not be expecting
> always worries me.

Sorry, I'm not following. What caller is not expecting netmem? Here
we're making sure __skb_frag_ref() handles netmem correctly, i.e. we
were not expecting netmem here before, and after this patch we'll
handle it correctly.

> Why do we need this?
>

The MSG_ZEROCOPY TX path takes a page reference on the passed memory
in zerocopy_fill_skb_from_iter() that kfree_skb() later drops when the
skb is sent. We need an equivalent for netmem, which only supports pp
refs today. This is my attempt at implementing a page_ref equivalent
to net_iov and generic netmem.

I think __skb_frag_[un]ref is used elsewhere in the TX path too,
tcp_mtu_probe for example calls skb_frag_ref eventually.

> In general, I'm surprised by the lack of bug reports for devmem.

I guess we did a good job making sure we don't regress the page paths.

The lack of support in any driver that qemu will run is an issue. I
wonder if also the fact that devmem needs some setup is also an issue.
We need headersplit enabled, udmabuf created, netlink API bound, and
then a connection referring to created and we don't support loopback.
I think maybe it all may make it difficult for syzbot to repro. I've
had it on my todo list to investigate this more.

> Can you think of any way we could expose this more to syzbot?
> First thing that comes to mind is a simple hack in netdevsim,
> to make it insert a netmem handle (allocated locally, not a real
> memory provider), every N packets (controllable via debugfs).
> Would that work?

Yes, great idea. I don't see why it wouldn't work.

We don't expect mixing of net_iovs and pages in the same skb, but
netdevsim could create one net_iov skb every N skbs.

I guess I'm not totally sure something is discoverable to syzbot. Is a
netdevsim hack toggleable via a debugfs sufficient for syzbot? I'll
investigate and ask.

--
Thanks,
Mina

Re: [PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Jakub Kicinski 11 months, 1 week ago

On Fri, 28 Feb 2025 17:29:13 -0800 Mina Almasry wrote:
> On Fri, Feb 28, 2025 at 4:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Thu, 27 Feb 2025 04:12:02 +0000 Mina Almasry wrote:  
> > >  static inline void __skb_frag_ref(skb_frag_t *frag)
> > >  {
> > > -     get_page(skb_frag_page(frag));
> > > +     get_netmem(skb_frag_netmem(frag));
> > >  }  
> >
> > Silently handling types of memory the caller may not be expecting
> > always worries me.  
> 
> Sorry, I'm not following. What caller is not expecting netmem?
> Here we're making sure __skb_frag_ref() handles netmem correctly,
> i.e. we were not expecting netmem here before, and after this patch
> we'll handle it correctly.
> 
> > Why do we need this?
> >  
> 
> The MSG_ZEROCOPY TX path takes a page reference on the passed memory
> in zerocopy_fill_skb_from_iter() that kfree_skb() later drops when the
> skb is sent. We need an equivalent for netmem, which only supports pp
> refs today. This is my attempt at implementing a page_ref equivalent
> to net_iov and generic netmem.
> 
> I think __skb_frag_[un]ref is used elsewhere in the TX path too,
> tcp_mtu_probe for example calls skb_frag_ref eventually.

Any such caller must be inspected to make sure it generates
/ anticipates skbs with appropriate pp_recycle and readable settings.
It's possible that adding a set of _netmem APIs would be too much
churn, but if it's not - it'd make it clear which parts of the kernel
we have inspected.

> > In general, I'm surprised by the lack of bug reports for devmem.  
> 
> I guess we did a good job making sure we don't regress the page paths.

:)

> The lack of support in any driver that qemu will run is an issue. I
> wonder if also the fact that devmem needs some setup is also an issue.
> We need headersplit enabled, udmabuf created, netlink API bound, and
> then a connection referring to created and we don't support loopback.
> I think maybe it all may make it difficult for syzbot to repro. I've
> had it on my todo list to investigate this more.
> 
> > Can you think of any way we could expose this more to syzbot?
> > First thing that comes to mind is a simple hack in netdevsim,
> > to make it insert a netmem handle (allocated locally, not a real
> > memory provider), every N packets (controllable via debugfs).
> > Would that work?  
> 
> Yes, great idea. I don't see why it wouldn't work.
> 
> We don't expect mixing of net_iovs and pages in the same skb, but
> netdevsim could create one net_iov skb every N skbs.
> 
> I guess I'm not totally sure something is discoverable to syzbot. Is a
> netdevsim hack toggleable via a debugfs sufficient for syzbot? I'll
> investigate and ask.

Yeah, my unreliable memory is that syzbot has a mixed record discovering
problems with debugfs. If you could ask Dmitry for advice that'd be
ideal.

Re: [PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Mina Almasry 11 months, 1 week ago

On Mon, Mar 3, 2025 at 4:20 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 28 Feb 2025 17:29:13 -0800 Mina Almasry wrote:
> > On Fri, Feb 28, 2025 at 4:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > On Thu, 27 Feb 2025 04:12:02 +0000 Mina Almasry wrote:
> > > >  static inline void __skb_frag_ref(skb_frag_t *frag)
> > > >  {
> > > > -     get_page(skb_frag_page(frag));
> > > > +     get_netmem(skb_frag_netmem(frag));
> > > >  }
> > >
> > > Silently handling types of memory the caller may not be expecting
> > > always worries me.
> >
> > Sorry, I'm not following. What caller is not expecting netmem?
> > Here we're making sure __skb_frag_ref() handles netmem correctly,
> > i.e. we were not expecting netmem here before, and after this patch
> > we'll handle it correctly.
> >
> > > Why do we need this?
> > >
> >
> > The MSG_ZEROCOPY TX path takes a page reference on the passed memory
> > in zerocopy_fill_skb_from_iter() that kfree_skb() later drops when the
> > skb is sent. We need an equivalent for netmem, which only supports pp
> > refs today. This is my attempt at implementing a page_ref equivalent
> > to net_iov and generic netmem.
> >
> > I think __skb_frag_[un]ref is used elsewhere in the TX path too,
> > tcp_mtu_probe for example calls skb_frag_ref eventually.
>
> Any such caller must be inspected to make sure it generates
> / anticipates skbs with appropriate pp_recycle and readable settings.
> It's possible that adding a set of _netmem APIs would be too much
> churn, but if it's not - it'd make it clear which parts of the kernel
> we have inspected.
>

I see, you're concerned about interactions between skb->pp_recycle and
skb->unreadable. My strategy to handle this is to take what works for
pages, and extend it as carefully as possible to unreadable
net_iovs/skbs. Abstractly speaking, I think skb_frag_ref/unref should
be able to obtain/drop a ref on a frag regardless of what the
underlying memory type is.

Currently __skb_frag_ref() obtains a page ref regardless of
skb->pp_recycle setting, and skb_page_unref() drops a page_ref if
!skb->pp_recycle and a pp ref if skb->pp_recycle. I extend the logic
so that it applies as-is to net_iovs and unreadable skbs.

After this patch, __skb_frag_ref() obtains a 'page ref equivalent' on
the net_iov regardless of the pp_recycle setting. My conjecture is
that since that works for pages, it should also work for net_iovs.

Also after this patch, skb_page_unref will drop a pp ref on the
net_iov if skb->pp_recycle is set, and drop a 'page ref equivalent' on
the net_iov if !skb->pp_recycle. The conjecture again is that since
that works for pages, it should extend to net_iovs.

With regards to the callers, my thinking is that since we preserved
the semantics of __skb_frag_ref and skb_page_unref (and so
skb_frag_ref/skb_frag_unref by extension), then all existing callers
of skb_frag_[un]ref should work as is. Here is some code analysis of
individual callers:

1. skb_release_data
      __skb_frag_unref
        skb_page_unref

This code path expects to drop a pp_ref if skb->pp_recycle, and drop a
page ref if !skb->pp_recycle. We make sure to do this regardless if
the skb is actually filled with net_iovs or pages, and the conjecture
is that since the logic to acquire/drop a pp=page ref works for pages,
it should work for the net_iovs as well.

2. __skb_zcopy_downgrade_managed
      skb_frag_ref

Same thing here, since this code expects to get a page ref on the
frag, we obtain the net_iov equivalent of a page_ref and that should
work as well for net_iovs as pages. I could look at more callers, but
the general thinking is the same. Am I off the rails here?

Of course, we could leave skb_frag_[un]ref as-is and create an
skb_frag_[un]ref_netmem which support net_iovs, but my ambition here
was to make skb_frag_[un]ref itself support netmem rather than fork
the frag refcounting - if it seems like a good idea?

> > > In general, I'm surprised by the lack of bug reports for devmem.
> >
> > I guess we did a good job making sure we don't regress the page paths.
>
> :)
>
> > The lack of support in any driver that qemu will run is an issue. I
> > wonder if also the fact that devmem needs some setup is also an issue.
> > We need headersplit enabled, udmabuf created, netlink API bound, and
> > then a connection referring to created and we don't support loopback.
> > I think maybe it all may make it difficult for syzbot to repro. I've
> > had it on my todo list to investigate this more.
> >
> > > Can you think of any way we could expose this more to syzbot?
> > > First thing that comes to mind is a simple hack in netdevsim,
> > > to make it insert a netmem handle (allocated locally, not a real
> > > memory provider), every N packets (controllable via debugfs).
> > > Would that work?
> >
> > Yes, great idea. I don't see why it wouldn't work.
> >
> > We don't expect mixing of net_iovs and pages in the same skb, but
> > netdevsim could create one net_iov skb every N skbs.
> >
> > I guess I'm not totally sure something is discoverable to syzbot. Is a
> > netdevsim hack toggleable via a debugfs sufficient for syzbot? I'll
> > investigate and ask.
>
> Yeah, my unreliable memory is that syzbot has a mixed record discovering
> problems with debugfs. If you could ask Dmitry for advice that'd be
> ideal.

Yes, I took a look here and discussed with Willem. Long story short is
that syzbot support is possible but with a handful of changes. We'll
look into that.

Long story long, for syzbot support I don't think netdevsim itself
will be useful. Its our understanding so far that syzbot doesn't do
anything special with netdevsim. We'll need to add queue
API/page_pool/unreadable netmem support to one of the drivers qemu
(syzbot) uses, and that should get syzbot fuzzing the control plane.

To get syzbot to fuzz the data plane, I think we need to set up a
special syzbot instance which configures udmabuf/rss/flow
steering/netlink binding and start injecting packets through the data
path. Syzbot would not discover a working config on its own. I'm told
it's rare to set up specialized syzbot instances but we could sell
that this coverage is important enough.

Hacking netdevsim like you suggested would be useful as well, but
outside of syzbot, AFAICT so far. We can run existing netdevsim data
path tests with netdevsim in 'unreadable netmem mode' and see if it
can reproduce issues. Although I'm not sure how to integrate that with
continuous testing yet.

--
Thanks,
Mina

Re: [PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Jakub Kicinski 11 months, 1 week ago

On Tue, 4 Mar 2025 17:39:37 -0800 Mina Almasry wrote:
> > > Yes, great idea. I don't see why it wouldn't work.
> > >
> > > We don't expect mixing of net_iovs and pages in the same skb, but
> > > netdevsim could create one net_iov skb every N skbs.
> > >
> > > I guess I'm not totally sure something is discoverable to syzbot. Is a
> > > netdevsim hack toggleable via a debugfs sufficient for syzbot? I'll
> > > investigate and ask.  
> >
> > Yeah, my unreliable memory is that syzbot has a mixed record discovering
> > problems with debugfs. If you could ask Dmitry for advice that'd be
> > ideal.  
> 
> Yes, I took a look here and discussed with Willem. Long story short is
> that syzbot support is possible but with a handful of changes. We'll
> look into that.
> 
> Long story long, for syzbot support I don't think netdevsim itself
> will be useful. Its our understanding so far that syzbot doesn't do
> anything special with netdevsim.

Meaning it doesn't currently do anything special, or you can't make it
do anything special with netdevsim?

> We'll need to add queue API/page_pool/unreadable netmem support to
> one of the drivers qemu (syzbot) uses, and that should get syzbot
> fuzzing the control plane.
> 
> To get syzbot to fuzz the data plane, I think we need to set up a
> special syzbot instance which configures udmabuf/rss/flow

To be clear for Tx you don't need RSS and flow steering, Tx should
be trivial for any device driver which managers DMAs directly (not USB).

> steering/netlink binding and start injecting packets through the data
> path. Syzbot would not discover a working config on its own. I'm told
> it's rare to set up specialized syzbot instances but we could sell
> that this coverage is important enough.
> 
> Hacking netdevsim like you suggested would be useful as well, but
> outside of syzbot, AFAICT so far. We can run existing netdevsim data
> path tests with netdevsim in 'unreadable netmem mode' and see if it
> can reproduce issues. Although I'm not sure how to integrate that with
> continuous testing yet.

Re: [PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Mina Almasry 11 months, 1 week ago

On Thu, Mar 6, 2025 at 1:40 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 4 Mar 2025 17:39:37 -0800 Mina Almasry wrote:
> > > > Yes, great idea. I don't see why it wouldn't work.
> > > >
> > > > We don't expect mixing of net_iovs and pages in the same skb, but
> > > > netdevsim could create one net_iov skb every N skbs.
> > > >
> > > > I guess I'm not totally sure something is discoverable to syzbot. Is a
> > > > netdevsim hack toggleable via a debugfs sufficient for syzbot? I'll
> > > > investigate and ask.
> > >
> > > Yeah, my unreliable memory is that syzbot has a mixed record discovering
> > > problems with debugfs. If you could ask Dmitry for advice that'd be
> > > ideal.
> >
> > Yes, I took a look here and discussed with Willem. Long story short is
> > that syzbot support is possible but with a handful of changes. We'll
> > look into that.
> >
> > Long story long, for syzbot support I don't think netdevsim itself
> > will be useful. Its our understanding so far that syzbot doesn't do
> > anything special with netdevsim.
>
> Meaning it doesn't currently do anything special, or you can't make it
> do anything special with netdevsim?
>

Meaning it currently doesn't do anything special with netdevsim. I
imagine we may be able to create a specialized syzbot instance that
loads netdevsim and starts fuzzing its APIs. However I'm told
specialized syzbot instances are much less valuable than making the
feature discoverable to existing syzbot instances, which is why our
thoughts went to adding devmem/unreadable skb support to virtio or
tun/tap.

Do I surmise from your question you prefer a netdevsim-based approach?
(and just curious maybe, why?)
> > We'll need to add queue API/page_pool/unreadable netmem support to
> > one of the drivers qemu (syzbot) uses, and that should get syzbot
> > fuzzing the control plane.
> >
> > To get syzbot to fuzz the data plane, I think we need to set up a
> > special syzbot instance which configures udmabuf/rss/flow
>
> To be clear for Tx you don't need RSS and flow steering, Tx should
> be trivial for any device driver which managers DMAs directly (not USB).
>

Yes, we don't need queue API or page_pool support or header split
either for that matter. TX fuzzing is definitely simpler. Maybe we can
start with that.

--
Thanks,
Mina

Re: [PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Jakub Kicinski 11 months, 1 week ago

On Thu, 6 Mar 2025 14:44:41 -0800 Mina Almasry wrote:
> > Meaning it doesn't currently do anything special, or you can't make it
> > do anything special with netdevsim?
>
> Meaning it currently doesn't do anything special with netdevsim. I
> imagine we may be able to create a specialized syzbot instance that
> loads netdevsim and starts fuzzing its APIs. However I'm told
> specialized syzbot instances are much less valuable than making the
> feature discoverable to existing syzbot instances, which is why our
> thoughts went to adding devmem/unreadable skb support to virtio or
> tun/tap.
> 
> Do I surmise from your question you prefer a netdevsim-based approach?
> (and just curious maybe, why?)

My exposure to syzbot is mostly as a consumer of reports, I thought
from looking at the repros that there's a way of teaching syzbot
how to perform more complex "ops", like a sequence of specific
writes. IIRC for netlink it does things like resolve family.
But not sure if it's true or how much of an exception adding such
things is.

Here we'd need to guide syzbot towards a specific series of
sysfs writes, so that it creates the correctly configured netdevsim
instance with higher probability.

Just explaining my thinking, not saying this is the way we should
necessarily go.

> > > We'll need to add queue API/page_pool/unreadable netmem support to
> > > one of the drivers qemu (syzbot) uses, and that should get syzbot
> > > fuzzing the control plane.
> > >
> > > To get syzbot to fuzz the data plane, I think we need to set up a
> > > special syzbot instance which configures udmabuf/rss/flow  
> >
> > To be clear for Tx you don't need RSS and flow steering, Tx should
> > be trivial for any device driver which managers DMAs directly (not USB).
> 
> Yes, we don't need queue API or page_pool support or header split
> either for that matter. TX fuzzing is definitely simpler. Maybe we can
> start with that.

Adding support to virtio would be ideal, if syzbot already fuzzes it. 
I was recently talking to David Wei about it for the Rx side, too, 
so we can test io_uring ZC. But io_uring can only ZC user memory now.
I'm not sure what adding DEVMEM support to virtio would entail.

Re: [PATCH net-next v6 1/8] net: add get_netmem/put_netmem support

Posted by Willem de Bruijn 11 months, 1 week ago

Jakub Kicinski wrote:
> On Thu, 6 Mar 2025 14:44:41 -0800 Mina Almasry wrote:
> > > Meaning it doesn't currently do anything special, or you can't make it
> > > do anything special with netdevsim?
> >
> > Meaning it currently doesn't do anything special with netdevsim. I
> > imagine we may be able to create a specialized syzbot instance that
> > loads netdevsim and starts fuzzing its APIs. However I'm told
> > specialized syzbot instances are much less valuable than making the
> > feature discoverable to existing syzbot instances, which is why our
> > thoughts went to adding devmem/unreadable skb support to virtio or
> > tun/tap.
> > 
> > Do I surmise from your question you prefer a netdevsim-based approach?
> > (and just curious maybe, why?)
> 
> My exposure to syzbot is mostly as a consumer of reports, I thought
> from looking at the repros that there's a way of teaching syzbot
> how to perform more complex "ops", like a sequence of specific
> writes. IIRC for netlink it does things like resolve family.
> But not sure if it's true or how much of an exception adding such
> things is.

The standard way of increasing coverage is by teaching syzbot about
new ABI extensions.

Adding additional initialization, such as setting up a usdma buf,
requires changing the repro scripts that it generates. Not sure where
that code gen lives. But all .c repros consist of a small loop() that
does the pertinent work, wrapped in a lot of initialization of the
tun devices, tunnel devices, netns, etc, etc.

> Here we'd need to guide syzbot towards a specific series of
> sysfs writes, so that it creates the correctly configured netdevsim
> instance with higher probability.
> 
> Just explaining my thinking, not saying this is the way we should
> necessarily go.


> > > > We'll need to add queue API/page_pool/unreadable netmem support to
> > > > one of the drivers qemu (syzbot) uses, and that should get syzbot
> > > > fuzzing the control plane.
> > > >
> > > > To get syzbot to fuzz the data plane, I think we need to set up a
> > > > special syzbot instance which configures udmabuf/rss/flow  
> > >
> > > To be clear for Tx you don't need RSS and flow steering, Tx should
> > > be trivial for any device driver which managers DMAs directly (not USB).
> > 
> > Yes, we don't need queue API or page_pool support or header split
> > either for that matter. TX fuzzing is definitely simpler. Maybe we can
> > start with that.
> 
> Adding support to virtio would be ideal, if syzbot already fuzzes it. 
> I was recently talking to David Wei about it for the Rx side, too, 
> so we can test io_uring ZC. But io_uring can only ZC user memory now.
> I'm not sure what adding DEVMEM support to virtio would entail.

By default syzbot uses a local tun device.

At least all the reports that I have seen. That is why virtio_net_hdr_to_skb
was such a frequent target.

We also added tun IFF_NAPI and IFF_NAPI_FRAGS to get more coverage of those
receive paths in syzbot.

If expanding syzkaller to a devmem rx path, tun would be more first choice.
But since devmem requires page_pool, queue API, etc., another virtual
device that already has those may be an alternative, not sure.