[v10] net: devmem: improve cpu cost of RX token management

[PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Bobby Eshleman 3 weeks, 3 days ago

From: Bobby Eshleman <bobbyeshleman@meta.com>

Update devmem.rst documentation to describe the autorelease netlink
attribute used during RX dmabuf binding.

The autorelease attribute is specified at bind-time via the netlink API
(NETDEV_CMD_BIND_RX) and controls what happens to outstanding tokens
when the socket closes.

Document the two token release modes (automatic vs manual), how to
configure the binding for autorelease, the perf benefits, new caveats
and restrictions, and the way the mode is enforced system-wide.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
Changes in v7:
- Document netlink instead of sockopt
- Mention system-wide locked to one mode
---
 Documentation/networking/devmem.rst | 73 +++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/Documentation/networking/devmem.rst b/Documentation/networking/devmem.rst
index a6cd7236bfbd..f85f1dcc9621 100644
--- a/Documentation/networking/devmem.rst
+++ b/Documentation/networking/devmem.rst
@@ -235,6 +235,79 @@ can be less than the tokens provided by the user in case of:
 (a) an internal kernel leak bug.
 (b) the user passed more than 1024 frags.
 
+
+Autorelease Control
+~~~~~~~~~~~~~~~~~~~
+
+The autorelease mode controls what happens to outstanding tokens (tokens not
+released via SO_DEVMEM_DONTNEED) when the socket closes. Autorelease is
+configured per-binding at binding creation time via the netlink API::
+
+	struct netdev_bind_rx_req *req;
+	struct netdev_bind_rx_rsp *rsp;
+	struct ynl_sock *ys;
+	struct ynl_error yerr;
+
+	ys = ynl_sock_create(&ynl_netdev_family, &yerr);
+
+	req = netdev_bind_rx_req_alloc();
+	netdev_bind_rx_req_set_ifindex(req, ifindex);
+	netdev_bind_rx_req_set_fd(req, dmabuf_fd);
+	netdev_bind_rx_req_set_autorelease(req, 0); /* 0 = manual, 1 = auto */
+	__netdev_bind_rx_req_set_queues(req, queues, n_queues);
+
+	rsp = netdev_bind_rx(ys, req);
+
+	dmabuf_id = rsp->id;
+
+When autorelease is disabled (0):
+
+- Outstanding tokens are NOT released when the socket closes
+- Outstanding tokens are only released when all RX queues are unbound AND all
+  sockets that called recvmsg() are closed
+- Provides better performance by eliminating xarray overhead (~13% CPU reduction)
+- Kernel tracks tokens via atomic reference counters in net_iov structures
+
+When autorelease is enabled (1):
+
+- Outstanding tokens are automatically released when the socket closes
+- Backwards compatible behavior
+- Kernel tracks tokens in an xarray per socket
+
+The default is autorelease disabled.
+
+Important: In both modes, applications should call SO_DEVMEM_DONTNEED to
+return tokens as soon as they are done processing. The autorelease setting only
+affects what happens to tokens that are still outstanding when close() is called.
+
+The mode is enforced system-wide. Once a binding is created with a specific
+autorelease mode, all subsequent bindings system-wide must use the same mode.
+
+
+Performance Considerations
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Disabling autorelease provides approximately ~13% CPU utilization improvement
+in RX workloads. That said, applications must ensure all tokens are released
+via SO_DEVMEM_DONTNEED before closing the socket, otherwise the backing pages
+will remain pinned until all RX queues are unbound AND all sockets that called
+recvmsg() are closed.
+
+
+Caveats
+~~~~~~~
+
+- Once a system-wide autorelease mode is selected (via the first binding),
+  all subsequent bindings must use the same mode. Attempts to create bindings
+  with a different mode will be rejected with -EBUSY.
+
+- Applications using manual release mode (autorelease=0) must ensure all tokens
+  are returned via SO_DEVMEM_DONTNEED before socket close to avoid resource
+  leaks during the lifetime of the dmabuf binding. Tokens not released before
+  close() will only be freed when all RX queues are unbound AND all sockets
+  that called recvmsg() are closed.
+
+
 TX Interface
 ============
 

-- 
2.47.3

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Jakub Kicinski 2 weeks, 5 days ago

On Thu, 15 Jan 2026 21:02:15 -0800 Bobby Eshleman wrote:
> +- Once a system-wide autorelease mode is selected (via the first binding),
> +  all subsequent bindings must use the same mode. Attempts to create bindings
> +  with a different mode will be rejected with -EBUSY.

Why?

> +- Applications using manual release mode (autorelease=0) must ensure all tokens
> +  are returned via SO_DEVMEM_DONTNEED before socket close to avoid resource
> +  leaks during the lifetime of the dmabuf binding. Tokens not released before
> +  close() will only be freed when all RX queues are unbound AND all sockets
> +  that called recvmsg() are closed.

Could you add a short example on how? by calling shutdown()?

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Bobby Eshleman 2 weeks, 5 days ago

On Tue, Jan 20, 2026 at 04:36:50PM -0800, Jakub Kicinski wrote:
> On Thu, 15 Jan 2026 21:02:15 -0800 Bobby Eshleman wrote:
> > +- Once a system-wide autorelease mode is selected (via the first binding),
> > +  all subsequent bindings must use the same mode. Attempts to create bindings
> > +  with a different mode will be rejected with -EBUSY.
> 
> Why?
> 

Originally I was using EINVAL, but when writing the tests I noticed this
might be a confusing case for users to interpret EINVAL (i.e., some
binding possibly made by someone else is in a different mode). I thought
EBUSY could capture the semantic "the system is locked up in a different
mode, try again when it isn't".

I'm not married to it though. Happy to go back to EINVAL or another
errno.

> > +- Applications using manual release mode (autorelease=0) must ensure all tokens
> > +  are returned via SO_DEVMEM_DONTNEED before socket close to avoid resource
> > +  leaks during the lifetime of the dmabuf binding. Tokens not released before
> > +  close() will only be freed when all RX queues are unbound AND all sockets
> > +  that called recvmsg() are closed.
> 
> Could you add a short example on how? by calling shutdown()?

Show an example of the three steps: returning the tokens, unbinding, and closing the
sockets (TCP/NL)?

Best,
Bobby

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Jakub Kicinski 2 weeks, 4 days ago

On Tue, 20 Jan 2026 21:44:09 -0800 Bobby Eshleman wrote:
> On Tue, Jan 20, 2026 at 04:36:50PM -0800, Jakub Kicinski wrote:
> > On Thu, 15 Jan 2026 21:02:15 -0800 Bobby Eshleman wrote:  
> > > +- Once a system-wide autorelease mode is selected (via the first binding),
> > > +  all subsequent bindings must use the same mode. Attempts to create bindings
> > > +  with a different mode will be rejected with -EBUSY.  
> > 
> > Why?
> 
> Originally I was using EINVAL, but when writing the tests I noticed this
> might be a confusing case for users to interpret EINVAL (i.e., some
> binding possibly made by someone else is in a different mode). I thought
> EBUSY could capture the semantic "the system is locked up in a different
> mode, try again when it isn't".
> 
> I'm not married to it though. Happy to go back to EINVAL or another
> errno.

My question was more why the system-wide policy exists, rather than
binding-by-binding. Naively I'd think that a single socket must pick
but system wide there could easily be multiple bindings not bothering
each other, doing different things?

> > > +- Applications using manual release mode (autorelease=0) must ensure all tokens
> > > +  are returned via SO_DEVMEM_DONTNEED before socket close to avoid resource
> > > +  leaks during the lifetime of the dmabuf binding. Tokens not released before
> > > +  close() will only be freed when all RX queues are unbound AND all sockets
> > > +  that called recvmsg() are closed.  
> > 
> > Could you add a short example on how? by calling shutdown()?  
> 
> Show an example of the three steps: returning the tokens, unbinding, and closing the
> sockets (TCP/NL)?

TBH I read the doc before reading the code, which I guess may actually
be better since we don't expect users to read the code first either..

Now after reading the code I'm not sure the doc explains things
properly. AFAIU there's no association of token <> socket within the
same binding. User can close socket A and return the tokens via socket
B. As written the doc made me think that there will be a leak if socket
is closed without releasing tokens, or that there may be a race with
data queued but not read. Neither is true, really?

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Bobby Eshleman 2 weeks, 4 days ago

On Wed, Jan 21, 2026 at 05:35:12PM -0800, Jakub Kicinski wrote:
> On Tue, 20 Jan 2026 21:44:09 -0800 Bobby Eshleman wrote:
> > On Tue, Jan 20, 2026 at 04:36:50PM -0800, Jakub Kicinski wrote:
> > > On Thu, 15 Jan 2026 21:02:15 -0800 Bobby Eshleman wrote:  
> > > > +- Once a system-wide autorelease mode is selected (via the first binding),
> > > > +  all subsequent bindings must use the same mode. Attempts to create bindings
> > > > +  with a different mode will be rejected with -EBUSY.  
> > > 
> > > Why?
> > 
> > Originally I was using EINVAL, but when writing the tests I noticed this
> > might be a confusing case for users to interpret EINVAL (i.e., some
> > binding possibly made by someone else is in a different mode). I thought
> > EBUSY could capture the semantic "the system is locked up in a different
> > mode, try again when it isn't".
> > 
> > I'm not married to it though. Happy to go back to EINVAL or another
> > errno.
> 
> My question was more why the system-wide policy exists, rather than
> binding-by-binding. Naively I'd think that a single socket must pick
> but system wide there could easily be multiple bindings not bothering
> each other, doing different things?

Originally we allowed per-binding policy, but it seemed one-per-system
may 1) simplify reasoning through the code by only allowing one policy
per system, and 2) allow simpler deprecation of autorelease=on if its
found to be obsolete over time (just hack off that particular path of
the static branch set). It doesn't prevent any races/bugs or anything.

> 
> > > > +- Applications using manual release mode (autorelease=0) must ensure all tokens
> > > > +  are returned via SO_DEVMEM_DONTNEED before socket close to avoid resource
> > > > +  leaks during the lifetime of the dmabuf binding. Tokens not released before
> > > > +  close() will only be freed when all RX queues are unbound AND all sockets
> > > > +  that called recvmsg() are closed.  
> > > 
> > > Could you add a short example on how? by calling shutdown()?  
> > 
> > Show an example of the three steps: returning the tokens, unbinding, and closing the
> > sockets (TCP/NL)?
> 
> TBH I read the doc before reading the code, which I guess may actually
> be better since we don't expect users to read the code first either..
> 
> Now after reading the code I'm not sure the doc explains things
> properly. AFAIU there's no association of token <> socket within the
> same binding. User can close socket A and return the tokens via socket
> B. As written the doc made me think that there will be a leak if socket
> is closed without releasing tokens, or that there may be a race with
> data queued but not read. Neither is true, really?

That is correct, neither is true. If the two sockets share a binding the
kernel doesn't care which socket received the token or which one
returned it. No token <> socket association. There is no
queued-but-not-read race either. If any tokens are not returned, as long
as all of the binding references are eventually released and all sockets
that used the binding are closed, then all references will be accounted
for and everything cleaned up.

Best,
Bobby

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Jakub Kicinski 2 weeks, 4 days ago

On Wed, 21 Jan 2026 18:37:56 -0800 Bobby Eshleman wrote:
> > > Show an example of the three steps: returning the tokens, unbinding, and closing the
> > > sockets (TCP/NL)?  
> > 
> > TBH I read the doc before reading the code, which I guess may actually
> > be better since we don't expect users to read the code first either..
> > 
> > Now after reading the code I'm not sure the doc explains things
> > properly. AFAIU there's no association of token <> socket within the
> > same binding. User can close socket A and return the tokens via socket
> > B. As written the doc made me think that there will be a leak if socket
> > is closed without releasing tokens, or that there may be a race with
> > data queued but not read. Neither is true, really?  
> 
> That is correct, neither is true. If the two sockets share a binding the
> kernel doesn't care which socket received the token or which one
> returned it. No token <> socket association. There is no
> queued-but-not-read race either. If any tokens are not returned, as long
> as all of the binding references are eventually released and all sockets
> that used the binding are closed, then all references will be accounted
> for and everything cleaned up.

Naming is hard, but I wonder whether the whole feature wouldn't be
better referred to as something to do with global token accounting 
/ management? AUTORELEASE makes sense but seems like focusing on one
particular side effect.

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Bobby Eshleman 2 weeks, 4 days ago

On Wed, Jan 21, 2026 at 6:50 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 21 Jan 2026 18:37:56 -0800 Bobby Eshleman wrote:
> > > > Show an example of the three steps: returning the tokens, unbinding, and closing the
> > > > sockets (TCP/NL)?
> > >
> > > TBH I read the doc before reading the code, which I guess may actually
> > > be better since we don't expect users to read the code first either..
> > >
> > > Now after reading the code I'm not sure the doc explains things
> > > properly. AFAIU there's no association of token <> socket within the
> > > same binding. User can close socket A and return the tokens via socket
> > > B. As written the doc made me think that there will be a leak if socket
> > > is closed without releasing tokens, or that there may be a race with
> > > data queued but not read. Neither is true, really?
> >
> > That is correct, neither is true. If the two sockets share a binding the
> > kernel doesn't care which socket received the token or which one
> > returned it. No token <> socket association. There is no
> > queued-but-not-read race either. If any tokens are not returned, as long
> > as all of the binding references are eventually released and all sockets
> > that used the binding are closed, then all references will be accounted
> > for and everything cleaned up.
>
> Naming is hard, but I wonder whether the whole feature wouldn't be
> better referred to as something to do with global token accounting
> / management? AUTORELEASE makes sense but seems like focusing on one
> particular side effect.

Good point. The only real use case for autorelease=on is for backwards
compatibility... so I thought maybe DEVMEM_A_DMABUF_COMPAT_TOKEN
or DEVMEM_A_DMABUF_COMPAT_DONTNEED would be clearer?

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Jakub Kicinski 2 weeks, 4 days ago

On Wed, 21 Jan 2026 19:25:27 -0800 Bobby Eshleman wrote:
> > > That is correct, neither is true. If the two sockets share a binding the
> > > kernel doesn't care which socket received the token or which one
> > > returned it. No token <> socket association. There is no
> > > queued-but-not-read race either. If any tokens are not returned, as long
> > > as all of the binding references are eventually released and all sockets
> > > that used the binding are closed, then all references will be accounted
> > > for and everything cleaned up.  
> >
> > Naming is hard, but I wonder whether the whole feature wouldn't be
> > better referred to as something to do with global token accounting
> > / management? AUTORELEASE makes sense but seems like focusing on one
> > particular side effect.  
> 
> Good point. The only real use case for autorelease=on is for backwards
> compatibility... so I thought maybe DEVMEM_A_DMABUF_COMPAT_TOKEN
> or DEVMEM_A_DMABUF_COMPAT_DONTNEED would be clearer?

Hm. Maybe let's return to naming once we have consensus on the uAPI.

Does everyone think that pushing this via TCP socket opts still makes
sense, even tho in practice the TCP socket is just how we find the
binding?

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Stanislav Fomichev 2 weeks, 4 days ago

On 01/21, Jakub Kicinski wrote:
> On Wed, 21 Jan 2026 19:25:27 -0800 Bobby Eshleman wrote:
> > > > That is correct, neither is true. If the two sockets share a binding the
> > > > kernel doesn't care which socket received the token or which one
> > > > returned it. No token <> socket association. There is no
> > > > queued-but-not-read race either. If any tokens are not returned, as long
> > > > as all of the binding references are eventually released and all sockets
> > > > that used the binding are closed, then all references will be accounted
> > > > for and everything cleaned up.  
> > >
> > > Naming is hard, but I wonder whether the whole feature wouldn't be
> > > better referred to as something to do with global token accounting
> > > / management? AUTORELEASE makes sense but seems like focusing on one
> > > particular side effect.  
> > 
> > Good point. The only real use case for autorelease=on is for backwards
> > compatibility... so I thought maybe DEVMEM_A_DMABUF_COMPAT_TOKEN
> > or DEVMEM_A_DMABUF_COMPAT_DONTNEED would be clearer?
> 
> Hm. Maybe let's return to naming once we have consensus on the uAPI.
> 
> Does everyone think that pushing this via TCP socket opts still makes
> sense, even tho in practice the TCP socket is just how we find the
> binding?

I'm not a fan of the existing cmsg scheme, but we already have userspace
using it, so getting more performance out of it seems like an easy win?

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Jakub Kicinski 1 week, 6 days ago

On Wed, 21 Jan 2026 20:07:11 -0800 Stanislav Fomichev wrote:
> On 01/21, Jakub Kicinski wrote:
> > On Wed, 21 Jan 2026 19:25:27 -0800 Bobby Eshleman wrote:  
> > > Good point. The only real use case for autorelease=on is for backwards
> > > compatibility... so I thought maybe DEVMEM_A_DMABUF_COMPAT_TOKEN
> > > or DEVMEM_A_DMABUF_COMPAT_DONTNEED would be clearer?  
> > 
> > Hm. Maybe let's return to naming once we have consensus on the uAPI.
> > 
> > Does everyone think that pushing this via TCP socket opts still makes
> > sense, even tho in practice the TCP socket is just how we find the
> > binding?  
> 
> I'm not a fan of the existing cmsg scheme, but we already have userspace
> using it, so getting more performance out of it seems like an easy win?

I don't like:
 - the fact that we have to add the binding to a socket (extra field)
   - single socket can only serve single binding, there's no technical
     reason for this really, AFAICT, just the fact that we have a single
     pointer in the sock struct
 - the 7 levels of indentation in tcp_recvmsg_dmabuf()

I understand your argument, but as is this series feels closer to a PoC
than an easy win (the easy part should imply minor changes and no
detrimental effect on code quality IMHO).

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Bobby Eshleman 1 week, 6 days ago

On Mon, Jan 26, 2026 at 05:26:46PM -0800, Jakub Kicinski wrote:
> On Wed, 21 Jan 2026 20:07:11 -0800 Stanislav Fomichev wrote:
> > On 01/21, Jakub Kicinski wrote:
> > > On Wed, 21 Jan 2026 19:25:27 -0800 Bobby Eshleman wrote:  
> > > > Good point. The only real use case for autorelease=on is for backwards
> > > > compatibility... so I thought maybe DEVMEM_A_DMABUF_COMPAT_TOKEN
> > > > or DEVMEM_A_DMABUF_COMPAT_DONTNEED would be clearer?  
> > > 
> > > Hm. Maybe let's return to naming once we have consensus on the uAPI.
> > > 
> > > Does everyone think that pushing this via TCP socket opts still makes
> > > sense, even tho in practice the TCP socket is just how we find the
> > > binding?  
> > 
> > I'm not a fan of the existing cmsg scheme, but we already have userspace
> > using it, so getting more performance out of it seems like an easy win?
> 
> I don't like:
>  - the fact that we have to add the binding to a socket (extra field)
>    - single socket can only serve single binding, there's no technical
>      reason for this really, AFAICT, just the fact that we have a single
>      pointer in the sock struct

The core reason is that sockets lose the ability to map a given token to
a given binding by no longer storing the niov ptr.

One proposal I had was to encode some number of bits in the token that
can be used to lookup the binding in an array, I could reboot that
approach.

With 32 bits, we can represent:

dmabuf max size = 512 GB, max dmabuf count = 8
dmabuf max size = 256 GB, max dmabuf count = 16
dmabuf max size = 128 GB, max dmabuf count = 32

etc...

Then, if the dmabuf count encoding space is exhausted, the socket would
have to wait until the user returns all of the tokens from one of the
dmabufs and frees the ID (or err out is another option).

This wouldn't change adding a field to the socket, we'd have to add one
or two more for allocating the dmabuf ID and fetching the dmabuf with
it. But it does fix the single binding thing.

>  - the 7 levels of indentation in tcp_recvmsg_dmabuf()

For sure, it is getting hairy.

> 
> I understand your argument, but as is this series feels closer to a PoC
> than an easy win (the easy part should imply minor changes and no
> detrimental effect on code quality IMHO).

Sure, let's try to find a way to minimize the changes.

Best,
Bobby

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Jakub Kicinski 1 week, 6 days ago

On Mon, 26 Jan 2026 18:30:45 -0800 Bobby Eshleman wrote:
> > > I'm not a fan of the existing cmsg scheme, but we already have userspace
> > > using it, so getting more performance out of it seems like an easy win?  
> > 
> > I don't like:
> >  - the fact that we have to add the binding to a socket (extra field)
> >    - single socket can only serve single binding, there's no technical
> >      reason for this really, AFAICT, just the fact that we have a single
> >      pointer in the sock struct  
> 
> The core reason is that sockets lose the ability to map a given token to
> a given binding by no longer storing the niov ptr.
> 
> One proposal I had was to encode some number of bits in the token that
> can be used to lookup the binding in an array, I could reboot that
> approach.
> 
> With 32 bits, we can represent:
> 
> dmabuf max size = 512 GB, max dmabuf count = 8
> dmabuf max size = 256 GB, max dmabuf count = 16
> dmabuf max size = 128 GB, max dmabuf count = 32
> 
> etc...
> 
> Then, if the dmabuf count encoding space is exhausted, the socket would
> have to wait until the user returns all of the tokens from one of the
> dmabufs and frees the ID (or err out is another option).
> 
> This wouldn't change adding a field to the socket, we'd have to add one
> or two more for allocating the dmabuf ID and fetching the dmabuf with
> it. But it does fix the single binding thing.

I think the bigger problem (than space exhaustion) is that we'd also
have some understanding of permissions. If an application guesses 
the binding ID of another app it can mess up its buffers. ENOBUENO..

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Bobby Eshleman 1 week, 6 days ago

On Mon, Jan 26, 2026 at 06:44:40PM -0800, Jakub Kicinski wrote:
> On Mon, 26 Jan 2026 18:30:45 -0800 Bobby Eshleman wrote:
> > > > I'm not a fan of the existing cmsg scheme, but we already have userspace
> > > > using it, so getting more performance out of it seems like an easy win?  
> > > 
> > > I don't like:
> > >  - the fact that we have to add the binding to a socket (extra field)
> > >    - single socket can only serve single binding, there's no technical
> > >      reason for this really, AFAICT, just the fact that we have a single
> > >      pointer in the sock struct  
> > 
> > The core reason is that sockets lose the ability to map a given token to
> > a given binding by no longer storing the niov ptr.
> > 
> > One proposal I had was to encode some number of bits in the token that
> > can be used to lookup the binding in an array, I could reboot that
> > approach.
> > 
> > With 32 bits, we can represent:
> > 
> > dmabuf max size = 512 GB, max dmabuf count = 8
> > dmabuf max size = 256 GB, max dmabuf count = 16
> > dmabuf max size = 128 GB, max dmabuf count = 32
> > 
> > etc...
> > 
> > Then, if the dmabuf count encoding space is exhausted, the socket would
> > have to wait until the user returns all of the tokens from one of the
> > dmabufs and frees the ID (or err out is another option).
> > 
> > This wouldn't change adding a field to the socket, we'd have to add one
> > or two more for allocating the dmabuf ID and fetching the dmabuf with
> > it. But it does fix the single binding thing.
> 
> I think the bigger problem (than space exhaustion) is that we'd also
> have some understanding of permissions. If an application guesses 
> the binding ID of another app it can mess up its buffers. ENOBUENO..

I was thinking it would be per-socket, effectively:

sk->sk_devmem_info.bindings[binding_id_from_token(token)]

So sockets could only access those that they have already recv'd on.

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Jakub Kicinski 1 week, 6 days ago

On Mon, 26 Jan 2026 19:06:49 -0800 Bobby Eshleman wrote:
> > > Then, if the dmabuf count encoding space is exhausted, the socket would
> > > have to wait until the user returns all of the tokens from one of the
> > > dmabufs and frees the ID (or err out is another option).
> > > 
> > > This wouldn't change adding a field to the socket, we'd have to add one
> > > or two more for allocating the dmabuf ID and fetching the dmabuf with
> > > it. But it does fix the single binding thing.  
> > 
> > I think the bigger problem (than space exhaustion) is that we'd also
> > have some understanding of permissions. If an application guesses 
> > the binding ID of another app it can mess up its buffers. ENOBUENO..  
> 
> I was thinking it would be per-socket, effectively:
> 
> sk->sk_devmem_info.bindings[binding_id_from_token(token)]
> 
> So sockets could only access those that they have already recv'd on.

Ah, missed that the array would be per socket. I guess it'd have to be
reusing the token xarray otherwise we're taking up even more space in
the socket struct? Dunno.

Re: [PATCH net-next v10 4/5] net: devmem: document NETDEV_A_DMABUF_AUTORELEASE netlink attribute

Posted by Bobby Eshleman 1 week, 6 days ago

On Mon, Jan 26, 2026 at 07:43:59PM -0800, Jakub Kicinski wrote:
> On Mon, 26 Jan 2026 19:06:49 -0800 Bobby Eshleman wrote:
> > > > Then, if the dmabuf count encoding space is exhausted, the socket would
> > > > have to wait until the user returns all of the tokens from one of the
> > > > dmabufs and frees the ID (or err out is another option).
> > > > 
> > > > This wouldn't change adding a field to the socket, we'd have to add one
> > > > or two more for allocating the dmabuf ID and fetching the dmabuf with
> > > > it. But it does fix the single binding thing.  
> > > 
> > > I think the bigger problem (than space exhaustion) is that we'd also
> > > have some understanding of permissions. If an application guesses 
> > > the binding ID of another app it can mess up its buffers. ENOBUENO..  
> > 
> > I was thinking it would be per-socket, effectively:
> > 
> > sk->sk_devmem_info.bindings[binding_id_from_token(token)]
> > 
> > So sockets could only access those that they have already recv'd on.
> 
> Ah, missed that the array would be per socket. I guess it'd have to be
> reusing the token xarray otherwise we're taking up even more space in
> the socket struct? Dunno.

Yeah, unless we just want to break this all off into a malloc'd struct
we point to... or put into tcp_sock (not sure if either addresses the
unappealing bit of adding to struct sock)?