[PATCH bpf-next V2] xsk: allow remap of fill and/or completion rings

Nuno Gonçalves posted 1 patch 2 years, 10 months ago
There is a newer version of this series
net/xdp/xsk.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
[PATCH bpf-next V2] xsk: allow remap of fill and/or completion rings
Posted by Nuno Gonçalves 2 years, 10 months ago
The remap of fill and completion rings was frowned upon as they
control the usage of UMEM which does not support concurrent use.
At the same time this would disallow the remap of these rings
into another process.

A possible use case is that the user wants to transfer the socket/
UMEM ownership to another process (via SYS_pidfd_getfd) and so
would need to also remap these rings.

This will have no impact on current usages and just relaxes the
remap limitation.

Signed-off-by: Nuno Gonçalves <nunog@fr24.com>
---
 net/xdp/xsk.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2ac58b282b5eb..e2571ec067526 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -1301,9 +1301,10 @@ static int xsk_mmap(struct file *file, struct socket *sock,
 	loff_t offset = (loff_t)vma->vm_pgoff << PAGE_SHIFT;
 	unsigned long size = vma->vm_end - vma->vm_start;
 	struct xdp_sock *xs = xdp_sk(sock->sk);
+	int state = READ_ONCE(xs->state);
 	struct xsk_queue *q = NULL;
 
-	if (READ_ONCE(xs->state) != XSK_READY)
+	if (state != XSK_READY && state != XSK_BOUND)
 		return -EBUSY;
 
 	if (offset == XDP_PGOFF_RX_RING) {
@@ -1314,9 +1315,11 @@ static int xsk_mmap(struct file *file, struct socket *sock,
 		/* Matches the smp_wmb() in XDP_UMEM_REG */
 		smp_rmb();
 		if (offset == XDP_UMEM_PGOFF_FILL_RING)
-			q = READ_ONCE(xs->fq_tmp);
+			q = READ_ONCE(state == XSK_READY ? xs->fq_tmp :
+							   xs->pool->fq);
 		else if (offset == XDP_UMEM_PGOFF_COMPLETION_RING)
-			q = READ_ONCE(xs->cq_tmp);
+			q = READ_ONCE(state == XSK_READY ? xs->cq_tmp :
+							   xs->pool->cq);
 	}
 
 	if (!q)
-- 
2.40.0

Re: [PATCH bpf-next V2] xsk: allow remap of fill and/or completion rings
Posted by Magnus Karlsson 2 years, 10 months ago
On Mon, 20 Mar 2023 at 21:54, Nuno Gonçalves <nunog@fr24.com> wrote:
>
> The remap of fill and completion rings was frowned upon as they
> control the usage of UMEM which does not support concurrent use.
> At the same time this would disallow the remap of these rings
> into another process.
>
> A possible use case is that the user wants to transfer the socket/
> UMEM ownership to another process (via SYS_pidfd_getfd) and so
> would need to also remap these rings.
>
> This will have no impact on current usages and just relaxes the
> remap limitation.

Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>

> Signed-off-by: Nuno Gonçalves <nunog@fr24.com>
> ---
>  net/xdp/xsk.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 2ac58b282b5eb..e2571ec067526 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -1301,9 +1301,10 @@ static int xsk_mmap(struct file *file, struct socket *sock,
>         loff_t offset = (loff_t)vma->vm_pgoff << PAGE_SHIFT;
>         unsigned long size = vma->vm_end - vma->vm_start;
>         struct xdp_sock *xs = xdp_sk(sock->sk);
> +       int state = READ_ONCE(xs->state);
>         struct xsk_queue *q = NULL;
>
> -       if (READ_ONCE(xs->state) != XSK_READY)
> +       if (state != XSK_READY && state != XSK_BOUND)
>                 return -EBUSY;
>
>         if (offset == XDP_PGOFF_RX_RING) {
> @@ -1314,9 +1315,11 @@ static int xsk_mmap(struct file *file, struct socket *sock,
>                 /* Matches the smp_wmb() in XDP_UMEM_REG */
>                 smp_rmb();
>                 if (offset == XDP_UMEM_PGOFF_FILL_RING)
> -                       q = READ_ONCE(xs->fq_tmp);
> +                       q = READ_ONCE(state == XSK_READY ? xs->fq_tmp :
> +                                                          xs->pool->fq);
>                 else if (offset == XDP_UMEM_PGOFF_COMPLETION_RING)
> -                       q = READ_ONCE(xs->cq_tmp);
> +                       q = READ_ONCE(state == XSK_READY ? xs->cq_tmp :
> +                                                          xs->pool->cq);
>         }
>
>         if (!q)
> --
> 2.40.0
>
Re: [PATCH bpf-next V2] xsk: allow remap of fill and/or completion rings
Posted by Daniel Borkmann 2 years, 10 months ago
On 3/21/23 12:13 PM, Magnus Karlsson wrote:
> On Mon, 20 Mar 2023 at 21:54, Nuno Gonçalves <nunog@fr24.com> wrote:
>>
>> The remap of fill and completion rings was frowned upon as they
>> control the usage of UMEM which does not support concurrent use.
>> At the same time this would disallow the remap of these rings
>> into another process.
>>
>> A possible use case is that the user wants to transfer the socket/
>> UMEM ownership to another process (via SYS_pidfd_getfd) and so
>> would need to also remap these rings.
>>
>> This will have no impact on current usages and just relaxes the
>> remap limitation.
> 
> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
> 
>> Signed-off-by: Nuno Gonçalves <nunog@fr24.com>
>> ---
>>   net/xdp/xsk.c | 9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
>> index 2ac58b282b5eb..e2571ec067526 100644
>> --- a/net/xdp/xsk.c
>> +++ b/net/xdp/xsk.c
>> @@ -1301,9 +1301,10 @@ static int xsk_mmap(struct file *file, struct socket *sock,
>>          loff_t offset = (loff_t)vma->vm_pgoff << PAGE_SHIFT;
>>          unsigned long size = vma->vm_end - vma->vm_start;
>>          struct xdp_sock *xs = xdp_sk(sock->sk);
>> +       int state = READ_ONCE(xs->state);
>>          struct xsk_queue *q = NULL;
>>
>> -       if (READ_ONCE(xs->state) != XSK_READY)
>> +       if (state != XSK_READY && state != XSK_BOUND)
>>                  return -EBUSY;
>>
>>          if (offset == XDP_PGOFF_RX_RING) {
>> @@ -1314,9 +1315,11 @@ static int xsk_mmap(struct file *file, struct socket *sock,
>>                  /* Matches the smp_wmb() in XDP_UMEM_REG */
>>                  smp_rmb();
>>                  if (offset == XDP_UMEM_PGOFF_FILL_RING)
>> -                       q = READ_ONCE(xs->fq_tmp);
>> +                       q = READ_ONCE(state == XSK_READY ? xs->fq_tmp :
>> +                                                          xs->pool->fq);
>>                  else if (offset == XDP_UMEM_PGOFF_COMPLETION_RING)
>> -                       q = READ_ONCE(xs->cq_tmp);
>> +                       q = READ_ONCE(state == XSK_READY ? xs->cq_tmp :
>> +                                                          xs->pool->cq);

This triggers a build error:

   [...]
     CC      drivers/acpi/fan_attr.o
     CC      net/ipv6/syncookies.o
   ../net/xdp/xsk.c:1318:8: error: cannot take the address of an rvalue of type 'struct xsk_queue *'
                           q = READ_ONCE(state == XSK_READY ? xs->fq_tmp :
                               ^         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   ../include/asm-generic/rwonce.h:50:2: note: expanded from macro 'READ_ONCE'
           __READ_ONCE(x);                                                 \
           ^           ~
   ../include/asm-generic/rwonce.h:44:70: note: expanded from macro '__READ_ONCE'
   #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
                                                                         ^ ~
   ../net/xdp/xsk.c:1321:8: error: cannot take the address of an rvalue of type 'struct xsk_queue *'
                           q = READ_ONCE(state == XSK_READY ? xs->cq_tmp :
                               ^         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   ../include/asm-generic/rwonce.h:50:2: note: expanded from macro 'READ_ONCE'
           __READ_ONCE(x);                                                 \
           ^           ~
   ../include/asm-generic/rwonce.h:44:70: note: expanded from macro '__READ_ONCE'
   #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
                                                                         ^ ~
   2 errors generated.
   make[4]: *** [../scripts/Makefile.build:252: net/xdp/xsk.o] Error 1
   make[4]: *** Waiting for unfinished jobs....
     CC      fs/fs_types.o
     CC      kernel/bpf/offload.o
     AR      net/mpls/built-in.a
     CC      net/mptcp/subflow.o
   make[3]: *** [../scripts/Makefile.build:494: net/xdp] Error 2
   make[3]: *** Waiting for unfinished jobs....